mbox

[v15,0/7] enable hotplug on multi-process

Message ID 20180816030455.41354-1-qi.z.zhang@intel.com (mailing list archive)
Headers

Message

Qi Zhang Aug. 16, 2018, 3:04 a.m. UTC
v15:
- fix missing return in rte_eth_dev_pci_release.
- minor fix and more detail comments for patch 5/7.
- update release notes for v18.11.

v14:
- rebase.
- All changes belongs to patch 1/6.
  1) rename rte_eth_dev_release_port_private to rte_eth_dev_release_port_seondary
     since it is only used by secondary process.
  2) in rte_eth_dev_pci_generic_remove, even on the secondary process,
     I think its better to call rte_eth_dev_release_port_secondary after
     dev_uninit since it is possible that secondary process need to release
     some local resources in dev_uninit before release the port and return.
     Also this does not break all exist users of rte_eth_dev_pci_generic_remove,
     because there is no special handle in all exist dev_uninit for secondary
     process.
  3) add rte_eth_dev_release_port_secondary into rte_eth_dev_destroy as a
     general step, so we don't need patches for i40e and ixgbe.
  4) fix missing update on rte_ethdev_version.map.
- improve error handle for -EEXIST when attaching a device and -ENOENT
  when detaching a device. It is possible that device is not synced during
  some situation, so attach an exist device in primary still need to sync
  with secondary. Also, it's not necessary to rollback if we fail to
  attach an exist device or detach a not exist device on secondary.
- fix potential NULL point ref in handle_primary_request.
- merge all vdev driver patches into one patch.
- merge all pci driver patches into on patch.

v13:
- Since rte_eth_dev_attach/rte_eth_dev_detach will be deprecated,
  so, modify the sample code to use rte_eal_hotplug_add and
  rte_eal_hotplug_remove to attach/detach device.

v12:
- fix return value in eal_dev_hotplug_request_to_primary.
- add more error log in rte_eal_hotplug_add.
- fix return value in rte_eal_hotplug_add and rte_eal_hotplug_remove
  any failure due to IPC error will return -ENOMSG, but not -1.
- remove unnecessary changes from previous rework.

v11: - move out common code from pci_vfio_unmap_secondary and
  pci_vfio_unmap_primary.
- move RTE_BUS_NAME_MAX_LEN and RTE_DEV_ARGS_MAX_LEN into hotplug_mp.h
- fix reply check in eal_dev_hotplug_request_to_primary.
- move skeleton code for attaching device from secondary from patch 6/19
  to patch 5/19 to improve code readability.

v10:
- Since hotplug add/remove a vdev on a secondary process will sync on
  all processes now, it is not necessary to support private vdev for
  a secondary process which is identified by a not-NULL devargs in
  "--vdev". So re-work on all vdev driver changes to simpified device
  probe scenario on a secondary process, devargs will be ignored on
  secondary process now.
- fix lisence header in example/multi-process/hotplug_mp/Makefile.

v9:
- Move hotplug IPC from rte_eth_dev_attach/rte_eth_dev_detach to
  eal_dev_hotplug_add and eal_dev_hotplug_remove, now all kinds of
  devices will be synced in multi-process.
- Fix couple issue when a device is bound to vfio.
  1) The device can't be detached clearly in a secondary process, which
     also cause it can't be attached again, due to the error that
     /dev/vfio/<group_fd> is still busy.(see Patch 3/19 and 4/19)
  2) repeat detach/attach device will cause "cannot find TAILQ entry
     for PCI device" due to incorrect PCI address compare.
     (see patch 2/19).
- Removed device lock.
- Removed private device support.
- Fix commit log grammar issue

v8:
- update rte_eal_version.map due to new API added.
- minor reword on release note.
- minor fix on commit log and code style.

NOTE:
  Some issues which is not related with this patchset is expected when
  play with hotplug_mp sample as belows.

- Attach a PCI device twice may cause device can't be detached
  below fix is required:
  https://patches.dpdk.org/patch/42030/

- ixgbe device can't detached, below fix is required
  https://patches.dpdk.org/patch/42031/

v7:
- update rte_ethdev_version.map for new APIs.
- improve code readability in __handle_secondary_request by use goto.
- add comments to explain why need to call rte_eal_alarm_set.
- add error log when process_mp_init_callbacks failed.
- reword release notes base on Anatoly's suggestion.
- add back previous "Acked-by" and "Reviewed-by" in commit log.

  NOTE: current patchset depends on below IPC fix, or it may not be able
  to attach a shared vdev.
  https://patches.dpdk.org/patch/41647/

v6:
- remove bus->scan_one, since ABI break is not necessary.
- remove patch for failsafe PMD since it will not support secondary.
- fix wrong implemenation on ixgbe.
- add rte_eth_dev_release_port_private into rte_eth_dev_pci_generic_remove for
  secondary process, so we don't need to patch on PMD if PMD use the
  default remove function.
- add release notes update.
- agreed to use strdup(peer) as workaround for repling a sync request in seperate
  thread.

v5:
- since we will keep mp thread separate from interrupt thread,
  it is not necessary to use temporary thread, we use rte_eal_alarm_set.
- remove the change in rte_eth_dev_release_port, since there is a better
  way to prevent rte_eth_dev_release_port be called after
  rte_eth_dev_release_port_private.
- fix the issue that lock does not take effect on secondary due to
  previous re-work
- fix the issue when the first attached device is a private device from
  secondary. (patch 8/24)
- work around for reply a sync request in separate thread, this is still
  an open and in discussion as below.
  https://mails.dpdk.org/archives/dev/2018-June/105359.html

v4:
- since mp thread will be merged to interrupt thread, the fix on v3
  for sync IPC deadlock will not work. the new version enable the
  machanism to invoke a mp action callback in a temporary thread to
  avoid the IPC deadlock, with this, secondary to primary request
  impelemtation also be simplified, since we can use sync request
  directly in a separate thread.

v3:
- enable mp init callback register to help non-eal module to initialize
  mp channel during rte_eal_init
- fix when attach share device from secondary.
  1) dead lock due to sync IPC be invoked in rte_malloc in primary
     process when handle secondary request to attach device, the
     solution is primary process to issue share device attach/detach
     in interrupt thread.
  2) return port_id not correct.
- check nb_sent and nb_received in sync IPC.
- fix memory leak duirng error handling at attach_on_secondary.
- improve clean_lock_callback to only lock/unlock spinlock once
- improve error code return in check-reply during async IPC.
- remove rte_ prefix of internal function in ethdev_mp.c
- sample code improvement.
  1) rename sample to "hotplug_mp", and move to example/multi-process.
  2) cleanup header include.
  3) call rte_eal_cleanup before exit.

v2:
- rename rte_ethdev_mp.* to ethdev_mp.*
- rename rte_ethdev_lock.* to ethdev_lock.*
- move internal funciton to ethdev_private.h
- separate rte_eth_dev_[un]lock into rte_eth_dev_[un]lock and
  rte_eth_dev_[un]lock_with_callback
- lock callbacks will be removed automatically after device is detached.
- add experimental tag for all new APIs.
- fix coding style issue.
- fix wrong lisence header in sample code.
- fix spelling 
- fix meson.build.
- improve comments. 

Background:
===========

Currently secondary process will only sync ethdev from primary
process at init stage, but it will not be aware if device
is attached/detached on primary process at runtime.

While there is the requirement from application that take
primary-secondary process model. The primary process work as a
resource management process, it will create/destroy virtual device
at runtime, while the secondary process deal with the network stuff
with these devices.

Solution:
=========

So the orignial intention is to fix this gap, but beyond that
the patch set provide a more comprehesive solution to handle
different hotplug cases in multi-process situation, it cover below
scenario:

1. Attach a device from the primary
2. Detach a device from the primary
3. Attach a device from a secondary
4. Detach a device from a secondary

In primary-secondary process model, we assume ethernet devices are
shared by default. that means attach or detach a device on any process
will broadcast to all other processes through mp channel then device
information will be synchronized on all processes.

Any failure during attaching or detaching process will cause inconsistent
status between processes, so proper rollback action should be considered.

Scenario for Case 1, 2:

attach device from primary
a) primary attach the new device if failed goto h).
b) primary send attach sync request to all secondary.
c) secondary receive request and attach device and send reply.
d) primary check the reply if all success go to i).
e) primary send attach rollback sync request to all secondary.
f) secondary receive the request and detach device and send reply.
g) primary receive the reply and detach device as rollback action.
h) attach fail
i) attach success

detach device from primary
a) primary perform pre-detach check, if device is locked, goto i).
b) primary send pre-detach sync request to all secondary.
c) secondary perform pre-detach check and send reply.
d) primary check the reply if any fail goto i).
e) primary send detach sync request to all secondary
f) secondary detach the device and send reply (assume no fail)
g) primary detach the device.
h) detach success
i) detach failed

Scenario for case 3, 4:

attach device from secondary:
a) seconary send asycn request to primary and wait on a condition
   which will be released by matched response from primary.
b) primary receive the request and attach the new device if failed
   goto i).
c) primary forward attach request to all secondary as async request
   (because this in mp thread context, use sync request will deadlock,
    same reason for all following async request.)
d) secondary receive request and attach device and send reply.
e) primary check the reply if all success go to j).
f) primary send attach rollback async request to all secondary.
g) secondary receive the request and detach device and send reply.
h) primary receive the reply and detach device as rollback action.
i) send fail response to secondary, goto k).
j) send success response to secondary.
k) secondary process receive response and return.
 
detach device from secondary:
a) secondary send async request to primary and wait on a condition
   which will be released by matched response from primary.
b) primary receive the request and  perform pre-detach check, if device
   is locked, goto j).
c) primary send pre-detach async request to all secondary.
d) secondary perform pre-detach check and send reply.
e) primary check the reply if any fail goto j).
f) primary send detach async request to all secondary
g) secondary detach the device and send reply
h) primary detach the device.
i) send success response to secondary, goto k).
j) send fail response to secondary.
k) secondary process receive response and return.

APIs chenages:
==============

scope of rte_eal_hotplug_add and rte_eal_hotplug_remove is extended.
In primary-secondary process model, rte_eal_hotplug_add will guarantee
that device be attached on all processes, while rte_eal_hotplug_remove will
guarantee device be detached on all processes.


PMD Impact:
===========

Currently device removing is not handled well in secondary process on
most pmd drivers, rte_eth_dev_relase_port will be invoked and will mess up
primary process since it reset all shared data. So we introduced new API
rte_eth_dev_release_port_secondary which only reset ethdev's state to unsued
but not touch shared data so other process will not be impacted.
Since not all device driver is target to support primary-secondary
process model, so the patch set only fix this for PCI device those driver use
rte_eth_dev_pci_generic_remove or rte_eth_dev_destroy and all
vdev that support secondary process, it can be refereneced by other driver
when equevalent fix is required

Example:
========

The patchset also contains a example to demonstrate device hotplug
in multi-process model, below are detail instructions.

/* start sample code as primary then secondary */
./hotplug_mp --proc-type=auto

Command Line Example:

>help
>list

/* attach a pci device */
> attach 0000:81:00.0

/* detach the pci device */
> detach 0000:81:00.0

/* attach a vdev af_packet device */
> attach net_af_packet,iface=eth0

/* detach the vdev af_packet device */
> detach net_af_packet

Qi Zhang (7):
  ethdev: add function to release port in secondary process
  eal: enable hotplug on multi-process
  eal: support attach or detach share device from  secondary
  drivers/net: enable hotplug on secondary process
  drivers/net: enable device detach on secondary
  examples/multi_process: add hotplug sample
  doc: update release notes for mulit-process hotplug

 doc/guides/rel_notes/release_18_11.rst       |  11 +
 drivers/net/af_packet/rte_eth_af_packet.c    |   6 +-
 drivers/net/bnxt/bnxt_ethdev.c               |   6 +-
 drivers/net/bonding/rte_eth_bond_pmd.c       |   6 +-
 drivers/net/ena/ena_ethdev.c                 |   2 +-
 drivers/net/kni/rte_eth_kni.c                |   6 +-
 drivers/net/liquidio/lio_ethdev.c            |   2 +-
 drivers/net/null/rte_eth_null.c              |   6 +-
 drivers/net/octeontx/octeontx_ethdev.c       |   8 +
 drivers/net/pcap/rte_eth_pcap.c              |   6 +-
 drivers/net/tap/rte_eth_tap.c                |   8 +-
 drivers/net/vhost/rte_eth_vhost.c            |   6 +-
 drivers/net/virtio/virtio_ethdev.c           |   2 +-
 examples/multi_process/Makefile              |   1 +
 examples/multi_process/hotplug_mp/Makefile   |  23 ++
 examples/multi_process/hotplug_mp/commands.c | 214 ++++++++++++++++
 examples/multi_process/hotplug_mp/commands.h |  10 +
 examples/multi_process/hotplug_mp/main.c     |  41 +++
 lib/librte_eal/bsdapp/eal/Makefile           |   1 +
 lib/librte_eal/common/eal_common_dev.c       | 198 ++++++++++++++-
 lib/librte_eal/common/eal_private.h          |  37 +++
 lib/librte_eal/common/hotplug_mp.c           | 363 +++++++++++++++++++++++++++
 lib/librte_eal/common/hotplug_mp.h           |  48 ++++
 lib/librte_eal/common/include/rte_dev.h      |   6 +
 lib/librte_eal/common/meson.build            |   1 +
 lib/librte_eal/linuxapp/eal/Makefile         |   1 +
 lib/librte_eal/linuxapp/eal/eal.c            |   6 +
 lib/librte_ethdev/rte_ethdev.c               |  17 +-
 lib/librte_ethdev/rte_ethdev_driver.h        |  16 +-
 lib/librte_ethdev/rte_ethdev_pci.h           |  10 +-
 lib/librte_ethdev/rte_ethdev_version.map     |   7 +
 31 files changed, 1046 insertions(+), 29 deletions(-)
 create mode 100644 examples/multi_process/hotplug_mp/Makefile
 create mode 100644 examples/multi_process/hotplug_mp/commands.c
 create mode 100644 examples/multi_process/hotplug_mp/commands.h
 create mode 100644 examples/multi_process/hotplug_mp/main.c
 create mode 100644 lib/librte_eal/common/hotplug_mp.c
 create mode 100644 lib/librte_eal/common/hotplug_mp.h