[v2] kni: restrict bifurcated device support
Checks
Commit Message
To enable bifurcated device support, rtnl_lock is released before calling
userspace callbacks and asynchronous requests are enabled.
But these changes caused more issues, like bug #809, #816. To reduce the
scope of the problems, the bifurcated device support related changes are
only enabled when it is requested explicitly with new 'enable_bifurcated'
module parameter.
And bifurcated device support is disabled by default.
So the bifurcated device related problems are isolated and they can be
fixed without impacting all use cases.
Bugzilla ID: 816
Fixes: 631217c76135 ("kni: fix kernel deadlock with bifurcated device")
Cc: stable@dpdk.org
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Igor Ryzhov <iryzhov@nfware.com>
---
Cc: Igor Ryzhov <iryzhov@nfware.com>
Cc: Elad Nachman <eladv6@gmail.com>
Cc: Eric Christian <erclists@gmail.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Sahithi Singam <sahithi.singam@oracle.com>
v2:
* Updated release note
* Verified the fix with two KNI interface setting MAC address in endless
loop, and confirmed deadlock is not observed (which was observed
without this patch).
---
.../prog_guide/kernel_nic_interface.rst | 28 +++++++++++++
doc/guides/rel_notes/release_21_11.rst | 6 +++
kernel/linux/kni/kni_dev.h | 3 ++
kernel/linux/kni/kni_misc.c | 36 ++++++++++++++++
kernel/linux/kni/kni_net.c | 42 +++++++++++--------
5 files changed, 98 insertions(+), 17 deletions(-)
Comments
23/11/2021 17:46, Ferruh Yigit:
> To enable bifurcated device support, rtnl_lock is released before calling
> userspace callbacks and asynchronous requests are enabled.
>
> But these changes caused more issues, like bug #809, #816. To reduce the
> scope of the problems, the bifurcated device support related changes are
> only enabled when it is requested explicitly with new 'enable_bifurcated'
> module parameter.
> And bifurcated device support is disabled by default.
>
> So the bifurcated device related problems are isolated and they can be
> fixed without impacting all use cases.
>
> Bugzilla ID: 816
> Fixes: 631217c76135 ("kni: fix kernel deadlock with bifurcated device")
> Cc: stable@dpdk.org
>
> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
> Acked-by: Igor Ryzhov <iryzhov@nfware.com>
This approach looks pragmatic.
BTW, I'm not sure there is a strong need for KNI with bifurcated device.
Applied, thanks.
@@ -56,6 +56,12 @@ can be specified when the module is loaded to control its behavior:
off Interfaces will be created with carrier state set to off.
on Interfaces will be created with carrier state set to on.
(charp)
+ parm: enable_bifurcated: Enable request processing support for
+ bifurcated drivers, which means releasing rtnl_lock before calling
+ userspace callback and supporting async requests (default=off):
+ on Enable request processing support for bifurcated drivers.
+ (charp)
+
Loading the ``rte_kni`` kernel module without any optional parameters is
the typical way a DPDK application gets packets into and out of the kernel
@@ -174,6 +180,28 @@ To set the default carrier state to *off*:
If the ``carrier`` parameter is not specified, the default carrier state
of KNI interfaces will be set to *off*.
+.. _kni_bifurcated_device_support:
+
+Bifurcated Device Support
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+User callbacks are executed while kernel module holds the ``rtnl`` lock, this
+causes a deadlock when callbacks run control commands on another Linux kernel
+network interface.
+
+Bifurcated devices has kernel network driver part and to prevent deadlock for
+them ``enable_bifurcated`` is used.
+
+To enable bifurcated device support:
+
+.. code-block:: console
+
+ # insmod <build_dir>/kernel/linux/kni/rte_kni.ko enable_bifurcated=on
+
+Enabling bifurcated device support releases ``rtnl`` lock before calling
+callback and locks it back after callback. Also enables asynchronous request to
+support callbacks that requires rtnl lock to work (interface down).
+
KNI Creation and Deletion
-------------------------
@@ -75,6 +75,12 @@ New Features
operations.
* Added multi-process support.
+* **Updated default KNI behavior on net devices control callbacks.**
+
+ Updated KNI net devices control callbacks to run with ``rtnl`` kernel lock
+ held by default. A newly added ``enable_bifurcated`` KNI kernel module
+ parameter can be used to run callbacks with ``rtnl`` lock released.
+
* **Added HiSilicon DMA driver.**
The HiSilicon DMA driver provides device drivers for the Kunpeng's DMA devices.
@@ -34,6 +34,9 @@
/* Default carrier state for created KNI network interfaces */
extern uint32_t kni_dflt_carrier;
+/* Request processing support for bifurcated drivers. */
+extern uint32_t bifurcated_support;
+
/**
* A structure describing the private information for a kni device.
*/
@@ -41,6 +41,10 @@ static uint32_t multiple_kthread_on;
static char *carrier;
uint32_t kni_dflt_carrier;
+/* Request processing support for bifurcated drivers. */
+static char *enable_bifurcated;
+uint32_t bifurcated_support;
+
#define KNI_DEV_IN_USE_BIT_NUM 0 /* Bit number for device in use */
static int kni_net_id;
@@ -565,6 +569,22 @@ kni_parse_carrier_state(void)
return 0;
}
+static int __init
+kni_parse_bifurcated_support(void)
+{
+ if (!enable_bifurcated) {
+ bifurcated_support = 0;
+ return 0;
+ }
+
+ if (strcmp(enable_bifurcated, "on") == 0)
+ bifurcated_support = 1;
+ else
+ return -1;
+
+ return 0;
+}
+
static int __init
kni_init(void)
{
@@ -590,6 +610,13 @@ kni_init(void)
else
pr_debug("Default carrier state set to on.\n");
+ if (kni_parse_bifurcated_support() < 0) {
+ pr_err("Invalid parameter for bifurcated support\n");
+ return -EINVAL;
+ }
+ if (bifurcated_support == 1)
+ pr_debug("bifurcated support is enabled.\n");
+
#ifdef HAVE_SIMPLIFIED_PERNET_OPERATIONS
rc = register_pernet_subsys(&kni_net_ops);
#else
@@ -656,3 +683,12 @@ MODULE_PARM_DESC(carrier,
"\t\ton Interfaces will be created with carrier state set to on.\n"
"\t\t"
);
+
+module_param(enable_bifurcated, charp, 0644);
+MODULE_PARM_DESC(enable_bifurcated,
+"Enable request processing support for bifurcated drivers, "
+"which means releasing rtnl_lock before calling userspace callback and "
+"supporting async requests (default=off):\n"
+"\t\ton Enable request processing support for bifurcated drivers.\n"
+"\t\t"
+);
@@ -113,12 +113,14 @@ kni_net_process_request(struct net_device *dev, struct rte_kni_request *req)
ASSERT_RTNL();
- /* If we need to wait and RTNL mutex is held
- * drop the mutex and hold reference to keep device
- */
- if (req->async == 0) {
- dev_hold(dev);
- rtnl_unlock();
+ if (bifurcated_support) {
+ /* If we need to wait and RTNL mutex is held
+ * drop the mutex and hold reference to keep device
+ */
+ if (req->async == 0) {
+ dev_hold(dev);
+ rtnl_unlock();
+ }
}
mutex_lock(&kni->sync_lock);
@@ -132,12 +134,14 @@ kni_net_process_request(struct net_device *dev, struct rte_kni_request *req)
goto fail;
}
- /* No result available since request is handled
- * asynchronously. set response to success.
- */
- if (req->async != 0) {
- req->result = 0;
- goto async;
+ if (bifurcated_support) {
+ /* No result available since request is handled
+ * asynchronously. set response to success.
+ */
+ if (req->async != 0) {
+ req->result = 0;
+ goto async;
+ }
}
ret_val = wait_event_interruptible_timeout(kni->wq,
@@ -160,9 +164,11 @@ kni_net_process_request(struct net_device *dev, struct rte_kni_request *req)
fail:
mutex_unlock(&kni->sync_lock);
- if (req->async == 0) {
- rtnl_lock();
- dev_put(dev);
+ if (bifurcated_support) {
+ if (req->async == 0) {
+ rtnl_lock();
+ dev_put(dev);
+ }
}
return ret;
}
@@ -207,8 +213,10 @@ kni_net_release(struct net_device *dev)
/* Setting if_up to 0 means down */
req.if_up = 0;
- /* request async because of the deadlock problem */
- req.async = 1;
+ if (bifurcated_support) {
+ /* request async because of the deadlock problem */
+ req.async = 1;
+ }
ret = kni_net_process_request(dev, &req);