[2/3] vfio: fix to add handler lock for hot-unplug
Checks
Commit Message
This patch add hot-unplug handler lock and unlock in device request
handler when process bus and device resource, in order to avoid the
synchronization issue when device be hot-unplugged.
Fixes: c115fd000c32 ("vfio: handle hotplug request notifier")
Signed-off-by: Jeff Guo <jia.guo@intel.com>
---
drivers/bus/pci/linux/pci_vfio.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
Comments
Hi Jeff
Can you detail more in the commit log that we can understand the synchronization problematic scenario. And how does this commit fix it?
> -----Original Message-----
> From: Jeff Guo <jia.guo@intel.com>
> Sent: Tuesday, November 6, 2018 8:07 AM
> To: konstantin.ananyev@intel.com; anatoly.burakov@intel.com; Thomas
> Monjalon <thomas@monjalon.net>; bernard.iremonger@intel.com;
> jingjing.wu@intel.com; wenzhuo.lu@intel.com
> Cc: ferruh.yigit@intel.com; dev@dpdk.org; jia.guo@intel.com;
> helin.zhang@intel.com; Matan Azrad <matan@mellanox.com>;
> shaopeng.he@intel.com
> Subject: [PATCH 2/3] vfio: fix to add handler lock for hot-unplug
>
> This patch add hot-unplug handler lock and unlock in device request handler
> when process bus and device resource, in order to avoid the synchronization
> issue when device be hot-unplugged.
>
> Fixes: c115fd000c32 ("vfio: handle hotplug request notifier")
> Signed-off-by: Jeff Guo <jia.guo@intel.com>
> ---
> drivers/bus/pci/linux/pci_vfio.c | 14 +++++++++++++-
> 1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/bus/pci/linux/pci_vfio.c b/drivers/bus/pci/linux/pci_vfio.c
> index 305cc06..d2c8410 100644
> --- a/drivers/bus/pci/linux/pci_vfio.c
> +++ b/drivers/bus/pci/linux/pci_vfio.c
> @@ -19,6 +19,7 @@
> #include <rte_vfio.h>
> #include <rte_eal.h>
> #include <rte_bus.h>
> +#include <rte_spinlock.h>
>
> #include "eal_filesystem.h"
>
> @@ -35,6 +36,14 @@
> * This file is only compiled if CONFIG_RTE_EAL_VFIO is set to "y".
> */
>
> +/*
> + * spinlock for device hot-unplug failure handling. If it try to access
> +bus or
> + * device, such as handle sigbus on bus or handle memory failure for
> +device
> + * just need to use this lock. It could protect the bus and the device
> +to avoid
> + * race condition.
> + */
> +static rte_spinlock_t failure_handle_lock = RTE_SPINLOCK_INITIALIZER;
> +
> #ifdef VFIO_PRESENT
>
> #ifndef PAGE_SIZE
> @@ -289,11 +298,12 @@ pci_vfio_req_handler(void *param)
> int ret;
> struct rte_device *device = (struct rte_device *)param;
>
> + rte_spinlock_lock(&failure_handle_lock);
> bus = rte_bus_find_by_device(device);
> if (bus == NULL) {
> RTE_LOG(ERR, EAL, "Cannot find bus for device (%s)\n",
> device->name);
> - return;
> + goto handle_end;
> }
>
> /*
> @@ -306,6 +316,8 @@ pci_vfio_req_handler(void *param)
> RTE_LOG(ERR, EAL,
> "Can not handle hot-unplug for device (%s)\n",
> device->name);
> +handle_end:
> + rte_spinlock_unlock(&failure_handle_lock);
> }
>
> /* enable notifier (only enable req now) */
> --
> 2.7.4
On 11/6/2018 2:23 PM, Matan Azrad wrote:
> Hi Jeff
>
> Can you detail more in the commit log that we can understand the synchronization problematic scenario. And how does this commit fix it?
>
Please check my reply in the 1/3 mail. And explain more here is that,
when device be hot-unplugged in vfio, the req notifier will invoked,
then user space could release device resource in user space side,
then vfio check that the device be released out from the device group,
it will take the device control again and trigger the device kernel
release processing, at the mean time it will sent remove uevent to
user space. Here although the req handler seems will always process
before uevent handler, but even for fast path and slow path protection
of device accessing when device is removing , it should also be need.
what do you think about that?
>> -----Original Message-----
>> From: Jeff Guo <jia.guo@intel.com>
>> Sent: Tuesday, November 6, 2018 8:07 AM
>> To: konstantin.ananyev@intel.com; anatoly.burakov@intel.com; Thomas
>> Monjalon <thomas@monjalon.net>; bernard.iremonger@intel.com;
>> jingjing.wu@intel.com; wenzhuo.lu@intel.com
>> Cc: ferruh.yigit@intel.com; dev@dpdk.org; jia.guo@intel.com;
>> helin.zhang@intel.com; Matan Azrad <matan@mellanox.com>;
>> shaopeng.he@intel.com
>> Subject: [PATCH 2/3] vfio: fix to add handler lock for hot-unplug
>>
>> This patch add hot-unplug handler lock and unlock in device request handler
>> when process bus and device resource, in order to avoid the synchronization
>> issue when device be hot-unplugged.
>>
>> Fixes: c115fd000c32 ("vfio: handle hotplug request notifier")
>> Signed-off-by: Jeff Guo <jia.guo@intel.com>
>> ---
>> drivers/bus/pci/linux/pci_vfio.c | 14 +++++++++++++-
>> 1 file changed, 13 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/bus/pci/linux/pci_vfio.c b/drivers/bus/pci/linux/pci_vfio.c
>> index 305cc06..d2c8410 100644
>> --- a/drivers/bus/pci/linux/pci_vfio.c
>> +++ b/drivers/bus/pci/linux/pci_vfio.c
>> @@ -19,6 +19,7 @@
>> #include <rte_vfio.h>
>> #include <rte_eal.h>
>> #include <rte_bus.h>
>> +#include <rte_spinlock.h>
>>
>> #include "eal_filesystem.h"
>>
>> @@ -35,6 +36,14 @@
>> * This file is only compiled if CONFIG_RTE_EAL_VFIO is set to "y".
>> */
>>
>> +/*
>> + * spinlock for device hot-unplug failure handling. If it try to access
>> +bus or
>> + * device, such as handle sigbus on bus or handle memory failure for
>> +device
>> + * just need to use this lock. It could protect the bus and the device
>> +to avoid
>> + * race condition.
>> + */
>> +static rte_spinlock_t failure_handle_lock = RTE_SPINLOCK_INITIALIZER;
>> +
>> #ifdef VFIO_PRESENT
>>
>> #ifndef PAGE_SIZE
>> @@ -289,11 +298,12 @@ pci_vfio_req_handler(void *param)
>> int ret;
>> struct rte_device *device = (struct rte_device *)param;
>>
>> + rte_spinlock_lock(&failure_handle_lock);
>> bus = rte_bus_find_by_device(device);
>> if (bus == NULL) {
>> RTE_LOG(ERR, EAL, "Cannot find bus for device (%s)\n",
>> device->name);
>> - return;
>> + goto handle_end;
>> }
>>
>> /*
>> @@ -306,6 +316,8 @@ pci_vfio_req_handler(void *param)
>> RTE_LOG(ERR, EAL,
>> "Can not handle hot-unplug for device (%s)\n",
>> device->name);
>> +handle_end:
>> + rte_spinlock_unlock(&failure_handle_lock);
>> }
>>
>> /* enable notifier (only enable req now) */
>> --
>> 2.7.4
From: Jeff Guo
> On 11/6/2018 2:23 PM, Matan Azrad wrote:
> > Hi Jeff
> >
> > Can you detail more in the commit log that we can understand the
> synchronization problematic scenario. And how does this commit fix it?
> >
>
> Please check my reply in the 1/3 mail. And explain more here is that, when
> device be hot-unplugged in vfio, the req notifier will invoked, then user
> space could release device resource in user space side,
>
> then vfio check that the device be released out from the device group, it will
> take the device control again and trigger the device kernel release
> processing, at the mean time it will sent remove uevent to
>
> user space. Here although the req handler seems will always process before
> uevent handler, but even for fast path and slow path protection of device
> accessing when device is removing , it should also be need.
>
> what do you think about that?
Just don't understanf hoe the fast\control-path can access to pci_vfio_req_handler...
>
>
> >> -----Original Message-----
> >> From: Jeff Guo <jia.guo@intel.com>
> >> Sent: Tuesday, November 6, 2018 8:07 AM
> >> To: konstantin.ananyev@intel.com; anatoly.burakov@intel.com; Thomas
> >> Monjalon <thomas@monjalon.net>; bernard.iremonger@intel.com;
> >> jingjing.wu@intel.com; wenzhuo.lu@intel.com
> >> Cc: ferruh.yigit@intel.com; dev@dpdk.org; jia.guo@intel.com;
> >> helin.zhang@intel.com; Matan Azrad <matan@mellanox.com>;
> >> shaopeng.he@intel.com
> >> Subject: [PATCH 2/3] vfio: fix to add handler lock for hot-unplug
> >>
> >> This patch add hot-unplug handler lock and unlock in device request
> >> handler when process bus and device resource, in order to avoid the
> >> synchronization issue when device be hot-unplugged.
> >>
> >> Fixes: c115fd000c32 ("vfio: handle hotplug request notifier")
> >> Signed-off-by: Jeff Guo <jia.guo@intel.com>
> >> ---
> >> drivers/bus/pci/linux/pci_vfio.c | 14 +++++++++++++-
> >> 1 file changed, 13 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/bus/pci/linux/pci_vfio.c
> >> b/drivers/bus/pci/linux/pci_vfio.c
> >> index 305cc06..d2c8410 100644
> >> --- a/drivers/bus/pci/linux/pci_vfio.c
> >> +++ b/drivers/bus/pci/linux/pci_vfio.c
> >> @@ -19,6 +19,7 @@
> >> #include <rte_vfio.h>
> >> #include <rte_eal.h>
> >> #include <rte_bus.h>
> >> +#include <rte_spinlock.h>
> >>
> >> #include "eal_filesystem.h"
> >>
> >> @@ -35,6 +36,14 @@
> >> * This file is only compiled if CONFIG_RTE_EAL_VFIO is set to "y".
> >> */
> >>
> >> +/*
> >> + * spinlock for device hot-unplug failure handling. If it try to
> >> +access bus or
> >> + * device, such as handle sigbus on bus or handle memory failure for
> >> +device
> >> + * just need to use this lock. It could protect the bus and the
> >> +device to avoid
> >> + * race condition.
> >> + */
> >> +static rte_spinlock_t failure_handle_lock =
> >> +RTE_SPINLOCK_INITIALIZER;
> >> +
> >> #ifdef VFIO_PRESENT
> >>
> >> #ifndef PAGE_SIZE
> >> @@ -289,11 +298,12 @@ pci_vfio_req_handler(void *param)
> >> int ret;
> >> struct rte_device *device = (struct rte_device *)param;
> >>
> >> + rte_spinlock_lock(&failure_handle_lock);
> >> bus = rte_bus_find_by_device(device);
> >> if (bus == NULL) {
> >> RTE_LOG(ERR, EAL, "Cannot find bus for device (%s)\n",
> >> device->name);
> >> - return;
> >> + goto handle_end;
> >> }
> >>
> >> /*
> >> @@ -306,6 +316,8 @@ pci_vfio_req_handler(void *param)
> >> RTE_LOG(ERR, EAL,
> >> "Can not handle hot-unplug for device (%s)\n",
> >> device->name);
> >> +handle_end:
> >> + rte_spinlock_unlock(&failure_handle_lock);
> >> }
> >>
> >> /* enable notifier (only enable req now) */
> >> --
> >> 2.7.4
On 11/8/2018 3:20 PM, Matan Azrad wrote:
>
> From: Jeff Guo
>> On 11/6/2018 2:23 PM, Matan Azrad wrote:
>>> Hi Jeff
>>>
>>> Can you detail more in the commit log that we can understand the
>> synchronization problematic scenario. And how does this commit fix it?
>> Please check my reply in the 1/3 mail. And explain more here is that, when
>> device be hot-unplugged in vfio, the req notifier will invoked, then user
>> space could release device resource in user space side,
>>
>> then vfio check that the device be released out from the device group, it will
>> take the device control again and trigger the device kernel release
>> processing, at the mean time it will sent remove uevent to
>>
>> user space. Here although the req handler seems will always process before
>> uevent handler, but even for fast path and slow path protection of device
>> accessing when device is removing , it should also be need.
>>
>> what do you think about that?
> Just don't understanf hoe the fast\control-path can access to pci_vfio_req_handler...
The pci_vfio_req_handler is register in eal interrupt thread when pci
probe driver, so it should be access in the control path.
Since the sigbus handler will be enable when enable hotplug , if origin
sigbus occur the sigbus handler be invoked, in the handler it will
process the bus and device. So i think protection of the bus and device
so that it will not cause race condition that should be need.
>>
>>>> -----Original Message-----
>>>> From: Jeff Guo <jia.guo@intel.com>
>>>> Sent: Tuesday, November 6, 2018 8:07 AM
>>>> To: konstantin.ananyev@intel.com; anatoly.burakov@intel.com; Thomas
>>>> Monjalon <thomas@monjalon.net>; bernard.iremonger@intel.com;
>>>> jingjing.wu@intel.com; wenzhuo.lu@intel.com
>>>> Cc: ferruh.yigit@intel.com; dev@dpdk.org; jia.guo@intel.com;
>>>> helin.zhang@intel.com; Matan Azrad <matan@mellanox.com>;
>>>> shaopeng.he@intel.com
>>>> Subject: [PATCH 2/3] vfio: fix to add handler lock for hot-unplug
>>>>
>>>> This patch add hot-unplug handler lock and unlock in device request
>>>> handler when process bus and device resource, in order to avoid the
>>>> synchronization issue when device be hot-unplugged.
>>>>
>>>> Fixes: c115fd000c32 ("vfio: handle hotplug request notifier")
>>>> Signed-off-by: Jeff Guo <jia.guo@intel.com>
>>>> ---
>>>> drivers/bus/pci/linux/pci_vfio.c | 14 +++++++++++++-
>>>> 1 file changed, 13 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/bus/pci/linux/pci_vfio.c
>>>> b/drivers/bus/pci/linux/pci_vfio.c
>>>> index 305cc06..d2c8410 100644
>>>> --- a/drivers/bus/pci/linux/pci_vfio.c
>>>> +++ b/drivers/bus/pci/linux/pci_vfio.c
>>>> @@ -19,6 +19,7 @@
>>>> #include <rte_vfio.h>
>>>> #include <rte_eal.h>
>>>> #include <rte_bus.h>
>>>> +#include <rte_spinlock.h>
>>>>
>>>> #include "eal_filesystem.h"
>>>>
>>>> @@ -35,6 +36,14 @@
>>>> * This file is only compiled if CONFIG_RTE_EAL_VFIO is set to "y".
>>>> */
>>>>
>>>> +/*
>>>> + * spinlock for device hot-unplug failure handling. If it try to
>>>> +access bus or
>>>> + * device, such as handle sigbus on bus or handle memory failure for
>>>> +device
>>>> + * just need to use this lock. It could protect the bus and the
>>>> +device to avoid
>>>> + * race condition.
>>>> + */
>>>> +static rte_spinlock_t failure_handle_lock =
>>>> +RTE_SPINLOCK_INITIALIZER;
>>>> +
>>>> #ifdef VFIO_PRESENT
>>>>
>>>> #ifndef PAGE_SIZE
>>>> @@ -289,11 +298,12 @@ pci_vfio_req_handler(void *param)
>>>> int ret;
>>>> struct rte_device *device = (struct rte_device *)param;
>>>>
>>>> + rte_spinlock_lock(&failure_handle_lock);
>>>> bus = rte_bus_find_by_device(device);
>>>> if (bus == NULL) {
>>>> RTE_LOG(ERR, EAL, "Cannot find bus for device (%s)\n",
>>>> device->name);
>>>> - return;
>>>> + goto handle_end;
>>>> }
>>>>
>>>> /*
>>>> @@ -306,6 +316,8 @@ pci_vfio_req_handler(void *param)
>>>> RTE_LOG(ERR, EAL,
>>>> "Can not handle hot-unplug for device (%s)\n",
>>>> device->name);
>>>> +handle_end:
>>>> + rte_spinlock_unlock(&failure_handle_lock);
>>>> }
>>>>
>>>> /* enable notifier (only enable req now) */
>>>> --
>>>> 2.7.4
@@ -19,6 +19,7 @@
#include <rte_vfio.h>
#include <rte_eal.h>
#include <rte_bus.h>
+#include <rte_spinlock.h>
#include "eal_filesystem.h"
@@ -35,6 +36,14 @@
* This file is only compiled if CONFIG_RTE_EAL_VFIO is set to "y".
*/
+/*
+ * spinlock for device hot-unplug failure handling. If it try to access bus or
+ * device, such as handle sigbus on bus or handle memory failure for device
+ * just need to use this lock. It could protect the bus and the device to avoid
+ * race condition.
+ */
+static rte_spinlock_t failure_handle_lock = RTE_SPINLOCK_INITIALIZER;
+
#ifdef VFIO_PRESENT
#ifndef PAGE_SIZE
@@ -289,11 +298,12 @@ pci_vfio_req_handler(void *param)
int ret;
struct rte_device *device = (struct rte_device *)param;
+ rte_spinlock_lock(&failure_handle_lock);
bus = rte_bus_find_by_device(device);
if (bus == NULL) {
RTE_LOG(ERR, EAL, "Cannot find bus for device (%s)\n",
device->name);
- return;
+ goto handle_end;
}
/*
@@ -306,6 +316,8 @@ pci_vfio_req_handler(void *param)
RTE_LOG(ERR, EAL,
"Can not handle hot-unplug for device (%s)\n",
device->name);
+handle_end:
+ rte_spinlock_unlock(&failure_handle_lock);
}
/* enable notifier (only enable req now) */