[1/3] eal: fix lock issue for hot-unplug

Message ID 1541484436-91320-2-git-send-email-jia.guo@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers
Series fix vfio hot-unplug issue |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK

Commit Message

Guo, Jia Nov. 6, 2018, 6:07 a.m. UTC
  This patch will add missing unlock for hot-unplug handler, without this
patch potential dead lock will occur when device be hotplug-in after
device be hot-unplugged.

Fixes: 0fc54536b14a ("eal: add failure handling for hot-unplug")
Signed-off-by: Jeff Guo <jia.guo@intel.com>
---
 lib/librte_eal/linuxapp/eal/eal_dev.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)
  

Comments

Matan Azrad Nov. 6, 2018, 6:22 a.m. UTC | #1
Hi Jeff

Can you detail more in the commit log that we can understand the deadlock scenario. And how does this commit fix it? 

> -----Original Message-----
> From: Jeff Guo <jia.guo@intel.com>
> Sent: Tuesday, November 6, 2018 8:07 AM
> To: konstantin.ananyev@intel.com; anatoly.burakov@intel.com; Thomas
> Monjalon <thomas@monjalon.net>; bernard.iremonger@intel.com;
> jingjing.wu@intel.com; wenzhuo.lu@intel.com
> Cc: ferruh.yigit@intel.com; dev@dpdk.org; jia.guo@intel.com;
> helin.zhang@intel.com; Matan Azrad <matan@mellanox.com>;
> shaopeng.he@intel.com
> Subject: [PATCH 1/3] eal: fix lock issue for hot-unplug
> 
> This patch will add missing unlock for hot-unplug handler, without this patch
> potential dead lock will occur when device be hotplug-in after device be hot-
> unplugged.
> 
> Fixes: 0fc54536b14a ("eal: add failure handling for hot-unplug")
> Signed-off-by: Jeff Guo <jia.guo@intel.com>
> ---
>  lib/librte_eal/linuxapp/eal/eal_dev.c | 12 ++++++++----
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/lib/librte_eal/linuxapp/eal/eal_dev.c
> b/lib/librte_eal/linuxapp/eal/eal_dev.c
> index d589c69..2830c86 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_dev.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_dev.c
> @@ -258,7 +258,7 @@ dev_uev_handler(__rte_unused void *param)
>  			if (bus == NULL) {
>  				RTE_LOG(ERR, EAL, "Cannot find bus (%s)\n",
>  					busname);
> -				return;
> +				goto failure_handle_err;
>  			}
> 
>  			dev = bus->find_device(NULL, cmp_dev_name, @@
> -266,19 +266,23 @@ dev_uev_handler(__rte_unused void *param)
>  			if (dev == NULL) {
>  				RTE_LOG(ERR, EAL, "Cannot find device (%s)
> on "
>  					"bus (%s)\n", uevent.devname,
> busname);
> -				return;
> +				goto failure_handle_err;
>  			}
> 
>  			ret = bus->hot_unplug_handler(dev);
> -			rte_spinlock_unlock(&failure_handle_lock);
>  			if (ret) {
>  				RTE_LOG(ERR, EAL, "Can not handle hot-
> unplug "
>  					"for device (%s)\n", dev->name);
> -				return;
>  			}
> +			rte_spinlock_unlock(&failure_handle_lock);
>  		}
>  		rte_dev_event_callback_process(uevent.devname,
> uevent.type);
>  	}
> +
> +	return;
> +
> +failure_handle_err:
> +	rte_spinlock_unlock(&failure_handle_lock);
>  }
> 
>  int __rte_experimental
> --
> 2.7.4
  
Guo, Jia Nov. 7, 2018, 5:49 a.m. UTC | #2
hi matan

On 11/6/2018 2:22 PM, Matan Azrad wrote:
> Hi Jeff
>
> Can you detail more in the commit log that we can understand the deadlock scenario. And how does this commit fix it?


Before i add more detail in the commit log of next version, i would 
explain to you here at first here.

When the device be hot-unplugged,  the hot-unplug handler will be 
invoked and the device will be detached, at this time if the interrupt 
still not disable soon and the second

remove event come again(kernel will sent pci remove event after sent uio 
remove event) , the bus->find_device will return null and return, at 
this place lack of an unlock.

Without this unlock, it will block the next remove or add event 
detection. So it definitely need an unlock here to avoid dead lock.


>> -----Original Message-----
>> From: Jeff Guo <jia.guo@intel.com>
>> Sent: Tuesday, November 6, 2018 8:07 AM
>> To: konstantin.ananyev@intel.com; anatoly.burakov@intel.com; Thomas
>> Monjalon <thomas@monjalon.net>; bernard.iremonger@intel.com;
>> jingjing.wu@intel.com; wenzhuo.lu@intel.com
>> Cc: ferruh.yigit@intel.com; dev@dpdk.org; jia.guo@intel.com;
>> helin.zhang@intel.com; Matan Azrad <matan@mellanox.com>;
>> shaopeng.he@intel.com
>> Subject: [PATCH 1/3] eal: fix lock issue for hot-unplug
>>
>> This patch will add missing unlock for hot-unplug handler, without this patch
>> potential dead lock will occur when device be hotplug-in after device be hot-
>> unplugged.
>>
>> Fixes: 0fc54536b14a ("eal: add failure handling for hot-unplug")
>> Signed-off-by: Jeff Guo <jia.guo@intel.com>
>> ---
>>   lib/librte_eal/linuxapp/eal/eal_dev.c | 12 ++++++++----
>>   1 file changed, 8 insertions(+), 4 deletions(-)
>>
>> diff --git a/lib/librte_eal/linuxapp/eal/eal_dev.c
>> b/lib/librte_eal/linuxapp/eal/eal_dev.c
>> index d589c69..2830c86 100644
>> --- a/lib/librte_eal/linuxapp/eal/eal_dev.c
>> +++ b/lib/librte_eal/linuxapp/eal/eal_dev.c
>> @@ -258,7 +258,7 @@ dev_uev_handler(__rte_unused void *param)
>>   			if (bus == NULL) {
>>   				RTE_LOG(ERR, EAL, "Cannot find bus (%s)\n",
>>   					busname);
>> -				return;
>> +				goto failure_handle_err;
>>   			}
>>
>>   			dev = bus->find_device(NULL, cmp_dev_name, @@
>> -266,19 +266,23 @@ dev_uev_handler(__rte_unused void *param)
>>   			if (dev == NULL) {
>>   				RTE_LOG(ERR, EAL, "Cannot find device (%s)
>> on "
>>   					"bus (%s)\n", uevent.devname,
>> busname);
>> -				return;
>> +				goto failure_handle_err;
>>   			}
>>
>>   			ret = bus->hot_unplug_handler(dev);
>> -			rte_spinlock_unlock(&failure_handle_lock);
>>   			if (ret) {
>>   				RTE_LOG(ERR, EAL, "Can not handle hot-
>> unplug "
>>   					"for device (%s)\n", dev->name);
>> -				return;
>>   			}
>> +			rte_spinlock_unlock(&failure_handle_lock);
>>   		}
>>   		rte_dev_event_callback_process(uevent.devname,
>> uevent.type);
>>   	}
>> +
>> +	return;
>> +
>> +failure_handle_err:
>> +	rte_spinlock_unlock(&failure_handle_lock);
>>   }
>>
>>   int __rte_experimental
>> --
>> 2.7.4
  
Matan Azrad Nov. 8, 2018, 7:08 a.m. UTC | #3
From: Jeff Guo 
> hi matan
> 
> On 11/6/2018 2:22 PM, Matan Azrad wrote:
> > Hi Jeff
> >
> > Can you detail more in the commit log that we can understand the
> deadlock scenario. And how does this commit fix it?
> 
> 
> Before i add more detail in the commit log of next version, i would explain to
> you here at first here.
> 
> When the device be hot-unplugged,  the hot-unplug handler will be invoked
> and the device will be detached, at this time if the interrupt still not disable
> soon and the second
> 
> remove event come again(kernel will sent pci remove event after sent uio
> remove event) , the bus->find_device will return null and return, at this place
> lack of an unlock.
> 
> Without this unlock, it will block the next remove or add event detection. So
> it definitely need an unlock here to avoid dead lock.
> 

Makes sense.

Thanks

> 
> >> -----Original Message-----
> >> From: Jeff Guo <jia.guo@intel.com>
> >> Sent: Tuesday, November 6, 2018 8:07 AM
> >> To: konstantin.ananyev@intel.com; anatoly.burakov@intel.com; Thomas
> >> Monjalon <thomas@monjalon.net>; bernard.iremonger@intel.com;
> >> jingjing.wu@intel.com; wenzhuo.lu@intel.com
> >> Cc: ferruh.yigit@intel.com; dev@dpdk.org; jia.guo@intel.com;
> >> helin.zhang@intel.com; Matan Azrad <matan@mellanox.com>;
> >> shaopeng.he@intel.com
> >> Subject: [PATCH 1/3] eal: fix lock issue for hot-unplug
> >>
> >> This patch will add missing unlock for hot-unplug handler, without this
> patch
> >> potential dead lock will occur when device be hotplug-in after device be
> hot-
> >> unplugged.
> >>
> >> Fixes: 0fc54536b14a ("eal: add failure handling for hot-unplug")
> >> Signed-off-by: Jeff Guo <jia.guo@intel.com>
> >> ---
> >>   lib/librte_eal/linuxapp/eal/eal_dev.c | 12 ++++++++----
> >>   1 file changed, 8 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/lib/librte_eal/linuxapp/eal/eal_dev.c
> >> b/lib/librte_eal/linuxapp/eal/eal_dev.c
> >> index d589c69..2830c86 100644
> >> --- a/lib/librte_eal/linuxapp/eal/eal_dev.c
> >> +++ b/lib/librte_eal/linuxapp/eal/eal_dev.c
> >> @@ -258,7 +258,7 @@ dev_uev_handler(__rte_unused void *param)
> >>   			if (bus == NULL) {
> >>   				RTE_LOG(ERR, EAL, "Cannot find bus (%s)\n",
> >>   					busname);
> >> -				return;
> >> +				goto failure_handle_err;
> >>   			}
> >>
> >>   			dev = bus->find_device(NULL, cmp_dev_name, @@
> >> -266,19 +266,23 @@ dev_uev_handler(__rte_unused void *param)
> >>   			if (dev == NULL) {
> >>   				RTE_LOG(ERR, EAL, "Cannot find device (%s)
> >> on "
> >>   					"bus (%s)\n", uevent.devname,
> >> busname);
> >> -				return;
> >> +				goto failure_handle_err;
> >>   			}
> >>
> >>   			ret = bus->hot_unplug_handler(dev);
> >> -			rte_spinlock_unlock(&failure_handle_lock);
> >>   			if (ret) {
> >>   				RTE_LOG(ERR, EAL, "Can not handle hot-
> >> unplug "
> >>   					"for device (%s)\n", dev->name);
> >> -				return;
> >>   			}
> >> +			rte_spinlock_unlock(&failure_handle_lock);
> >>   		}
> >>   		rte_dev_event_callback_process(uevent.devname,
> >> uevent.type);
> >>   	}
> >> +
> >> +	return;
> >> +
> >> +failure_handle_err:
> >> +	rte_spinlock_unlock(&failure_handle_lock);
> >>   }
> >>
> >>   int __rte_experimental
> >> --
> >> 2.7.4
  

Patch

diff --git a/lib/librte_eal/linuxapp/eal/eal_dev.c b/lib/librte_eal/linuxapp/eal/eal_dev.c
index d589c69..2830c86 100644
--- a/lib/librte_eal/linuxapp/eal/eal_dev.c
+++ b/lib/librte_eal/linuxapp/eal/eal_dev.c
@@ -258,7 +258,7 @@  dev_uev_handler(__rte_unused void *param)
 			if (bus == NULL) {
 				RTE_LOG(ERR, EAL, "Cannot find bus (%s)\n",
 					busname);
-				return;
+				goto failure_handle_err;
 			}
 
 			dev = bus->find_device(NULL, cmp_dev_name,
@@ -266,19 +266,23 @@  dev_uev_handler(__rte_unused void *param)
 			if (dev == NULL) {
 				RTE_LOG(ERR, EAL, "Cannot find device (%s) on "
 					"bus (%s)\n", uevent.devname, busname);
-				return;
+				goto failure_handle_err;
 			}
 
 			ret = bus->hot_unplug_handler(dev);
-			rte_spinlock_unlock(&failure_handle_lock);
 			if (ret) {
 				RTE_LOG(ERR, EAL, "Can not handle hot-unplug "
 					"for device (%s)\n", dev->name);
-				return;
 			}
+			rte_spinlock_unlock(&failure_handle_lock);
 		}
 		rte_dev_event_callback_process(uevent.devname, uevent.type);
 	}
+
+	return;
+
+failure_handle_err:
+	rte_spinlock_unlock(&failure_handle_lock);
 }
 
 int __rte_experimental