ethdev: recommend against using locks in event callbacks

Message ID 20240201084319.2687896-1-david.marchand@redhat.com (mailing list archive)
State Accepted, archived
Delegated to: Ferruh Yigit
Headers
Series ethdev: recommend against using locks in event callbacks |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/Intel-compilation success Compilation OK
ci/iol-intel-Performance success Performance Testing PASS
ci/github-robot: build success github build: passed
ci/intel-Testing success Testing PASS
ci/iol-abi-testing success Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-compile-amd64-testing success Testing PASS
ci/iol-sample-apps-testing success Testing PASS
ci/intel-Functional success Functional PASS
ci/iol-unit-amd64-testing success Testing PASS
ci/iol-unit-arm64-testing success Testing PASS
ci/iol-compile-arm64-testing success Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS

Commit Message

David Marchand Feb. 1, 2024, 8:43 a.m. UTC
  As described in a recent bugzilla opened against the net/iavf driver,
a driver may call a event callback from other calls of the ethdev API.

Nothing guarantees in the ethdev API against such behavior.

Add a notice against using locks in those callbacks.

Bugzilla ID: 1337

Signed-off-by: David Marchand <david.marchand@redhat.com>
---
 lib/ethdev/rte_ethdev.h | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)
  

Comments

Kevin Traynor Feb. 1, 2024, 10:08 a.m. UTC | #1
On 01/02/2024 08:43, David Marchand wrote:
> As described in a recent bugzilla opened against the net/iavf driver,
> a driver may call a event callback from other calls of the ethdev API.
> 
> Nothing guarantees in the ethdev API against such behavior.
> 
> Add a notice against using locks in those callbacks.
> 
> Bugzilla ID: 1337
> 
> Signed-off-by: David Marchand <david.marchand@redhat.com>
> ---
>  lib/ethdev/rte_ethdev.h | 14 +++++++++++++-
>  1 file changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> index 21e3a21903..5c6b104fb4 100644
> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -4090,7 +4090,19 @@ enum rte_eth_event_type {
>  	RTE_ETH_EVENT_MAX       /**< max value of this enum */
>  };
>  
> -/** User application callback to be registered for interrupts. */
> +/**
> + * User application callback to be registered for interrupts.
> + *
> + * Note: there is no guarantee in the DPDK drivers that a callback won't be
> + *       called in the middle of other parts of the ethdev API. For example,
> + *       imagine that thread A calls rte_eth_dev_start() and as part of this
> + *       call, a RTE_ETH_EVENT_INTR_RESET event gets generated and the
> + *       associated callback is ran on thread A. In that example, if the
> + *       application protects its internal data using locks before calling
> + *       rte_eth_dev_start(), and the callback takes a same lock, a deadlock
> + *       occurs. Because of this, it is highly recommended NOT to take locks in
> + *       those callbacks.
> + */

That is a good practical recommendation for an application developer.

I wonder if it should taken further so that the API formally states the
callback MUST be non-blocking?

>  typedef int (*rte_eth_dev_cb_fn)(uint16_t port_id,
>  		enum rte_eth_event_type event, void *cb_arg, void *ret_param);
>
  
Ferruh Yigit Feb. 6, 2024, 8:27 p.m. UTC | #2
On 2/1/2024 8:43 AM, David Marchand wrote:
> As described in a recent bugzilla opened against the net/iavf driver,
> a driver may call a event callback from other calls of the ethdev API.
> 
> Nothing guarantees in the ethdev API against such behavior.
> 
> Add a notice against using locks in those callbacks.
> 
> Bugzilla ID: 1337
> 
> Signed-off-by: David Marchand <david.marchand@redhat.com>
> ---
>  lib/ethdev/rte_ethdev.h | 14 +++++++++++++-
>  1 file changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> index 21e3a21903..5c6b104fb4 100644
> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -4090,7 +4090,19 @@ enum rte_eth_event_type {
>  	RTE_ETH_EVENT_MAX       /**< max value of this enum */
>  };
>  
> -/** User application callback to be registered for interrupts. */
> +/**
> + * User application callback to be registered for interrupts.
> + *
> + * Note: there is no guarantee in the DPDK drivers that a callback won't be
> + *       called in the middle of other parts of the ethdev API. For example,
> + *       imagine that thread A calls rte_eth_dev_start() and as part of this
> + *       call, a RTE_ETH_EVENT_INTR_RESET event gets generated and the
> + *       associated callback is ran on thread A. In that example, if the
> + *       application protects its internal data using locks before calling
> + *       rte_eth_dev_start(), and the callback takes a same lock, a deadlock
> + *       occurs. Because of this, it is highly recommended NOT to take locks in
> + *       those callbacks.
> + */
>  typedef int (*rte_eth_dev_cb_fn)(uint16_t port_id,
>  		enum rte_eth_event_type event, void *cb_arg, void *ret_param);
>  

Acked-by: Ferruh Yigit <ferruh.yigit@amd.com>
  
Ferruh Yigit Feb. 6, 2024, 8:33 p.m. UTC | #3
On 2/1/2024 10:08 AM, Kevin Traynor wrote:
> On 01/02/2024 08:43, David Marchand wrote:
>> As described in a recent bugzilla opened against the net/iavf driver,
>> a driver may call a event callback from other calls of the ethdev API.
>>
>> Nothing guarantees in the ethdev API against such behavior.
>>
>> Add a notice against using locks in those callbacks.
>>
>> Bugzilla ID: 1337
>>
>> Signed-off-by: David Marchand <david.marchand@redhat.com>
>> ---
>>  lib/ethdev/rte_ethdev.h | 14 +++++++++++++-
>>  1 file changed, 13 insertions(+), 1 deletion(-)
>>
>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
>> index 21e3a21903..5c6b104fb4 100644
>> --- a/lib/ethdev/rte_ethdev.h
>> +++ b/lib/ethdev/rte_ethdev.h
>> @@ -4090,7 +4090,19 @@ enum rte_eth_event_type {
>>  	RTE_ETH_EVENT_MAX       /**< max value of this enum */
>>  };
>>  
>> -/** User application callback to be registered for interrupts. */
>> +/**
>> + * User application callback to be registered for interrupts.
>> + *
>> + * Note: there is no guarantee in the DPDK drivers that a callback won't be
>> + *       called in the middle of other parts of the ethdev API. For example,
>> + *       imagine that thread A calls rte_eth_dev_start() and as part of this
>> + *       call, a RTE_ETH_EVENT_INTR_RESET event gets generated and the
>> + *       associated callback is ran on thread A. In that example, if the
>> + *       application protects its internal data using locks before calling
>> + *       rte_eth_dev_start(), and the callback takes a same lock, a deadlock
>> + *       occurs. Because of this, it is highly recommended NOT to take locks in
>> + *       those callbacks.
>> + */
> 
> That is a good practical recommendation for an application developer.
> 
> I wonder if it should taken further so that the API formally states the
> callback MUST be non-blocking?
> 

Application still can manage the locks in a safe way, but needs to be
aware of above condition and possible deadlock.

I think above note is sufficient instead of forbidding locks in
callbacks completely.
  
Kevin Traynor Feb. 7, 2024, 2:31 p.m. UTC | #4
On 06/02/2024 20:33, Ferruh Yigit wrote:
> On 2/1/2024 10:08 AM, Kevin Traynor wrote:
>> On 01/02/2024 08:43, David Marchand wrote:
>>> As described in a recent bugzilla opened against the net/iavf driver,
>>> a driver may call a event callback from other calls of the ethdev API.
>>>
>>> Nothing guarantees in the ethdev API against such behavior.
>>>
>>> Add a notice against using locks in those callbacks.
>>>
>>> Bugzilla ID: 1337
>>>
>>> Signed-off-by: David Marchand <david.marchand@redhat.com>
>>> ---
>>>  lib/ethdev/rte_ethdev.h | 14 +++++++++++++-
>>>  1 file changed, 13 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
>>> index 21e3a21903..5c6b104fb4 100644
>>> --- a/lib/ethdev/rte_ethdev.h
>>> +++ b/lib/ethdev/rte_ethdev.h
>>> @@ -4090,7 +4090,19 @@ enum rte_eth_event_type {
>>>  	RTE_ETH_EVENT_MAX       /**< max value of this enum */
>>>  };
>>>  
>>> -/** User application callback to be registered for interrupts. */
>>> +/**
>>> + * User application callback to be registered for interrupts.
>>> + *
>>> + * Note: there is no guarantee in the DPDK drivers that a callback won't be
>>> + *       called in the middle of other parts of the ethdev API. For example,
>>> + *       imagine that thread A calls rte_eth_dev_start() and as part of this
>>> + *       call, a RTE_ETH_EVENT_INTR_RESET event gets generated and the
>>> + *       associated callback is ran on thread A. In that example, if the
>>> + *       application protects its internal data using locks before calling
>>> + *       rte_eth_dev_start(), and the callback takes a same lock, a deadlock
>>> + *       occurs. Because of this, it is highly recommended NOT to take locks in
>>> + *       those callbacks.
>>> + */
>>
>> That is a good practical recommendation for an application developer.
>>
>> I wonder if it should taken further so that the API formally states the
>> callback MUST be non-blocking?
>>
> 
> Application still can manage the locks in a safe way, but needs to be
> aware of above condition and possible deadlock.
> 

Just to explain a bit more, if you look at the original issue in the
Bugzilla [0], I think there was an assumption that
rte_eth_dev_configure() would not block or deadlock with the
eal-intr-thread. So then it was assumed that waiting for the lock in the
callback was ok, because rte_eth_dev_configure() would return and
callback would obtain the lock.

So i'm showing that in this example the lack of a guarantee or clarity
or bad assumption about the behavior of rte_eth_dev_configure() made it
difficult for an app developer to know if their locks were safe or not.
That's why I was thinking about something more formal.

> I think above note is sufficient instead of forbidding locks in
> callbacks completely.
> 

In the end the difference between "highly recommended NOT to" and "must
not" is not much and either way is probably enough to scare someone
enough to avoid them.

[0] https://bugs.dpdk.org/show_bug.cgi?id=1337#c0
  
Kevin Traynor Feb. 7, 2024, 3:29 p.m. UTC | #5
On 01/02/2024 08:43, David Marchand wrote:
> As described in a recent bugzilla opened against the net/iavf driver,
> a driver may call a event callback from other calls of the ethdev API.
> 
> Nothing guarantees in the ethdev API against such behavior.
> 
> Add a notice against using locks in those callbacks.
> 
> Bugzilla ID: 1337
> 
> Signed-off-by: David Marchand <david.marchand@redhat.com>
> ---
>  lib/ethdev/rte_ethdev.h | 14 +++++++++++++-
>  1 file changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> index 21e3a21903..5c6b104fb4 100644
> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -4090,7 +4090,19 @@ enum rte_eth_event_type {
>  	RTE_ETH_EVENT_MAX       /**< max value of this enum */
>  };
>  
> -/** User application callback to be registered for interrupts. */
> +/**
> + * User application callback to be registered for interrupts.
> + *
> + * Note: there is no guarantee in the DPDK drivers that a callback won't be
> + *       called in the middle of other parts of the ethdev API. For example,
> + *       imagine that thread A calls rte_eth_dev_start() and as part of this
> + *       call, a RTE_ETH_EVENT_INTR_RESET event gets generated and the
> + *       associated callback is ran on thread A. In that example, if the
> + *       application protects its internal data using locks before calling
> + *       rte_eth_dev_start(), and the callback takes a same lock, a deadlock
> + *       occurs. Because of this, it is highly recommended NOT to take locks in
> + *       those callbacks.
> + */
>  typedef int (*rte_eth_dev_cb_fn)(uint16_t port_id,
>  		enum rte_eth_event_type event, void *cb_arg, void *ret_param);
>  

Acked-by: Kevin Traynor <ktraynor@redhat.com>
  
Dariusz Sosnowski Feb. 9, 2024, 4:07 p.m. UTC | #6
> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Thursday, February 1, 2024 09:43
> To: dev@dpdk.org
> Cc: ktraynor@redhat.com; NBU-Contact-Thomas Monjalon (EXTERNAL)
> <thomas@monjalon.net>; Ferruh Yigit <ferruh.yigit@amd.com>; Andrew
> Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Subject: [PATCH] ethdev: recommend against using locks in event callbacks
> 
> As described in a recent bugzilla opened against the net/iavf driver, a driver
> may call a event callback from other calls of the ethdev API.
> 
> Nothing guarantees in the ethdev API against such behavior.
> 
> Add a notice against using locks in those callbacks.
> 
> Bugzilla ID: 1337
> 
> Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Dariusz Sosnowski <dsosnowski@nvidia.com>

Best regards,
Dariusz Sosnowski
  
Ferruh Yigit Feb. 9, 2024, 9:13 p.m. UTC | #7
On 2/9/2024 4:07 PM, Dariusz Sosnowski wrote:
>> -----Original Message-----
>> From: David Marchand <david.marchand@redhat.com>
>> Sent: Thursday, February 1, 2024 09:43
>> To: dev@dpdk.org
>> Cc: ktraynor@redhat.com; NBU-Contact-Thomas Monjalon (EXTERNAL)
>> <thomas@monjalon.net>; Ferruh Yigit <ferruh.yigit@amd.com>; Andrew
>> Rybchenko <andrew.rybchenko@oktetlabs.ru>
>> Subject: [PATCH] ethdev: recommend against using locks in event callbacks
>>
>> As described in a recent bugzilla opened against the net/iavf driver, a driver
>> may call a event callback from other calls of the ethdev API.
>>
>> Nothing guarantees in the ethdev API against such behavior.
>>
>> Add a notice against using locks in those callbacks.
>>
>> Bugzilla ID: 1337
>>
>> Signed-off-by: David Marchand <david.marchand@redhat.com>
> Acked-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
> 

Acked-by: Ferruh Yigit <ferruh.yigit@amd.com>

Applied to dpdk-next-net/main, thanks.
  

Patch

diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index 21e3a21903..5c6b104fb4 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -4090,7 +4090,19 @@  enum rte_eth_event_type {
 	RTE_ETH_EVENT_MAX       /**< max value of this enum */
 };
 
-/** User application callback to be registered for interrupts. */
+/**
+ * User application callback to be registered for interrupts.
+ *
+ * Note: there is no guarantee in the DPDK drivers that a callback won't be
+ *       called in the middle of other parts of the ethdev API. For example,
+ *       imagine that thread A calls rte_eth_dev_start() and as part of this
+ *       call, a RTE_ETH_EVENT_INTR_RESET event gets generated and the
+ *       associated callback is ran on thread A. In that example, if the
+ *       application protects its internal data using locks before calling
+ *       rte_eth_dev_start(), and the callback takes a same lock, a deadlock
+ *       occurs. Because of this, it is highly recommended NOT to take locks in
+ *       those callbacks.
+ */
 typedef int (*rte_eth_dev_cb_fn)(uint16_t port_id,
 		enum rte_eth_event_type event, void *cb_arg, void *ret_param);