eventdev: fix Rx adapter stalls on event device backpressure

Message ID 20211108132558.28748-1-mattias.ronnblom@ericsson.com (mailing list archive)
State Changes Requested, archived
Delegated to: Jerin Jacob
Headers
Series eventdev: fix Rx adapter stalls on event device backpressure |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/github-robot: build success github build: passed
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-aarch64-unit-testing success Testing PASS
ci/iol-x86_64-compile-testing success Testing PASS
ci/iol-x86_64-unit-testing success Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-intel-Functional success Functional Testing PASS

Commit Message

Mattias Rönnblom Nov. 8, 2021, 1:25 p.m. UTC
  In the Eventdev Ethernet RX Adapter, correctly handle the case where
the circular enqueue buffer head and tail index points to the same
element (i.e., the buffer is full) and the buffer has wrapped.

This bug may be triggered in case there is backpressure from the event
device to the RX adapter.

Fixes: 8113fd15e229 ("eventdev/eth_rx: make enqueue buffer circular")
Cc: ganapati.kundapura@intel.com
Cc: stable@dpdk.org

Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
---
 lib/eventdev/rte_event_eth_rx_adapter.c | 22 ++++++++++++++--------
 1 file changed, 14 insertions(+), 8 deletions(-)
  

Comments

Mattias Rönnblom Nov. 8, 2021, 1:43 p.m. UTC | #1
On 2021-11-08 14:25, Mattias Rönnblom wrote:
> In the Eventdev Ethernet RX Adapter, correctly handle the case where
> the circular enqueue buffer head and tail index points to the same
> element (i.e., the buffer is full) and the buffer has wrapped.
>
> This bug may be triggered in case there is backpressure from the event
> device to the RX adapter.
>
> Fixes: 8113fd15e229 ("eventdev/eth_rx: make enqueue buffer circular")
> Cc: ganapati.kundapura@intel.com
> Cc: stable@dpdk.org


Disregard the stable cc. This bug does not appear in any released DPDK 
version (e.g., 21.08).


> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> ---
>   lib/eventdev/rte_event_eth_rx_adapter.c | 22 ++++++++++++++--------
>   1 file changed, 14 insertions(+), 8 deletions(-)
>
> diff --git a/lib/eventdev/rte_event_eth_rx_adapter.c b/lib/eventdev/rte_event_eth_rx_adapter.c
> index 56318b5a6f..809416d9b7 100644
> --- a/lib/eventdev/rte_event_eth_rx_adapter.c
> +++ b/lib/eventdev/rte_event_eth_rx_adapter.c
> @@ -777,19 +777,25 @@ rxa_flush_event_buffer(struct event_eth_rx_adapter *rx_adapter,
>   		       struct eth_event_enqueue_buffer *buf,
>   		       struct rte_event_eth_rx_adapter_stats *stats)
>   {
> -	uint16_t count = buf->last ? buf->last - buf->head : buf->count;
> +	uint16_t count = buf->count;
> +	uint16_t n = 0;
>   
>   	if (!count)
>   		return 0;
>   
> -	uint16_t n = rte_event_enqueue_new_burst(rx_adapter->eventdev_id,
> -					rx_adapter->event_port_id,
> -					&buf->events[buf->head],
> -					count);
> -	if (n != count)
> -		stats->rx_enq_retry++;
> +	if (buf->last)
> +		count = buf->last - buf->head;
> +
> +	if (count) {
> +		n = rte_event_enqueue_new_burst(rx_adapter->eventdev_id,
> +						rx_adapter->event_port_id,
> +						&buf->events[buf->head],
> +						count);
> +		if (n != count)
> +			stats->rx_enq_retry++;
>   
> -	buf->head += n;
> +		buf->head += n;
> +	}
>   
>   	if (buf->last && n == count) {
>   		uint16_t n1;
  
Ganapati Kundapura Nov. 9, 2021, 6:26 a.m. UTC | #2
Hi Mattias,

> -----Original Message-----
> From: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> Sent: 08 November 2021 19:14
> To: jerinj@marvell.com; Jayatheerthan, Jay <jay.jayatheerthan@intel.com>
> Cc: dev@dpdk.org; Kundapura, Ganapati <ganapati.kundapura@intel.com>;
> stable@dpdk.org
> Subject: Re: [PATCH] eventdev: fix Rx adapter stalls on event device
> backpressure
> 
> On 2021-11-08 14:25, Mattias Rönnblom wrote:
> > In the Eventdev Ethernet RX Adapter, correctly handle the case where
> > the circular enqueue buffer head and tail index points to the same
> > element (i.e., the buffer is full) and the buffer has wrapped.
> >
> > This bug may be triggered in case there is backpressure from the event
> > device to the RX adapter.
> >
> > Fixes: 8113fd15e229 ("eventdev/eth_rx: make enqueue buffer circular")
> > Cc: ganapati.kundapura@intel.com
> > Cc: stable@dpdk.org
> 
> 
> Disregard the stable cc. This bug does not appear in any released DPDK
> version (e.g., 21.08).
> 
> 
> > Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> > ---
> >   lib/eventdev/rte_event_eth_rx_adapter.c | 22 ++++++++++++++--------
> >   1 file changed, 14 insertions(+), 8 deletions(-)
> >
> > diff --git a/lib/eventdev/rte_event_eth_rx_adapter.c
> b/lib/eventdev/rte_event_eth_rx_adapter.c
> > index 56318b5a6f..809416d9b7 100644
> > --- a/lib/eventdev/rte_event_eth_rx_adapter.c
> > +++ b/lib/eventdev/rte_event_eth_rx_adapter.c
> > @@ -777,19 +777,25 @@ rxa_flush_event_buffer(struct
> event_eth_rx_adapter *rx_adapter,
> >   		       struct eth_event_enqueue_buffer *buf,
> >   		       struct rte_event_eth_rx_adapter_stats *stats)
> >   {
> > -	uint16_t count = buf->last ? buf->last - buf->head : buf->count;
> > +	uint16_t count = buf->count;
> > +	uint16_t n = 0;
> >
> >   	if (!count)
> >   		return 0;
> >
> > -	uint16_t n = rte_event_enqueue_new_burst(rx_adapter-
> >eventdev_id,
> > -					rx_adapter->event_port_id,
> > -					&buf->events[buf->head],
> > -					count);
> > -	if (n != count)
> > -		stats->rx_enq_retry++;
> > +	if (buf->last)
> > +		count = buf->last - buf->head;
> > +
> > +	if (count) {
> > +		n = rte_event_enqueue_new_burst(rx_adapter-
> >eventdev_id,
> > +						rx_adapter->event_port_id,
> > +						&buf->events[buf->head],
> > +						count);
> > +		if (n != count)
> > +			stats->rx_enq_retry++;
> >
> > -	buf->head += n;
> > +		buf->head += n;
> > +	}
> >
> >   	if (buf->last && n == count) {
> >   		uint16_t n1;

When head = tail, count is the number of events in the event buffer i.e count = buf->count and last = 0
Last is the marker used in case of roll over.
In case of tail roll over and head is not, events are processed from head to last and zero to tail.
Looks like change is same as the original. 
Could you please clarify more on this change and also clarify if you were able to reproduce the backpressure issue?
  
Mattias Rönnblom Nov. 9, 2021, 8:28 a.m. UTC | #3
On 2021-11-09 07:26, Kundapura, Ganapati wrote:
> Hi Mattias,
>
>> -----Original Message-----
>> From: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
>> Sent: 08 November 2021 19:14
>> To: jerinj@marvell.com; Jayatheerthan, Jay <jay.jayatheerthan@intel.com>
>> Cc: dev@dpdk.org; Kundapura, Ganapati <ganapati.kundapura@intel.com>;
>> stable@dpdk.org
>> Subject: Re: [PATCH] eventdev: fix Rx adapter stalls on event device
>> backpressure
>>
>> On 2021-11-08 14:25, Mattias Rönnblom wrote:
>>> In the Eventdev Ethernet RX Adapter, correctly handle the case where
>>> the circular enqueue buffer head and tail index points to the same
>>> element (i.e., the buffer is full) and the buffer has wrapped.
>>>
>>> This bug may be triggered in case there is backpressure from the event
>>> device to the RX adapter.
>>>
>>> Fixes: 8113fd15e229 ("eventdev/eth_rx: make enqueue buffer circular")
>>> Cc: ganapati.kundapura@intel.com
>>> Cc: stable@dpdk.org
>>
>> Disregard the stable cc. This bug does not appear in any released DPDK
>> version (e.g., 21.08).
>>
>>
>>> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
>>> ---
>>>    lib/eventdev/rte_event_eth_rx_adapter.c | 22 ++++++++++++++--------
>>>    1 file changed, 14 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/lib/eventdev/rte_event_eth_rx_adapter.c
>> b/lib/eventdev/rte_event_eth_rx_adapter.c
>>> index 56318b5a6f..809416d9b7 100644
>>> --- a/lib/eventdev/rte_event_eth_rx_adapter.c
>>> +++ b/lib/eventdev/rte_event_eth_rx_adapter.c
>>> @@ -777,19 +777,25 @@ rxa_flush_event_buffer(struct
>> event_eth_rx_adapter *rx_adapter,
>>>    		       struct eth_event_enqueue_buffer *buf,
>>>    		       struct rte_event_eth_rx_adapter_stats *stats)
>>>    {
>>> -	uint16_t count = buf->last ? buf->last - buf->head : buf->count;
>>> +	uint16_t count = buf->count;
>>> +	uint16_t n = 0;
>>>
>>>    	if (!count)
>>>    		return 0;
>>>
>>> -	uint16_t n = rte_event_enqueue_new_burst(rx_adapter-
>>> eventdev_id,
>>> -					rx_adapter->event_port_id,
>>> -					&buf->events[buf->head],
>>> -					count);
>>> -	if (n != count)
>>> -		stats->rx_enq_retry++;
>>> +	if (buf->last)
>>> +		count = buf->last - buf->head;
>>> +
>>> +	if (count) {
>>> +		n = rte_event_enqueue_new_burst(rx_adapter-
>>> eventdev_id,
>>> +						rx_adapter->event_port_id,
>>> +						&buf->events[buf->head],
>>> +						count);
>>> +		if (n != count)
>>> +			stats->rx_enq_retry++;
>>>
>>> -	buf->head += n;
>>> +		buf->head += n;
>>> +	}
>>>
>>>    	if (buf->last && n == count) {
>>>    		uint16_t n1;
> When head = tail, count is the number of events in the event buffer i.e count = buf->count and last = 0
> Last is the marker used in case of roll over.
> In case of tail roll over and head is not, events are processed from head to last and zero to tail.
> Looks like change is same as the original.
> Could you please clarify more on this change and also clarify if you were able to reproduce the backpressure issue?


The enqueue buffer state I encountered was last != 0 and head == tail, 
and size != 0. In that case, the function returns early, since count == 
0, even though there are events stored from 0 to tail. head, tail, last 
and size were all 192, from what I remember.


For reasons I didn't analyze, it only seem to occur when the event 
port's enqueue burst size was larger than 32 (the RX burst used against 
the RX adapter's ethdev queues), and there was backpressure from the 
event device.
  
Ganapati Kundapura Nov. 9, 2021, 11:09 a.m. UTC | #4
> -----Original Message-----
> From: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> Sent: 09 November 2021 13:58
> To: Kundapura, Ganapati <ganapati.kundapura@intel.com>;
> jerinj@marvell.com; Jayatheerthan, Jay <jay.jayatheerthan@intel.com>
> Cc: dev@dpdk.org; stable@dpdk.org
> Subject: Re: [PATCH] eventdev: fix Rx adapter stalls on event device
> backpressure
> 
> On 2021-11-09 07:26, Kundapura, Ganapati wrote:
> > Hi Mattias,
> >
> >> -----Original Message-----
> >> From: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> >> Sent: 08 November 2021 19:14
> >> To: jerinj@marvell.com; Jayatheerthan, Jay
> >> <jay.jayatheerthan@intel.com>
> >> Cc: dev@dpdk.org; Kundapura, Ganapati
> <ganapati.kundapura@intel.com>;
> >> stable@dpdk.org
> >> Subject: Re: [PATCH] eventdev: fix Rx adapter stalls on event device
> >> backpressure
> >>
> >> On 2021-11-08 14:25, Mattias Rönnblom wrote:
> >>> In the Eventdev Ethernet RX Adapter, correctly handle the case where
> >>> the circular enqueue buffer head and tail index points to the same
> >>> element (i.e., the buffer is full) and the buffer has wrapped.
> >>>
> >>> This bug may be triggered in case there is backpressure from the
> >>> event device to the RX adapter.
> >>>
> >>> Fixes: 8113fd15e229 ("eventdev/eth_rx: make enqueue buffer
> >>> circular")
> >>> Cc: ganapati.kundapura@intel.com
> >>> Cc: stable@dpdk.org
> >>
> >> Disregard the stable cc. This bug does not appear in any released
> >> DPDK version (e.g., 21.08).
> >>
> >>
> >>> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> >>> ---
> >>>    lib/eventdev/rte_event_eth_rx_adapter.c | 22 ++++++++++++++------
> --
> >>>    1 file changed, 14 insertions(+), 8 deletions(-)
> >>>
> >>> diff --git a/lib/eventdev/rte_event_eth_rx_adapter.c
> >> b/lib/eventdev/rte_event_eth_rx_adapter.c
> >>> index 56318b5a6f..809416d9b7 100644
> >>> --- a/lib/eventdev/rte_event_eth_rx_adapter.c
> >>> +++ b/lib/eventdev/rte_event_eth_rx_adapter.c
> >>> @@ -777,19 +777,25 @@ rxa_flush_event_buffer(struct
> >> event_eth_rx_adapter *rx_adapter,
> >>>    		       struct eth_event_enqueue_buffer *buf,
> >>>    		       struct rte_event_eth_rx_adapter_stats *stats)
> >>>    {
> >>> -	uint16_t count = buf->last ? buf->last - buf->head : buf->count;
> >>> +	uint16_t count = buf->count;
> >>> +	uint16_t n = 0;
> >>>
> >>>    	if (!count)
> >>>    		return 0;
> >>>
> >>> -	uint16_t n = rte_event_enqueue_new_burst(rx_adapter-
> >>> eventdev_id,
> >>> -					rx_adapter->event_port_id,
> >>> -					&buf->events[buf->head],
> >>> -					count);
> >>> -	if (n != count)
> >>> -		stats->rx_enq_retry++;
> >>> +	if (buf->last)
> >>> +		count = buf->last - buf->head;
> >>> +
> >>> +	if (count) {
> >>> +		n = rte_event_enqueue_new_burst(rx_adapter-
> >>> eventdev_id,
> >>> +						rx_adapter->event_port_id,
> >>> +						&buf->events[buf->head],
> >>> +						count);
> >>> +		if (n != count)
> >>> +			stats->rx_enq_retry++;
> >>>
> >>> -	buf->head += n;
> >>> +		buf->head += n;
> >>> +	}
> >>>
> >>>    	if (buf->last && n == count) {
> >>>    		uint16_t n1;
> > When head = tail, count is the number of events in the event buffer
> > i.e count = buf->count and last = 0 Last is the marker used in case of roll
> over.
> > In case of tail roll over and head is not, events are processed from head to
> last and zero to tail.
> > Looks like change is same as the original.
> > Could you please clarify more on this change and also clarify if you were
> able to reproduce the backpressure issue?
> 
> 
> The enqueue buffer state I encountered was last != 0 and head == tail, and
> size != 0. In that case, the function returns early, since count == 0, even
> though there are events stored from 0 to tail. head, tail, last and size were all
> 192, from what I remember.
>
head == tail only in case of empty buffer(No events enqueued) and count will be zero.
In case of event buffer is full and tail is rolled over, tail will be behind one batch size from head,
head cannot be equal to tail.

Please provide details on issue repro.
 
> 
> For reasons I didn't analyze, it only seem to occur when the event port's
> enqueue burst size was larger than 32 (the RX burst used against the RX
> adapter's ethdev queues), and there was backpressure from the event
> device.
> 
Burst size larger than 32 implies, BATCH_SIZE is modified. 
Please provide details what all the changes done for hitting the issue.
  
Mattias Rönnblom Nov. 9, 2021, 11:43 a.m. UTC | #5
On 2021-11-09 12:09, Kundapura, Ganapati wrote:
>
>> -----Original Message-----
>> From: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
>> Sent: 09 November 2021 13:58
>> To: Kundapura, Ganapati <ganapati.kundapura@intel.com>;
>> jerinj@marvell.com; Jayatheerthan, Jay <jay.jayatheerthan@intel.com>
>> Cc: dev@dpdk.org; stable@dpdk.org
>> Subject: Re: [PATCH] eventdev: fix Rx adapter stalls on event device
>> backpressure
>>
>> On 2021-11-09 07:26, Kundapura, Ganapati wrote:
>>> Hi Mattias,
>>>
>>>> -----Original Message-----
>>>> From: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
>>>> Sent: 08 November 2021 19:14
>>>> To: jerinj@marvell.com; Jayatheerthan, Jay
>>>> <jay.jayatheerthan@intel.com>
>>>> Cc: dev@dpdk.org; Kundapura, Ganapati
>> <ganapati.kundapura@intel.com>;
>>>> stable@dpdk.org
>>>> Subject: Re: [PATCH] eventdev: fix Rx adapter stalls on event device
>>>> backpressure
>>>>
>>>> On 2021-11-08 14:25, Mattias Rönnblom wrote:
>>>>> In the Eventdev Ethernet RX Adapter, correctly handle the case where
>>>>> the circular enqueue buffer head and tail index points to the same
>>>>> element (i.e., the buffer is full) and the buffer has wrapped.
>>>>>
>>>>> This bug may be triggered in case there is backpressure from the
>>>>> event device to the RX adapter.
>>>>>
>>>>> Fixes: 8113fd15e229 ("eventdev/eth_rx: make enqueue buffer
>>>>> circular")
>>>>> Cc: ganapati.kundapura@intel.com
>>>>> Cc: stable@dpdk.org
>>>> Disregard the stable cc. This bug does not appear in any released
>>>> DPDK version (e.g., 21.08).
>>>>
>>>>
>>>>> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
>>>>> ---
>>>>>     lib/eventdev/rte_event_eth_rx_adapter.c | 22 ++++++++++++++------
>> --
>>>>>     1 file changed, 14 insertions(+), 8 deletions(-)
>>>>>
>>>>> diff --git a/lib/eventdev/rte_event_eth_rx_adapter.c
>>>> b/lib/eventdev/rte_event_eth_rx_adapter.c
>>>>> index 56318b5a6f..809416d9b7 100644
>>>>> --- a/lib/eventdev/rte_event_eth_rx_adapter.c
>>>>> +++ b/lib/eventdev/rte_event_eth_rx_adapter.c
>>>>> @@ -777,19 +777,25 @@ rxa_flush_event_buffer(struct
>>>> event_eth_rx_adapter *rx_adapter,
>>>>>     		       struct eth_event_enqueue_buffer *buf,
>>>>>     		       struct rte_event_eth_rx_adapter_stats *stats)
>>>>>     {
>>>>> -	uint16_t count = buf->last ? buf->last - buf->head : buf->count;
>>>>> +	uint16_t count = buf->count;
>>>>> +	uint16_t n = 0;
>>>>>
>>>>>     	if (!count)
>>>>>     		return 0;
>>>>>
>>>>> -	uint16_t n = rte_event_enqueue_new_burst(rx_adapter-
>>>>> eventdev_id,
>>>>> -					rx_adapter->event_port_id,
>>>>> -					&buf->events[buf->head],
>>>>> -					count);
>>>>> -	if (n != count)
>>>>> -		stats->rx_enq_retry++;
>>>>> +	if (buf->last)
>>>>> +		count = buf->last - buf->head;
>>>>> +
>>>>> +	if (count) {
>>>>> +		n = rte_event_enqueue_new_burst(rx_adapter-
>>>>> eventdev_id,
>>>>> +						rx_adapter->event_port_id,
>>>>> +						&buf->events[buf->head],
>>>>> +						count);
>>>>> +		if (n != count)
>>>>> +			stats->rx_enq_retry++;
>>>>>
>>>>> -	buf->head += n;
>>>>> +		buf->head += n;
>>>>> +	}
>>>>>
>>>>>     	if (buf->last && n == count) {
>>>>>     		uint16_t n1;
>>> When head = tail, count is the number of events in the event buffer
>>> i.e count = buf->count and last = 0 Last is the marker used in case of roll
>> over.
>>> In case of tail roll over and head is not, events are processed from head to
>> last and zero to tail.
>>> Looks like change is same as the original.
>>> Could you please clarify more on this change and also clarify if you were
>> able to reproduce the backpressure issue?
>>
>>
>> The enqueue buffer state I encountered was last != 0 and head == tail, and
>> size != 0. In that case, the function returns early, since count == 0, even
>> though there are events stored from 0 to tail. head, tail, last and size were all
>> 192, from what I remember.
>>
> head == tail only in case of empty buffer(No events enqueued) and count will be zero.
> In case of event buffer is full and tail is rolled over, tail will be behind one batch size from head,
> head cannot be equal to tail.


Why do you have a count field if it's not to break an ambiguity about 
the state of the buffer? Anyway. This function is from the RX adapter:


static inline bool
rxa_pkt_buf_available(struct eth_event_enqueue_buffer *buf)
{
         uint32_t nb_req = buf->tail + BATCH_SIZE;

         if (!buf->last) {
                 if (nb_req <= buf->events_size)
                         return true;

                 if (buf->head >= BATCH_SIZE) {
                         buf->last_mask = ~0;
                         buf->last = buf->tail;
                         buf->tail = 0;
                         return true;
                 }
         }

         return nb_req <= buf->head;
}


nb_req will potentially be the new tail, yes? So if you want to avoid 
head == tail, then the comparison should be "<" not "<=". Although I 
don't see what's the problem with head == tail, except the bug in the 
buffer flush function.


>
> Please provide details on issue repro.
>   


I was using DSW + RX adapter from DPDK v21.11rc1 when I encountered this 
issue.


>> For reasons I didn't analyze, it only seem to occur when the event port's
>> enqueue burst size was larger than 32 (the RX burst used against the RX
>> adapter's ethdev queues), and there was backpressure from the event
>> device.
>>
> Burst size larger than 32 implies, BATCH_SIZE is modified.
> Please provide details what all the changes done for hitting the issue.


The burst size against the event device can be much larger than 32. The 
RX adapter doesn't honor any particular limit there (except the circular 
buffer size), but the burst in practice will be no longer than the 
configured eventdev or eventdev port enqueue burst limit.


I didn't change anything to reproduce this issue.


The same setup works with v21.08, so it does seem like a regression.
  
Ganapati Kundapura Nov. 10, 2021, 8:26 a.m. UTC | #6
Hi Mattias,

> -----Original Message-----
> From: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> Sent: 09 November 2021 13:58
> To: Kundapura, Ganapati <ganapati.kundapura@intel.com>;
> jerinj@marvell.com; Jayatheerthan, Jay <jay.jayatheerthan@intel.com>
> Cc: dev@dpdk.org; stable@dpdk.org
> Subject: Re: [PATCH] eventdev: fix Rx adapter stalls on event device
> backpressure
> 
> On 2021-11-09 07:26, Kundapura, Ganapati wrote:
> > Hi Mattias,
> >
> >> -----Original Message-----
> >> From: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> >> Sent: 08 November 2021 19:14
> >> To: jerinj@marvell.com; Jayatheerthan, Jay
> >> <jay.jayatheerthan@intel.com>
> >> Cc: dev@dpdk.org; Kundapura, Ganapati
> <ganapati.kundapura@intel.com>;
> >> stable@dpdk.org
> >> Subject: Re: [PATCH] eventdev: fix Rx adapter stalls on event device
> >> backpressure
> >>
> >> On 2021-11-08 14:25, Mattias Rönnblom wrote:
> >>> In the Eventdev Ethernet RX Adapter, correctly handle the case where
> >>> the circular enqueue buffer head and tail index points to the same
> >>> element (i.e., the buffer is full) and the buffer has wrapped.
> >>>
> >>> This bug may be triggered in case there is backpressure from the
> >>> event device to the RX adapter.
> >>>
> >>> Fixes: 8113fd15e229 ("eventdev/eth_rx: make enqueue buffer
> >>> circular")
> >>> Cc: ganapati.kundapura@intel.com
> >>> Cc: stable@dpdk.org
> >>
> >> Disregard the stable cc. This bug does not appear in any released
> >> DPDK version (e.g., 21.08).
> >>
> >>
> >>> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> >>> ---
> >>>    lib/eventdev/rte_event_eth_rx_adapter.c | 22 ++++++++++++++------
> --
> >>>    1 file changed, 14 insertions(+), 8 deletions(-)
> >>>
> >>> diff --git a/lib/eventdev/rte_event_eth_rx_adapter.c
> >> b/lib/eventdev/rte_event_eth_rx_adapter.c
> >>> index 56318b5a6f..809416d9b7 100644
> >>> --- a/lib/eventdev/rte_event_eth_rx_adapter.c
> >>> +++ b/lib/eventdev/rte_event_eth_rx_adapter.c
> >>> @@ -777,19 +777,25 @@ rxa_flush_event_buffer(struct
> >> event_eth_rx_adapter *rx_adapter,
> >>>    		       struct eth_event_enqueue_buffer *buf,
> >>>    		       struct rte_event_eth_rx_adapter_stats *stats)
> >>>    {
> >>> -	uint16_t count = buf->last ? buf->last - buf->head : buf->count;
> >>> +	uint16_t count = buf->count;
> >>> +	uint16_t n = 0;
> >>>
> >>>    	if (!count)
> >>>    		return 0;
> >>>
> >>> -	uint16_t n = rte_event_enqueue_new_burst(rx_adapter-
> >>> eventdev_id,
> >>> -					rx_adapter->event_port_id,
> >>> -					&buf->events[buf->head],
> >>> -					count);
> >>> -	if (n != count)
> >>> -		stats->rx_enq_retry++;
> >>> +	if (buf->last)
> >>> +		count = buf->last - buf->head;
> >>> +
> >>> +	if (count) {
> >>> +		n = rte_event_enqueue_new_burst(rx_adapter-
> >>> eventdev_id,
> >>> +						rx_adapter->event_port_id,
> >>> +						&buf->events[buf->head],
> >>> +						count);
> >>> +		if (n != count)
> >>> +			stats->rx_enq_retry++;
> >>>
> >>> -	buf->head += n;
> >>> +		buf->head += n;
> >>> +	}
> >>>
> >>>    	if (buf->last && n == count) {
> >>>    		uint16_t n1;
> > When head = tail, count is the number of events in the event buffer
> > i.e count = buf->count and last = 0 Last is the marker used in case of roll
> over.
> > In case of tail roll over and head is not, events are processed from head to
> last and zero to tail.
> > Looks like change is same as the original.
> > Could you please clarify more on this change and also clarify if you were
> able to reproduce the backpressure issue?
> 
> 
> The enqueue buffer state I encountered was last != 0 and head == tail, and
> size != 0. In that case, the function returns early, since count == 0, even
> though there are events stored from 0 to tail. head, tail, last and size were all
> 192, from what I remember.
> 
Issue seems to be head is catching up with last not tail when the tail is rolled over.
When head = last, count is zero and returning without processing the events from 0 to tail.
Approved. You can add my ack

> 
> For reasons I didn't analyze, it only seem to occur when the event port's
> enqueue burst size was larger than 32 (the RX burst used against the RX
> adapter's ethdev queues), and there was backpressure from the event
> device.
>
  
Jayatheerthan, Jay Nov. 10, 2021, 10:08 a.m. UTC | #7
> -----Original Message-----
> From: Kundapura, Ganapati <ganapati.kundapura@intel.com>
> Sent: Wednesday, November 10, 2021 1:56 PM
> To: mattias.ronnblom <mattias.ronnblom@ericsson.com>; jerinj@marvell.com; Jayatheerthan, Jay <jay.jayatheerthan@intel.com>
> Cc: dev@dpdk.org; stable@dpdk.org
> Subject: RE: [PATCH] eventdev: fix Rx adapter stalls on event device backpressure
> 
> Hi Mattias,
> 
> > -----Original Message-----
> > From: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> > Sent: 09 November 2021 13:58
> > To: Kundapura, Ganapati <ganapati.kundapura@intel.com>;
> > jerinj@marvell.com; Jayatheerthan, Jay <jay.jayatheerthan@intel.com>
> > Cc: dev@dpdk.org; stable@dpdk.org
> > Subject: Re: [PATCH] eventdev: fix Rx adapter stalls on event device
> > backpressure
> >
> > On 2021-11-09 07:26, Kundapura, Ganapati wrote:
> > > Hi Mattias,
> > >
> > >> -----Original Message-----
> > >> From: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> > >> Sent: 08 November 2021 19:14
> > >> To: jerinj@marvell.com; Jayatheerthan, Jay
> > >> <jay.jayatheerthan@intel.com>
> > >> Cc: dev@dpdk.org; Kundapura, Ganapati
> > <ganapati.kundapura@intel.com>;
> > >> stable@dpdk.org
> > >> Subject: Re: [PATCH] eventdev: fix Rx adapter stalls on event device
> > >> backpressure
> > >>
> > >> On 2021-11-08 14:25, Mattias Rönnblom wrote:
> > >>> In the Eventdev Ethernet RX Adapter, correctly handle the case where
> > >>> the circular enqueue buffer head and tail index points to the same
> > >>> element (i.e., the buffer is full) and the buffer has wrapped.

Could you update the description to reflect last instead of tail ? last > 0 doesn’t mean the buffer is full. It is just a marker for event device enqueue when tail has rolled over and head hasn’t. Below would work, if you want to use it.

"In the Eventdev Ethernet RX Adapter, correctly handle the case where
the circular enqueue buffer head and last index points to the same
element."

> > >>>
> > >>> This bug may be triggered in case there is backpressure from the
> > >>> event device to the RX adapter.
> > >>>
> > >>> Fixes: 8113fd15e229 ("eventdev/eth_rx: make enqueue buffer
> > >>> circular")
> > >>> Cc: ganapati.kundapura@intel.com
> > >>> Cc: stable@dpdk.org
> > >>
> > >> Disregard the stable cc. This bug does not appear in any released
> > >> DPDK version (e.g., 21.08).
> > >>
> > >>
> > >>> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> > >>> ---
> > >>>    lib/eventdev/rte_event_eth_rx_adapter.c | 22 ++++++++++++++------
> > --
> > >>>    1 file changed, 14 insertions(+), 8 deletions(-)
> > >>>
> > >>> diff --git a/lib/eventdev/rte_event_eth_rx_adapter.c
> > >> b/lib/eventdev/rte_event_eth_rx_adapter.c
> > >>> index 56318b5a6f..809416d9b7 100644
> > >>> --- a/lib/eventdev/rte_event_eth_rx_adapter.c
> > >>> +++ b/lib/eventdev/rte_event_eth_rx_adapter.c
> > >>> @@ -777,19 +777,25 @@ rxa_flush_event_buffer(struct
> > >> event_eth_rx_adapter *rx_adapter,
> > >>>    		       struct eth_event_enqueue_buffer *buf,
> > >>>    		       struct rte_event_eth_rx_adapter_stats *stats)
> > >>>    {
> > >>> -	uint16_t count = buf->last ? buf->last - buf->head : buf->count;
> > >>> +	uint16_t count = buf->count;
> > >>> +	uint16_t n = 0;
> > >>>
> > >>>    	if (!count)
> > >>>    		return 0;
> > >>>
> > >>> -	uint16_t n = rte_event_enqueue_new_burst(rx_adapter-
> > >>> eventdev_id,
> > >>> -					rx_adapter->event_port_id,
> > >>> -					&buf->events[buf->head],
> > >>> -					count);
> > >>> -	if (n != count)
> > >>> -		stats->rx_enq_retry++;
> > >>> +	if (buf->last)
> > >>> +		count = buf->last - buf->head;
> > >>> +
> > >>> +	if (count) {
> > >>> +		n = rte_event_enqueue_new_burst(rx_adapter-
> > >>> eventdev_id,
> > >>> +						rx_adapter->event_port_id,
> > >>> +						&buf->events[buf->head],
> > >>> +						count);
> > >>> +		if (n != count)
> > >>> +			stats->rx_enq_retry++;
> > >>>
> > >>> -	buf->head += n;
> > >>> +		buf->head += n;
> > >>> +	}
> > >>>
> > >>>    	if (buf->last && n == count) {
> > >>>    		uint16_t n1;
> > > When head = tail, count is the number of events in the event buffer
> > > i.e count = buf->count and last = 0 Last is the marker used in case of roll
> > over.
> > > In case of tail roll over and head is not, events are processed from head to
> > last and zero to tail.
> > > Looks like change is same as the original.
> > > Could you please clarify more on this change and also clarify if you were
> > able to reproduce the backpressure issue?
> >
> >
> > The enqueue buffer state I encountered was last != 0 and head == tail, and
> > size != 0. In that case, the function returns early, since count == 0, even
> > though there are events stored from 0 to tail. head, tail, last and size were all
> > 192, from what I remember.
> >
> Issue seems to be head is catching up with last not tail when the tail is rolled over.
> When head = last, count is zero and returning without processing the events from 0 to tail.
> Approved. You can add my ack
> 
> >
> > For reasons I didn't analyze, it only seem to occur when the event port's
> > enqueue burst size was larger than 32 (the RX burst used against the RX
> > adapter's ethdev queues), and there was backpressure from the event
> > device.
> >
  
Mattias Rönnblom Nov. 10, 2021, 11:11 a.m. UTC | #8
On 2021-11-10 11:08, Jayatheerthan, Jay wrote:
>> -----Original Message-----
>> From: Kundapura, Ganapati <ganapati.kundapura@intel.com>
>> Sent: Wednesday, November 10, 2021 1:56 PM
>> To: mattias.ronnblom <mattias.ronnblom@ericsson.com>; jerinj@marvell.com; Jayatheerthan, Jay <jay.jayatheerthan@intel.com>
>> Cc: dev@dpdk.org; stable@dpdk.org
>> Subject: RE: [PATCH] eventdev: fix Rx adapter stalls on event device backpressure
>>
>> Hi Mattias,
>>
>>> -----Original Message-----
>>> From: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
>>> Sent: 09 November 2021 13:58
>>> To: Kundapura, Ganapati <ganapati.kundapura@intel.com>;
>>> jerinj@marvell.com; Jayatheerthan, Jay <jay.jayatheerthan@intel.com>
>>> Cc: dev@dpdk.org; stable@dpdk.org
>>> Subject: Re: [PATCH] eventdev: fix Rx adapter stalls on event device
>>> backpressure
>>>
>>> On 2021-11-09 07:26, Kundapura, Ganapati wrote:
>>>> Hi Mattias,
>>>>
>>>>> -----Original Message-----
>>>>> From: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
>>>>> Sent: 08 November 2021 19:14
>>>>> To: jerinj@marvell.com; Jayatheerthan, Jay
>>>>> <jay.jayatheerthan@intel.com>
>>>>> Cc: dev@dpdk.org; Kundapura, Ganapati
>>> <ganapati.kundapura@intel.com>;
>>>>> stable@dpdk.org
>>>>> Subject: Re: [PATCH] eventdev: fix Rx adapter stalls on event device
>>>>> backpressure
>>>>>
>>>>> On 2021-11-08 14:25, Mattias Rönnblom wrote:
>>>>>> In the Eventdev Ethernet RX Adapter, correctly handle the case where
>>>>>> the circular enqueue buffer head and tail index points to the same
>>>>>> element (i.e., the buffer is full) and the buffer has wrapped.
> Could you update the description to reflect last instead of tail ? last > 0 doesn’t mean the buffer is full. It is just a marker for event device enqueue when tail has rolled over and head hasn’t. Below would work, if you want to use it.
>
> "In the Eventdev Ethernet RX Adapter, correctly handle the case where
> the circular enqueue buffer head and last index points to the same
> element."


Good point. I'll send a v2. Thanks.


>>>>>> This bug may be triggered in case there is backpressure from the
>>>>>> event device to the RX adapter.
>>>>>>
>>>>>> Fixes: 8113fd15e229 ("eventdev/eth_rx: make enqueue buffer
>>>>>> circular")
>>>>>> Cc: ganapati.kundapura@intel.com
>>>>>> Cc: stable@dpdk.org
>>>>> Disregard the stable cc. This bug does not appear in any released
>>>>> DPDK version (e.g., 21.08).
>>>>>
>>>>>
>>>>>> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
>>>>>> ---
>>>>>>     lib/eventdev/rte_event_eth_rx_adapter.c | 22 ++++++++++++++------
>>> --
>>>>>>     1 file changed, 14 insertions(+), 8 deletions(-)
>>>>>>
>>>>>> diff --git a/lib/eventdev/rte_event_eth_rx_adapter.c
>>>>> b/lib/eventdev/rte_event_eth_rx_adapter.c
>>>>>> index 56318b5a6f..809416d9b7 100644
>>>>>> --- a/lib/eventdev/rte_event_eth_rx_adapter.c
>>>>>> +++ b/lib/eventdev/rte_event_eth_rx_adapter.c
>>>>>> @@ -777,19 +777,25 @@ rxa_flush_event_buffer(struct
>>>>> event_eth_rx_adapter *rx_adapter,
>>>>>>     		       struct eth_event_enqueue_buffer *buf,
>>>>>>     		       struct rte_event_eth_rx_adapter_stats *stats)
>>>>>>     {
>>>>>> -	uint16_t count = buf->last ? buf->last - buf->head : buf->count;
>>>>>> +	uint16_t count = buf->count;
>>>>>> +	uint16_t n = 0;
>>>>>>
>>>>>>     	if (!count)
>>>>>>     		return 0;
>>>>>>
>>>>>> -	uint16_t n = rte_event_enqueue_new_burst(rx_adapter-
>>>>>> eventdev_id,
>>>>>> -					rx_adapter->event_port_id,
>>>>>> -					&buf->events[buf->head],
>>>>>> -					count);
>>>>>> -	if (n != count)
>>>>>> -		stats->rx_enq_retry++;
>>>>>> +	if (buf->last)
>>>>>> +		count = buf->last - buf->head;
>>>>>> +
>>>>>> +	if (count) {
>>>>>> +		n = rte_event_enqueue_new_burst(rx_adapter-
>>>>>> eventdev_id,
>>>>>> +						rx_adapter->event_port_id,
>>>>>> +						&buf->events[buf->head],
>>>>>> +						count);
>>>>>> +		if (n != count)
>>>>>> +			stats->rx_enq_retry++;
>>>>>>
>>>>>> -	buf->head += n;
>>>>>> +		buf->head += n;
>>>>>> +	}
>>>>>>
>>>>>>     	if (buf->last && n == count) {
>>>>>>     		uint16_t n1;
>>>> When head = tail, count is the number of events in the event buffer
>>>> i.e count = buf->count and last = 0 Last is the marker used in case of roll
>>> over.
>>>> In case of tail roll over and head is not, events are processed from head to
>>> last and zero to tail.
>>>> Looks like change is same as the original.
>>>> Could you please clarify more on this change and also clarify if you were
>>> able to reproduce the backpressure issue?
>>>
>>>
>>> The enqueue buffer state I encountered was last != 0 and head == tail, and
>>> size != 0. In that case, the function returns early, since count == 0, even
>>> though there are events stored from 0 to tail. head, tail, last and size were all
>>> 192, from what I remember.
>>>
>> Issue seems to be head is catching up with last not tail when the tail is rolled over.
>> When head = last, count is zero and returning without processing the events from 0 to tail.
>> Approved. You can add my ack
>>
>>> For reasons I didn't analyze, it only seem to occur when the event port's
>>> enqueue burst size was larger than 32 (the RX burst used against the RX
>>> adapter's ethdev queues), and there was backpressure from the event
>>> device.
>>>
  

Patch

diff --git a/lib/eventdev/rte_event_eth_rx_adapter.c b/lib/eventdev/rte_event_eth_rx_adapter.c
index 56318b5a6f..809416d9b7 100644
--- a/lib/eventdev/rte_event_eth_rx_adapter.c
+++ b/lib/eventdev/rte_event_eth_rx_adapter.c
@@ -777,19 +777,25 @@  rxa_flush_event_buffer(struct event_eth_rx_adapter *rx_adapter,
 		       struct eth_event_enqueue_buffer *buf,
 		       struct rte_event_eth_rx_adapter_stats *stats)
 {
-	uint16_t count = buf->last ? buf->last - buf->head : buf->count;
+	uint16_t count = buf->count;
+	uint16_t n = 0;
 
 	if (!count)
 		return 0;
 
-	uint16_t n = rte_event_enqueue_new_burst(rx_adapter->eventdev_id,
-					rx_adapter->event_port_id,
-					&buf->events[buf->head],
-					count);
-	if (n != count)
-		stats->rx_enq_retry++;
+	if (buf->last)
+		count = buf->last - buf->head;
+
+	if (count) {
+		n = rte_event_enqueue_new_burst(rx_adapter->eventdev_id,
+						rx_adapter->event_port_id,
+						&buf->events[buf->head],
+						count);
+		if (n != count)
+			stats->rx_enq_retry++;
 
-	buf->head += n;
+		buf->head += n;
+	}
 
 	if (buf->last && n == count) {
 		uint16_t n1;