net/mana: fix wrong indexing on CQE error when coalescing is used

Message ID 1688689077-5557-1-git-send-email-longli@linuxonhyperv.com (mailing list archive)
State Accepted, archived
Delegated to: Ferruh Yigit
Headers
Series net/mana: fix wrong indexing on CQE error when coalescing is used |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-aarch-unit-testing success Testing PASS
ci/iol-abi-testing success Testing PASS
ci/iol-unit-testing success Testing PASS
ci/github-robot: build success github build: passed
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/intel-Functional success Functional PASS
ci/iol-x86_64-compile-testing success Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-testing success Testing PASS
ci/iol-x86_64-unit-testing success Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-intel-Functional success Functional Testing PASS

Commit Message

Long Li July 7, 2023, 12:17 a.m. UTC
  From: Long Li <longli@microsoft.com>

On a fatal CQE error when coalescing is used, update the correct index
and allow proceeding to the next CQE.

Fixes: 3409e0f172f6 ("net/mana: implement Rx CQE coalescing")
Signed-off-by: Long Li <longli@microsoft.com>
---
 drivers/net/mana/rx.c | 18 ++++++++----------
 1 file changed, 8 insertions(+), 10 deletions(-)
  

Comments

Ferruh Yigit July 7, 2023, 9:01 a.m. UTC | #1
On 7/7/2023 1:17 AM, longli@linuxonhyperv.com wrote:
> From: Long Li <longli@microsoft.com>
> 
> On a fatal CQE error when coalescing is used, update the correct index
> and allow proceeding to the next CQE.
> 
> Fixes: 3409e0f172f6 ("net/mana: implement Rx CQE coalescing")
>

Is above fixes commit correct?
Logic for 'CQE_RX_COALESCED_4' is not changed with this commit, in above
commit and in this commit both does breaks the loop.

This commit changes logic for 'CQE_RX_TRUNCATED' and 'default' cases,
which are added with different commits, not the one in fixes line.

"fatal CQE error when coalescing" mentioned in the commit log, to which
switch case does this corresponds to?


> Signed-off-by: Long Li <longli@microsoft.com>
> ---
>  drivers/net/mana/rx.c | 18 ++++++++----------
>  1 file changed, 8 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/net/mana/rx.c b/drivers/net/mana/rx.c
> index cacfd9ae1b..220b372b15 100644
> --- a/drivers/net/mana/rx.c
> +++ b/drivers/net/mana/rx.c
> @@ -416,23 +416,21 @@ mana_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
>  
>  		switch (oob->cqe_hdr.cqe_type) {
>  		case CQE_RX_OKAY:
> +		case CQE_RX_COALESCED_4:
>  			/* Proceed to process mbuf */
>  			break;
>  
>  		case CQE_RX_TRUNCATED:
> -			DP_LOG(DEBUG, "Drop a truncated packet");
> +		default:
> +			DP_LOG(ERR, "RX CQE type %d client %d vendor %d",
> +			       oob->cqe_hdr.cqe_type, oob->cqe_hdr.client_type,
> +			       oob->cqe_hdr.vendor_err);
> +
>  			rxq->stats.errors++;
>  			rte_pktmbuf_free(mbuf);
> -			goto drop;
> -
> -		case CQE_RX_COALESCED_4:
> -			/* Proceed to process mbuf */
> -			break;
>  
> -		default:
> -			DP_LOG(ERR, "Unknown RX CQE type %d",
> -			       oob->cqe_hdr.cqe_type);
> -			continue;
> +			i++;
> +			goto drop;
>  		}
>  
>  		DP_LOG(DEBUG, "mana_rx_comp_oob type %d rxq %p",
  
Long Li July 7, 2023, 6:01 p.m. UTC | #2
> Subject: Re: [PATCH] net/mana: fix wrong indexing on CQE error when coalescing
> is used
> 
> On 7/7/2023 1:17 AM, longli@linuxonhyperv.com wrote:
> > From: Long Li <longli@microsoft.com>
> >
> > On a fatal CQE error when coalescing is used, update the correct index
> > and allow proceeding to the next CQE.
> >
> > Fixes: 3409e0f172f6 ("net/mana: implement Rx CQE coalescing")
> >
> 
> Is above fixes commit correct?
> Logic for 'CQE_RX_COALESCED_4' is not changed with this commit, in above
> commit and in this commit both does breaks the loop.

Yes, the "Fixes" tag is correct. Here CQE_RX_COALESCED_4 is rearranged to make it easier to read, but it doesn't change the behavior.

> 
> This commit changes logic for 'CQE_RX_TRUNCATED' and 'default' cases, which
> are added with different commits, not the one in fixes line.
> 
> "fatal CQE error when coalescing" mentioned in the commit log, to which switch
> case does this corresponds to?

The previous patch (3409e0f172f6 ) introduced variable "i", an index to completion CQEs. But both 'default' and 'CQE_RX_TRUNCATED' cases don't advance "i", hence not advance to next CQE on error.

The fatal CQE error means the "default" case. On 'CQE_RX_TRUNCATED' the code can recover when all CQEs are read. But on "default", it's a dead loop.

> 
> 
> > Signed-off-by: Long Li <longli@microsoft.com>
> > ---
> >  drivers/net/mana/rx.c | 18 ++++++++----------
> >  1 file changed, 8 insertions(+), 10 deletions(-)
> >
> > diff --git a/drivers/net/mana/rx.c b/drivers/net/mana/rx.c index
> > cacfd9ae1b..220b372b15 100644
> > --- a/drivers/net/mana/rx.c
> > +++ b/drivers/net/mana/rx.c
> > @@ -416,23 +416,21 @@ mana_rx_burst(void *dpdk_rxq, struct rte_mbuf
> > **pkts, uint16_t pkts_n)
> >
> >  		switch (oob->cqe_hdr.cqe_type) {
> >  		case CQE_RX_OKAY:
> > +		case CQE_RX_COALESCED_4:
> >  			/* Proceed to process mbuf */
> >  			break;
> >
> >  		case CQE_RX_TRUNCATED:
> > -			DP_LOG(DEBUG, "Drop a truncated packet");
> > +		default:
> > +			DP_LOG(ERR, "RX CQE type %d client %d vendor %d",
> > +			       oob->cqe_hdr.cqe_type, oob->cqe_hdr.client_type,
> > +			       oob->cqe_hdr.vendor_err);
> > +
> >  			rxq->stats.errors++;
> >  			rte_pktmbuf_free(mbuf);
> > -			goto drop;
> > -
> > -		case CQE_RX_COALESCED_4:
> > -			/* Proceed to process mbuf */
> > -			break;
> >
> > -		default:
> > -			DP_LOG(ERR, "Unknown RX CQE type %d",
> > -			       oob->cqe_hdr.cqe_type);
> > -			continue;
> > +			i++;
> > +			goto drop;
> >  		}
> >
> >  		DP_LOG(DEBUG, "mana_rx_comp_oob type %d rxq %p",
  
Ferruh Yigit July 10, 2023, 10:42 a.m. UTC | #3
On 7/7/2023 7:01 PM, Long Li wrote:
>> Subject: Re: [PATCH] net/mana: fix wrong indexing on CQE error when coalescing
>> is used
>>
>> On 7/7/2023 1:17 AM, longli@linuxonhyperv.com wrote:
>>> From: Long Li <longli@microsoft.com>
>>>
>>> On a fatal CQE error when coalescing is used, update the correct index
>>> and allow proceeding to the next CQE.
>>>
>>> Fixes: 3409e0f172f6 ("net/mana: implement Rx CQE coalescing")
>>>
>>
>> Is above fixes commit correct?
>> Logic for 'CQE_RX_COALESCED_4' is not changed with this commit, in above
>> commit and in this commit both does breaks the loop.
> 
> Yes, the "Fixes" tag is correct. Here CQE_RX_COALESCED_4 is rearranged to make it easier to read, but it doesn't change the behavior.
> 
>>
>> This commit changes logic for 'CQE_RX_TRUNCATED' and 'default' cases, which
>> are added with different commits, not the one in fixes line.
>>
>> "fatal CQE error when coalescing" mentioned in the commit log, to which switch
>> case does this corresponds to?
> 
> The previous patch (3409e0f172f6 ) introduced variable "i", an index to completion CQEs. But both 'default' and 'CQE_RX_TRUNCATED' cases don't advance "i", hence not advance to next CQE on error.
> 

Right, thanks for clarification, I was checking from CQE_RX_COALESCED_4
perspective.

> The fatal CQE error means the "default" case. On 'CQE_RX_TRUNCATED' the code can recover when all CQEs are read. But on "default", it's a dead loop.
> 

ack.

>>
>>
>>> Signed-off-by: Long Li <longli@microsoft.com>
>>> ---
>>>  drivers/net/mana/rx.c | 18 ++++++++----------
>>>  1 file changed, 8 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/drivers/net/mana/rx.c b/drivers/net/mana/rx.c index
>>> cacfd9ae1b..220b372b15 100644
>>> --- a/drivers/net/mana/rx.c
>>> +++ b/drivers/net/mana/rx.c
>>> @@ -416,23 +416,21 @@ mana_rx_burst(void *dpdk_rxq, struct rte_mbuf
>>> **pkts, uint16_t pkts_n)
>>>
>>>  		switch (oob->cqe_hdr.cqe_type) {
>>>  		case CQE_RX_OKAY:
>>> +		case CQE_RX_COALESCED_4:
>>>  			/* Proceed to process mbuf */
>>>  			break;
>>>
>>>  		case CQE_RX_TRUNCATED:
>>> -			DP_LOG(DEBUG, "Drop a truncated packet");
>>> +		default:
>>> +			DP_LOG(ERR, "RX CQE type %d client %d vendor %d",
>>> +			       oob->cqe_hdr.cqe_type, oob->cqe_hdr.client_type,
>>> +			       oob->cqe_hdr.vendor_err);
>>> +
>>>  			rxq->stats.errors++;
>>>  			rte_pktmbuf_free(mbuf);
>>> -			goto drop;
>>> -
>>> -		case CQE_RX_COALESCED_4:
>>> -			/* Proceed to process mbuf */
>>> -			break;
>>>
>>> -		default:
>>> -			DP_LOG(ERR, "Unknown RX CQE type %d",
>>> -			       oob->cqe_hdr.cqe_type);
>>> -			continue;
>>> +			i++;
>>> +			goto drop;
>>>  		}
>>>
>>>  		DP_LOG(DEBUG, "mana_rx_comp_oob type %d rxq %p",
>
  
Ferruh Yigit July 10, 2023, 11 a.m. UTC | #4
On 7/7/2023 1:17 AM, longli@linuxonhyperv.com wrote:
> From: Long Li <longli@microsoft.com>
> 
> On a fatal CQE error when coalescing is used, update the correct index
> and allow proceeding to the next CQE.
> 
> Fixes: 3409e0f172f6 ("net/mana: implement Rx CQE coalescing">
> Signed-off-by: Long Li <longli@microsoft.com>
>

Applied to dpdk-next-net/main, thanks.
  

Patch

diff --git a/drivers/net/mana/rx.c b/drivers/net/mana/rx.c
index cacfd9ae1b..220b372b15 100644
--- a/drivers/net/mana/rx.c
+++ b/drivers/net/mana/rx.c
@@ -416,23 +416,21 @@  mana_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
 
 		switch (oob->cqe_hdr.cqe_type) {
 		case CQE_RX_OKAY:
+		case CQE_RX_COALESCED_4:
 			/* Proceed to process mbuf */
 			break;
 
 		case CQE_RX_TRUNCATED:
-			DP_LOG(DEBUG, "Drop a truncated packet");
+		default:
+			DP_LOG(ERR, "RX CQE type %d client %d vendor %d",
+			       oob->cqe_hdr.cqe_type, oob->cqe_hdr.client_type,
+			       oob->cqe_hdr.vendor_err);
+
 			rxq->stats.errors++;
 			rte_pktmbuf_free(mbuf);
-			goto drop;
-
-		case CQE_RX_COALESCED_4:
-			/* Proceed to process mbuf */
-			break;
 
-		default:
-			DP_LOG(ERR, "Unknown RX CQE type %d",
-			       oob->cqe_hdr.cqe_type);
-			continue;
+			i++;
+			goto drop;
 		}
 
 		DP_LOG(DEBUG, "mana_rx_comp_oob type %d rxq %p",