net/ixgbe: fix RxQ/TxQ release

Message ID 20210927171815.16216-1-julien.meunier@nokia.com (mailing list archive)
State Superseded, archived
Delegated to: Qi Zhang
Headers
Series net/ixgbe: fix RxQ/TxQ release |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/github-robot: build success github build: passed
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS

Commit Message

Julien Meunier Sept. 27, 2021, 5:18 p.m. UTC
  On the vector implementation, during the tear-down, the mbufs not
drained in the RxQ and TxQ are freed based on an algorithm which
supposed that the number of descriptors is a power of 2 (max_desc).
Based on this hypothesis, this algorithm uses a bitmask in order to
detect an index overflow during the iteration, and to restart the loop
from 0.

However, there is no such power of 2 requirement in the ixgbe for the
number of descriptors in the RxQ / TxQ. The only requirement is to have
a number correctly aligned.

If a user requested to configure a number of descriptors which is not a
power of 2, as a consequence, during the tear-down, it was possible to
be in an infinite loop, and to never reach the exit loop condition.

By removing the bitmask and changing the loop method, we can avoid this
issue, and allow the user to configure a RxQ / TxQ which is not a power
of 2.

Fixes: c95584dc2b18 ("ixgbe: new vectorized functions for Rx/Tx")
Cc: bruce.richardson@intel.com
Cc: stable@dpdk.org

Signed-off-by: Julien Meunier <julien.meunier@nokia.com>
---
 drivers/net/ixgbe/ixgbe_rxtx_vec_common.h | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)
  

Comments

Wang, Haiyue Sept. 28, 2021, 3:06 a.m. UTC | #1
> -----Original Message-----
> From: Julien Meunier <julien.meunier@nokia.com>
> Sent: Tuesday, September 28, 2021 01:18
> To: dev@dpdk.org
> Cc: stable@dpdk.org; Richardson, Bruce <bruce.richardson@intel.com>; Wang, Haiyue
> <haiyue.wang@intel.com>
> Subject: [PATCH] net/ixgbe: fix RxQ/TxQ release
> 
> On the vector implementation, during the tear-down, the mbufs not
> drained in the RxQ and TxQ are freed based on an algorithm which
> supposed that the number of descriptors is a power of 2 (max_desc).
> Based on this hypothesis, this algorithm uses a bitmask in order to
> detect an index overflow during the iteration, and to restart the loop
> from 0.
> 
> However, there is no such power of 2 requirement in the ixgbe for the
> number of descriptors in the RxQ / TxQ. The only requirement is to have
> a number correctly aligned.
> 
> If a user requested to configure a number of descriptors which is not a
> power of 2, as a consequence, during the tear-down, it was possible to
> be in an infinite loop, and to never reach the exit loop condition.
> 

Are you able to setup not a power of 2 successfully ?

int
rte_eth_tx_queue_setup(uint16_t port_id, uint16_t tx_queue_id,
		       uint16_t nb_tx_desc, unsigned int socket_id,
		       const struct rte_eth_txconf *tx_conf)
{
	...

	if (nb_tx_desc > dev_info.tx_desc_lim.nb_max ||
	    nb_tx_desc < dev_info.tx_desc_lim.nb_min ||
	    nb_tx_desc % dev_info.tx_desc_lim.nb_align != 0) {
		RTE_ETHDEV_LOG(ERR,
			"Invalid value for nb_tx_desc(=%hu), should be: <= %hu, >= %hu, and a product of %hu\n",
			nb_tx_desc, dev_info.tx_desc_lim.nb_max,
			dev_info.tx_desc_lim.nb_min,
			dev_info.tx_desc_lim.nb_align);
		return -EINVAL;
	}

	...

}

> By removing the bitmask and changing the loop method, we can avoid this
> issue, and allow the user to configure a RxQ / TxQ which is not a power
> of 2.
> 
> Fixes: c95584dc2b18 ("ixgbe: new vectorized functions for Rx/Tx")
> Cc: bruce.richardson@intel.com
> Cc: stable@dpdk.org
> 
> Signed-off-by: Julien Meunier <julien.meunier@nokia.com>
> ---
>  drivers/net/ixgbe/ixgbe_rxtx_vec_common.h | 20 +++++++++++++-------
>  1 file changed, 13 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec_common.h b/drivers/net/ixgbe/ixgbe_rxtx_vec_common.h
> index adba855ca3..8912558918 100644
> --- a/drivers/net/ixgbe/ixgbe_rxtx_vec_common.h
> +++ b/drivers/net/ixgbe/ixgbe_rxtx_vec_common.h
> @@ -150,11 +150,14 @@ _ixgbe_tx_queue_release_mbufs_vec(struct ixgbe_tx_queue *txq)
>  		return;
> 
>  	/* release the used mbufs in sw_ring */
> -	for (i = txq->tx_next_dd - (txq->tx_rs_thresh - 1);
> -	     i != txq->tx_tail;
> -	     i = (i + 1) & max_desc) {
> +	i = txq->tx_next_dd - (txq->tx_rs_thresh - 1);
> +	while (i != txq->tx_tail) {
>  		txe = &txq->sw_ring_v[i];
>  		rte_pktmbuf_free_seg(txe->mbuf);
> +
> +		i = i + 1;
> +		if (i > max_desc)
> +			i = 0;
>  	}
>  	txq->nb_tx_free = max_desc;
> 
> @@ -168,7 +171,7 @@ _ixgbe_tx_queue_release_mbufs_vec(struct ixgbe_tx_queue *txq)
>  static inline void
>  _ixgbe_rx_queue_release_mbufs_vec(struct ixgbe_rx_queue *rxq)
>  {
> -	const unsigned int mask = rxq->nb_rx_desc - 1;
> +	const unsigned int max_desc = rxq->nb_rx_desc - 1;
>  	unsigned int i;
> 
>  	if (rxq->sw_ring == NULL || rxq->rxrearm_nb >= rxq->nb_rx_desc)
> @@ -181,11 +184,14 @@ _ixgbe_rx_queue_release_mbufs_vec(struct ixgbe_rx_queue *rxq)
>  				rte_pktmbuf_free_seg(rxq->sw_ring[i].mbuf);
>  		}
>  	} else {
> -		for (i = rxq->rx_tail;
> -		     i != rxq->rxrearm_start;
> -		     i = (i + 1) & mask) {
> +		i = rxq->rx_tail;
> +		while (i != rxq->rxrearm_start) {
>  			if (rxq->sw_ring[i].mbuf != NULL)
>  				rte_pktmbuf_free_seg(rxq->sw_ring[i].mbuf);
> +
> +			i = i + 1;
> +			if (i > max_desc)
> +				i = 0;
>  		}
>  	}
> 
> --
> 2.17.1
  
Wang, Haiyue Sept. 28, 2021, 3:21 a.m. UTC | #2
> -----Original Message-----
> From: Wang, Haiyue
> Sent: Tuesday, September 28, 2021 11:06
> To: 'Julien Meunier' <julien.meunier@nokia.com>; dev@dpdk.org
> Cc: stable@dpdk.org; Richardson, Bruce <bruce.richardson@intel.com>
> Subject: RE: [PATCH] net/ixgbe: fix RxQ/TxQ release
> 
> > -----Original Message-----
> > From: Julien Meunier <julien.meunier@nokia.com>
> > Sent: Tuesday, September 28, 2021 01:18
> > To: dev@dpdk.org
> > Cc: stable@dpdk.org; Richardson, Bruce <bruce.richardson@intel.com>; Wang, Haiyue
> > <haiyue.wang@intel.com>
> > Subject: [PATCH] net/ixgbe: fix RxQ/TxQ release
> >
> > On the vector implementation, during the tear-down, the mbufs not
> > drained in the RxQ and TxQ are freed based on an algorithm which
> > supposed that the number of descriptors is a power of 2 (max_desc).
> > Based on this hypothesis, this algorithm uses a bitmask in order to
> > detect an index overflow during the iteration, and to restart the loop
> > from 0.
> >
> > However, there is no such power of 2 requirement in the ixgbe for the
> > number of descriptors in the RxQ / TxQ. The only requirement is to have
> > a number correctly aligned.
> >
> > If a user requested to configure a number of descriptors which is not a
> > power of 2, as a consequence, during the tear-down, it was possible to
> > be in an infinite loop, and to never reach the exit loop condition.
> >
> 
> Are you able to setup not a power of 2 successfully ?
> 

My fault, yes, possible. ;-)

> int
> rte_eth_tx_queue_setup(uint16_t port_id, uint16_t tx_queue_id,
> 		       uint16_t nb_tx_desc, unsigned int socket_id,
> 		       const struct rte_eth_txconf *tx_conf)
> {
> 	...
> 
> 	if (nb_tx_desc > dev_info.tx_desc_lim.nb_max ||
> 	    nb_tx_desc < dev_info.tx_desc_lim.nb_min ||
> 	    nb_tx_desc % dev_info.tx_desc_lim.nb_align != 0) {
> 		RTE_ETHDEV_LOG(ERR,
> 			"Invalid value for nb_tx_desc(=%hu), should be: <= %hu, >= %hu, and a product
> of %hu\n",
> 			nb_tx_desc, dev_info.tx_desc_lim.nb_max,
> 			dev_info.tx_desc_lim.nb_min,
> 			dev_info.tx_desc_lim.nb_align);
> 		return -EINVAL;
> 	}
> 
> 	...
> 
> }
> 
> > By removing the bitmask and changing the loop method, we can avoid this
> > issue, and allow the user to configure a RxQ / TxQ which is not a power
> > of 2.
> >
> > Fixes: c95584dc2b18 ("ixgbe: new vectorized functions for Rx/Tx")
> > Cc: bruce.richardson@intel.com
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Julien Meunier <julien.meunier@nokia.com>
> > ---
> >  drivers/net/ixgbe/ixgbe_rxtx_vec_common.h | 20 +++++++++++++-------
> >  1 file changed, 13 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec_common.h b/drivers/net/ixgbe/ixgbe_rxtx_vec_common.h
> > index adba855ca3..8912558918 100644
> > --- a/drivers/net/ixgbe/ixgbe_rxtx_vec_common.h
> > +++ b/drivers/net/ixgbe/ixgbe_rxtx_vec_common.h
> > @@ -150,11 +150,14 @@ _ixgbe_tx_queue_release_mbufs_vec(struct ixgbe_tx_queue *txq)
> >  		return;
> >

Just one line ?

i = (i + 1) % txq->nb_tx_desc

> >  	/* release the used mbufs in sw_ring */
> > -	for (i = txq->tx_next_dd - (txq->tx_rs_thresh - 1);
> > -	     i != txq->tx_tail;
> > -	     i = (i + 1) & max_desc) {
> > +	i = txq->tx_next_dd - (txq->tx_rs_thresh - 1);
> > +	while (i != txq->tx_tail) {
> >  		txe = &txq->sw_ring_v[i];
> >  		rte_pktmbuf_free_seg(txe->mbuf);
> > +
> > +		i = i + 1;
> > +		if (i > max_desc)
> > +			i = 0;
> >  	}
> >  	txq->nb_tx_free = max_desc;
> >
> > @@ -168,7 +171,7 @@ _ixgbe_tx_queue_release_mbufs_vec(struct ixgbe_tx_queue *txq)
> >  static inline void
> >  _ixgbe_rx_queue_release_mbufs_vec(struct ixgbe_rx_queue *rxq)
> >  {
> > -	const unsigned int mask = rxq->nb_rx_desc - 1;
> > +	const unsigned int max_desc = rxq->nb_rx_desc - 1;
> >  	unsigned int i;
> >
> >  	if (rxq->sw_ring == NULL || rxq->rxrearm_nb >= rxq->nb_rx_desc)
> > @@ -181,11 +184,14 @@ _ixgbe_rx_queue_release_mbufs_vec(struct ixgbe_rx_queue *rxq)
> >  				rte_pktmbuf_free_seg(rxq->sw_ring[i].mbuf);
> >  		}
> >  	} else {
> > -		for (i = rxq->rx_tail;
> > -		     i != rxq->rxrearm_start;
> > -		     i = (i + 1) & mask) {
> > +		i = rxq->rx_tail;
> > +		while (i != rxq->rxrearm_start) {
> >  			if (rxq->sw_ring[i].mbuf != NULL)
> >  				rte_pktmbuf_free_seg(rxq->sw_ring[i].mbuf);
> > +
> > +			i = i + 1;
> > +			if (i > max_desc)
> > +				i = 0;
> >  		}
> >  	}
> >
> > --
> > 2.17.1
  
Julien Meunier Sept. 28, 2021, 7:46 a.m. UTC | #3
Hello,

On 28/09/2021 05:21, Wang, Haiyue wrote:
>> -----Original Message-----
>> From: Wang, Haiyue
>> Sent: Tuesday, September 28, 2021 11:06
>> To: 'Julien Meunier' <julien.meunier@nokia.com>; dev@dpdk.org
>> Cc: stable@dpdk.org; Richardson, Bruce <bruce.richardson@intel.com>
>> Subject: RE: [PATCH] net/ixgbe: fix RxQ/TxQ release
>>
>>> -----Original Message-----
>>> From: Julien Meunier <julien.meunier@nokia.com>
>>> Sent: Tuesday, September 28, 2021 01:18
>>> To: dev@dpdk.org
>>> Cc: stable@dpdk.org; Richardson, Bruce <bruce.richardson@intel.com>; Wang, Haiyue
>>> <haiyue.wang@intel.com>
>>> Subject: [PATCH] net/ixgbe: fix RxQ/TxQ release
>>>
>>> On the vector implementation, during the tear-down, the mbufs not
>>> drained in the RxQ and TxQ are freed based on an algorithm which
>>> supposed that the number of descriptors is a power of 2 (max_desc).
>>> Based on this hypothesis, this algorithm uses a bitmask in order to
>>> detect an index overflow during the iteration, and to restart the loop
>>> from 0.
>>>
>>> However, there is no such power of 2 requirement in the ixgbe for the
>>> number of descriptors in the RxQ / TxQ. The only requirement is to have
>>> a number correctly aligned.
>>>
>>> If a user requested to configure a number of descriptors which is not a
>>> power of 2, as a consequence, during the tear-down, it was possible to
>>> be in an infinite loop, and to never reach the exit loop condition.
>>>
>>
>> Are you able to setup not a power of 2 successfully ?
>>
> 
> My fault, yes, possible. ;-)
> 

Yes, we have some usecases where the nb of descriptiors for the TxQ is 
set to 1536.
I modified the test_pmd_perf in order to validate this behavior, as my 
ixgbe X550 supports the loopback mode:
- nb_desc = 2048 => txq is drained and stopped correctly
- nb_desc = 1536 => freeze during the teardown

>> int
>> rte_eth_tx_queue_setup(uint16_t port_id, uint16_t tx_queue_id,
>> 		       uint16_t nb_tx_desc, unsigned int socket_id,
>> 		       const struct rte_eth_txconf *tx_conf)
>> {
>> 	...
>>
>> 	if (nb_tx_desc > dev_info.tx_desc_lim.nb_max ||
>> 	    nb_tx_desc < dev_info.tx_desc_lim.nb_min ||
>> 	    nb_tx_desc % dev_info.tx_desc_lim.nb_align != 0) {
>> 		RTE_ETHDEV_LOG(ERR,
>> 			"Invalid value for nb_tx_desc(=%hu), should be: <= %hu, >= %hu, and a product
>> of %hu\n",
>> 			nb_tx_desc, dev_info.tx_desc_lim.nb_max,
>> 			dev_info.tx_desc_lim.nb_min,
>> 			dev_info.tx_desc_lim.nb_align);
>> 		return -EINVAL;
>> 	}
>>
>> 	...
>>
>> }
>>
>>> By removing the bitmask and changing the loop method, we can avoid this
>>> issue, and allow the user to configure a RxQ / TxQ which is not a power
>>> of 2.
>>>
>>> Fixes: c95584dc2b18 ("ixgbe: new vectorized functions for Rx/Tx")
>>> Cc: bruce.richardson@intel.com
>>> Cc: stable@dpdk.org
>>>
>>> Signed-off-by: Julien Meunier <julien.meunier@nokia.com>
>>> ---
>>>   drivers/net/ixgbe/ixgbe_rxtx_vec_common.h | 20 +++++++++++++-------
>>>   1 file changed, 13 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec_common.h b/drivers/net/ixgbe/ixgbe_rxtx_vec_common.h
>>> index adba855ca3..8912558918 100644
>>> --- a/drivers/net/ixgbe/ixgbe_rxtx_vec_common.h
>>> +++ b/drivers/net/ixgbe/ixgbe_rxtx_vec_common.h
>>> @@ -150,11 +150,14 @@ _ixgbe_tx_queue_release_mbufs_vec(struct ixgbe_tx_queue *txq)
>>>   		return;
>>>
> 
> Just one line ?
> 
> i = (i + 1) % txq->nb_tx_desc
> 

Ah yes, I was too focused with this bitmask...
The shorter, the better.

I will send a V2 today.

Thanks for this feedback !

>>>   	/* release the used mbufs in sw_ring */
>>> -	for (i = txq->tx_next_dd - (txq->tx_rs_thresh - 1);
>>> -	     i != txq->tx_tail;
>>> -	     i = (i + 1) & max_desc) {
>>> +	i = txq->tx_next_dd - (txq->tx_rs_thresh - 1);
>>> +	while (i != txq->tx_tail) {
>>>   		txe = &txq->sw_ring_v[i];
>>>   		rte_pktmbuf_free_seg(txe->mbuf);
>>> +
>>> +		i = i + 1;
>>> +		if (i > max_desc)
>>> +			i = 0;
>>>   	}
>>>   	txq->nb_tx_free = max_desc;
>>>
>>> @@ -168,7 +171,7 @@ _ixgbe_tx_queue_release_mbufs_vec(struct ixgbe_tx_queue *txq)
>>>   static inline void
>>>   _ixgbe_rx_queue_release_mbufs_vec(struct ixgbe_rx_queue *rxq)
>>>   {
>>> -	const unsigned int mask = rxq->nb_rx_desc - 1;
>>> +	const unsigned int max_desc = rxq->nb_rx_desc - 1;
>>>   	unsigned int i;
>>>
>>>   	if (rxq->sw_ring == NULL || rxq->rxrearm_nb >= rxq->nb_rx_desc)
>>> @@ -181,11 +184,14 @@ _ixgbe_rx_queue_release_mbufs_vec(struct ixgbe_rx_queue *rxq)
>>>   				rte_pktmbuf_free_seg(rxq->sw_ring[i].mbuf);
>>>   		}
>>>   	} else {
>>> -		for (i = rxq->rx_tail;
>>> -		     i != rxq->rxrearm_start;
>>> -		     i = (i + 1) & mask) {
>>> +		i = rxq->rx_tail;
>>> +		while (i != rxq->rxrearm_start) {
>>>   			if (rxq->sw_ring[i].mbuf != NULL)
>>>   				rte_pktmbuf_free_seg(rxq->sw_ring[i].mbuf);
>>> +
>>> +			i = i + 1;
>>> +			if (i > max_desc)
>>> +				i = 0;
>>>   		}
>>>   	}
>>>
>>> --
>>> 2.17.1
>
  

Patch

diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec_common.h b/drivers/net/ixgbe/ixgbe_rxtx_vec_common.h
index adba855ca3..8912558918 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx_vec_common.h
+++ b/drivers/net/ixgbe/ixgbe_rxtx_vec_common.h
@@ -150,11 +150,14 @@  _ixgbe_tx_queue_release_mbufs_vec(struct ixgbe_tx_queue *txq)
 		return;
 
 	/* release the used mbufs in sw_ring */
-	for (i = txq->tx_next_dd - (txq->tx_rs_thresh - 1);
-	     i != txq->tx_tail;
-	     i = (i + 1) & max_desc) {
+	i = txq->tx_next_dd - (txq->tx_rs_thresh - 1);
+	while (i != txq->tx_tail) {
 		txe = &txq->sw_ring_v[i];
 		rte_pktmbuf_free_seg(txe->mbuf);
+
+		i = i + 1;
+		if (i > max_desc)
+			i = 0;
 	}
 	txq->nb_tx_free = max_desc;
 
@@ -168,7 +171,7 @@  _ixgbe_tx_queue_release_mbufs_vec(struct ixgbe_tx_queue *txq)
 static inline void
 _ixgbe_rx_queue_release_mbufs_vec(struct ixgbe_rx_queue *rxq)
 {
-	const unsigned int mask = rxq->nb_rx_desc - 1;
+	const unsigned int max_desc = rxq->nb_rx_desc - 1;
 	unsigned int i;
 
 	if (rxq->sw_ring == NULL || rxq->rxrearm_nb >= rxq->nb_rx_desc)
@@ -181,11 +184,14 @@  _ixgbe_rx_queue_release_mbufs_vec(struct ixgbe_rx_queue *rxq)
 				rte_pktmbuf_free_seg(rxq->sw_ring[i].mbuf);
 		}
 	} else {
-		for (i = rxq->rx_tail;
-		     i != rxq->rxrearm_start;
-		     i = (i + 1) & mask) {
+		i = rxq->rx_tail;
+		while (i != rxq->rxrearm_start) {
 			if (rxq->sw_ring[i].mbuf != NULL)
 				rte_pktmbuf_free_seg(rxq->sw_ring[i].mbuf);
+
+			i = i + 1;
+			if (i > max_desc)
+				i = 0;
 		}
 	}