[v2] net/i40e: fix Rx instability with vector mode

Message ID 1540535607-115720-1-git-send-email-beilei.xing@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Qi Zhang
Headers
Series [v2] net/i40e: fix Rx instability with vector mode |

Checks

Context Check Description
ci/Intel-compilation success Compilation OK
ci/checkpatch success coding style OK

Commit Message

Xing, Beilei Oct. 26, 2018, 6:33 a.m. UTC
  Previously, there is instability during vector Rx if descriptor
number is not power of 2, e.g. process hang and some Rx packets
are unexpectedly empty. That's because vector Rx mode assumes Rx
descriptor number is power of 2 when doing bit mask.
This patch allows vector mode only when the number of Rx descriptor
is power of 2.

Fixes: 8e109464c022 ("i40e: allow vector Rx and Tx usage")
Fixes: a3c83a2527e1 ("net/i40e: enable runtime queue setup")
Cc: stable@dpdk.org

Signed-off-by: Beilei Xing <beilei.xing@intel.com>
---

v2 changes:
 - rx_vec_allowed is global configuration, avoid overwrite.

 doc/guides/nics/i40e.rst     |  7 +++++++
 drivers/net/i40e/i40e_rxtx.c | 13 ++++++++++++-
 2 files changed, 19 insertions(+), 1 deletion(-)
  

Comments

Ananyev, Konstantin Oct. 26, 2018, 8:47 a.m. UTC | #1
Hi,

> 
> Previously, there is instability during vector Rx if descriptor
> number is not power of 2, e.g. process hang and some Rx packets
> are unexpectedly empty. That's because vector Rx mode assumes Rx
> descriptor number is power of 2 when doing bit mask.
> This patch allows vector mode only when the number of Rx descriptor
> is power of 2.
> 
> Fixes: 8e109464c022 ("i40e: allow vector Rx and Tx usage")
> Fixes: a3c83a2527e1 ("net/i40e: enable runtime queue setup")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Beilei Xing <beilei.xing@intel.com>
> ---
> 
> v2 changes:
>  - rx_vec_allowed is global configuration, avoid overwrite.
> 
>  doc/guides/nics/i40e.rst     |  7 +++++++
>  drivers/net/i40e/i40e_rxtx.c | 13 ++++++++++++-
>  2 files changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/doc/guides/nics/i40e.rst b/doc/guides/nics/i40e.rst
> index ab3928a..bfacbd1 100644
> --- a/doc/guides/nics/i40e.rst
> +++ b/doc/guides/nics/i40e.rst
> @@ -172,6 +172,13 @@ Runtime Config Options
> 
>    -w 84:00.0,use-latest-supported-vec=1
> 
> +Vector RX Pre-conditions
> +~~~~~~~~~~~~~~~~~~~~~~~~
> +For Vector RX it is assumed that the number of descriptor rings will be a power
> +of 2. With this pre-condition, the ring pointer can easily scroll back to the
> +head after hitting the tail without a conditional check. In addition Vector RX
> +can use this assumption to do a bit mask using ``ring_size - 1``.
> +
>  Driver compilation and testing
>  ------------------------------
> 
> diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
> index a827456..f6dcca0 100644
> --- a/drivers/net/i40e/i40e_rxtx.c
> +++ b/drivers/net/i40e/i40e_rxtx.c
> @@ -1735,10 +1735,14 @@ i40e_dev_rx_queue_setup_runtime(struct rte_eth_dev *dev,
>  		 * i40e_set_rx_function.
>  		 */
>  		ad->rx_bulk_alloc_allowed = true;
> -		ad->rx_vec_allowed = true;
>  		dev->data->scattered_rx = use_scattered_rx;
>  		if (use_def_burst_func)
>  			ad->rx_bulk_alloc_allowed = false;
> +		/**
> +		 * Vector mode is allowed only when number of Rx queue
> +		 * descriptor is a power of 2.
> +		 */
> +		ad->rx_vec_allowed = !(rxq->nb_rx_desc & (rxq->nb_rx_desc - 1));
>  		i40e_set_rx_function(dev);
>  		return 0;
>  	}

I think this is not complete - if it is not  the first function, we have to do:
If (if (ad->rx_vec_allowed && (rxq->nb_rx_desc & (rxq->nb_rx_desc - 1) != 0) {
      PMD_DRV_LOG(ERR, ...);
       return -EINVAL;
}

BTW, here and below why not to use rte_is_power_of_2()? 
Konstantin

> @@ -1811,6 +1815,13 @@ i40e_dev_rx_queue_setup(struct rte_eth_dev *dev,
>  		return -EINVAL;
>  	}
> 
> +	/**
> +	 * Vector mode is allowed only when number of Rx queue
> +	 * descriptor is a power of 2.
> +	 */
> +	if (ad->rx_vec_allowed)
> +		ad->rx_vec_allowed = !(nb_desc & (nb_desc - 1));
> +
>  	/* Free memory if needed */
>  	if (dev->data->rx_queues[queue_idx]) {
>  		i40e_dev_rx_queue_release(dev->data->rx_queues[queue_idx]);
  
Bruce Richardson Oct. 26, 2018, 11:11 a.m. UTC | #2
On Fri, Oct 26, 2018 at 02:33:27PM +0800, Beilei Xing wrote:
> Previously, there is instability during vector Rx if descriptor
> number is not power of 2, e.g. process hang and some Rx packets
> are unexpectedly empty. That's because vector Rx mode assumes Rx
> descriptor number is power of 2 when doing bit mask.
> This patch allows vector mode only when the number of Rx descriptor
> is power of 2.
> 
> Fixes: 8e109464c022 ("i40e: allow vector Rx and Tx usage")
> Fixes: a3c83a2527e1 ("net/i40e: enable runtime queue setup")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Beilei Xing <beilei.xing@intel.com>
> ---
> 
Do we not limit our descriptor ring sizes to powers of two anyway? Is there
a real need for non-power-of-2 ring sizes?
  
Qi Zhang Oct. 26, 2018, 3:41 p.m. UTC | #3
> -----Original Message-----
> From: Ananyev, Konstantin
> Sent: Friday, October 26, 2018 3:47 AM
> To: Xing, Beilei <beilei.xing@intel.com>; Zhang, Qi Z <qi.z.zhang@intel.com>
> Cc: dev@dpdk.org; stable@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v2] net/i40e: fix Rx instability with vector
> mode
> 
> 
> Hi,
> 
> >
> > Previously, there is instability during vector Rx if descriptor number
> > is not power of 2, e.g. process hang and some Rx packets are
> > unexpectedly empty. That's because vector Rx mode assumes Rx
> > descriptor number is power of 2 when doing bit mask.
> > This patch allows vector mode only when the number of Rx descriptor is
> > power of 2.
> >
> > Fixes: 8e109464c022 ("i40e: allow vector Rx and Tx usage")
> > Fixes: a3c83a2527e1 ("net/i40e: enable runtime queue setup")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Beilei Xing <beilei.xing@intel.com>
> > ---
> >
> > v2 changes:
> >  - rx_vec_allowed is global configuration, avoid overwrite.
> >
> >  doc/guides/nics/i40e.rst     |  7 +++++++
> >  drivers/net/i40e/i40e_rxtx.c | 13 ++++++++++++-
> >  2 files changed, 19 insertions(+), 1 deletion(-)
> >
> > diff --git a/doc/guides/nics/i40e.rst b/doc/guides/nics/i40e.rst index
> > ab3928a..bfacbd1 100644
> > --- a/doc/guides/nics/i40e.rst
> > +++ b/doc/guides/nics/i40e.rst
> > @@ -172,6 +172,13 @@ Runtime Config Options
> >
> >    -w 84:00.0,use-latest-supported-vec=1
> >
> > +Vector RX Pre-conditions
> > +~~~~~~~~~~~~~~~~~~~~~~~~
> > +For Vector RX it is assumed that the number of descriptor rings will
> > +be a power of 2. With this pre-condition, the ring pointer can easily
> > +scroll back to the head after hitting the tail without a conditional
> > +check. In addition Vector RX can use this assumption to do a bit mask
> using ``ring_size - 1``.
> > +
> >  Driver compilation and testing
> >  ------------------------------
> >
> > diff --git a/drivers/net/i40e/i40e_rxtx.c
> > b/drivers/net/i40e/i40e_rxtx.c index a827456..f6dcca0 100644
> > --- a/drivers/net/i40e/i40e_rxtx.c
> > +++ b/drivers/net/i40e/i40e_rxtx.c
> > @@ -1735,10 +1735,14 @@ i40e_dev_rx_queue_setup_runtime(struct
> rte_eth_dev *dev,
> >  		 * i40e_set_rx_function.
> >  		 */
> >  		ad->rx_bulk_alloc_allowed = true;
> > -		ad->rx_vec_allowed = true;
> >  		dev->data->scattered_rx = use_scattered_rx;
> >  		if (use_def_burst_func)
> >  			ad->rx_bulk_alloc_allowed = false;
> > +		/**
> > +		 * Vector mode is allowed only when number of Rx queue
> > +		 * descriptor is a power of 2.
> > +		 */
> > +		ad->rx_vec_allowed = !(rxq->nb_rx_desc & (rxq->nb_rx_desc - 1));
> >  		i40e_set_rx_function(dev);
> >  		return 0;
> >  	}
> 
> I think this is not complete - if it is not  the first function, we have to do:
> If (if (ad->rx_vec_allowed && (rxq->nb_rx_desc & (rxq->nb_rx_desc - 1) != 0)
> {
>       PMD_DRV_LOG(ERR, ...);
>        return -EINVAL;
> }

Yes, we should prevent downgrade if the device is already running.

> 
> BTW, here and below why not to use rte_is_power_of_2()?
> Konstantin
> 
> > @@ -1811,6 +1815,13 @@ i40e_dev_rx_queue_setup(struct rte_eth_dev
> *dev,
> >  		return -EINVAL;
> >  	}
> >
> > +	/**
> > +	 * Vector mode is allowed only when number of Rx queue
> > +	 * descriptor is a power of 2.
> > +	 */
> > +	if (ad->rx_vec_allowed)
> > +		ad->rx_vec_allowed = !(nb_desc & (nb_desc - 1));

Looks like we are not able to get back to vec mode once some queue is set to no power of 2, even if all queues has been re-setup to power of 2 again.
Seems we still need to check all other queue's configure here.

> > +
> >  	/* Free memory if needed */
> >  	if (dev->data->rx_queues[queue_idx]) {
> >  		i40e_dev_rx_queue_release(dev->data->rx_queues[queue_idx]);
> 
> 
> 
> 
> 
>
  
Qi Zhang Oct. 26, 2018, 3:49 p.m. UTC | #4
> -----Original Message-----
> From: Richardson, Bruce
> Sent: Friday, October 26, 2018 6:12 AM
> To: Xing, Beilei <beilei.xing@intel.com>
> Cc: Zhang, Qi Z <qi.z.zhang@intel.com>; dev@dpdk.org; stable@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2] net/i40e: fix Rx instability with vector
> mode
> 
> On Fri, Oct 26, 2018 at 02:33:27PM +0800, Beilei Xing wrote:
> > Previously, there is instability during vector Rx if descriptor number
> > is not power of 2, e.g. process hang and some Rx packets are
> > unexpectedly empty. That's because vector Rx mode assumes Rx
> > descriptor number is power of 2 when doing bit mask.
> > This patch allows vector mode only when the number of Rx descriptor is
> > power of 2.
> >
> > Fixes: 8e109464c022 ("i40e: allow vector Rx and Tx usage")
> > Fixes: a3c83a2527e1 ("net/i40e: enable runtime queue setup")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Beilei Xing <beilei.xing@intel.com>
> > ---
> >
> Do we not limit our descriptor ring sizes to powers of two anyway? 
Is there a
> real need for non-power-of-2 ring sizes?

This is tracked by some Bugzilla, seems someone is using this kind of configure, 
and I guess a more flexible queue size configure may help on the platform that have limited cache?
  
Xing, Beilei Oct. 29, 2018, 1:48 a.m. UTC | #5
> -----Original Message-----
> From: Zhang, Qi Z
> Sent: Friday, October 26, 2018 11:42 PM
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Xing, Beilei
> <beilei.xing@intel.com>
> Cc: dev@dpdk.org; stable@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v2] net/i40e: fix Rx instability with vector
> mode
> 
> 
> 
> > -----Original Message-----
> > From: Ananyev, Konstantin
> > Sent: Friday, October 26, 2018 3:47 AM
> > To: Xing, Beilei <beilei.xing@intel.com>; Zhang, Qi Z
> > <qi.z.zhang@intel.com>
> > Cc: dev@dpdk.org; stable@dpdk.org
> > Subject: RE: [dpdk-dev] [PATCH v2] net/i40e: fix Rx instability with
> > vector mode
> >
> >
> > Hi,
> >
> > >
> > > Previously, there is instability during vector Rx if descriptor
> > > number is not power of 2, e.g. process hang and some Rx packets are
> > > unexpectedly empty. That's because vector Rx mode assumes Rx
> > > descriptor number is power of 2 when doing bit mask.
> > > This patch allows vector mode only when the number of Rx descriptor
> > > is power of 2.
> > >
> > > Fixes: 8e109464c022 ("i40e: allow vector Rx and Tx usage")
> > > Fixes: a3c83a2527e1 ("net/i40e: enable runtime queue setup")
> > > Cc: stable@dpdk.org
> > >
> > > Signed-off-by: Beilei Xing <beilei.xing@intel.com>
> > > ---
> > >
> > > v2 changes:
> > >  - rx_vec_allowed is global configuration, avoid overwrite.
> > >
> > >  doc/guides/nics/i40e.rst     |  7 +++++++
> > >  drivers/net/i40e/i40e_rxtx.c | 13 ++++++++++++-
> > >  2 files changed, 19 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/doc/guides/nics/i40e.rst b/doc/guides/nics/i40e.rst
> > > index
> > > ab3928a..bfacbd1 100644
> > > --- a/doc/guides/nics/i40e.rst
> > > +++ b/doc/guides/nics/i40e.rst
> > > @@ -172,6 +172,13 @@ Runtime Config Options
> > >
> > >    -w 84:00.0,use-latest-supported-vec=1
> > >
> > > +Vector RX Pre-conditions
> > > +~~~~~~~~~~~~~~~~~~~~~~~~
> > > +For Vector RX it is assumed that the number of descriptor rings
> > > +will be a power of 2. With this pre-condition, the ring pointer can
> > > +easily scroll back to the head after hitting the tail without a
> > > +conditional check. In addition Vector RX can use this assumption to
> > > +do a bit mask
> > using ``ring_size - 1``.
> > > +
> > >  Driver compilation and testing
> > >  ------------------------------
> > >
> > > diff --git a/drivers/net/i40e/i40e_rxtx.c
> > > b/drivers/net/i40e/i40e_rxtx.c index a827456..f6dcca0 100644
> > > --- a/drivers/net/i40e/i40e_rxtx.c
> > > +++ b/drivers/net/i40e/i40e_rxtx.c
> > > @@ -1735,10 +1735,14 @@ i40e_dev_rx_queue_setup_runtime(struct
> > rte_eth_dev *dev,
> > >  		 * i40e_set_rx_function.
> > >  		 */
> > >  		ad->rx_bulk_alloc_allowed = true;
> > > -		ad->rx_vec_allowed = true;
> > >  		dev->data->scattered_rx = use_scattered_rx;
> > >  		if (use_def_burst_func)
> > >  			ad->rx_bulk_alloc_allowed = false;
> > > +		/**
> > > +		 * Vector mode is allowed only when number of Rx queue
> > > +		 * descriptor is a power of 2.
> > > +		 */
> > > +		ad->rx_vec_allowed = !(rxq->nb_rx_desc & (rxq->nb_rx_desc -
> 1));
> > >  		i40e_set_rx_function(dev);
> > >  		return 0;
> > >  	}
> >
> > I think this is not complete - if it is not  the first function, we have to do:
> > If (if (ad->rx_vec_allowed && (rxq->nb_rx_desc & (rxq->nb_rx_desc - 1)
> > != 0) {
> >       PMD_DRV_LOG(ERR, ...);
> >        return -EINVAL;
> > }
> 
> Yes, we should prevent downgrade if the device is already running.

OK, will update in next version.

> 
> >
> > BTW, here and below why not to use rte_is_power_of_2()?

Thanks for the reminder, will use the inline function in next version.


> > Konstantin
> >
> > > @@ -1811,6 +1815,13 @@ i40e_dev_rx_queue_setup(struct
> rte_eth_dev
> > *dev,
> > >  		return -EINVAL;
> > >  	}
> > >
> > > +	/**
> > > +	 * Vector mode is allowed only when number of Rx queue
> > > +	 * descriptor is a power of 2.
> > > +	 */
> > > +	if (ad->rx_vec_allowed)
> > > +		ad->rx_vec_allowed = !(nb_desc & (nb_desc - 1));
> 
> Looks like we are not able to get back to vec mode once some queue is set
> to no power of 2, even if all queues has been re-setup to power of 2 again.
> Seems we still need to check all other queue's configure here.

We can allow vector again when runtime configuring the first queue whose descriptor number is power of 2, right?

> 
> > > +
> > >  	/* Free memory if needed */
> > >  	if (dev->data->rx_queues[queue_idx]) {
> > >
> 	i40e_dev_rx_queue_release(dev->data->rx_queues[queue_idx]);
> >
> >
> >
> >
> >
> >
  

Patch

diff --git a/doc/guides/nics/i40e.rst b/doc/guides/nics/i40e.rst
index ab3928a..bfacbd1 100644
--- a/doc/guides/nics/i40e.rst
+++ b/doc/guides/nics/i40e.rst
@@ -172,6 +172,13 @@  Runtime Config Options
 
   -w 84:00.0,use-latest-supported-vec=1
 
+Vector RX Pre-conditions
+~~~~~~~~~~~~~~~~~~~~~~~~
+For Vector RX it is assumed that the number of descriptor rings will be a power
+of 2. With this pre-condition, the ring pointer can easily scroll back to the
+head after hitting the tail without a conditional check. In addition Vector RX
+can use this assumption to do a bit mask using ``ring_size - 1``.
+
 Driver compilation and testing
 ------------------------------
 
diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index a827456..f6dcca0 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -1735,10 +1735,14 @@  i40e_dev_rx_queue_setup_runtime(struct rte_eth_dev *dev,
 		 * i40e_set_rx_function.
 		 */
 		ad->rx_bulk_alloc_allowed = true;
-		ad->rx_vec_allowed = true;
 		dev->data->scattered_rx = use_scattered_rx;
 		if (use_def_burst_func)
 			ad->rx_bulk_alloc_allowed = false;
+		/**
+		 * Vector mode is allowed only when number of Rx queue
+		 * descriptor is a power of 2.
+		 */
+		ad->rx_vec_allowed = !(rxq->nb_rx_desc & (rxq->nb_rx_desc - 1));
 		i40e_set_rx_function(dev);
 		return 0;
 	}
@@ -1811,6 +1815,13 @@  i40e_dev_rx_queue_setup(struct rte_eth_dev *dev,
 		return -EINVAL;
 	}
 
+	/**
+	 * Vector mode is allowed only when number of Rx queue
+	 * descriptor is a power of 2.
+	 */
+	if (ad->rx_vec_allowed)
+		ad->rx_vec_allowed = !(nb_desc & (nb_desc - 1));
+
 	/* Free memory if needed */
 	if (dev->data->rx_queues[queue_idx]) {
 		i40e_dev_rx_queue_release(dev->data->rx_queues[queue_idx]);