[dpdk-dev,v2] ixgbe: don't override mbuf buffer length

Message ID 2601191342CEEE43887BDE71AB977258213BCCB6@IRSMSX105.ger.corp.intel.com (mailing list archive)
State Not Applicable, archived
Headers

Commit Message

Ananyev, Konstantin Dec. 5, 2014, 1:10 a.m. UTC
  > -----Original Message-----
> From: Jean-Mickael Guerin [mailto:jean-mickael.guerin@6wind.com]
> Sent: Thursday, December 04, 2014 6:09 PM
> To: dev@dpdk.org
> Cc: Richardson, Bruce; Ananyev, Konstantin
> Subject: [PATCH v2] ixgbe: don't override mbuf buffer length
> 
> The template mbuf_initializer is hard coded with a buflen which
> might have been set differently by the application at the time of
> mbuf pool creation.
> 
> Switch to a mbuf allocation, to fetch the correct default values.
> There is no performance impact because this is not a data-plane API.
> 
> Signed-off-by: Jean-Mickael Guerin <jean-mickael.guerin@6wind.com>
> Acked-by: David Marchand <david.marchand@6wind.com>
> Fixes: 0ff3324da2 ("ixgbe: rework vector pmd following mbuf changes")
> ---
> 
>  v2: check returned value of ixgbe_rxq_vec_setup
> 
>  lib/librte_pmd_ixgbe/ixgbe_rxtx.c     |  5 ++++-
>  lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c | 19 ++++++++++++-------
>  2 files changed, 16 insertions(+), 8 deletions(-)
> 
> diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> index 5c36bff..7994da1 100644
> --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> @@ -2244,7 +2244,10 @@ ixgbe_dev_rx_queue_setup(struct rte_eth_dev *dev,
>  	use_def_burst_func = check_rx_burst_bulk_alloc_preconditions(rxq);
> 
>  #ifdef RTE_IXGBE_INC_VECTOR
> -	ixgbe_rxq_vec_setup(rxq);
> +	if (ixgbe_rxq_vec_setup(rxq) < 0) {
> +		ixgbe_rx_queue_release(rxq);
> +		return (-ENOMEM);
> +	}
>  #endif
>  	/* Check if pre-conditions are satisfied, and no Scattered Rx */
>  	if (!use_def_burst_func && !dev->data->scattered_rx) {
> diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> index c1b5a78..f7b02f5 100644
> --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> @@ -732,17 +732,22 @@ static struct ixgbe_txq_ops vec_txq_ops = {
>  int
>  ixgbe_rxq_vec_setup(struct igb_rx_queue *rxq)
>  {
> -	struct rte_mbuf mb_def = { .buf_addr = 0 }; /* zeroed mbuf */
> +	struct rte_mbuf *mb_def;
> 
> -	mb_def.nb_segs = 1;
> -	mb_def.data_off = RTE_PKTMBUF_HEADROOM;
> -	mb_def.buf_len = rxq->mb_pool->elt_size - sizeof(struct rte_mbuf);
> -	mb_def.port = rxq->port_id;
> -	rte_mbuf_refcnt_set(&mb_def, 1);
> +	mb_def = rte_pktmbuf_alloc(rxq->mb_pool);
> +	if (mb_def == NULL) {
> +		PMD_INIT_LOG(ERR, "ixgbe_rxq_vec_setup: could not allocate one mbuf");
> +		return -1;
> +	}
> +	/* nb_segs, refcnt, data_off and buf_len are already set */
> +	mb_def->port = rxq->port_id;
> 
>  	/* prevent compiler reordering: rearm_data covers previous fields */
>  	rte_compiler_barrier();
> -	rxq->mbuf_initializer = *((uint64_t *)&mb_def.rearm_data);
> +	rxq->mbuf_initializer = *((uint64_t *)&mb_def->rearm_data);
> +
> +	rte_pktmbuf_free(mb_def);
> +
>  	return 0;
>  }
> 
> --
> 2.1.3

As I said in another mail, I don't think it is a proper fix.
What we did here - just changed one assumption to another.
Current assumption - all mbuf obj_init() would setup buf_len in exactly the same manner as  rte_pktmbuf_init() does:
buf_len = mp->elt_size - sizeof(struct rte_mbuf);
New assumption - all mbuf obj_init() would setup buf_len for all mbufs in the pool to the same value.
Both assumptions, I believe, are not always correct.
Though, probably the new one would be true more often.

I still think the proper fix is not to update mbuf's buf_len field at ixgbe_rxq_rearm() at all.
We should just leave the original value unmodified.
Actually, while looking at ixgbe_rxq_rearm(), I don't see any reason why we need to update buf_len field.
It is not the data that need to be rearmed.
The fields that need to be rearmed are:
uint16_t data_off;
uint16_t refcnt
uint8_t nb_segs;
uint8_t port;

6B in total. 
We probably would like to keep rearming as one 64bit load/store. 
Though straight below them we have:
uint64_t ol_flags;

As RX fully override ol_flags anyway, we can safely overwrite first 2B of it.
That would allow us to still read/write whole 64bits and avoid overwriting buf_len.  
I am talking about something like patch  below.
I admit that it looks not so pretty, but I think it is much safer and correct.
Konstantin
  

Comments

Bruce Richardson Dec. 5, 2014, 10:38 a.m. UTC | #1
On Fri, Dec 05, 2014 at 01:10:28AM +0000, Ananyev, Konstantin wrote:
> 
> 
> > -----Original Message-----
> > From: Jean-Mickael Guerin [mailto:jean-mickael.guerin@6wind.com]
> > Sent: Thursday, December 04, 2014 6:09 PM
> > To: dev@dpdk.org
> > Cc: Richardson, Bruce; Ananyev, Konstantin
> > Subject: [PATCH v2] ixgbe: don't override mbuf buffer length
> > 
> > The template mbuf_initializer is hard coded with a buflen which
> > might have been set differently by the application at the time of
> > mbuf pool creation.
> > 
> > Switch to a mbuf allocation, to fetch the correct default values.
> > There is no performance impact because this is not a data-plane API.
> > 
> > Signed-off-by: Jean-Mickael Guerin <jean-mickael.guerin@6wind.com>
> > Acked-by: David Marchand <david.marchand@6wind.com>
> > Fixes: 0ff3324da2 ("ixgbe: rework vector pmd following mbuf changes")
> > ---
> > 
> >  v2: check returned value of ixgbe_rxq_vec_setup
> > 
> >  lib/librte_pmd_ixgbe/ixgbe_rxtx.c     |  5 ++++-
> >  lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c | 19 ++++++++++++-------
> >  2 files changed, 16 insertions(+), 8 deletions(-)
> > 
> > diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> > index 5c36bff..7994da1 100644
> > --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> > +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> > @@ -2244,7 +2244,10 @@ ixgbe_dev_rx_queue_setup(struct rte_eth_dev *dev,
> >  	use_def_burst_func = check_rx_burst_bulk_alloc_preconditions(rxq);
> > 
> >  #ifdef RTE_IXGBE_INC_VECTOR
> > -	ixgbe_rxq_vec_setup(rxq);
> > +	if (ixgbe_rxq_vec_setup(rxq) < 0) {
> > +		ixgbe_rx_queue_release(rxq);
> > +		return (-ENOMEM);
> > +	}
> >  #endif
> >  	/* Check if pre-conditions are satisfied, and no Scattered Rx */
> >  	if (!use_def_burst_func && !dev->data->scattered_rx) {
> > diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> > index c1b5a78..f7b02f5 100644
> > --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> > +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> > @@ -732,17 +732,22 @@ static struct ixgbe_txq_ops vec_txq_ops = {
> >  int
> >  ixgbe_rxq_vec_setup(struct igb_rx_queue *rxq)
> >  {
> > -	struct rte_mbuf mb_def = { .buf_addr = 0 }; /* zeroed mbuf */
> > +	struct rte_mbuf *mb_def;
> > 
> > -	mb_def.nb_segs = 1;
> > -	mb_def.data_off = RTE_PKTMBUF_HEADROOM;
> > -	mb_def.buf_len = rxq->mb_pool->elt_size - sizeof(struct rte_mbuf);
> > -	mb_def.port = rxq->port_id;
> > -	rte_mbuf_refcnt_set(&mb_def, 1);
> > +	mb_def = rte_pktmbuf_alloc(rxq->mb_pool);
> > +	if (mb_def == NULL) {
> > +		PMD_INIT_LOG(ERR, "ixgbe_rxq_vec_setup: could not allocate one mbuf");
> > +		return -1;
> > +	}
> > +	/* nb_segs, refcnt, data_off and buf_len are already set */
> > +	mb_def->port = rxq->port_id;
> > 
> >  	/* prevent compiler reordering: rearm_data covers previous fields */
> >  	rte_compiler_barrier();
> > -	rxq->mbuf_initializer = *((uint64_t *)&mb_def.rearm_data);
> > +	rxq->mbuf_initializer = *((uint64_t *)&mb_def->rearm_data);
> > +
> > +	rte_pktmbuf_free(mb_def);
> > +
> >  	return 0;
> >  }
> > 
> > --
> > 2.1.3
> 
> As I said in another mail, I don't think it is a proper fix.
> What we did here - just changed one assumption to another.
> Current assumption - all mbuf obj_init() would setup buf_len in exactly the same manner as  rte_pktmbuf_init() does:
> buf_len = mp->elt_size - sizeof(struct rte_mbuf);
> New assumption - all mbuf obj_init() would setup buf_len for all mbufs in the pool to the same value.
> Both assumptions, I believe, are not always correct.
> Though, probably the new one would be true more often.
> 
> I still think the proper fix is not to update mbuf's buf_len field at ixgbe_rxq_rearm() at all.
> We should just leave the original value unmodified.
> Actually, while looking at ixgbe_rxq_rearm(), I don't see any reason why we need to update buf_len field.
> It is not the data that need to be rearmed.
> The fields that need to be rearmed are:
> uint16_t data_off;
> uint16_t refcnt
> uint8_t nb_segs;
> uint8_t port;
> 
> 6B in total. 
> We probably would like to keep rearming as one 64bit load/store. 
> Though straight below them we have:
> uint64_t ol_flags;
> 
> As RX fully override ol_flags anyway, we can safely overwrite first 2B of it.
> That would allow us to still read/write whole 64bits and avoid overwriting buf_len.  
> I am talking about something like patch  below.
> I admit that it looks not so pretty, but I think it is much safer and correct.
> Konstantin

I just don't see this as worthwhile doing. We are looking here at an mbuf pool
which is to be used for packet buffers for RX. If the packet buffers in that
pool are all of different sizes then we need to go back and look at other places
throughout the code too. For instance, we query the buffer length of the mbuf
pool when initializing the RX queues to determine if we need to enable scattered
RX. If the mbufs in the pool can potentially be of different sizes, we need to
turn off the no-scattered-packets optimization and always use the scatter
packets code path - because the assumption that buffers don't get resized could 
also be false. Similarly here, if the mbufs are going to be of different
sizes then the user should disable the vector PMD, and use RX code that doesn't
override the buf_len each time.

Furthermore, we still don't have an actual use-case where the user would want to
have different size mbufs in an mbuf pool used for RX. They can still have
variable-sized mbufs in other pools, but having all buffers the same size in the
pool used to receive packets seems a perfectly fair restriction to have. If
someone has an app that they are creating that needs this functionality, I'll
reconsider this opinion, but right now this is all theoretical.

Regards,
/Bruce

> 
> --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> @@ -79,13 +79,19 @@ ixgbe_rxq_rearm(struct igb_rx_queue *rxq)
>         /* Initialize the mbufs in vector, process 2 mbufs in one loop */
>         for (i = 0; i < RTE_IXGBE_RXQ_REARM_THRESH; i += 2, rxep += 2) {
>                 __m128i vaddr0, vaddr1;
> +               uintptr_t p0, p1;
> 
>                 mb0 = rxep[0].mbuf;
>                 mb1 = rxep[1].mbuf;
> 
>                 /* flush mbuf with pkt template */
> -               mb0->rearm_data[0] = rxq->mbuf_initializer;
> -               mb1->rearm_data[0] = rxq->mbuf_initializer;
> +               p0 = (uintptr_t)&mb0->data_off;
> +               *(uint64_t *)p0 = rxq->mbuf_initializer;
> +               p1 = (uintptr_t)&mb1->data_off;
> +               *(uint64_t *)p1 = rxq->mbuf_initializer;
> +
> +               //mb0->rearm_data[0] = rxq->mbuf_initializer;
> +               //mb1->rearm_data[0] = rxq->mbuf_initializer;
> 
>                 /* load buf_addr(lo 64bit) and buf_physaddr(hi 64bit) */
>                 vaddr0 = _mm_loadu_si128((__m128i *)&(mb0->buf_addr));
> @@ -732,6 +738,7 @@ static struct ixgbe_txq_ops vec_txq_ops = {
>  int
>  ixgbe_rxq_vec_setup(struct igb_rx_queue *rxq)
>  {
> +       uintptr_t p;
>         struct rte_mbuf mb_def = { .buf_addr = 0 }; /* zeroed mbuf */
> 
>         mb_def.nb_segs = 1;
> @@ -739,7 +746,8 @@ ixgbe_rxq_vec_setup(struct igb_rx_queue *rxq)
>         mb_def.buf_len = rxq->mb_pool->elt_size - sizeof(struct rte_mbuf);
>         mb_def.port = rxq->port_id;
>         rte_mbuf_refcnt_set(&mb_def, 1);
> -       rxq->mbuf_initializer = *((uint64_t *)&mb_def.rearm_data);
> +       p = (uintptr_t)&mb_def.data_off;
> +       rxq->mbuf_initializer = *(uint64_t *)p;
>         return 0;
>  }
> 
>
  
Ananyev, Konstantin Dec. 5, 2014, 11:19 a.m. UTC | #2
> -----Original Message-----
> From: Richardson, Bruce
> Sent: Friday, December 05, 2014 10:39 AM
> To: Ananyev, Konstantin
> Cc: Jean-Mickael Guerin; dev@dpdk.org
> Subject: Re: [PATCH v2] ixgbe: don't override mbuf buffer length
> 
> On Fri, Dec 05, 2014 at 01:10:28AM +0000, Ananyev, Konstantin wrote:
> >
> >
> > > -----Original Message-----
> > > From: Jean-Mickael Guerin [mailto:jean-mickael.guerin@6wind.com]
> > > Sent: Thursday, December 04, 2014 6:09 PM
> > > To: dev@dpdk.org
> > > Cc: Richardson, Bruce; Ananyev, Konstantin
> > > Subject: [PATCH v2] ixgbe: don't override mbuf buffer length
> > >
> > > The template mbuf_initializer is hard coded with a buflen which
> > > might have been set differently by the application at the time of
> > > mbuf pool creation.
> > >
> > > Switch to a mbuf allocation, to fetch the correct default values.
> > > There is no performance impact because this is not a data-plane API.
> > >
> > > Signed-off-by: Jean-Mickael Guerin <jean-mickael.guerin@6wind.com>
> > > Acked-by: David Marchand <david.marchand@6wind.com>
> > > Fixes: 0ff3324da2 ("ixgbe: rework vector pmd following mbuf changes")
> > > ---
> > >
> > >  v2: check returned value of ixgbe_rxq_vec_setup
> > >
> > >  lib/librte_pmd_ixgbe/ixgbe_rxtx.c     |  5 ++++-
> > >  lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c | 19 ++++++++++++-------
> > >  2 files changed, 16 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> > > index 5c36bff..7994da1 100644
> > > --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> > > +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> > > @@ -2244,7 +2244,10 @@ ixgbe_dev_rx_queue_setup(struct rte_eth_dev *dev,
> > >  	use_def_burst_func = check_rx_burst_bulk_alloc_preconditions(rxq);
> > >
> > >  #ifdef RTE_IXGBE_INC_VECTOR
> > > -	ixgbe_rxq_vec_setup(rxq);
> > > +	if (ixgbe_rxq_vec_setup(rxq) < 0) {
> > > +		ixgbe_rx_queue_release(rxq);
> > > +		return (-ENOMEM);
> > > +	}
> > >  #endif
> > >  	/* Check if pre-conditions are satisfied, and no Scattered Rx */
> > >  	if (!use_def_burst_func && !dev->data->scattered_rx) {
> > > diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> > > index c1b5a78..f7b02f5 100644
> > > --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> > > +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> > > @@ -732,17 +732,22 @@ static struct ixgbe_txq_ops vec_txq_ops = {
> > >  int
> > >  ixgbe_rxq_vec_setup(struct igb_rx_queue *rxq)
> > >  {
> > > -	struct rte_mbuf mb_def = { .buf_addr = 0 }; /* zeroed mbuf */
> > > +	struct rte_mbuf *mb_def;
> > >
> > > -	mb_def.nb_segs = 1;
> > > -	mb_def.data_off = RTE_PKTMBUF_HEADROOM;
> > > -	mb_def.buf_len = rxq->mb_pool->elt_size - sizeof(struct rte_mbuf);
> > > -	mb_def.port = rxq->port_id;
> > > -	rte_mbuf_refcnt_set(&mb_def, 1);
> > > +	mb_def = rte_pktmbuf_alloc(rxq->mb_pool);
> > > +	if (mb_def == NULL) {
> > > +		PMD_INIT_LOG(ERR, "ixgbe_rxq_vec_setup: could not allocate one mbuf");
> > > +		return -1;
> > > +	}
> > > +	/* nb_segs, refcnt, data_off and buf_len are already set */
> > > +	mb_def->port = rxq->port_id;
> > >
> > >  	/* prevent compiler reordering: rearm_data covers previous fields */
> > >  	rte_compiler_barrier();
> > > -	rxq->mbuf_initializer = *((uint64_t *)&mb_def.rearm_data);
> > > +	rxq->mbuf_initializer = *((uint64_t *)&mb_def->rearm_data);
> > > +
> > > +	rte_pktmbuf_free(mb_def);
> > > +
> > >  	return 0;
> > >  }
> > >
> > > --
> > > 2.1.3
> >
> > As I said in another mail, I don't think it is a proper fix.
> > What we did here - just changed one assumption to another.
> > Current assumption - all mbuf obj_init() would setup buf_len in exactly the same manner as  rte_pktmbuf_init() does:
> > buf_len = mp->elt_size - sizeof(struct rte_mbuf);
> > New assumption - all mbuf obj_init() would setup buf_len for all mbufs in the pool to the same value.
> > Both assumptions, I believe, are not always correct.
> > Though, probably the new one would be true more often.
> >
> > I still think the proper fix is not to update mbuf's buf_len field at ixgbe_rxq_rearm() at all.
> > We should just leave the original value unmodified.
> > Actually, while looking at ixgbe_rxq_rearm(), I don't see any reason why we need to update buf_len field.
> > It is not the data that need to be rearmed.
> > The fields that need to be rearmed are:
> > uint16_t data_off;
> > uint16_t refcnt
> > uint8_t nb_segs;
> > uint8_t port;
> >
> > 6B in total.
> > We probably would like to keep rearming as one 64bit load/store.
> > Though straight below them we have:
> > uint64_t ol_flags;
> >
> > As RX fully override ol_flags anyway, we can safely overwrite first 2B of it.
> > That would allow us to still read/write whole 64bits and avoid overwriting buf_len.
> > I am talking about something like patch  below.
> > I admit that it looks not so pretty, but I think it is much safer and correct.
> > Konstantin
> 
> I just don't see this as worthwhile doing. We are looking here at an mbuf pool
> which is to be used for packet buffers for RX. If the packet buffers in that
> pool are all of different sizes then we need to go back and look at other places
> throughout the code too. For instance, we query the buffer length of the mbuf
> pool when initializing the RX queues to determine if we need to enable scattered
> RX. If the mbufs in the pool can potentially be of different sizes, we need to
> turn off the no-scattered-packets optimization and always use the scatter
> packets code path - because the assumption that buffers don't get resized could
> also be false. Similarly here, if the mbufs are going to be of different
> sizes then the user should disable the vector PMD, and use RX code that doesn't
> override the buf_len each time.

Yes, all buffers in the pool supposed to be not smaller than some threshold.
But it doesn't mean that buffer can't be bigger.
Let say our usual case - buf_len = 2K.
Why it should be prohibited to use for RX mbuf with buf_len == 4K?
There is absolutely nothing wrong with it. 

> 
> Furthermore, we still don't have an actual use-case where the user would want to
> have different size mbufs in an mbuf pool used for RX. They can still have
> variable-sized mbufs in other pools, but having all buffers the same size in the
> pool used to receive packets seems a perfectly fair restriction to have. If
> someone has an app that they are creating that needs this functionality, I'll
> reconsider this opinion, but right now this is all theoretical.

Well, the question is why do we need to override buf_len at all?
At first place, it doesn't look correct.
Second - what advantage we are gaining from it?
Performance?
I tried with the changes below and didn't see any performance difference at all. 
So what is the point in keeping that potential point of failure?

About use cases - I think I gave two examples in my previous mail.
Not sure why you consider them artificial.
From other side, do we have right now any actual use-case where people setup buf_len to something
different then: mp->elt_size - sizeof(struct rte_mbuf)?

Konstantin

> 
> Regards,
> /Bruce
> 
> >
> > --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> > +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> > @@ -79,13 +79,19 @@ ixgbe_rxq_rearm(struct igb_rx_queue *rxq)
> >         /* Initialize the mbufs in vector, process 2 mbufs in one loop */
> >         for (i = 0; i < RTE_IXGBE_RXQ_REARM_THRESH; i += 2, rxep += 2) {
> >                 __m128i vaddr0, vaddr1;
> > +               uintptr_t p0, p1;
> >
> >                 mb0 = rxep[0].mbuf;
> >                 mb1 = rxep[1].mbuf;
> >
> >                 /* flush mbuf with pkt template */
> > -               mb0->rearm_data[0] = rxq->mbuf_initializer;
> > -               mb1->rearm_data[0] = rxq->mbuf_initializer;
> > +               p0 = (uintptr_t)&mb0->data_off;
> > +               *(uint64_t *)p0 = rxq->mbuf_initializer;
> > +               p1 = (uintptr_t)&mb1->data_off;
> > +               *(uint64_t *)p1 = rxq->mbuf_initializer;
> > +
> > +               //mb0->rearm_data[0] = rxq->mbuf_initializer;
> > +               //mb1->rearm_data[0] = rxq->mbuf_initializer;
> >
> >                 /* load buf_addr(lo 64bit) and buf_physaddr(hi 64bit) */
> >                 vaddr0 = _mm_loadu_si128((__m128i *)&(mb0->buf_addr));
> > @@ -732,6 +738,7 @@ static struct ixgbe_txq_ops vec_txq_ops = {
> >  int
> >  ixgbe_rxq_vec_setup(struct igb_rx_queue *rxq)
> >  {
> > +       uintptr_t p;
> >         struct rte_mbuf mb_def = { .buf_addr = 0 }; /* zeroed mbuf */
> >
> >         mb_def.nb_segs = 1;
> > @@ -739,7 +746,8 @@ ixgbe_rxq_vec_setup(struct igb_rx_queue *rxq)
> >         mb_def.buf_len = rxq->mb_pool->elt_size - sizeof(struct rte_mbuf);
> >         mb_def.port = rxq->port_id;
> >         rte_mbuf_refcnt_set(&mb_def, 1);
> > -       rxq->mbuf_initializer = *((uint64_t *)&mb_def.rearm_data);
> > +       p = (uintptr_t)&mb_def.data_off;
> > +       rxq->mbuf_initializer = *(uint64_t *)p;
> >         return 0;
> >  }
> >
> >
  
Bruce Richardson Dec. 5, 2014, 11:28 a.m. UTC | #3
On Fri, Dec 05, 2014 at 11:19:11AM +0000, Ananyev, Konstantin wrote:
> 
> 
> > -----Original Message-----
> > From: Richardson, Bruce
> > Sent: Friday, December 05, 2014 10:39 AM
> > To: Ananyev, Konstantin
> > Cc: Jean-Mickael Guerin; dev@dpdk.org
> > Subject: Re: [PATCH v2] ixgbe: don't override mbuf buffer length
> > 
> > On Fri, Dec 05, 2014 at 01:10:28AM +0000, Ananyev, Konstantin wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: Jean-Mickael Guerin [mailto:jean-mickael.guerin@6wind.com]
> > > > Sent: Thursday, December 04, 2014 6:09 PM
> > > > To: dev@dpdk.org
> > > > Cc: Richardson, Bruce; Ananyev, Konstantin
> > > > Subject: [PATCH v2] ixgbe: don't override mbuf buffer length
> > > >
> > > > The template mbuf_initializer is hard coded with a buflen which
> > > > might have been set differently by the application at the time of
> > > > mbuf pool creation.
> > > >
> > > > Switch to a mbuf allocation, to fetch the correct default values.
> > > > There is no performance impact because this is not a data-plane API.
> > > >
> > > > Signed-off-by: Jean-Mickael Guerin <jean-mickael.guerin@6wind.com>
> > > > Acked-by: David Marchand <david.marchand@6wind.com>
> > > > Fixes: 0ff3324da2 ("ixgbe: rework vector pmd following mbuf changes")
> > > > ---
> > > >
> > > >  v2: check returned value of ixgbe_rxq_vec_setup
> > > >
> > > >  lib/librte_pmd_ixgbe/ixgbe_rxtx.c     |  5 ++++-
> > > >  lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c | 19 ++++++++++++-------
> > > >  2 files changed, 16 insertions(+), 8 deletions(-)
> > > >
> > > > diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> > > > index 5c36bff..7994da1 100644
> > > > --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> > > > +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> > > > @@ -2244,7 +2244,10 @@ ixgbe_dev_rx_queue_setup(struct rte_eth_dev *dev,
> > > >  	use_def_burst_func = check_rx_burst_bulk_alloc_preconditions(rxq);
> > > >
> > > >  #ifdef RTE_IXGBE_INC_VECTOR
> > > > -	ixgbe_rxq_vec_setup(rxq);
> > > > +	if (ixgbe_rxq_vec_setup(rxq) < 0) {
> > > > +		ixgbe_rx_queue_release(rxq);
> > > > +		return (-ENOMEM);
> > > > +	}
> > > >  #endif
> > > >  	/* Check if pre-conditions are satisfied, and no Scattered Rx */
> > > >  	if (!use_def_burst_func && !dev->data->scattered_rx) {
> > > > diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> > > > index c1b5a78..f7b02f5 100644
> > > > --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> > > > +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> > > > @@ -732,17 +732,22 @@ static struct ixgbe_txq_ops vec_txq_ops = {
> > > >  int
> > > >  ixgbe_rxq_vec_setup(struct igb_rx_queue *rxq)
> > > >  {
> > > > -	struct rte_mbuf mb_def = { .buf_addr = 0 }; /* zeroed mbuf */
> > > > +	struct rte_mbuf *mb_def;
> > > >
> > > > -	mb_def.nb_segs = 1;
> > > > -	mb_def.data_off = RTE_PKTMBUF_HEADROOM;
> > > > -	mb_def.buf_len = rxq->mb_pool->elt_size - sizeof(struct rte_mbuf);
> > > > -	mb_def.port = rxq->port_id;
> > > > -	rte_mbuf_refcnt_set(&mb_def, 1);
> > > > +	mb_def = rte_pktmbuf_alloc(rxq->mb_pool);
> > > > +	if (mb_def == NULL) {
> > > > +		PMD_INIT_LOG(ERR, "ixgbe_rxq_vec_setup: could not allocate one mbuf");
> > > > +		return -1;
> > > > +	}
> > > > +	/* nb_segs, refcnt, data_off and buf_len are already set */
> > > > +	mb_def->port = rxq->port_id;
> > > >
> > > >  	/* prevent compiler reordering: rearm_data covers previous fields */
> > > >  	rte_compiler_barrier();
> > > > -	rxq->mbuf_initializer = *((uint64_t *)&mb_def.rearm_data);
> > > > +	rxq->mbuf_initializer = *((uint64_t *)&mb_def->rearm_data);
> > > > +
> > > > +	rte_pktmbuf_free(mb_def);
> > > > +
> > > >  	return 0;
> > > >  }
> > > >
> > > > --
> > > > 2.1.3
> > >
> > > As I said in another mail, I don't think it is a proper fix.
> > > What we did here - just changed one assumption to another.
> > > Current assumption - all mbuf obj_init() would setup buf_len in exactly the same manner as  rte_pktmbuf_init() does:
> > > buf_len = mp->elt_size - sizeof(struct rte_mbuf);
> > > New assumption - all mbuf obj_init() would setup buf_len for all mbufs in the pool to the same value.
> > > Both assumptions, I believe, are not always correct.
> > > Though, probably the new one would be true more often.
> > >
> > > I still think the proper fix is not to update mbuf's buf_len field at ixgbe_rxq_rearm() at all.
> > > We should just leave the original value unmodified.
> > > Actually, while looking at ixgbe_rxq_rearm(), I don't see any reason why we need to update buf_len field.
> > > It is not the data that need to be rearmed.
> > > The fields that need to be rearmed are:
> > > uint16_t data_off;
> > > uint16_t refcnt
> > > uint8_t nb_segs;
> > > uint8_t port;
> > >
> > > 6B in total.
> > > We probably would like to keep rearming as one 64bit load/store.
> > > Though straight below them we have:
> > > uint64_t ol_flags;
> > >
> > > As RX fully override ol_flags anyway, we can safely overwrite first 2B of it.
> > > That would allow us to still read/write whole 64bits and avoid overwriting buf_len.
> > > I am talking about something like patch  below.
> > > I admit that it looks not so pretty, but I think it is much safer and correct.
> > > Konstantin
> > 
> > I just don't see this as worthwhile doing. We are looking here at an mbuf pool
> > which is to be used for packet buffers for RX. If the packet buffers in that
> > pool are all of different sizes then we need to go back and look at other places
> > throughout the code too. For instance, we query the buffer length of the mbuf
> > pool when initializing the RX queues to determine if we need to enable scattered
> > RX. If the mbufs in the pool can potentially be of different sizes, we need to
> > turn off the no-scattered-packets optimization and always use the scatter
> > packets code path - because the assumption that buffers don't get resized could
> > also be false. Similarly here, if the mbufs are going to be of different
> > sizes then the user should disable the vector PMD, and use RX code that doesn't
> > override the buf_len each time.
> 
> Yes, all buffers in the pool supposed to be not smaller than some threshold.
> But it doesn't mean that buffer can't be bigger.
> Let say our usual case - buf_len = 2K.
> Why it should be prohibited to use for RX mbuf with buf_len == 4K?
> There is absolutely nothing wrong with it. 
> 
> > 
> > Furthermore, we still don't have an actual use-case where the user would want to
> > have different size mbufs in an mbuf pool used for RX. They can still have
> > variable-sized mbufs in other pools, but having all buffers the same size in the
> > pool used to receive packets seems a perfectly fair restriction to have. If
> > someone has an app that they are creating that needs this functionality, I'll
> > reconsider this opinion, but right now this is all theoretical.
> 
> Well, the question is why do we need to override buf_len at all?
> At first place, it doesn't look correct.
> Second - what advantage we are gaining from it?
> Performance?
> I tried with the changes below and didn't see any performance difference at all. 
> So what is the point in keeping that potential point of failure?
> 
> About use cases - I think I gave two examples in my previous mail.
> Not sure why you consider them artificial.
> From other side, do we have right now any actual use-case where people setup buf_len to something
> different then: mp->elt_size - sizeof(struct rte_mbuf)?
> 
> Konstantin
>
Ok, thanks for the info Konstantin - I assume there is a current case where the
mbuf is being initialized to something other than the default, on the basis that
this patch was proposed as a fix for it.

However, rather than argue this further, I'll defer to whatever the rest of the
community decides. Thomas, Jean-Mickael, David - what are your opinions on this?

/Bruce
  
Ananyev, Konstantin Dec. 5, 2014, 11:59 a.m. UTC | #4
> -----Original Message-----
> From: Richardson, Bruce
> Sent: Friday, December 05, 2014 11:29 AM
> To: Ananyev, Konstantin
> Cc: Jean-Mickael Guerin; dev@dpdk.org
> Subject: Re: [PATCH v2] ixgbe: don't override mbuf buffer length
> 
> On Fri, Dec 05, 2014 at 11:19:11AM +0000, Ananyev, Konstantin wrote:
> >
> >
> > > -----Original Message-----
> > > From: Richardson, Bruce
> > > Sent: Friday, December 05, 2014 10:39 AM
> > > To: Ananyev, Konstantin
> > > Cc: Jean-Mickael Guerin; dev@dpdk.org
> > > Subject: Re: [PATCH v2] ixgbe: don't override mbuf buffer length
> > >
> > > On Fri, Dec 05, 2014 at 01:10:28AM +0000, Ananyev, Konstantin wrote:
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Jean-Mickael Guerin [mailto:jean-mickael.guerin@6wind.com]
> > > > > Sent: Thursday, December 04, 2014 6:09 PM
> > > > > To: dev@dpdk.org
> > > > > Cc: Richardson, Bruce; Ananyev, Konstantin
> > > > > Subject: [PATCH v2] ixgbe: don't override mbuf buffer length
> > > > >
> > > > > The template mbuf_initializer is hard coded with a buflen which
> > > > > might have been set differently by the application at the time of
> > > > > mbuf pool creation.
> > > > >
> > > > > Switch to a mbuf allocation, to fetch the correct default values.
> > > > > There is no performance impact because this is not a data-plane API.
> > > > >
> > > > > Signed-off-by: Jean-Mickael Guerin <jean-mickael.guerin@6wind.com>
> > > > > Acked-by: David Marchand <david.marchand@6wind.com>
> > > > > Fixes: 0ff3324da2 ("ixgbe: rework vector pmd following mbuf changes")
> > > > > ---
> > > > >
> > > > >  v2: check returned value of ixgbe_rxq_vec_setup
> > > > >
> > > > >  lib/librte_pmd_ixgbe/ixgbe_rxtx.c     |  5 ++++-
> > > > >  lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c | 19 ++++++++++++-------
> > > > >  2 files changed, 16 insertions(+), 8 deletions(-)
> > > > >
> > > > > diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> > > > > index 5c36bff..7994da1 100644
> > > > > --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> > > > > +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> > > > > @@ -2244,7 +2244,10 @@ ixgbe_dev_rx_queue_setup(struct rte_eth_dev *dev,
> > > > >  	use_def_burst_func = check_rx_burst_bulk_alloc_preconditions(rxq);
> > > > >
> > > > >  #ifdef RTE_IXGBE_INC_VECTOR
> > > > > -	ixgbe_rxq_vec_setup(rxq);
> > > > > +	if (ixgbe_rxq_vec_setup(rxq) < 0) {
> > > > > +		ixgbe_rx_queue_release(rxq);
> > > > > +		return (-ENOMEM);
> > > > > +	}
> > > > >  #endif
> > > > >  	/* Check if pre-conditions are satisfied, and no Scattered Rx */
> > > > >  	if (!use_def_burst_func && !dev->data->scattered_rx) {
> > > > > diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> > > > > index c1b5a78..f7b02f5 100644
> > > > > --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> > > > > +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> > > > > @@ -732,17 +732,22 @@ static struct ixgbe_txq_ops vec_txq_ops = {
> > > > >  int
> > > > >  ixgbe_rxq_vec_setup(struct igb_rx_queue *rxq)
> > > > >  {
> > > > > -	struct rte_mbuf mb_def = { .buf_addr = 0 }; /* zeroed mbuf */
> > > > > +	struct rte_mbuf *mb_def;
> > > > >
> > > > > -	mb_def.nb_segs = 1;
> > > > > -	mb_def.data_off = RTE_PKTMBUF_HEADROOM;
> > > > > -	mb_def.buf_len = rxq->mb_pool->elt_size - sizeof(struct rte_mbuf);
> > > > > -	mb_def.port = rxq->port_id;
> > > > > -	rte_mbuf_refcnt_set(&mb_def, 1);
> > > > > +	mb_def = rte_pktmbuf_alloc(rxq->mb_pool);
> > > > > +	if (mb_def == NULL) {
> > > > > +		PMD_INIT_LOG(ERR, "ixgbe_rxq_vec_setup: could not allocate one mbuf");
> > > > > +		return -1;
> > > > > +	}
> > > > > +	/* nb_segs, refcnt, data_off and buf_len are already set */
> > > > > +	mb_def->port = rxq->port_id;
> > > > >
> > > > >  	/* prevent compiler reordering: rearm_data covers previous fields */
> > > > >  	rte_compiler_barrier();
> > > > > -	rxq->mbuf_initializer = *((uint64_t *)&mb_def.rearm_data);
> > > > > +	rxq->mbuf_initializer = *((uint64_t *)&mb_def->rearm_data);
> > > > > +
> > > > > +	rte_pktmbuf_free(mb_def);
> > > > > +
> > > > >  	return 0;
> > > > >  }
> > > > >
> > > > > --
> > > > > 2.1.3
> > > >
> > > > As I said in another mail, I don't think it is a proper fix.
> > > > What we did here - just changed one assumption to another.
> > > > Current assumption - all mbuf obj_init() would setup buf_len in exactly the same manner as  rte_pktmbuf_init() does:
> > > > buf_len = mp->elt_size - sizeof(struct rte_mbuf);
> > > > New assumption - all mbuf obj_init() would setup buf_len for all mbufs in the pool to the same value.
> > > > Both assumptions, I believe, are not always correct.
> > > > Though, probably the new one would be true more often.
> > > >
> > > > I still think the proper fix is not to update mbuf's buf_len field at ixgbe_rxq_rearm() at all.
> > > > We should just leave the original value unmodified.
> > > > Actually, while looking at ixgbe_rxq_rearm(), I don't see any reason why we need to update buf_len field.
> > > > It is not the data that need to be rearmed.
> > > > The fields that need to be rearmed are:
> > > > uint16_t data_off;
> > > > uint16_t refcnt
> > > > uint8_t nb_segs;
> > > > uint8_t port;
> > > >
> > > > 6B in total.
> > > > We probably would like to keep rearming as one 64bit load/store.
> > > > Though straight below them we have:
> > > > uint64_t ol_flags;
> > > >
> > > > As RX fully override ol_flags anyway, we can safely overwrite first 2B of it.
> > > > That would allow us to still read/write whole 64bits and avoid overwriting buf_len.
> > > > I am talking about something like patch  below.
> > > > I admit that it looks not so pretty, but I think it is much safer and correct.
> > > > Konstantin
> > >
> > > I just don't see this as worthwhile doing. We are looking here at an mbuf pool
> > > which is to be used for packet buffers for RX. If the packet buffers in that
> > > pool are all of different sizes then we need to go back and look at other places
> > > throughout the code too. For instance, we query the buffer length of the mbuf
> > > pool when initializing the RX queues to determine if we need to enable scattered
> > > RX. If the mbufs in the pool can potentially be of different sizes, we need to
> > > turn off the no-scattered-packets optimization and always use the scatter
> > > packets code path - because the assumption that buffers don't get resized could
> > > also be false. Similarly here, if the mbufs are going to be of different
> > > sizes then the user should disable the vector PMD, and use RX code that doesn't
> > > override the buf_len each time.
> >
> > Yes, all buffers in the pool supposed to be not smaller than some threshold.
> > But it doesn't mean that buffer can't be bigger.
> > Let say our usual case - buf_len = 2K.
> > Why it should be prohibited to use for RX mbuf with buf_len == 4K?
> > There is absolutely nothing wrong with it.
> >
> > >
> > > Furthermore, we still don't have an actual use-case where the user would want to
> > > have different size mbufs in an mbuf pool used for RX. They can still have
> > > variable-sized mbufs in other pools, but having all buffers the same size in the
> > > pool used to receive packets seems a perfectly fair restriction to have. If
> > > someone has an app that they are creating that needs this functionality, I'll
> > > reconsider this opinion, but right now this is all theoretical.
> >
> > Well, the question is why do we need to override buf_len at all?
> > At first place, it doesn't look correct.
> > Second - what advantage we are gaining from it?
> > Performance?
> > I tried with the changes below and didn't see any performance difference at all.
> > So what is the point in keeping that potential point of failure?
> >
> > About use cases - I think I gave two examples in my previous mail.
> > Not sure why you consider them artificial.
> > From other side, do we have right now any actual use-case where people setup buf_len to something
> > different then: mp->elt_size - sizeof(struct rte_mbuf)?
> >
> > Konstantin
> >
> Ok, thanks for the info Konstantin - I assume there is a current case where the
> mbuf is being initialized to something other than the default, on the basis that
> this patch was proposed as a fix for it.
> 
> However, rather than argue this further, I'll defer to whatever the rest of the
> community decides. Thomas, Jean-Mickael, David - what are your opinions on this?

That's ok, but I still wonder why do we need to keep buf_len under rearm_data marker and overwrite it for each RX?
Honestly, probably there exist some obvious reason, that I am missing?
To keep code tidy and easier to read?
Or to keep whole (uint64_t)rearm_data on 64bits boundary?
That are all good reasons, but I suppose data integrity is more important, right?

Konstantin
  
Bruce Richardson Dec. 5, 2014, 12:10 p.m. UTC | #5
On Fri, Dec 05, 2014 at 11:59:50AM +0000, Ananyev, Konstantin wrote:
> 
> 
> > -----Original Message-----
> > From: Richardson, Bruce
> > Sent: Friday, December 05, 2014 11:29 AM
> > To: Ananyev, Konstantin
> > Cc: Jean-Mickael Guerin; dev@dpdk.org
> > Subject: Re: [PATCH v2] ixgbe: don't override mbuf buffer length
> > 
> > On Fri, Dec 05, 2014 at 11:19:11AM +0000, Ananyev, Konstantin wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: Richardson, Bruce
> > > > Sent: Friday, December 05, 2014 10:39 AM
> > > > To: Ananyev, Konstantin
> > > > Cc: Jean-Mickael Guerin; dev@dpdk.org
> > > > Subject: Re: [PATCH v2] ixgbe: don't override mbuf buffer length
> > > >
> > > > On Fri, Dec 05, 2014 at 01:10:28AM +0000, Ananyev, Konstantin wrote:
> > > > >
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Jean-Mickael Guerin [mailto:jean-mickael.guerin@6wind.com]
> > > > > > Sent: Thursday, December 04, 2014 6:09 PM
> > > > > > To: dev@dpdk.org
> > > > > > Cc: Richardson, Bruce; Ananyev, Konstantin
> > > > > > Subject: [PATCH v2] ixgbe: don't override mbuf buffer length
> > > > > >
> > > > > > The template mbuf_initializer is hard coded with a buflen which
> > > > > > might have been set differently by the application at the time of
> > > > > > mbuf pool creation.
> > > > > >
> > > > > > Switch to a mbuf allocation, to fetch the correct default values.
> > > > > > There is no performance impact because this is not a data-plane API.
> > > > > >
> > > > > > Signed-off-by: Jean-Mickael Guerin <jean-mickael.guerin@6wind.com>
> > > > > > Acked-by: David Marchand <david.marchand@6wind.com>
> > > > > > Fixes: 0ff3324da2 ("ixgbe: rework vector pmd following mbuf changes")
> > > > > > ---
> > > > > >
> > > > > >  v2: check returned value of ixgbe_rxq_vec_setup
> > > > > >
> > > > > >  lib/librte_pmd_ixgbe/ixgbe_rxtx.c     |  5 ++++-
> > > > > >  lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c | 19 ++++++++++++-------
> > > > > >  2 files changed, 16 insertions(+), 8 deletions(-)
> > > > > >
> > > > > > diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> > > > > > index 5c36bff..7994da1 100644
> > > > > > --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> > > > > > +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> > > > > > @@ -2244,7 +2244,10 @@ ixgbe_dev_rx_queue_setup(struct rte_eth_dev *dev,
> > > > > >  	use_def_burst_func = check_rx_burst_bulk_alloc_preconditions(rxq);
> > > > > >
> > > > > >  #ifdef RTE_IXGBE_INC_VECTOR
> > > > > > -	ixgbe_rxq_vec_setup(rxq);
> > > > > > +	if (ixgbe_rxq_vec_setup(rxq) < 0) {
> > > > > > +		ixgbe_rx_queue_release(rxq);
> > > > > > +		return (-ENOMEM);
> > > > > > +	}
> > > > > >  #endif
> > > > > >  	/* Check if pre-conditions are satisfied, and no Scattered Rx */
> > > > > >  	if (!use_def_burst_func && !dev->data->scattered_rx) {
> > > > > > diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> > > > > > index c1b5a78..f7b02f5 100644
> > > > > > --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> > > > > > +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> > > > > > @@ -732,17 +732,22 @@ static struct ixgbe_txq_ops vec_txq_ops = {
> > > > > >  int
> > > > > >  ixgbe_rxq_vec_setup(struct igb_rx_queue *rxq)
> > > > > >  {
> > > > > > -	struct rte_mbuf mb_def = { .buf_addr = 0 }; /* zeroed mbuf */
> > > > > > +	struct rte_mbuf *mb_def;
> > > > > >
> > > > > > -	mb_def.nb_segs = 1;
> > > > > > -	mb_def.data_off = RTE_PKTMBUF_HEADROOM;
> > > > > > -	mb_def.buf_len = rxq->mb_pool->elt_size - sizeof(struct rte_mbuf);
> > > > > > -	mb_def.port = rxq->port_id;
> > > > > > -	rte_mbuf_refcnt_set(&mb_def, 1);
> > > > > > +	mb_def = rte_pktmbuf_alloc(rxq->mb_pool);
> > > > > > +	if (mb_def == NULL) {
> > > > > > +		PMD_INIT_LOG(ERR, "ixgbe_rxq_vec_setup: could not allocate one mbuf");
> > > > > > +		return -1;
> > > > > > +	}
> > > > > > +	/* nb_segs, refcnt, data_off and buf_len are already set */
> > > > > > +	mb_def->port = rxq->port_id;
> > > > > >
> > > > > >  	/* prevent compiler reordering: rearm_data covers previous fields */
> > > > > >  	rte_compiler_barrier();
> > > > > > -	rxq->mbuf_initializer = *((uint64_t *)&mb_def.rearm_data);
> > > > > > +	rxq->mbuf_initializer = *((uint64_t *)&mb_def->rearm_data);
> > > > > > +
> > > > > > +	rte_pktmbuf_free(mb_def);
> > > > > > +
> > > > > >  	return 0;
> > > > > >  }
> > > > > >
> > > > > > --
> > > > > > 2.1.3
> > > > >
> > > > > As I said in another mail, I don't think it is a proper fix.
> > > > > What we did here - just changed one assumption to another.
> > > > > Current assumption - all mbuf obj_init() would setup buf_len in exactly the same manner as  rte_pktmbuf_init() does:
> > > > > buf_len = mp->elt_size - sizeof(struct rte_mbuf);
> > > > > New assumption - all mbuf obj_init() would setup buf_len for all mbufs in the pool to the same value.
> > > > > Both assumptions, I believe, are not always correct.
> > > > > Though, probably the new one would be true more often.
> > > > >
> > > > > I still think the proper fix is not to update mbuf's buf_len field at ixgbe_rxq_rearm() at all.
> > > > > We should just leave the original value unmodified.
> > > > > Actually, while looking at ixgbe_rxq_rearm(), I don't see any reason why we need to update buf_len field.
> > > > > It is not the data that need to be rearmed.
> > > > > The fields that need to be rearmed are:
> > > > > uint16_t data_off;
> > > > > uint16_t refcnt
> > > > > uint8_t nb_segs;
> > > > > uint8_t port;
> > > > >
> > > > > 6B in total.
> > > > > We probably would like to keep rearming as one 64bit load/store.
> > > > > Though straight below them we have:
> > > > > uint64_t ol_flags;
> > > > >
> > > > > As RX fully override ol_flags anyway, we can safely overwrite first 2B of it.
> > > > > That would allow us to still read/write whole 64bits and avoid overwriting buf_len.
> > > > > I am talking about something like patch  below.
> > > > > I admit that it looks not so pretty, but I think it is much safer and correct.
> > > > > Konstantin
> > > >
> > > > I just don't see this as worthwhile doing. We are looking here at an mbuf pool
> > > > which is to be used for packet buffers for RX. If the packet buffers in that
> > > > pool are all of different sizes then we need to go back and look at other places
> > > > throughout the code too. For instance, we query the buffer length of the mbuf
> > > > pool when initializing the RX queues to determine if we need to enable scattered
> > > > RX. If the mbufs in the pool can potentially be of different sizes, we need to
> > > > turn off the no-scattered-packets optimization and always use the scatter
> > > > packets code path - because the assumption that buffers don't get resized could
> > > > also be false. Similarly here, if the mbufs are going to be of different
> > > > sizes then the user should disable the vector PMD, and use RX code that doesn't
> > > > override the buf_len each time.
> > >
> > > Yes, all buffers in the pool supposed to be not smaller than some threshold.
> > > But it doesn't mean that buffer can't be bigger.
> > > Let say our usual case - buf_len = 2K.
> > > Why it should be prohibited to use for RX mbuf with buf_len == 4K?
> > > There is absolutely nothing wrong with it.
> > >
> > > >
> > > > Furthermore, we still don't have an actual use-case where the user would want to
> > > > have different size mbufs in an mbuf pool used for RX. They can still have
> > > > variable-sized mbufs in other pools, but having all buffers the same size in the
> > > > pool used to receive packets seems a perfectly fair restriction to have. If
> > > > someone has an app that they are creating that needs this functionality, I'll
> > > > reconsider this opinion, but right now this is all theoretical.
> > >
> > > Well, the question is why do we need to override buf_len at all?
> > > At first place, it doesn't look correct.
> > > Second - what advantage we are gaining from it?
> > > Performance?
> > > I tried with the changes below and didn't see any performance difference at all.
> > > So what is the point in keeping that potential point of failure?
> > >
> > > About use cases - I think I gave two examples in my previous mail.
> > > Not sure why you consider them artificial.
> > > From other side, do we have right now any actual use-case where people setup buf_len to something
> > > different then: mp->elt_size - sizeof(struct rte_mbuf)?
> > >
> > > Konstantin
> > >
> > Ok, thanks for the info Konstantin - I assume there is a current case where the
> > mbuf is being initialized to something other than the default, on the basis that
> > this patch was proposed as a fix for it.
> > 
> > However, rather than argue this further, I'll defer to whatever the rest of the
> > community decides. Thomas, Jean-Mickael, David - what are your opinions on this?
> 
> That's ok, but I still wonder why do we need to keep buf_len under rearm_data marker and overwrite it for each RX?
> Honestly, probably there exist some obvious reason, that I am missing?
> To keep code tidy and easier to read?
> Or to keep whole (uint64_t)rearm_data on 64bits boundary?
> That are all good reasons, but I suppose data integrity is more important, right?
>
Yes, those are the reasons it was put there (easy to read, aligned), but also 
because it was viewed as being constant for a queue, i.e. the case where you 
would have different sized mbufs in a pool was never considered. [The first line 
of our documentation for the mempools is "A memory pool is an allocator of a 
fixed-sized object", and everything is based around that idea of fixed size 
object pools.]
When doing the mbuf redesign, I pulled together what were the fields that were
constant for each queue, and grouped them to a single 8-byte write to give
faster (and shorter in terms of code) mbuf initialization.

/Bruce
  
Ananyev, Konstantin Dec. 5, 2014, 3:23 p.m. UTC | #6
> -----Original Message-----
> From: Richardson, Bruce
> Sent: Friday, December 05, 2014 12:11 PM
> To: Ananyev, Konstantin
> Cc: Jean-Mickael Guerin; dev@dpdk.org
> Subject: Re: [PATCH v2] ixgbe: don't override mbuf buffer length
> 
> On Fri, Dec 05, 2014 at 11:59:50AM +0000, Ananyev, Konstantin wrote:
> >
> >
> > > -----Original Message-----
> > > From: Richardson, Bruce
> > > Sent: Friday, December 05, 2014 11:29 AM
> > > To: Ananyev, Konstantin
> > > Cc: Jean-Mickael Guerin; dev@dpdk.org
> > > Subject: Re: [PATCH v2] ixgbe: don't override mbuf buffer length
> > >
> > > On Fri, Dec 05, 2014 at 11:19:11AM +0000, Ananyev, Konstantin wrote:
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Richardson, Bruce
> > > > > Sent: Friday, December 05, 2014 10:39 AM
> > > > > To: Ananyev, Konstantin
> > > > > Cc: Jean-Mickael Guerin; dev@dpdk.org
> > > > > Subject: Re: [PATCH v2] ixgbe: don't override mbuf buffer length
> > > > >
> > > > > On Fri, Dec 05, 2014 at 01:10:28AM +0000, Ananyev, Konstantin wrote:
> > > > > >
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Jean-Mickael Guerin [mailto:jean-mickael.guerin@6wind.com]
> > > > > > > Sent: Thursday, December 04, 2014 6:09 PM
> > > > > > > To: dev@dpdk.org
> > > > > > > Cc: Richardson, Bruce; Ananyev, Konstantin
> > > > > > > Subject: [PATCH v2] ixgbe: don't override mbuf buffer length
> > > > > > >
> > > > > > > The template mbuf_initializer is hard coded with a buflen which
> > > > > > > might have been set differently by the application at the time of
> > > > > > > mbuf pool creation.
> > > > > > >
> > > > > > > Switch to a mbuf allocation, to fetch the correct default values.
> > > > > > > There is no performance impact because this is not a data-plane API.
> > > > > > >
> > > > > > > Signed-off-by: Jean-Mickael Guerin <jean-mickael.guerin@6wind.com>
> > > > > > > Acked-by: David Marchand <david.marchand@6wind.com>
> > > > > > > Fixes: 0ff3324da2 ("ixgbe: rework vector pmd following mbuf changes")
> > > > > > > ---
> > > > > > >
> > > > > > >  v2: check returned value of ixgbe_rxq_vec_setup
> > > > > > >
> > > > > > >  lib/librte_pmd_ixgbe/ixgbe_rxtx.c     |  5 ++++-
> > > > > > >  lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c | 19 ++++++++++++-------
> > > > > > >  2 files changed, 16 insertions(+), 8 deletions(-)
> > > > > > >
> > > > > > > diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> > > > > > > index 5c36bff..7994da1 100644
> > > > > > > --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> > > > > > > +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> > > > > > > @@ -2244,7 +2244,10 @@ ixgbe_dev_rx_queue_setup(struct rte_eth_dev *dev,
> > > > > > >  	use_def_burst_func = check_rx_burst_bulk_alloc_preconditions(rxq);
> > > > > > >
> > > > > > >  #ifdef RTE_IXGBE_INC_VECTOR
> > > > > > > -	ixgbe_rxq_vec_setup(rxq);
> > > > > > > +	if (ixgbe_rxq_vec_setup(rxq) < 0) {
> > > > > > > +		ixgbe_rx_queue_release(rxq);
> > > > > > > +		return (-ENOMEM);
> > > > > > > +	}
> > > > > > >  #endif
> > > > > > >  	/* Check if pre-conditions are satisfied, and no Scattered Rx */
> > > > > > >  	if (!use_def_burst_func && !dev->data->scattered_rx) {
> > > > > > > diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> > > > > > > index c1b5a78..f7b02f5 100644
> > > > > > > --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> > > > > > > +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> > > > > > > @@ -732,17 +732,22 @@ static struct ixgbe_txq_ops vec_txq_ops = {
> > > > > > >  int
> > > > > > >  ixgbe_rxq_vec_setup(struct igb_rx_queue *rxq)
> > > > > > >  {
> > > > > > > -	struct rte_mbuf mb_def = { .buf_addr = 0 }; /* zeroed mbuf */
> > > > > > > +	struct rte_mbuf *mb_def;
> > > > > > >
> > > > > > > -	mb_def.nb_segs = 1;
> > > > > > > -	mb_def.data_off = RTE_PKTMBUF_HEADROOM;
> > > > > > > -	mb_def.buf_len = rxq->mb_pool->elt_size - sizeof(struct rte_mbuf);
> > > > > > > -	mb_def.port = rxq->port_id;
> > > > > > > -	rte_mbuf_refcnt_set(&mb_def, 1);
> > > > > > > +	mb_def = rte_pktmbuf_alloc(rxq->mb_pool);
> > > > > > > +	if (mb_def == NULL) {
> > > > > > > +		PMD_INIT_LOG(ERR, "ixgbe_rxq_vec_setup: could not allocate one mbuf");
> > > > > > > +		return -1;
> > > > > > > +	}
> > > > > > > +	/* nb_segs, refcnt, data_off and buf_len are already set */
> > > > > > > +	mb_def->port = rxq->port_id;
> > > > > > >
> > > > > > >  	/* prevent compiler reordering: rearm_data covers previous fields */
> > > > > > >  	rte_compiler_barrier();
> > > > > > > -	rxq->mbuf_initializer = *((uint64_t *)&mb_def.rearm_data);
> > > > > > > +	rxq->mbuf_initializer = *((uint64_t *)&mb_def->rearm_data);
> > > > > > > +
> > > > > > > +	rte_pktmbuf_free(mb_def);
> > > > > > > +
> > > > > > >  	return 0;
> > > > > > >  }
> > > > > > >
> > > > > > > --
> > > > > > > 2.1.3
> > > > > >
> > > > > > As I said in another mail, I don't think it is a proper fix.
> > > > > > What we did here - just changed one assumption to another.
> > > > > > Current assumption - all mbuf obj_init() would setup buf_len in exactly the same manner as  rte_pktmbuf_init() does:
> > > > > > buf_len = mp->elt_size - sizeof(struct rte_mbuf);
> > > > > > New assumption - all mbuf obj_init() would setup buf_len for all mbufs in the pool to the same value.
> > > > > > Both assumptions, I believe, are not always correct.
> > > > > > Though, probably the new one would be true more often.
> > > > > >
> > > > > > I still think the proper fix is not to update mbuf's buf_len field at ixgbe_rxq_rearm() at all.
> > > > > > We should just leave the original value unmodified.
> > > > > > Actually, while looking at ixgbe_rxq_rearm(), I don't see any reason why we need to update buf_len field.
> > > > > > It is not the data that need to be rearmed.
> > > > > > The fields that need to be rearmed are:
> > > > > > uint16_t data_off;
> > > > > > uint16_t refcnt
> > > > > > uint8_t nb_segs;
> > > > > > uint8_t port;
> > > > > >
> > > > > > 6B in total.
> > > > > > We probably would like to keep rearming as one 64bit load/store.
> > > > > > Though straight below them we have:
> > > > > > uint64_t ol_flags;
> > > > > >
> > > > > > As RX fully override ol_flags anyway, we can safely overwrite first 2B of it.
> > > > > > That would allow us to still read/write whole 64bits and avoid overwriting buf_len.
> > > > > > I am talking about something like patch  below.
> > > > > > I admit that it looks not so pretty, but I think it is much safer and correct.
> > > > > > Konstantin
> > > > >
> > > > > I just don't see this as worthwhile doing. We are looking here at an mbuf pool
> > > > > which is to be used for packet buffers for RX. If the packet buffers in that
> > > > > pool are all of different sizes then we need to go back and look at other places
> > > > > throughout the code too. For instance, we query the buffer length of the mbuf
> > > > > pool when initializing the RX queues to determine if we need to enable scattered
> > > > > RX. If the mbufs in the pool can potentially be of different sizes, we need to
> > > > > turn off the no-scattered-packets optimization and always use the scatter
> > > > > packets code path - because the assumption that buffers don't get resized could
> > > > > also be false. Similarly here, if the mbufs are going to be of different
> > > > > sizes then the user should disable the vector PMD, and use RX code that doesn't
> > > > > override the buf_len each time.
> > > >
> > > > Yes, all buffers in the pool supposed to be not smaller than some threshold.
> > > > But it doesn't mean that buffer can't be bigger.
> > > > Let say our usual case - buf_len = 2K.
> > > > Why it should be prohibited to use for RX mbuf with buf_len == 4K?
> > > > There is absolutely nothing wrong with it.
> > > >
> > > > >
> > > > > Furthermore, we still don't have an actual use-case where the user would want to
> > > > > have different size mbufs in an mbuf pool used for RX. They can still have
> > > > > variable-sized mbufs in other pools, but having all buffers the same size in the
> > > > > pool used to receive packets seems a perfectly fair restriction to have. If
> > > > > someone has an app that they are creating that needs this functionality, I'll
> > > > > reconsider this opinion, but right now this is all theoretical.
> > > >
> > > > Well, the question is why do we need to override buf_len at all?
> > > > At first place, it doesn't look correct.
> > > > Second - what advantage we are gaining from it?
> > > > Performance?
> > > > I tried with the changes below and didn't see any performance difference at all.
> > > > So what is the point in keeping that potential point of failure?
> > > >
> > > > About use cases - I think I gave two examples in my previous mail.
> > > > Not sure why you consider them artificial.
> > > > From other side, do we have right now any actual use-case where people setup buf_len to something
> > > > different then: mp->elt_size - sizeof(struct rte_mbuf)?
> > > >
> > > > Konstantin
> > > >
> > > Ok, thanks for the info Konstantin - I assume there is a current case where the
> > > mbuf is being initialized to something other than the default, on the basis that
> > > this patch was proposed as a fix for it.
> > >
> > > However, rather than argue this further, I'll defer to whatever the rest of the
> > > community decides. Thomas, Jean-Mickael, David - what are your opinions on this?
> >
> > That's ok, but I still wonder why do we need to keep buf_len under rearm_data marker and overwrite it for each RX?
> > Honestly, probably there exist some obvious reason, that I am missing?
> > To keep code tidy and easier to read?
> > Or to keep whole (uint64_t)rearm_data on 64bits boundary?
> > That are all good reasons, but I suppose data integrity is more important, right?
> >
> Yes, those are the reasons it was put there (easy to read, aligned), but also
> because it was viewed as being constant for a queue, i.e. the case where you
> would have different sized mbufs in a pool was never considered. [The first line
> of our documentation for the mempools is "A memory pool is an allocator of a
> fixed-sized object", and everything is based around that idea of fixed size
> object pools.]
> When doing the mbuf redesign, I pulled together what were the fields that were
> constant for each queue, and grouped them to a single 8-byte write to give
> faster (and shorter in terms of code) mbuf initialization.
> 

Ok, thanks for explanation.
I'll submit the fix I am proposing as RFC patch.
Konstantin
  

Patch

--- a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
+++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
@@ -79,13 +79,19 @@  ixgbe_rxq_rearm(struct igb_rx_queue *rxq)
        /* Initialize the mbufs in vector, process 2 mbufs in one loop */
        for (i = 0; i < RTE_IXGBE_RXQ_REARM_THRESH; i += 2, rxep += 2) {
                __m128i vaddr0, vaddr1;
+               uintptr_t p0, p1;

                mb0 = rxep[0].mbuf;
                mb1 = rxep[1].mbuf;

                /* flush mbuf with pkt template */
-               mb0->rearm_data[0] = rxq->mbuf_initializer;
-               mb1->rearm_data[0] = rxq->mbuf_initializer;
+               p0 = (uintptr_t)&mb0->data_off;
+               *(uint64_t *)p0 = rxq->mbuf_initializer;
+               p1 = (uintptr_t)&mb1->data_off;
+               *(uint64_t *)p1 = rxq->mbuf_initializer;
+
+               //mb0->rearm_data[0] = rxq->mbuf_initializer;
+               //mb1->rearm_data[0] = rxq->mbuf_initializer;

                /* load buf_addr(lo 64bit) and buf_physaddr(hi 64bit) */
                vaddr0 = _mm_loadu_si128((__m128i *)&(mb0->buf_addr));
@@ -732,6 +738,7 @@  static struct ixgbe_txq_ops vec_txq_ops = {
 int
 ixgbe_rxq_vec_setup(struct igb_rx_queue *rxq)
 {
+       uintptr_t p;
        struct rte_mbuf mb_def = { .buf_addr = 0 }; /* zeroed mbuf */

        mb_def.nb_segs = 1;
@@ -739,7 +746,8 @@  ixgbe_rxq_vec_setup(struct igb_rx_queue *rxq)
        mb_def.buf_len = rxq->mb_pool->elt_size - sizeof(struct rte_mbuf);
        mb_def.port = rxq->port_id;
        rte_mbuf_refcnt_set(&mb_def, 1);
-       rxq->mbuf_initializer = *((uint64_t *)&mb_def.rearm_data);
+       p = (uintptr_t)&mb_def.data_off;
+       rxq->mbuf_initializer = *(uint64_t *)p;
        return 0;
 }