[v4,2/4] mempool: add non-IO flag

Message ID 20211013110131.2909604-3-dkozlyuk@nvidia.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers
Series net/mlx5: implicit mempool registration |

Checks

Context Check Description
ci/checkpatch warning coding style issues

Commit Message

Dmitry Kozlyuk Oct. 13, 2021, 11:01 a.m. UTC
  Mempool is a generic allocator that is not necessarily used for device
IO operations and its memory for DMA. Add MEMPOOL_F_NON_IO flag to mark
such mempools automatically if their objects are not contiguous
or IOVA are not available. Components can inspect this flag
in order to optimize their memory management.
Discussion: https://mails.dpdk.org/archives/dev/2021-August/216654.html

Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
---
 app/test/test_mempool.c                | 76 ++++++++++++++++++++++++++
 doc/guides/rel_notes/release_21_11.rst |  3 +
 lib/mempool/rte_mempool.c              |  2 +
 lib/mempool/rte_mempool.h              |  5 ++
 4 files changed, 86 insertions(+)
  

Comments

Andrew Rybchenko Oct. 15, 2021, 9:01 a.m. UTC | #1
On 10/13/21 2:01 PM, Dmitry Kozlyuk wrote:
> Mempool is a generic allocator that is not necessarily used for device
> IO operations and its memory for DMA. Add MEMPOOL_F_NON_IO flag to mark
> such mempools automatically if their objects are not contiguous
> or IOVA are not available. Components can inspect this flag
> in order to optimize their memory management.
> Discussion: https://mails.dpdk.org/archives/dev/2021-August/216654.html
> 
> Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
> Acked-by: Matan Azrad <matan@nvidia.com>

See review notes below. With review notes processed:

Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>

[snip]

> diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst
> index f643a61f44..74e0e6f495 100644
> --- a/doc/guides/rel_notes/release_21_11.rst
> +++ b/doc/guides/rel_notes/release_21_11.rst
> @@ -226,6 +226,9 @@ API Changes
>    the crypto/security operation. This field will be used to communicate
>    events such as soft expiry with IPsec in lookaside mode.
>  
> +* mempool: Added ``MEMPOOL_F_NON_IO`` flag to give a hint to DPDK components
> +  that objects from this pool will not be used for device IO (e.g. DMA).
> +
>  
>  ABI Changes
>  -----------
> diff --git a/lib/mempool/rte_mempool.c b/lib/mempool/rte_mempool.c
> index 51c0ba2931..2204f140b3 100644
> --- a/lib/mempool/rte_mempool.c
> +++ b/lib/mempool/rte_mempool.c
> @@ -371,6 +371,8 @@ rte_mempool_populate_iova(struct rte_mempool *mp, char *vaddr,
>  
>  	STAILQ_INSERT_TAIL(&mp->mem_list, memhdr, next);
>  	mp->nb_mem_chunks++;
> +	if (iova == RTE_BAD_IOVA)
> +		mp->flags |= MEMPOOL_F_NON_IO;

As I understand rte_mempool_populate_iova() may be called
few times for one mempool. The flag must be set if all
invocations are done with RTE_BAD_IOVA. So, it should be
set by default and just removed when iova != RTE_BAD_IOVA
happens.

Yes, it is a corner case. May be it makes sense to
cover it by unit test as well.

>  
>  	/* Report the mempool as ready only when fully populated. */
>  	if (mp->populated_size >= mp->size)
> diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
> index 663123042f..029b62a650 100644
> --- a/lib/mempool/rte_mempool.h
> +++ b/lib/mempool/rte_mempool.h
> @@ -262,6 +262,8 @@ struct rte_mempool {
>  #define MEMPOOL_F_SC_GET         0x0008 /**< Default get is "single-consumer".*/
>  #define MEMPOOL_F_POOL_CREATED   0x0010 /**< Internal: pool is created. */
>  #define MEMPOOL_F_NO_IOVA_CONTIG 0x0020 /**< Don't need IOVA contiguous objs. */
> +#define MEMPOOL_F_NON_IO         0x0040
> +		/**< Internal: pool is not usable for device IO (DMA). */

Please, put the documentation before the define.
/** Internal: pool is not usable for device IO (DMA). */
#define MEMPOOL_F_NON_IO         0x0040

>  
>  /**
>   * @internal When debug is enabled, store some statistics.
> @@ -991,6 +993,9 @@ typedef void (rte_mempool_ctor_t)(struct rte_mempool *, void *);
>   *     "single-consumer". Otherwise, it is "multi-consumers".
>   *   - MEMPOOL_F_NO_IOVA_CONTIG: If set, allocated objects won't
>   *     necessarily be contiguous in IO memory.
> + *   - MEMPOOL_F_NON_IO: If set, the mempool is considered to be
> + *     never used for device IO, i.e. for DMA operations.
> + *     It's a hint to other components and does not affect the mempool behavior.

I tend to say that it should not be here if the flag is
internal.

>   * @return
>   *   The pointer to the new allocated mempool, on success. NULL on error
>   *   with rte_errno set appropriately. Possible rte_errno values include:
>
  
Dmitry Kozlyuk Oct. 15, 2021, 9:18 a.m. UTC | #2
> -----Original Message-----
> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> [...]
> > diff --git a/lib/mempool/rte_mempool.c b/lib/mempool/rte_mempool.c
> > index 51c0ba2931..2204f140b3 100644
> > --- a/lib/mempool/rte_mempool.c
> > +++ b/lib/mempool/rte_mempool.c
> > @@ -371,6 +371,8 @@ rte_mempool_populate_iova(struct rte_mempool *mp,
> > char *vaddr,
> >
> >       STAILQ_INSERT_TAIL(&mp->mem_list, memhdr, next);
> >       mp->nb_mem_chunks++;
> > +     if (iova == RTE_BAD_IOVA)
> > +             mp->flags |= MEMPOOL_F_NON_IO;
> 
> As I understand rte_mempool_populate_iova() may be called few times for
> one mempool. The flag must be set if all invocations are done with
> RTE_BAD_IOVA. So, it should be set by default and just removed when iova
> != RTE_BAD_IOVA happens.

I don't agree at all. If any object of the pool is unsuitable for IO,
the pool cannot be considered suitable for IO. So if there's a single
invocation with RTE_BAD_IOVA, the flag must be set forever.

> Yes, it is a corner case. May be it makes sense to cover it by unit test
> as well.

True for either your logic or mine, I'll add it.

Ack on the rest of the comments, thanks.
  
David Marchand Oct. 15, 2021, 9:25 a.m. UTC | #3
Hello Dmitry,

On Wed, Oct 13, 2021 at 1:02 PM Dmitry Kozlyuk <dkozlyuk@oss.nvidia.com> wrote:
> diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
> index 663123042f..029b62a650 100644
> --- a/lib/mempool/rte_mempool.h
> +++ b/lib/mempool/rte_mempool.h
> @@ -262,6 +262,8 @@ struct rte_mempool {
>  #define MEMPOOL_F_SC_GET         0x0008 /**< Default get is "single-consumer".*/
>  #define MEMPOOL_F_POOL_CREATED   0x0010 /**< Internal: pool is created. */
>  #define MEMPOOL_F_NO_IOVA_CONTIG 0x0020 /**< Don't need IOVA contiguous objs. */
> +#define MEMPOOL_F_NON_IO         0x0040
> +               /**< Internal: pool is not usable for device IO (DMA). */
>
>  /**
>   * @internal When debug is enabled, store some statistics.
> @@ -991,6 +993,9 @@ typedef void (rte_mempool_ctor_t)(struct rte_mempool *, void *);
>   *     "single-consumer". Otherwise, it is "multi-consumers".
>   *   - MEMPOOL_F_NO_IOVA_CONTIG: If set, allocated objects won't
>   *     necessarily be contiguous in IO memory.
> + *   - MEMPOOL_F_NON_IO: If set, the mempool is considered to be
> + *     never used for device IO, i.e. for DMA operations.
> + *     It's a hint to other components and does not affect the mempool behavior.
>   * @return
>   *   The pointer to the new allocated mempool, on success. NULL on error
>   *   with rte_errno set appropriately. Possible rte_errno values include:

- When rebasing on main, you probably won't be able to call this new flag.
The diff should be something like:

diff --git a/app/test/test_mempool.c b/app/test/test_mempool.c
index d886f4800c..35c80291fa 100644
--- a/app/test/test_mempool.c
+++ b/app/test/test_mempool.c
@@ -214,7 +214,7 @@ static int test_mempool_creation_with_unknown_flag(void)
                MEMPOOL_ELT_SIZE, 0, 0,
                NULL, NULL,
                NULL, NULL,
-               SOCKET_ID_ANY, MEMPOOL_F_NO_IOVA_CONTIG << 1);
+               SOCKET_ID_ANY, MEMPOOL_F_NON_IO << 1);

        if (mp_cov != NULL) {
                rte_mempool_free(mp_cov);
diff --git a/lib/mempool/rte_mempool.c b/lib/mempool/rte_mempool.c
index 8d5f99f7e7..27d197fe86 100644
--- a/lib/mempool/rte_mempool.c
+++ b/lib/mempool/rte_mempool.c
@@ -802,6 +802,7 @@ rte_mempool_cache_free(struct rte_mempool_cache *cache)
        | MEMPOOL_F_SC_GET \
        | MEMPOOL_F_POOL_CREATED \
        | MEMPOOL_F_NO_IOVA_CONTIG \
+       | MEMPOOL_F_NON_IO \
        )
 /* create an empty mempool */
 struct rte_mempool *


- While grepping, I noticed that proc-info also dumps mempool flags.
This could be something to enhance, maybe amending current
rte_mempool_dump() and having this tool use it.
But for now, can you update this tool too?
  
Andrew Rybchenko Oct. 15, 2021, 9:33 a.m. UTC | #4
On 10/15/21 12:18 PM, Dmitry Kozlyuk wrote:
>> -----Original Message-----
>> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>> [...]
>>> diff --git a/lib/mempool/rte_mempool.c b/lib/mempool/rte_mempool.c
>>> index 51c0ba2931..2204f140b3 100644
>>> --- a/lib/mempool/rte_mempool.c
>>> +++ b/lib/mempool/rte_mempool.c
>>> @@ -371,6 +371,8 @@ rte_mempool_populate_iova(struct rte_mempool *mp,
>>> char *vaddr,
>>>
>>>       STAILQ_INSERT_TAIL(&mp->mem_list, memhdr, next);
>>>       mp->nb_mem_chunks++;
>>> +     if (iova == RTE_BAD_IOVA)
>>> +             mp->flags |= MEMPOOL_F_NON_IO;
>>
>> As I understand rte_mempool_populate_iova() may be called few times for
>> one mempool. The flag must be set if all invocations are done with
>> RTE_BAD_IOVA. So, it should be set by default and just removed when iova
>> != RTE_BAD_IOVA happens.
> 
> I don't agree at all. If any object of the pool is unsuitable for IO,
> the pool cannot be considered suitable for IO. So if there's a single
> invocation with RTE_BAD_IOVA, the flag must be set forever.

If so, some objects may be used for IO, some cannot be used.
What should happen if an application allocates an object
which is suitable for IO and try to use it this way?

> 
>> Yes, it is a corner case. May be it makes sense to cover it by unit test
>> as well.
> 
> True for either your logic or mine, I'll add it.
> 
> Ack on the rest of the comments, thanks.
>
  
Dmitry Kozlyuk Oct. 15, 2021, 9:38 a.m. UTC | #5
> -----Original Message-----
> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Sent: 15 октября 2021 г. 12:34
> To: Dmitry Kozlyuk <dkozlyuk@nvidia.com>; dev@dpdk.org
> Cc: Matan Azrad <matan@nvidia.com>; Olivier Matz <olivier.matz@6wind.com>
> Subject: Re: [PATCH v4 2/4] mempool: add non-IO flag
> 
> External email: Use caution opening links or attachments
> 
> 
> On 10/15/21 12:18 PM, Dmitry Kozlyuk wrote:
> >> -----Original Message-----
> >> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> [...]
> >>> diff --git a/lib/mempool/rte_mempool.c b/lib/mempool/rte_mempool.c
> >>> index 51c0ba2931..2204f140b3 100644
> >>> --- a/lib/mempool/rte_mempool.c
> >>> +++ b/lib/mempool/rte_mempool.c
> >>> @@ -371,6 +371,8 @@ rte_mempool_populate_iova(struct rte_mempool
> >>> *mp, char *vaddr,
> >>>
> >>>       STAILQ_INSERT_TAIL(&mp->mem_list, memhdr, next);
> >>>       mp->nb_mem_chunks++;
> >>> +     if (iova == RTE_BAD_IOVA)
> >>> +             mp->flags |= MEMPOOL_F_NON_IO;
> >>
> >> As I understand rte_mempool_populate_iova() may be called few times
> >> for one mempool. The flag must be set if all invocations are done
> >> with RTE_BAD_IOVA. So, it should be set by default and just removed
> >> when iova != RTE_BAD_IOVA happens.
> >
> > I don't agree at all. If any object of the pool is unsuitable for IO,
> > the pool cannot be considered suitable for IO. So if there's a single
> > invocation with RTE_BAD_IOVA, the flag must be set forever.
> 
> If so, some objects may be used for IO, some cannot be used.
> What should happen if an application allocates an object which is suitable
> for IO and try to use it this way?

Never mind, I was thinking in v1 mode when the application marked mempools
as not suitable for IO. Since now they're marked automatically, you're correct:
the flag must be set if and only if there is no chance
that objects from this pool will be used for IO.
  
Olivier Matz Oct. 15, 2021, 9:43 a.m. UTC | #6
On Fri, Oct 15, 2021 at 12:33:31PM +0300, Andrew Rybchenko wrote:
> On 10/15/21 12:18 PM, Dmitry Kozlyuk wrote:
> >> -----Original Message-----
> >> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> >> [...]
> >>> diff --git a/lib/mempool/rte_mempool.c b/lib/mempool/rte_mempool.c
> >>> index 51c0ba2931..2204f140b3 100644
> >>> --- a/lib/mempool/rte_mempool.c
> >>> +++ b/lib/mempool/rte_mempool.c
> >>> @@ -371,6 +371,8 @@ rte_mempool_populate_iova(struct rte_mempool *mp,
> >>> char *vaddr,
> >>>
> >>>       STAILQ_INSERT_TAIL(&mp->mem_list, memhdr, next);
> >>>       mp->nb_mem_chunks++;
> >>> +     if (iova == RTE_BAD_IOVA)
> >>> +             mp->flags |= MEMPOOL_F_NON_IO;
> >>
> >> As I understand rte_mempool_populate_iova() may be called few times for
> >> one mempool. The flag must be set if all invocations are done with
> >> RTE_BAD_IOVA. So, it should be set by default and just removed when iova
> >> != RTE_BAD_IOVA happens.
> > 
> > I don't agree at all. If any object of the pool is unsuitable for IO,
> > the pool cannot be considered suitable for IO. So if there's a single
> > invocation with RTE_BAD_IOVA, the flag must be set forever.
> 
> If so, some objects may be used for IO, some cannot be used.
> What should happen if an application allocates an object
> which is suitable for IO and try to use it this way?

If the application can predict if the allocated object is usable for IO
before allocating it, I would be surprised to have it used for IO. I agree
with Dmitry here.
  
Dmitry Kozlyuk Oct. 15, 2021, 9:58 a.m. UTC | #7
> -----Original Message-----
> From: Olivier Matz <olivier.matz@6wind.com>
> Sent: 15 октября 2021 г. 12:43
> To: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Cc: Dmitry Kozlyuk <dkozlyuk@nvidia.com>; dev@dpdk.org; Matan Azrad
> <matan@nvidia.com>
> Subject: Re: [PATCH v4 2/4] mempool: add non-IO flag
> 
> External email: Use caution opening links or attachments
> 
> 
> On Fri, Oct 15, 2021 at 12:33:31PM +0300, Andrew Rybchenko wrote:
> > On 10/15/21 12:18 PM, Dmitry Kozlyuk wrote:
> > >> -----Original Message-----
> > >> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> [...]
> > >>> diff --git a/lib/mempool/rte_mempool.c b/lib/mempool/rte_mempool.c
> > >>> index 51c0ba2931..2204f140b3 100644
> > >>> --- a/lib/mempool/rte_mempool.c
> > >>> +++ b/lib/mempool/rte_mempool.c
> > >>> @@ -371,6 +371,8 @@ rte_mempool_populate_iova(struct rte_mempool
> > >>> *mp, char *vaddr,
> > >>>
> > >>>       STAILQ_INSERT_TAIL(&mp->mem_list, memhdr, next);
> > >>>       mp->nb_mem_chunks++;
> > >>> +     if (iova == RTE_BAD_IOVA)
> > >>> +             mp->flags |= MEMPOOL_F_NON_IO;
> > >>
> > >> As I understand rte_mempool_populate_iova() may be called few times
> > >> for one mempool. The flag must be set if all invocations are done
> > >> with RTE_BAD_IOVA. So, it should be set by default and just removed
> > >> when iova != RTE_BAD_IOVA happens.
> > >
> > > I don't agree at all. If any object of the pool is unsuitable for
> > > IO, the pool cannot be considered suitable for IO. So if there's a
> > > single invocation with RTE_BAD_IOVA, the flag must be set forever.
> >
> > If so, some objects may be used for IO, some cannot be used.
> > What should happen if an application allocates an object which is
> > suitable for IO and try to use it this way?
> 
> If the application can predict if the allocated object is usable for IO
> before allocating it, I would be surprised to have it used for IO. I agree
> with Dmitry here.

The flag hints to components, PMDs before all,
that objects from this mempool will never be used for IO,
so that the component can save some memory mapping or DMA configuration.
If the flag is set when even a single object may be used for IO,
the consumer of the flag will not be ready for that.
Whatever a corner case it is, Andrew is correct.
There is a subtle difference between "pool is not usable"
(as described now) and "objects from this mempool will never be used"
(as stated above), I'll highlight it in the flag description.
  
Dmitry Kozlyuk Oct. 15, 2021, 10:42 a.m. UTC | #8
Hello David,

> [...]
> - When rebasing on main, you probably won't be able to call this new flag.
> The diff should be something like:
> 
> diff --git a/app/test/test_mempool.c b/app/test/test_mempool.c index
> d886f4800c..35c80291fa 100644
> --- a/app/test/test_mempool.c
> +++ b/app/test/test_mempool.c
> @@ -214,7 +214,7 @@ static int
> test_mempool_creation_with_unknown_flag(void)
>                 MEMPOOL_ELT_SIZE, 0, 0,
>                 NULL, NULL,
>                 NULL, NULL,
> -               SOCKET_ID_ANY, MEMPOOL_F_NO_IOVA_CONTIG << 1);
> +               SOCKET_ID_ANY, MEMPOOL_F_NON_IO << 1);
> 
>         if (mp_cov != NULL) {
>                 rte_mempool_free(mp_cov); diff --git
> a/lib/mempool/rte_mempool.c b/lib/mempool/rte_mempool.c index
> 8d5f99f7e7..27d197fe86 100644
> --- a/lib/mempool/rte_mempool.c
> +++ b/lib/mempool/rte_mempool.c
> @@ -802,6 +802,7 @@ rte_mempool_cache_free(struct rte_mempool_cache
> *cache)
>         | MEMPOOL_F_SC_GET \
>         | MEMPOOL_F_POOL_CREATED \
>         | MEMPOOL_F_NO_IOVA_CONTIG \
> +       | MEMPOOL_F_NON_IO \

I wonder why CREATED and NON_IO should be listed here:
they are not supposed to be passed by the user,
which is what MEMPOOL_KNOWN_FLAGS is used for.
The same question stands for the test code.
Could you confirm your suggestion?

>         )
>  /* create an empty mempool */
>  struct rte_mempool *
> 
> 
> - While grepping, I noticed that proc-info also dumps mempool flags.
> This could be something to enhance, maybe amending current
> rte_mempool_dump() and having this tool use it.
> But for now, can you update this tool too?

I will, thanks for the hints.
  
David Marchand Oct. 15, 2021, 11:41 a.m. UTC | #9
On Fri, Oct 15, 2021 at 12:42 PM Dmitry Kozlyuk <dkozlyuk@nvidia.com> wrote:
> > a/lib/mempool/rte_mempool.c b/lib/mempool/rte_mempool.c index
> > 8d5f99f7e7..27d197fe86 100644
> > --- a/lib/mempool/rte_mempool.c
> > +++ b/lib/mempool/rte_mempool.c
> > @@ -802,6 +802,7 @@ rte_mempool_cache_free(struct rte_mempool_cache
> > *cache)
> >         | MEMPOOL_F_SC_GET \
> >         | MEMPOOL_F_POOL_CREATED \
> >         | MEMPOOL_F_NO_IOVA_CONTIG \
> > +       | MEMPOOL_F_NON_IO \
>
> I wonder why CREATED and NON_IO should be listed here:
> they are not supposed to be passed by the user,
> which is what MEMPOOL_KNOWN_FLAGS is used for.
> The same question stands for the test code.
> Could you confirm your suggestion?

There was no distinction in the API for valid flags so far, and indeed
I did not pay attention to MEMPOOL_F_POOL_CREATED and its internal
aspect.
(That's the problem when mixing stuff together)

We could separate internal and exposed flags in different fields, but
it seems overkill.
It would be seen as an API change too, if application were checking
for this flag.
So let's keep this as is.

As you suggest, we should exclude those internal flags from
KNOWN_FLAGS (probably rename it too), and we will have to export this
define for the unit test since the check had been written with
contiguous valid flags in mind.
If your new flag is internal only, I agree we must skip it.

I'll prepare a patch for mempool.
  
Olivier Matz Oct. 15, 2021, 12:11 p.m. UTC | #10
On Fri, Oct 15, 2021 at 09:58:49AM +0000, Dmitry Kozlyuk wrote:
> 
> 
> > -----Original Message-----
> > From: Olivier Matz <olivier.matz@6wind.com>
> > Sent: 15 октября 2021 г. 12:43
> > To: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> > Cc: Dmitry Kozlyuk <dkozlyuk@nvidia.com>; dev@dpdk.org; Matan Azrad
> > <matan@nvidia.com>
> > Subject: Re: [PATCH v4 2/4] mempool: add non-IO flag
> > 
> > External email: Use caution opening links or attachments
> > 
> > 
> > On Fri, Oct 15, 2021 at 12:33:31PM +0300, Andrew Rybchenko wrote:
> > > On 10/15/21 12:18 PM, Dmitry Kozlyuk wrote:
> > > >> -----Original Message-----
> > > >> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> [...]
> > > >>> diff --git a/lib/mempool/rte_mempool.c b/lib/mempool/rte_mempool.c
> > > >>> index 51c0ba2931..2204f140b3 100644
> > > >>> --- a/lib/mempool/rte_mempool.c
> > > >>> +++ b/lib/mempool/rte_mempool.c
> > > >>> @@ -371,6 +371,8 @@ rte_mempool_populate_iova(struct rte_mempool
> > > >>> *mp, char *vaddr,
> > > >>>
> > > >>>       STAILQ_INSERT_TAIL(&mp->mem_list, memhdr, next);
> > > >>>       mp->nb_mem_chunks++;
> > > >>> +     if (iova == RTE_BAD_IOVA)
> > > >>> +             mp->flags |= MEMPOOL_F_NON_IO;
> > > >>
> > > >> As I understand rte_mempool_populate_iova() may be called few times
> > > >> for one mempool. The flag must be set if all invocations are done
> > > >> with RTE_BAD_IOVA. So, it should be set by default and just removed
> > > >> when iova != RTE_BAD_IOVA happens.
> > > >
> > > > I don't agree at all. If any object of the pool is unsuitable for
> > > > IO, the pool cannot be considered suitable for IO. So if there's a
> > > > single invocation with RTE_BAD_IOVA, the flag must be set forever.
> > >
> > > If so, some objects may be used for IO, some cannot be used.
> > > What should happen if an application allocates an object which is
> > > suitable for IO and try to use it this way?
> > 
> > If the application can predict if the allocated object is usable for IO
> > before allocating it, I would be surprised to have it used for IO. I agree
> > with Dmitry here.
> 
> The flag hints to components, PMDs before all,
> that objects from this mempool will never be used for IO,
> so that the component can save some memory mapping or DMA configuration.
> If the flag is set when even a single object may be used for IO,
> the consumer of the flag will not be ready for that.
> Whatever a corner case it is, Andrew is correct.
> There is a subtle difference between "pool is not usable"
> (as described now) and "objects from this mempool will never be used"
> (as stated above), I'll highlight it in the flag description.

OK, agreed, thanks.
  
Olivier Matz Oct. 15, 2021, 12:13 p.m. UTC | #11
On Fri, Oct 15, 2021 at 01:41:40PM +0200, David Marchand wrote:
> On Fri, Oct 15, 2021 at 12:42 PM Dmitry Kozlyuk <dkozlyuk@nvidia.com> wrote:
> > > a/lib/mempool/rte_mempool.c b/lib/mempool/rte_mempool.c index
> > > 8d5f99f7e7..27d197fe86 100644
> > > --- a/lib/mempool/rte_mempool.c
> > > +++ b/lib/mempool/rte_mempool.c
> > > @@ -802,6 +802,7 @@ rte_mempool_cache_free(struct rte_mempool_cache
> > > *cache)
> > >         | MEMPOOL_F_SC_GET \
> > >         | MEMPOOL_F_POOL_CREATED \
> > >         | MEMPOOL_F_NO_IOVA_CONTIG \
> > > +       | MEMPOOL_F_NON_IO \
> >
> > I wonder why CREATED and NON_IO should be listed here:
> > they are not supposed to be passed by the user,
> > which is what MEMPOOL_KNOWN_FLAGS is used for.
> > The same question stands for the test code.
> > Could you confirm your suggestion?
> 
> There was no distinction in the API for valid flags so far, and indeed
> I did not pay attention to MEMPOOL_F_POOL_CREATED and its internal
> aspect.
> (That's the problem when mixing stuff together)
> 
> We could separate internal and exposed flags in different fields, but
> it seems overkill.
> It would be seen as an API change too, if application were checking
> for this flag.
> So let's keep this as is.
> 
> As you suggest, we should exclude those internal flags from
> KNOWN_FLAGS (probably rename it too), and we will have to export this

I suggest RTE_MEMPOOL_VALID_USER_FLAGS for the name

> define for the unit test since the check had been written with
> contiguous valid flags in mind.
> If your new flag is internal only, I agree we must skip it.
> 
> I'll prepare a patch for mempool.
> 
> -- 
> David Marchand
>
  
Olivier Matz Oct. 15, 2021, 1:19 p.m. UTC | #12
Hi Dmitry,

On Wed, Oct 13, 2021 at 02:01:29PM +0300, Dmitry Kozlyuk wrote:
> Mempool is a generic allocator that is not necessarily used for device
> IO operations and its memory for DMA. Add MEMPOOL_F_NON_IO flag to mark
> such mempools automatically if their objects are not contiguous
> or IOVA are not available. Components can inspect this flag
> in order to optimize their memory management.
> Discussion: https://mails.dpdk.org/archives/dev/2021-August/216654.html
> 
> Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
> Acked-by: Matan Azrad <matan@nvidia.com>
> ---
>  app/test/test_mempool.c                | 76 ++++++++++++++++++++++++++
>  doc/guides/rel_notes/release_21_11.rst |  3 +
>  lib/mempool/rte_mempool.c              |  2 +
>  lib/mempool/rte_mempool.h              |  5 ++
>  4 files changed, 86 insertions(+)
> 
> diff --git a/app/test/test_mempool.c b/app/test/test_mempool.c
> index bc0cc9ed48..15146dd737 100644
> --- a/app/test/test_mempool.c
> +++ b/app/test/test_mempool.c
> @@ -672,6 +672,74 @@ test_mempool_events_safety(void)
>  	return 0;
>  }
>  
> +static int
> +test_mempool_flag_non_io_set_when_no_iova_contig_set(void)
> +{
> +	struct rte_mempool *mp;
> +	int ret;
> +
> +	mp = rte_mempool_create_empty("empty", MEMPOOL_SIZE,
> +				      MEMPOOL_ELT_SIZE, 0, 0,
> +				      SOCKET_ID_ANY, MEMPOOL_F_NO_IOVA_CONTIG);
> +	RTE_TEST_ASSERT_NOT_NULL(mp, "Cannot create mempool: %s",
> +				 rte_strerror(rte_errno));
> +	rte_mempool_set_ops_byname(mp, rte_mbuf_best_mempool_ops(), NULL);
> +	ret = rte_mempool_populate_default(mp);
> +	RTE_TEST_ASSERT(ret > 0, "Failed to populate mempool: %s",
> +			rte_strerror(rte_errno));
> +	RTE_TEST_ASSERT(mp->flags & MEMPOOL_F_NON_IO,
> +			"NON_IO flag is not set when NO_IOVA_CONTIG is set");
> +	rte_mempool_free(mp);
> +	return 0;
> +}

One comment that also applies to the previous patch. Using
RTE_TEST_ASSERT_*() is convenient to test a condition, display an error
message and return on error in one operation. But here it can cause a
leak on test failure.

I don't know what is the best approach to solve the issue. Having
equivalent test macros that do "goto fail" instead of "return -1" would
help here. I mean something like:
  RTE_TEST_ASSERT_GOTO_*(cond, label, fmt, ...)

What do you think?
  
Dmitry Kozlyuk Oct. 15, 2021, 1:27 p.m. UTC | #13
> [...]
> > +static int
> > +test_mempool_flag_non_io_set_when_no_iova_contig_set(void)
> > +{
> > +     struct rte_mempool *mp;
> > +     int ret;
> > +
> > +     mp = rte_mempool_create_empty("empty", MEMPOOL_SIZE,
> > +                                   MEMPOOL_ELT_SIZE, 0, 0,
> > +                                   SOCKET_ID_ANY,
> MEMPOOL_F_NO_IOVA_CONTIG);
> > +     RTE_TEST_ASSERT_NOT_NULL(mp, "Cannot create mempool: %s",
> > +                              rte_strerror(rte_errno));
> > +     rte_mempool_set_ops_byname(mp, rte_mbuf_best_mempool_ops(), NULL);
> > +     ret = rte_mempool_populate_default(mp);
> > +     RTE_TEST_ASSERT(ret > 0, "Failed to populate mempool: %s",
> > +                     rte_strerror(rte_errno));
> > +     RTE_TEST_ASSERT(mp->flags & MEMPOOL_F_NON_IO,
> > +                     "NON_IO flag is not set when NO_IOVA_CONTIG is
> set");
> > +     rte_mempool_free(mp);
> > +     return 0;
> > +}
> 
> One comment that also applies to the previous patch. Using
> RTE_TEST_ASSERT_*() is convenient to test a condition, display an error
> message and return on error in one operation. But here it can cause a
> leak on test failure.
> 
> I don't know what is the best approach to solve the issue. Having
> equivalent test macros that do "goto fail" instead of "return -1" would
> help here. I mean something like:
>   RTE_TEST_ASSERT_GOTO_*(cond, label, fmt, ...)
> 
> What do you think?

This can work with existing macros:

	#define TEST_TRACE_FAILURE(...) goto fail

Because of "trace" in the name it looks a bit like a hijacking.
Probably the macro should be named TEST_HANDLE_FAILURE
to suggest broader usages than just tracing,
but for now it looks the most neat way.
  
Olivier Matz Oct. 15, 2021, 1:43 p.m. UTC | #14
On Fri, Oct 15, 2021 at 01:27:59PM +0000, Dmitry Kozlyuk wrote:
> > [...]
> > > +static int
> > > +test_mempool_flag_non_io_set_when_no_iova_contig_set(void)
> > > +{
> > > +     struct rte_mempool *mp;
> > > +     int ret;
> > > +
> > > +     mp = rte_mempool_create_empty("empty", MEMPOOL_SIZE,
> > > +                                   MEMPOOL_ELT_SIZE, 0, 0,
> > > +                                   SOCKET_ID_ANY,
> > MEMPOOL_F_NO_IOVA_CONTIG);
> > > +     RTE_TEST_ASSERT_NOT_NULL(mp, "Cannot create mempool: %s",
> > > +                              rte_strerror(rte_errno));
> > > +     rte_mempool_set_ops_byname(mp, rte_mbuf_best_mempool_ops(), NULL);
> > > +     ret = rte_mempool_populate_default(mp);
> > > +     RTE_TEST_ASSERT(ret > 0, "Failed to populate mempool: %s",
> > > +                     rte_strerror(rte_errno));
> > > +     RTE_TEST_ASSERT(mp->flags & MEMPOOL_F_NON_IO,
> > > +                     "NON_IO flag is not set when NO_IOVA_CONTIG is
> > set");
> > > +     rte_mempool_free(mp);
> > > +     return 0;
> > > +}
> > 
> > One comment that also applies to the previous patch. Using
> > RTE_TEST_ASSERT_*() is convenient to test a condition, display an error
> > message and return on error in one operation. But here it can cause a
> > leak on test failure.
> > 
> > I don't know what is the best approach to solve the issue. Having
> > equivalent test macros that do "goto fail" instead of "return -1" would
> > help here. I mean something like:
> >   RTE_TEST_ASSERT_GOTO_*(cond, label, fmt, ...)
> > 
> > What do you think?
> 
> This can work with existing macros:
> 
> 	#define TEST_TRACE_FAILURE(...) goto fail
> 
> Because of "trace" in the name it looks a bit like a hijacking.
> Probably the macro should be named TEST_HANDLE_FAILURE
> to suggest broader usages than just tracing,
> but for now it looks the most neat way.

That would work for me.
What about introducing another macro for this usage, that would
be "return -1" by default and that can be overridden?
  
Dmitry Kozlyuk Oct. 19, 2021, 1:08 p.m. UTC | #15
> -----Original Message-----
> From: Olivier Matz <olivier.matz@6wind.com>
> Sent: 15 октября 2021 г. 16:43
> To: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
> Cc: dev@dpdk.org; Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>; Matan
> Azrad <matan@nvidia.com>
> Subject: Re: [PATCH v4 2/4] mempool: add non-IO flag
> 
> External email: Use caution opening links or attachments
> 
> 
> On Fri, Oct 15, 2021 at 01:27:59PM +0000, Dmitry Kozlyuk wrote:
> > > [...]
> > > > +static int
> > > > +test_mempool_flag_non_io_set_when_no_iova_contig_set(void)
> > > > +{
> > > > +     struct rte_mempool *mp;
> > > > +     int ret;
> > > > +
> > > > +     mp = rte_mempool_create_empty("empty", MEMPOOL_SIZE,
> > > > +                                   MEMPOOL_ELT_SIZE, 0, 0,
> > > > +                                   SOCKET_ID_ANY,
> > > MEMPOOL_F_NO_IOVA_CONTIG);
> > > > +     RTE_TEST_ASSERT_NOT_NULL(mp, "Cannot create mempool: %s",
> > > > +                              rte_strerror(rte_errno));
> > > > +     rte_mempool_set_ops_byname(mp, rte_mbuf_best_mempool_ops(),
> NULL);
> > > > +     ret = rte_mempool_populate_default(mp);
> > > > +     RTE_TEST_ASSERT(ret > 0, "Failed to populate mempool: %s",
> > > > +                     rte_strerror(rte_errno));
> > > > +     RTE_TEST_ASSERT(mp->flags & MEMPOOL_F_NON_IO,
> > > > +                     "NON_IO flag is not set when NO_IOVA_CONTIG is
> > > set");
> > > > +     rte_mempool_free(mp);
> > > > +     return 0;
> > > > +}
> > >
> > > One comment that also applies to the previous patch. Using
> > > RTE_TEST_ASSERT_*() is convenient to test a condition, display an
> error
> > > message and return on error in one operation. But here it can cause a
> > > leak on test failure.
> > >
> > > I don't know what is the best approach to solve the issue. Having
> > > equivalent test macros that do "goto fail" instead of "return -1"
> would
> > > help here. I mean something like:
> > >   RTE_TEST_ASSERT_GOTO_*(cond, label, fmt, ...)
> > >
> > > What do you think?
> >
> > This can work with existing macros:
> >
> >       #define TEST_TRACE_FAILURE(...) goto fail
> >
> > Because of "trace" in the name it looks a bit like a hijacking.
> > Probably the macro should be named TEST_HANDLE_FAILURE
> > to suggest broader usages than just tracing,
> > but for now it looks the most neat way.
> 
> That would work for me.

Did so in v9.

> What about introducing another macro for this usage, that would
> be "return -1" by default and that can be overridden?

I like this suggestion by itself.
While implementing the solution with RTE_TEST_TRACE_FAILURE()
I didn't like the hustle with #ifdef/#pragma push/pop_macro.
At least some of them could be hidden, need to play with macros
before suggesting something clean.
  

Patch

diff --git a/app/test/test_mempool.c b/app/test/test_mempool.c
index bc0cc9ed48..15146dd737 100644
--- a/app/test/test_mempool.c
+++ b/app/test/test_mempool.c
@@ -672,6 +672,74 @@  test_mempool_events_safety(void)
 	return 0;
 }
 
+static int
+test_mempool_flag_non_io_set_when_no_iova_contig_set(void)
+{
+	struct rte_mempool *mp;
+	int ret;
+
+	mp = rte_mempool_create_empty("empty", MEMPOOL_SIZE,
+				      MEMPOOL_ELT_SIZE, 0, 0,
+				      SOCKET_ID_ANY, MEMPOOL_F_NO_IOVA_CONTIG);
+	RTE_TEST_ASSERT_NOT_NULL(mp, "Cannot create mempool: %s",
+				 rte_strerror(rte_errno));
+	rte_mempool_set_ops_byname(mp, rte_mbuf_best_mempool_ops(), NULL);
+	ret = rte_mempool_populate_default(mp);
+	RTE_TEST_ASSERT(ret > 0, "Failed to populate mempool: %s",
+			rte_strerror(rte_errno));
+	RTE_TEST_ASSERT(mp->flags & MEMPOOL_F_NON_IO,
+			"NON_IO flag is not set when NO_IOVA_CONTIG is set");
+	rte_mempool_free(mp);
+	return 0;
+}
+
+static int
+test_mempool_flag_non_io_set_when_populated_with_bad_iova(void)
+{
+	void *addr;
+	size_t size = 1 << 16;
+	struct rte_mempool *mp;
+	int ret;
+
+	addr = rte_malloc("test_mempool", size, 0);
+	RTE_TEST_ASSERT_NOT_NULL(addr, "Cannot allocate memory");
+	mp = rte_mempool_create_empty("empty", MEMPOOL_SIZE,
+				      MEMPOOL_ELT_SIZE, 0, 0,
+				      SOCKET_ID_ANY, 0);
+	RTE_TEST_ASSERT_NOT_NULL(mp, "Cannot create mempool: %s",
+				 rte_strerror(rte_errno));
+	ret = rte_mempool_populate_iova(mp, addr, RTE_BAD_IOVA, size,
+					NULL, NULL);
+	/* The flag must be inferred even if population isn't full. */
+	RTE_TEST_ASSERT(ret > 0, "Failed to populate mempool: %s",
+			rte_strerror(rte_errno));
+	RTE_TEST_ASSERT(mp->flags & MEMPOOL_F_NON_IO,
+			"NON_IO flag is not set when mempool is populated with RTE_BAD_IOVA");
+	rte_mempool_free(mp);
+	rte_free(addr);
+	return 0;
+}
+
+static int
+test_mempool_flag_non_io_unset_by_default(void)
+{
+	struct rte_mempool *mp;
+	int ret;
+
+	mp = rte_mempool_create_empty("empty", MEMPOOL_SIZE,
+				      MEMPOOL_ELT_SIZE, 0, 0,
+				      SOCKET_ID_ANY, 0);
+	RTE_TEST_ASSERT_NOT_NULL(mp, "Cannot create mempool: %s",
+				 rte_strerror(rte_errno));
+	ret = rte_mempool_populate_default(mp);
+	RTE_TEST_ASSERT_EQUAL(ret, (int)mp->size, "Failed to populate mempool: %s",
+			      rte_strerror(rte_errno));
+	RTE_TEST_ASSERT(!(mp->flags & MEMPOOL_F_NON_IO),
+			"NON_IO flag is set by default");
+	rte_mempool_free(mp);
+	return 0;
+}
+
 static int
 test_mempool(void)
 {
@@ -854,6 +922,14 @@  test_mempool(void)
 	if (test_mempool_events_safety() < 0)
 		GOTO_ERR(ret, err);
 
+	/* test NON_IO flag inference */
+	if (test_mempool_flag_non_io_unset_by_default() < 0)
+		GOTO_ERR(ret, err);
+	if (test_mempool_flag_non_io_set_when_no_iova_contig_set() < 0)
+		GOTO_ERR(ret, err);
+	if (test_mempool_flag_non_io_set_when_populated_with_bad_iova() < 0)
+		GOTO_ERR(ret, err);
+
 	rte_mempool_list_dump(stdout);
 
 	ret = 0;
diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst
index f643a61f44..74e0e6f495 100644
--- a/doc/guides/rel_notes/release_21_11.rst
+++ b/doc/guides/rel_notes/release_21_11.rst
@@ -226,6 +226,9 @@  API Changes
   the crypto/security operation. This field will be used to communicate
   events such as soft expiry with IPsec in lookaside mode.
 
+* mempool: Added ``MEMPOOL_F_NON_IO`` flag to give a hint to DPDK components
+  that objects from this pool will not be used for device IO (e.g. DMA).
+
 
 ABI Changes
 -----------
diff --git a/lib/mempool/rte_mempool.c b/lib/mempool/rte_mempool.c
index 51c0ba2931..2204f140b3 100644
--- a/lib/mempool/rte_mempool.c
+++ b/lib/mempool/rte_mempool.c
@@ -371,6 +371,8 @@  rte_mempool_populate_iova(struct rte_mempool *mp, char *vaddr,
 
 	STAILQ_INSERT_TAIL(&mp->mem_list, memhdr, next);
 	mp->nb_mem_chunks++;
+	if (iova == RTE_BAD_IOVA)
+		mp->flags |= MEMPOOL_F_NON_IO;
 
 	/* Report the mempool as ready only when fully populated. */
 	if (mp->populated_size >= mp->size)
diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
index 663123042f..029b62a650 100644
--- a/lib/mempool/rte_mempool.h
+++ b/lib/mempool/rte_mempool.h
@@ -262,6 +262,8 @@  struct rte_mempool {
 #define MEMPOOL_F_SC_GET         0x0008 /**< Default get is "single-consumer".*/
 #define MEMPOOL_F_POOL_CREATED   0x0010 /**< Internal: pool is created. */
 #define MEMPOOL_F_NO_IOVA_CONTIG 0x0020 /**< Don't need IOVA contiguous objs. */
+#define MEMPOOL_F_NON_IO         0x0040
+		/**< Internal: pool is not usable for device IO (DMA). */
 
 /**
  * @internal When debug is enabled, store some statistics.
@@ -991,6 +993,9 @@  typedef void (rte_mempool_ctor_t)(struct rte_mempool *, void *);
  *     "single-consumer". Otherwise, it is "multi-consumers".
  *   - MEMPOOL_F_NO_IOVA_CONTIG: If set, allocated objects won't
  *     necessarily be contiguous in IO memory.
+ *   - MEMPOOL_F_NON_IO: If set, the mempool is considered to be
+ *     never used for device IO, i.e. for DMA operations.
+ *     It's a hint to other components and does not affect the mempool behavior.
  * @return
  *   The pointer to the new allocated mempool, on success. NULL on error
  *   with rte_errno set appropriately. Possible rte_errno values include: