[v4,1/4] mempool: add event callbacks

Message ID 20211013110131.2909604-2-dkozlyuk@nvidia.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers
Series net/mlx5: implicit mempool registration |

Checks

Context Check Description
ci/checkpatch warning coding style issues

Commit Message

Dmitry Kozlyuk Oct. 13, 2021, 11:01 a.m. UTC
  Data path performance can benefit if the PMD knows which memory it will
need to handle in advance, before the first mbuf is sent to the PMD.
It is impractical, however, to consider all allocated memory for this
purpose. Most often mbuf memory comes from mempools that can come and
go. PMD can enumerate existing mempools on device start, but it also
needs to track creation and destruction of mempools after the forwarding
starts but before an mbuf from the new mempool is sent to the device.

Add an API to register callback for mempool life cycle events:
* rte_mempool_event_callback_register()
* rte_mempool_event_callback_unregister()
Currently tracked events are:
* RTE_MEMPOOL_EVENT_READY (after populating a mempool)
* RTE_MEMPOOL_EVENT_DESTROY (before freeing a mempool)
Provide a unit test for the new API.
The new API is internal, because it is primarily demanded by PMDs that
may need to deal with any mempools and do not control their creation,
while an application, on the other hand, knows which mempools it creates
and doesn't care about internal mempools PMDs might create.

Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
---
 app/test/test_mempool.c   | 209 ++++++++++++++++++++++++++++++++++++++
 lib/mempool/rte_mempool.c | 137 +++++++++++++++++++++++++
 lib/mempool/rte_mempool.h |  61 +++++++++++
 lib/mempool/version.map   |   8 ++
 4 files changed, 415 insertions(+)
  

Comments

Andrew Rybchenko Oct. 15, 2021, 8:52 a.m. UTC | #1
On 10/13/21 2:01 PM, Dmitry Kozlyuk wrote:
> Data path performance can benefit if the PMD knows which memory it will
> need to handle in advance, before the first mbuf is sent to the PMD.
> It is impractical, however, to consider all allocated memory for this
> purpose. Most often mbuf memory comes from mempools that can come and
> go. PMD can enumerate existing mempools on device start, but it also
> needs to track creation and destruction of mempools after the forwarding
> starts but before an mbuf from the new mempool is sent to the device.
> 
> Add an API to register callback for mempool life cycle events:
> * rte_mempool_event_callback_register()
> * rte_mempool_event_callback_unregister()
> Currently tracked events are:
> * RTE_MEMPOOL_EVENT_READY (after populating a mempool)
> * RTE_MEMPOOL_EVENT_DESTROY (before freeing a mempool)
> Provide a unit test for the new API.
> The new API is internal, because it is primarily demanded by PMDs that
> may need to deal with any mempools and do not control their creation,
> while an application, on the other hand, knows which mempools it creates
> and doesn't care about internal mempools PMDs might create.
> 
> Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
> Acked-by: Matan Azrad <matan@nvidia.com>

With below review notes processed

Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>

[snip]

> diff --git a/lib/mempool/rte_mempool.c b/lib/mempool/rte_mempool.c
> index c5f859ae71..51c0ba2931 100644
> --- a/lib/mempool/rte_mempool.c
> +++ b/lib/mempool/rte_mempool.c

[snip]

> @@ -360,6 +372,10 @@ rte_mempool_populate_iova(struct rte_mempool *mp, char *vaddr,
>  	STAILQ_INSERT_TAIL(&mp->mem_list, memhdr, next);
>  	mp->nb_mem_chunks++;
>  
> +	/* Report the mempool as ready only when fully populated. */
> +	if (mp->populated_size >= mp->size)
> +		mempool_event_callback_invoke(RTE_MEMPOOL_EVENT_READY, mp);
> +
>  	rte_mempool_trace_populate_iova(mp, vaddr, iova, len, free_cb, opaque);
>  	return i;
>  
> @@ -722,6 +738,7 @@ rte_mempool_free(struct rte_mempool *mp)
>  	}
>  	rte_mcfg_tailq_write_unlock();
>  
> +	mempool_event_callback_invoke(RTE_MEMPOOL_EVENT_DESTROY, mp);
>  	rte_mempool_trace_free(mp);
>  	rte_mempool_free_memchunks(mp);
>  	rte_mempool_ops_free(mp);
> @@ -1343,3 +1360,123 @@ void rte_mempool_walk(void (*func)(struct rte_mempool *, void *),
>  
>  	rte_mcfg_mempool_read_unlock();
>  }
> +
> +struct mempool_callback {

It sounds like it is a mempool callback itself.
Consider: mempool_event_callback_data.
I think this way it will be consistent.

> +	rte_mempool_event_callback *func;
> +	void *user_data;
> +};
> +
> +static void
> +mempool_event_callback_invoke(enum rte_mempool_event event,
> +			      struct rte_mempool *mp)
> +{
> +	struct mempool_callback_list *list;
> +	struct rte_tailq_entry *te;
> +	void *tmp_te;
> +
> +	rte_mcfg_tailq_read_lock();
> +	list = RTE_TAILQ_CAST(callback_tailq.head, mempool_callback_list);
> +	RTE_TAILQ_FOREACH_SAFE(te, list, next, tmp_te) {
> +		struct mempool_callback *cb = te->data;
> +		rte_mcfg_tailq_read_unlock();
> +		cb->func(event, mp, cb->user_data);
> +		rte_mcfg_tailq_read_lock();
> +	}
> +	rte_mcfg_tailq_read_unlock();
> +}
> +
> +int
> +rte_mempool_event_callback_register(rte_mempool_event_callback *func,
> +				    void *user_data)
> +{
> +	struct mempool_callback_list *list;
> +	struct rte_tailq_entry *te = NULL;
> +	struct mempool_callback *cb;
> +	void *tmp_te;
> +	int ret;
> +
> +	if (func == NULL) {
> +		rte_errno = EINVAL;
> +		return -rte_errno;
> +	}
> +
> +	rte_mcfg_mempool_read_lock();
> +	rte_mcfg_tailq_write_lock();
> +
> +	list = RTE_TAILQ_CAST(callback_tailq.head, mempool_callback_list);
> +	RTE_TAILQ_FOREACH_SAFE(te, list, next, tmp_te) {
> +		struct mempool_callback *cb =
> +					(struct mempool_callback *)te->data;
> +		if (cb->func == func && cb->user_data == user_data) {
> +			ret = -EEXIST;
> +			goto exit;
> +		}
> +	}
> +
> +	te = rte_zmalloc("MEMPOOL_TAILQ_ENTRY", sizeof(*te), 0);
> +	if (te == NULL) {
> +		RTE_LOG(ERR, MEMPOOL,
> +			"Cannot allocate event callback tailq entry!\n");
> +		ret = -ENOMEM;
> +		goto exit;
> +	}
> +
> +	cb = rte_malloc("MEMPOOL_EVENT_CALLBACK", sizeof(*cb), 0);
> +	if (cb == NULL) {
> +		RTE_LOG(ERR, MEMPOOL,
> +			"Cannot allocate event callback!\n");
> +		rte_free(te);
> +		ret = -ENOMEM;
> +		goto exit;
> +	}
> +
> +	cb->func = func;
> +	cb->user_data = user_data;
> +	te->data = cb;
> +	TAILQ_INSERT_TAIL(list, te, next);
> +	ret = 0;
> +
> +exit:
> +	rte_mcfg_tailq_write_unlock();
> +	rte_mcfg_mempool_read_unlock();
> +	rte_errno = -ret;
> +	return ret;
> +}
> +
> +int
> +rte_mempool_event_callback_unregister(rte_mempool_event_callback *func,
> +				      void *user_data)
> +{
> +	struct mempool_callback_list *list;
> +	struct rte_tailq_entry *te = NULL;
> +	struct mempool_callback *cb;
> +	int ret;
> +
> +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> +		rte_errno = EPERM;
> +		return -1;

The function should behave consistencly. Below you
return negative error. Here just -1. I think it
would be more constent to return -rte_errno here.

> +	}
> +
> +	rte_mcfg_mempool_read_lock();
> +	rte_mcfg_tailq_write_lock();
> +	ret = -ENOENT;
> +	list = RTE_TAILQ_CAST(callback_tailq.head, mempool_callback_list);
> +	TAILQ_FOREACH(te, list, next) {
> +		cb = (struct mempool_callback *)te->data;
> +		if (cb->func == func && cb->user_data == user_data)
> +			break;
> +	}
> +	if (te != NULL) {

Here we rely on the fact that TAILQ_FOREACH()
exists with te==NULL in the case of no such
entry. I'd suggest to avoid the assumption.
I.e. do below two lines above before break and
have not the if condition her at all.

> +		TAILQ_REMOVE(list, te, next);
> +		ret = 0;
> +	}
> +	rte_mcfg_tailq_write_unlock();
> +	rte_mcfg_mempool_read_unlock();
> +
> +	if (ret == 0) {
> +		rte_free(te);
> +		rte_free(cb);
> +	}
> +	rte_errno = -ret;
> +	return ret;
> +}
> diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
> index f57ecbd6fc..663123042f 100644
> --- a/lib/mempool/rte_mempool.h
> +++ b/lib/mempool/rte_mempool.h
> @@ -1774,6 +1774,67 @@ void rte_mempool_walk(void (*func)(struct rte_mempool *, void *arg),
>  int
>  rte_mempool_get_page_size(struct rte_mempool *mp, size_t *pg_sz);
>  
> +/**
> + * Mempool event type.
> + * @internal

Shouldn't internal go first?

> + */
> +enum rte_mempool_event {
> +	/** Occurs after a mempool is successfully populated. */

successfully -> fully ?

> +	RTE_MEMPOOL_EVENT_READY = 0,
> +	/** Occurs before destruction of a mempool begins. */
> +	RTE_MEMPOOL_EVENT_DESTROY = 1,
> +};

[snip]
  
Dmitry Kozlyuk Oct. 15, 2021, 9:13 a.m. UTC | #2
> -----Original Message-----
> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> [...]
> With below review notes processed
> 
> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> 

Thanks for the comments, I'll fix them all, just a small note below FYI.

> > +     rte_mcfg_mempool_read_lock();
> > +     rte_mcfg_tailq_write_lock();
> > +     ret = -ENOENT;
> > +     list = RTE_TAILQ_CAST(callback_tailq.head, mempool_callback_list);
> > +     TAILQ_FOREACH(te, list, next) {
> > +             cb = (struct mempool_callback *)te->data;
> > +             if (cb->func == func && cb->user_data == user_data)
> > +                     break;
> > +     }
> > +     if (te != NULL) {
> 
> Here we rely on the fact that TAILQ_FOREACH() exists with te==NULL in the
> case of no such entry. I'd suggest to avoid the assumption.
> I.e. do below two lines above before break and have not the if condition
> her at all.

Since you asked the question, the code is non-obvious, so I'll change it.
FWIW, man 3 tailq:

    TAILQ_FOREACH() traverses the queue referenced by head in the forward
    direction, assigning each element in turn to var.  var is set to NULL
    if the loop completes normally, or if there were no elements.
  
Olivier Matz Oct. 15, 2021, 12:12 p.m. UTC | #3
Hi Dmitry,

On Wed, Oct 13, 2021 at 02:01:28PM +0300, Dmitry Kozlyuk wrote:
> Data path performance can benefit if the PMD knows which memory it will
> need to handle in advance, before the first mbuf is sent to the PMD.
> It is impractical, however, to consider all allocated memory for this
> purpose. Most often mbuf memory comes from mempools that can come and
> go. PMD can enumerate existing mempools on device start, but it also
> needs to track creation and destruction of mempools after the forwarding
> starts but before an mbuf from the new mempool is sent to the device.
> 
> Add an API to register callback for mempool life cycle events:
> * rte_mempool_event_callback_register()
> * rte_mempool_event_callback_unregister()
> Currently tracked events are:
> * RTE_MEMPOOL_EVENT_READY (after populating a mempool)
> * RTE_MEMPOOL_EVENT_DESTROY (before freeing a mempool)
> Provide a unit test for the new API.
> The new API is internal, because it is primarily demanded by PMDs that
> may need to deal with any mempools and do not control their creation,
> while an application, on the other hand, knows which mempools it creates
> and doesn't care about internal mempools PMDs might create.
> 
> Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
> Acked-by: Matan Azrad <matan@nvidia.com>
> ---
>  app/test/test_mempool.c   | 209 ++++++++++++++++++++++++++++++++++++++
>  lib/mempool/rte_mempool.c | 137 +++++++++++++++++++++++++
>  lib/mempool/rte_mempool.h |  61 +++++++++++
>  lib/mempool/version.map   |   8 ++
>  4 files changed, 415 insertions(+)

(...)

> --- a/lib/mempool/rte_mempool.c
> +++ b/lib/mempool/rte_mempool.c
> @@ -42,6 +42,18 @@ static struct rte_tailq_elem rte_mempool_tailq = {
>  };
>  EAL_REGISTER_TAILQ(rte_mempool_tailq)
>  
> +TAILQ_HEAD(mempool_callback_list, rte_tailq_entry);
> +
> +static struct rte_tailq_elem callback_tailq = {
> +	.name = "RTE_MEMPOOL_CALLBACK",
> +};
> +EAL_REGISTER_TAILQ(callback_tailq)
> +
> +/* Invoke all registered mempool event callbacks. */
> +static void
> +mempool_event_callback_invoke(enum rte_mempool_event event,
> +			      struct rte_mempool *mp);
> +
>  #define CACHE_FLUSHTHRESH_MULTIPLIER 1.5
>  #define CALC_CACHE_FLUSHTHRESH(c)	\
>  	((typeof(c))((c) * CACHE_FLUSHTHRESH_MULTIPLIER))
> @@ -360,6 +372,10 @@ rte_mempool_populate_iova(struct rte_mempool *mp, char *vaddr,
>  	STAILQ_INSERT_TAIL(&mp->mem_list, memhdr, next);
>  	mp->nb_mem_chunks++;
>  
> +	/* Report the mempool as ready only when fully populated. */
> +	if (mp->populated_size >= mp->size)
> +		mempool_event_callback_invoke(RTE_MEMPOOL_EVENT_READY, mp);
> +

One small comment here. I think it does not happen today, but in the
future, something that could happen is:
  - create empty mempool
  - populate mempool
  - use mempool
  - populate mempool with more objects
  - use mempool

I've seen one usage there: https://www.youtube.com/watch?v=SzQFn9tm4Sw

In that case, it would require a POPULATE event instead of a
MEMPOOL_CREATE event.

Enhancing the documentation to better explain when the callback is
invoked looks enough to me for the moment.

>  	rte_mempool_trace_populate_iova(mp, vaddr, iova, len, free_cb, opaque);
>  	return i;
>  
> @@ -722,6 +738,7 @@ rte_mempool_free(struct rte_mempool *mp)
>  	}
>  	rte_mcfg_tailq_write_unlock();
>  
> +	mempool_event_callback_invoke(RTE_MEMPOOL_EVENT_DESTROY, mp);
>  	rte_mempool_trace_free(mp);
>  	rte_mempool_free_memchunks(mp);
>  	rte_mempool_ops_free(mp);
> @@ -1343,3 +1360,123 @@ void rte_mempool_walk(void (*func)(struct rte_mempool *, void *),
>  
>  	rte_mcfg_mempool_read_unlock();
>  }
> +
> +struct mempool_callback {
> +	rte_mempool_event_callback *func;
> +	void *user_data;
> +};
> +
> +static void
> +mempool_event_callback_invoke(enum rte_mempool_event event,
> +			      struct rte_mempool *mp)
> +{
> +	struct mempool_callback_list *list;
> +	struct rte_tailq_entry *te;
> +	void *tmp_te;
> +
> +	rte_mcfg_tailq_read_lock();
> +	list = RTE_TAILQ_CAST(callback_tailq.head, mempool_callback_list);
> +	RTE_TAILQ_FOREACH_SAFE(te, list, next, tmp_te) {
> +		struct mempool_callback *cb = te->data;
> +		rte_mcfg_tailq_read_unlock();
> +		cb->func(event, mp, cb->user_data);
> +		rte_mcfg_tailq_read_lock();

I think it is dangerous to unlock the list before invoking the callback.
During that time, another thread can remove the next mempool callback, and
the next iteration will access to a freed element, causing an undefined
behavior.

Is it a problem to keep the lock held during the callback invocation?

I see that you have a test for this, and that you wrote a comment in the
documentation:

 * rte_mempool_event_callback_register() may be called from within the callback,
 * but the callbacks registered this way will not be invoked for the same event.
 * rte_mempool_event_callback_unregister() may only be safely called
 * to remove the running callback.

But is there a use-case for this?
If no, I'll tend to say that we can document that it is not allowed to
create, free or list mempools or register cb from the callback.


> +	}
> +	rte_mcfg_tailq_read_unlock();
> +}
> +
> +int
> +rte_mempool_event_callback_register(rte_mempool_event_callback *func,
> +				    void *user_data)
> +{
> +	struct mempool_callback_list *list;
> +	struct rte_tailq_entry *te = NULL;
> +	struct mempool_callback *cb;
> +	void *tmp_te;
> +	int ret;
> +
> +	if (func == NULL) {
> +		rte_errno = EINVAL;
> +		return -rte_errno;
> +	}
> +
> +	rte_mcfg_mempool_read_lock();
> +	rte_mcfg_tailq_write_lock();
> +
> +	list = RTE_TAILQ_CAST(callback_tailq.head, mempool_callback_list);
> +	RTE_TAILQ_FOREACH_SAFE(te, list, next, tmp_te) {
> +		struct mempool_callback *cb =
> +					(struct mempool_callback *)te->data;
> +		if (cb->func == func && cb->user_data == user_data) {
> +			ret = -EEXIST;
> +			goto exit;
> +		}
> +	}
> +
> +	te = rte_zmalloc("MEMPOOL_TAILQ_ENTRY", sizeof(*te), 0);
> +	if (te == NULL) {
> +		RTE_LOG(ERR, MEMPOOL,
> +			"Cannot allocate event callback tailq entry!\n");
> +		ret = -ENOMEM;
> +		goto exit;
> +	}
> +
> +	cb = rte_malloc("MEMPOOL_EVENT_CALLBACK", sizeof(*cb), 0);
> +	if (cb == NULL) {
> +		RTE_LOG(ERR, MEMPOOL,
> +			"Cannot allocate event callback!\n");
> +		rte_free(te);
> +		ret = -ENOMEM;
> +		goto exit;
> +	}
> +
> +	cb->func = func;
> +	cb->user_data = user_data;
> +	te->data = cb;
> +	TAILQ_INSERT_TAIL(list, te, next);
> +	ret = 0;
> +
> +exit:
> +	rte_mcfg_tailq_write_unlock();
> +	rte_mcfg_mempool_read_unlock();
> +	rte_errno = -ret;
> +	return ret;
> +}
> +
> +int
> +rte_mempool_event_callback_unregister(rte_mempool_event_callback *func,
> +				      void *user_data)
> +{
> +	struct mempool_callback_list *list;
> +	struct rte_tailq_entry *te = NULL;
> +	struct mempool_callback *cb;
> +	int ret;
> +
> +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> +		rte_errno = EPERM;
> +		return -1;
> +	}

The help of the register function says
 * Callbacks will be invoked in the process that creates the mempool.

So registration is allowed from primary or secondary process. Can't a
secondary process destroys the callback it has loaded?

> +
> +	rte_mcfg_mempool_read_lock();
> +	rte_mcfg_tailq_write_lock();

I don't understand why there are 2 locks here.

After looking at the code, I think the locking model is already
incorrect in current mempool code:

   rte_mcfg_tailq_write_lock() is used in create and free to protect the
     access to the mempool tailq

   rte_mcfg_mempool_write_lock() is used in create(), to protect from
     concurrent creation (with same name for instance), but I doubt it
     is absolutly needed, because memzone_reserve is already protected.

   rte_mcfg_mempool_read_lock() is used in dump functions, but to me
     it should use rte_mcfg_tailq_read_lock() instead.
     Currently, doing a dump and a free concurrently can cause a crash
     because they are not using the same lock.

In your case, I suggest to use only one lock to protect the callback
list. I think it can be rte_mcfg_tailq_*_lock().
  
Dmitry Kozlyuk Oct. 15, 2021, 1:07 p.m. UTC | #4
[...]
> > @@ -360,6 +372,10 @@ rte_mempool_populate_iova(struct rte_mempool *mp,
> char *vaddr,
> >       STAILQ_INSERT_TAIL(&mp->mem_list, memhdr, next);
> >       mp->nb_mem_chunks++;
> >
> > +     /* Report the mempool as ready only when fully populated. */
> > +     if (mp->populated_size >= mp->size)
> > +             mempool_event_callback_invoke(RTE_MEMPOOL_EVENT_READY,
> mp);
> > +
> 
> One small comment here. I think it does not happen today, but in the
> future, something that could happen is:
>   - create empty mempool
>   - populate mempool
>   - use mempool
>   - populate mempool with more objects
>   - use mempool
> 
> I've seen one usage there: https://www.youtube.com/watch?v=SzQFn9tm4Sw
> 
> In that case, it would require a POPULATE event instead of a
> MEMPOOL_CREATE event.

That's a troublesome case.
Technically it can be only one POPULATE event called on each population,
even an incomplete one, and the callback can check populated_size vs size.
However, I'd keep the READY event indeed, because it's the common case,
and the first consumer of the feature, mlx5 PMD, can handle corner cases.
It is also a little more efficient. POPULATE can be added later.

> Enhancing the documentation to better explain when the callback is
> invoked looks enough to me for the moment.

Per Andrew's suggestion, it will say:
"Occurs after a mempool is fully populated".
I hope this is clear enough.

> [...]
> > +static void
> > +mempool_event_callback_invoke(enum rte_mempool_event event,
> > +                           struct rte_mempool *mp)
> > +{
> > +     struct mempool_callback_list *list;
> > +     struct rte_tailq_entry *te;
> > +     void *tmp_te;
> > +
> > +     rte_mcfg_tailq_read_lock();
> > +     list = RTE_TAILQ_CAST(callback_tailq.head, mempool_callback_list);
> > +     RTE_TAILQ_FOREACH_SAFE(te, list, next, tmp_te) {
> > +             struct mempool_callback *cb = te->data;
> > +             rte_mcfg_tailq_read_unlock();
> > +             cb->func(event, mp, cb->user_data);
> > +             rte_mcfg_tailq_read_lock();
> 
> I think it is dangerous to unlock the list before invoking the callback.
> During that time, another thread can remove the next mempool callback, and
> the next iteration will access to a freed element, causing an undefined
> behavior.
> 
> Is it a problem to keep the lock held during the callback invocation?
> 
> I see that you have a test for this, and that you wrote a comment in the
> documentation:
> 
>  * rte_mempool_event_callback_register() may be called from within the
> callback,
>  * but the callbacks registered this way will not be invoked for the same
> event.
>  * rte_mempool_event_callback_unregister() may only be safely called
>  * to remove the running callback.
> 
> But is there a use-case for this?
> If no, I'll tend to say that we can document that it is not allowed to
> create, free or list mempools or register cb from the callback.

There is no use-case, but I'd argue for releasing the lock.
This lock is taken by rte_xxx_create() functions in many libraries,
so the restriction is wider and, worse, it is not strictly limited.

> [...]
> > +int
> > +rte_mempool_event_callback_unregister(rte_mempool_event_callback *func,
> > +                                   void *user_data)
> > +{
> > +     struct mempool_callback_list *list;
> > +     struct rte_tailq_entry *te = NULL;
> > +     struct mempool_callback *cb;
> > +     int ret;
> > +
> > +     if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > +             rte_errno = EPERM;
> > +             return -1;
> > +     }
> 
> The help of the register function says
>  * Callbacks will be invoked in the process that creates the mempool.

BTW, this is another bug, it should be "populates", not "creates".

> So registration is allowed from primary or secondary process. Can't a
> secondary process destroys the callback it has loaded?
> 
> > +
> > +     rte_mcfg_mempool_read_lock();
> > +     rte_mcfg_tailq_write_lock();
> 
> I don't understand why there are 2 locks here.
> 
> After looking at the code, I think the locking model is already
> incorrect in current mempool code:
> 
>    rte_mcfg_tailq_write_lock() is used in create and free to protect the
>      access to the mempool tailq
> 
>    rte_mcfg_mempool_write_lock() is used in create(), to protect from
>      concurrent creation (with same name for instance), but I doubt it
>      is absolutly needed, because memzone_reserve is already protected.
> 
>    rte_mcfg_mempool_read_lock() is used in dump functions, but to me
>      it should use rte_mcfg_tailq_read_lock() instead.
>      Currently, doing a dump and a free concurrently can cause a crash
>      because they are not using the same lock.
> 
> In your case, I suggest to use only one lock to protect the callback
> list. I think it can be rte_mcfg_tailq_*_lock().

Thanks, I will double-check the locking.
  
Olivier Matz Oct. 15, 2021, 1:40 p.m. UTC | #5
On Fri, Oct 15, 2021 at 01:07:42PM +0000, Dmitry Kozlyuk wrote:
[...]
> > > +static void
> > > +mempool_event_callback_invoke(enum rte_mempool_event event,
> > > +                           struct rte_mempool *mp)
> > > +{
> > > +     struct mempool_callback_list *list;
> > > +     struct rte_tailq_entry *te;
> > > +     void *tmp_te;
> > > +
> > > +     rte_mcfg_tailq_read_lock();
> > > +     list = RTE_TAILQ_CAST(callback_tailq.head, mempool_callback_list);
> > > +     RTE_TAILQ_FOREACH_SAFE(te, list, next, tmp_te) {
> > > +             struct mempool_callback *cb = te->data;
> > > +             rte_mcfg_tailq_read_unlock();
> > > +             cb->func(event, mp, cb->user_data);
> > > +             rte_mcfg_tailq_read_lock();
> > 
> > I think it is dangerous to unlock the list before invoking the callback.
> > During that time, another thread can remove the next mempool callback, and
> > the next iteration will access to a freed element, causing an undefined
> > behavior.
> > 
> > Is it a problem to keep the lock held during the callback invocation?
> > 
> > I see that you have a test for this, and that you wrote a comment in the
> > documentation:
> > 
> >  * rte_mempool_event_callback_register() may be called from within the
> > callback,
> >  * but the callbacks registered this way will not be invoked for the same
> > event.
> >  * rte_mempool_event_callback_unregister() may only be safely called
> >  * to remove the running callback.
> > 
> > But is there a use-case for this?
> > If no, I'll tend to say that we can document that it is not allowed to
> > create, free or list mempools or register cb from the callback.
> 
> There is no use-case, but I'd argue for releasing the lock.
> This lock is taken by rte_xxx_create() functions in many libraries,
> so the restriction is wider and, worse, it is not strictly limited.

Yes... I honnestly don't understand why every library uses the same
lock rte_mcfg_tailq if the only code that accesses the list is in the
library itself. Maybe I'm missing something.

I have the impression that having only one mempool lock for all usages in
mempool would be simpler and more specific. It would allow to keep the lock held
while invoking the callbacks without blocking accesses to the other libs, and
would also solve the problem described below.



> > [...]
> > > +int
> > > +rte_mempool_event_callback_unregister(rte_mempool_event_callback *func,
> > > +                                   void *user_data)
> > > +{
> > > +     struct mempool_callback_list *list;
> > > +     struct rte_tailq_entry *te = NULL;
> > > +     struct mempool_callback *cb;
> > > +     int ret;
> > > +
> > > +     if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > > +             rte_errno = EPERM;
> > > +             return -1;
> > > +     }
> > 
> > The help of the register function says
> >  * Callbacks will be invoked in the process that creates the mempool.
> 
> BTW, this is another bug, it should be "populates", not "creates".
> 
> > So registration is allowed from primary or secondary process. Can't a
> > secondary process destroys the callback it has loaded?
> > 
> > > +
> > > +     rte_mcfg_mempool_read_lock();
> > > +     rte_mcfg_tailq_write_lock();
> > 
> > I don't understand why there are 2 locks here.
> > 
> > After looking at the code, I think the locking model is already
> > incorrect in current mempool code:
> > 
> >    rte_mcfg_tailq_write_lock() is used in create and free to protect the
> >      access to the mempool tailq
> > 
> >    rte_mcfg_mempool_write_lock() is used in create(), to protect from
> >      concurrent creation (with same name for instance), but I doubt it
> >      is absolutly needed, because memzone_reserve is already protected.
> > 
> >    rte_mcfg_mempool_read_lock() is used in dump functions, but to me
> >      it should use rte_mcfg_tailq_read_lock() instead.
> >      Currently, doing a dump and a free concurrently can cause a crash
> >      because they are not using the same lock.
> > 
> > In your case, I suggest to use only one lock to protect the callback
> > list. I think it can be rte_mcfg_tailq_*_lock().
> 
> Thanks, I will double-check the locking.
  
Dmitry Kozlyuk Oct. 19, 2021, 1:08 p.m. UTC | #6
> > +/**
> > + * Mempool event type.
> > + * @internal
> 
> Shouldn't internal go first?
> 
> > + */
> > +enum rte_mempool_event {

It really should, but I had to keep it this way
because otherwise Doxygen fails on multiple systems:

[3110/3279] Generating doxygen with a custom command
FAILED: doc/api/html 
/root/dpdk/doc/api/generate_doxygen.sh doc/api/doxy-api.conf doc/api/html 	/root/dpdk/doc/api/doxy-html-custom.sh
/root/dpdk/lib/mempool/rte_mempool.h:1778: error: Member rte_mempool_event 	(enumeration) of file rte_mempool.h is not documented.
	(warning treated as error, aborting now)
/root/dpdk/doc/api/generate_doxygen.sh: line 12: 51733 Segmentation fault      
	(core dumped) doxygen "${DOXYCONF}" > $OUT_FILE
  

Patch

diff --git a/app/test/test_mempool.c b/app/test/test_mempool.c
index 7675a3e605..bc0cc9ed48 100644
--- a/app/test/test_mempool.c
+++ b/app/test/test_mempool.c
@@ -14,6 +14,7 @@ 
 #include <rte_common.h>
 #include <rte_log.h>
 #include <rte_debug.h>
+#include <rte_errno.h>
 #include <rte_memory.h>
 #include <rte_launch.h>
 #include <rte_cycles.h>
@@ -471,6 +472,206 @@  test_mp_mem_init(struct rte_mempool *mp,
 	data->ret = 0;
 }
 
+struct test_mempool_events_data {
+	struct rte_mempool *mp;
+	enum rte_mempool_event event;
+	bool invoked;
+};
+
+static void
+test_mempool_events_cb(enum rte_mempool_event event,
+		       struct rte_mempool *mp, void *user_data)
+{
+	struct test_mempool_events_data *data = user_data;
+
+	data->mp = mp;
+	data->event = event;
+	data->invoked = true;
+}
+
+static int
+test_mempool_events(int (*populate)(struct rte_mempool *mp))
+{
+	static const size_t CB_NUM = 3;
+	static const size_t MP_NUM = 2;
+
+	struct test_mempool_events_data data[CB_NUM];
+	struct rte_mempool *mp[MP_NUM];
+	char name[RTE_MEMPOOL_NAMESIZE];
+	size_t i, j;
+	int ret;
+
+	for (i = 0; i < CB_NUM; i++) {
+		ret = rte_mempool_event_callback_register
+				(test_mempool_events_cb, &data[i]);
+		RTE_TEST_ASSERT_EQUAL(ret, 0, "Failed to register the callback %zu: %s",
+				      i, rte_strerror(rte_errno));
+	}
+	ret = rte_mempool_event_callback_unregister(test_mempool_events_cb, mp);
+	RTE_TEST_ASSERT_NOT_EQUAL(ret, 0, "Unregistered a non-registered callback");
+	/* NULL argument has no special meaning in this API. */
+	ret = rte_mempool_event_callback_unregister(test_mempool_events_cb,
+						    NULL);
+	RTE_TEST_ASSERT_NOT_EQUAL(ret, 0, "Unregistered a non-registered callback with NULL argument");
+
+	/* Create mempool 0 that will be observed by all callbacks. */
+	memset(&data, 0, sizeof(data));
+	strcpy(name, "empty0");
+	mp[0] = rte_mempool_create_empty(name, MEMPOOL_SIZE,
+					 MEMPOOL_ELT_SIZE, 0, 0,
+					 SOCKET_ID_ANY, 0);
+	RTE_TEST_ASSERT_NOT_NULL(mp[0], "Cannot create mempool %s: %s",
+				 name, rte_strerror(rte_errno));
+	for (j = 0; j < CB_NUM; j++)
+		RTE_TEST_ASSERT_EQUAL(data[j].invoked, false,
+				      "Callback %zu invoked on %s mempool creation",
+				      j, name);
+
+	rte_mempool_set_ops_byname(mp[0], rte_mbuf_best_mempool_ops(), NULL);
+	ret = populate(mp[0]);
+	RTE_TEST_ASSERT_EQUAL(ret, (int)mp[0]->size, "Failed to populate mempool %s: %s",
+			      name, rte_strerror(rte_errno));
+	for (j = 0; j < CB_NUM; j++) {
+		RTE_TEST_ASSERT_EQUAL(data[j].invoked, true,
+					"Callback %zu not invoked on mempool %s population",
+					j, name);
+		RTE_TEST_ASSERT_EQUAL(data[j].event,
+					RTE_MEMPOOL_EVENT_READY,
+					"Wrong callback invoked, expected READY");
+		RTE_TEST_ASSERT_EQUAL(data[j].mp, mp[0],
+					"Callback %zu invoked for a wrong mempool instead of %s",
+					j, name);
+	}
+
+	/* Check that unregistered callback 0 observes no events. */
+	ret = rte_mempool_event_callback_unregister(test_mempool_events_cb,
+						    &data[0]);
+	RTE_TEST_ASSERT_EQUAL(ret, 0, "Failed to unregister callback 0: %s",
+			      rte_strerror(rte_errno));
+	memset(&data, 0, sizeof(data));
+	strcpy(name, "empty1");
+	mp[1] = rte_mempool_create_empty(name, MEMPOOL_SIZE,
+					 MEMPOOL_ELT_SIZE, 0, 0,
+					 SOCKET_ID_ANY, 0);
+	RTE_TEST_ASSERT_NOT_NULL(mp[1], "Cannot create mempool %s: %s",
+				 name, rte_strerror(rte_errno));
+	rte_mempool_set_ops_byname(mp[1], rte_mbuf_best_mempool_ops(), NULL);
+	ret = populate(mp[1]);
+	RTE_TEST_ASSERT_EQUAL(ret, (int)mp[1]->size, "Failed to populate mempool %s: %s",
+			      name, rte_strerror(rte_errno));
+	RTE_TEST_ASSERT_EQUAL(data[0].invoked, false,
+			      "Unregistered callback 0 invoked on %s mempool populaton",
+			      name);
+
+	for (i = 0; i < MP_NUM; i++) {
+		memset(&data, 0, sizeof(data));
+		sprintf(name, "empty%zu", i);
+		rte_mempool_free(mp[i]);
+		for (j = 1; j < CB_NUM; j++) {
+			RTE_TEST_ASSERT_EQUAL(data[j].invoked, true,
+					      "Callback %zu not invoked on mempool %s destruction",
+					      j, name);
+			RTE_TEST_ASSERT_EQUAL(data[j].event,
+					      RTE_MEMPOOL_EVENT_DESTROY,
+					      "Wrong callback invoked, expected DESTROY");
+			RTE_TEST_ASSERT_EQUAL(data[j].mp, mp[i],
+					      "Callback %zu invoked for a wrong mempool instead of %s",
+					      j, name);
+		}
+		RTE_TEST_ASSERT_EQUAL(data[0].invoked, false,
+				      "Unregistered callback 0 invoked on %s mempool destruction",
+				      name);
+	}
+
+	for (j = 1; j < CB_NUM; j++) {
+		ret = rte_mempool_event_callback_unregister
+					(test_mempool_events_cb, &data[j]);
+		RTE_TEST_ASSERT_EQUAL(ret, 0, "Failed to unregister the callback %zu: %s",
+				      j, rte_strerror(rte_errno));
+	}
+	return 0;
+}
+
+struct test_mempool_events_safety_data {
+	bool invoked;
+	int (*api_func)(rte_mempool_event_callback *func, void *user_data);
+	rte_mempool_event_callback *cb_func;
+	void *cb_user_data;
+	int ret;
+};
+
+static void
+test_mempool_events_safety_cb(enum rte_mempool_event event,
+			      struct rte_mempool *mp, void *user_data)
+{
+	struct test_mempool_events_safety_data *data = user_data;
+
+	RTE_SET_USED(event);
+	RTE_SET_USED(mp);
+	data->invoked = true;
+	data->ret = data->api_func(data->cb_func, data->cb_user_data);
+}
+
+static int
+test_mempool_events_safety(void)
+{
+	struct test_mempool_events_data data;
+	struct test_mempool_events_safety_data sdata[2];
+	struct rte_mempool *mp;
+	size_t i;
+	int ret;
+
+	/* removes itself */
+	sdata[0].api_func = rte_mempool_event_callback_unregister;
+	sdata[0].cb_func = test_mempool_events_safety_cb;
+	sdata[0].cb_user_data = &sdata[0];
+	sdata[0].ret = -1;
+	rte_mempool_event_callback_register(test_mempool_events_safety_cb,
+					    &sdata[0]);
+	/* inserts a callback after itself */
+	sdata[1].api_func = rte_mempool_event_callback_register;
+	sdata[1].cb_func = test_mempool_events_cb;
+	sdata[1].cb_user_data = &data;
+	sdata[1].ret = -1;
+	rte_mempool_event_callback_register(test_mempool_events_safety_cb,
+					    &sdata[1]);
+
+	mp = rte_mempool_create_empty("empty", MEMPOOL_SIZE,
+				      MEMPOOL_ELT_SIZE, 0, 0,
+				      SOCKET_ID_ANY, 0);
+	RTE_TEST_ASSERT_NOT_NULL(mp, "Cannot create mempool: %s",
+				 rte_strerror(rte_errno));
+	memset(&data, 0, sizeof(data));
+	ret = rte_mempool_populate_default(mp);
+	RTE_TEST_ASSERT_EQUAL(ret, (int)mp->size, "Failed to populate mempool: %s",
+			      rte_strerror(rte_errno));
+
+	RTE_TEST_ASSERT_EQUAL(sdata[0].ret, 0, "Callback failed to unregister itself: %s",
+			      rte_strerror(rte_errno));
+	RTE_TEST_ASSERT_EQUAL(sdata[1].ret, 0, "Failed to insert a new callback: %s",
+			      rte_strerror(rte_errno));
+	RTE_TEST_ASSERT_EQUAL(data.invoked, false,
+			      "Inserted callback is invoked on mempool population");
+
+	memset(&data, 0, sizeof(data));
+	sdata[0].invoked = false;
+	rte_mempool_free(mp);
+	RTE_TEST_ASSERT_EQUAL(sdata[0].invoked, false,
+			      "Callback that unregistered itself was called");
+	RTE_TEST_ASSERT_EQUAL(sdata[1].ret, -EEXIST,
+			      "New callback inserted twice");
+	RTE_TEST_ASSERT_EQUAL(data.invoked, true,
+			      "Inserted callback is not invoked on mempool destruction");
+
+	/* cleanup, don't care which callbacks are already removed */
+	rte_mempool_event_callback_unregister(test_mempool_events_cb, &data);
+	for (i = 0; i < RTE_DIM(sdata); i++)
+		rte_mempool_event_callback_unregister
+						(test_mempool_events_safety_cb,
+						 &sdata[i]);
+	return 0;
+}
+
 static int
 test_mempool(void)
 {
@@ -645,6 +846,14 @@  test_mempool(void)
 	if (test_mempool_basic(default_pool, 1) < 0)
 		GOTO_ERR(ret, err);
 
+	/* test mempool event callbacks */
+	if (test_mempool_events(rte_mempool_populate_default) < 0)
+		GOTO_ERR(ret, err);
+	if (test_mempool_events(rte_mempool_populate_anon) < 0)
+		GOTO_ERR(ret, err);
+	if (test_mempool_events_safety() < 0)
+		GOTO_ERR(ret, err);
+
 	rte_mempool_list_dump(stdout);
 
 	ret = 0;
diff --git a/lib/mempool/rte_mempool.c b/lib/mempool/rte_mempool.c
index c5f859ae71..51c0ba2931 100644
--- a/lib/mempool/rte_mempool.c
+++ b/lib/mempool/rte_mempool.c
@@ -42,6 +42,18 @@  static struct rte_tailq_elem rte_mempool_tailq = {
 };
 EAL_REGISTER_TAILQ(rte_mempool_tailq)
 
+TAILQ_HEAD(mempool_callback_list, rte_tailq_entry);
+
+static struct rte_tailq_elem callback_tailq = {
+	.name = "RTE_MEMPOOL_CALLBACK",
+};
+EAL_REGISTER_TAILQ(callback_tailq)
+
+/* Invoke all registered mempool event callbacks. */
+static void
+mempool_event_callback_invoke(enum rte_mempool_event event,
+			      struct rte_mempool *mp);
+
 #define CACHE_FLUSHTHRESH_MULTIPLIER 1.5
 #define CALC_CACHE_FLUSHTHRESH(c)	\
 	((typeof(c))((c) * CACHE_FLUSHTHRESH_MULTIPLIER))
@@ -360,6 +372,10 @@  rte_mempool_populate_iova(struct rte_mempool *mp, char *vaddr,
 	STAILQ_INSERT_TAIL(&mp->mem_list, memhdr, next);
 	mp->nb_mem_chunks++;
 
+	/* Report the mempool as ready only when fully populated. */
+	if (mp->populated_size >= mp->size)
+		mempool_event_callback_invoke(RTE_MEMPOOL_EVENT_READY, mp);
+
 	rte_mempool_trace_populate_iova(mp, vaddr, iova, len, free_cb, opaque);
 	return i;
 
@@ -722,6 +738,7 @@  rte_mempool_free(struct rte_mempool *mp)
 	}
 	rte_mcfg_tailq_write_unlock();
 
+	mempool_event_callback_invoke(RTE_MEMPOOL_EVENT_DESTROY, mp);
 	rte_mempool_trace_free(mp);
 	rte_mempool_free_memchunks(mp);
 	rte_mempool_ops_free(mp);
@@ -1343,3 +1360,123 @@  void rte_mempool_walk(void (*func)(struct rte_mempool *, void *),
 
 	rte_mcfg_mempool_read_unlock();
 }
+
+struct mempool_callback {
+	rte_mempool_event_callback *func;
+	void *user_data;
+};
+
+static void
+mempool_event_callback_invoke(enum rte_mempool_event event,
+			      struct rte_mempool *mp)
+{
+	struct mempool_callback_list *list;
+	struct rte_tailq_entry *te;
+	void *tmp_te;
+
+	rte_mcfg_tailq_read_lock();
+	list = RTE_TAILQ_CAST(callback_tailq.head, mempool_callback_list);
+	RTE_TAILQ_FOREACH_SAFE(te, list, next, tmp_te) {
+		struct mempool_callback *cb = te->data;
+		rte_mcfg_tailq_read_unlock();
+		cb->func(event, mp, cb->user_data);
+		rte_mcfg_tailq_read_lock();
+	}
+	rte_mcfg_tailq_read_unlock();
+}
+
+int
+rte_mempool_event_callback_register(rte_mempool_event_callback *func,
+				    void *user_data)
+{
+	struct mempool_callback_list *list;
+	struct rte_tailq_entry *te = NULL;
+	struct mempool_callback *cb;
+	void *tmp_te;
+	int ret;
+
+	if (func == NULL) {
+		rte_errno = EINVAL;
+		return -rte_errno;
+	}
+
+	rte_mcfg_mempool_read_lock();
+	rte_mcfg_tailq_write_lock();
+
+	list = RTE_TAILQ_CAST(callback_tailq.head, mempool_callback_list);
+	RTE_TAILQ_FOREACH_SAFE(te, list, next, tmp_te) {
+		struct mempool_callback *cb =
+					(struct mempool_callback *)te->data;
+		if (cb->func == func && cb->user_data == user_data) {
+			ret = -EEXIST;
+			goto exit;
+		}
+	}
+
+	te = rte_zmalloc("MEMPOOL_TAILQ_ENTRY", sizeof(*te), 0);
+	if (te == NULL) {
+		RTE_LOG(ERR, MEMPOOL,
+			"Cannot allocate event callback tailq entry!\n");
+		ret = -ENOMEM;
+		goto exit;
+	}
+
+	cb = rte_malloc("MEMPOOL_EVENT_CALLBACK", sizeof(*cb), 0);
+	if (cb == NULL) {
+		RTE_LOG(ERR, MEMPOOL,
+			"Cannot allocate event callback!\n");
+		rte_free(te);
+		ret = -ENOMEM;
+		goto exit;
+	}
+
+	cb->func = func;
+	cb->user_data = user_data;
+	te->data = cb;
+	TAILQ_INSERT_TAIL(list, te, next);
+	ret = 0;
+
+exit:
+	rte_mcfg_tailq_write_unlock();
+	rte_mcfg_mempool_read_unlock();
+	rte_errno = -ret;
+	return ret;
+}
+
+int
+rte_mempool_event_callback_unregister(rte_mempool_event_callback *func,
+				      void *user_data)
+{
+	struct mempool_callback_list *list;
+	struct rte_tailq_entry *te = NULL;
+	struct mempool_callback *cb;
+	int ret;
+
+	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
+		rte_errno = EPERM;
+		return -1;
+	}
+
+	rte_mcfg_mempool_read_lock();
+	rte_mcfg_tailq_write_lock();
+	ret = -ENOENT;
+	list = RTE_TAILQ_CAST(callback_tailq.head, mempool_callback_list);
+	TAILQ_FOREACH(te, list, next) {
+		cb = (struct mempool_callback *)te->data;
+		if (cb->func == func && cb->user_data == user_data)
+			break;
+	}
+	if (te != NULL) {
+		TAILQ_REMOVE(list, te, next);
+		ret = 0;
+	}
+	rte_mcfg_tailq_write_unlock();
+	rte_mcfg_mempool_read_unlock();
+
+	if (ret == 0) {
+		rte_free(te);
+		rte_free(cb);
+	}
+	rte_errno = -ret;
+	return ret;
+}
diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
index f57ecbd6fc..663123042f 100644
--- a/lib/mempool/rte_mempool.h
+++ b/lib/mempool/rte_mempool.h
@@ -1774,6 +1774,67 @@  void rte_mempool_walk(void (*func)(struct rte_mempool *, void *arg),
 int
 rte_mempool_get_page_size(struct rte_mempool *mp, size_t *pg_sz);
 
+/**
+ * Mempool event type.
+ * @internal
+ */
+enum rte_mempool_event {
+	/** Occurs after a mempool is successfully populated. */
+	RTE_MEMPOOL_EVENT_READY = 0,
+	/** Occurs before destruction of a mempool begins. */
+	RTE_MEMPOOL_EVENT_DESTROY = 1,
+};
+
+/**
+ * @internal
+ * Mempool event callback.
+ *
+ * rte_mempool_event_callback_register() may be called from within the callback,
+ * but the callbacks registered this way will not be invoked for the same event.
+ * rte_mempool_event_callback_unregister() may only be safely called
+ * to remove the running callback.
+ */
+typedef void (rte_mempool_event_callback)(
+		enum rte_mempool_event event,
+		struct rte_mempool *mp,
+		void *user_data);
+
+/**
+ * @internal
+ * Register a callback invoked on mempool life cycle event.
+ * Callbacks will be invoked in the process that creates the mempool.
+ *
+ * @param func
+ *   Callback function.
+ * @param user_data
+ *   User data.
+ *
+ * @return
+ *   0 on success, negative on failure and rte_errno is set.
+ */
+__rte_internal
+int
+rte_mempool_event_callback_register(rte_mempool_event_callback *func,
+				    void *user_data);
+
+/**
+ * @internal
+ * Unregister a callback added with rte_mempool_event_callback_register().
+ * @p func and @p user_data must exactly match registration parameters.
+ *
+ * @param func
+ *   Callback function.
+ * @param user_data
+ *   User data.
+ *
+ * @return
+ *   0 on success, negative on failure and rte_errno is set.
+ */
+__rte_internal
+int
+rte_mempool_event_callback_unregister(rte_mempool_event_callback *func,
+				      void *user_data);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/mempool/version.map b/lib/mempool/version.map
index 9f77da6fff..1b7d7c5456 100644
--- a/lib/mempool/version.map
+++ b/lib/mempool/version.map
@@ -64,3 +64,11 @@  EXPERIMENTAL {
 	__rte_mempool_trace_ops_free;
 	__rte_mempool_trace_set_ops_byname;
 };
+
+INTERNAL {
+	global:
+
+	# added in 21.11
+	rte_mempool_event_callback_register;
+	rte_mempool_event_callback_unregister;
+};