[v3,1/4] mempool: add event callbacks

Message ID 20211012000409.2751908-2-dkozlyuk@nvidia.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers
Series net/mlx5: implicit mempool registration |

Checks

Context Check Description
ci/checkpatch warning coding style issues

Commit Message

Dmitry Kozlyuk Oct. 12, 2021, 12:04 a.m. UTC
  Data path performance can benefit if the PMD knows which memory it will
need to handle in advance, before the first mbuf is sent to the PMD.
It is impractical, however, to consider all allocated memory for this
purpose. Most often mbuf memory comes from mempools that can come and
go. PMD can enumerate existing mempools on device start, but it also
needs to track creation and destruction of mempools after the forwarding
starts but before an mbuf from the new mempool is sent to the device.

Add an internal API to register callback for mempool life cycle events:
* rte_mempool_event_callback_register()
* rte_mempool_event_callback_unregister()
Currently tracked events are:
* RTE_MEMPOOL_EVENT_READY (after populating a mempool)
* RTE_MEMPOOL_EVENT_DESTROY (before freeing a mempool)
Provide a unit test for the new API.

Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
---
 app/test/test_mempool.c   |  75 +++++++++++++++++++++
 lib/mempool/rte_mempool.c | 137 ++++++++++++++++++++++++++++++++++++++
 lib/mempool/rte_mempool.h |  56 ++++++++++++++++
 lib/mempool/version.map   |   8 +++
 4 files changed, 276 insertions(+)
  

Comments

Andrew Rybchenko Oct. 12, 2021, 6:33 a.m. UTC | #1
On 10/12/21 3:04 AM, Dmitry Kozlyuk wrote:
> Data path performance can benefit if the PMD knows which memory it will
> need to handle in advance, before the first mbuf is sent to the PMD.
> It is impractical, however, to consider all allocated memory for this
> purpose. Most often mbuf memory comes from mempools that can come and
> go. PMD can enumerate existing mempools on device start, but it also
> needs to track creation and destruction of mempools after the forwarding
> starts but before an mbuf from the new mempool is sent to the device.
> 
> Add an internal API to register callback for mempool life cycle events:
> * rte_mempool_event_callback_register()
> * rte_mempool_event_callback_unregister()
> Currently tracked events are:
> * RTE_MEMPOOL_EVENT_READY (after populating a mempool)
> * RTE_MEMPOOL_EVENT_DESTROY (before freeing a mempool)
> Provide a unit test for the new API.

Good idea.

> Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
> Acked-by: Matan Azrad <matan@nvidia.com>

[snip]

I think it would be very useful to test two callbacks as well
including new mempool creation after one of callbacks
unregister. Plus register/unregister callbacks from a callback
itself. Feel free to drop it, since increasing test coverage
is almost endless :)

> diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
> index f57ecbd6fc..e2bf40aa09 100644
> --- a/lib/mempool/rte_mempool.h
> +++ b/lib/mempool/rte_mempool.h
> @@ -1774,6 +1774,62 @@ void rte_mempool_walk(void (*func)(struct rte_mempool *, void *arg),
>  int
>  rte_mempool_get_page_size(struct rte_mempool *mp, size_t *pg_sz);
>  
> +/**
> + * Mempool event type.
> + * @internal
> + */
> +enum rte_mempool_event {
> +	/** Occurs after a mempool is successfully populated. */

successfully -> fully ?

> +	RTE_MEMPOOL_EVENT_READY = 0,
> +	/** Occurs before destruction of a mempool begins. */
> +	RTE_MEMPOOL_EVENT_DESTROY = 1,
> +};
> +
> +/**
> + * @internal
> + * Mempool event callback.
> + */
> +typedef void (rte_mempool_event_callback)(
> +		enum rte_mempool_event event,
> +		struct rte_mempool *mp,
> +		void *user_data);
> +
> +/**
> + * @internal

I'd like to understand why the API is internal (not
experimental). I think reasons should be clear from
function description.

> + * Register a callback invoked on mempool life cycle event.
> + * Callbacks will be invoked in the process that creates the mempool.
> + *
> + * @param func
> + *   Callback function.
> + * @param user_data
> + *   User data.
> + *
> + * @return
> + *   0 on success, negative on failure and rte_errno is set.
> + */
> +__rte_internal
> +int
> +rte_mempool_event_callback_register(rte_mempool_event_callback *func,
> +				    void *user_data);
> +
> +/**
> + * @internal
> + * Unregister a callback added with rte_mempool_event_callback_register().
> + * @p func and @p user_data must exactly match registration parameters.
> + *
> + * @param func
> + *   Callback function.
> + * @param user_data
> + *   User data.
> + *
> + * @return
> + *   0 on success, negative on failure and rte_errno is set.
> + */
> +__rte_internal
> +int
> +rte_mempool_event_callback_unregister(rte_mempool_event_callback *func,
> +				      void *user_data);
> +
>  #ifdef __cplusplus
>  }
>  #endif
> diff --git a/lib/mempool/version.map b/lib/mempool/version.map
> index 9f77da6fff..1b7d7c5456 100644
> --- a/lib/mempool/version.map
> +++ b/lib/mempool/version.map
> @@ -64,3 +64,11 @@ EXPERIMENTAL {
>  	__rte_mempool_trace_ops_free;
>  	__rte_mempool_trace_set_ops_byname;
>  };
> +
> +INTERNAL {
> +	global:
> +
> +	# added in 21.11
> +	rte_mempool_event_callback_register;
> +	rte_mempool_event_callback_unregister;
> +};
>
  
Dmitry Kozlyuk Oct. 12, 2021, 9:37 a.m. UTC | #2
> On 10/12/21 3:04 AM, Dmitry Kozlyuk wrote:
> > Data path performance can benefit if the PMD knows which memory it
> > will need to handle in advance, before the first mbuf is sent to the PMD.
> [...]
> 
> I'd like to understand why the API is internal (not experimental). I think reasons
> should be clear from function description.

My reasoning was that PMDs need this API while applications don't. PMDs may need to deal with any mempools and don't control their creation, while the application knows which mempools it creates and doesn't care about internal mempools PMDs might create. But maybe I was wrong and there are applications that want to use mbufs from those internal mempools for non-DPDK IO, something SPDK-like. I'll add a note about that in the description and make them experimental.
  
Andrew Rybchenko Oct. 12, 2021, 9:46 a.m. UTC | #3
On 10/12/21 12:37 PM, Dmitry Kozlyuk wrote:
>> On 10/12/21 3:04 AM, Dmitry Kozlyuk wrote:
>>> Data path performance can benefit if the PMD knows which memory it
>>> will need to handle in advance, before the first mbuf is sent to the PMD.
>> [...]
>>
>> I'd like to understand why the API is internal (not experimental). I think reasons
>> should be clear from function description.
> 
> My reasoning was that PMDs need this API while applications don't. PMDs may need to deal with any mempools and don't control their creation, while the application knows which mempools it creates and doesn't care about internal mempools PMDs might create. But maybe I was wrong and there are applications that want to use mbufs from those internal mempools for non-DPDK IO, something SPDK-like. I'll add a note about that in the description and make them experimental.
> 

It is a good explanation. Thanks.
May be it is safer to keep it internal until
we find the first external user.
  

Patch

diff --git a/app/test/test_mempool.c b/app/test/test_mempool.c
index 7675a3e605..0c4ed7c60b 100644
--- a/app/test/test_mempool.c
+++ b/app/test/test_mempool.c
@@ -14,6 +14,7 @@ 
 #include <rte_common.h>
 #include <rte_log.h>
 #include <rte_debug.h>
+#include <rte_errno.h>
 #include <rte_memory.h>
 #include <rte_launch.h>
 #include <rte_cycles.h>
@@ -471,6 +472,74 @@  test_mp_mem_init(struct rte_mempool *mp,
 	data->ret = 0;
 }
 
+struct test_mempool_events_data {
+	struct rte_mempool *mp;
+	enum rte_mempool_event event;
+	bool invoked;
+};
+
+static void
+test_mempool_events_cb(enum rte_mempool_event event,
+		       struct rte_mempool *mp, void *arg)
+{
+	struct test_mempool_events_data *data = arg;
+
+	data->mp = mp;
+	data->event = event;
+	data->invoked = true;
+}
+
+static int
+test_mempool_events(int (*populate)(struct rte_mempool *mp))
+{
+	struct test_mempool_events_data data;
+	struct rte_mempool *mp;
+	int ret;
+
+	ret = rte_mempool_event_callback_register(NULL, &data);
+	RTE_TEST_ASSERT_NOT_EQUAL(ret, 0, "Registered a NULL callback");
+
+	memset(&data, 0, sizeof(data));
+	ret = rte_mempool_event_callback_register(test_mempool_events_cb,
+						  &data);
+	RTE_TEST_ASSERT_EQUAL(ret, 0, "Failed to register the callback: %s",
+			      rte_strerror(rte_errno));
+
+	mp = rte_mempool_create_empty("empty", MEMPOOL_SIZE,
+				      MEMPOOL_ELT_SIZE, 0, 0,
+				      SOCKET_ID_ANY, 0);
+	RTE_TEST_ASSERT_NOT_NULL(mp, "Cannot create an empty mempool: %s",
+				 rte_strerror(rte_errno));
+	RTE_TEST_ASSERT_EQUAL(data.invoked, false,
+			      "Callback invoked on an empty mempool creation");
+
+	rte_mempool_set_ops_byname(mp, rte_mbuf_best_mempool_ops(), NULL);
+	ret = populate(mp);
+	RTE_TEST_ASSERT_EQUAL(ret, (int)mp->size, "Failed to populate the mempool: %s",
+			      rte_strerror(rte_errno));
+	RTE_TEST_ASSERT_EQUAL(data.invoked, true,
+			      "Callback not invoked on an empty mempool population");
+	RTE_TEST_ASSERT_EQUAL(data.event, RTE_MEMPOOL_EVENT_READY,
+			      "Wrong callback invoked, expected READY");
+	RTE_TEST_ASSERT_EQUAL(data.mp, mp,
+			      "Callback invoked for a wrong mempool");
+
+	memset(&data, 0, sizeof(data));
+	rte_mempool_free(mp);
+	RTE_TEST_ASSERT_EQUAL(data.invoked, true,
+			      "Callback not invoked on mempool destruction");
+	RTE_TEST_ASSERT_EQUAL(data.event, RTE_MEMPOOL_EVENT_DESTROY,
+			      "Wrong callback invoked, expected DESTROY");
+	RTE_TEST_ASSERT_EQUAL(data.mp, mp,
+			      "Callback invoked for a wrong mempool");
+
+	ret = rte_mempool_event_callback_unregister(test_mempool_events_cb,
+						    &data);
+	RTE_TEST_ASSERT_EQUAL(ret, 0, "Failed to unregister the callback: %s",
+			      rte_strerror(rte_errno));
+	return 0;
+}
+
 static int
 test_mempool(void)
 {
@@ -645,6 +714,12 @@  test_mempool(void)
 	if (test_mempool_basic(default_pool, 1) < 0)
 		GOTO_ERR(ret, err);
 
+	/* test mempool event callbacks */
+	if (test_mempool_events(rte_mempool_populate_default) < 0)
+		GOTO_ERR(ret, err);
+	if (test_mempool_events(rte_mempool_populate_anon) < 0)
+		GOTO_ERR(ret, err);
+
 	rte_mempool_list_dump(stdout);
 
 	ret = 0;
diff --git a/lib/mempool/rte_mempool.c b/lib/mempool/rte_mempool.c
index c5f859ae71..51c0ba2931 100644
--- a/lib/mempool/rte_mempool.c
+++ b/lib/mempool/rte_mempool.c
@@ -42,6 +42,18 @@  static struct rte_tailq_elem rte_mempool_tailq = {
 };
 EAL_REGISTER_TAILQ(rte_mempool_tailq)
 
+TAILQ_HEAD(mempool_callback_list, rte_tailq_entry);
+
+static struct rte_tailq_elem callback_tailq = {
+	.name = "RTE_MEMPOOL_CALLBACK",
+};
+EAL_REGISTER_TAILQ(callback_tailq)
+
+/* Invoke all registered mempool event callbacks. */
+static void
+mempool_event_callback_invoke(enum rte_mempool_event event,
+			      struct rte_mempool *mp);
+
 #define CACHE_FLUSHTHRESH_MULTIPLIER 1.5
 #define CALC_CACHE_FLUSHTHRESH(c)	\
 	((typeof(c))((c) * CACHE_FLUSHTHRESH_MULTIPLIER))
@@ -360,6 +372,10 @@  rte_mempool_populate_iova(struct rte_mempool *mp, char *vaddr,
 	STAILQ_INSERT_TAIL(&mp->mem_list, memhdr, next);
 	mp->nb_mem_chunks++;
 
+	/* Report the mempool as ready only when fully populated. */
+	if (mp->populated_size >= mp->size)
+		mempool_event_callback_invoke(RTE_MEMPOOL_EVENT_READY, mp);
+
 	rte_mempool_trace_populate_iova(mp, vaddr, iova, len, free_cb, opaque);
 	return i;
 
@@ -722,6 +738,7 @@  rte_mempool_free(struct rte_mempool *mp)
 	}
 	rte_mcfg_tailq_write_unlock();
 
+	mempool_event_callback_invoke(RTE_MEMPOOL_EVENT_DESTROY, mp);
 	rte_mempool_trace_free(mp);
 	rte_mempool_free_memchunks(mp);
 	rte_mempool_ops_free(mp);
@@ -1343,3 +1360,123 @@  void rte_mempool_walk(void (*func)(struct rte_mempool *, void *),
 
 	rte_mcfg_mempool_read_unlock();
 }
+
+struct mempool_callback {
+	rte_mempool_event_callback *func;
+	void *user_data;
+};
+
+static void
+mempool_event_callback_invoke(enum rte_mempool_event event,
+			      struct rte_mempool *mp)
+{
+	struct mempool_callback_list *list;
+	struct rte_tailq_entry *te;
+	void *tmp_te;
+
+	rte_mcfg_tailq_read_lock();
+	list = RTE_TAILQ_CAST(callback_tailq.head, mempool_callback_list);
+	RTE_TAILQ_FOREACH_SAFE(te, list, next, tmp_te) {
+		struct mempool_callback *cb = te->data;
+		rte_mcfg_tailq_read_unlock();
+		cb->func(event, mp, cb->user_data);
+		rte_mcfg_tailq_read_lock();
+	}
+	rte_mcfg_tailq_read_unlock();
+}
+
+int
+rte_mempool_event_callback_register(rte_mempool_event_callback *func,
+				    void *user_data)
+{
+	struct mempool_callback_list *list;
+	struct rte_tailq_entry *te = NULL;
+	struct mempool_callback *cb;
+	void *tmp_te;
+	int ret;
+
+	if (func == NULL) {
+		rte_errno = EINVAL;
+		return -rte_errno;
+	}
+
+	rte_mcfg_mempool_read_lock();
+	rte_mcfg_tailq_write_lock();
+
+	list = RTE_TAILQ_CAST(callback_tailq.head, mempool_callback_list);
+	RTE_TAILQ_FOREACH_SAFE(te, list, next, tmp_te) {
+		struct mempool_callback *cb =
+					(struct mempool_callback *)te->data;
+		if (cb->func == func && cb->user_data == user_data) {
+			ret = -EEXIST;
+			goto exit;
+		}
+	}
+
+	te = rte_zmalloc("MEMPOOL_TAILQ_ENTRY", sizeof(*te), 0);
+	if (te == NULL) {
+		RTE_LOG(ERR, MEMPOOL,
+			"Cannot allocate event callback tailq entry!\n");
+		ret = -ENOMEM;
+		goto exit;
+	}
+
+	cb = rte_malloc("MEMPOOL_EVENT_CALLBACK", sizeof(*cb), 0);
+	if (cb == NULL) {
+		RTE_LOG(ERR, MEMPOOL,
+			"Cannot allocate event callback!\n");
+		rte_free(te);
+		ret = -ENOMEM;
+		goto exit;
+	}
+
+	cb->func = func;
+	cb->user_data = user_data;
+	te->data = cb;
+	TAILQ_INSERT_TAIL(list, te, next);
+	ret = 0;
+
+exit:
+	rte_mcfg_tailq_write_unlock();
+	rte_mcfg_mempool_read_unlock();
+	rte_errno = -ret;
+	return ret;
+}
+
+int
+rte_mempool_event_callback_unregister(rte_mempool_event_callback *func,
+				      void *user_data)
+{
+	struct mempool_callback_list *list;
+	struct rte_tailq_entry *te = NULL;
+	struct mempool_callback *cb;
+	int ret;
+
+	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
+		rte_errno = EPERM;
+		return -1;
+	}
+
+	rte_mcfg_mempool_read_lock();
+	rte_mcfg_tailq_write_lock();
+	ret = -ENOENT;
+	list = RTE_TAILQ_CAST(callback_tailq.head, mempool_callback_list);
+	TAILQ_FOREACH(te, list, next) {
+		cb = (struct mempool_callback *)te->data;
+		if (cb->func == func && cb->user_data == user_data)
+			break;
+	}
+	if (te != NULL) {
+		TAILQ_REMOVE(list, te, next);
+		ret = 0;
+	}
+	rte_mcfg_tailq_write_unlock();
+	rte_mcfg_mempool_read_unlock();
+
+	if (ret == 0) {
+		rte_free(te);
+		rte_free(cb);
+	}
+	rte_errno = -ret;
+	return ret;
+}
diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
index f57ecbd6fc..e2bf40aa09 100644
--- a/lib/mempool/rte_mempool.h
+++ b/lib/mempool/rte_mempool.h
@@ -1774,6 +1774,62 @@  void rte_mempool_walk(void (*func)(struct rte_mempool *, void *arg),
 int
 rte_mempool_get_page_size(struct rte_mempool *mp, size_t *pg_sz);
 
+/**
+ * Mempool event type.
+ * @internal
+ */
+enum rte_mempool_event {
+	/** Occurs after a mempool is successfully populated. */
+	RTE_MEMPOOL_EVENT_READY = 0,
+	/** Occurs before destruction of a mempool begins. */
+	RTE_MEMPOOL_EVENT_DESTROY = 1,
+};
+
+/**
+ * @internal
+ * Mempool event callback.
+ */
+typedef void (rte_mempool_event_callback)(
+		enum rte_mempool_event event,
+		struct rte_mempool *mp,
+		void *user_data);
+
+/**
+ * @internal
+ * Register a callback invoked on mempool life cycle event.
+ * Callbacks will be invoked in the process that creates the mempool.
+ *
+ * @param func
+ *   Callback function.
+ * @param user_data
+ *   User data.
+ *
+ * @return
+ *   0 on success, negative on failure and rte_errno is set.
+ */
+__rte_internal
+int
+rte_mempool_event_callback_register(rte_mempool_event_callback *func,
+				    void *user_data);
+
+/**
+ * @internal
+ * Unregister a callback added with rte_mempool_event_callback_register().
+ * @p func and @p user_data must exactly match registration parameters.
+ *
+ * @param func
+ *   Callback function.
+ * @param user_data
+ *   User data.
+ *
+ * @return
+ *   0 on success, negative on failure and rte_errno is set.
+ */
+__rte_internal
+int
+rte_mempool_event_callback_unregister(rte_mempool_event_callback *func,
+				      void *user_data);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/mempool/version.map b/lib/mempool/version.map
index 9f77da6fff..1b7d7c5456 100644
--- a/lib/mempool/version.map
+++ b/lib/mempool/version.map
@@ -64,3 +64,11 @@  EXPERIMENTAL {
 	__rte_mempool_trace_ops_free;
 	__rte_mempool_trace_set_ops_byname;
 };
+
+INTERNAL {
+	global:
+
+	# added in 21.11
+	rte_mempool_event_callback_register;
+	rte_mempool_event_callback_unregister;
+};