eal: allow swapping of malloc heaps

Message ID 20230915122703.475834-1-bruce.richardson@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers
Series eal: allow swapping of malloc heaps |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/Intel-compilation fail Compilation issues
ci/intel-Testing success Testing PASS
ci/intel-Functional success Functional PASS

Commit Message

Bruce Richardson Sept. 15, 2023, 12:27 p.m. UTC
  The external memory functions in DPDK allow the addition of externally
access memory to malloc heaps, but with one major restriction - the
memory must be allocated to an application-created heap, not one of the
standard DPDK heaps for a NUMA node.

This restriction makes it difficult - if not impossible - to use
externally allocated memory for DPDK by default. However, even if the
restriction is relaxed, so we can add external memory to e.g. the socket
0 heap, there would be no way to guarantee that the external memory
would be used in preference to the standard DPDK hugepage memory for a
given allocation.

To give appropriately defined behaviour, a better solution is to allow
the application to explicitly swap a pair of heaps. With this one new
API in place, it allows the user to configure a new malloc heap, add
external memory to it, and then replace a standard socket heap with the
newly created one - thereby guaranteeing future allocations from the
external memory.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/eal/common/malloc_heap.c | 18 ++++++++++++++++++
 lib/eal/include/rte_malloc.h | 25 +++++++++++++++++++++++++
 lib/eal/version.map          |  2 ++
 3 files changed, 45 insertions(+)
  

Comments

Stephen Hemminger Sept. 15, 2023, 3:43 p.m. UTC | #1
On Fri, 15 Sep 2023 13:27:03 +0100
Bruce Richardson <bruce.richardson@intel.com> wrote:

> The external memory functions in DPDK allow the addition of externally
> access memory to malloc heaps, but with one major restriction - the
> memory must be allocated to an application-created heap, not one of the
> standard DPDK heaps for a NUMA node.
> 
> This restriction makes it difficult - if not impossible - to use
> externally allocated memory for DPDK by default. However, even if the
> restriction is relaxed, so we can add external memory to e.g. the socket
> 0 heap, there would be no way to guarantee that the external memory
> would be used in preference to the standard DPDK hugepage memory for a
> given allocation.
> 
> To give appropriately defined behaviour, a better solution is to allow
> the application to explicitly swap a pair of heaps. With this one new
> API in place, it allows the user to configure a new malloc heap, add
> external memory to it, and then replace a standard socket heap with the
> newly created one - thereby guaranteeing future allocations from the
> external memory.
> 
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> ---

Should document any restrictions about when this can be done.
Doesn't look thread safe. Would expect that application would do this
just after EAL init before doing any work.

What happens to memory allocated from old heap.
  
Stephen Hemminger Sept. 18, 2023, 4:44 p.m. UTC | #2
On Mon, 18 Sep 2023 17:32:04 +0100
Bruce Richardson <bruce.richardson@intel.com> wrote:

> Sometimes apps (or perhaps DPDK driver components) may want to allow
> use of "external" i.e. non EAL allocated, memory as though it were
> standard DPDK memory. This patchset provides the ability to do this,
> by: firstly, adding an explicit flag to indicate non-EAL memory,
> rather than relying on the socket_id implicitly, and then secondly,
> allowing heaps to be swapped, so an external heap can be used as the
> default heap for socket 0 or 1, etc.
> 
> V3:
> * Expand to 2 patch set, adding patch to tag external memory
>   explicitly, before adding the swap function
> * Add locks to improve thread safety of the swap opperation.
> * Add additional notes to function to clarify usage.
> 
> V2:
> * Fix doxygen comment issue on doc builds
> 
> Bruce Richardson (2):
>   eal: add flag to indicate non-EAL malloc heaps
>   eal: allow swapping of malloc heaps
> 
>  lib/eal/common/malloc_heap.c | 46 ++++++++++++++++++++++++++++--------
>  lib/eal/common/malloc_heap.h |  1 +
>  lib/eal/common/malloc_mp.c   |  5 ++--
>  lib/eal/common/rte_malloc.c  | 14 ++++++-----
>  lib/eal/include/rte_malloc.h | 34 ++++++++++++++++++++++++++
>  lib/eal/version.map          |  2 ++
>  6 files changed, 83 insertions(+), 19 deletions(-)
> 
> --
> 2.39.2
> 

Series-Acked-by: Stephen Hemminger <stephen@networkplumber.org>
  

Patch

diff --git a/lib/eal/common/malloc_heap.c b/lib/eal/common/malloc_heap.c
index 6b6cf9174c..e5d0ad6b52 100644
--- a/lib/eal/common/malloc_heap.c
+++ b/lib/eal/common/malloc_heap.c
@@ -1320,6 +1320,24 @@  malloc_heap_add_external_memory(struct malloc_heap *heap,
 	return 0;
 }
 
+int
+rte_malloc_heap_swap_socket(int socket1, int socket2)
+{
+	const int h1 = malloc_socket_to_heap_id(socket1);
+	if (h1 < 0 || h1 > RTE_MAX_HEAPS)
+		return -1;
+
+	const int h2 = malloc_socket_to_heap_id(socket2);
+	if (h2 < 0 || h2 > RTE_MAX_HEAPS)
+		return -1;
+
+	struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
+	int tmp = mcfg->malloc_heaps[h1].socket_id;
+	mcfg->malloc_heaps[h1].socket_id = mcfg->malloc_heaps[h2].socket_id;
+	mcfg->malloc_heaps[h2].socket_id = tmp;
+	return 0;
+}
+
 int
 malloc_heap_remove_external_memory(struct malloc_heap *heap, void *va_addr,
 		size_t len)
diff --git a/lib/eal/include/rte_malloc.h b/lib/eal/include/rte_malloc.h
index 54a8ac211e..fa70ca7430 100644
--- a/lib/eal/include/rte_malloc.h
+++ b/lib/eal/include/rte_malloc.h
@@ -490,6 +490,31 @@  rte_malloc_heap_get_socket(const char *name);
 int
 rte_malloc_heap_socket_is_external(int socket_id);
 
+/**
+ * Swap the heaps for the given socket ids
+ *
+ * This causes the heaps for the given socket ids to be swapped, allowing
+ * external memory registered as a malloc heap to become the new default memory
+ * for a standard numa node. For example, to have allocations on socket 0 come
+ * from external memory, the following sequence of API calls can be used:
+ *   - rte_malloc_heap_create(<name>)
+ *   - rte_malloc_heap_memory_add(<name>,....)
+ *   - id = rte_malloc_heap_get_socket(<name>)
+ *   - rte_malloc_heap_swap_sockets(0, id)
+ * Following these calls, allocations for the old memory allocated on socket 0,
+ * can be made by passing "id" as the socket_id parameter
+ *
+ * @param socket1
+ *   The socket id of the first heap to swap
+ * @param socket2
+ *   The socket id of the second heap to swap
+ * @return
+ *   0 on success, -1 on error
+ */
+__rte_experimental
+int
+rte_malloc_heap_swap_socket(int socket1, int socket2);
+
 /**
  * Dump statistics.
  *
diff --git a/lib/eal/version.map b/lib/eal/version.map
index 7940431e5a..b06ee7219e 100644
--- a/lib/eal/version.map
+++ b/lib/eal/version.map
@@ -417,6 +417,8 @@  EXPERIMENTAL {
 	# added in 23.07
 	rte_memzone_max_get;
 	rte_memzone_max_set;
+
+	rte_malloc_heap_swap_socket;
 };
 
 INTERNAL {