app/testpmd: fix external buffer allocation

Message ID 20211217095816.2599242-1-dkozlyuk@nvidia.com (mailing list archive)
State Accepted, archived
Delegated to: Ferruh Yigit
Headers
Series app/testpmd: fix external buffer allocation |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/github-robot: build success github build: passed
ci/iol-x86_64-compile-testing success Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-x86_64-unit-testing success Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-aarch64-unit-testing success Testing PASS
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/iol-intel-Performance success Performance Testing PASS

Commit Message

Dmitry Kozlyuk Dec. 17, 2021, 9:58 a.m. UTC
  External pinned buffer memory (--mp-alloc=xbuf)
was allocated as multiple IOVA-contiguous memzones
of 2M size and 2M alignment.
Due to the malloc overhead and the alignment requirement,
each 2M memzone consumed 4M of hugepage memory:
2M of usable memory + X of malloc overhead + (2M-X) padding.
The allocation often failed with 2M hugepages and IOVA-as-PA
if a PA-contiguous span of 2 hugepages could not be found.
Also, with any hugepage size and IOVA mode
memory consumption was almost 2x of the usable amount.

Alignment requirement of 2M for external buffers is redundant.
It was an attempt to ensure IOVA-contiguity
by forcing memzones to start at hugepage boundaries,
while 2M size intended to leave no unused space on the page.
As shown above, this in fact caused excessive memory consumption
and decreased the chance of a successful allocation.
RTE_MEMZONE_F_IOVA_CONTIG already ensures IOVA-contiguity.

Remove the alignment requirement.
Reduce the memzone size by the malloc overhead size (4 cache lines),
so that memory consumption for each memzone is
(2M-X) of usable memory + X of malloc overhead = 2M.
This also means that whenever there are free 2M hugepages,
an IOVA-contiguous memzone can always be allocated.

Fixes: 72512e1897b2 ("app/testpmd: add mempool with external data buffers")
Cc: stable@dpdk.org

Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 app/test-pmd/testpmd.c | 19 ++++++++++++-------
 1 file changed, 12 insertions(+), 7 deletions(-)
  

Comments

Ferruh Yigit Jan. 18, 2022, 1:28 p.m. UTC | #1
On 12/17/2021 9:58 AM, Dmitry Kozlyuk wrote:
> External pinned buffer memory (--mp-alloc=xbuf)
> was allocated as multiple IOVA-contiguous memzones
> of 2M size and 2M alignment.
> Due to the malloc overhead and the alignment requirement,
> each 2M memzone consumed 4M of hugepage memory:
> 2M of usable memory + X of malloc overhead + (2M-X) padding.
> The allocation often failed with 2M hugepages and IOVA-as-PA
> if a PA-contiguous span of 2 hugepages could not be found.
> Also, with any hugepage size and IOVA mode
> memory consumption was almost 2x of the usable amount.
> 
> Alignment requirement of 2M for external buffers is redundant.
> It was an attempt to ensure IOVA-contiguity
> by forcing memzones to start at hugepage boundaries,
> while 2M size intended to leave no unused space on the page.
> As shown above, this in fact caused excessive memory consumption
> and decreased the chance of a successful allocation.
> RTE_MEMZONE_F_IOVA_CONTIG already ensures IOVA-contiguity.
> 
> Remove the alignment requirement.
> Reduce the memzone size by the malloc overhead size (4 cache lines),
> so that memory consumption for each memzone is
> (2M-X) of usable memory + X of malloc overhead = 2M.
> This also means that whenever there are free 2M hugepages,
> an IOVA-contiguous memzone can always be allocated.
> 
> Fixes: 72512e1897b2 ("app/testpmd: add mempool with external data buffers")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

Applied to dpdk-next-net/main, thanks.
  

Patch

diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 55eb293cc0..fa04cc6be6 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -84,7 +84,13 @@ 
 #endif
 
 #define EXTMEM_HEAP_NAME "extmem"
-#define EXTBUF_ZONE_SIZE RTE_PGSIZE_2M
+/*
+ * Zone size with the malloc overhead (max of debug and release variants)
+ * must fit into the smallest supported hugepage size (2M),
+ * so that an IOVA-contiguous zone of this size can always be allocated
+ * if there are free 2M hugepages.
+ */
+#define EXTBUF_ZONE_SIZE (RTE_PGSIZE_2M - 4 * RTE_CACHE_LINE_SIZE)
 
 uint16_t verbose_level = 0; /**< Silent by default. */
 int testpmd_logtype; /**< Log type for testpmd logs */
@@ -1061,12 +1067,11 @@  setup_extbuf(uint32_t nb_mbufs, uint16_t mbuf_sz, unsigned int socket_id,
 			ext_num = 0;
 			break;
 		}
-		mz = rte_memzone_reserve_aligned(mz_name, EXTBUF_ZONE_SIZE,
-						 socket_id,
-						 RTE_MEMZONE_IOVA_CONTIG |
-						 RTE_MEMZONE_1GB |
-						 RTE_MEMZONE_SIZE_HINT_ONLY,
-						 EXTBUF_ZONE_SIZE);
+		mz = rte_memzone_reserve(mz_name, EXTBUF_ZONE_SIZE,
+					 socket_id,
+					 RTE_MEMZONE_IOVA_CONTIG |
+					 RTE_MEMZONE_1GB |
+					 RTE_MEMZONE_SIZE_HINT_ONLY);
 		if (mz == NULL) {
 			/*
 			 * The caller exits on external buffer creation