From patchwork Wed Oct 30 14:36:14 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Olivier Matz X-Patchwork-Id: 62227 Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 544A7A00BE; Wed, 30 Oct 2019 15:37:21 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 4B5DE1C00E; Wed, 30 Oct 2019 15:37:04 +0100 (CET) Received: from proxy.6wind.com (host.76.145.23.62.rev.coltfrance.com [62.23.145.76]) by dpdk.org (Postfix) with ESMTP id 87B661BF7A for ; Wed, 30 Oct 2019 15:36:57 +0100 (CET) Received: from glumotte.dev.6wind.com. (unknown [10.16.0.195]) by proxy.6wind.com (Postfix) with ESMTP id A4D1E338A49; Wed, 30 Oct 2019 15:36:55 +0100 (CET) From: Olivier Matz To: dev@dpdk.org Cc: Anatoly Burakov , Andrew Rybchenko , Ferruh Yigit , "Giridharan, Ganesan" , Jerin Jacob Kollanukkaran , "Kiran Kumar Kokkilagadda" , Stephen Hemminger , Thomas Monjalon , Vamsi Krishna Attunuru Date: Wed, 30 Oct 2019 15:36:14 +0100 Message-Id: <20191030143619.4007-2-olivier.matz@6wind.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20191030143619.4007-1-olivier.matz@6wind.com> References: <20190719133845.32432-1-olivier.matz@6wind.com> <20191030143619.4007-1-olivier.matz@6wind.com> MIME-Version: 1.0 Subject: [dpdk-dev] [PATCH v2 1/6] mempool: allow unaligned addr/len in populate virt X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" rte_mempool_populate_virt() currently requires that both addr and length are page-aligned. Remove this uneeded constraint which can be annoying with big hugepages (ex: 1GB). Signed-off-by: Olivier Matz Reviewed-by: Andrew Rybchenko --- lib/librte_mempool/rte_mempool.c | 23 +++++++++++------------ lib/librte_mempool/rte_mempool.h | 3 +-- 2 files changed, 12 insertions(+), 14 deletions(-) diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c index 0f29e8712..88e49c751 100644 --- a/lib/librte_mempool/rte_mempool.c +++ b/lib/librte_mempool/rte_mempool.c @@ -368,17 +368,11 @@ rte_mempool_populate_virt(struct rte_mempool *mp, char *addr, size_t off, phys_len; int ret, cnt = 0; - /* address and len must be page-aligned */ - if (RTE_PTR_ALIGN_CEIL(addr, pg_sz) != addr) - return -EINVAL; - if (RTE_ALIGN_CEIL(len, pg_sz) != len) - return -EINVAL; - if (mp->flags & MEMPOOL_F_NO_IOVA_CONTIG) return rte_mempool_populate_iova(mp, addr, RTE_BAD_IOVA, len, free_cb, opaque); - for (off = 0; off + pg_sz <= len && + for (off = 0; off < len && mp->populated_size < mp->size; off += phys_len) { iova = rte_mem_virt2iova(addr + off); @@ -389,12 +383,18 @@ rte_mempool_populate_virt(struct rte_mempool *mp, char *addr, } /* populate with the largest group of contiguous pages */ - for (phys_len = pg_sz; off + phys_len < len; phys_len += pg_sz) { + for (phys_len = RTE_MIN( + (size_t)(RTE_PTR_ALIGN_CEIL(addr + off + 1, pg_sz) - + (addr + off)), + len - off); + off + phys_len < len; + phys_len = RTE_MIN(phys_len + pg_sz, len - off)) { rte_iova_t iova_tmp; iova_tmp = rte_mem_virt2iova(addr + off + phys_len); - if (iova_tmp != iova + phys_len) + if (iova_tmp == RTE_BAD_IOVA || + iova_tmp != iova + phys_len) break; } @@ -575,8 +575,7 @@ rte_mempool_populate_default(struct rte_mempool *mp) * have */ mz = rte_memzone_reserve_aligned(mz_name, 0, - mp->socket_id, flags, - RTE_MAX(pg_sz, align)); + mp->socket_id, flags, align); } if (mz == NULL) { ret = -rte_errno; @@ -601,7 +600,7 @@ rte_mempool_populate_default(struct rte_mempool *mp) (void *)(uintptr_t)mz); else ret = rte_mempool_populate_virt(mp, mz->addr, - RTE_ALIGN_FLOOR(mz->len, pg_sz), pg_sz, + mz->len, pg_sz, rte_mempool_memchunk_mz_free, (void *)(uintptr_t)mz); if (ret < 0) { diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h index 8053f7a04..0fe8aa7b8 100644 --- a/lib/librte_mempool/rte_mempool.h +++ b/lib/librte_mempool/rte_mempool.h @@ -1042,9 +1042,8 @@ int rte_mempool_populate_iova(struct rte_mempool *mp, char *vaddr, * A pointer to the mempool structure. * @param addr * The virtual address of memory that should be used to store objects. - * Must be page-aligned. * @param len - * The length of memory in bytes. Must be page-aligned. + * The length of memory in bytes. * @param pg_sz * The size of memory pages in this virtual area. * @param free_cb From patchwork Wed Oct 30 14:36:15 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Olivier Matz X-Patchwork-Id: 62225 Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 7C1EDA00BE; Wed, 30 Oct 2019 15:37:00 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id B86791BF7A; Wed, 30 Oct 2019 15:36:59 +0100 (CET) Received: from proxy.6wind.com (host.76.145.23.62.rev.coltfrance.com [62.23.145.76]) by dpdk.org (Postfix) with ESMTP id 876751BE9C for ; Wed, 30 Oct 2019 15:36:57 +0100 (CET) Received: from glumotte.dev.6wind.com. (unknown [10.16.0.195]) by proxy.6wind.com (Postfix) with ESMTP id B27C5338A4A; Wed, 30 Oct 2019 15:36:55 +0100 (CET) From: Olivier Matz To: dev@dpdk.org Cc: Anatoly Burakov , Andrew Rybchenko , Ferruh Yigit , "Giridharan, Ganesan" , Jerin Jacob Kollanukkaran , "Kiran Kumar Kokkilagadda" , Stephen Hemminger , Thomas Monjalon , Vamsi Krishna Attunuru Date: Wed, 30 Oct 2019 15:36:15 +0100 Message-Id: <20191030143619.4007-3-olivier.matz@6wind.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20191030143619.4007-1-olivier.matz@6wind.com> References: <20190719133845.32432-1-olivier.matz@6wind.com> <20191030143619.4007-1-olivier.matz@6wind.com> MIME-Version: 1.0 Subject: [dpdk-dev] [PATCH v2 2/6] mempool: reduce wasted space on mempool populate X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" The size returned by rte_mempool_op_calc_mem_size_default() is aligned to the specified page size. Therefore, with big pages, the returned size can be much more that what we really need to populate the mempool. For instance, populating a mempool that requires 1.1GB of memory with 1GB hugepages can result in allocating 2GB of memory. This problem is hidden most of the time due to the allocation method of rte_mempool_populate_default(): when try_iova_contig_mempool=true, it first tries to allocate an iova contiguous area, without the alignment constraint. If it fails, it fallbacks to an aligned allocation that does not require to be iova-contiguous. This can also fallback into several smaller aligned allocations. This commit changes rte_mempool_op_calc_mem_size_default() to relax the alignment constraint to a cache line and to return a smaller size. Signed-off-by: Olivier Matz Reviewed-by: Andrew Rybdhenko --- lib/librte_mempool/rte_mempool.c | 7 ++--- lib/librte_mempool/rte_mempool.h | 2 +- lib/librte_mempool/rte_mempool_ops.c | 4 ++- lib/librte_mempool/rte_mempool_ops_default.c | 28 +++++++++++++++----- 4 files changed, 28 insertions(+), 13 deletions(-) diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c index 88e49c751..4e0d576f5 100644 --- a/lib/librte_mempool/rte_mempool.c +++ b/lib/librte_mempool/rte_mempool.c @@ -477,11 +477,8 @@ rte_mempool_populate_default(struct rte_mempool *mp) * wasting some space this way, but it's much nicer than looping around * trying to reserve each and every page size. * - * However, since size calculation will produce page-aligned sizes, it - * makes sense to first try and see if we can reserve the entire memzone - * in one contiguous chunk as well (otherwise we might end up wasting a - * 1G page on a 10MB memzone). If we fail to get enough contiguous - * memory, then we'll go and reserve space page-by-page. + * If we fail to get enough contiguous memory, then we'll go and + * reserve space in smaller chunks. * * We also have to take into account the fact that memory that we're * going to allocate from can belong to an externally allocated memory diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h index 0fe8aa7b8..8fa3c04e5 100644 --- a/lib/librte_mempool/rte_mempool.h +++ b/lib/librte_mempool/rte_mempool.h @@ -458,7 +458,7 @@ typedef unsigned (*rte_mempool_get_count)(const struct rte_mempool *mp); * @param[out] align * Location for required memory chunk alignment. * @return - * Required memory size aligned at page boundary. + * Required memory size. */ typedef ssize_t (*rte_mempool_calc_mem_size_t)(const struct rte_mempool *mp, uint32_t obj_num, uint32_t pg_shift, diff --git a/lib/librte_mempool/rte_mempool_ops.c b/lib/librte_mempool/rte_mempool_ops.c index e02eb702c..22c5251eb 100644 --- a/lib/librte_mempool/rte_mempool_ops.c +++ b/lib/librte_mempool/rte_mempool_ops.c @@ -100,7 +100,9 @@ rte_mempool_ops_get_count(const struct rte_mempool *mp) return ops->get_count(mp); } -/* wrapper to notify new memory area to external mempool */ +/* wrapper to calculate the memory size required to store given number + * of objects + */ ssize_t rte_mempool_ops_calc_mem_size(const struct rte_mempool *mp, uint32_t obj_num, uint32_t pg_shift, diff --git a/lib/librte_mempool/rte_mempool_ops_default.c b/lib/librte_mempool/rte_mempool_ops_default.c index 4e2bfc82d..f6aea7662 100644 --- a/lib/librte_mempool/rte_mempool_ops_default.c +++ b/lib/librte_mempool/rte_mempool_ops_default.c @@ -12,7 +12,7 @@ rte_mempool_op_calc_mem_size_default(const struct rte_mempool *mp, size_t *min_chunk_size, size_t *align) { size_t total_elt_sz; - size_t obj_per_page, pg_num, pg_sz; + size_t obj_per_page, pg_sz, objs_in_last_page; size_t mem_size; total_elt_sz = mp->header_size + mp->elt_size + mp->trailer_size; @@ -33,14 +33,30 @@ rte_mempool_op_calc_mem_size_default(const struct rte_mempool *mp, mem_size = RTE_ALIGN_CEIL(total_elt_sz, pg_sz) * obj_num; } else { - pg_num = (obj_num + obj_per_page - 1) / obj_per_page; - mem_size = pg_num << pg_shift; + /* In the best case, the allocator will return a + * page-aligned address. For example, with 5 objs, + * the required space is as below: + * | page0 | page1 | page2 (last) | + * |obj0 |obj1 |xxx|obj2 |obj3 |xxx|obj4| + * <------------- mem_size -------------> + */ + objs_in_last_page = ((obj_num - 1) % obj_per_page) + 1; + /* room required for the last page */ + mem_size = objs_in_last_page * total_elt_sz; + /* room required for other pages */ + mem_size += ((obj_num - objs_in_last_page) / + obj_per_page) << pg_shift; + + /* In the worst case, the allocator returns a + * non-aligned pointer, wasting up to + * total_elt_sz. Add a margin for that. + */ + mem_size += total_elt_sz - 1; } } - *min_chunk_size = RTE_MAX((size_t)1 << pg_shift, total_elt_sz); - - *align = RTE_MAX((size_t)RTE_CACHE_LINE_SIZE, (size_t)1 << pg_shift); + *min_chunk_size = total_elt_sz; + *align = RTE_CACHE_LINE_SIZE; return mem_size; } From patchwork Wed Oct 30 14:36:16 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Olivier Matz X-Patchwork-Id: 62226 Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 88127A00BE; Wed, 30 Oct 2019 15:37:08 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 0120F1BFE3; Wed, 30 Oct 2019 15:37:02 +0100 (CET) Received: from proxy.6wind.com (host.76.145.23.62.rev.coltfrance.com [62.23.145.76]) by dpdk.org (Postfix) with ESMTP id 8CB891BF8E for ; Wed, 30 Oct 2019 15:36:57 +0100 (CET) Received: from glumotte.dev.6wind.com. (unknown [10.16.0.195]) by proxy.6wind.com (Postfix) with ESMTP id C193C338A4B; Wed, 30 Oct 2019 15:36:55 +0100 (CET) From: Olivier Matz To: dev@dpdk.org Cc: Anatoly Burakov , Andrew Rybchenko , Ferruh Yigit , "Giridharan, Ganesan" , Jerin Jacob Kollanukkaran , "Kiran Kumar Kokkilagadda" , Stephen Hemminger , Thomas Monjalon , Vamsi Krishna Attunuru Date: Wed, 30 Oct 2019 15:36:16 +0100 Message-Id: <20191030143619.4007-4-olivier.matz@6wind.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20191030143619.4007-1-olivier.matz@6wind.com> References: <20190719133845.32432-1-olivier.matz@6wind.com> <20191030143619.4007-1-olivier.matz@6wind.com> MIME-Version: 1.0 Subject: [dpdk-dev] [PATCH v2 3/6] mempool: remove optimistic IOVA-contiguous allocation X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" The previous commit reduced the amount of required memory when populating the mempool with non iova-contiguous memory. Since there is no big advantage to have a fully iova-contiguous mempool if it is not explicitly asked, remove this code, it simplifies the populate function. Signed-off-by: Olivier Matz Reviewed-by: Andrew Rybchenko --- lib/librte_mempool/rte_mempool.c | 47 ++++++-------------------------- 1 file changed, 8 insertions(+), 39 deletions(-) diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c index 4e0d576f5..213e574fc 100644 --- a/lib/librte_mempool/rte_mempool.c +++ b/lib/librte_mempool/rte_mempool.c @@ -430,7 +430,6 @@ rte_mempool_populate_default(struct rte_mempool *mp) unsigned mz_id, n; int ret; bool need_iova_contig_obj; - bool try_iova_contig_mempool; bool alloc_in_ext_mem; ret = mempool_ops_alloc_once(mp); @@ -483,9 +482,7 @@ rte_mempool_populate_default(struct rte_mempool *mp) * We also have to take into account the fact that memory that we're * going to allocate from can belong to an externally allocated memory * area, in which case the assumption of IOVA as VA mode being - * synonymous with IOVA contiguousness will not hold. We should also try - * to go for contiguous memory even if we're in no-huge mode, because - * external memory may in fact be IOVA-contiguous. + * synonymous with IOVA contiguousness will not hold. */ /* check if we can retrieve a valid socket ID */ @@ -494,7 +491,6 @@ rte_mempool_populate_default(struct rte_mempool *mp) return -EINVAL; alloc_in_ext_mem = (ret == 1); need_iova_contig_obj = !(mp->flags & MEMPOOL_F_NO_IOVA_CONTIG); - try_iova_contig_mempool = false; if (!need_iova_contig_obj) { pg_sz = 0; @@ -503,7 +499,6 @@ rte_mempool_populate_default(struct rte_mempool *mp) pg_sz = 0; pg_shift = 0; } else if (rte_eal_has_hugepages() || alloc_in_ext_mem) { - try_iova_contig_mempool = true; pg_sz = get_min_page_size(mp->socket_id); pg_shift = rte_bsf32(pg_sz); } else { @@ -513,14 +508,9 @@ rte_mempool_populate_default(struct rte_mempool *mp) for (mz_id = 0, n = mp->size; n > 0; mz_id++, n -= ret) { size_t min_chunk_size; - unsigned int flags; - if (try_iova_contig_mempool || pg_sz == 0) - mem_size = rte_mempool_ops_calc_mem_size(mp, n, - 0, &min_chunk_size, &align); - else - mem_size = rte_mempool_ops_calc_mem_size(mp, n, - pg_shift, &min_chunk_size, &align); + mem_size = rte_mempool_ops_calc_mem_size( + mp, n, pg_shift, &min_chunk_size, &align); if (mem_size < 0) { ret = mem_size; @@ -534,36 +524,15 @@ rte_mempool_populate_default(struct rte_mempool *mp) goto fail; } - flags = mz_flags; - /* if we're trying to reserve contiguous memory, add appropriate * memzone flag. */ - if (try_iova_contig_mempool) - flags |= RTE_MEMZONE_IOVA_CONTIG; + if (min_chunk_size == (size_t)mem_size) + mz_flags |= RTE_MEMZONE_IOVA_CONTIG; mz = rte_memzone_reserve_aligned(mz_name, mem_size, - mp->socket_id, flags, align); - - /* if we were trying to allocate contiguous memory, failed and - * minimum required contiguous chunk fits minimum page, adjust - * memzone size to the page size, and try again. - */ - if (mz == NULL && try_iova_contig_mempool && - min_chunk_size <= pg_sz) { - try_iova_contig_mempool = false; - flags &= ~RTE_MEMZONE_IOVA_CONTIG; - - mem_size = rte_mempool_ops_calc_mem_size(mp, n, - pg_shift, &min_chunk_size, &align); - if (mem_size < 0) { - ret = mem_size; - goto fail; - } + mp->socket_id, mz_flags, align); - mz = rte_memzone_reserve_aligned(mz_name, mem_size, - mp->socket_id, flags, align); - } /* don't try reserving with 0 size if we were asked to reserve * IOVA-contiguous memory. */ @@ -572,7 +541,7 @@ rte_mempool_populate_default(struct rte_mempool *mp) * have */ mz = rte_memzone_reserve_aligned(mz_name, 0, - mp->socket_id, flags, align); + mp->socket_id, mz_flags, align); } if (mz == NULL) { ret = -rte_errno; @@ -590,7 +559,7 @@ rte_mempool_populate_default(struct rte_mempool *mp) else iova = RTE_BAD_IOVA; - if (try_iova_contig_mempool || pg_sz == 0) + if (pg_sz == 0 || (mz_flags & RTE_MEMZONE_IOVA_CONTIG)) ret = rte_mempool_populate_iova(mp, mz->addr, iova, mz->len, rte_mempool_memchunk_mz_free, From patchwork Wed Oct 30 14:36:17 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Olivier Matz X-Patchwork-Id: 62228 Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 1973DA00BE; Wed, 30 Oct 2019 15:37:33 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 770AE1C020; Wed, 30 Oct 2019 15:37:06 +0100 (CET) Received: from proxy.6wind.com (host.76.145.23.62.rev.coltfrance.com [62.23.145.76]) by dpdk.org (Postfix) with ESMTP id 8D03A1BF95 for ; Wed, 30 Oct 2019 15:36:57 +0100 (CET) Received: from glumotte.dev.6wind.com. (unknown [10.16.0.195]) by proxy.6wind.com (Postfix) with ESMTP id D2FA7338A4C; Wed, 30 Oct 2019 15:36:55 +0100 (CET) From: Olivier Matz To: dev@dpdk.org Cc: Anatoly Burakov , Andrew Rybchenko , Ferruh Yigit , "Giridharan, Ganesan" , Jerin Jacob Kollanukkaran , "Kiran Kumar Kokkilagadda" , Stephen Hemminger , Thomas Monjalon , Vamsi Krishna Attunuru Date: Wed, 30 Oct 2019 15:36:17 +0100 Message-Id: <20191030143619.4007-5-olivier.matz@6wind.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20191030143619.4007-1-olivier.matz@6wind.com> References: <20190719133845.32432-1-olivier.matz@6wind.com> <20191030143619.4007-1-olivier.matz@6wind.com> MIME-Version: 1.0 Subject: [dpdk-dev] [PATCH v2 4/6] mempool: introduce function to get mempool page size X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" In rte_mempool_populate_default(), we determine the page size, which is needed for calc_size and allocation of memory. Move this in a function and export it, it will be used in next commit. Signed-off-by: Olivier Matz Reviewed-by: Andrew Rybchenko --- lib/librte_mempool/rte_mempool.c | 51 ++++++++++++++-------- lib/librte_mempool/rte_mempool.h | 7 +++ lib/librte_mempool/rte_mempool_version.map | 1 + 3 files changed, 40 insertions(+), 19 deletions(-) diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c index 213e574fc..758c5410b 100644 --- a/lib/librte_mempool/rte_mempool.c +++ b/lib/librte_mempool/rte_mempool.c @@ -414,6 +414,33 @@ rte_mempool_populate_virt(struct rte_mempool *mp, char *addr, return ret; } +/* Get the minimal page size used in a mempool before populating it. */ +int +rte_mempool_get_page_size(struct rte_mempool *mp, size_t *pg_sz) +{ + bool need_iova_contig_obj; + bool alloc_in_ext_mem; + int ret; + + /* check if we can retrieve a valid socket ID */ + ret = rte_malloc_heap_socket_is_external(mp->socket_id); + if (ret < 0) + return -EINVAL; + alloc_in_ext_mem = (ret == 1); + need_iova_contig_obj = !(mp->flags & MEMPOOL_F_NO_IOVA_CONTIG); + + if (!need_iova_contig_obj) + *pg_sz = 0; + else if (!alloc_in_ext_mem && rte_eal_iova_mode() == RTE_IOVA_VA) + *pg_sz = 0; + else if (rte_eal_has_hugepages() || alloc_in_ext_mem) + *pg_sz = get_min_page_size(mp->socket_id); + else + *pg_sz = getpagesize(); + + return 0; +} + /* Default function to populate the mempool: allocate memory in memzones, * and populate them. Return the number of objects added, or a negative * value on error. @@ -425,12 +452,11 @@ rte_mempool_populate_default(struct rte_mempool *mp) char mz_name[RTE_MEMZONE_NAMESIZE]; const struct rte_memzone *mz; ssize_t mem_size; - size_t align, pg_sz, pg_shift; + size_t align, pg_sz, pg_shift = 0; rte_iova_t iova; unsigned mz_id, n; int ret; bool need_iova_contig_obj; - bool alloc_in_ext_mem; ret = mempool_ops_alloc_once(mp); if (ret != 0) @@ -485,26 +511,13 @@ rte_mempool_populate_default(struct rte_mempool *mp) * synonymous with IOVA contiguousness will not hold. */ - /* check if we can retrieve a valid socket ID */ - ret = rte_malloc_heap_socket_is_external(mp->socket_id); - if (ret < 0) - return -EINVAL; - alloc_in_ext_mem = (ret == 1); need_iova_contig_obj = !(mp->flags & MEMPOOL_F_NO_IOVA_CONTIG); + ret = rte_mempool_get_page_size(mp, &pg_sz); + if (ret < 0) + return ret; - if (!need_iova_contig_obj) { - pg_sz = 0; - pg_shift = 0; - } else if (!alloc_in_ext_mem && rte_eal_iova_mode() == RTE_IOVA_VA) { - pg_sz = 0; - pg_shift = 0; - } else if (rte_eal_has_hugepages() || alloc_in_ext_mem) { - pg_sz = get_min_page_size(mp->socket_id); - pg_shift = rte_bsf32(pg_sz); - } else { - pg_sz = getpagesize(); + if (pg_sz != 0) pg_shift = rte_bsf32(pg_sz); - } for (mz_id = 0, n = mp->size; n > 0; mz_id++, n -= ret) { size_t min_chunk_size; diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h index 8fa3c04e5..6d98b3743 100644 --- a/lib/librte_mempool/rte_mempool.h +++ b/lib/librte_mempool/rte_mempool.h @@ -1691,6 +1691,13 @@ uint32_t rte_mempool_calc_obj_size(uint32_t elt_size, uint32_t flags, void rte_mempool_walk(void (*func)(struct rte_mempool *, void *arg), void *arg); +/** + * @internal Get page size used for mempool object allocation. + */ +__rte_experimental +int +rte_mempool_get_page_size(struct rte_mempool *mp, size_t *pg_sz); + #ifdef __cplusplus } #endif diff --git a/lib/librte_mempool/rte_mempool_version.map b/lib/librte_mempool/rte_mempool_version.map index 17cbca460..4eff2767d 100644 --- a/lib/librte_mempool/rte_mempool_version.map +++ b/lib/librte_mempool/rte_mempool_version.map @@ -56,5 +56,6 @@ DPDK_18.05 { EXPERIMENTAL { global: + rte_mempool_get_page_size; rte_mempool_ops_get_info; }; From patchwork Wed Oct 30 14:36:18 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Olivier Matz X-Patchwork-Id: 62231 Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id CE972A00BE; Wed, 30 Oct 2019 15:37:58 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 33A2B1C065; Wed, 30 Oct 2019 15:37:12 +0100 (CET) Received: from proxy.6wind.com (host.76.145.23.62.rev.coltfrance.com [62.23.145.76]) by dpdk.org (Postfix) with ESMTP id A5E891BFD4 for ; Wed, 30 Oct 2019 15:36:57 +0100 (CET) Received: from glumotte.dev.6wind.com. (unknown [10.16.0.195]) by proxy.6wind.com (Postfix) with ESMTP id E17A9338A4D; Wed, 30 Oct 2019 15:36:55 +0100 (CET) From: Olivier Matz To: dev@dpdk.org Cc: Anatoly Burakov , Andrew Rybchenko , Ferruh Yigit , "Giridharan, Ganesan" , Jerin Jacob Kollanukkaran , "Kiran Kumar Kokkilagadda" , Stephen Hemminger , Thomas Monjalon , Vamsi Krishna Attunuru Date: Wed, 30 Oct 2019 15:36:18 +0100 Message-Id: <20191030143619.4007-6-olivier.matz@6wind.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20191030143619.4007-1-olivier.matz@6wind.com> References: <20190719133845.32432-1-olivier.matz@6wind.com> <20191030143619.4007-1-olivier.matz@6wind.com> MIME-Version: 1.0 Subject: [dpdk-dev] [PATCH v2 5/6] mempool: prevent objects from being across pages X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" When populating a mempool, ensure that objects are not located across several pages, except if user did not request iova contiguous objects. Signed-off-by: Vamsi Krishna Attunuru Signed-off-by: Olivier Matz --- drivers/mempool/octeontx2/Makefile | 3 + drivers/mempool/octeontx2/meson.build | 3 + drivers/mempool/octeontx2/otx2_mempool_ops.c | 119 ++++++++++++++++--- lib/librte_mempool/rte_mempool.c | 23 ++-- lib/librte_mempool/rte_mempool_ops_default.c | 32 ++++- 5 files changed, 147 insertions(+), 33 deletions(-) diff --git a/drivers/mempool/octeontx2/Makefile b/drivers/mempool/octeontx2/Makefile index 87cce22c6..d781cbfc6 100644 --- a/drivers/mempool/octeontx2/Makefile +++ b/drivers/mempool/octeontx2/Makefile @@ -27,6 +27,9 @@ EXPORT_MAP := rte_mempool_octeontx2_version.map LIBABIVER := 1 +# for rte_mempool_get_page_size +CFLAGS += -DALLOW_EXPERIMENTAL_API + # # all source are stored in SRCS-y # diff --git a/drivers/mempool/octeontx2/meson.build b/drivers/mempool/octeontx2/meson.build index 9fde40f0e..28f9634da 100644 --- a/drivers/mempool/octeontx2/meson.build +++ b/drivers/mempool/octeontx2/meson.build @@ -21,3 +21,6 @@ foreach flag: extra_flags endforeach deps += ['eal', 'mbuf', 'kvargs', 'bus_pci', 'common_octeontx2', 'mempool'] + +# for rte_mempool_get_page_size +allow_experimental_apis = true diff --git a/drivers/mempool/octeontx2/otx2_mempool_ops.c b/drivers/mempool/octeontx2/otx2_mempool_ops.c index d769575f4..47117aec6 100644 --- a/drivers/mempool/octeontx2/otx2_mempool_ops.c +++ b/drivers/mempool/octeontx2/otx2_mempool_ops.c @@ -713,12 +713,76 @@ static ssize_t otx2_npa_calc_mem_size(const struct rte_mempool *mp, uint32_t obj_num, uint32_t pg_shift, size_t *min_chunk_size, size_t *align) { - /* - * Simply need space for one more object to be able to - * fulfill alignment requirements. - */ - return rte_mempool_op_calc_mem_size_default(mp, obj_num + 1, pg_shift, - min_chunk_size, align); + size_t total_elt_sz; + size_t obj_per_page, pg_sz, objs_in_last_page; + size_t mem_size; + + /* derived from rte_mempool_op_calc_mem_size_default() */ + + total_elt_sz = mp->header_size + mp->elt_size + mp->trailer_size; + + if (total_elt_sz == 0) { + mem_size = 0; + } else if (pg_shift == 0) { + /* one object margin to fix alignment */ + mem_size = total_elt_sz * (obj_num + 1); + } else { + pg_sz = (size_t)1 << pg_shift; + obj_per_page = pg_sz / total_elt_sz; + + /* we need to keep one object to fix alignment */ + if (obj_per_page > 0) + obj_per_page--; + + if (obj_per_page == 0) { + /* + * Note that if object size is bigger than page size, + * then it is assumed that pages are grouped in subsets + * of physically continuous pages big enough to store + * at least one object. + */ + mem_size = RTE_ALIGN_CEIL(2 * total_elt_sz, + pg_sz) * obj_num; + } else { + /* In the best case, the allocator will return a + * page-aligned address. For example, with 5 objs, + * the required space is as below: + * | page0 | page1 | page2 (last) | + * |obj0 |obj1 |xxx|obj2 |obj3 |xxx|obj4| + * <------------- mem_size -------------> + */ + objs_in_last_page = ((obj_num - 1) % obj_per_page) + 1; + /* room required for the last page */ + mem_size = objs_in_last_page * total_elt_sz; + /* room required for other pages */ + mem_size += ((obj_num - objs_in_last_page) / + obj_per_page) << pg_shift; + + /* In the worst case, the allocator returns a + * non-aligned pointer, wasting up to + * total_elt_sz. Add a margin for that. + */ + mem_size += total_elt_sz - 1; + } + } + + *min_chunk_size = total_elt_sz * 2; + *align = RTE_CACHE_LINE_SIZE; + + return mem_size; +} + +/* Returns -1 if object crosses a page boundary, else returns 0 */ +static int +check_obj_bounds(char *obj, size_t pg_sz, size_t elt_sz) +{ + if (pg_sz == 0) + return 0; + if (elt_sz > pg_sz) + return 0; + if (RTE_PTR_ALIGN(obj, pg_sz) != RTE_PTR_ALIGN(obj + elt_sz - 1, pg_sz)) + return -1; + return 0; } static int @@ -726,8 +790,12 @@ otx2_npa_populate(struct rte_mempool *mp, unsigned int max_objs, void *vaddr, rte_iova_t iova, size_t len, rte_mempool_populate_obj_cb_t *obj_cb, void *obj_cb_arg) { - size_t total_elt_sz; + char *va = vaddr; + size_t total_elt_sz, pg_sz; size_t off; + unsigned int i; + void *obj; + int ret; if (iova == RTE_BAD_IOVA) return -EINVAL; @@ -735,22 +803,45 @@ otx2_npa_populate(struct rte_mempool *mp, unsigned int max_objs, void *vaddr, total_elt_sz = mp->header_size + mp->elt_size + mp->trailer_size; /* Align object start address to a multiple of total_elt_sz */ - off = total_elt_sz - ((uintptr_t)vaddr % total_elt_sz); + off = total_elt_sz - (((uintptr_t)(va - 1) % total_elt_sz) + 1); if (len < off) return -EINVAL; - vaddr = (char *)vaddr + off; - iova += off; - len -= off; - npa_lf_aura_op_range_set(mp->pool_id, iova, iova + len); + npa_lf_aura_op_range_set(mp->pool_id, iova + off, iova + len - off); if (npa_lf_aura_range_update_check(mp->pool_id) < 0) return -EBUSY; - return rte_mempool_op_populate_default(mp, max_objs, vaddr, iova, len, - obj_cb, obj_cb_arg); + /* the following is derived from rte_mempool_op_populate_default() */ + + ret = rte_mempool_get_page_size(mp, &pg_sz); + if (ret < 0) + return ret; + + for (i = 0; i < max_objs; i++) { + /* avoid objects to cross page boundaries, and align + * offset to a multiple of total_elt_sz. + */ + if (check_obj_bounds(va + off, pg_sz, total_elt_sz) < 0) { + off += RTE_PTR_ALIGN_CEIL(va + off, pg_sz) - (va + off); + off += total_elt_sz - (((uintptr_t)(va + off - 1) % + total_elt_sz) + 1); + } + + if (off + total_elt_sz > len) + break; + + off += mp->header_size; + obj = va + off; + obj_cb(mp, obj_cb_arg, obj, + (iova == RTE_BAD_IOVA) ? RTE_BAD_IOVA : (iova + off)); + rte_mempool_ops_enqueue_bulk(mp, &obj, 1); + off += mp->elt_size + mp->trailer_size; + } + + return i; } static struct rte_mempool_ops otx2_npa_ops = { diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c index 758c5410b..d3db9273d 100644 --- a/lib/librte_mempool/rte_mempool.c +++ b/lib/librte_mempool/rte_mempool.c @@ -431,8 +431,6 @@ rte_mempool_get_page_size(struct rte_mempool *mp, size_t *pg_sz) if (!need_iova_contig_obj) *pg_sz = 0; - else if (!alloc_in_ext_mem && rte_eal_iova_mode() == RTE_IOVA_VA) - *pg_sz = 0; else if (rte_eal_has_hugepages() || alloc_in_ext_mem) *pg_sz = get_min_page_size(mp->socket_id); else @@ -481,17 +479,15 @@ rte_mempool_populate_default(struct rte_mempool *mp) * then just set page shift and page size to 0, because the user has * indicated that there's no need to care about anything. * - * if we do need contiguous objects, there is also an option to reserve - * the entire mempool memory as one contiguous block of memory, in - * which case the page shift and alignment wouldn't matter as well. + * if we do need contiguous objects (if a mempool driver has its + * own calc_size() method returning min_chunk_size = mem_size), + * there is also an option to reserve the entire mempool memory + * as one contiguous block of memory. * * if we require contiguous objects, but not necessarily the entire - * mempool reserved space to be contiguous, then there are two options. - * - * if our IO addresses are virtual, not actual physical (IOVA as VA - * case), then no page shift needed - our memory allocation will give us - * contiguous IO memory as far as the hardware is concerned, so - * act as if we're getting contiguous memory. + * mempool reserved space to be contiguous, pg_sz will be != 0, + * and the default ops->populate() will take care of not placing + * objects across pages. * * if our IO addresses are physical, we may get memory from bigger * pages, or we might get memory from smaller pages, and how much of it @@ -504,11 +500,6 @@ rte_mempool_populate_default(struct rte_mempool *mp) * * If we fail to get enough contiguous memory, then we'll go and * reserve space in smaller chunks. - * - * We also have to take into account the fact that memory that we're - * going to allocate from can belong to an externally allocated memory - * area, in which case the assumption of IOVA as VA mode being - * synonymous with IOVA contiguousness will not hold. */ need_iova_contig_obj = !(mp->flags & MEMPOOL_F_NO_IOVA_CONTIG); diff --git a/lib/librte_mempool/rte_mempool_ops_default.c b/lib/librte_mempool/rte_mempool_ops_default.c index f6aea7662..e5cd4600f 100644 --- a/lib/librte_mempool/rte_mempool_ops_default.c +++ b/lib/librte_mempool/rte_mempool_ops_default.c @@ -61,21 +61,47 @@ rte_mempool_op_calc_mem_size_default(const struct rte_mempool *mp, return mem_size; } +/* Returns -1 if object crosses a page boundary, else returns 0 */ +static int +check_obj_bounds(char *obj, size_t pg_sz, size_t elt_sz) +{ + if (pg_sz == 0) + return 0; + if (elt_sz > pg_sz) + return 0; + if (RTE_PTR_ALIGN(obj, pg_sz) != RTE_PTR_ALIGN(obj + elt_sz - 1, pg_sz)) + return -1; + return 0; +} + int rte_mempool_op_populate_default(struct rte_mempool *mp, unsigned int max_objs, void *vaddr, rte_iova_t iova, size_t len, rte_mempool_populate_obj_cb_t *obj_cb, void *obj_cb_arg) { - size_t total_elt_sz; + char *va = vaddr; + size_t total_elt_sz, pg_sz; size_t off; unsigned int i; void *obj; + int ret; + + ret = rte_mempool_get_page_size(mp, &pg_sz); + if (ret < 0) + return ret; total_elt_sz = mp->header_size + mp->elt_size + mp->trailer_size; - for (off = 0, i = 0; off + total_elt_sz <= len && i < max_objs; i++) { + for (off = 0, i = 0; i < max_objs; i++) { + /* avoid objects to cross page boundaries */ + if (check_obj_bounds(va + off, pg_sz, total_elt_sz) < 0) + off += RTE_PTR_ALIGN_CEIL(va + off, pg_sz) - (va + off); + + if (off + total_elt_sz > len) + break; + off += mp->header_size; - obj = (char *)vaddr + off; + obj = va + off; obj_cb(mp, obj_cb_arg, obj, (iova == RTE_BAD_IOVA) ? RTE_BAD_IOVA : (iova + off)); rte_mempool_ops_enqueue_bulk(mp, &obj, 1); From patchwork Wed Oct 30 14:36:19 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Olivier Matz X-Patchwork-Id: 62229 Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 9AD08A00BE; Wed, 30 Oct 2019 15:37:42 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 7EB141C02C; Wed, 30 Oct 2019 15:37:08 +0100 (CET) Received: from proxy.6wind.com (host.76.145.23.62.rev.coltfrance.com [62.23.145.76]) by dpdk.org (Postfix) with ESMTP id A5C781BFCF for ; Wed, 30 Oct 2019 15:36:57 +0100 (CET) Received: from glumotte.dev.6wind.com. (unknown [10.16.0.195]) by proxy.6wind.com (Postfix) with ESMTP id F1136338A4E; Wed, 30 Oct 2019 15:36:55 +0100 (CET) From: Olivier Matz To: dev@dpdk.org Cc: Anatoly Burakov , Andrew Rybchenko , Ferruh Yigit , "Giridharan, Ganesan" , Jerin Jacob Kollanukkaran , "Kiran Kumar Kokkilagadda" , Stephen Hemminger , Thomas Monjalon , Vamsi Krishna Attunuru Date: Wed, 30 Oct 2019 15:36:19 +0100 Message-Id: <20191030143619.4007-7-olivier.matz@6wind.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20191030143619.4007-1-olivier.matz@6wind.com> References: <20190719133845.32432-1-olivier.matz@6wind.com> <20191030143619.4007-1-olivier.matz@6wind.com> MIME-Version: 1.0 Subject: [dpdk-dev] [PATCH v2 6/6] mempool: use the specific macro for object alignment X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" For consistency, RTE_MEMPOOL_ALIGN should be used in place of RTE_CACHE_LINE_SIZE. They have the same value, because the only arch that was defining a specific value for it has been removed from dpdk. Signed-off-by: Olivier Matz Reviewed-by: Andrew Rybchenko --- lib/librte_mempool/rte_mempool.c | 2 +- lib/librte_mempool/rte_mempool.h | 3 +++ lib/librte_mempool/rte_mempool_ops_default.c | 2 +- 3 files changed, 5 insertions(+), 2 deletions(-) diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c index d3db9273d..40cae3eb6 100644 --- a/lib/librte_mempool/rte_mempool.c +++ b/lib/librte_mempool/rte_mempool.c @@ -329,7 +329,7 @@ rte_mempool_populate_iova(struct rte_mempool *mp, char *vaddr, if (mp->flags & MEMPOOL_F_NO_CACHE_ALIGN) off = RTE_PTR_ALIGN_CEIL(vaddr, 8) - vaddr; else - off = RTE_PTR_ALIGN_CEIL(vaddr, RTE_CACHE_LINE_SIZE) - vaddr; + off = RTE_PTR_ALIGN_CEIL(vaddr, RTE_MEMPOOL_ALIGN) - vaddr; if (off > len) { ret = -EINVAL; diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h index 6d98b3743..cb7c423e6 100644 --- a/lib/librte_mempool/rte_mempool.h +++ b/lib/librte_mempool/rte_mempool.h @@ -116,6 +116,9 @@ struct rte_mempool_objsz { #define MEMPOOL_PG_NUM_DEFAULT 1 #ifndef RTE_MEMPOOL_ALIGN +/** + * Alignment of elements inside mempool. + */ #define RTE_MEMPOOL_ALIGN RTE_CACHE_LINE_SIZE #endif diff --git a/lib/librte_mempool/rte_mempool_ops_default.c b/lib/librte_mempool/rte_mempool_ops_default.c index e5cd4600f..390c490fd 100644 --- a/lib/librte_mempool/rte_mempool_ops_default.c +++ b/lib/librte_mempool/rte_mempool_ops_default.c @@ -56,7 +56,7 @@ rte_mempool_op_calc_mem_size_default(const struct rte_mempool *mp, } *min_chunk_size = total_elt_sz; - *align = RTE_CACHE_LINE_SIZE; + *align = RTE_MEMPOOL_ALIGN; return mem_size; }