[1/2] mempool: use actual IOVA addresses when populating
Checks
Commit Message
Currently, when mempool is being populated, we get IOVA address
of every segment using rte_mem_virt2iova(). This works for internal
memory, but does not really work for external memory, and does not
work on platforms which return RTE_BAD_IOVA as a result of this
call (such as FreeBSD). Moreover, even when it works, the function
in question will do unnecessary pagewalks in IOVA as PA mode, as
it falls back to rte_mem_virt2phy() instead of just doing a lookup in
internal memseg table.
To fix it, replace the call to first attempt to look through the
internal memseg table (this takes care of internal and external memory),
and fall back to rte_mem_virt2iova() when unable to perform VA->IOVA
translation via memseg table.
Fixes: 66cc45e293ed ("mem: replace memseg with memseg lists")
Cc: stable@dpdk.org
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
lib/librte_mempool/rte_mempool.c | 17 +++++++++++++++--
1 file changed, 15 insertions(+), 2 deletions(-)
Comments
On Thu, Nov 14, 2019 at 01:58:20PM +0000, Anatoly Burakov wrote:
> Currently, when mempool is being populated, we get IOVA address
> of every segment using rte_mem_virt2iova(). This works for internal
> memory, but does not really work for external memory, and does not
> work on platforms which return RTE_BAD_IOVA as a result of this
> call (such as FreeBSD). Moreover, even when it works, the function
> in question will do unnecessary pagewalks in IOVA as PA mode, as
> it falls back to rte_mem_virt2phy() instead of just doing a lookup in
> internal memseg table.
>
> To fix it, replace the call to first attempt to look through the
> internal memseg table (this takes care of internal and external memory),
> and fall back to rte_mem_virt2iova() when unable to perform VA->IOVA
> translation via memseg table.
>
> Fixes: 66cc45e293ed ("mem: replace memseg with memseg lists")
Shouldn't we also add
Fixes: 035ee5bea5ef ("mempool: remove optimistic IOVA-contiguous allocation")
?
From what I understand, even if the problem existed in populate_virt()
before, this is the commit that makes the problem visible in most cases.
> Cc: stable@dpdk.org
>
> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
On 15-Nov-19 8:46 AM, Olivier Matz wrote:
> On Thu, Nov 14, 2019 at 01:58:20PM +0000, Anatoly Burakov wrote:
>> Currently, when mempool is being populated, we get IOVA address
>> of every segment using rte_mem_virt2iova(). This works for internal
>> memory, but does not really work for external memory, and does not
>> work on platforms which return RTE_BAD_IOVA as a result of this
>> call (such as FreeBSD). Moreover, even when it works, the function
>> in question will do unnecessary pagewalks in IOVA as PA mode, as
>> it falls back to rte_mem_virt2phy() instead of just doing a lookup in
>> internal memseg table.
>>
>> To fix it, replace the call to first attempt to look through the
>> internal memseg table (this takes care of internal and external memory),
>> and fall back to rte_mem_virt2iova() when unable to perform VA->IOVA
>> translation via memseg table.
>>
>> Fixes: 66cc45e293ed ("mem: replace memseg with memseg lists")
>
> Shouldn't we also add
> Fixes: 035ee5bea5ef ("mempool: remove optimistic IOVA-contiguous allocation")
> ?
>
>>From what I understand, even if the problem existed in populate_virt()
> before, this is the commit that makes the problem visible in most cases.
The commit didn't introduce the issue - only revealed it. So the commit
itself isn't buggy, and there's nothing to fix :) However, i'm not
opposed to sharing the blame!
>
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
>
> Acked-by: Olivier Matz <olivier.matz@6wind.com>
>
Tested-by: Chen, BoX C <box.c.chen@intel.com>
> -----Original Message-----
> From: stable [mailto:stable-bounces@dpdk.org] On Behalf Of Anatoly Burakov
> Sent: Thursday, November 14, 2019 9:58 PM
> To: dev@dpdk.org
> Cc: Olivier Matz <olivier.matz@6wind.com>; Andrew Rybchenko
> <arybchenko@solarflare.com>; david.marchand@redhat.com; stable@dpdk.org
> Subject: [dpdk-stable] [PATCH 1/2] mempool: use actual IOVA addresses when
> populating
>
> Currently, when mempool is being populated, we get IOVA address of every
> segment using rte_mem_virt2iova(). This works for internal memory, but does
> not really work for external memory, and does not work on platforms which
> return RTE_BAD_IOVA as a result of this call (such as FreeBSD). Moreover, even
> when it works, the function in question will do unnecessary pagewalks in IOVA
> as PA mode, as it falls back to rte_mem_virt2phy() instead of just doing a lookup
> in internal memseg table.
>
> To fix it, replace the call to first attempt to look through the internal memseg
> table (this takes care of internal and external memory), and fall back to
> rte_mem_virt2iova() when unable to perform VA->IOVA translation via memseg
> table.
>
> Fixes: 66cc45e293ed ("mem: replace memseg with memseg lists")
> Cc: stable@dpdk.org
>
> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> ---
> lib/librte_mempool/rte_mempool.c | 17 +++++++++++++++--
> 1 file changed, 15 insertions(+), 2 deletions(-)
>
> diff --git a/lib/librte_mempool/rte_mempool.c
> b/lib/librte_mempool/rte_mempool.c
> index 40cae3eb67..8da2e471c7 100644
> --- a/lib/librte_mempool/rte_mempool.c
> +++ b/lib/librte_mempool/rte_mempool.c
> @@ -356,6 +356,19 @@ rte_mempool_populate_iova(struct rte_mempool *mp,
> char *vaddr,
> return ret;
> }
>
> +static rte_iova_t
> +get_iova(void *addr)
> +{
> + struct rte_memseg *ms;
> +
> + /* try registered memory first */
> + ms = rte_mem_virt2memseg(addr, NULL);
> + if (ms == NULL || ms->iova == RTE_BAD_IOVA)
> + /* fall back to actual physical address */
> + return rte_mem_virt2iova(addr);
> + return ms->iova + RTE_PTR_DIFF(addr, ms->addr); }
> +
> /* Populate the mempool with a virtual area. Return the number of
> * objects added, or a negative value on error.
> */
> @@ -375,7 +388,7 @@ rte_mempool_populate_virt(struct rte_mempool *mp,
> char *addr,
> for (off = 0; off < len &&
> mp->populated_size < mp->size; off += phys_len) {
>
> - iova = rte_mem_virt2iova(addr + off);
> + iova = get_iova(addr + off);
>
> if (iova == RTE_BAD_IOVA && rte_eal_has_hugepages()) {
> ret = -EINVAL;
> @@ -391,7 +404,7 @@ rte_mempool_populate_virt(struct rte_mempool *mp,
> char *addr,
> phys_len = RTE_MIN(phys_len + pg_sz, len - off)) {
> rte_iova_t iova_tmp;
>
> - iova_tmp = rte_mem_virt2iova(addr + off + phys_len);
> + iova_tmp = get_iova(addr + off + phys_len);
>
> if (iova_tmp == RTE_BAD_IOVA ||
> iova_tmp != iova + phys_len)
> --
> 2.17.1
On Thu, Nov 14, 2019 at 2:58 PM Anatoly Burakov
<anatoly.burakov@intel.com> wrote:
>
> Currently, when mempool is being populated, we get IOVA address
> of every segment using rte_mem_virt2iova(). This works for internal
> memory, but does not really work for external memory, and does not
> work on platforms which return RTE_BAD_IOVA as a result of this
> call (such as FreeBSD). Moreover, even when it works, the function
> in question will do unnecessary pagewalks in IOVA as PA mode, as
> it falls back to rte_mem_virt2phy() instead of just doing a lookup in
> internal memseg table.
>
> To fix it, replace the call to first attempt to look through the
> internal memseg table (this takes care of internal and external memory),
> and fall back to rte_mem_virt2iova() when unable to perform VA->IOVA
> translation via memseg table.
>
> Fixes: 66cc45e293ed ("mem: replace memseg with memseg lists")
> Cc: stable@dpdk.org
>
> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Tested-by: Bo Chen <box.c.chen@intel.com>
Series applied, thanks.
--
David Marchand
@@ -356,6 +356,19 @@ rte_mempool_populate_iova(struct rte_mempool *mp, char *vaddr,
return ret;
}
+static rte_iova_t
+get_iova(void *addr)
+{
+ struct rte_memseg *ms;
+
+ /* try registered memory first */
+ ms = rte_mem_virt2memseg(addr, NULL);
+ if (ms == NULL || ms->iova == RTE_BAD_IOVA)
+ /* fall back to actual physical address */
+ return rte_mem_virt2iova(addr);
+ return ms->iova + RTE_PTR_DIFF(addr, ms->addr);
+}
+
/* Populate the mempool with a virtual area. Return the number of
* objects added, or a negative value on error.
*/
@@ -375,7 +388,7 @@ rte_mempool_populate_virt(struct rte_mempool *mp, char *addr,
for (off = 0; off < len &&
mp->populated_size < mp->size; off += phys_len) {
- iova = rte_mem_virt2iova(addr + off);
+ iova = get_iova(addr + off);
if (iova == RTE_BAD_IOVA && rte_eal_has_hugepages()) {
ret = -EINVAL;
@@ -391,7 +404,7 @@ rte_mempool_populate_virt(struct rte_mempool *mp, char *addr,
phys_len = RTE_MIN(phys_len + pg_sz, len - off)) {
rte_iova_t iova_tmp;
- iova_tmp = rte_mem_virt2iova(addr + off + phys_len);
+ iova_tmp = get_iova(addr + off + phys_len);
if (iova_tmp == RTE_BAD_IOVA ||
iova_tmp != iova + phys_len)