[v2] mem: fix cleanup when multi-process is disabled
Checks
Commit Message
rte_eal_memory_detach() did not account for cases where multi-process
mode is disabled: --in-memory and --no-shconf. This resulted
in unmapping memory that had not been mapped, which caused errors:
EAL: Could not unmap memory: No error (Windows)
EAL: Cannot munmap(0x1d47f40, 0x7000): Invalid argument (Linux)
Confusing "No error" was caused by using errno instead of rte_errno
set by rte_mem_unmap().
Skip detaching memory altogether when --in-memory is specified.
Skip unmapping configuration when it's not shared.
Fix and add error handling to produce proper log messages.
Fixes: dfbc61a2f9a6 ("mem: detach memsegs on cleanup")
Cc: Anatoly Burakov <anatoly.burakov@intel.com>
Reported-by: Jie Zhou <jizh@microsoft.com>
Suggested-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
---
lib/librte_eal/common/eal_common_memory.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
Comments
On 3/24/2021 12:32 PM, Dmitry Kozlyuk wrote:
> rte_eal_memory_detach() did not account for cases where multi-process
> mode is disabled: --in-memory and --no-shconf. This resulted
> in unmapping memory that had not been mapped, which caused errors:
>
> EAL: Could not unmap memory: No error (Windows)
> EAL: Cannot munmap(0x1d47f40, 0x7000): Invalid argument (Linux)
>
> Confusing "No error" was caused by using errno instead of rte_errno
> set by rte_mem_unmap().
>
> Skip detaching memory altogether when --in-memory is specified.
> Skip unmapping configuration when it's not shared.
> Fix and add error handling to produce proper log messages.
>
> Fixes: dfbc61a2f9a6 ("mem: detach memsegs on cleanup")
> Cc: Anatoly Burakov <anatoly.burakov@intel.com>
>
> Reported-by: Jie Zhou <jizh@microsoft.com>
> Suggested-by: David Marchand <david.marchand@redhat.com>
> Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
> ---
> lib/librte_eal/common/eal_common_memory.c | 12 ++++++++++--
> 1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/lib/librte_eal/common/eal_common_memory.c b/lib/librte_eal/common/eal_common_memory.c
> index 0e99986d3d..9495170c86 100644
> --- a/lib/librte_eal/common/eal_common_memory.c
> +++ b/lib/librte_eal/common/eal_common_memory.c
> @@ -1006,10 +1006,15 @@ rte_extmem_detach(void *va_addr, size_t len)
> int
> rte_eal_memory_detach(void)
> {
> + const struct internal_config *internal_conf =
> + eal_get_internal_configuration();
> struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
> size_t page_sz = rte_mem_page_size();
> unsigned int i;
>
> + if (internal_conf->in_memory == 1)
> + return 0;
> +
> rte_rwlock_write_lock(&mcfg->memory_hotplug_lock);
>
> /* detach internal memory subsystem data first */
> @@ -1032,7 +1037,7 @@ rte_eal_memory_detach(void)
> if (!msl->external)
> if (rte_mem_unmap(msl->base_va, msl->len) != 0)
> RTE_LOG(ERR, EAL, "Could not unmap memory: %s\n",
> - strerror(errno));
> + rte_strerror(rte_errno));
>
> /*
> * we are detaching the fbarray rather than destroying because
> @@ -1050,7 +1055,10 @@ rte_eal_memory_detach(void)
> * config - we can't zero it out because it might still be referenced
> * by other processes.
> */
> - rte_mem_unmap(mcfg, RTE_ALIGN(sizeof(*mcfg), page_sz));
> + if (internal_conf->no_shconf == 0)
> + if (rte_mem_unmap(mcfg, RTE_ALIGN(sizeof(*mcfg), page_sz)) != 0)
> + RTE_LOG(ERR, EAL, "Could not unmap shared memory config: %s\n",
> + rte_strerror(rte_errno));
> rte_eal_get_configuration()->mem_config = NULL;
>
> return 0;
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
On 24-Mar-21 7:32 PM, Dmitry Kozlyuk wrote:
> rte_eal_memory_detach() did not account for cases where multi-process
> mode is disabled: --in-memory and --no-shconf. This resulted
> in unmapping memory that had not been mapped, which caused errors:
>
> EAL: Could not unmap memory: No error (Windows)
> EAL: Cannot munmap(0x1d47f40, 0x7000): Invalid argument (Linux)
>
> Confusing "No error" was caused by using errno instead of rte_errno
> set by rte_mem_unmap().
>
> Skip detaching memory altogether when --in-memory is specified.
> Skip unmapping configuration when it's not shared.
> Fix and add error handling to produce proper log messages.
>
> Fixes: dfbc61a2f9a6 ("mem: detach memsegs on cleanup")
> Cc: Anatoly Burakov <anatoly.burakov@intel.com>
>
> Reported-by: Jie Zhou <jizh@microsoft.com>
> Suggested-by: David Marchand <david.marchand@redhat.com>
> Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
> ---
LGTM
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
On Wed, Mar 24, 2021 at 8:32 PM Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> wrote:
>
> rte_eal_memory_detach() did not account for cases where multi-process
> mode is disabled: --in-memory and --no-shconf. This resulted
> in unmapping memory that had not been mapped, which caused errors:
>
> EAL: Could not unmap memory: No error (Windows)
> EAL: Cannot munmap(0x1d47f40, 0x7000): Invalid argument (Linux)
>
> Confusing "No error" was caused by using errno instead of rte_errno
> set by rte_mem_unmap().
>
> Skip detaching memory altogether when --in-memory is specified.
> Skip unmapping configuration when it's not shared.
> Fix and add error handling to produce proper log messages.
>
> Fixes: dfbc61a2f9a6 ("mem: detach memsegs on cleanup")
> Cc: Anatoly Burakov <anatoly.burakov@intel.com>
>
> Reported-by: Jie Zhou <jizh@microsoft.com>
> Suggested-by: David Marchand <david.marchand@redhat.com>
> Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Applied, thanks Dmitry.
On Wed, Mar 24, 2021 at 8:32 PM Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> wrote:
> @@ -1050,7 +1055,10 @@ rte_eal_memory_detach(void)
> * config - we can't zero it out because it might still be referenced
> * by other processes.
> */
> - rte_mem_unmap(mcfg, RTE_ALIGN(sizeof(*mcfg), page_sz));
> + if (internal_conf->no_shconf == 0)
> + if (rte_mem_unmap(mcfg, RTE_ALIGN(sizeof(*mcfg), page_sz)) != 0)
> + RTE_LOG(ERR, EAL, "Could not unmap shared memory config: %s\n",
> + rte_strerror(rte_errno));
> rte_eal_get_configuration()->mem_config = NULL;
>
> return 0;
We have another issue if eal init fails early, then the application
exits calling rte_exit() -> rte_eal_cleanup() ->
rte_eal_memory_detach()
The issue itself is not related to this current change but rather to
dfbc61a2f9a6 ("mem: detach memsegs on cleanup"), but it became visible
with the above log.
Example:
$ ./build/app/dpdk-testpmd --plop
...
EAL: FATAL: Invalid 'command line' arguments.
EAL: Invalid 'command line' arguments.
EAL: Error - exiting with code: 1
Cause: Cannot init EAL: Invalid argument
EAL: Could not unmap shared memory config: Invalid argument
@@ -1006,10 +1006,15 @@ rte_extmem_detach(void *va_addr, size_t len)
int
rte_eal_memory_detach(void)
{
+ const struct internal_config *internal_conf =
+ eal_get_internal_configuration();
struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
size_t page_sz = rte_mem_page_size();
unsigned int i;
+ if (internal_conf->in_memory == 1)
+ return 0;
+
rte_rwlock_write_lock(&mcfg->memory_hotplug_lock);
/* detach internal memory subsystem data first */
@@ -1032,7 +1037,7 @@ rte_eal_memory_detach(void)
if (!msl->external)
if (rte_mem_unmap(msl->base_va, msl->len) != 0)
RTE_LOG(ERR, EAL, "Could not unmap memory: %s\n",
- strerror(errno));
+ rte_strerror(rte_errno));
/*
* we are detaching the fbarray rather than destroying because
@@ -1050,7 +1055,10 @@ rte_eal_memory_detach(void)
* config - we can't zero it out because it might still be referenced
* by other processes.
*/
- rte_mem_unmap(mcfg, RTE_ALIGN(sizeof(*mcfg), page_sz));
+ if (internal_conf->no_shconf == 0)
+ if (rte_mem_unmap(mcfg, RTE_ALIGN(sizeof(*mcfg), page_sz)) != 0)
+ RTE_LOG(ERR, EAL, "Could not unmap shared memory config: %s\n",
+ rte_strerror(rte_errno));
rte_eal_get_configuration()->mem_config = NULL;
return 0;