eal: fix segment fault when exit trace

Message ID 20220607120014.49823-1-fengchengwen@huawei.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers
Series eal: fix segment fault when exit trace |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK
ci/iol-mellanox-Performance success Performance Testing PASS
ci/intel-Testing success Testing PASS
ci/iol-aarch64-unit-testing success Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/github-robot: build success github build: passed
ci/iol-aarch64-compile-testing success Testing PASS
ci/iol-x86_64-compile-testing success Testing PASS
ci/iol-x86_64-unit-testing success Testing PASS
ci/iol-abi-testing success Testing PASS

Commit Message

fengchengwen June 7, 2022, noon UTC
  Bug scenario:
1. start testpmd:
	dpdk-testpmd -l 4-6 -a 0000:7d:00.0 --trace=.* -- -i
2. quit testpmd and then observed segment fault:
	Bye...
	Segmentation fault (core dumped)

The root cause is that rte_trace_save() and eal_trace_fini() access
the huge pages which were cleanup by rte_eal_memory_detach().

This patch moves rte_trace_save() and eal_trace_fini() before
rte_eal_memory_detach() to fix the bug.

Fixes: dfbc61a2f9a6 ("mem: detach memsegs on cleanup")
Cc: stable@dpdk.org

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
---
 lib/eal/linux/eal.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
  

Comments

fengchengwen June 7, 2022, 12:26 p.m. UTC | #1
Hi Bruce,

  Could you please test freebsd platform? I think it also have
the same problem, but I hasn't freebsd enviorment.

Thanks

On 2022/6/7 20:00, Chengwen Feng wrote:
> Bug scenario:
> 1. start testpmd:
> 	dpdk-testpmd -l 4-6 -a 0000:7d:00.0 --trace=.* -- -i
> 2. quit testpmd and then observed segment fault:
> 	Bye...
> 	Segmentation fault (core dumped)
> 
> The root cause is that rte_trace_save() and eal_trace_fini() access
> the huge pages which were cleanup by rte_eal_memory_detach().
> 
> This patch moves rte_trace_save() and eal_trace_fini() before
> rte_eal_memory_detach() to fix the bug.
> 
> Fixes: dfbc61a2f9a6 ("mem: detach memsegs on cleanup")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> ---
>  lib/eal/linux/eal.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c
> index 1ef263434a..c6f2056197 100644
> --- a/lib/eal/linux/eal.c
> +++ b/lib/eal/linux/eal.c
> @@ -1266,13 +1266,13 @@ rte_eal_cleanup(void)
>  	vfio_mp_sync_cleanup();
>  #endif
>  	rte_mp_channel_cleanup();
> +	rte_trace_save();
> +	eal_trace_fini();
>  	/* after this point, any DPDK pointers will become dangling */
>  	rte_eal_memory_detach();
>  	eal_mp_dev_hotplug_cleanup();
>  	rte_eal_malloc_heap_cleanup();
>  	rte_eal_alarm_cleanup();
> -	rte_trace_save();
> -	eal_trace_fini();
>  	eal_cleanup_config(internal_conf);
>  	rte_eal_log_cleanup();
>  	return 0;
>
  
David Marchand June 7, 2022, 1 p.m. UTC | #2
On Tue, Jun 7, 2022 at 2:06 PM Chengwen Feng <fengchengwen@huawei.com> wrote:
>
> Bug scenario:
> 1. start testpmd:
>         dpdk-testpmd -l 4-6 -a 0000:7d:00.0 --trace=.* -- -i
> 2. quit testpmd and then observed segment fault:
>         Bye...
>         Segmentation fault (core dumped)
>
> The root cause is that rte_trace_save() and eal_trace_fini() access
> the huge pages which were cleanup by rte_eal_memory_detach().
>
> This patch moves rte_trace_save() and eal_trace_fini() before
> rte_eal_memory_detach() to fix the bug.
>
> Fixes: dfbc61a2f9a6 ("mem: detach memsegs on cleanup")
> Cc: stable@dpdk.org
>
> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>

Good catch.
It was introduced in 21.05.
It's a bit concerning that it got unnoticed so far.
  
Jerin Jacob June 8, 2022, 10:31 a.m. UTC | #3
On Tue, Jun 7, 2022 at 5:36 PM Chengwen Feng <fengchengwen@huawei.com> wrote:
>
> Bug scenario:
> 1. start testpmd:
>         dpdk-testpmd -l 4-6 -a 0000:7d:00.0 --trace=.* -- -i
> 2. quit testpmd and then observed segment fault:
>         Bye...
>         Segmentation fault (core dumped)
>
> The root cause is that rte_trace_save() and eal_trace_fini() access
> the huge pages which were cleanup by rte_eal_memory_detach().
>
> This patch moves rte_trace_save() and eal_trace_fini() before
> rte_eal_memory_detach() to fix the bug.
>
> Fixes: dfbc61a2f9a6 ("mem: detach memsegs on cleanup")
> Cc: stable@dpdk.org
>
> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>


Tested-by: Jerin Jacob <jerinj@marvell.com>


> ---
>  lib/eal/linux/eal.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c
> index 1ef263434a..c6f2056197 100644
> --- a/lib/eal/linux/eal.c
> +++ b/lib/eal/linux/eal.c
> @@ -1266,13 +1266,13 @@ rte_eal_cleanup(void)
>         vfio_mp_sync_cleanup();
>  #endif
>         rte_mp_channel_cleanup();
> +       rte_trace_save();
> +       eal_trace_fini();
>         /* after this point, any DPDK pointers will become dangling */
>         rte_eal_memory_detach();
>         eal_mp_dev_hotplug_cleanup();
>         rte_eal_malloc_heap_cleanup();
>         rte_eal_alarm_cleanup();
> -       rte_trace_save();
> -       eal_trace_fini();
>         eal_cleanup_config(internal_conf);
>         rte_eal_log_cleanup();
>         return 0;
> --
> 2.33.0
>
  
David Marchand June 13, 2022, 2:15 p.m. UTC | #4
On Tue, Jun 7, 2022 at 2:26 PM fengchengwen <fengchengwen@huawei.com> wrote:
> Hi Bruce,
>
>   Could you please test freebsd platform? I think it also have
> the same problem, but I hasn't freebsd enviorment.

I can reproduce a crash with FreeBSD 13.

Moving rte_trace_save() and co earlier (as is done for Linux in this
proposed patch) avoids the crash, so I can't tell the traces file is
valid.

dpdk@freebsd:~/dpdk $ git diff
diff --git a/lib/eal/freebsd/eal.c b/lib/eal/freebsd/eal.c
index a6b20960f2..26fbc91b26 100644
--- a/lib/eal/freebsd/eal.c
+++ b/lib/eal/freebsd/eal.c
@@ -893,11 +893,11 @@ rte_eal_cleanup(void)
                eal_get_internal_configuration();
        rte_service_finalize();
        rte_mp_channel_cleanup();
+       rte_trace_save();
+       eal_trace_fini();
        /* after this point, any DPDK pointers will become dangling */
        rte_eal_memory_detach();
        rte_eal_alarm_cleanup();
-       rte_trace_save();
-       eal_trace_fini();
        eal_cleanup_config(internal_conf);
        return 0;
 }
  
fengchengwen June 14, 2022, 12:33 a.m. UTC | #5
Thansk for your testing, David.


On 2022/6/13 22:15, David Marchand wrote:
> On Tue, Jun 7, 2022 at 2:26 PM fengchengwen <fengchengwen@huawei.com> wrote:
>> Hi Bruce,
>>
>>   Could you please test freebsd platform? I think it also have
>> the same problem, but I hasn't freebsd enviorment.
> 
> I can reproduce a crash with FreeBSD 13.
> 
> Moving rte_trace_save() and co earlier (as is done for Linux in this
> proposed patch) avoids the crash, so I can't tell the traces file is
> valid.
> 
> dpdk@freebsd:~/dpdk $ git diff
> diff --git a/lib/eal/freebsd/eal.c b/lib/eal/freebsd/eal.c
> index a6b20960f2..26fbc91b26 100644
> --- a/lib/eal/freebsd/eal.c
> +++ b/lib/eal/freebsd/eal.c
> @@ -893,11 +893,11 @@ rte_eal_cleanup(void)
>                 eal_get_internal_configuration();
>         rte_service_finalize();
>         rte_mp_channel_cleanup();
> +       rte_trace_save();
> +       eal_trace_fini();
>         /* after this point, any DPDK pointers will become dangling */
>         rte_eal_memory_detach();
>         rte_eal_alarm_cleanup();
> -       rte_trace_save();
> -       eal_trace_fini();
>         eal_cleanup_config(internal_conf);
>         return 0;
>  }
> 
>
  

Patch

diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c
index 1ef263434a..c6f2056197 100644
--- a/lib/eal/linux/eal.c
+++ b/lib/eal/linux/eal.c
@@ -1266,13 +1266,13 @@  rte_eal_cleanup(void)
 	vfio_mp_sync_cleanup();
 #endif
 	rte_mp_channel_cleanup();
+	rte_trace_save();
+	eal_trace_fini();
 	/* after this point, any DPDK pointers will become dangling */
 	rte_eal_memory_detach();
 	eal_mp_dev_hotplug_cleanup();
 	rte_eal_malloc_heap_cleanup();
 	rte_eal_alarm_cleanup();
-	rte_trace_save();
-	eal_trace_fini();
 	eal_cleanup_config(internal_conf);
 	rte_eal_log_cleanup();
 	return 0;