test/mbuf: fix the forked process segment fault
Checks
Commit Message
Access of any memory in the hugepage shared file-backed area will trigger
an unexpected forked child process segment fault. The root cause is DPDK
doesn't support fork model [1] (calling rte_eal_init() before fork()).
Forked child process can't be treated as a secondary process.
Hence fix it by avoiding fork and doing verification in the main process.
[1] https://mails.dpdk.org/archives/dev/2018-July/108106.html
Fixes: af75078fece3 ("first public release")
Cc: stable@dpdk.org
Signed-off-by: Jia He <justin.he@arm.com>
Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
app/test/test_mbuf.c | 50 +++++++++++++-------------------------------
1 file changed, 14 insertions(+), 36 deletions(-)
Comments
On 5/22/2023 7:01 AM, Ruifeng Wang wrote:
> Access of any memory in the hugepage shared file-backed area will trigger
> an unexpected forked child process segment fault. The root cause is DPDK
> doesn't support fork model [1] (calling rte_eal_init() before fork()).
> Forked child process can't be treated as a secondary process.
>
> Hence fix it by avoiding fork and doing verification in the main process.
>
> [1] https://mails.dpdk.org/archives/dev/2018-July/108106.html
>
> Fixes: af75078fece3 ("first public release")
> Cc: stable@dpdk.org
>
> Signed-off-by: Jia He <justin.he@arm.com>
> Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
> ---
Would this be something that a secondary process-based test could test?
That's how we test rte_panic() and other calls.
> -----Original Message-----
> From: Burakov, Anatoly <anatoly.burakov@intel.com>
> Sent: Monday, May 22, 2023 5:24 PM
> To: Ruifeng Wang <Ruifeng.Wang@arm.com>; olivier.matz@6wind.com
> Cc: dev@dpdk.org; stable@dpdk.org; thomas@monjalon.net; stephen@networkplumber.org; Justin
> He <Justin.He@arm.com>; Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; nd
> <nd@arm.com>
> Subject: Re: [PATCH] test/mbuf: fix the forked process segment fault
>
> On 5/22/2023 7:01 AM, Ruifeng Wang wrote:
> > Access of any memory in the hugepage shared file-backed area will
> > trigger an unexpected forked child process segment fault. The root
> > cause is DPDK doesn't support fork model [1] (calling rte_eal_init() before fork()).
> > Forked child process can't be treated as a secondary process.
> >
> > Hence fix it by avoiding fork and doing verification in the main process.
> >
> > [1] https://mails.dpdk.org/archives/dev/2018-July/108106.html
> >
> > Fixes: af75078fece3 ("first public release")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Jia He <justin.he@arm.com>
> > Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > ---
>
> Would this be something that a secondary process-based test could test?
> That's how we test rte_panic() and other calls.
This case validates mbuf. IMO there is no need to do validation in a secondary process.
Unit test for rte_panic() also uses fork() and could have the same issue.
>
> --
> Thanks,
> Anatoly
On 5/22/2023 10:55 AM, Ruifeng Wang wrote:
>> -----Original Message-----
>> From: Burakov, Anatoly <anatoly.burakov@intel.com>
>> Sent: Monday, May 22, 2023 5:24 PM
>> To: Ruifeng Wang <Ruifeng.Wang@arm.com>; olivier.matz@6wind.com
>> Cc: dev@dpdk.org; stable@dpdk.org; thomas@monjalon.net; stephen@networkplumber.org; Justin
>> He <Justin.He@arm.com>; Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; nd
>> <nd@arm.com>
>> Subject: Re: [PATCH] test/mbuf: fix the forked process segment fault
>>
>> On 5/22/2023 7:01 AM, Ruifeng Wang wrote:
>>> Access of any memory in the hugepage shared file-backed area will
>>> trigger an unexpected forked child process segment fault. The root
>>> cause is DPDK doesn't support fork model [1] (calling rte_eal_init() before fork()).
>>> Forked child process can't be treated as a secondary process.
>>>
>>> Hence fix it by avoiding fork and doing verification in the main process.
>>>
>>> [1] https://mails.dpdk.org/archives/dev/2018-July/108106.html
>>>
>>> Fixes: af75078fece3 ("first public release")
>>> Cc: stable@dpdk.org
>>>
>>> Signed-off-by: Jia He <justin.he@arm.com>
>>> Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
>>> ---
>>
>> Would this be something that a secondary process-based test could test?
>> That's how we test rte_panic() and other calls.
>
> This case validates mbuf. IMO there is no need to do validation in a secondary process.
> Unit test for rte_panic() also uses fork() and could have the same issue.
>
In that case, rte_panic() test should be fixed as well.
My concern is that ideally, we shouldn't intentionally crash the test
app if something goes wrong, and calling rte_panic() accomplishes just
that - which is why I suggested running them in secondary processes
instead, so that any call into rte_panic happens inside a secondary
process, and the main test process doesn't crash even if the test has
failed.
On Mon, 22 May 2023 11:19:24 +0100
"Burakov, Anatoly" <anatoly.burakov@intel.com> wrote:
> >
> > This case validates mbuf. IMO there is no need to do validation in a secondary process.
> > Unit test for rte_panic() also uses fork() and could have the same issue.
> >
>
> In that case, rte_panic() test should be fixed as well.
>
> My concern is that ideally, we shouldn't intentionally crash the test
> app if something goes wrong, and calling rte_panic() accomplishes just
> that - which is why I suggested running them in secondary processes
> instead, so that any call into rte_panic happens inside a secondary
> process, and the main test process doesn't crash even if the test has
> failed.
>
All forks outside of EAL are bad. The test should be removed, it was buggy
when first written.
On 5/22/2023 4:21 PM, Stephen Hemminger wrote:
> On Mon, 22 May 2023 11:19:24 +0100
> "Burakov, Anatoly" <anatoly.burakov@intel.com> wrote:
>
>>>
>>> This case validates mbuf. IMO there is no need to do validation in a secondary process.
>>> Unit test for rte_panic() also uses fork() and could have the same issue.
>>>
>>
>> In that case, rte_panic() test should be fixed as well.
>>
>> My concern is that ideally, we shouldn't intentionally crash the test
>> app if something goes wrong, and calling rte_panic() accomplishes just
>> that - which is why I suggested running them in secondary processes
>> instead, so that any call into rte_panic happens inside a secondary
>> process, and the main test process doesn't crash even if the test has
>> failed.
>>
>
> All forks outside of EAL are bad. The test should be removed, it was buggy
> when first written.
I agree that none of the tests (or anything else in DPDK) should fork,
but some things still crash process (as in, cause rte_panic), and the
purpose of using fork() was for that. We can rewrite the tests to not
fork and instead use other methods (such as spinning up secondary
processes), we don't have to remove them.
> -----Original Message-----
> From: Burakov, Anatoly <anatoly.burakov@intel.com>
> Sent: Monday, May 22, 2023 6:19 PM
> To: Ruifeng Wang <Ruifeng.Wang@arm.com>; olivier.matz@6wind.com
> Cc: dev@dpdk.org; stable@dpdk.org; thomas@monjalon.net; stephen@networkplumber.org; Justin
> He <Justin.He@arm.com>; Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; nd
> <nd@arm.com>
> Subject: Re: [PATCH] test/mbuf: fix the forked process segment fault
>
> On 5/22/2023 10:55 AM, Ruifeng Wang wrote:
> >> -----Original Message-----
> >> From: Burakov, Anatoly <anatoly.burakov@intel.com>
> >> Sent: Monday, May 22, 2023 5:24 PM
> >> To: Ruifeng Wang <Ruifeng.Wang@arm.com>; olivier.matz@6wind.com
> >> Cc: dev@dpdk.org; stable@dpdk.org; thomas@monjalon.net;
> >> stephen@networkplumber.org; Justin He <Justin.He@arm.com>; Honnappa
> >> Nagarahalli <Honnappa.Nagarahalli@arm.com>; nd <nd@arm.com>
> >> Subject: Re: [PATCH] test/mbuf: fix the forked process segment fault
> >>
> >> On 5/22/2023 7:01 AM, Ruifeng Wang wrote:
> >>> Access of any memory in the hugepage shared file-backed area will
> >>> trigger an unexpected forked child process segment fault. The root
> >>> cause is DPDK doesn't support fork model [1] (calling rte_eal_init() before fork()).
> >>> Forked child process can't be treated as a secondary process.
> >>>
> >>> Hence fix it by avoiding fork and doing verification in the main process.
> >>>
> >>> [1] https://mails.dpdk.org/archives/dev/2018-July/108106.html
> >>>
> >>> Fixes: af75078fece3 ("first public release")
> >>> Cc: stable@dpdk.org
> >>>
> >>> Signed-off-by: Jia He <justin.he@arm.com>
> >>> Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
> >>> ---
> >>
> >> Would this be something that a secondary process-based test could test?
> >> That's how we test rte_panic() and other calls.
> >
> > This case validates mbuf. IMO there is no need to do validation in a secondary process.
> > Unit test for rte_panic() also uses fork() and could have the same issue.
> >
>
> In that case, rte_panic() test should be fixed as well.
>
> My concern is that ideally, we shouldn't intentionally crash the test app if something
> goes wrong, and calling rte_panic() accomplishes just that - which is why I suggested
> running them in secondary processes instead, so that any call into rte_panic happens
> inside a secondary process, and the main test process doesn't crash even if the test has
> failed.
Agree that intentionally crashing the test app is bad.
In this patch, verification of mbuf is changed to use another API without rte_panic().
Then the verification can be done directly in the primary. And the indirectness of
using a secondary process is removed. Because verification will not crash the process.
>
> --
> Thanks,
> Anatoly
On 5/23/2023 4:45 AM, Ruifeng Wang wrote:
>> -----Original Message-----
>> From: Burakov, Anatoly <anatoly.burakov@intel.com>
>> Sent: Monday, May 22, 2023 6:19 PM
>> To: Ruifeng Wang <Ruifeng.Wang@arm.com>; olivier.matz@6wind.com
>> Cc: dev@dpdk.org; stable@dpdk.org; thomas@monjalon.net; stephen@networkplumber.org; Justin
>> He <Justin.He@arm.com>; Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; nd
>> <nd@arm.com>
>> Subject: Re: [PATCH] test/mbuf: fix the forked process segment fault
>>
>> On 5/22/2023 10:55 AM, Ruifeng Wang wrote:
>>>> -----Original Message-----
>>>> From: Burakov, Anatoly <anatoly.burakov@intel.com>
>>>> Sent: Monday, May 22, 2023 5:24 PM
>>>> To: Ruifeng Wang <Ruifeng.Wang@arm.com>; olivier.matz@6wind.com
>>>> Cc: dev@dpdk.org; stable@dpdk.org; thomas@monjalon.net;
>>>> stephen@networkplumber.org; Justin He <Justin.He@arm.com>; Honnappa
>>>> Nagarahalli <Honnappa.Nagarahalli@arm.com>; nd <nd@arm.com>
>>>> Subject: Re: [PATCH] test/mbuf: fix the forked process segment fault
>>>>
>>>> On 5/22/2023 7:01 AM, Ruifeng Wang wrote:
>>>>> Access of any memory in the hugepage shared file-backed area will
>>>>> trigger an unexpected forked child process segment fault. The root
>>>>> cause is DPDK doesn't support fork model [1] (calling rte_eal_init() before fork()).
>>>>> Forked child process can't be treated as a secondary process.
>>>>>
>>>>> Hence fix it by avoiding fork and doing verification in the main process.
>>>>>
>>>>> [1] https://mails.dpdk.org/archives/dev/2018-July/108106.html
>>>>>
>>>>> Fixes: af75078fece3 ("first public release")
>>>>> Cc: stable@dpdk.org
>>>>>
>>>>> Signed-off-by: Jia He <justin.he@arm.com>
>>>>> Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
>>>>> ---
>>>>
>>>> Would this be something that a secondary process-based test could test?
>>>> That's how we test rte_panic() and other calls.
>>>
>>> This case validates mbuf. IMO there is no need to do validation in a secondary process.
>>> Unit test for rte_panic() also uses fork() and could have the same issue.
>>>
>>
>> In that case, rte_panic() test should be fixed as well.
>>
>> My concern is that ideally, we shouldn't intentionally crash the test app if something
>> goes wrong, and calling rte_panic() accomplishes just that - which is why I suggested
>> running them in secondary processes instead, so that any call into rte_panic happens
>> inside a secondary process, and the main test process doesn't crash even if the test has
>> failed.
>
> Agree that intentionally crashing the test app is bad.
> In this patch, verification of mbuf is changed to use another API without rte_panic().
> Then the verification can be done directly in the primary. And the indirectness of
> using a secondary process is removed. Because verification will not crash the process.
>
Oh,
My apologies, I did not notice that. In that case,
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
23/05/2023 12:12, Burakov, Anatoly:
> On 5/23/2023 4:45 AM, Ruifeng Wang wrote:
> >> -----Original Message-----
> >> From: Burakov, Anatoly <anatoly.burakov@intel.com>
> >> Sent: Monday, May 22, 2023 6:19 PM
> >> To: Ruifeng Wang <Ruifeng.Wang@arm.com>; olivier.matz@6wind.com
> >> Cc: dev@dpdk.org; stable@dpdk.org; thomas@monjalon.net; stephen@networkplumber.org; Justin
> >> He <Justin.He@arm.com>; Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; nd
> >> <nd@arm.com>
> >> Subject: Re: [PATCH] test/mbuf: fix the forked process segment fault
> >>
> >> On 5/22/2023 10:55 AM, Ruifeng Wang wrote:
> >>>> -----Original Message-----
> >>>> From: Burakov, Anatoly <anatoly.burakov@intel.com>
> >>>> Sent: Monday, May 22, 2023 5:24 PM
> >>>> To: Ruifeng Wang <Ruifeng.Wang@arm.com>; olivier.matz@6wind.com
> >>>> Cc: dev@dpdk.org; stable@dpdk.org; thomas@monjalon.net;
> >>>> stephen@networkplumber.org; Justin He <Justin.He@arm.com>; Honnappa
> >>>> Nagarahalli <Honnappa.Nagarahalli@arm.com>; nd <nd@arm.com>
> >>>> Subject: Re: [PATCH] test/mbuf: fix the forked process segment fault
> >>>>
> >>>> On 5/22/2023 7:01 AM, Ruifeng Wang wrote:
> >>>>> Access of any memory in the hugepage shared file-backed area will
> >>>>> trigger an unexpected forked child process segment fault. The root
> >>>>> cause is DPDK doesn't support fork model [1] (calling rte_eal_init() before fork()).
> >>>>> Forked child process can't be treated as a secondary process.
> >>>>>
> >>>>> Hence fix it by avoiding fork and doing verification in the main process.
> >>>>>
> >>>>> [1] https://mails.dpdk.org/archives/dev/2018-July/108106.html
> >>>>>
> >>>>> Fixes: af75078fece3 ("first public release")
> >>>>> Cc: stable@dpdk.org
> >>>>>
> >>>>> Signed-off-by: Jia He <justin.he@arm.com>
> >>>>> Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
> >>>>> ---
> >>>>
> >>>> Would this be something that a secondary process-based test could test?
> >>>> That's how we test rte_panic() and other calls.
> >>>
> >>> This case validates mbuf. IMO there is no need to do validation in a secondary process.
> >>> Unit test for rte_panic() also uses fork() and could have the same issue.
> >>>
> >>
> >> In that case, rte_panic() test should be fixed as well.
> >>
> >> My concern is that ideally, we shouldn't intentionally crash the test app if something
> >> goes wrong, and calling rte_panic() accomplishes just that - which is why I suggested
> >> running them in secondary processes instead, so that any call into rte_panic happens
> >> inside a secondary process, and the main test process doesn't crash even if the test has
> >> failed.
> >
> > Agree that intentionally crashing the test app is bad.
> > In this patch, verification of mbuf is changed to use another API without rte_panic().
> > Then the verification can be done directly in the primary. And the indirectness of
> > using a secondary process is removed. Because verification will not crash the process.
> >
>
> Oh,
>
> My apologies, I did not notice that. In that case,
>
> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Applied, thanks.
@@ -1167,38 +1167,16 @@ test_failing_mbuf_sanity_check(struct rte_mempool *pktmbuf_pool)
return TEST_SKIPPED;
}
#else
-
-#include <unistd.h>
-#include <sys/resource.h>
-#include <sys/time.h>
-#include <sys/wait.h>
-
-/* use fork() to test mbuf errors panic */
-static int
-verify_mbuf_check_panics(struct rte_mbuf *buf)
+/* Verify if mbuf can pass the check */
+static bool
+mbuf_check_pass(struct rte_mbuf *buf)
{
- int pid;
- int status;
+ const char *reason;
- pid = fork();
+ if (rte_mbuf_check(buf, 1, &reason) == 0)
+ return true;
- if (pid == 0) {
- struct rlimit rl;
-
- /* No need to generate a coredump when panicking. */
- rl.rlim_cur = rl.rlim_max = 0;
- setrlimit(RLIMIT_CORE, &rl);
- rte_mbuf_sanity_check(buf, 1); /* should panic */
- exit(0); /* return normally if it doesn't panic */
- } else if (pid < 0) {
- printf("Fork Failed\n");
- return -1;
- }
- wait(&status);
- if(status == 0)
- return -1;
-
- return 0;
+ return false;
}
static int
@@ -1215,19 +1193,19 @@ test_failing_mbuf_sanity_check(struct rte_mempool *pktmbuf_pool)
return -1;
printf("Checking good mbuf initially\n");
- if (verify_mbuf_check_panics(buf) != -1)
+ if (!mbuf_check_pass(buf))
return -1;
printf("Now checking for error conditions\n");
- if (verify_mbuf_check_panics(NULL)) {
+ if (mbuf_check_pass(NULL)) {
printf("Error with NULL mbuf test\n");
return -1;
}
badbuf = *buf;
badbuf.pool = NULL;
- if (verify_mbuf_check_panics(&badbuf)) {
+ if (mbuf_check_pass(&badbuf)) {
printf("Error with bad-pool mbuf test\n");
return -1;
}
@@ -1235,7 +1213,7 @@ test_failing_mbuf_sanity_check(struct rte_mempool *pktmbuf_pool)
if (RTE_IOVA_IN_MBUF) {
badbuf = *buf;
rte_mbuf_iova_set(&badbuf, 0);
- if (verify_mbuf_check_panics(&badbuf)) {
+ if (mbuf_check_pass(&badbuf)) {
printf("Error with bad-physaddr mbuf test\n");
return -1;
}
@@ -1243,21 +1221,21 @@ test_failing_mbuf_sanity_check(struct rte_mempool *pktmbuf_pool)
badbuf = *buf;
badbuf.buf_addr = NULL;
- if (verify_mbuf_check_panics(&badbuf)) {
+ if (mbuf_check_pass(&badbuf)) {
printf("Error with bad-addr mbuf test\n");
return -1;
}
badbuf = *buf;
badbuf.refcnt = 0;
- if (verify_mbuf_check_panics(&badbuf)) {
+ if (mbuf_check_pass(&badbuf)) {
printf("Error with bad-refcnt(0) mbuf test\n");
return -1;
}
badbuf = *buf;
badbuf.refcnt = UINT16_MAX;
- if (verify_mbuf_check_panics(&badbuf)) {
+ if (mbuf_check_pass(&badbuf)) {
printf("Error with bad-refcnt(MAX) mbuf test\n");
return -1;
}