eal: factorize lcore main loop

Message ID 20220323093001.20618-1-david.marchand@redhat.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers
Series eal: factorize lcore main loop |

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-aarch64-unit-testing success Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS
ci/github-robot: build success github build: passed
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-x86_64-unit-testing success Testing PASS
ci/iol-x86_64-compile-testing success Testing PASS
ci/iol-broadcom-Functional fail Functional Testing issues

Commit Message

David Marchand March 23, 2022, 9:30 a.m. UTC
  All OS implementations provide the same main loop.
Introduce helpers (shared for Linux and FreeBSD) to handle synchronisation
between main and threads and factorize the rest as common code.
Thread id are now logged as string in a common format across OS.

Signed-off-by: David Marchand <david.marchand@redhat.com>
---
I had this patch in store for a long time.
I don't particularly care about it, it's not fixing anything.
But it seems a good cleanup/consolidation, so I rebased it and I am
sending it to get feedback.

---
 lib/eal/common/eal_common_launch.c |  36 +++++++-
 lib/eal/common/eal_common_thread.c |  72 +++++++++++++++
 lib/eal/common/eal_thread.h        |  27 ++++++
 lib/eal/freebsd/eal.c              |   7 +-
 lib/eal/freebsd/eal_thread.c       | 138 ----------------------------
 lib/eal/linux/eal.c                |   4 +-
 lib/eal/linux/eal_thread.c         | 139 +----------------------------
 lib/eal/unix/eal_unix_thread.c     |  63 +++++++++++++
 lib/eal/unix/meson.build           |   5 +-
 lib/eal/windows/eal_thread.c       | 137 +++++++---------------------
 10 files changed, 236 insertions(+), 392 deletions(-)
 create mode 100644 lib/eal/unix/eal_unix_thread.c
  

Comments

Morten Brørup March 23, 2022, 12:01 p.m. UTC | #1
> From: David Marchand [mailto:david.marchand@redhat.com]
> Sent: Wednesday, 23 March 2022 10.30
> 
> All OS implementations provide the same main loop.
> Introduce helpers (shared for Linux and FreeBSD) to handle
> synchronisation
> between main and threads and factorize the rest as common code.
> Thread id are now logged as string in a common format across OS.
> 
> Signed-off-by: David Marchand <david.marchand@redhat.com>
> ---
> I had this patch in store for a long time.
> I don't particularly care about it, it's not fixing anything.
> But it seems a good cleanup/consolidation, so I rebased it and I am
> sending it to get feedback.
> 

LGTM. I'm always in favor of cleaning up! :-)

Thank you, David.

Acked-By: Morten Brørup <mb@smartsharesystems.com>
  
Tyler Retzlaff March 24, 2022, 8:31 a.m. UTC | #2
On Wed, Mar 23, 2022 at 10:30:01AM +0100, David Marchand wrote:
> All OS implementations provide the same main loop.
> Introduce helpers (shared for Linux and FreeBSD) to handle synchronisation
> between main and threads and factorize the rest as common code.
> Thread id are now logged as string in a common format across OS.
> 
> Signed-off-by: David Marchand <david.marchand@redhat.com>
> ---
> I had this patch in store for a long time.
> I don't particularly care about it, it's not fixing anything.
> But it seems a good cleanup/consolidation, so I rebased it and I am
> sending it to get feedback.
> 

... snip ...

> diff --git a/lib/eal/common/eal_common_thread.c b/lib/eal/common/eal_common_thread.c
> index 684bea166c..256de91abc 100644
> --- a/lib/eal/common/eal_common_thread.c
> +++ b/lib/eal/common/eal_common_thread.c
> @@ -9,6 +9,7 @@
>  #include <assert.h>
>  #include <string.h>
>  
> +#include <rte_eal_trace.h>
>  #include <rte_errno.h>
>  #include <rte_lcore.h>
>  #include <rte_log.h>
> @@ -163,6 +164,77 @@ __rte_thread_uninit(void)
>  	RTE_PER_LCORE(_lcore_id) = LCORE_ID_ANY;
>  }
>  
> +/* main loop of threads */
> +__rte_noreturn void *
> +eal_thread_loop(__rte_unused void *arg)
> +{
> +	char cpuset[RTE_CPU_AFFINITY_STR_LEN];
> +	pthread_t thread_id = pthread_self();
> +	unsigned int lcore_id;
> +	int ret;
> +
> +	/* retrieve our lcore_id from the configuration structure */
> +	RTE_LCORE_FOREACH_WORKER(lcore_id) {
> +		if (thread_id == lcore_config[lcore_id].thread_id)
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

i can see that in practice this isn't a problem since the linux
implementation of pthread_create(3) stores to pthread_t *thread before
executing start_routine.

but strictly speaking i don't think the pthread_create api contractually
guarantees that the thread id is stored before start_routine runs. so this
is relying on an internal implementation detail.

https://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_create.html

  "Upon successful completion, pthread_create() shall store the ID of the
   created thread in the location referenced by thread."

https://man7.org/linux/man-pages/man3/pthread_create.3.html

  "Before returning, a successful call to pthread_create() stores
   the ID of the new thread in the buffer pointed to by thread; this
   identifier is used to refer to the thread in subsequent calls to
   other pthreads functions."

it doesn't really say when it does this in relation to start_routine running.
depends how hair splitty you want to be about it. but since you're revamping
the code you might be interested in addressing it.

ty
  
David Marchand March 24, 2022, 2:44 p.m. UTC | #3
On Thu, Mar 24, 2022 at 9:31 AM Tyler Retzlaff
<roretzla@linux.microsoft.com> wrote:
> > diff --git a/lib/eal/common/eal_common_thread.c b/lib/eal/common/eal_common_thread.c
> > index 684bea166c..256de91abc 100644
> > --- a/lib/eal/common/eal_common_thread.c
> > +++ b/lib/eal/common/eal_common_thread.c
> > @@ -9,6 +9,7 @@
> >  #include <assert.h>
> >  #include <string.h>
> >
> > +#include <rte_eal_trace.h>
> >  #include <rte_errno.h>
> >  #include <rte_lcore.h>
> >  #include <rte_log.h>
> > @@ -163,6 +164,77 @@ __rte_thread_uninit(void)
> >       RTE_PER_LCORE(_lcore_id) = LCORE_ID_ANY;
> >  }
> >
> > +/* main loop of threads */
> > +__rte_noreturn void *
> > +eal_thread_loop(__rte_unused void *arg)
> > +{
> > +     char cpuset[RTE_CPU_AFFINITY_STR_LEN];
> > +     pthread_t thread_id = pthread_self();
> > +     unsigned int lcore_id;
> > +     int ret;
> > +
> > +     /* retrieve our lcore_id from the configuration structure */
> > +     RTE_LCORE_FOREACH_WORKER(lcore_id) {
> > +             if (thread_id == lcore_config[lcore_id].thread_id)
>                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> i can see that in practice this isn't a problem since the linux
> implementation of pthread_create(3) stores to pthread_t *thread before
> executing start_routine.
>
> but strictly speaking i don't think the pthread_create api contractually
> guarantees that the thread id is stored before start_routine runs. so this
> is relying on an internal implementation detail.
>
> https://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_create.html
>
>   "Upon successful completion, pthread_create() shall store the ID of the
>    created thread in the location referenced by thread."
>
> https://man7.org/linux/man-pages/man3/pthread_create.3.html
>
>   "Before returning, a successful call to pthread_create() stores
>    the ID of the new thread in the buffer pointed to by thread; this
>    identifier is used to refer to the thread in subsequent calls to
>    other pthreads functions."
>
> it doesn't really say when it does this in relation to start_routine running.
> depends how hair splitty you want to be about it. but since you're revamping
> the code you might be interested in addressing it.

I had wondered about this part too in the past.

I don't see a reason to keep this loop (even considering baremetal,
since this code is within the linux implementation of EAL).
And this comment seems a good reason to cleanup the code (like simply
pass lcore_id via arg).

Something like:

Author: David Marchand <david.marchand@redhat.com>
Date:   Thu Mar 24 11:29:46 2022 +0100

    eal: cleanup lcore hand-over from main thread

    As noted by Tyler, there is nothing in the pthread API that strictly
    guarantees that the new thread won't start running eal_thread_loop
    before pthread_create writes to &lcore_config[xx].thread_id.

    Rather than rely on thread id, the main thread can directly pass the
    worker thread lcore.

    Signed-off-by: David Marchand <david.marchand@redhat.com>

diff --git a/lib/eal/common/eal_common_thread.c
b/lib/eal/common/eal_common_thread.c
index 256de91abc..962b7e9ac4 100644
--- a/lib/eal/common/eal_common_thread.c
+++ b/lib/eal/common/eal_common_thread.c
@@ -166,26 +166,17 @@ __rte_thread_uninit(void)

 /* main loop of threads */
 __rte_noreturn void *
-eal_thread_loop(__rte_unused void *arg)
+eal_thread_loop(void *arg)
 {
+    unsigned int lcore_id = (uintptr_t)arg;
     char cpuset[RTE_CPU_AFFINITY_STR_LEN];
-    pthread_t thread_id = pthread_self();
-    unsigned int lcore_id;
     int ret;

-    /* retrieve our lcore_id from the configuration structure */
-    RTE_LCORE_FOREACH_WORKER(lcore_id) {
-        if (thread_id == lcore_config[lcore_id].thread_id)
-            break;
-    }
-    if (lcore_id == RTE_MAX_LCORE)
-        rte_panic("cannot retrieve lcore id\n");
-
     __rte_thread_init(lcore_id, &lcore_config[lcore_id].cpuset);

     ret = eal_thread_dump_current_affinity(cpuset, sizeof(cpuset));
     RTE_LOG(DEBUG, EAL, "lcore %u is ready (tid=%zx;cpuset=[%s%s])\n",
-        lcore_id, (uintptr_t)thread_id, cpuset,
+        lcore_id, (uintptr_t)pthread_self(), cpuset,
         ret == 0 ? "" : "...");

     rte_eal_trace_thread_lcore_ready(lcore_id, cpuset);
diff --git a/lib/eal/common/eal_thread.h b/lib/eal/common/eal_thread.h
index b08dcf34b5..0fde33e70c 100644
--- a/lib/eal/common/eal_thread.h
+++ b/lib/eal/common/eal_thread.h
@@ -11,7 +11,7 @@
  * basic loop of thread, called for each thread by eal_init().
  *
  * @param arg
- *   opaque pointer
+ *   The lcore_id (passed as an integer) of this worker thread.
  */
 __rte_noreturn void *eal_thread_loop(void *arg);

diff --git a/lib/eal/freebsd/eal.c b/lib/eal/freebsd/eal.c
index 80bc3d25e0..a6b20960f2 100644
--- a/lib/eal/freebsd/eal.c
+++ b/lib/eal/freebsd/eal.c
@@ -810,7 +810,7 @@ rte_eal_init(int argc, char **argv)

         /* create a thread for each lcore */
         ret = pthread_create(&lcore_config[i].thread_id, NULL,
-                     eal_thread_loop, NULL);
+                     eal_thread_loop, (void *)(uintptr_t)i);
         if (ret != 0)
             rte_panic("Cannot create thread\n");

diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c
index 8a405d1d59..1ef263434a 100644
--- a/lib/eal/linux/eal.c
+++ b/lib/eal/linux/eal.c
@@ -1145,7 +1145,7 @@ rte_eal_init(int argc, char **argv)

         /* create a thread for each lcore */
         ret = pthread_create(&lcore_config[i].thread_id, NULL,
-                     eal_thread_loop, NULL);
+                     eal_thread_loop, (void *)(uintptr_t)i);
         if (ret != 0)
             rte_panic("Cannot create thread\n");

diff --git a/lib/eal/windows/eal.c b/lib/eal/windows/eal.c
index ca3c41aaa7..1874f9f6d7 100644
--- a/lib/eal/windows/eal.c
+++ b/lib/eal/windows/eal.c
@@ -420,7 +420,7 @@ rte_eal_init(int argc, char **argv)
         lcore_config[i].state = WAIT;

         /* create a thread for each lcore */
-        if (eal_thread_create(&lcore_config[i].thread_id) != 0)
+        if (eal_thread_create(&lcore_config[i].thread_id, i) != 0)
             rte_panic("Cannot create thread\n");
         ret = pthread_setaffinity_np(lcore_config[i].thread_id,
             sizeof(rte_cpuset_t), &lcore_config[i].cpuset);
diff --git a/lib/eal/windows/eal_thread.c b/lib/eal/windows/eal_thread.c
index de1c0078a5..704781a83c 100644
--- a/lib/eal/windows/eal_thread.c
+++ b/lib/eal/windows/eal_thread.c
@@ -71,13 +71,14 @@ eal_thread_ack_command(void)

 /* function to create threads */
 int
-eal_thread_create(pthread_t *thread)
+eal_thread_create(pthread_t *thread, unsigned int lcore_id)
 {
     HANDLE th;

     th = CreateThread(NULL, 0,
         (LPTHREAD_START_ROUTINE)(ULONG_PTR)eal_thread_loop,
-                        NULL, 0, (LPDWORD)thread);
+                        (LPVOID)(uintptr_t)lcore_id, 0,
+                        (LPDWORD)thread);
     if (!th)
         return -1;



But seeing how this code has been there from day 1, I would not
request a backport.
  
Tyler Retzlaff March 25, 2022, 12:11 p.m. UTC | #4
On Thu, Mar 24, 2022 at 03:44:23PM +0100, David Marchand wrote:
> On Thu, Mar 24, 2022 at 9:31 AM Tyler Retzlaff
> <roretzla@linux.microsoft.com> wrote:
> > > diff --git a/lib/eal/common/eal_common_thread.c b/lib/eal/common/eal_common_thread.c
> > > index 684bea166c..256de91abc 100644
> > > --- a/lib/eal/common/eal_common_thread.c
> > > +++ b/lib/eal/common/eal_common_thread.c
> > > @@ -9,6 +9,7 @@
> > >  #include <assert.h>
> > >  #include <string.h>
> > >
> > > +#include <rte_eal_trace.h>
> > >  #include <rte_errno.h>
> > >  #include <rte_lcore.h>
> > >  #include <rte_log.h>
> > > @@ -163,6 +164,77 @@ __rte_thread_uninit(void)
> > >       RTE_PER_LCORE(_lcore_id) = LCORE_ID_ANY;
> > >  }
> > >
> > > +/* main loop of threads */
> > > +__rte_noreturn void *
> > > +eal_thread_loop(__rte_unused void *arg)
> > > +{
> > > +     char cpuset[RTE_CPU_AFFINITY_STR_LEN];
> > > +     pthread_t thread_id = pthread_self();
> > > +     unsigned int lcore_id;
> > > +     int ret;
> > > +
> > > +     /* retrieve our lcore_id from the configuration structure */
> > > +     RTE_LCORE_FOREACH_WORKER(lcore_id) {
> > > +             if (thread_id == lcore_config[lcore_id].thread_id)
> >                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> >
> > i can see that in practice this isn't a problem since the linux
> > implementation of pthread_create(3) stores to pthread_t *thread before
> > executing start_routine.
> >
> > but strictly speaking i don't think the pthread_create api contractually
> > guarantees that the thread id is stored before start_routine runs. so this
> > is relying on an internal implementation detail.
> >
> > https://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_create.html
> >
> >   "Upon successful completion, pthread_create() shall store the ID of the
> >    created thread in the location referenced by thread."
> >
> > https://man7.org/linux/man-pages/man3/pthread_create.3.html
> >
> >   "Before returning, a successful call to pthread_create() stores
> >    the ID of the new thread in the buffer pointed to by thread; this
> >    identifier is used to refer to the thread in subsequent calls to
> >    other pthreads functions."
> >
> > it doesn't really say when it does this in relation to start_routine running.
> > depends how hair splitty you want to be about it. but since you're revamping
> > the code you might be interested in addressing it.
> 
> I had wondered about this part too in the past.
> 
> I don't see a reason to keep this loop (even considering baremetal,
> since this code is within the linux implementation of EAL).
> And this comment seems a good reason to cleanup the code (like simply
> pass lcore_id via arg).
> 
> Something like:
> 
> Author: David Marchand <david.marchand@redhat.com>
> Date:   Thu Mar 24 11:29:46 2022 +0100
> 
>     eal: cleanup lcore hand-over from main thread
> 
>     As noted by Tyler, there is nothing in the pthread API that strictly
>     guarantees that the new thread won't start running eal_thread_loop
>     before pthread_create writes to &lcore_config[xx].thread_id.
> 
>     Rather than rely on thread id, the main thread can directly pass the
>     worker thread lcore.
> 
>     Signed-off-by: David Marchand <david.marchand@redhat.com>
> 
> diff --git a/lib/eal/common/eal_common_thread.c
> b/lib/eal/common/eal_common_thread.c
> index 256de91abc..962b7e9ac4 100644
> --- a/lib/eal/common/eal_common_thread.c
> +++ b/lib/eal/common/eal_common_thread.c
> @@ -166,26 +166,17 @@ __rte_thread_uninit(void)
> 
>  /* main loop of threads */
>  __rte_noreturn void *
> -eal_thread_loop(__rte_unused void *arg)
> +eal_thread_loop(void *arg)
>  {
> +    unsigned int lcore_id = (uintptr_t)arg;
>      char cpuset[RTE_CPU_AFFINITY_STR_LEN];
> -    pthread_t thread_id = pthread_self();
> -    unsigned int lcore_id;
>      int ret;
> 
> -    /* retrieve our lcore_id from the configuration structure */
> -    RTE_LCORE_FOREACH_WORKER(lcore_id) {
> -        if (thread_id == lcore_config[lcore_id].thread_id)
> -            break;
> -    }
> -    if (lcore_id == RTE_MAX_LCORE)
> -        rte_panic("cannot retrieve lcore id\n");
> -
>      __rte_thread_init(lcore_id, &lcore_config[lcore_id].cpuset);
> 
>      ret = eal_thread_dump_current_affinity(cpuset, sizeof(cpuset));
>      RTE_LOG(DEBUG, EAL, "lcore %u is ready (tid=%zx;cpuset=[%s%s])\n",
> -        lcore_id, (uintptr_t)thread_id, cpuset,
> +        lcore_id, (uintptr_t)pthread_self(), cpuset,
>          ret == 0 ? "" : "...");
> 
>      rte_eal_trace_thread_lcore_ready(lcore_id, cpuset);
> diff --git a/lib/eal/common/eal_thread.h b/lib/eal/common/eal_thread.h
> index b08dcf34b5..0fde33e70c 100644
> --- a/lib/eal/common/eal_thread.h
> +++ b/lib/eal/common/eal_thread.h
> @@ -11,7 +11,7 @@
>   * basic loop of thread, called for each thread by eal_init().
>   *
>   * @param arg
> - *   opaque pointer
> + *   The lcore_id (passed as an integer) of this worker thread.
>   */
>  __rte_noreturn void *eal_thread_loop(void *arg);
> 
> diff --git a/lib/eal/freebsd/eal.c b/lib/eal/freebsd/eal.c
> index 80bc3d25e0..a6b20960f2 100644
> --- a/lib/eal/freebsd/eal.c
> +++ b/lib/eal/freebsd/eal.c
> @@ -810,7 +810,7 @@ rte_eal_init(int argc, char **argv)
> 
>          /* create a thread for each lcore */
>          ret = pthread_create(&lcore_config[i].thread_id, NULL,
> -                     eal_thread_loop, NULL);
> +                     eal_thread_loop, (void *)(uintptr_t)i);
>          if (ret != 0)
>              rte_panic("Cannot create thread\n");
> 
> diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c
> index 8a405d1d59..1ef263434a 100644
> --- a/lib/eal/linux/eal.c
> +++ b/lib/eal/linux/eal.c
> @@ -1145,7 +1145,7 @@ rte_eal_init(int argc, char **argv)
> 
>          /* create a thread for each lcore */
>          ret = pthread_create(&lcore_config[i].thread_id, NULL,
> -                     eal_thread_loop, NULL);
> +                     eal_thread_loop, (void *)(uintptr_t)i);
>          if (ret != 0)
>              rte_panic("Cannot create thread\n");
> 
> diff --git a/lib/eal/windows/eal.c b/lib/eal/windows/eal.c
> index ca3c41aaa7..1874f9f6d7 100644
> --- a/lib/eal/windows/eal.c
> +++ b/lib/eal/windows/eal.c
> @@ -420,7 +420,7 @@ rte_eal_init(int argc, char **argv)
>          lcore_config[i].state = WAIT;
> 
>          /* create a thread for each lcore */
> -        if (eal_thread_create(&lcore_config[i].thread_id) != 0)
> +        if (eal_thread_create(&lcore_config[i].thread_id, i) != 0)
>              rte_panic("Cannot create thread\n");
>          ret = pthread_setaffinity_np(lcore_config[i].thread_id,
>              sizeof(rte_cpuset_t), &lcore_config[i].cpuset);
> diff --git a/lib/eal/windows/eal_thread.c b/lib/eal/windows/eal_thread.c
> index de1c0078a5..704781a83c 100644
> --- a/lib/eal/windows/eal_thread.c
> +++ b/lib/eal/windows/eal_thread.c
> @@ -71,13 +71,14 @@ eal_thread_ack_command(void)
> 
>  /* function to create threads */
>  int
> -eal_thread_create(pthread_t *thread)
> +eal_thread_create(pthread_t *thread, unsigned int lcore_id)
>  {
>      HANDLE th;
> 
>      th = CreateThread(NULL, 0,
>          (LPTHREAD_START_ROUTINE)(ULONG_PTR)eal_thread_loop,
> -                        NULL, 0, (LPDWORD)thread);
> +                        (LPVOID)(uintptr_t)lcore_id, 0,
> +                        (LPDWORD)thread);
>      if (!th)
>          return -1;
> 
> 
> 
> But seeing how this code has been there from day 1, I would not
> request a backport.

this looks better to me it ends up being a bit less code and it solves
the problem in a general fashion for platforms including windows.

on windows the implementation does run the start_routine before assigning
thread which was addressed with this patch. (still not merged)
http://patchwork.dpdk.org/project/dpdk/list/?series=22094

it's likely your patch will be merged before mine so when that happens
i'll just quietly abandon mine.  however if some desire exists for a
backport the simpler patch i provided could be used.

on our downstream UT pipelines the bug was causing intermittent failure
of around 30% of the tests. i'm surprised the bug hasn't had a more
negative impact on the dpdk CI pipelines.
  
Tyler Retzlaff March 25, 2022, 12:23 p.m. UTC | #5
On Wed, Mar 23, 2022 at 10:30:01AM +0100, David Marchand wrote:
> All OS implementations provide the same main loop.
> Introduce helpers (shared for Linux and FreeBSD) to handle synchronisation
> between main and threads and factorize the rest as common code.
> Thread id are now logged as string in a common format across OS.
> 
> Signed-off-by: David Marchand <david.marchand@redhat.com>
> ---
> I had this patch in store for a long time.
> I don't particularly care about it, it's not fixing anything.
> But it seems a good cleanup/consolidation, so I rebased it and I am
> sending it to get feedback.

Acked-By: Tyler Retzlaff <roretzla@linux.microsoft.com>
  
Thomas Monjalon March 25, 2022, 2:58 p.m. UTC | #6
25/03/2022 13:11, Tyler Retzlaff:
> On Thu, Mar 24, 2022 at 03:44:23PM +0100, David Marchand wrote:
> > On Thu, Mar 24, 2022 at 9:31 AM Tyler Retzlaff
> > <roretzla@linux.microsoft.com> wrote:
> > > > diff --git a/lib/eal/common/eal_common_thread.c b/lib/eal/common/eal_common_thread.c
> > > > index 684bea166c..256de91abc 100644
> > > > --- a/lib/eal/common/eal_common_thread.c
> > > > +++ b/lib/eal/common/eal_common_thread.c
> > > > @@ -9,6 +9,7 @@
> > > >  #include <assert.h>
> > > >  #include <string.h>
> > > >
> > > > +#include <rte_eal_trace.h>
> > > >  #include <rte_errno.h>
> > > >  #include <rte_lcore.h>
> > > >  #include <rte_log.h>
> > > > @@ -163,6 +164,77 @@ __rte_thread_uninit(void)
> > > >       RTE_PER_LCORE(_lcore_id) = LCORE_ID_ANY;
> > > >  }
> > > >
> > > > +/* main loop of threads */
> > > > +__rte_noreturn void *
> > > > +eal_thread_loop(__rte_unused void *arg)
> > > > +{
> > > > +     char cpuset[RTE_CPU_AFFINITY_STR_LEN];
> > > > +     pthread_t thread_id = pthread_self();
> > > > +     unsigned int lcore_id;
> > > > +     int ret;
> > > > +
> > > > +     /* retrieve our lcore_id from the configuration structure */
> > > > +     RTE_LCORE_FOREACH_WORKER(lcore_id) {
> > > > +             if (thread_id == lcore_config[lcore_id].thread_id)
> > >                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > >
> > > i can see that in practice this isn't a problem since the linux
> > > implementation of pthread_create(3) stores to pthread_t *thread before
> > > executing start_routine.
> > >
> > > but strictly speaking i don't think the pthread_create api contractually
> > > guarantees that the thread id is stored before start_routine runs. so this
> > > is relying on an internal implementation detail.
> > >
> > > https://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_create.html
> > >
> > >   "Upon successful completion, pthread_create() shall store the ID of the
> > >    created thread in the location referenced by thread."
> > >
> > > https://man7.org/linux/man-pages/man3/pthread_create.3.html
> > >
> > >   "Before returning, a successful call to pthread_create() stores
> > >    the ID of the new thread in the buffer pointed to by thread; this
> > >    identifier is used to refer to the thread in subsequent calls to
> > >    other pthreads functions."
> > >
> > > it doesn't really say when it does this in relation to start_routine running.
> > > depends how hair splitty you want to be about it. but since you're revamping
> > > the code you might be interested in addressing it.
> > 
> > I had wondered about this part too in the past.
> > 
> > I don't see a reason to keep this loop (even considering baremetal,
> > since this code is within the linux implementation of EAL).
> > And this comment seems a good reason to cleanup the code (like simply
> > pass lcore_id via arg).
> > 
> > Something like:
> > 
> > Author: David Marchand <david.marchand@redhat.com>
> > Date:   Thu Mar 24 11:29:46 2022 +0100
> > 
> >     eal: cleanup lcore hand-over from main thread
> > 
> >     As noted by Tyler, there is nothing in the pthread API that strictly
> >     guarantees that the new thread won't start running eal_thread_loop
> >     before pthread_create writes to &lcore_config[xx].thread_id.
> > 
> >     Rather than rely on thread id, the main thread can directly pass the
> >     worker thread lcore.
> > 
> >     Signed-off-by: David Marchand <david.marchand@redhat.com>
> > 
> > diff --git a/lib/eal/common/eal_common_thread.c
> > b/lib/eal/common/eal_common_thread.c
> > index 256de91abc..962b7e9ac4 100644
> > --- a/lib/eal/common/eal_common_thread.c
> > +++ b/lib/eal/common/eal_common_thread.c
> > @@ -166,26 +166,17 @@ __rte_thread_uninit(void)
> > 
> >  /* main loop of threads */
> >  __rte_noreturn void *
> > -eal_thread_loop(__rte_unused void *arg)
> > +eal_thread_loop(void *arg)
> >  {
> > +    unsigned int lcore_id = (uintptr_t)arg;
> >      char cpuset[RTE_CPU_AFFINITY_STR_LEN];
> > -    pthread_t thread_id = pthread_self();
> > -    unsigned int lcore_id;
> >      int ret;
> > 
> > -    /* retrieve our lcore_id from the configuration structure */
> > -    RTE_LCORE_FOREACH_WORKER(lcore_id) {
> > -        if (thread_id == lcore_config[lcore_id].thread_id)
> > -            break;
> > -    }
> > -    if (lcore_id == RTE_MAX_LCORE)
> > -        rte_panic("cannot retrieve lcore id\n");
> > -
> >      __rte_thread_init(lcore_id, &lcore_config[lcore_id].cpuset);
> > 
> >      ret = eal_thread_dump_current_affinity(cpuset, sizeof(cpuset));
> >      RTE_LOG(DEBUG, EAL, "lcore %u is ready (tid=%zx;cpuset=[%s%s])\n",
> > -        lcore_id, (uintptr_t)thread_id, cpuset,
> > +        lcore_id, (uintptr_t)pthread_self(), cpuset,
> >          ret == 0 ? "" : "...");
> > 
> >      rte_eal_trace_thread_lcore_ready(lcore_id, cpuset);
> > diff --git a/lib/eal/common/eal_thread.h b/lib/eal/common/eal_thread.h
> > index b08dcf34b5..0fde33e70c 100644
> > --- a/lib/eal/common/eal_thread.h
> > +++ b/lib/eal/common/eal_thread.h
> > @@ -11,7 +11,7 @@
> >   * basic loop of thread, called for each thread by eal_init().
> >   *
> >   * @param arg
> > - *   opaque pointer
> > + *   The lcore_id (passed as an integer) of this worker thread.
> >   */
> >  __rte_noreturn void *eal_thread_loop(void *arg);
> > 
> > diff --git a/lib/eal/freebsd/eal.c b/lib/eal/freebsd/eal.c
> > index 80bc3d25e0..a6b20960f2 100644
> > --- a/lib/eal/freebsd/eal.c
> > +++ b/lib/eal/freebsd/eal.c
> > @@ -810,7 +810,7 @@ rte_eal_init(int argc, char **argv)
> > 
> >          /* create a thread for each lcore */
> >          ret = pthread_create(&lcore_config[i].thread_id, NULL,
> > -                     eal_thread_loop, NULL);
> > +                     eal_thread_loop, (void *)(uintptr_t)i);
> >          if (ret != 0)
> >              rte_panic("Cannot create thread\n");
> > 
> > diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c
> > index 8a405d1d59..1ef263434a 100644
> > --- a/lib/eal/linux/eal.c
> > +++ b/lib/eal/linux/eal.c
> > @@ -1145,7 +1145,7 @@ rte_eal_init(int argc, char **argv)
> > 
> >          /* create a thread for each lcore */
> >          ret = pthread_create(&lcore_config[i].thread_id, NULL,
> > -                     eal_thread_loop, NULL);
> > +                     eal_thread_loop, (void *)(uintptr_t)i);
> >          if (ret != 0)
> >              rte_panic("Cannot create thread\n");
> > 
> > diff --git a/lib/eal/windows/eal.c b/lib/eal/windows/eal.c
> > index ca3c41aaa7..1874f9f6d7 100644
> > --- a/lib/eal/windows/eal.c
> > +++ b/lib/eal/windows/eal.c
> > @@ -420,7 +420,7 @@ rte_eal_init(int argc, char **argv)
> >          lcore_config[i].state = WAIT;
> > 
> >          /* create a thread for each lcore */
> > -        if (eal_thread_create(&lcore_config[i].thread_id) != 0)
> > +        if (eal_thread_create(&lcore_config[i].thread_id, i) != 0)
> >              rte_panic("Cannot create thread\n");
> >          ret = pthread_setaffinity_np(lcore_config[i].thread_id,
> >              sizeof(rte_cpuset_t), &lcore_config[i].cpuset);
> > diff --git a/lib/eal/windows/eal_thread.c b/lib/eal/windows/eal_thread.c
> > index de1c0078a5..704781a83c 100644
> > --- a/lib/eal/windows/eal_thread.c
> > +++ b/lib/eal/windows/eal_thread.c
> > @@ -71,13 +71,14 @@ eal_thread_ack_command(void)
> > 
> >  /* function to create threads */
> >  int
> > -eal_thread_create(pthread_t *thread)
> > +eal_thread_create(pthread_t *thread, unsigned int lcore_id)
> >  {
> >      HANDLE th;
> > 
> >      th = CreateThread(NULL, 0,
> >          (LPTHREAD_START_ROUTINE)(ULONG_PTR)eal_thread_loop,
> > -                        NULL, 0, (LPDWORD)thread);
> > +                        (LPVOID)(uintptr_t)lcore_id, 0,
> > +                        (LPDWORD)thread);
> >      if (!th)
> >          return -1;
> > 
> > 
> > 
> > But seeing how this code has been there from day 1, I would not
> > request a backport.
> 
> this looks better to me it ends up being a bit less code and it solves
> the problem in a general fashion for platforms including windows.
> 
> on windows the implementation does run the start_routine before assigning
> thread which was addressed with this patch. (still not merged)
> http://patchwork.dpdk.org/project/dpdk/list/?series=22094
> 
> it's likely your patch will be merged before mine so when that happens
> i'll just quietly abandon mine.  however if some desire exists for a
> backport the simpler patch i provided could be used.

Your patch could be merged now that we start a new cycle.
What do you prefer? Is David's solution better?
In this case, should we reject your patch?
  
David Marchand March 25, 2022, 3:09 p.m. UTC | #7
On Fri, Mar 25, 2022 at 3:58 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > But seeing how this code has been there from day 1, I would not
> > > request a backport.
> >
> > this looks better to me it ends up being a bit less code and it solves
> > the problem in a general fashion for platforms including windows.
> >
> > on windows the implementation does run the start_routine before assigning
> > thread which was addressed with this patch. (still not merged)
> > http://patchwork.dpdk.org/project/dpdk/list/?series=22094
> >
> > it's likely your patch will be merged before mine so when that happens
> > i'll just quietly abandon mine.  however if some desire exists for a
> > backport the simpler patch i provided could be used.
>
> Your patch could be merged now that we start a new cycle.
> What do you prefer? Is David's solution better?
> In this case, should we reject your patch?

We can merge Tyler fix right away because it is a real issue on
Windows and it can be backported.

My series can be rebased and merged later as a cleanup/unified
solution for all OS.
  
Tyler Retzlaff March 25, 2022, 4:38 p.m. UTC | #8
On Fri, Mar 25, 2022 at 04:09:50PM +0100, David Marchand wrote:
> On Fri, Mar 25, 2022 at 3:58 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > But seeing how this code has been there from day 1, I would not
> > > > request a backport.
> > >
> > > this looks better to me it ends up being a bit less code and it solves
> > > the problem in a general fashion for platforms including windows.
> > >
> > > on windows the implementation does run the start_routine before assigning
> > > thread which was addressed with this patch. (still not merged)
> > > http://patchwork.dpdk.org/project/dpdk/list/?series=22094
> > >
> > > it's likely your patch will be merged before mine so when that happens
> > > i'll just quietly abandon mine.  however if some desire exists for a
> > > backport the simpler patch i provided could be used.
> >
> > Your patch could be merged now that we start a new cycle.
> > What do you prefer? Is David's solution better?
> > In this case, should we reject your patch?
> 
> We can merge Tyler fix right away because it is a real issue on
> Windows and it can be backported.
> 
> My series can be rebased and merged later as a cleanup/unified
> solution for all OS.

sounds about right to me. no objection here.

> 
> 
> -- 
> David Marchand
  

Patch

diff --git a/lib/eal/common/eal_common_launch.c b/lib/eal/common/eal_common_launch.c
index 9f393b9bda..5770803172 100644
--- a/lib/eal/common/eal_common_launch.c
+++ b/lib/eal/common/eal_common_launch.c
@@ -5,11 +5,13 @@ 
 #include <errno.h>
 
 #include <rte_launch.h>
+#include <rte_eal_trace.h>
 #include <rte_atomic.h>
 #include <rte_pause.h>
 #include <rte_lcore.h>
 
 #include "eal_private.h"
+#include "eal_thread.h"
 
 /*
  * Wait until a lcore finished its job.
@@ -18,12 +20,44 @@  int
 rte_eal_wait_lcore(unsigned worker_id)
 {
 	while (__atomic_load_n(&lcore_config[worker_id].state,
-					__ATOMIC_ACQUIRE) != WAIT)
+			__ATOMIC_ACQUIRE) != WAIT)
 		rte_pause();
 
 	return lcore_config[worker_id].ret;
 }
 
+/*
+ * Send a message to a worker lcore identified by worker_id to call a
+ * function f with argument arg. Once the execution is done, the
+ * remote lcore switches to WAIT state.
+ */
+int
+rte_eal_remote_launch(int (*f)(void *), void *arg, unsigned int worker_id)
+{
+	int rc = -EBUSY;
+
+	/* Check if the worker is in 'WAIT' state. Use acquire order
+	 * since 'state' variable is used as the guard variable.
+	 */
+	if (__atomic_load_n(&lcore_config[worker_id].state,
+			__ATOMIC_ACQUIRE) != WAIT)
+		goto finish;
+
+	lcore_config[worker_id].arg = arg;
+	/* Ensure that all the memory operations are completed
+	 * before the worker thread starts running the function.
+	 * Use worker thread function as the guard variable.
+	 */
+	__atomic_store_n(&lcore_config[worker_id].f, f, __ATOMIC_RELEASE);
+
+	eal_thread_wake_worker(worker_id);
+	rc = 0;
+
+finish:
+	rte_eal_trace_thread_remote_launch(f, arg, worker_id, rc);
+	return rc;
+}
+
 /*
  * Check that every WORKER lcores are in WAIT state, then call
  * rte_eal_remote_launch() for all of them. If call_main is true
diff --git a/lib/eal/common/eal_common_thread.c b/lib/eal/common/eal_common_thread.c
index 684bea166c..256de91abc 100644
--- a/lib/eal/common/eal_common_thread.c
+++ b/lib/eal/common/eal_common_thread.c
@@ -9,6 +9,7 @@ 
 #include <assert.h>
 #include <string.h>
 
+#include <rte_eal_trace.h>
 #include <rte_errno.h>
 #include <rte_lcore.h>
 #include <rte_log.h>
@@ -163,6 +164,77 @@  __rte_thread_uninit(void)
 	RTE_PER_LCORE(_lcore_id) = LCORE_ID_ANY;
 }
 
+/* main loop of threads */
+__rte_noreturn void *
+eal_thread_loop(__rte_unused void *arg)
+{
+	char cpuset[RTE_CPU_AFFINITY_STR_LEN];
+	pthread_t thread_id = pthread_self();
+	unsigned int lcore_id;
+	int ret;
+
+	/* retrieve our lcore_id from the configuration structure */
+	RTE_LCORE_FOREACH_WORKER(lcore_id) {
+		if (thread_id == lcore_config[lcore_id].thread_id)
+			break;
+	}
+	if (lcore_id == RTE_MAX_LCORE)
+		rte_panic("cannot retrieve lcore id\n");
+
+	__rte_thread_init(lcore_id, &lcore_config[lcore_id].cpuset);
+
+	ret = eal_thread_dump_current_affinity(cpuset, sizeof(cpuset));
+	RTE_LOG(DEBUG, EAL, "lcore %u is ready (tid=%zx;cpuset=[%s%s])\n",
+		lcore_id, (uintptr_t)thread_id, cpuset,
+		ret == 0 ? "" : "...");
+
+	rte_eal_trace_thread_lcore_ready(lcore_id, cpuset);
+
+	/* read on our pipe to get commands */
+	while (1) {
+		lcore_function_t *f;
+		void *fct_arg;
+
+		eal_thread_wait_command();
+
+		/* Set the state to 'RUNNING'. Use release order
+		 * since 'state' variable is used as the guard variable.
+		 */
+		__atomic_store_n(&lcore_config[lcore_id].state, RUNNING,
+			__ATOMIC_RELEASE);
+
+		eal_thread_ack_command();
+
+		/* Load 'f' with acquire order to ensure that
+		 * the memory operations from the main thread
+		 * are accessed only after update to 'f' is visible.
+		 * Wait till the update to 'f' is visible to the worker.
+		 */
+		while ((f = __atomic_load_n(&lcore_config[lcore_id].f,
+				__ATOMIC_ACQUIRE)) == NULL)
+			rte_pause();
+
+		/* call the function and store the return value */
+		fct_arg = lcore_config[lcore_id].arg;
+		ret = f(fct_arg);
+		lcore_config[lcore_id].ret = ret;
+		lcore_config[lcore_id].f = NULL;
+		lcore_config[lcore_id].arg = NULL;
+
+		/* Store the state with release order to ensure that
+		 * the memory operations from the worker thread
+		 * are completed before the state is updated.
+		 * Use 'state' as the guard variable.
+		 */
+		__atomic_store_n(&lcore_config[lcore_id].state, WAIT,
+			__ATOMIC_RELEASE);
+	}
+
+	/* never reached */
+	/* pthread_exit(NULL); */
+	/* return NULL; */
+}
+
 enum __rte_ctrl_thread_status {
 	CTRL_THREAD_LAUNCHING, /* Yet to call pthread_create function */
 	CTRL_THREAD_RUNNING, /* Control thread is running successfully */
diff --git a/lib/eal/common/eal_thread.h b/lib/eal/common/eal_thread.h
index 4a49117be8..b08dcf34b5 100644
--- a/lib/eal/common/eal_thread.h
+++ b/lib/eal/common/eal_thread.h
@@ -58,4 +58,31 @@  eal_thread_dump_affinity(rte_cpuset_t *cpuset, char *str, unsigned int size);
 int
 eal_thread_dump_current_affinity(char *str, unsigned int size);
 
+/**
+ * Called by the main thread to wake up a worker in 'WAIT' state.
+ * This function blocks until the worker acknowledge it started processing a
+ * new command.
+ * This function is private to EAL.
+ *
+ * @param worker_id
+ *   The lcore_id of a worker thread.
+ */
+void
+eal_thread_wake_worker(unsigned int worker_id);
+
+/**
+ * Called by a worker thread to sleep after entering 'WAIT' state.
+ * This function is private to EAL.
+ */
+void
+eal_thread_wait_command(void);
+
+/**
+ * Called by a worker thread to acknowledge new command after leaving 'WAIT'
+ * state.
+ * This function is private to EAL.
+ */
+void
+eal_thread_ack_command(void);
+
 #endif /* EAL_THREAD_H */
diff --git a/lib/eal/freebsd/eal.c b/lib/eal/freebsd/eal.c
index 71993fe25b..80bc3d25e0 100644
--- a/lib/eal/freebsd/eal.c
+++ b/lib/eal/freebsd/eal.c
@@ -579,7 +579,6 @@  int
 rte_eal_init(int argc, char **argv)
 {
 	int i, fctret, ret;
-	pthread_t thread_id;
 	static uint32_t run_once;
 	uint32_t has_run = 0;
 	char cpuset[RTE_CPU_AFFINITY_STR_LEN];
@@ -604,8 +603,6 @@  rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
-	thread_id = pthread_self();
-
 	eal_reset_internal_config(internal_conf);
 
 	/* clone argv to report out later in telemetry */
@@ -794,8 +791,8 @@  rte_eal_init(int argc, char **argv)
 
 	ret = eal_thread_dump_current_affinity(cpuset, sizeof(cpuset));
 
-	RTE_LOG(DEBUG, EAL, "Main lcore %u is ready (tid=%p;cpuset=[%s%s])\n",
-		config->main_lcore, thread_id, cpuset,
+	RTE_LOG(DEBUG, EAL, "Main lcore %u is ready (tid=%zx;cpuset=[%s%s])\n",
+		config->main_lcore, (uintptr_t)pthread_self(), cpuset,
 		ret == 0 ? "" : "...");
 
 	RTE_LCORE_FOREACH_WORKER(i) {
diff --git a/lib/eal/freebsd/eal_thread.c b/lib/eal/freebsd/eal_thread.c
index 3b18030d73..ab81b527bc 100644
--- a/lib/eal/freebsd/eal_thread.c
+++ b/lib/eal/freebsd/eal_thread.c
@@ -20,148 +20,10 @@ 
 #include <rte_per_lcore.h>
 #include <rte_eal.h>
 #include <rte_lcore.h>
-#include <rte_eal_trace.h>
 
 #include "eal_private.h"
 #include "eal_thread.h"
 
-/*
- * Send a message to a worker lcore identified by worker_id to call a
- * function f with argument arg. Once the execution is done, the
- * remote lcore switches to WAIT state.
- */
-int
-rte_eal_remote_launch(int (*f)(void *), void *arg, unsigned worker_id)
-{
-	int n;
-	char c = 0;
-	int m2w = lcore_config[worker_id].pipe_main2worker[1];
-	int w2m = lcore_config[worker_id].pipe_worker2main[0];
-	int rc = -EBUSY;
-
-	/* Check if the worker is in 'WAIT' state. Use acquire order
-	 * since 'state' variable is used as the guard variable.
-	 */
-	if (__atomic_load_n(&lcore_config[worker_id].state,
-					__ATOMIC_ACQUIRE) != WAIT)
-		goto finish;
-
-	lcore_config[worker_id].arg = arg;
-	/* Ensure that all the memory operations are completed
-	 * before the worker thread starts running the function.
-	 * Use worker thread function as the guard variable.
-	 */
-	__atomic_store_n(&lcore_config[worker_id].f, f, __ATOMIC_RELEASE);
-
-	/* send message */
-	n = 0;
-	while (n == 0 || (n < 0 && errno == EINTR))
-		n = write(m2w, &c, 1);
-	if (n < 0)
-		rte_panic("cannot write on configuration pipe\n");
-
-	/* wait ack */
-	do {
-		n = read(w2m, &c, 1);
-	} while (n < 0 && errno == EINTR);
-
-	if (n <= 0)
-		rte_panic("cannot read on configuration pipe\n");
-
-	rc = 0;
-finish:
-	rte_eal_trace_thread_remote_launch(f, arg, worker_id, rc);
-	return rc;
-}
-
-/* main loop of threads */
-__rte_noreturn void *
-eal_thread_loop(__rte_unused void *arg)
-{
-	char c;
-	int n, ret;
-	unsigned lcore_id;
-	pthread_t thread_id;
-	int m2w, w2m;
-	char cpuset[RTE_CPU_AFFINITY_STR_LEN];
-
-	thread_id = pthread_self();
-
-	/* retrieve our lcore_id from the configuration structure */
-	RTE_LCORE_FOREACH_WORKER(lcore_id) {
-		if (thread_id == lcore_config[lcore_id].thread_id)
-			break;
-	}
-	if (lcore_id == RTE_MAX_LCORE)
-		rte_panic("cannot retrieve lcore id\n");
-
-	m2w = lcore_config[lcore_id].pipe_main2worker[0];
-	w2m = lcore_config[lcore_id].pipe_worker2main[1];
-
-	__rte_thread_init(lcore_id, &lcore_config[lcore_id].cpuset);
-
-	ret = eal_thread_dump_current_affinity(cpuset, sizeof(cpuset));
-	RTE_LOG(DEBUG, EAL, "lcore %u is ready (tid=%p;cpuset=[%s%s])\n",
-		lcore_id, thread_id, cpuset, ret == 0 ? "" : "...");
-
-	rte_eal_trace_thread_lcore_ready(lcore_id, cpuset);
-
-	/* read on our pipe to get commands */
-	while (1) {
-		lcore_function_t *f;
-		void *fct_arg;
-
-		/* wait command */
-		do {
-			n = read(m2w, &c, 1);
-		} while (n < 0 && errno == EINTR);
-
-		if (n <= 0)
-			rte_panic("cannot read on configuration pipe\n");
-
-		/* Set the state to 'RUNNING'. Use release order
-		 * since 'state' variable is used as the guard variable.
-		 */
-		__atomic_store_n(&lcore_config[lcore_id].state, RUNNING,
-					__ATOMIC_RELEASE);
-
-		/* send ack */
-		n = 0;
-		while (n == 0 || (n < 0 && errno == EINTR))
-			n = write(w2m, &c, 1);
-		if (n < 0)
-			rte_panic("cannot write on configuration pipe\n");
-
-		/* Load 'f' with acquire order to ensure that
-		 * the memory operations from the main thread
-		 * are accessed only after update to 'f' is visible.
-		 * Wait till the update to 'f' is visible to the worker.
-		 */
-		while ((f = __atomic_load_n(&lcore_config[lcore_id].f,
-			__ATOMIC_ACQUIRE)) == NULL)
-			rte_pause();
-
-		/* call the function and store the return value */
-		fct_arg = lcore_config[lcore_id].arg;
-		ret = f(fct_arg);
-		lcore_config[lcore_id].ret = ret;
-		lcore_config[lcore_id].f = NULL;
-		lcore_config[lcore_id].arg = NULL;
-
-		/* Store the state with release order to ensure that
-		 * the memory operations from the worker thread
-		 * are completed before the state is updated.
-		 * Use 'state' as the guard variable.
-		 */
-		__atomic_store_n(&lcore_config[lcore_id].state, WAIT,
-					__ATOMIC_RELEASE);
-	}
-
-	/* never reached */
-	/* pthread_exit(NULL); */
-	/* return NULL; */
-}
-
 /* require calling thread tid by gettid() */
 int rte_sys_gettid(void)
 {
diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c
index 025e5cc10d..8a405d1d59 100644
--- a/lib/eal/linux/eal.c
+++ b/lib/eal/linux/eal.c
@@ -862,7 +862,6 @@  int
 rte_eal_init(int argc, char **argv)
 {
 	int i, fctret, ret;
-	pthread_t thread_id;
 	static uint32_t run_once;
 	uint32_t has_run = 0;
 	const char *p;
@@ -890,7 +889,6 @@  rte_eal_init(int argc, char **argv)
 
 	p = strrchr(argv[0], '/');
 	strlcpy(logid, p ? p + 1 : argv[0], sizeof(logid));
-	thread_id = pthread_self();
 
 	eal_reset_internal_config(internal_conf);
 
@@ -1129,7 +1127,7 @@  rte_eal_init(int argc, char **argv)
 
 	ret = eal_thread_dump_current_affinity(cpuset, sizeof(cpuset));
 	RTE_LOG(DEBUG, EAL, "Main lcore %u is ready (tid=%zx;cpuset=[%s%s])\n",
-		config->main_lcore, (uintptr_t)thread_id, cpuset,
+		config->main_lcore, (uintptr_t)pthread_self(), cpuset,
 		ret == 0 ? "" : "...");
 
 	RTE_LCORE_FOREACH_WORKER(i) {
diff --git a/lib/eal/linux/eal_thread.c b/lib/eal/linux/eal_thread.c
index fa6cd7e2c4..820cc905e0 100644
--- a/lib/eal/linux/eal_thread.c
+++ b/lib/eal/linux/eal_thread.c
@@ -14,148 +14,11 @@ 
 #include <rte_log.h>
 #include <rte_eal.h>
 #include <rte_lcore.h>
-#include <rte_eal_trace.h>
+#include <rte_string_fns.h>
 
 #include "eal_private.h"
 #include "eal_thread.h"
 
-/*
- * Send a message to a worker lcore identified by worker_id to call a
- * function f with argument arg. Once the execution is done, the
- * remote lcore switches to WAIT state.
- */
-int
-rte_eal_remote_launch(int (*f)(void *), void *arg, unsigned int worker_id)
-{
-	int n;
-	char c = 0;
-	int m2w = lcore_config[worker_id].pipe_main2worker[1];
-	int w2m = lcore_config[worker_id].pipe_worker2main[0];
-	int rc = -EBUSY;
-
-	/* Check if the worker is in 'WAIT' state. Use acquire order
-	 * since 'state' variable is used as the guard variable.
-	 */
-	if (__atomic_load_n(&lcore_config[worker_id].state,
-					__ATOMIC_ACQUIRE) != WAIT)
-		goto finish;
-
-	lcore_config[worker_id].arg = arg;
-	/* Ensure that all the memory operations are completed
-	 * before the worker thread starts running the function.
-	 * Use worker thread function pointer as the guard variable.
-	 */
-	__atomic_store_n(&lcore_config[worker_id].f, f, __ATOMIC_RELEASE);
-
-	/* send message */
-	n = 0;
-	while (n == 0 || (n < 0 && errno == EINTR))
-		n = write(m2w, &c, 1);
-	if (n < 0)
-		rte_panic("cannot write on configuration pipe\n");
-
-	/* wait ack */
-	do {
-		n = read(w2m, &c, 1);
-	} while (n < 0 && errno == EINTR);
-
-	if (n <= 0)
-		rte_panic("cannot read on configuration pipe\n");
-
-	rc = 0;
-finish:
-	rte_eal_trace_thread_remote_launch(f, arg, worker_id, rc);
-	return rc;
-}
-
-/* main loop of threads */
-__rte_noreturn void *
-eal_thread_loop(__rte_unused void *arg)
-{
-	char c;
-	int n, ret;
-	unsigned lcore_id;
-	pthread_t thread_id;
-	int m2w, w2m;
-	char cpuset[RTE_CPU_AFFINITY_STR_LEN];
-
-	thread_id = pthread_self();
-
-	/* retrieve our lcore_id from the configuration structure */
-	RTE_LCORE_FOREACH_WORKER(lcore_id) {
-		if (thread_id == lcore_config[lcore_id].thread_id)
-			break;
-	}
-	if (lcore_id == RTE_MAX_LCORE)
-		rte_panic("cannot retrieve lcore id\n");
-
-	m2w = lcore_config[lcore_id].pipe_main2worker[0];
-	w2m = lcore_config[lcore_id].pipe_worker2main[1];
-
-	__rte_thread_init(lcore_id, &lcore_config[lcore_id].cpuset);
-
-	ret = eal_thread_dump_current_affinity(cpuset, sizeof(cpuset));
-	RTE_LOG(DEBUG, EAL, "lcore %u is ready (tid=%zx;cpuset=[%s%s])\n",
-		lcore_id, (uintptr_t)thread_id, cpuset, ret == 0 ? "" : "...");
-
-	rte_eal_trace_thread_lcore_ready(lcore_id, cpuset);
-
-	/* read on our pipe to get commands */
-	while (1) {
-		lcore_function_t *f;
-		void *fct_arg;
-
-		/* wait command */
-		do {
-			n = read(m2w, &c, 1);
-		} while (n < 0 && errno == EINTR);
-
-		if (n <= 0)
-			rte_panic("cannot read on configuration pipe\n");
-
-		/* Set the state to 'RUNNING'. Use release order
-		 * since 'state' variable is used as the guard variable.
-		 */
-		__atomic_store_n(&lcore_config[lcore_id].state, RUNNING,
-					__ATOMIC_RELEASE);
-
-		/* send ack */
-		n = 0;
-		while (n == 0 || (n < 0 && errno == EINTR))
-			n = write(w2m, &c, 1);
-		if (n < 0)
-			rte_panic("cannot write on configuration pipe\n");
-
-		/* Load 'f' with acquire order to ensure that
-		 * the memory operations from the main thread
-		 * are accessed only after update to 'f' is visible.
-		 * Wait till the update to 'f' is visible to the worker.
-		 */
-		while ((f = __atomic_load_n(&lcore_config[lcore_id].f,
-			__ATOMIC_ACQUIRE)) == NULL)
-			rte_pause();
-
-		/* call the function and store the return value */
-		fct_arg = lcore_config[lcore_id].arg;
-		ret = f(fct_arg);
-		lcore_config[lcore_id].ret = ret;
-		lcore_config[lcore_id].f = NULL;
-		lcore_config[lcore_id].arg = NULL;
-
-		/* Store the state with release order to ensure that
-		 * the memory operations from the worker thread
-		 * are completed before the state is updated.
-		 * Use 'state' as the guard variable.
-		 */
-		__atomic_store_n(&lcore_config[lcore_id].state, WAIT,
-					__ATOMIC_RELEASE);
-	}
-
-	/* never reached */
-	/* pthread_exit(NULL); */
-	/* return NULL; */
-}
-
 /* require calling thread tid by gettid() */
 int rte_sys_gettid(void)
 {
diff --git a/lib/eal/unix/eal_unix_thread.c b/lib/eal/unix/eal_unix_thread.c
new file mode 100644
index 0000000000..70b5ba6b98
--- /dev/null
+++ b/lib/eal/unix/eal_unix_thread.c
@@ -0,0 +1,63 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2022 Red Hat, Inc.
+ */
+
+#include <errno.h>
+#include <unistd.h>
+
+#include <rte_debug.h>
+
+#include "eal_private.h"
+
+void
+eal_thread_wake_worker(unsigned int worker_id)
+{
+	int m2w = lcore_config[worker_id].pipe_main2worker[1];
+	int w2m = lcore_config[worker_id].pipe_worker2main[0];
+	char c = 0;
+	int n;
+
+	do {
+		n = write(m2w, &c, 1);
+	} while (n == 0 || (n < 0 && errno == EINTR));
+	if (n < 0)
+		rte_panic("cannot write on configuration pipe\n");
+
+	do {
+		n = read(w2m, &c, 1);
+	} while (n < 0 && errno == EINTR);
+	if (n <= 0)
+		rte_panic("cannot read on configuration pipe\n");
+}
+
+void
+eal_thread_wait_command(void)
+{
+	unsigned int lcore_id = rte_lcore_id();
+	int m2w;
+	char c;
+	int n;
+
+	m2w = lcore_config[lcore_id].pipe_main2worker[0];
+	do {
+		n = read(m2w, &c, 1);
+	} while (n < 0 && errno == EINTR);
+	if (n <= 0)
+		rte_panic("cannot read on configuration pipe\n");
+}
+
+void
+eal_thread_ack_command(void)
+{
+	unsigned int lcore_id = rte_lcore_id();
+	char c = 0;
+	int w2m;
+	int n;
+
+	w2m = lcore_config[lcore_id].pipe_worker2main[1];
+	do {
+		n = write(w2m, &c, 1);
+	} while (n == 0 || (n < 0 && errno == EINTR));
+	if (n < 0)
+		rte_panic("cannot write on configuration pipe\n");
+}
diff --git a/lib/eal/unix/meson.build b/lib/eal/unix/meson.build
index a22ea7cabc..781505ca90 100644
--- a/lib/eal/unix/meson.build
+++ b/lib/eal/unix/meson.build
@@ -3,9 +3,10 @@ 
 
 sources += files(
         'eal_file.c',
+        'eal_filesystem.c',
+        'eal_firmware.c',
         'eal_unix_memory.c',
+        'eal_unix_thread.c',
         'eal_unix_timer.c',
-        'eal_firmware.c',
-        'eal_filesystem.c',
         'rte_thread.c',
 )
diff --git a/lib/eal/windows/eal_thread.c b/lib/eal/windows/eal_thread.c
index 54fa93fa62..de1c0078a5 100644
--- a/lib/eal/windows/eal_thread.c
+++ b/lib/eal/windows/eal_thread.c
@@ -11,135 +11,62 @@ 
 #include <rte_per_lcore.h>
 #include <rte_common.h>
 #include <rte_memory.h>
-#include <eal_thread.h>
 
 #include "eal_private.h"
+#include "eal_thread.h"
 #include "eal_windows.h"
 
-/*
- * Send a message to a worker lcore identified by worker_id to call a
- * function f with argument arg. Once the execution is done, the
- * remote lcore switches to WAIT state.
- */
-int
-rte_eal_remote_launch(lcore_function_t *f, void *arg, unsigned int worker_id)
+void
+eal_thread_wake_worker(unsigned int worker_id)
 {
-	int n;
-	char c = 0;
 	int m2w = lcore_config[worker_id].pipe_main2worker[1];
 	int w2m = lcore_config[worker_id].pipe_worker2main[0];
+	char c = 0;
+	int n;
 
-	/* Check if the worker is in 'WAIT' state. Use acquire order
-	 * since 'state' variable is used as the guard variable.
-	 */
-	if (__atomic_load_n(&lcore_config[worker_id].state,
-					__ATOMIC_ACQUIRE) != WAIT)
-		return -EBUSY;
-
-	lcore_config[worker_id].arg = arg;
-	/* Ensure that all the memory operations are completed
-	 * before the worker thread starts running the function.
-	 * Use worker thread function as the guard variable.
-	 */
-	__atomic_store_n(&lcore_config[worker_id].f, f, __ATOMIC_RELEASE);
-
-	/* send message */
-	n = 0;
-	while (n == 0 || (n < 0 && errno == EINTR))
+	do {
 		n = _write(m2w, &c, 1);
+	} while (n == 0 || (n < 0 && errno == EINTR));
 	if (n < 0)
 		rte_panic("cannot write on configuration pipe\n");
 
-	/* wait ack */
 	do {
 		n = _read(w2m, &c, 1);
 	} while (n < 0 && errno == EINTR);
-
 	if (n <= 0)
 		rte_panic("cannot read on configuration pipe\n");
-
-	return 0;
 }
 
-/* main loop of threads */
-void *
-eal_thread_loop(void *arg __rte_unused)
+void
+eal_thread_wait_command(void)
 {
+	unsigned int lcore_id = rte_lcore_id();
+	int m2w;
 	char c;
-	int n, ret;
-	unsigned int lcore_id;
-	pthread_t thread_id;
-	int m2w, w2m;
-	char cpuset[RTE_CPU_AFFINITY_STR_LEN];
-
-	thread_id = pthread_self();
-
-	/* retrieve our lcore_id from the configuration structure */
-	RTE_LCORE_FOREACH_WORKER(lcore_id) {
-		if (thread_id == lcore_config[lcore_id].thread_id)
-			break;
-	}
-	if (lcore_id == RTE_MAX_LCORE)
-		rte_panic("cannot retrieve lcore id\n");
+	int n;
 
 	m2w = lcore_config[lcore_id].pipe_main2worker[0];
-	w2m = lcore_config[lcore_id].pipe_worker2main[1];
+	do {
+		n = _read(m2w, &c, 1);
+	} while (n < 0 && errno == EINTR);
+	if (n <= 0)
+		rte_panic("cannot read on configuration pipe\n");
+}
+
+void
+eal_thread_ack_command(void)
+{
+	unsigned int lcore_id = rte_lcore_id();
+	char c = 0;
+	int w2m;
+	int n;
 
-	__rte_thread_init(lcore_id, &lcore_config[lcore_id].cpuset);
-
-	RTE_LOG(DEBUG, EAL, "lcore %u is ready (tid=%zx;cpuset=[%s])\n",
-		lcore_id, (uintptr_t)thread_id, cpuset);
-
-	/* read on our pipe to get commands */
-	while (1) {
-		lcore_function_t *f;
-		void *fct_arg;
-
-		/* wait command */
-		do {
-			n = _read(m2w, &c, 1);
-		} while (n < 0 && errno == EINTR);
-
-		if (n <= 0)
-			rte_panic("cannot read on configuration pipe\n");
-
-		/* Set the state to 'RUNNING'. Use release order
-		 * since 'state' variable is used as the guard variable.
-		 */
-		__atomic_store_n(&lcore_config[lcore_id].state, RUNNING,
-					__ATOMIC_RELEASE);
-
-		/* send ack */
-		n = 0;
-		while (n == 0 || (n < 0 && errno == EINTR))
-			n = _write(w2m, &c, 1);
-		if (n < 0)
-			rte_panic("cannot write on configuration pipe\n");
-
-		/* Load 'f' with acquire order to ensure that
-		 * the memory operations from the main thread
-		 * are accessed only after update to 'f' is visible.
-		 * Wait till the update to 'f' is visible to the worker.
-		 */
-		while ((f = __atomic_load_n(&lcore_config[lcore_id].f,
-			__ATOMIC_ACQUIRE)) == NULL)
-			rte_pause();
-
-		/* call the function and store the return value */
-		fct_arg = lcore_config[lcore_id].arg;
-		ret = f(fct_arg);
-		lcore_config[lcore_id].ret = ret;
-		lcore_config[lcore_id].f = NULL;
-		lcore_config[lcore_id].arg = NULL;
-
-		/* Store the state with release order to ensure that
-		 * the memory operations from the worker thread
-		 * are completed before the state is updated.
-		 * Use 'state' as the guard variable.
-		 */
-		__atomic_store_n(&lcore_config[lcore_id].state, WAIT,
-					__ATOMIC_RELEASE);
-	}
+	w2m = lcore_config[lcore_id].pipe_worker2main[1];
+	do {
+		n = _write(w2m, &c, 1);
+	} while (n == 0 || (n < 0 && errno == EINTR));
+	if (n < 0)
+		rte_panic("cannot write on configuration pipe\n");
 }
 
 /* function to create threads */