[RFC] eal: RFC to refactor rte_eal_init into sub-functions

Message ID 1698949164-20287-1-git-send-email-rahulgupt@linux.microsoft.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers
Series [RFC] eal: RFC to refactor rte_eal_init into sub-functions |

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/loongarch-compilation warning apply patch failure
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/intel-Functional success Functional PASS

Commit Message

Rahul Gupta Nov. 2, 2023, 6:19 p.m. UTC
  From: Rahul Gupta <rahulgupt@microsoft.com>

Initialization often requires rte_eal_init + rte_pktmbuf_pool_create
which can consume a total time of 500-600 ms:
a) For many devices FLR may take a significant chunk of time
   (200-250 ms in our use-case), this FLR is triggered during device
   probe in rte_eal_init().
b) rte_pktmbuf_pool_create() can consume upto 300-350 ms for
applications that require huge memory.

This cost is incurred on each restart (which happens in our use-case
during binary updates for servicing).
This patch provides an optimization using pthreads that appplications
can use and which can save 200-230ms.

In this patch, rte_eal_init() is refactored into two parts-
a) 1st part is dependent code ie- it’s a perquisite of the FLR and
   mempool creation. So this code needs to be executed before any
   pthreads. Its named as rte_eal_init_setup()
b) 2nd part of code is independent code ie- it can execute in parallel
   to mempool creation in a pthread. Its named as rte_probe_and_ioctl().

Existing applications require no changes unless they wish to leverage
the optimization.

If the application wants to use pthread functionality, it should call-
a) rte_eal_init_setup() then create two or more pthreads-
b) in one pthread call- rte_probe_and_ioctl(),
c) second pthread call- rte_pktmbuf_pool_create()
d) (optional) Other pthreads for  any other independent function.

Signed-off-by: Rahul Gupta <rahulgupt@linux.microsoft.com>
---
 lib/librte_eal/common/include/rte_eal.h |  2 ++
 lib/librte_eal/linux/eal/eal.c          | 18 ++++++++++++++++--
 lib/librte_eal/rte_eal_version.map      |  2 ++
 3 files changed, 20 insertions(+), 2 deletions(-)
  

Comments

Stephen Hemminger Nov. 2, 2023, 6:37 p.m. UTC | #1
On Thu,  2 Nov 2023 11:19:24 -0700
Rahul Gupta <rahulgupt@linux.microsoft.com> wrote:

> From: Rahul Gupta <rahulgupt@linux.microsoft.com>
> To: dev@dpdk.org,  thomas@monjalon.net
> Cc: sovaradh@linux.microsoft.com, okaya@kernel.org, sujithsankar@microsoft.com,  sowmini.varadhan@microsoft.com, rahulrgupta27@gmail.com,  Rahul Gupta <rahulgupt@microsoft.com>,  Rahul Gupta <rahulgupt@linux.microsoft.com>
> Subject: [RFC] eal: RFC to refactor rte_eal_init into sub-functions
> Date: Thu,  2 Nov 2023 11:19:24 -0700
> X-Mailer: git-send-email 1.8.3.1
> 
> From: Rahul Gupta <rahulgupt@microsoft.com>
> 
> Initialization often requires rte_eal_init + rte_pktmbuf_pool_create
> which can consume a total time of 500-600 ms:
> a) For many devices FLR may take a significant chunk of time
>    (200-250 ms in our use-case), this FLR is triggered during device
>    probe in rte_eal_init().
> b) rte_pktmbuf_pool_create() can consume upto 300-350 ms for
> applications that require huge memory.
> 
> This cost is incurred on each restart (which happens in our use-case
> during binary updates for servicing).
> This patch provides an optimization using pthreads that appplications
> can use and which can save 200-230ms.
> 
> In this patch, rte_eal_init() is refactored into two parts-
> a) 1st part is dependent code ie- it’s a perquisite of the FLR and
>    mempool creation. So this code needs to be executed before any
>    pthreads. Its named as rte_eal_init_setup()
> b) 2nd part of code is independent code ie- it can execute in parallel
>    to mempool creation in a pthread. Its named as rte_probe_and_ioctl().
> 
> Existing applications require no changes unless they wish to leverage
> the optimization.
> 
> If the application wants to use pthread functionality, it should call-
> a) rte_eal_init_setup() then create two or more pthreads-
> b) in one pthread call- rte_probe_and_ioctl(),
> c) second pthread call- rte_pktmbuf_pool_create()
> d) (optional) Other pthreads for  any other independent function.
> 
> Signed-off-by: Rahul Gupta <rahulgupt@linux.microsoft.com>

These probably marked internal rather than part of API/ABI.
  
rahul gupta Nov. 7, 2023, 5:33 p.m. UTC | #2
Hi Stephen,

Thanks for your review.
If I make it __rte_internal then, testpmd or our application can't use it.
So instead I am planning to make it __rte_experimental.

Regards,
Rahul.

On Fri, 3 Nov 2023 at 00:08, Stephen Hemminger <stephen@networkplumber.org>
wrote:

> On Thu,  2 Nov 2023 11:19:24 -0700
> Rahul Gupta <rahulgupt@linux.microsoft.com> wrote:
>
> > From: Rahul Gupta <rahulgupt@linux.microsoft.com>
> > To: dev@dpdk.org,  thomas@monjalon.net
> > Cc: sovaradh@linux.microsoft.com, okaya@kernel.org,
> sujithsankar@microsoft.com,  sowmini.varadhan@microsoft.com,
> rahulrgupta27@gmail.com,  Rahul Gupta <rahulgupt@microsoft.com>,  Rahul
> Gupta <rahulgupt@linux.microsoft.com>
> > Subject: [RFC] eal: RFC to refactor rte_eal_init into sub-functions
> > Date: Thu,  2 Nov 2023 11:19:24 -0700
> > X-Mailer: git-send-email 1.8.3.1
> >
> > From: Rahul Gupta <rahulgupt@microsoft.com>
> >
> > Initialization often requires rte_eal_init + rte_pktmbuf_pool_create
> > which can consume a total time of 500-600 ms:
> > a) For many devices FLR may take a significant chunk of time
> >    (200-250 ms in our use-case), this FLR is triggered during device
> >    probe in rte_eal_init().
> > b) rte_pktmbuf_pool_create() can consume upto 300-350 ms for
> > applications that require huge memory.
> >
> > This cost is incurred on each restart (which happens in our use-case
> > during binary updates for servicing).
> > This patch provides an optimization using pthreads that appplications
> > can use and which can save 200-230ms.
> >
> > In this patch, rte_eal_init() is refactored into two parts-
> > a) 1st part is dependent code ie- it’s a perquisite of the FLR and
> >    mempool creation. So this code needs to be executed before any
> >    pthreads. Its named as rte_eal_init_setup()
> > b) 2nd part of code is independent code ie- it can execute in parallel
> >    to mempool creation in a pthread. Its named as rte_probe_and_ioctl().
> >
> > Existing applications require no changes unless they wish to leverage
> > the optimization.
> >
> > If the application wants to use pthread functionality, it should call-
> > a) rte_eal_init_setup() then create two or more pthreads-
> > b) in one pthread call- rte_probe_and_ioctl(),
> > c) second pthread call- rte_pktmbuf_pool_create()
> > d) (optional) Other pthreads for  any other independent function.
> >
> > Signed-off-by: Rahul Gupta <rahulgupt@linux.microsoft.com>
>
> These probably marked internal rather than part of API/ABI.
>
  
Rahul Gupta Nov. 8, 2023, 4:38 a.m. UTC | #3
On (11/02/23 11:37), Stephen Hemminger wrote:
> Date: Thu, 2 Nov 2023 11:37:59 -0700
> From: Stephen Hemminger <stephen@networkplumber.org>
> To: Rahul Gupta <rahulgupt@linux.microsoft.com>
> Cc: dev@dpdk.org, thomas@monjalon.net, sovaradh@linux.microsoft.com,
>  okaya@kernel.org, sujithsankar@microsoft.com,
>  sowmini.varadhan@microsoft.com, rahulrgupta27@gmail.com, Rahul Gupta
>  <rahulgupt@microsoft.com>
> Subject: Re: [RFC] eal: RFC to refactor rte_eal_init into sub-functions
> X-Mailer: Claws Mail 4.1.1 (GTK 3.24.38; x86_64-redhat-linux-gnu)
> 
> On Thu,  2 Nov 2023 11:19:24 -0700
> Rahul Gupta <rahulgupt@linux.microsoft.com> wrote:
> 
> > From: Rahul Gupta <rahulgupt@linux.microsoft.com>
> > To: dev@dpdk.org,  thomas@monjalon.net
> > Cc: sovaradh@linux.microsoft.com, okaya@kernel.org, sujithsankar@microsoft.com,  sowmini.varadhan@microsoft.com, rahulrgupta27@gmail.com,  Rahul Gupta <rahulgupt@microsoft.com>,  Rahul Gupta <rahulgupt@linux.microsoft.com>
> > Subject: [RFC] eal: RFC to refactor rte_eal_init into sub-functions
> > Date: Thu,  2 Nov 2023 11:19:24 -0700
> > X-Mailer: git-send-email 1.8.3.1
> > 
> > From: Rahul Gupta <rahulgupt@microsoft.com>
> > 
> > Initialization often requires rte_eal_init + rte_pktmbuf_pool_create
> > which can consume a total time of 500-600 ms:
> > a) For many devices FLR may take a significant chunk of time
> >    (200-250 ms in our use-case), this FLR is triggered during device
> >    probe in rte_eal_init().
> > b) rte_pktmbuf_pool_create() can consume upto 300-350 ms for
> > applications that require huge memory.
> > 
> > This cost is incurred on each restart (which happens in our use-case
> > during binary updates for servicing).
> > This patch provides an optimization using pthreads that appplications
> > can use and which can save 200-230ms.
> > 
> > In this patch, rte_eal_init() is refactored into two parts-
> > a) 1st part is dependent code ie- it’s a perquisite of the FLR and
> >    mempool creation. So this code needs to be executed before any
> >    pthreads. Its named as rte_eal_init_setup()
> > b) 2nd part of code is independent code ie- it can execute in parallel
> >    to mempool creation in a pthread. Its named as rte_probe_and_ioctl().
> > 
> > Existing applications require no changes unless they wish to leverage
> > the optimization.
> > 
> > If the application wants to use pthread functionality, it should call-
> > a) rte_eal_init_setup() then create two or more pthreads-
> > b) in one pthread call- rte_probe_and_ioctl(),
> > c) second pthread call- rte_pktmbuf_pool_create()
> > d) (optional) Other pthreads for  any other independent function.
> > 
> > Signed-off-by: Rahul Gupta <rahulgupt@linux.microsoft.com>
> 
> These probably marked internal rather than part of API/ABI.

Hi Stephen,

Thanks for your review.
If I make it __rte_internal then, testpmd or our application can't use it.
So instead I am planning to make it __rte_experimental.

Regards,
Rahul.
  
Bruce Richardson Nov. 8, 2023, 11:51 a.m. UTC | #4
On Thu, Nov 02, 2023 at 11:19:24AM -0700, Rahul Gupta wrote:
> From: Rahul Gupta <rahulgupt@microsoft.com>
> 
> Initialization often requires rte_eal_init + rte_pktmbuf_pool_create
> which can consume a total time of 500-600 ms:
> a) For many devices FLR may take a significant chunk of time
>    (200-250 ms in our use-case), this FLR is triggered during device
>    probe in rte_eal_init().
> b) rte_pktmbuf_pool_create() can consume upto 300-350 ms for
> applications that require huge memory.
> 
> This cost is incurred on each restart (which happens in our use-case
> during binary updates for servicing).
> This patch provides an optimization using pthreads that appplications
> can use and which can save 200-230ms.
> 
> In this patch, rte_eal_init() is refactored into two parts-
> a) 1st part is dependent code ie- it’s a perquisite of the FLR and
>    mempool creation. So this code needs to be executed before any
>    pthreads. Its named as rte_eal_init_setup()
> b) 2nd part of code is independent code ie- it can execute in parallel
>    to mempool creation in a pthread. Its named as rte_probe_and_ioctl().
> 
> Existing applications require no changes unless they wish to leverage
> the optimization.
> 
> If the application wants to use pthread functionality, it should call-
> a) rte_eal_init_setup() then create two or more pthreads-
> b) in one pthread call- rte_probe_and_ioctl(),
> c) second pthread call- rte_pktmbuf_pool_create()
> d) (optional) Other pthreads for  any other independent function.
> 
> Signed-off-by: Rahul Gupta <rahulgupt@linux.microsoft.com>

Reading the description, this seems an interesting idea, and a good saving.

If I may, I wonder if I can suggest a slight alternative. Rather than
splitting EAL init into two functions like that, how about providing an
"rte_eal_init_async()" function, which does part 1, and then spawns a
thread for part 2, before returning. We can then provide an
rte_eal_init_done() [or eal_init_async_done()] function to allow apps to
resync and check for EAL being done.

The reason for suggesting this is that the naming and purpose of the APIs
may be a little clearer for the end user. Allowing the async init function
to create threads also allows possible future parallelism in the function
itself. For example, we could do probing of the devices themselves in
parallel.

Regards,
/Bruce
  
Dmitry Kozlyuk Nov. 8, 2023, 1:53 p.m. UTC | #5
2023-11-07 23:03 (UTC+0530), rahul gupta:
> > > From: Rahul Gupta <rahulgupt@linux.microsoft.com>
> > > To: dev@dpdk.org,  thomas@monjalon.net
> > > Cc: sovaradh@linux.microsoft.com, okaya@kernel.org,  
> > sujithsankar@microsoft.com,  sowmini.varadhan@microsoft.com,
> > rahulrgupta27@gmail.com,  Rahul Gupta <rahulgupt@microsoft.com>,  Rahul
> > Gupta <rahulgupt@linux.microsoft.com>  
> > > Subject: [RFC] eal: RFC to refactor rte_eal_init into sub-functions
> > > Date: Thu,  2 Nov 2023 11:19:24 -0700
> > > X-Mailer: git-send-email 1.8.3.1
> > >
> > > From: Rahul Gupta <rahulgupt@microsoft.com>
> > >
> > > Initialization often requires rte_eal_init + rte_pktmbuf_pool_create
> > > which can consume a total time of 500-600 ms:
> > > a) For many devices FLR may take a significant chunk of time
> > >    (200-250 ms in our use-case), this FLR is triggered during device
> > >    probe in rte_eal_init().
> > > b) rte_pktmbuf_pool_create() can consume upto 300-350 ms for
> > > applications that require huge memory.
> > >
> > > This cost is incurred on each restart (which happens in our use-case
> > > during binary updates for servicing).
> > > This patch provides an optimization using pthreads that appplications
> > > can use and which can save 200-230ms.
> > >
> > > In this patch, rte_eal_init() is refactored into two parts-
> > > a) 1st part is dependent code ie- it’s a perquisite of the FLR and
> > >    mempool creation. So this code needs to be executed before any
> > >    pthreads. Its named as rte_eal_init_setup()
> > > b) 2nd part of code is independent code ie- it can execute in parallel
> > >    to mempool creation in a pthread. Its named as rte_probe_and_ioctl().
> > >
> > > Existing applications require no changes unless they wish to leverage
> > > the optimization.
> > >
> > > If the application wants to use pthread functionality, it should call-
> > > a) rte_eal_init_setup() then create two or more pthreads-
> > > b) in one pthread call- rte_probe_and_ioctl(),
> > > c) second pthread call- rte_pktmbuf_pool_create()
> > > d) (optional) Other pthreads for  any other independent function.
> > >
> > > Signed-off-by: Rahul Gupta <rahulgupt@linux.microsoft.com>  

I doubt that the new API is required.
It is already possible to block all devices from automatic probing
with EAL options and then probe explicitly in any threads desired.
At the same time, this RFC shows a valuable optimization pattern,
so maybe it is worth having in DPDK as an example.
There are DPDK use cases when probing is completely unnecessary.
Exposing the initialization process stages makes it harder to refactor
and requires precise documentation of when and what is initialized
(for example, in this RFC rte_eal_init_setup()
does not make service core API usable yet).

P. S. You may be also interested in using `--huge-unlink=never`
to speed rte_pktmbuf_pool_create() during restarts:

https://doc.dpdk.org/guides/linux_gsg/linux_eal_parameters.html#id3
  
Thomas Monjalon Nov. 8, 2023, 3:40 p.m. UTC | #6
08/11/2023 14:53, Dmitry Kozlyuk:
> 2023-11-07 23:03 (UTC+0530), rahul gupta:
> > > > From: Rahul Gupta <rahulgupt@linux.microsoft.com>
> > > > To: dev@dpdk.org,  thomas@monjalon.net
> > > > Cc: sovaradh@linux.microsoft.com, okaya@kernel.org,  
> > > sujithsankar@microsoft.com,  sowmini.varadhan@microsoft.com,
> > > rahulrgupta27@gmail.com,  Rahul Gupta <rahulgupt@microsoft.com>,  Rahul
> > > Gupta <rahulgupt@linux.microsoft.com>  
> > > > Subject: [RFC] eal: RFC to refactor rte_eal_init into sub-functions
> > > > Date: Thu,  2 Nov 2023 11:19:24 -0700
> > > > X-Mailer: git-send-email 1.8.3.1
> > > >
> > > > From: Rahul Gupta <rahulgupt@microsoft.com>
> > > >
> > > > Initialization often requires rte_eal_init + rte_pktmbuf_pool_create
> > > > which can consume a total time of 500-600 ms:
> > > > a) For many devices FLR may take a significant chunk of time
> > > >    (200-250 ms in our use-case), this FLR is triggered during device
> > > >    probe in rte_eal_init().
> > > > b) rte_pktmbuf_pool_create() can consume upto 300-350 ms for
> > > > applications that require huge memory.
> > > >
> > > > This cost is incurred on each restart (which happens in our use-case
> > > > during binary updates for servicing).
> > > > This patch provides an optimization using pthreads that appplications
> > > > can use and which can save 200-230ms.
> > > >
> > > > In this patch, rte_eal_init() is refactored into two parts-
> > > > a) 1st part is dependent code ie- it’s a perquisite of the FLR and
> > > >    mempool creation. So this code needs to be executed before any
> > > >    pthreads. Its named as rte_eal_init_setup()
> > > > b) 2nd part of code is independent code ie- it can execute in parallel
> > > >    to mempool creation in a pthread. Its named as rte_probe_and_ioctl().
> > > >
> > > > Existing applications require no changes unless they wish to leverage
> > > > the optimization.
> > > >
> > > > If the application wants to use pthread functionality, it should call-
> > > > a) rte_eal_init_setup() then create two or more pthreads-
> > > > b) in one pthread call- rte_probe_and_ioctl(),
> > > > c) second pthread call- rte_pktmbuf_pool_create()
> > > > d) (optional) Other pthreads for  any other independent function.
> > > >
> > > > Signed-off-by: Rahul Gupta <rahulgupt@linux.microsoft.com>  
> 
> I doubt that the new API is required.
> It is already possible to block all devices from automatic probing
> with EAL options and then probe explicitly in any threads desired.
> At the same time, this RFC shows a valuable optimization pattern,
> so maybe it is worth having in DPDK as an example.
> There are DPDK use cases when probing is completely unnecessary.

It seems here we want to do the device probing,
but start it in parallel of other tasks.

> Exposing the initialization process stages makes it harder to refactor
> and requires precise documentation of when and what is initialized
> (for example, in this RFC rte_eal_init_setup()
> does not make service core API usable yet).

Yes the init order is sensitive, that's why we have a big init function.
But in general I would agree to try splitting it with necessary warnings
and explanations.

> P. S. You may be also interested in using `--huge-unlink=never`
> to speed rte_pktmbuf_pool_create() during restarts:
> 
> https://doc.dpdk.org/guides/linux_gsg/linux_eal_parameters.html#id3

Yes good tip :)
  
Thomas Monjalon Nov. 8, 2023, 3:40 p.m. UTC | #7
08/11/2023 12:51, Bruce Richardson:
> On Thu, Nov 02, 2023 at 11:19:24AM -0700, Rahul Gupta wrote:
> > From: Rahul Gupta <rahulgupt@microsoft.com>
> > 
> > Initialization often requires rte_eal_init + rte_pktmbuf_pool_create
> > which can consume a total time of 500-600 ms:
> > a) For many devices FLR may take a significant chunk of time
> >    (200-250 ms in our use-case), this FLR is triggered during device
> >    probe in rte_eal_init().
> > b) rte_pktmbuf_pool_create() can consume upto 300-350 ms for
> > applications that require huge memory.
> > 
> > This cost is incurred on each restart (which happens in our use-case
> > during binary updates for servicing).
> > This patch provides an optimization using pthreads that appplications
> > can use and which can save 200-230ms.
> > 
> > In this patch, rte_eal_init() is refactored into two parts-
> > a) 1st part is dependent code ie- it’s a perquisite of the FLR and
> >    mempool creation. So this code needs to be executed before any
> >    pthreads. Its named as rte_eal_init_setup()
> > b) 2nd part of code is independent code ie- it can execute in parallel
> >    to mempool creation in a pthread. Its named as rte_probe_and_ioctl().
> > 
> > Existing applications require no changes unless they wish to leverage
> > the optimization.
> > 
> > If the application wants to use pthread functionality, it should call-
> > a) rte_eal_init_setup() then create two or more pthreads-
> > b) in one pthread call- rte_probe_and_ioctl(),
> > c) second pthread call- rte_pktmbuf_pool_create()
> > d) (optional) Other pthreads for  any other independent function.
> > 
> > Signed-off-by: Rahul Gupta <rahulgupt@linux.microsoft.com>
> 
> Reading the description, this seems an interesting idea, and a good saving.
> 
> If I may, I wonder if I can suggest a slight alternative. Rather than
> splitting EAL init into two functions like that, how about providing an
> "rte_eal_init_async()" function, which does part 1, and then spawns a
> thread for part 2, before returning. We can then provide an
> rte_eal_init_done() [or eal_init_async_done()] function to allow apps to
> resync and check for EAL being done.
> 
> The reason for suggesting this is that the naming and purpose of the APIs
> may be a little clearer for the end user. Allowing the async init function
> to create threads also allows possible future parallelism in the function
> itself. For example, we could do probing of the devices themselves in
> parallel.

I like the idea of async init.
  
Rahul Gupta Nov. 9, 2023, 5:26 p.m. UTC | #8
On (11/08/23 16:40), Thomas Monjalon wrote:
> Date: Wed, 08 Nov 2023 16:40:07 +0100
> From: Thomas Monjalon <thomas@monjalon.net>
> To: rahul gupta <rahulrgupta27@gmail.com>, Dmitry Kozlyuk
>  <dmitry.kozliuk@gmail.com>
> Cc: Stephen Hemminger <stephen@networkplumber.org>, Rahul Gupta
>  <rahulgupt@linux.microsoft.com>, dev@dpdk.org,
>  sovaradh@linux.microsoft.com, okaya@kernel.org,
>  sujithsankar@microsoft.com, sowmini.varadhan@microsoft.com, Rahul Gupta
>  <rahulgupt@microsoft.com>
> Subject: Re: [RFC] eal: RFC to refactor rte_eal_init into sub-functions
> 
> 08/11/2023 14:53, Dmitry Kozlyuk:
> > 2023-11-07 23:03 (UTC+0530), rahul gupta:
> > > > > From: Rahul Gupta <rahulgupt@linux.microsoft.com>
> > > > > To: dev@dpdk.org,  thomas@monjalon.net
> > > > > Cc: sovaradh@linux.microsoft.com, okaya@kernel.org,  
> > > > sujithsankar@microsoft.com,  sowmini.varadhan@microsoft.com,
> > > > rahulrgupta27@gmail.com,  Rahul Gupta <rahulgupt@microsoft.com>,  Rahul
> > > > Gupta <rahulgupt@linux.microsoft.com>  
> > > > > Subject: [RFC] eal: RFC to refactor rte_eal_init into sub-functions
> > > > > Date: Thu,  2 Nov 2023 11:19:24 -0700
> > > > > X-Mailer: git-send-email 1.8.3.1
> > > > >
> > > > > From: Rahul Gupta <rahulgupt@microsoft.com>
> > > > >
> > > > > Initialization often requires rte_eal_init + rte_pktmbuf_pool_create
> > > > > which can consume a total time of 500-600 ms:
> > > > > a) For many devices FLR may take a significant chunk of time
> > > > >    (200-250 ms in our use-case), this FLR is triggered during device
> > > > >    probe in rte_eal_init().
> > > > > b) rte_pktmbuf_pool_create() can consume upto 300-350 ms for
> > > > > applications that require huge memory.
> > > > >
> > > > > This cost is incurred on each restart (which happens in our use-case
> > > > > during binary updates for servicing).
> > > > > This patch provides an optimization using pthreads that appplications
> > > > > can use and which can save 200-230ms.
> > > > >
> > > > > In this patch, rte_eal_init() is refactored into two parts-
> > > > > a) 1st part is dependent code ie- it’s a perquisite of the FLR and
> > > > >    mempool creation. So this code needs to be executed before any
> > > > >    pthreads. Its named as rte_eal_init_setup()
> > > > > b) 2nd part of code is independent code ie- it can execute in parallel
> > > > >    to mempool creation in a pthread. Its named as rte_probe_and_ioctl().
> > > > >
> > > > > Existing applications require no changes unless they wish to leverage
> > > > > the optimization.
> > > > >
> > > > > If the application wants to use pthread functionality, it should call-
> > > > > a) rte_eal_init_setup() then create two or more pthreads-
> > > > > b) in one pthread call- rte_probe_and_ioctl(),
> > > > > c) second pthread call- rte_pktmbuf_pool_create()
> > > > > d) (optional) Other pthreads for  any other independent function.
> > > > >
> > > > > Signed-off-by: Rahul Gupta <rahulgupt@linux.microsoft.com>  
> > 
> > I doubt that the new API is required.
> > It is already possible to block all devices from automatic probing
> > with EAL options and then probe explicitly in any threads desired.
> > At the same time, this RFC shows a valuable optimization pattern,
> > so maybe it is worth having in DPDK as an example.
> > There are DPDK use cases when probing is completely unnecessary.
> 
> It seems here we want to do the device probing,
> but start it in parallel of other tasks.
> 
> > Exposing the initialization process stages makes it harder to refactor
> > and requires precise documentation of when and what is initialized
> > (for example, in this RFC rte_eal_init_setup()
> > does not make service core API usable yet).
> 
> Yes the init order is sensitive, that's why we have a big init function.
> But in general I would agree to try splitting it with necessary warnings
> and explanations.
> 
> > P. S. You may be also interested in using `--huge-unlink=never`
> > to speed rte_pktmbuf_pool_create() during restarts:
> > 
> > https://doc.dpdk.org/guides/linux_gsg/linux_eal_parameters.html#id3
> 
> Yes good tip :)
> 
> 
Thank you for the comments. I will send a patch shortly.
eal_init_async(); //Internally forks a thread to do FLR.
/* Application can do other stuff, including mempool_create, possibly in
   multiple threads. If threads are forked, then application has to do any
   needed thread-joins */
eal_init_async_done(); //To sync with FLR thread.
  
Bruce Richardson Nov. 9, 2023, 5:32 p.m. UTC | #9
On Thu, Nov 09, 2023 at 09:26:27AM -0800, Rahul Gupta wrote:
> On (11/08/23 16:40), Thomas Monjalon wrote:
> > Date: Wed, 08 Nov 2023 16:40:07 +0100
> > From: Thomas Monjalon <thomas@monjalon.net>
> > To: rahul gupta <rahulrgupta27@gmail.com>, Dmitry Kozlyuk
> >  <dmitry.kozliuk@gmail.com>
> > Cc: Stephen Hemminger <stephen@networkplumber.org>, Rahul Gupta
> >  <rahulgupt@linux.microsoft.com>, dev@dpdk.org,
> >  sovaradh@linux.microsoft.com, okaya@kernel.org,
> >  sujithsankar@microsoft.com, sowmini.varadhan@microsoft.com, Rahul Gupta
> >  <rahulgupt@microsoft.com>
> > Subject: Re: [RFC] eal: RFC to refactor rte_eal_init into sub-functions
> > 
> > 08/11/2023 14:53, Dmitry Kozlyuk:
> > > 2023-11-07 23:03 (UTC+0530), rahul gupta:
> > > > > > From: Rahul Gupta <rahulgupt@linux.microsoft.com>
> > > > > > To: dev@dpdk.org,  thomas@monjalon.net
> > > > > > Cc: sovaradh@linux.microsoft.com, okaya@kernel.org,  
> > > > > sujithsankar@microsoft.com,  sowmini.varadhan@microsoft.com,
> > > > > rahulrgupta27@gmail.com,  Rahul Gupta <rahulgupt@microsoft.com>,  Rahul
> > > > > Gupta <rahulgupt@linux.microsoft.com>  
> > > > > > Subject: [RFC] eal: RFC to refactor rte_eal_init into sub-functions
> > > > > > Date: Thu,  2 Nov 2023 11:19:24 -0700
> > > > > > X-Mailer: git-send-email 1.8.3.1
> > > > > >
> > > > > > From: Rahul Gupta <rahulgupt@microsoft.com>
> > > > > >
> > > > > > Initialization often requires rte_eal_init + rte_pktmbuf_pool_create
> > > > > > which can consume a total time of 500-600 ms:
> > > > > > a) For many devices FLR may take a significant chunk of time
> > > > > >    (200-250 ms in our use-case), this FLR is triggered during device
> > > > > >    probe in rte_eal_init().
> > > > > > b) rte_pktmbuf_pool_create() can consume upto 300-350 ms for
> > > > > > applications that require huge memory.
> > > > > >
> > > > > > This cost is incurred on each restart (which happens in our use-case
> > > > > > during binary updates for servicing).
> > > > > > This patch provides an optimization using pthreads that appplications
> > > > > > can use and which can save 200-230ms.
> > > > > >
> > > > > > In this patch, rte_eal_init() is refactored into two parts-
> > > > > > a) 1st part is dependent code ie- it’s a perquisite of the FLR and
> > > > > >    mempool creation. So this code needs to be executed before any
> > > > > >    pthreads. Its named as rte_eal_init_setup()
> > > > > > b) 2nd part of code is independent code ie- it can execute in parallel
> > > > > >    to mempool creation in a pthread. Its named as rte_probe_and_ioctl().
> > > > > >
> > > > > > Existing applications require no changes unless they wish to leverage
> > > > > > the optimization.
> > > > > >
> > > > > > If the application wants to use pthread functionality, it should call-
> > > > > > a) rte_eal_init_setup() then create two or more pthreads-
> > > > > > b) in one pthread call- rte_probe_and_ioctl(),
> > > > > > c) second pthread call- rte_pktmbuf_pool_create()
> > > > > > d) (optional) Other pthreads for  any other independent function.
> > > > > >
> > > > > > Signed-off-by: Rahul Gupta <rahulgupt@linux.microsoft.com>  
> > > 
> > > I doubt that the new API is required.
> > > It is already possible to block all devices from automatic probing
> > > with EAL options and then probe explicitly in any threads desired.
> > > At the same time, this RFC shows a valuable optimization pattern,
> > > so maybe it is worth having in DPDK as an example.
> > > There are DPDK use cases when probing is completely unnecessary.
> > 
> > It seems here we want to do the device probing,
> > but start it in parallel of other tasks.
> > 
> > > Exposing the initialization process stages makes it harder to refactor
> > > and requires precise documentation of when and what is initialized
> > > (for example, in this RFC rte_eal_init_setup()
> > > does not make service core API usable yet).
> > 
> > Yes the init order is sensitive, that's why we have a big init function.
> > But in general I would agree to try splitting it with necessary warnings
> > and explanations.
> > 
> > > P. S. You may be also interested in using `--huge-unlink=never`
> > > to speed rte_pktmbuf_pool_create() during restarts:
> > > 
> > > https://doc.dpdk.org/guides/linux_gsg/linux_eal_parameters.html#id3
> > 
> > Yes good tip :)
> > 
> > 
> Thank you for the comments. I will send a patch shortly.
> eal_init_async(); //Internally forks a thread to do FLR.
> /* Application can do other stuff, including mempool_create, possibly in
>    multiple threads. If threads are forked, then application has to do any
>    needed thread-joins */
> eal_init_async_done(); //To sync with FLR thread.

Just to note, the documentation on rte_eal_init_async() needs to call out
very explicitly what DPDK APIs, if any, can be called before the call to
async_done().

/Bruce
  
Rahul Gupta Nov. 10, 2023, 5:25 p.m. UTC | #10
On (11/09/23 17:32), Bruce Richardson wrote:
> Date: Thu, 9 Nov 2023 17:32:31 +0000
> From: Bruce Richardson <bruce.richardson@intel.com>
> To: Rahul Gupta <rahulgupt@linux.microsoft.com>
> CC: Thomas Monjalon <thomas@monjalon.net>, rahul gupta
>  <rahulrgupta27@gmail.com>, Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>,
>  Stephen Hemminger <stephen@networkplumber.org>, dev@dpdk.org,
>  sovaradh@linux.microsoft.com, okaya@kernel.org,
>  sujithsankar@microsoft.com, sowmini.varadhan@microsoft.com
> Subject: Re: [RFC] eal: RFC to refactor rte_eal_init into sub-functions
> 
> On Thu, Nov 09, 2023 at 09:26:27AM -0800, Rahul Gupta wrote:
> > On (11/08/23 16:40), Thomas Monjalon wrote:
> > > Date: Wed, 08 Nov 2023 16:40:07 +0100
> > > From: Thomas Monjalon <thomas@monjalon.net>
> > > To: rahul gupta <rahulrgupta27@gmail.com>, Dmitry Kozlyuk
> > >  <dmitry.kozliuk@gmail.com>
> > > Cc: Stephen Hemminger <stephen@networkplumber.org>, Rahul Gupta
> > >  <rahulgupt@linux.microsoft.com>, dev@dpdk.org,
> > >  sovaradh@linux.microsoft.com, okaya@kernel.org,
> > >  sujithsankar@microsoft.com, sowmini.varadhan@microsoft.com, Rahul Gupta
> > >  <rahulgupt@microsoft.com>
> > > Subject: Re: [RFC] eal: RFC to refactor rte_eal_init into sub-functions
> > > 
> > > 08/11/2023 14:53, Dmitry Kozlyuk:
> > > > 2023-11-07 23:03 (UTC+0530), rahul gupta:
> > > > > > > From: Rahul Gupta <rahulgupt@linux.microsoft.com>
> > > > > > > To: dev@dpdk.org,  thomas@monjalon.net
> > > > > > > Cc: sovaradh@linux.microsoft.com, okaya@kernel.org,  
> > > > > > sujithsankar@microsoft.com,  sowmini.varadhan@microsoft.com,
> > > > > > rahulrgupta27@gmail.com,  Rahul Gupta <rahulgupt@microsoft.com>,  Rahul
> > > > > > Gupta <rahulgupt@linux.microsoft.com>  
> > > > > > > Subject: [RFC] eal: RFC to refactor rte_eal_init into sub-functions
> > > > > > > Date: Thu,  2 Nov 2023 11:19:24 -0700
> > > > > > > X-Mailer: git-send-email 1.8.3.1
> > > > > > >
> > > > > > > From: Rahul Gupta <rahulgupt@microsoft.com>
> > > > > > >
> > > > > > > Initialization often requires rte_eal_init + rte_pktmbuf_pool_create
> > > > > > > which can consume a total time of 500-600 ms:
> > > > > > > a) For many devices FLR may take a significant chunk of time
> > > > > > >    (200-250 ms in our use-case), this FLR is triggered during device
> > > > > > >    probe in rte_eal_init().
> > > > > > > b) rte_pktmbuf_pool_create() can consume upto 300-350 ms for
> > > > > > > applications that require huge memory.
> > > > > > >
> > > > > > > This cost is incurred on each restart (which happens in our use-case
> > > > > > > during binary updates for servicing).
> > > > > > > This patch provides an optimization using pthreads that appplications
> > > > > > > can use and which can save 200-230ms.
> > > > > > >
> > > > > > > In this patch, rte_eal_init() is refactored into two parts-
> > > > > > > a) 1st part is dependent code ie- it’s a perquisite of the FLR and
> > > > > > >    mempool creation. So this code needs to be executed before any
> > > > > > >    pthreads. Its named as rte_eal_init_setup()
> > > > > > > b) 2nd part of code is independent code ie- it can execute in parallel
> > > > > > >    to mempool creation in a pthread. Its named as rte_probe_and_ioctl().
> > > > > > >
> > > > > > > Existing applications require no changes unless they wish to leverage
> > > > > > > the optimization.
> > > > > > >
> > > > > > > If the application wants to use pthread functionality, it should call-
> > > > > > > a) rte_eal_init_setup() then create two or more pthreads-
> > > > > > > b) in one pthread call- rte_probe_and_ioctl(),
> > > > > > > c) second pthread call- rte_pktmbuf_pool_create()
> > > > > > > d) (optional) Other pthreads for  any other independent function.
> > > > > > >
> > > > > > > Signed-off-by: Rahul Gupta <rahulgupt@linux.microsoft.com>  
> > > > 
> > > > I doubt that the new API is required.
> > > > It is already possible to block all devices from automatic probing
> > > > with EAL options and then probe explicitly in any threads desired.
> > > > At the same time, this RFC shows a valuable optimization pattern,
> > > > so maybe it is worth having in DPDK as an example.
> > > > There are DPDK use cases when probing is completely unnecessary.
> > > 
> > > It seems here we want to do the device probing,
> > > but start it in parallel of other tasks.
> > > 
> > > > Exposing the initialization process stages makes it harder to refactor
> > > > and requires precise documentation of when and what is initialized
> > > > (for example, in this RFC rte_eal_init_setup()
> > > > does not make service core API usable yet).
> > > 
> > > Yes the init order is sensitive, that's why we have a big init function.
> > > But in general I would agree to try splitting it with necessary warnings
> > > and explanations.
> > > 
> > > > P. S. You may be also interested in using `--huge-unlink=never`
> > > > to speed rte_pktmbuf_pool_create() during restarts:
> > > > 
> > > > https://doc.dpdk.org/guides/linux_gsg/linux_eal_parameters.html#id3
> > > 
> > > Yes good tip :)
> > > 
> > > 
> > Thank you for the comments. I will send a patch shortly.
> > eal_init_async(); //Internally forks a thread to do FLR.
> > /* Application can do other stuff, including mempool_create, possibly in
> >    multiple threads. If threads are forked, then application has to do any
> >    needed thread-joins */
> > eal_init_async_done(); //To sync with FLR thread.
> 
> Just to note, the documentation on rte_eal_init_async() needs to call out
> very explicitly what DPDK APIs, if any, can be called before the call to
> async_done().
> 
> /Bruce

Yes, I will document it in commit log and near code.

Regards,
Rahul.
  

Patch

diff --git a/lib/librte_eal/common/include/rte_eal.h b/lib/librte_eal/common/include/rte_eal.h
index 2f9ed29..90db16b 100644
--- a/lib/librte_eal/common/include/rte_eal.h
+++ b/lib/librte_eal/common/include/rte_eal.h
@@ -129,6 +129,8 @@  enum rte_proc_type_t {
  *     ENOEXEC indicates that a service core failed to launch successfully.
  */
 int rte_eal_init(int argc, char **argv);
+int rte_eal_init_setup(int argc, char **argv);
+int rte_probe_and_ioctl(void);
 
 /**
  * Clean up the Environment Abstraction Layer (EAL)
diff --git a/lib/librte_eal/linux/eal/eal.c b/lib/librte_eal/linux/eal/eal.c
index 5cabba8..6f2a4d0 100644
--- a/lib/librte_eal/linux/eal/eal.c
+++ b/lib/librte_eal/linux/eal/eal.c
@@ -956,9 +956,8 @@  static void rte_eal_init_alert(const char *msg)
 	return n > 2;
 }
 
-/* Launch threads, called at application init(). */
 int
-rte_eal_init(int argc, char **argv)
+rte_eal_init_setup(int argc, char **argv)
 {
 	int i, fctret, ret;
 	pthread_t thread_id;
@@ -1246,7 +1245,13 @@  static void rte_eal_init_alert(const char *msg)
 	 */
 	rte_eal_mp_remote_launch(sync_func, NULL, SKIP_MASTER);
 	rte_eal_mp_wait_lcore();
+	return fctret;
+}
 
+int
+rte_probe_and_ioctl(void)
+{
+	int ret = 0;
 	/* initialize services so vdevs register service during bus_probe. */
 	ret = rte_service_init();
 	if (ret) {
@@ -1296,7 +1301,16 @@  static void rte_eal_init_alert(const char *msg)
 
 	/* Call each registered callback, if enabled */
 	rte_option_init();
+	return 0;	//return Success if control reaches here
+}
 
+/* Launch threads, called at application init(). */
+int
+rte_eal_init(int argc, char **argv)
+{
+	int fctret = rte_eal_init_setup(argc, argv);
+	if (fctret>=0)
+		fctret = rte_probe_and_ioctl();
 	return fctret;
 }
 
diff --git a/lib/librte_eal/rte_eal_version.map b/lib/librte_eal/rte_eal_version.map
index 6482539..9e6054f 100644
--- a/lib/librte_eal/rte_eal_version.map
+++ b/lib/librte_eal/rte_eal_version.map
@@ -53,6 +53,8 @@  DPDK_20.0 {
 	rte_eal_hotplug_remove;
 	rte_eal_hpet_init;
 	rte_eal_init;
+	rte_eal_init_setup;
+	rte_probe_and_ioctl;
 	rte_eal_iopl_init;
 	rte_eal_iova_mode;
 	rte_eal_lcore_role;