[v4,0/8] support reinit flow

Message ID 20230815145023.1386003-1-okaya@kernel.org (mailing list archive)
Headers
Series support reinit flow |

Message

Sinan Kaya Aug. 15, 2023, 2:50 p.m. UTC
  From: Sinan Kaya <okaya@kernel.org>

We want to be able to call rte_eal_init() and rte_eal_cleanup()
APIs back to back for maintanance reasons.

Here is a summary of the code we have seen so far:

1. some code support getting called multiple times by keeping
a static variable.
2. some code initializes once but never clean up after them and
don't have a cleanup API.
3. some code assumes that they only get called once during the
lifecycle of the process.

Most changes in this patch center around following the #1 design
principle.

Why?

It is not always ideal to reinitialize a DPDK process. Memory needs
to be reinitialized, hugetables need to warm up etc.

Changed from

v1:
Fix checkpatch warnings

v2:
rebase to most recent DPDK.

v3:
pick up Stephen's "eal: cleanup plugins data" as a pre-requisite
patch.

Graham Whyte (1):
  eal: fixes for re-initialization issues

Sinan Kaya (6):
  tailq: skip init if already initialized
  eal_memzone: bail out on initialized
  memseg: init once
  eal_memory: skip initialization
  eal_interrupts: don't reinitialize threads
  eal: initialize worker threads once

Stephen Hemminger (1):
  eal: cleanup plugins data

 lib/eal/common/eal_common_memory.c  |  5 +++
 lib/eal/common/eal_common_memzone.c |  7 +++
 lib/eal/common/eal_common_options.c | 20 +++++++++
 lib/eal/common/eal_common_tailqs.c  | 20 ++++++---
 lib/eal/common/eal_options.h        |  1 +
 lib/eal/common/malloc_heap.c        |  7 +++
 lib/eal/linux/eal.c                 | 66 ++++++++++++++++-------------
 lib/eal/linux/eal_interrupts.c      |  7 +++
 lib/eal/linux/eal_memory.c          | 12 +++++-
 9 files changed, 108 insertions(+), 37 deletions(-)
  

Comments

Stephen Hemminger Aug. 15, 2023, 5:45 p.m. UTC | #1
On Tue, 15 Aug 2023 10:50:15 -0400
okaya@kernel.org wrote:

> From: Sinan Kaya <okaya@kernel.org>
> 
> We want to be able to call rte_eal_init() and rte_eal_cleanup()
> APIs back to back for maintanance reasons.
> 
> Here is a summary of the code we have seen so far:
> 
> 1. some code support getting called multiple times by keeping
> a static variable.
> 2. some code initializes once but never clean up after them and
> don't have a cleanup API.
> 3. some code assumes that they only get called once during the
> lifecycle of the process.
> 
> Most changes in this patch center around following the #1 design
> principle.
> 
> Why?
> 
> It is not always ideal to reinitialize a DPDK process. Memory needs
> to be reinitialized, hugetables need to warm up etc.


I am familiar with the backstory of why this is desirable in your case.
But others may not be. It will work for you, but for the wider the
range of libraries and drivers it probably won't.

As a compromise, can this restart be officially tagged as unsupported.
I.e. it may work for some drivers and libraries but not all of them.
If nothing else many parts of DPDK still do leak memory on cleanup
and currently this is harmless.

This has enough impact that it probably needs to wait past 23.11 release.

Could you add a test for restart into standalone tests and test-pmd?
  
Sinan Kaya Aug. 15, 2023, 6:58 p.m. UTC | #2
On Tue, 2023-08-15 at 10:45 -0700, Stephen Hemminger wrote:
> > Why?
> > It is not always ideal to reinitialize a DPDK process. Memory needs
> > to be reinitialized, hugetables need to warm up etc.
> 
> 
> 
> 
> I am familiar with the backstory of why this is desirable in your
> case.
> 
> But others may not be. It will work for you, but for the wider the
> 
> range of libraries and drivers it probably won't.
> 
> 

Fair enough.

> 
> As a compromise, can this restart be officially tagged as
> unsupported.

any pointers how to do this?

I have no idea how to mark something unsupported in code.
If this is acceptable in cover letter, I'm happy to do that too.

> 
> I.e. it may work for some drivers and libraries but not all of them.
> 
> If nothing else many parts of DPDK still do leak memory on cleanup
> 
> and currently this is harmless.
> 
> 
> 
> This has enough impact that it probably needs to wait past 23.11
> release.
> 
> 

Sure, no rush. Happy to wait for time slot and accumulate review
feedbacks.

> 
> Could you add a test for restart into standalone tests and test-pmd?

Will do.
  
Stephen Hemminger Aug. 15, 2023, 7:47 p.m. UTC | #3
On Tue, 15 Aug 2023 14:58:03 -0400
Sinan Kaya <okaya@kernel.org> wrote:

> > 
> > As a compromise, can this restart be officially tagged as
> > unsupported.  
> 
> any pointers how to do this?
> 
> I have no idea how to mark something unsupported in code.
> If this is acceptable in cover letter, I'm happy to do that too.

Put some additional notes in the rte_eal.h like:

diff --git a/lib/eal/include/rte_eal.h b/lib/eal/include/rte_eal.h
index 53c4a5519e61..348b340f006d 100644
--- a/lib/eal/include/rte_eal.h
+++ b/lib/eal/include/rte_eal.h
@@ -67,6 +67,11 @@ int rte_eal_iopl_init(void);
  * as possible in the application's main() function.
  * It puts the WORKER lcores in the WAIT state.
  *
+ * @warning
+ * It maybe possisble to call it again after rte_eal_cleanup().
+ * But this usage is dependent on libraries and drivers support which
+ * may not work. Use at your own risk.
+ *
  * @param argc
  *   A non-negative value.  If it is greater than 0, the array members
  *   for argv[0] through argv[argc] (non-inclusive) shall contain pointers

Or maybe in the Shutdown and Cleanup section of:
 doc/guides/prog_guide/env_abstraction_layer.rst