[RFC,v2] eal: increase passed max multi-process file descriptors

Message ID 20240308203737.208999-1-stephen@networkplumber.org (mailing list archive)
State Deferred
Delegated to: Thomas Monjalon
Headers
Series [RFC,v2] eal: increase passed max multi-process file descriptors |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/github-robot: build fail github build: failed
ci/intel-Functional success Functional PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-abi-testing warning Testing issues
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-compile-amd64-testing success Testing PASS
ci/iol-unit-amd64-testing success Testing PASS
ci/iol-sample-apps-testing success Testing PASS
ci/iol-unit-arm64-testing success Testing PASS
ci/iol-compile-arm64-testing success Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS

Commit Message

Stephen Hemminger March 8, 2024, 8:36 p.m. UTC
  Both XDP and TAP device are limited in the number of queues
because of limitations on the number of file descriptors that
are allowed. The original choice of 8 was too low; the allowed
maximum is 253 according to unix(7) man page.

This may look like a serious ABI breakage but it is not.
It is simpler for everyone if the limit is increased rather than
building a parallel set of calls.

The case that matters is older application registering MP support
with the newer version of EAL. In this case, since the old application
will always send the more compact structure (less possible fd's)
it is OK.

Request (for up to 8 fds) sent to EAL.
   - EAL only references up to num_fds.
   - The area past the old fd array is not accessed.

Reply callback:
   - EAL will pass pointer to the new (larger structure),
     the old callback will only look at the first part of
     the fd array (num_fds <= 8).

   - Since primary and secondary must both be from same DPDK version
     there is normal way that a reply with more fd's could be possible.
     The only case is the same as above, where application requested
     something that would break in old version and now succeeds.

The one possible incompatibility is that if application passed
a larger number of fd's (32?) and expected an error. Now it will
succeed and get passed through.

Fixes: bacaa2754017 ("eal: add channel for multi-process communication")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
v2 - show the simpler way to address with some minor ABI issue

 doc/guides/rel_notes/release_24_03.rst | 4 ++++
 lib/eal/include/rte_eal.h              | 2 +-
 2 files changed, 5 insertions(+), 1 deletion(-)
  

Comments

Stephen Hemminger March 14, 2024, 3:23 p.m. UTC | #1
On Fri,  8 Mar 2024 12:36:39 -0800
Stephen Hemminger <stephen@networkplumber.org> wrote:

> Both XDP and TAP device are limited in the number of queues
> because of limitations on the number of file descriptors that
> are allowed. The original choice of 8 was too low; the allowed
> maximum is 253 according to unix(7) man page.
> 
> This may look like a serious ABI breakage but it is not.
> It is simpler for everyone if the limit is increased rather than
> building a parallel set of calls.
> 
> The case that matters is older application registering MP support
> with the newer version of EAL. In this case, since the old application
> will always send the more compact structure (less possible fd's)
> it is OK.
> 
> Request (for up to 8 fds) sent to EAL.
>    - EAL only references up to num_fds.
>    - The area past the old fd array is not accessed.
> 
> Reply callback:
>    - EAL will pass pointer to the new (larger structure),
>      the old callback will only look at the first part of
>      the fd array (num_fds <= 8).
> 
>    - Since primary and secondary must both be from same DPDK version
>      there is normal way that a reply with more fd's could be possible.
>      The only case is the same as above, where application requested
>      something that would break in old version and now succeeds.
> 
> The one possible incompatibility is that if application passed
> a larger number of fd's (32?) and expected an error. Now it will
> succeed and get passed through.
> 
> Fixes: bacaa2754017 ("eal: add channel for multi-process communication")
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
> v2 - show the simpler way to address with some minor ABI issue
> 
>  doc/guides/rel_notes/release_24_03.rst | 4 ++++
>  lib/eal/include/rte_eal.h              | 2 +-
>  2 files changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/doc/guides/rel_notes/release_24_03.rst b/doc/guides/rel_notes/release_24_03.rst
> index 932688ca4d82..1d33cfa15dfb 100644
> --- a/doc/guides/rel_notes/release_24_03.rst
> +++ b/doc/guides/rel_notes/release_24_03.rst
> @@ -225,6 +225,10 @@ API Changes
>  * ethdev: Renamed structure ``rte_flow_action_modify_data`` to be
>    ``rte_flow_field_data`` for more generic usage.
>  
> +* eal: The maximum number of file descriptors allowed to be passed in
> +  multi-process requests is increased from 8 to the maximum possible on
> +  Linux unix domain sockets 253. This allows for more queues on XDP and
> +  TAP device.
>  
>  ABI Changes
>  -----------
> diff --git a/lib/eal/include/rte_eal.h b/lib/eal/include/rte_eal.h
> index c2256f832e51..cd84fcdd1bdb 100644
> --- a/lib/eal/include/rte_eal.h
> +++ b/lib/eal/include/rte_eal.h
> @@ -155,7 +155,7 @@ int rte_eal_primary_proc_alive(const char *config_file_path);
>   */
>  bool rte_mp_disable(void);
>  
> -#define RTE_MP_MAX_FD_NUM	8    /* The max amount of fds */
> +#define RTE_MP_MAX_FD_NUM	253    /* The max amount of fds */
>  #define RTE_MP_MAX_NAME_LEN	64   /* The max length of action name */
>  #define RTE_MP_MAX_PARAM_LEN	256  /* The max length of param */
>  struct rte_mp_msg {


Rather than mess with versioning everything, probably better to just
hold off to 24.11 release and do the change there.

It will limit xdp and tap PMD's to 8 queues but no user has been
demanding more yet.
  
David Marchand March 14, 2024, 3:38 p.m. UTC | #2
On Thu, Mar 14, 2024 at 4:23 PM Stephen Hemminger
<stephen@networkplumber.org> wrote:
> Rather than mess with versioning everything, probably better to just
> hold off to 24.11 release and do the change there.
>
> It will limit xdp and tap PMD's to 8 queues but no user has been
> demanding more yet.

IIUC, this limitation only applies to multiprocess setups.

Waiting for next ABI seems the simpler approach if there is no
explicit ask for this change.
Until someone wants to send more than 253 fds :-).
  

Patch

diff --git a/doc/guides/rel_notes/release_24_03.rst b/doc/guides/rel_notes/release_24_03.rst
index 932688ca4d82..1d33cfa15dfb 100644
--- a/doc/guides/rel_notes/release_24_03.rst
+++ b/doc/guides/rel_notes/release_24_03.rst
@@ -225,6 +225,10 @@  API Changes
 * ethdev: Renamed structure ``rte_flow_action_modify_data`` to be
   ``rte_flow_field_data`` for more generic usage.
 
+* eal: The maximum number of file descriptors allowed to be passed in
+  multi-process requests is increased from 8 to the maximum possible on
+  Linux unix domain sockets 253. This allows for more queues on XDP and
+  TAP device.
 
 ABI Changes
 -----------
diff --git a/lib/eal/include/rte_eal.h b/lib/eal/include/rte_eal.h
index c2256f832e51..cd84fcdd1bdb 100644
--- a/lib/eal/include/rte_eal.h
+++ b/lib/eal/include/rte_eal.h
@@ -155,7 +155,7 @@  int rte_eal_primary_proc_alive(const char *config_file_path);
  */
 bool rte_mp_disable(void);
 
-#define RTE_MP_MAX_FD_NUM	8    /* The max amount of fds */
+#define RTE_MP_MAX_FD_NUM	253    /* The max amount of fds */
 #define RTE_MP_MAX_NAME_LEN	64   /* The max length of action name */
 #define RTE_MP_MAX_PARAM_LEN	256  /* The max length of param */
 struct rte_mp_msg {