[v2,01/10] ethdev: introduce flow pre-configuration hints

Message ID 20220118153027.3947448-2-akozyrev@nvidia.com (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers
Series ethdev: datapath-focused flow rules management |

Checks

Context Check Description
ci/checkpatch success coding style OK

Commit Message

Alexander Kozyrev Jan. 18, 2022, 3:30 p.m. UTC
  The flow rules creation/destruction at a large scale incurs a performance
penalty and may negatively impact the packet processing when used
as part of the datapath logic. This is mainly because software/hardware
resources are allocated and prepared during the flow rule creation.

In order to optimize the insertion rate, PMD may use some hints provided
by the application at the initialization phase. The rte_flow_configure()
function allows to pre-allocate all the needed resources beforehand.
These resources can be used at a later stage without costly allocations.
Every PMD may use only the subset of hints and ignore unused ones or
fail in case the requested configuration is not supported.

Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
---
 doc/guides/prog_guide/rte_flow.rst     | 37 +++++++++++++++
 doc/guides/rel_notes/release_22_03.rst |  2 +
 lib/ethdev/rte_flow.c                  | 20 ++++++++
 lib/ethdev/rte_flow.h                  | 63 ++++++++++++++++++++++++++
 lib/ethdev/rte_flow_driver.h           |  5 ++
 lib/ethdev/version.map                 |  3 ++
 6 files changed, 130 insertions(+)
  

Comments

Jerin Jacob Jan. 24, 2022, 2:36 p.m. UTC | #1
On Tue, Jan 18, 2022 at 9:01 PM Alexander Kozyrev <akozyrev@nvidia.com> wrote:
>
> The flow rules creation/destruction at a large scale incurs a performance
> penalty and may negatively impact the packet processing when used
> as part of the datapath logic. This is mainly because software/hardware
> resources are allocated and prepared during the flow rule creation.
>
> In order to optimize the insertion rate, PMD may use some hints provided
> by the application at the initialization phase. The rte_flow_configure()
> function allows to pre-allocate all the needed resources beforehand.
> These resources can be used at a later stage without costly allocations.
> Every PMD may use only the subset of hints and ignore unused ones or
> fail in case the requested configuration is not supported.
>
> Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
> ---

>
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Flow engine port configuration attributes.
> + */
> +__extension__

Is this __extension__ required ?


> +struct rte_flow_port_attr {
> +       /**
> +        * Version of the struct layout, should be 0.
> +        */
> +       uint32_t version;

Why version number? Across DPDK, we are using dynamic function
versioning, I think, that would
 be sufficient for ABI versioning

> +       /**
> +        * Number of counter actions pre-configured.
> +        * If set to 0, PMD will allocate counters dynamically.
> +        * @see RTE_FLOW_ACTION_TYPE_COUNT
> +        */
> +       uint32_t nb_counters;
> +       /**
> +        * Number of aging actions pre-configured.
> +        * If set to 0, PMD will allocate aging dynamically.
> +        * @see RTE_FLOW_ACTION_TYPE_AGE
> +        */
> +       uint32_t nb_aging;
> +       /**
> +        * Number of traffic metering actions pre-configured.
> +        * If set to 0, PMD will allocate meters dynamically.
> +        * @see RTE_FLOW_ACTION_TYPE_METER
> +        */
> +       uint32_t nb_meters;
> +};
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Configure flow rules module.
> + * To pre-allocate resources as per the flow port attributes
> + * this configuration function must be called before any flow rule is created.
> + * Must be called only after Ethernet device is configured, but may be called
> + * before or after the device is started as long as there are no flow rules.
> + * No other rte_flow function should be called while this function is invoked.
> + * This function can be called again to change the configuration.
> + * Some PMDs may not support re-configuration at all,
> + * or may only allow increasing the number of resources allocated.

Following comment from Ivan looks good to me

* Pre-configure the port's flow API engine.
*
* This API can only be invoked before the application
* starts using the rest of the flow library functions.
*
* The API can be invoked multiple times to change the
* settings. The port, however, may reject the changes.

> + *
> + * @param port_id
> + *   Port identifier of Ethernet device.
> + * @param[in] port_attr
> + *   Port configuration attributes.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL.
> + *   PMDs initialize this structure in case of error only.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +__rte_experimental
> +int
> +rte_flow_configure(uint16_t port_id,

Should we couple, setting resource limit hint to configure function as
if we add future items in
configuration, we may pain to manage all state. Instead how about,
rte_flow_resource_reserve_hint_set()?


> +                  const struct rte_flow_port_attr *port_attr,
> +                  struct rte_flow_error *error);

I think, we should have _get function to get those limit numbers otherwise,
we can not write portable applications as the return value is  kind of
boolean now if
don't define exact values for rte_errno for reasons.



> +
>  #ifdef __cplusplus
>  }
>  #endif
> diff --git a/lib/ethdev/rte_flow_driver.h b/lib/ethdev/rte_flow_driver.h
> index f691b04af4..5f722f1a39 100644
> --- a/lib/ethdev/rte_flow_driver.h
> +++ b/lib/ethdev/rte_flow_driver.h
> @@ -152,6 +152,11 @@ struct rte_flow_ops {
>                 (struct rte_eth_dev *dev,
>                  const struct rte_flow_item_flex_handle *handle,
>                  struct rte_flow_error *error);
> +       /** See rte_flow_configure() */
> +       int (*configure)
> +               (struct rte_eth_dev *dev,
> +                const struct rte_flow_port_attr *port_attr,
> +                struct rte_flow_error *err);
>  };
>
>  /**
> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
> index c2fb0669a4..7645796739 100644
> --- a/lib/ethdev/version.map
> +++ b/lib/ethdev/version.map
> @@ -256,6 +256,9 @@ EXPERIMENTAL {
>         rte_flow_flex_item_create;
>         rte_flow_flex_item_release;
>         rte_flow_pick_transfer_proxy;
> +
> +       # added in 22.03
> +       rte_flow_configure;
>  };
>
>  INTERNAL {
> --
> 2.18.2
>
  
Thomas Monjalon Jan. 24, 2022, 5:35 p.m. UTC | #2
24/01/2022 15:36, Jerin Jacob:
> On Tue, Jan 18, 2022 at 9:01 PM Alexander Kozyrev <akozyrev@nvidia.com> wrote:
> > +struct rte_flow_port_attr {
> > +       /**
> > +        * Version of the struct layout, should be 0.
> > +        */
> > +       uint32_t version;
> 
> Why version number? Across DPDK, we are using dynamic function
> versioning, I think, that would be sufficient for ABI versioning

Function versioning is not ideal when the structure is accessed
in many places like many drivers and library functions.

The idea of this version field (which can be a bitfield)
is to update it when some new features are added,
so the users of the struct can check if a feature is there
before trying to use it.
It means a bit more code in the functions, but avoid duplicating functions
as in function versioning.

Another approach was suggested by Bruce, and applied to dmadev.
It is assuming we only add new fields at the end (no removal),
and focus on the size of the struct.
By passing sizeof as an extra parameter, the function knows
which fields are OK to use.
Example: http://code.dpdk.org/dpdk/v21.11/source/lib/dmadev/rte_dmadev.c#L476
  
Ajit Khaparde Jan. 24, 2022, 5:40 p.m. UTC | #3
On Mon, Jan 24, 2022 at 6:37 AM Jerin Jacob <jerinjacobk@gmail.com> wrote:
>
> On Tue, Jan 18, 2022 at 9:01 PM Alexander Kozyrev <akozyrev@nvidia.com> wrote:
> >
> > The flow rules creation/destruction at a large scale incurs a performance
> > penalty and may negatively impact the packet processing when used
> > as part of the datapath logic. This is mainly because software/hardware
> > resources are allocated and prepared during the flow rule creation.
> >
> > In order to optimize the insertion rate, PMD may use some hints provided
> > by the application at the initialization phase. The rte_flow_configure()
> > function allows to pre-allocate all the needed resources beforehand.
> > These resources can be used at a later stage without costly allocations.
> > Every PMD may use only the subset of hints and ignore unused ones or
> > fail in case the requested configuration is not supported.
> >
> > Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
> > ---
>
> >
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Flow engine port configuration attributes.
> > + */
> > +__extension__
>
> Is this __extension__ required ?
>
>
> > +struct rte_flow_port_attr {
> > +       /**
> > +        * Version of the struct layout, should be 0.
> > +        */
> > +       uint32_t version;
>
> Why version number? Across DPDK, we are using dynamic function
> versioning, I think, that would
>  be sufficient for ABI versioning
>
> > +       /**
> > +        * Number of counter actions pre-configured.
> > +        * If set to 0, PMD will allocate counters dynamically.
> > +        * @see RTE_FLOW_ACTION_TYPE_COUNT
> > +        */
> > +       uint32_t nb_counters;
> > +       /**
> > +        * Number of aging actions pre-configured.
> > +        * If set to 0, PMD will allocate aging dynamically.
> > +        * @see RTE_FLOW_ACTION_TYPE_AGE
> > +        */
> > +       uint32_t nb_aging;
> > +       /**
> > +        * Number of traffic metering actions pre-configured.
> > +        * If set to 0, PMD will allocate meters dynamically.
> > +        * @see RTE_FLOW_ACTION_TYPE_METER
> > +        */
> > +       uint32_t nb_meters;
> > +};
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Configure flow rules module.
> > + * To pre-allocate resources as per the flow port attributes
> > + * this configuration function must be called before any flow rule is created.
> > + * Must be called only after Ethernet device is configured, but may be called
> > + * before or after the device is started as long as there are no flow rules.
> > + * No other rte_flow function should be called while this function is invoked.
> > + * This function can be called again to change the configuration.
> > + * Some PMDs may not support re-configuration at all,
> > + * or may only allow increasing the number of resources allocated.
>
> Following comment from Ivan looks good to me
>
> * Pre-configure the port's flow API engine.
> *
> * This API can only be invoked before the application
> * starts using the rest of the flow library functions.
> *
> * The API can be invoked multiple times to change the
> * settings. The port, however, may reject the changes.
>
> > + *
> > + * @param port_id
> > + *   Port identifier of Ethernet device.
> > + * @param[in] port_attr
> > + *   Port configuration attributes.
> > + * @param[out] error
> > + *   Perform verbose error reporting if not NULL.
> > + *   PMDs initialize this structure in case of error only.
> > + *
> > + * @return
> > + *   0 on success, a negative errno value otherwise and rte_errno is set.
> > + */
> > +__rte_experimental
> > +int
> > +rte_flow_configure(uint16_t port_id,
>
> Should we couple, setting resource limit hint to configure function as
> if we add future items in
> configuration, we may pain to manage all state. Instead how about,
> rte_flow_resource_reserve_hint_set()?
+1

>
>
> > +                  const struct rte_flow_port_attr *port_attr,
> > +                  struct rte_flow_error *error);
>
> I think, we should have _get function to get those limit numbers otherwise,
> we can not write portable applications as the return value is  kind of
> boolean now if
> don't define exact values for rte_errno for reasons.
+1
  
Jerin Jacob Jan. 24, 2022, 5:46 p.m. UTC | #4
On Mon, Jan 24, 2022 at 11:05 PM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> 24/01/2022 15:36, Jerin Jacob:
> > On Tue, Jan 18, 2022 at 9:01 PM Alexander Kozyrev <akozyrev@nvidia.com> wrote:
> > > +struct rte_flow_port_attr {
> > > +       /**
> > > +        * Version of the struct layout, should be 0.
> > > +        */
> > > +       uint32_t version;
> >
> > Why version number? Across DPDK, we are using dynamic function
> > versioning, I think, that would be sufficient for ABI versioning
>
> Function versioning is not ideal when the structure is accessed
> in many places like many drivers and library functions.
>
> The idea of this version field (which can be a bitfield)
> is to update it when some new features are added,
> so the users of the struct can check if a feature is there
> before trying to use it.
> It means a bit more code in the functions, but avoid duplicating functions
> as in function versioning.
>
> Another approach was suggested by Bruce, and applied to dmadev.
> It is assuming we only add new fields at the end (no removal),
> and focus on the size of the struct.
> By passing sizeof as an extra parameter, the function knows
> which fields are OK to use.
> Example: http://code.dpdk.org/dpdk/v21.11/source/lib/dmadev/rte_dmadev.c#L476

+ @Richardson, Bruce
Either approach is fine, No strong opinion.  We can have one approach
and use it across DPDK for consistency.

>
>
  
Bruce Richardson Jan. 24, 2022, 6:08 p.m. UTC | #5
On Mon, Jan 24, 2022 at 11:16:15PM +0530, Jerin Jacob wrote:
> On Mon, Jan 24, 2022 at 11:05 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> >
> > 24/01/2022 15:36, Jerin Jacob:
> > > On Tue, Jan 18, 2022 at 9:01 PM Alexander Kozyrev <akozyrev@nvidia.com> wrote:
> > > > +struct rte_flow_port_attr {
> > > > +       /**
> > > > +        * Version of the struct layout, should be 0.
> > > > +        */
> > > > +       uint32_t version;
> > >
> > > Why version number? Across DPDK, we are using dynamic function
> > > versioning, I think, that would be sufficient for ABI versioning
> >
> > Function versioning is not ideal when the structure is accessed
> > in many places like many drivers and library functions.
> >
> > The idea of this version field (which can be a bitfield)
> > is to update it when some new features are added,
> > so the users of the struct can check if a feature is there
> > before trying to use it.
> > It means a bit more code in the functions, but avoid duplicating functions
> > as in function versioning.
> >
> > Another approach was suggested by Bruce, and applied to dmadev.
> > It is assuming we only add new fields at the end (no removal),
> > and focus on the size of the struct.
> > By passing sizeof as an extra parameter, the function knows
> > which fields are OK to use.
> > Example: http://code.dpdk.org/dpdk/v21.11/source/lib/dmadev/rte_dmadev.c#L476
> 
> + @Richardson, Bruce
> Either approach is fine, No strong opinion.  We can have one approach
> and use it across DPDK for consistency.
> 

In general I prefer the size-based approach, mainly because of its
simplicity. However, some other reasons why we may want to choose it:

* It's completely hidden from the end user, and there is no need for an
  extra struct field that needs to be filled in

* Related to that, for the version-field approach, if the field is present
  in a user-allocated struct, then you probably need to start preventing user
  error via:
   - having the external struct not have the field and use a separate
     internal struct to add in the version info after the fact in the
     versioned function. Alternatively,
   - provide a separate init function for each structure to fill in the
     version field appropriately

* In general, using the size-based approach like in the linked example is
  more resilient since it's compiler-inserted, so there is reduced chance
  of error.

* A sizeof field allows simple-enough handling in the drivers - especially
  since it does not allow removed fields. Each driver only needs to check
  that the size passed in is greater than that expected, thereby allowing
  us to have both updated and non-updated drivers co-existing simultaneously.
  [For a version field, the same scheme could also work if we keep the
  no-delete rule, but for a bitmask field, I believe things may get more
  complex in terms of checking]

In terms of the limitations of using sizeof - requiring new fields to
always go on the end, and preventing shrinking the struct - I think that the
simplicity gains far outweigh the impact of these strictions.

* Adding fields to struct is far more common than wanting to remove one

* So long as the added field is at the end, even if the struct size doesn't
  change the scheme can still work as the versioned function for the old
  struct can ensure that the extra field is appropriately zeroed (rather than
  random) on entry into any driver function

* If we do want to remove a field, the space simply needs to be marked as
  reserved in the struct, until the next ABI break release, when it can be
  compacted. Again, function versioning can take care of appropriately
  zeroing this field on return, if necessary.

My 2c from considering this for the implementation in dmadev. :-)

/Bruce
  
Alexander Kozyrev Jan. 25, 2022, 1:14 a.m. UTC | #6
On Monday, January 24, 2022 13:09 Bruce Richardson <bruce.richardson@intel.com> wrote:
> On Mon, Jan 24, 2022 at 11:16:15PM +0530, Jerin Jacob wrote:
> > On Mon, Jan 24, 2022 at 11:05 PM Thomas Monjalon
> <thomas@monjalon.net> wrote:
> > >
> > > 24/01/2022 15:36, Jerin Jacob:
> > > > On Tue, Jan 18, 2022 at 9:01 PM Alexander Kozyrev
> <akozyrev@nvidia.com> wrote:
> > > > > +struct rte_flow_port_attr {
> > > > > +       /**
> > > > > +        * Version of the struct layout, should be 0.
> > > > > +        */
> > > > > +       uint32_t version;
> > > >
> > > > Why version number? Across DPDK, we are using dynamic function
> > > > versioning, I think, that would be sufficient for ABI versioning
> > >
> > > Function versioning is not ideal when the structure is accessed
> > > in many places like many drivers and library functions.
> > >
> > > The idea of this version field (which can be a bitfield)
> > > is to update it when some new features are added,
> > > so the users of the struct can check if a feature is there
> > > before trying to use it.
> > > It means a bit more code in the functions, but avoid duplicating functions
> > > as in function versioning.
> > >
> > > Another approach was suggested by Bruce, and applied to dmadev.
> > > It is assuming we only add new fields at the end (no removal),
> > > and focus on the size of the struct.
> > > By passing sizeof as an extra parameter, the function knows
> > > which fields are OK to use.
> > > Example:
> http://code.dpdk.org/dpdk/v21.11/source/lib/dmadev/rte_dmadev.c#L476
> >
> > + @Richardson, Bruce
> > Either approach is fine, No strong opinion.  We can have one approach
> > and use it across DPDK for consistency.
> >
> 
> In general I prefer the size-based approach, mainly because of its
> simplicity. However, some other reasons why we may want to choose it:
> 
> * It's completely hidden from the end user, and there is no need for an
>   extra struct field that needs to be filled in
> 
> * Related to that, for the version-field approach, if the field is present
>   in a user-allocated struct, then you probably need to start preventing user
>   error via:
>    - having the external struct not have the field and use a separate
>      internal struct to add in the version info after the fact in the
>      versioned function. Alternatively,
>    - provide a separate init function for each structure to fill in the
>      version field appropriately
> 
> * In general, using the size-based approach like in the linked example is
>   more resilient since it's compiler-inserted, so there is reduced chance
>   of error.
> 
> * A sizeof field allows simple-enough handling in the drivers - especially
>   since it does not allow removed fields. Each driver only needs to check
>   that the size passed in is greater than that expected, thereby allowing
>   us to have both updated and non-updated drivers co-existing
> simultaneously.
>   [For a version field, the same scheme could also work if we keep the
>   no-delete rule, but for a bitmask field, I believe things may get more
>   complex in terms of checking]
> 
> In terms of the limitations of using sizeof - requiring new fields to
> always go on the end, and preventing shrinking the struct - I think that the
> simplicity gains far outweigh the impact of these strictions.
> 
> * Adding fields to struct is far more common than wanting to remove one
> 
> * So long as the added field is at the end, even if the struct size doesn't
>   change the scheme can still work as the versioned function for the old
>   struct can ensure that the extra field is appropriately zeroed (rather than
>   random) on entry into any driver function
> 
> * If we do want to remove a field, the space simply needs to be marked as
>   reserved in the struct, until the next ABI break release, when it can be
>   compacted. Again, function versioning can take care of appropriately
>   zeroing this field on return, if necessary.
> 
> My 2c from considering this for the implementation in dmadev. :-)
> 
> /Bruce

Thank you for the suggestions. I have no objections in adopting a size-based approach.
I can keep versions or switch to sizeof as long as we can agree on some uniform way.
  
Alexander Kozyrev Jan. 25, 2022, 1:28 a.m. UTC | #7
On Monday, January 24, 2022 12:41 Ajit Khaparde <ajit.khaparde@broadcom.com> wrote:
> On Mon, Jan 24, 2022 at 6:37 AM Jerin Jacob <jerinjacobk@gmail.com>
> wrote:
> >
> > On Tue, Jan 18, 2022 at 9:01 PM Alexander Kozyrev
> <akozyrev@nvidia.com> wrote:
> > >
> > > The flow rules creation/destruction at a large scale incurs a performance
> > > penalty and may negatively impact the packet processing when used
> > > as part of the datapath logic. This is mainly because software/hardware
> > > resources are allocated and prepared during the flow rule creation.
> > >
> > > In order to optimize the insertion rate, PMD may use some hints
> provided
> > > by the application at the initialization phase. The rte_flow_configure()
> > > function allows to pre-allocate all the needed resources beforehand.
> > > These resources can be used at a later stage without costly allocations.
> > > Every PMD may use only the subset of hints and ignore unused ones or
> > > fail in case the requested configuration is not supported.
> > >
> > > Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
> > > ---
> >
> > >
> > > +/**
> > > + * @warning
> > > + * @b EXPERIMENTAL: this API may change without prior notice.
> > > + *
> > > + * Flow engine port configuration attributes.
> > > + */
> > > +__extension__
> >
> > Is this __extension__ required ?
No, it is not longer required as I removed bitfield from this structure. Thanks for catching.

> >
> > > +struct rte_flow_port_attr {
> > > +       /**
> > > +        * Version of the struct layout, should be 0.
> > > +        */
> > > +       uint32_t version;
> >
> > Why version number? Across DPDK, we are using dynamic function
> > versioning, I think, that would
> >  be sufficient for ABI versioning
> >
> > > +       /**
> > > +        * Number of counter actions pre-configured.
> > > +        * If set to 0, PMD will allocate counters dynamically.
> > > +        * @see RTE_FLOW_ACTION_TYPE_COUNT
> > > +        */
> > > +       uint32_t nb_counters;
> > > +       /**
> > > +        * Number of aging actions pre-configured.
> > > +        * If set to 0, PMD will allocate aging dynamically.
> > > +        * @see RTE_FLOW_ACTION_TYPE_AGE
> > > +        */
> > > +       uint32_t nb_aging;
> > > +       /**
> > > +        * Number of traffic metering actions pre-configured.
> > > +        * If set to 0, PMD will allocate meters dynamically.
> > > +        * @see RTE_FLOW_ACTION_TYPE_METER
> > > +        */
> > > +       uint32_t nb_meters;
> > > +};
> > > +
> > > +/**
> > > + * @warning
> > > + * @b EXPERIMENTAL: this API may change without prior notice.
> > > + *
> > > + * Configure flow rules module.
> > > + * To pre-allocate resources as per the flow port attributes
> > > + * this configuration function must be called before any flow rule is
> created.
> > > + * Must be called only after Ethernet device is configured, but may be
> called
> > > + * before or after the device is started as long as there are no flow rules.
> > > + * No other rte_flow function should be called while this function is
> invoked.
> > > + * This function can be called again to change the configuration.
> > > + * Some PMDs may not support re-configuration at all,
> > > + * or may only allow increasing the number of resources allocated.
> >
> > Following comment from Ivan looks good to me
> >
> > * Pre-configure the port's flow API engine.
> > *
> > * This API can only be invoked before the application
> > * starts using the rest of the flow library functions.
> > *
> > * The API can be invoked multiple times to change the
> > * settings. The port, however, may reject the changes.
Ok, I'll adopt this wording in the v3.

> > > + *
> > > + * @param port_id
> > > + *   Port identifier of Ethernet device.
> > > + * @param[in] port_attr
> > > + *   Port configuration attributes.
> > > + * @param[out] error
> > > + *   Perform verbose error reporting if not NULL.
> > > + *   PMDs initialize this structure in case of error only.
> > > + *
> > > + * @return
> > > + *   0 on success, a negative errno value otherwise and rte_errno is set.
> > > + */
> > > +__rte_experimental
> > > +int
> > > +rte_flow_configure(uint16_t port_id,
> >
> > Should we couple, setting resource limit hint to configure function as
> > if we add future items in
> > configuration, we may pain to manage all state. Instead how about,
> > rte_flow_resource_reserve_hint_set()?
> +1
Port attributes are the hints, PMD can safely ignore anything that is not supported/deemed unreasonable.
Having several functions to call instead of one configuration function seems like a burden to me.

> 
> >
> >
> > > +                  const struct rte_flow_port_attr *port_attr,
> > > +                  struct rte_flow_error *error);
> >
> > I think, we should have _get function to get those limit numbers otherwise,
> > we can not write portable applications as the return value is  kind of
> > boolean now if
> > don't define exact values for rte_errno for reasons.
> +1
We had this discussion in RFC. The limits will vary from NIC to NIC and from system to
system, depending on hardware capabilities and amount of free memory for example.
It is easier to reject a configuration with a clear error description as we do for flow creation.
  
Ori Kam Jan. 25, 2022, 3:58 p.m. UTC | #8
Hi Bruce,

> -----Original Message-----
> From: Bruce Richardson <bruce.richardson@intel.com>
> Sent: Monday, January 24, 2022 8:09 PM
> Subject: Re: [PATCH v2 01/10] ethdev: introduce flow pre-configuration hints
> 
> On Mon, Jan 24, 2022 at 11:16:15PM +0530, Jerin Jacob wrote:
> > On Mon, Jan 24, 2022 at 11:05 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > >
> > > 24/01/2022 15:36, Jerin Jacob:
> > > > On Tue, Jan 18, 2022 at 9:01 PM Alexander Kozyrev <akozyrev@nvidia.com> wrote:
> > > > > +struct rte_flow_port_attr {
> > > > > +       /**
> > > > > +        * Version of the struct layout, should be 0.
> > > > > +        */
> > > > > +       uint32_t version;
> > > >
> > > > Why version number? Across DPDK, we are using dynamic function
> > > > versioning, I think, that would be sufficient for ABI versioning
> > >
> > > Function versioning is not ideal when the structure is accessed
> > > in many places like many drivers and library functions.
> > >
> > > The idea of this version field (which can be a bitfield)
> > > is to update it when some new features are added,
> > > so the users of the struct can check if a feature is there
> > > before trying to use it.
> > > It means a bit more code in the functions, but avoid duplicating functions
> > > as in function versioning.
> > >
> > > Another approach was suggested by Bruce, and applied to dmadev.
> > > It is assuming we only add new fields at the end (no removal),
> > > and focus on the size of the struct.
> > > By passing sizeof as an extra parameter, the function knows
> > > which fields are OK to use.
> > > Example: http://code.dpdk.org/dpdk/v21.11/source/lib/dmadev/rte_dmadev.c#L476
> >
> > + @Richardson, Bruce
> > Either approach is fine, No strong opinion.  We can have one approach
> > and use it across DPDK for consistency.
> >
> 
> In general I prefer the size-based approach, mainly because of its
> simplicity. However, some other reasons why we may want to choose it:
> 
> * It's completely hidden from the end user, and there is no need for an
>   extra struct field that needs to be filled in
> 
> * Related to that, for the version-field approach, if the field is present
>   in a user-allocated struct, then you probably need to start preventing user
>   error via:
>    - having the external struct not have the field and use a separate
>      internal struct to add in the version info after the fact in the
>      versioned function. Alternatively,
>    - provide a separate init function for each structure to fill in the
>      version field appropriately
> 
> * In general, using the size-based approach like in the linked example is
>   more resilient since it's compiler-inserted, so there is reduced chance
>   of error.
> 
> * A sizeof field allows simple-enough handling in the drivers - especially
>   since it does not allow removed fields. Each driver only needs to check
>   that the size passed in is greater than that expected, thereby allowing
>   us to have both updated and non-updated drivers co-existing simultaneously.
>   [For a version field, the same scheme could also work if we keep the
>   no-delete rule, but for a bitmask field, I believe things may get more
>   complex in terms of checking]
> 
> In terms of the limitations of using sizeof - requiring new fields to
> always go on the end, and preventing shrinking the struct - I think that the
> simplicity gains far outweigh the impact of these strictions.
> 
> * Adding fields to struct is far more common than wanting to remove one
> 
> * So long as the added field is at the end, even if the struct size doesn't
>   change the scheme can still work as the versioned function for the old
>   struct can ensure that the extra field is appropriately zeroed (rather than
>   random) on entry into any driver function
> 

Zero can be a valid value so this is may result in an issue.

> * If we do want to remove a field, the space simply needs to be marked as
>   reserved in the struct, until the next ABI break release, when it can be
>   compacted. Again, function versioning can take care of appropriately
>   zeroing this field on return, if necessary.
> 

This means that PMD will have to change just for removal of a field
I would say removal is not allowed.

> My 2c from considering this for the implementation in dmadev. :-)

Some concerns I have about your suggestion:
1. The size of the struct is dependent on the system, for example
Assume this struct 
{
Uint16_t a;
Uint32_t b;
Uint8_t c;
Uint32_t d;
}
Incase of 32 bit machine the size will be 128 bytes, while in 64 machine it will be 96

2. ABI breakage, as far as I know changing size of a struct is ABI breakage, since if 
the application got the size from previous version and for example created array
or allocated memory then using the new structure will result in memory override.

I know that flags/version is not easy since it means creating new 
Structure for each change. I prefer to declare that size can change between
DPDK releases is allowd but as long as we say ABI breakage is forbidden then I don't think your
solution is valid.
And we must go with the version/flags and create new structure for each change.

Best,
Ori
> 
> /Bruce
  
Bruce Richardson Jan. 25, 2022, 6:09 p.m. UTC | #9
On Tue, Jan 25, 2022 at 03:58:45PM +0000, Ori Kam wrote:
> Hi Bruce,
> 
> > -----Original Message-----
> > From: Bruce Richardson <bruce.richardson@intel.com>
> > Sent: Monday, January 24, 2022 8:09 PM
> > Subject: Re: [PATCH v2 01/10] ethdev: introduce flow pre-configuration hints
> > 
> > On Mon, Jan 24, 2022 at 11:16:15PM +0530, Jerin Jacob wrote:
> > > On Mon, Jan 24, 2022 at 11:05 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > >
> > > > 24/01/2022 15:36, Jerin Jacob:
> > > > > On Tue, Jan 18, 2022 at 9:01 PM Alexander Kozyrev <akozyrev@nvidia.com> wrote:
> > > > > > +struct rte_flow_port_attr {
> > > > > > +       /**
> > > > > > +        * Version of the struct layout, should be 0.
> > > > > > +        */
> > > > > > +       uint32_t version;
> > > > >
> > > > > Why version number? Across DPDK, we are using dynamic function
> > > > > versioning, I think, that would be sufficient for ABI versioning
> > > >
> > > > Function versioning is not ideal when the structure is accessed
> > > > in many places like many drivers and library functions.
> > > >
> > > > The idea of this version field (which can be a bitfield)
> > > > is to update it when some new features are added,
> > > > so the users of the struct can check if a feature is there
> > > > before trying to use it.
> > > > It means a bit more code in the functions, but avoid duplicating functions
> > > > as in function versioning.
> > > >
> > > > Another approach was suggested by Bruce, and applied to dmadev.
> > > > It is assuming we only add new fields at the end (no removal),
> > > > and focus on the size of the struct.
> > > > By passing sizeof as an extra parameter, the function knows
> > > > which fields are OK to use.
> > > > Example: http://code.dpdk.org/dpdk/v21.11/source/lib/dmadev/rte_dmadev.c#L476
> > >
> > > + @Richardson, Bruce
> > > Either approach is fine, No strong opinion.  We can have one approach
> > > and use it across DPDK for consistency.
> > >
> > 
> > In general I prefer the size-based approach, mainly because of its
> > simplicity. However, some other reasons why we may want to choose it:
> > 
> > * It's completely hidden from the end user, and there is no need for an
> >   extra struct field that needs to be filled in
> > 
> > * Related to that, for the version-field approach, if the field is present
> >   in a user-allocated struct, then you probably need to start preventing user
> >   error via:
> >    - having the external struct not have the field and use a separate
> >      internal struct to add in the version info after the fact in the
> >      versioned function. Alternatively,
> >    - provide a separate init function for each structure to fill in the
> >      version field appropriately
> > 
> > * In general, using the size-based approach like in the linked example is
> >   more resilient since it's compiler-inserted, so there is reduced chance
> >   of error.
> > 
> > * A sizeof field allows simple-enough handling in the drivers - especially
> >   since it does not allow removed fields. Each driver only needs to check
> >   that the size passed in is greater than that expected, thereby allowing
> >   us to have both updated and non-updated drivers co-existing simultaneously.
> >   [For a version field, the same scheme could also work if we keep the
> >   no-delete rule, but for a bitmask field, I believe things may get more
> >   complex in terms of checking]
> > 
> > In terms of the limitations of using sizeof - requiring new fields to
> > always go on the end, and preventing shrinking the struct - I think that the
> > simplicity gains far outweigh the impact of these strictions.
> > 
> > * Adding fields to struct is far more common than wanting to remove one
> > 
> > * So long as the added field is at the end, even if the struct size doesn't
> >   change the scheme can still work as the versioned function for the old
> >   struct can ensure that the extra field is appropriately zeroed (rather than
> >   random) on entry into any driver function
> > 
> 
> Zero can be a valid value so this is may result in an issue.
> 

In this instance, I was using zero as a neutral, default-option value. If
having zero as the default causes problems, we can always make the
structure size change to force a new size value.

> > * If we do want to remove a field, the space simply needs to be marked as
> >   reserved in the struct, until the next ABI break release, when it can be
> >   compacted. Again, function versioning can take care of appropriately
> >   zeroing this field on return, if necessary.
> > 
> 
> This means that PMD will have to change just for removal of a field
> I would say removal is not allowed.
> 
> > My 2c from considering this for the implementation in dmadev. :-)
> 
> Some concerns I have about your suggestion:
> 1. The size of the struct is dependent on the system, for example
> Assume this struct 
> {
> Uint16_t a;
> Uint32_t b;
> Uint8_t c;
> Uint32_t d;
> }
> Incase of 32 bit machine the size will be 128 bytes, while in 64 machine it will be 96

Actually, I believe that in just about every system we support it will be
4x4B i.e. 16 bytes in size. How do you compute 96 or 128 byte sizes? In any
case, the actual size value doesn't matter in practice, since all sizes
should be computed by the compiler using sizeof, rather than hard-coded.

> 
> 2. ABI breakage, as far as I know changing size of a struct is ABI breakage, since if 
> the application got the size from previous version and for example created array
> or allocated memory then using the new structure will result in memory override.
> 
> I know that flags/version is not easy since it means creating new 
> Structure for each change. I prefer to declare that size can change between
> DPDK releases is allowd but as long as we say ABI breakage is forbidden then I don't think your
> solution is valid.
> And we must go with the version/flags and create new structure for each change.
> 

whatever approach is taken for this, I believe we will always need to
create a new structure for the changes. This is because only functions can
be versioned, not structures. The only question therefore becomes how to
pass ABI version information, and therefore by extension structure version
information across a library to driver boundary. This has to be an extra
field somewhere, either in a structure or as a function parameter. I'd
prefer not in the structure as it exposes it to the user. In terms of the
field value, it can either be explicit version info as version number or
version flags, or implicit versioning via "size". Based off the "YAGNI"
principle, I really would prefer just using sizes, as it's far easier to
manage and work with for all concerned, and requires no additional
documentation for the programmer or driver developer to understand.

Regards,
/Bruce
  
Bruce Richardson Jan. 25, 2022, 6:14 p.m. UTC | #10
On Tue, Jan 25, 2022 at 06:09:42PM +0000, Bruce Richardson wrote:
> On Tue, Jan 25, 2022 at 03:58:45PM +0000, Ori Kam wrote:
> > Hi Bruce,
> > 
> > > -----Original Message----- From: Bruce Richardson
> > > <bruce.richardson@intel.com> Sent: Monday, January 24, 2022 8:09 PM
> > > Subject: Re: [PATCH v2 01/10] ethdev: introduce flow
> > > pre-configuration hints
> > > 
> > > On Mon, Jan 24, 2022 at 11:16:15PM +0530, Jerin Jacob wrote:
> > > > On Mon, Jan 24, 2022 at 11:05 PM Thomas Monjalon
> > > > <thomas@monjalon.net> wrote:
> > > > >
> > > > > 24/01/2022 15:36, Jerin Jacob:
> > > > > > On Tue, Jan 18, 2022 at 9:01 PM Alexander Kozyrev
> > > > > > <akozyrev@nvidia.com> wrote:
> > > > > > > +struct rte_flow_port_attr { +       /** +        * Version
> > > > > > > of the struct layout, should be 0.  +        */ +
> > > > > > > uint32_t version;
> > > > > >
> > > > > > Why version number? Across DPDK, we are using dynamic function
> > > > > > versioning, I think, that would be sufficient for ABI
> > > > > > versioning
> > > > >
> > > > > Function versioning is not ideal when the structure is accessed
> > > > > in many places like many drivers and library functions.
> > > > >
> > > > > The idea of this version field (which can be a bitfield) is to
> > > > > update it when some new features are added, so the users of the
> > > > > struct can check if a feature is there before trying to use it.
> > > > > It means a bit more code in the functions, but avoid duplicating
> > > > > functions as in function versioning.
> > > > >
> > > > > Another approach was suggested by Bruce, and applied to dmadev.
> > > > > It is assuming we only add new fields at the end (no removal),
> > > > > and focus on the size of the struct.  By passing sizeof as an
> > > > > extra parameter, the function knows which fields are OK to use.
> > > > > Example:
> > > > > http://code.dpdk.org/dpdk/v21.11/source/lib/dmadev/rte_dmadev.c#L476
> > > >
> > > > + @Richardson, Bruce Either approach is fine, No strong opinion.
> > > > We can have one approach and use it across DPDK for consistency.
> > > >
> > > 
> > > In general I prefer the size-based approach, mainly because of its
> > > simplicity. However, some other reasons why we may want to choose it:
> > > 
> > > * It's completely hidden from the end user, and there is no need for
> > > an extra struct field that needs to be filled in
> > > 
> > > * Related to that, for the version-field approach, if the field is
> > > present in a user-allocated struct, then you probably need to start
> > > preventing user error via: - having the external struct not have the
> > > field and use a separate internal struct to add in the version info
> > > after the fact in the versioned function. Alternatively, - provide a
> > > separate init function for each structure to fill in the version
> > > field appropriately
> > > 
> > > * In general, using the size-based approach like in the linked
> > > example is more resilient since it's compiler-inserted, so there is
> > > reduced chance of error.
> > > 
> > > * A sizeof field allows simple-enough handling in the drivers -
> > > especially since it does not allow removed fields. Each driver only
> > > needs to check that the size passed in is greater than that expected,
> > > thereby allowing us to have both updated and non-updated drivers
> > > co-existing simultaneously.  [For a version field, the same scheme
> > > could also work if we keep the no-delete rule, but for a bitmask
> > > field, I believe things may get more complex in terms of checking]
> > > 
> > > In terms of the limitations of using sizeof - requiring new fields to
> > > always go on the end, and preventing shrinking the struct - I think
> > > that the simplicity gains far outweigh the impact of these
> > > strictions.
> > > 
> > > * Adding fields to struct is far more common than wanting to remove
> > > one
> > > 
> > > * So long as the added field is at the end, even if the struct size
> > > doesn't change the scheme can still work as the versioned function
> > > for the old struct can ensure that the extra field is appropriately
> > > zeroed (rather than random) on entry into any driver function
> > > 
> > 
> > Zero can be a valid value so this is may result in an issue.
> > 
> 
> In this instance, I was using zero as a neutral, default-option value. If
> having zero as the default causes problems, we can always make the
> structure size change to force a new size value.
> 
> > > * If we do want to remove a field, the space simply needs to be
> > > marked as reserved in the struct, until the next ABI break release,
> > > when it can be compacted. Again, function versioning can take care of
> > > appropriately zeroing this field on return, if necessary.
> > > 
> > 
> > This means that PMD will have to change just for removal of a field I
> > would say removal is not allowed.
> > 
> > > My 2c from considering this for the implementation in dmadev. :-)
> > 
> > Some concerns I have about your suggestion: 1. The size of the struct
> > is dependent on the system, for example Assume this struct { Uint16_t
> > a; Uint32_t b; Uint8_t c; Uint32_t d; } Incase of 32 bit machine the
> > size will be 128 bytes, while in 64 machine it will be 96
> 
> Actually, I believe that in just about every system we support it will be
> 4x4B i.e. 16 bytes in size. How do you compute 96 or 128 byte sizes? In
> any case, the actual size value doesn't matter in practice, since all
> sizes should be computed by the compiler using sizeof, rather than
> hard-coded.
> 
> > 
> > 2. ABI breakage, as far as I know changing size of a struct is ABI
> > breakage, since if the application got the size from previous version
> > and for example created array or allocated memory then using the new
> > structure will result in memory override.
> > 
> > I know that flags/version is not easy since it means creating new
> > Structure for each change. I prefer to declare that size can change
> > between DPDK releases is allowd but as long as we say ABI breakage is
> > forbidden then I don't think your solution is valid.  And we must go
> > with the version/flags and create new structure for each change.
> > 
> 
> whatever approach is taken for this, I believe we will always need to
> create a new structure for the changes. This is because only functions
> can be versioned, not structures. The only question therefore becomes how
> to pass ABI version information, and therefore by extension structure
> version information across a library to driver boundary. This has to be
> an extra field somewhere, either in a structure or as a function
> parameter. I'd prefer not in the structure as it exposes it to the user.
> In terms of the field value, it can either be explicit version info as
> version number or version flags, or implicit versioning via "size". Based
> off the "YAGNI" principle, I really would prefer just using sizes, as
> it's far easier to manage and work with for all concerned, and requires
> no additional documentation for the programmer or driver developer to
> understand.
> 
As a third alternative that I would find acceptable, we could also just
take the approach of passing the ABI version explicitly across the function
call i.e. 22 for DPDK_21.11. I'd find this ok too on the basis that it's
largely self explanatory, and can be inserted automatically by the compiler
- again reducing chances of errors. [However, I also believe that using
sizes is still simpler again, which is why it's still my first choice! :-)]

/Bruce
  
Jerin Jacob Jan. 25, 2022, 6:44 p.m. UTC | #11
On Tue, Jan 25, 2022 at 6:58 AM Alexander Kozyrev <akozyrev@nvidia.com> wrote:
>
> On Monday, January 24, 2022 12:41 Ajit Khaparde <ajit.khaparde@broadcom.com> wrote:
> > On Mon, Jan 24, 2022 at 6:37 AM Jerin Jacob <jerinjacobk@gmail.com>
> > wrote:
> > >

> Ok, I'll adopt this wording in the v3.
>
> > > > + *
> > > > + * @param port_id
> > > > + *   Port identifier of Ethernet device.
> > > > + * @param[in] port_attr
> > > > + *   Port configuration attributes.
> > > > + * @param[out] error
> > > > + *   Perform verbose error reporting if not NULL.
> > > > + *   PMDs initialize this structure in case of error only.
> > > > + *
> > > > + * @return
> > > > + *   0 on success, a negative errno value otherwise and rte_errno is set.
> > > > + */
> > > > +__rte_experimental
> > > > +int
> > > > +rte_flow_configure(uint16_t port_id,
> > >
> > > Should we couple, setting resource limit hint to configure function as
> > > if we add future items in
> > > configuration, we may pain to manage all state. Instead how about,
> > > rte_flow_resource_reserve_hint_set()?
> > +1
> Port attributes are the hints, PMD can safely ignore anything that is not supported/deemed unreasonable.
> Having several functions to call instead of one configuration function seems like a burden to me.

If we add a lot of features which has different state it will be
difficult to manage.
Since it is the slow path and OPTIONAL API. IMO, it should be fine to
have a separate API for a specific purpose
to have a clean interface.


>
> >
> > >
> > >
> > > > +                  const struct rte_flow_port_attr *port_attr,
> > > > +                  struct rte_flow_error *error);
> > >
> > > I think, we should have _get function to get those limit numbers otherwise,
> > > we can not write portable applications as the return value is  kind of
> > > boolean now if
> > > don't define exact values for rte_errno for reasons.
> > +1
> We had this discussion in RFC. The limits will vary from NIC to NIC and from system to
> system, depending on hardware capabilities and amount of free memory for example.
> It is easier to reject a configuration with a clear error description as we do for flow creation.

In that case, we can return a "defined" return value or "defined"
errno to capture this case so that
the application can make forward progress to differentiate between API
failed vs dont having enough resources
and move on.
  
Ori Kam Jan. 26, 2022, 9:45 a.m. UTC | #12
> -----Original Message-----
> From: Bruce Richardson <bruce.richardson@intel.com>
> Sent: Tuesday, January 25, 2022 8:14 PM
> To: Ori Kam <orika@nvidia.com>
> Subject: Re: [PATCH v2 01/10] ethdev: introduce flow pre-configuration hints
> 
> On Tue, Jan 25, 2022 at 06:09:42PM +0000, Bruce Richardson wrote:
> > On Tue, Jan 25, 2022 at 03:58:45PM +0000, Ori Kam wrote:
> > > Hi Bruce,
> > >
> > > > -----Original Message----- From: Bruce Richardson
> > > > <bruce.richardson@intel.com> Sent: Monday, January 24, 2022 8:09 PM
> > > > Subject: Re: [PATCH v2 01/10] ethdev: introduce flow
> > > > pre-configuration hints
> > > >
> > > > On Mon, Jan 24, 2022 at 11:16:15PM +0530, Jerin Jacob wrote:
> > > > > On Mon, Jan 24, 2022 at 11:05 PM Thomas Monjalon
> > > > > <thomas@monjalon.net> wrote:
> > > > > >
> > > > > > 24/01/2022 15:36, Jerin Jacob:
> > > > > > > On Tue, Jan 18, 2022 at 9:01 PM Alexander Kozyrev
> > > > > > > <akozyrev@nvidia.com> wrote:
> > > > > > > > +struct rte_flow_port_attr { +       /** +        * Version
> > > > > > > > of the struct layout, should be 0.  +        */ +
> > > > > > > > uint32_t version;
> > > > > > >
> > > > > > > Why version number? Across DPDK, we are using dynamic function
> > > > > > > versioning, I think, that would be sufficient for ABI
> > > > > > > versioning
> > > > > >
> > > > > > Function versioning is not ideal when the structure is accessed
> > > > > > in many places like many drivers and library functions.
> > > > > >
> > > > > > The idea of this version field (which can be a bitfield) is to
> > > > > > update it when some new features are added, so the users of the
> > > > > > struct can check if a feature is there before trying to use it.
> > > > > > It means a bit more code in the functions, but avoid duplicating
> > > > > > functions as in function versioning.
> > > > > >
> > > > > > Another approach was suggested by Bruce, and applied to dmadev.
> > > > > > It is assuming we only add new fields at the end (no removal),
> > > > > > and focus on the size of the struct.  By passing sizeof as an
> > > > > > extra parameter, the function knows which fields are OK to use.
> > > > > > Example:
> > > > > > http://code.dpdk.org/dpdk/v21.11/source/lib/dmadev/rte_dmadev.c#L476
> > > > >
> > > > > + @Richardson, Bruce Either approach is fine, No strong opinion.
> > > > > We can have one approach and use it across DPDK for consistency.
> > > > >
> > > >
> > > > In general I prefer the size-based approach, mainly because of its
> > > > simplicity. However, some other reasons why we may want to choose it:
> > > >
> > > > * It's completely hidden from the end user, and there is no need for
> > > > an extra struct field that needs to be filled in
> > > >
> > > > * Related to that, for the version-field approach, if the field is
> > > > present in a user-allocated struct, then you probably need to start
> > > > preventing user error via: - having the external struct not have the
> > > > field and use a separate internal struct to add in the version info
> > > > after the fact in the versioned function. Alternatively, - provide a
> > > > separate init function for each structure to fill in the version
> > > > field appropriately
> > > >
> > > > * In general, using the size-based approach like in the linked
> > > > example is more resilient since it's compiler-inserted, so there is
> > > > reduced chance of error.
> > > >
> > > > * A sizeof field allows simple-enough handling in the drivers -
> > > > especially since it does not allow removed fields. Each driver only
> > > > needs to check that the size passed in is greater than that expected,
> > > > thereby allowing us to have both updated and non-updated drivers
> > > > co-existing simultaneously.  [For a version field, the same scheme
> > > > could also work if we keep the no-delete rule, but for a bitmask
> > > > field, I believe things may get more complex in terms of checking]
> > > >
> > > > In terms of the limitations of using sizeof - requiring new fields to
> > > > always go on the end, and preventing shrinking the struct - I think
> > > > that the simplicity gains far outweigh the impact of these
> > > > strictions.
> > > >
> > > > * Adding fields to struct is far more common than wanting to remove
> > > > one
> > > >
> > > > * So long as the added field is at the end, even if the struct size
> > > > doesn't change the scheme can still work as the versioned function
> > > > for the old struct can ensure that the extra field is appropriately
> > > > zeroed (rather than random) on entry into any driver function
> > > >
> > >
> > > Zero can be a valid value so this is may result in an issue.
> > >
> >
> > In this instance, I was using zero as a neutral, default-option value. If
> > having zero as the default causes problems, we can always make the
> > structure size change to force a new size value.
> >
> > > > * If we do want to remove a field, the space simply needs to be
> > > > marked as reserved in the struct, until the next ABI break release,
> > > > when it can be compacted. Again, function versioning can take care of
> > > > appropriately zeroing this field on return, if necessary.
> > > >
> > >
> > > This means that PMD will have to change just for removal of a field I
> > > would say removal is not allowed.
> > >
> > > > My 2c from considering this for the implementation in dmadev. :-)
> > >
> > > Some concerns I have about your suggestion: 1. The size of the struct
> > > is dependent on the system, for example Assume this struct { Uint16_t
> > > a; Uint32_t b; Uint8_t c; Uint32_t d; } Incase of 32 bit machine the
> > > size will be 128 bytes, while in 64 machine it will be 96
> >
> > Actually, I believe that in just about every system we support it will be
> > 4x4B i.e. 16 bytes in size. How do you compute 96 or 128 byte sizes? In
> > any case, the actual size value doesn't matter in practice, since all
> > sizes should be computed by the compiler using sizeof, rather than
> > hard-coded.
> >
You are correct my mistake with the numbers.
I still think there might be some issue but I can't think of anything.
So dropping it.

> > >
> > > 2. ABI breakage, as far as I know changing size of a struct is ABI
> > > breakage, since if the application got the size from previous version
> > > and for example created array or allocated memory then using the new
> > > structure will result in memory override.
> > >
> > > I know that flags/version is not easy since it means creating new
> > > Structure for each change. I prefer to declare that size can change
> > > between DPDK releases is allowd but as long as we say ABI breakage is
> > > forbidden then I don't think your solution is valid.  And we must go
> > > with the version/flags and create new structure for each change.
> > >
> >
> > whatever approach is taken for this, I believe we will always need to
> > create a new structure for the changes. This is because only functions
> > can be versioned, not structures. The only question therefore becomes how
> > to pass ABI version information, and therefore by extension structure
> > version information across a library to driver boundary. This has to be
> > an extra field somewhere, either in a structure or as a function
> > parameter. I'd prefer not in the structure as it exposes it to the user.
> > In terms of the field value, it can either be explicit version info as
> > version number or version flags, or implicit versioning via "size". Based
> > off the "YAGNI" principle, I really would prefer just using sizes, as
> > it's far easier to manage and work with for all concerned, and requires
> > no additional documentation for the programmer or driver developer to
> > understand.
> >
> As a third alternative that I would find acceptable, we could also just
> take the approach of passing the ABI version explicitly across the function
> call i.e. 22 for DPDK_21.11. I'd find this ok too on the basis that it's
> largely self explanatory, and can be inserted automatically by the compiler
> - again reducing chances of errors. [However, I also believe that using
> sizes is still simpler again, which is why it's still my first choice! :-)]
> 

Just to make sure I fully understand your suggestion.
We will create new struct for each change.
The function will  stay the same
For example I had the following:

Struct base {
 Uint32_t x;
}

Function (struct base *input)
{
	Inner_func (input, sizeof(struct base))
}

Now I'm adding new member so it will look like this:
Struct new {
 Uint32_t x;
Uint32_t y;
}

When I want to call the function I need to cast
Function((struct base*) new) 

Right?

This means that in both cases the sizeof wil return the same value,
What am I missing?
 
> /Bruce
  
Bruce Richardson Jan. 26, 2022, 10:52 a.m. UTC | #13
On Wed, Jan 26, 2022 at 09:45:18AM +0000, Ori Kam wrote:
> 
> 
> > -----Original Message-----
> > From: Bruce Richardson <bruce.richardson@intel.com>
> > Sent: Tuesday, January 25, 2022 8:14 PM
> > To: Ori Kam <orika@nvidia.com>
> > Subject: Re: [PATCH v2 01/10] ethdev: introduce flow pre-configuration hints
> > 
> > On Tue, Jan 25, 2022 at 06:09:42PM +0000, Bruce Richardson wrote:
> > > On Tue, Jan 25, 2022 at 03:58:45PM +0000, Ori Kam wrote:
> > > > Hi Bruce,
> > > >
> > > > > -----Original Message----- From: Bruce Richardson
> > > > > <bruce.richardson@intel.com> Sent: Monday, January 24, 2022 8:09 PM
> > > > > Subject: Re: [PATCH v2 01/10] ethdev: introduce flow
> > > > > pre-configuration hints
> > > > >
> > > > > On Mon, Jan 24, 2022 at 11:16:15PM +0530, Jerin Jacob wrote:
> > > > > > On Mon, Jan 24, 2022 at 11:05 PM Thomas Monjalon
> > > > > > <thomas@monjalon.net> wrote:
> > > > > > >
> > > > > > > 24/01/2022 15:36, Jerin Jacob:
> > > > > > > > On Tue, Jan 18, 2022 at 9:01 PM Alexander Kozyrev
> > > > > > > > <akozyrev@nvidia.com> wrote:
> > > > > > > > > +struct rte_flow_port_attr { +       /** +        * Version
> > > > > > > > > of the struct layout, should be 0.  +        */ +
> > > > > > > > > uint32_t version;
> > > > > > > >
> > > > > > > > Why version number? Across DPDK, we are using dynamic function
> > > > > > > > versioning, I think, that would be sufficient for ABI
> > > > > > > > versioning
> > > > > > >
> > > > > > > Function versioning is not ideal when the structure is accessed
> > > > > > > in many places like many drivers and library functions.
> > > > > > >
> > > > > > > The idea of this version field (which can be a bitfield) is to
> > > > > > > update it when some new features are added, so the users of the
> > > > > > > struct can check if a feature is there before trying to use it.
> > > > > > > It means a bit more code in the functions, but avoid duplicating
> > > > > > > functions as in function versioning.
> > > > > > >
> > > > > > > Another approach was suggested by Bruce, and applied to dmadev.
> > > > > > > It is assuming we only add new fields at the end (no removal),
> > > > > > > and focus on the size of the struct.  By passing sizeof as an
> > > > > > > extra parameter, the function knows which fields are OK to use.
> > > > > > > Example:
> > > > > > > http://code.dpdk.org/dpdk/v21.11/source/lib/dmadev/rte_dmadev.c#L476
> > > > > >
> > > > > > + @Richardson, Bruce Either approach is fine, No strong opinion.
> > > > > > We can have one approach and use it across DPDK for consistency.
> > > > > >
> > > > >
> > > > > In general I prefer the size-based approach, mainly because of its
> > > > > simplicity. However, some other reasons why we may want to choose it:
> > > > >
> > > > > * It's completely hidden from the end user, and there is no need for
> > > > > an extra struct field that needs to be filled in
> > > > >
> > > > > * Related to that, for the version-field approach, if the field is
> > > > > present in a user-allocated struct, then you probably need to start
> > > > > preventing user error via: - having the external struct not have the
> > > > > field and use a separate internal struct to add in the version info
> > > > > after the fact in the versioned function. Alternatively, - provide a
> > > > > separate init function for each structure to fill in the version
> > > > > field appropriately
> > > > >
> > > > > * In general, using the size-based approach like in the linked
> > > > > example is more resilient since it's compiler-inserted, so there is
> > > > > reduced chance of error.
> > > > >
> > > > > * A sizeof field allows simple-enough handling in the drivers -
> > > > > especially since it does not allow removed fields. Each driver only
> > > > > needs to check that the size passed in is greater than that expected,
> > > > > thereby allowing us to have both updated and non-updated drivers
> > > > > co-existing simultaneously.  [For a version field, the same scheme
> > > > > could also work if we keep the no-delete rule, but for a bitmask
> > > > > field, I believe things may get more complex in terms of checking]
> > > > >
> > > > > In terms of the limitations of using sizeof - requiring new fields to
> > > > > always go on the end, and preventing shrinking the struct - I think
> > > > > that the simplicity gains far outweigh the impact of these
> > > > > strictions.
> > > > >
> > > > > * Adding fields to struct is far more common than wanting to remove
> > > > > one
> > > > >
> > > > > * So long as the added field is at the end, even if the struct size
> > > > > doesn't change the scheme can still work as the versioned function
> > > > > for the old struct can ensure that the extra field is appropriately
> > > > > zeroed (rather than random) on entry into any driver function
> > > > >
> > > >
> > > > Zero can be a valid value so this is may result in an issue.
> > > >
> > >
> > > In this instance, I was using zero as a neutral, default-option value. If
> > > having zero as the default causes problems, we can always make the
> > > structure size change to force a new size value.
> > >
> > > > > * If we do want to remove a field, the space simply needs to be
> > > > > marked as reserved in the struct, until the next ABI break release,
> > > > > when it can be compacted. Again, function versioning can take care of
> > > > > appropriately zeroing this field on return, if necessary.
> > > > >
> > > >
> > > > This means that PMD will have to change just for removal of a field I
> > > > would say removal is not allowed.
> > > >
> > > > > My 2c from considering this for the implementation in dmadev. :-)
> > > >
> > > > Some concerns I have about your suggestion: 1. The size of the struct
> > > > is dependent on the system, for example Assume this struct { Uint16_t
> > > > a; Uint32_t b; Uint8_t c; Uint32_t d; } Incase of 32 bit machine the
> > > > size will be 128 bytes, while in 64 machine it will be 96
> > >
> > > Actually, I believe that in just about every system we support it will be
> > > 4x4B i.e. 16 bytes in size. How do you compute 96 or 128 byte sizes? In
> > > any case, the actual size value doesn't matter in practice, since all
> > > sizes should be computed by the compiler using sizeof, rather than
> > > hard-coded.
> > >
> You are correct my mistake with the numbers.
> I still think there might be some issue but I can't think of anything.
> So dropping it.
> 
> > > >
> > > > 2. ABI breakage, as far as I know changing size of a struct is ABI
> > > > breakage, since if the application got the size from previous version
> > > > and for example created array or allocated memory then using the new
> > > > structure will result in memory override.
> > > >
> > > > I know that flags/version is not easy since it means creating new
> > > > Structure for each change. I prefer to declare that size can change
> > > > between DPDK releases is allowd but as long as we say ABI breakage is
> > > > forbidden then I don't think your solution is valid.  And we must go
> > > > with the version/flags and create new structure for each change.
> > > >
> > >
> > > whatever approach is taken for this, I believe we will always need to
> > > create a new structure for the changes. This is because only functions
> > > can be versioned, not structures. The only question therefore becomes how
> > > to pass ABI version information, and therefore by extension structure
> > > version information across a library to driver boundary. This has to be
> > > an extra field somewhere, either in a structure or as a function
> > > parameter. I'd prefer not in the structure as it exposes it to the user.
> > > In terms of the field value, it can either be explicit version info as
> > > version number or version flags, or implicit versioning via "size". Based
> > > off the "YAGNI" principle, I really would prefer just using sizes, as
> > > it's far easier to manage and work with for all concerned, and requires
> > > no additional documentation for the programmer or driver developer to
> > > understand.
> > >
> > As a third alternative that I would find acceptable, we could also just
> > take the approach of passing the ABI version explicitly across the function
> > call i.e. 22 for DPDK_21.11. I'd find this ok too on the basis that it's
> > largely self explanatory, and can be inserted automatically by the compiler
> > - again reducing chances of errors. [However, I also believe that using
> > sizes is still simpler again, which is why it's still my first choice! :-)]
> > 
> 
> Just to make sure I fully understand your suggestion.
> We will create new struct for each change.
> The function will  stay the same
> For example I had the following:
> 
> Struct base {
>  Uint32_t x;
> }
> 
> Function (struct base *input)
> {
> 	Inner_func (input, sizeof(struct base))
> }
> 
> Now I'm adding new member so it will look like this:
> Struct new {
>  Uint32_t x;
> Uint32_t y;
> }
> 
> When I want to call the function I need to cast
> Function((struct base*) new) 
> 
> Right?
> 
> This means that in both cases the sizeof wil return the same value,
> What am I missing?
>

The scenario is as follows. Suppose we have the initial state as below:

struct x_dev_cfg {
   int x;
};

int
x_dev_cfg(int dev_id, struct x_dev_cfg *cfg)
{
   struct x_dev *dev = x_devs[id];
   // some setup/config may go here
   return dev->configure(cfg, sizeof(cfg)); // sizeof(cfg) == 4
}

Now, supposing we need to add in a new field into the config structure, a
very common occurance. This will indeed break the ABI, so we need to use
ABI versioning, to ensure that apps passing in the old structure, only call
a function which expects the old structure. Therefore, we need a copy of
the old structure, and a function to work on it. This gives this result:

struct x_dev_cfg {
	int x;
	bool flag; // new field;
};

struct x_dev_cfg_v22 { // needed for ABI-versioned function
	int x;
};

/* this function will only be called by *newly-linked* code, which uses
 * the new structure */
int
x_dev_cfg(int dev_id, struct x_dev_cfg *cfg)
{
   struct x_dev *dev = x_devs[id];
   // some setup/config may go here
   return dev->configure(cfg, sizeof(cfg)); // sizeof(cfg) is now 8
}

/* this function is called by apps linked against old version */
int
x_dev_cfg_v22(int dev_id, struct x_dev_cfg_v22 *cfg)
{
   struct x_dev *dev = x_devs[id];
   // some setup/config may go here
   return dev->configure((void *)cfg, sizeof(cfg)); // sizeof(cfg) is still 4
}

With the above library code, we have different functions using the
different structures, so ABI compatibility is preserved - apps passing in a
4-byte struct call a function using the 4-byte struct, while newer apps can
use the 8-byte version.

The final part of the puzzle is then how drivers react to this change.
Originally, all drivers only use "x" in the config structure because that
is all that there is. That will still continue to work fine in the above
case, as both 4-byte and 8-byte structs have the same x value at the same
offset. i.e. no driver updates for x_dev is needed.

On the other hand, if there are drivers that do want/need the new field,
they can also get to use it, but they do need to check for its presence
before they do so, i.e they would work as below:

	if (size_param > struct(x_dev_cfg_v22)) { // or "== struct(x_dev_cfg)"
		// use flags field
	}

Hope this is clear now.

/Bruce
  
Thomas Monjalon Jan. 26, 2022, 11:21 a.m. UTC | #14
26/01/2022 11:52, Bruce Richardson:
> The scenario is as follows. Suppose we have the initial state as below:
> 
> struct x_dev_cfg {
>    int x;
> };
> 
> int
> x_dev_cfg(int dev_id, struct x_dev_cfg *cfg)
> {
>    struct x_dev *dev = x_devs[id];
>    // some setup/config may go here
>    return dev->configure(cfg, sizeof(cfg)); // sizeof(cfg) == 4
> }
> 
> Now, supposing we need to add in a new field into the config structure, a
> very common occurance. This will indeed break the ABI, so we need to use
> ABI versioning, to ensure that apps passing in the old structure, only call
> a function which expects the old structure. Therefore, we need a copy of
> the old structure, and a function to work on it. This gives this result:
> 
> struct x_dev_cfg {
> 	int x;
> 	bool flag; // new field;
> };
> 
> struct x_dev_cfg_v22 { // needed for ABI-versioned function
> 	int x;
> };
> 
> /* this function will only be called by *newly-linked* code, which uses
>  * the new structure */
> int
> x_dev_cfg(int dev_id, struct x_dev_cfg *cfg)
> {
>    struct x_dev *dev = x_devs[id];
>    // some setup/config may go here
>    return dev->configure(cfg, sizeof(cfg)); // sizeof(cfg) is now 8
> }
> 
> /* this function is called by apps linked against old version */
> int
> x_dev_cfg_v22(int dev_id, struct x_dev_cfg_v22 *cfg)
> {
>    struct x_dev *dev = x_devs[id];
>    // some setup/config may go here
>    return dev->configure((void *)cfg, sizeof(cfg)); // sizeof(cfg) is still 4
> }
> 
> With the above library code, we have different functions using the
> different structures, so ABI compatibility is preserved - apps passing in a
> 4-byte struct call a function using the 4-byte struct, while newer apps can
> use the 8-byte version.
> 
> The final part of the puzzle is then how drivers react to this change.
> Originally, all drivers only use "x" in the config structure because that
> is all that there is. That will still continue to work fine in the above
> case, as both 4-byte and 8-byte structs have the same x value at the same
> offset. i.e. no driver updates for x_dev is needed.
> 
> On the other hand, if there are drivers that do want/need the new field,
> they can also get to use it, but they do need to check for its presence
> before they do so, i.e they would work as below:
> 
> 	if (size_param > struct(x_dev_cfg_v22)) { // or "== struct(x_dev_cfg)"
> 		// use flags field
> 	}
> 
> Hope this is clear now.

Yes, this is the kind of explanation we need in our guideline doc.
Alternatives can be documented as well.
If we can list pros/cons in the doc, it will be easier to choose
the best approach and to explain the choice during code review.
  
Ori Kam Jan. 26, 2022, 12:19 p.m. UTC | #15
> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Wednesday, January 26, 2022 1:22 PM
> Subject: Re: [PATCH v2 01/10] ethdev: introduce flow pre-configuration hints
> 
> 26/01/2022 11:52, Bruce Richardson:
> > The scenario is as follows. Suppose we have the initial state as below:
> >
> > struct x_dev_cfg {
> >    int x;
> > };
> >
> > int
> > x_dev_cfg(int dev_id, struct x_dev_cfg *cfg)
> > {
> >    struct x_dev *dev = x_devs[id];
> >    // some setup/config may go here
> >    return dev->configure(cfg, sizeof(cfg)); // sizeof(cfg) == 4
> > }
> >
> > Now, supposing we need to add in a new field into the config structure, a
> > very common occurance. This will indeed break the ABI, so we need to use
> > ABI versioning, to ensure that apps passing in the old structure, only call
> > a function which expects the old structure. Therefore, we need a copy of
> > the old structure, and a function to work on it. This gives this result:
> >
> > struct x_dev_cfg {
> > 	int x;
> > 	bool flag; // new field;
> > };
> >
> > struct x_dev_cfg_v22 { // needed for ABI-versioned function
> > 	int x;
> > };
> >
> > /* this function will only be called by *newly-linked* code, which uses
> >  * the new structure */
> > int
> > x_dev_cfg(int dev_id, struct x_dev_cfg *cfg)
> > {
> >    struct x_dev *dev = x_devs[id];
> >    // some setup/config may go here
> >    return dev->configure(cfg, sizeof(cfg)); // sizeof(cfg) is now 8
> > }
> >
> > /* this function is called by apps linked against old version */
> > int
> > x_dev_cfg_v22(int dev_id, struct x_dev_cfg_v22 *cfg)
> > {
> >    struct x_dev *dev = x_devs[id];
> >    // some setup/config may go here
> >    return dev->configure((void *)cfg, sizeof(cfg)); // sizeof(cfg) is still 4
> > }
> >
> > With the above library code, we have different functions using the
> > different structures, so ABI compatibility is preserved - apps passing in a
> > 4-byte struct call a function using the 4-byte struct, while newer apps can
> > use the 8-byte version.
> >
> > The final part of the puzzle is then how drivers react to this change.
> > Originally, all drivers only use "x" in the config structure because that
> > is all that there is. That will still continue to work fine in the above
> > case, as both 4-byte and 8-byte structs have the same x value at the same
> > offset. i.e. no driver updates for x_dev is needed.
> >
> > On the other hand, if there are drivers that do want/need the new field,
> > they can also get to use it, but they do need to check for its presence
> > before they do so, i.e they would work as below:
> >
> > 	if (size_param > struct(x_dev_cfg_v22)) { // or "== struct(x_dev_cfg)"
> > 		// use flags field
> > 	}
> >
> > Hope this is clear now.
> 
> Yes, this is the kind of explanation we need in our guideline doc.
> Alternatives can be documented as well.
> If we can list pros/cons in the doc, it will be easier to choose
> the best approach and to explain the choice during code review.
> 
> 
Thanks you very much for the clear explanation.

The draw back is that we need also to duplicate the functions.
Using the flags/version we only need to create new structures
and from application point of view it knows what exta fields it gets.
(I agree that application knowledge has downsides but also advantages)

In the case of flags/version your example will look like this (this is for the record and may other
developers are intrested):

struct x_dev_cfg {  //original struct
	int ver;
	int x;
};
 
struct x_dev_cfg_v2 { // new struct
	int ver;
 	int x;
	bool flag; // new field;
 };


The function is always the same function:
 x_dev_cfg(int dev_id, struct x_dev_cfg *cfg)
 {
    struct x_dev *dev = x_devs[id];
    // some setup/config may go here
    return dev->configure(cfg); 
 }

When calling this function with old struct:
X_dev_cfg(id, (struct x_dev_cfg *)cfg)

When calling this function with new struct:
X_dev_cfg(id, (struct x_dev_cfg *)cfg_v2)

In PMD:
If (cfg->ver >= 2)
	// version 2 logic
Else If (cfg->v >=0)
	// base version logic


When using flags it gives even more control since pmd can tell exactly what
features are required.

All options have prons/cons
I vote for the version one.

We can have a poll 😊
Or like Thomas said list pros and cons and each subsystem can
have it own selection.
  
Bruce Richardson Jan. 26, 2022, 1:41 p.m. UTC | #16
On Wed, Jan 26, 2022 at 12:19:43PM +0000, Ori Kam wrote:
> 
> 
> > -----Original Message-----
> > From: Thomas Monjalon <thomas@monjalon.net>
> > Sent: Wednesday, January 26, 2022 1:22 PM
> > Subject: Re: [PATCH v2 01/10] ethdev: introduce flow pre-configuration hints
> > 
> > 26/01/2022 11:52, Bruce Richardson:
> > > The scenario is as follows. Suppose we have the initial state as below:
> > >
> > > struct x_dev_cfg {
> > >    int x;
> > > };
> > >
> > > int
> > > x_dev_cfg(int dev_id, struct x_dev_cfg *cfg)
> > > {
> > >    struct x_dev *dev = x_devs[id];
> > >    // some setup/config may go here
> > >    return dev->configure(cfg, sizeof(cfg)); // sizeof(cfg) == 4
> > > }
> > >
> > > Now, supposing we need to add in a new field into the config structure, a
> > > very common occurance. This will indeed break the ABI, so we need to use
> > > ABI versioning, to ensure that apps passing in the old structure, only call
> > > a function which expects the old structure. Therefore, we need a copy of
> > > the old structure, and a function to work on it. This gives this result:
> > >
> > > struct x_dev_cfg {
> > > 	int x;
> > > 	bool flag; // new field;
> > > };
> > >
> > > struct x_dev_cfg_v22 { // needed for ABI-versioned function
> > > 	int x;
> > > };
> > >
> > > /* this function will only be called by *newly-linked* code, which uses
> > >  * the new structure */
> > > int
> > > x_dev_cfg(int dev_id, struct x_dev_cfg *cfg)
> > > {
> > >    struct x_dev *dev = x_devs[id];
> > >    // some setup/config may go here
> > >    return dev->configure(cfg, sizeof(cfg)); // sizeof(cfg) is now 8
> > > }
> > >
> > > /* this function is called by apps linked against old version */
> > > int
> > > x_dev_cfg_v22(int dev_id, struct x_dev_cfg_v22 *cfg)
> > > {
> > >    struct x_dev *dev = x_devs[id];
> > >    // some setup/config may go here
> > >    return dev->configure((void *)cfg, sizeof(cfg)); // sizeof(cfg) is still 4
> > > }
> > >
> > > With the above library code, we have different functions using the
> > > different structures, so ABI compatibility is preserved - apps passing in a
> > > 4-byte struct call a function using the 4-byte struct, while newer apps can
> > > use the 8-byte version.
> > >
> > > The final part of the puzzle is then how drivers react to this change.
> > > Originally, all drivers only use "x" in the config structure because that
> > > is all that there is. That will still continue to work fine in the above
> > > case, as both 4-byte and 8-byte structs have the same x value at the same
> > > offset. i.e. no driver updates for x_dev is needed.
> > >
> > > On the other hand, if there are drivers that do want/need the new field,
> > > they can also get to use it, but they do need to check for its presence
> > > before they do so, i.e they would work as below:
> > >
> > > 	if (size_param > struct(x_dev_cfg_v22)) { // or "== struct(x_dev_cfg)"
> > > 		// use flags field
> > > 	}
> > >
> > > Hope this is clear now.
> > 
> > Yes, this is the kind of explanation we need in our guideline doc.
> > Alternatives can be documented as well.
> > If we can list pros/cons in the doc, it will be easier to choose
> > the best approach and to explain the choice during code review.
> > 
> > 
> Thanks you very much for the clear explanation.
> 
> The draw back is that we need also to duplicate the functions.
> Using the flags/version we only need to create new structures
> and from application point of view it knows what exta fields it gets.
> (I agree that application knowledge has downsides but also advantages)
> 
> In the case of flags/version your example will look like this (this is for the record and may other
> developers are intrested):
> 
> struct x_dev_cfg {  //original struct
> 	int ver;
> 	int x;
> };
>  
> struct x_dev_cfg_v2 { // new struct
> 	int ver;
>  	int x;
> 	bool flag; // new field;
>  };
> 
> 
> The function is always the same function:
>  x_dev_cfg(int dev_id, struct x_dev_cfg *cfg)
>  {
>     struct x_dev *dev = x_devs[id];
>     // some setup/config may go here
>     return dev->configure(cfg); 
>  }
> 
> When calling this function with old struct:
> X_dev_cfg(id, (struct x_dev_cfg *)cfg)
> 
> When calling this function with new struct:
> X_dev_cfg(id, (struct x_dev_cfg *)cfg_v2)
> 
> In PMD:
> If (cfg->ver >= 2)
> 	// version 2 logic
> Else If (cfg->v >=0)
> 	// base version logic
> 
> 
> When using flags it gives even more control since pmd can tell exactly what
> features are required.
> 
> All options have prons/cons
> I vote for the version one.
> 
> We can have a poll 😊
> Or like Thomas said list pros and cons and each subsystem can
> have it own selection.

The biggest issue I have with this version approach is how is the user
meant to know what version number to put into the structure? When the user
upgrades from one version of DPDK to the next, are they manually to update
their version numbers in all their structures? If they don't, they then may
be mistified if they use the newer fields and find that they "don't work"
because they forgot that they need to update the version field to the newer
version at the same time. The reason I prefer the size field is that it is
impossible for the end user to mess things up, and the entirity of the
mechanism is internal, and hidden from the user.

Regards,
/Bruce
  
Ori Kam Jan. 26, 2022, 3:12 p.m. UTC | #17
> -----Original Message-----
> From: Bruce Richardson <bruce.richardson@intel.com>
> Sent: Wednesday, January 26, 2022 3:41 PM
> To: Ori Kam <orika@nvidia.com>
> Subject: Re: [PATCH v2 01/10] ethdev: introduce flow pre-configuration hints
> 
> On Wed, Jan 26, 2022 at 12:19:43PM +0000, Ori Kam wrote:
> >
> >
> > > -----Original Message-----
> > > From: Thomas Monjalon <thomas@monjalon.net>
> > > Sent: Wednesday, January 26, 2022 1:22 PM
> > > Subject: Re: [PATCH v2 01/10] ethdev: introduce flow pre-configuration hints
> > >
> > > 26/01/2022 11:52, Bruce Richardson:
> > > > The scenario is as follows. Suppose we have the initial state as below:
> > > >
> > > > struct x_dev_cfg {
> > > >    int x;
> > > > };
> > > >
> > > > int
> > > > x_dev_cfg(int dev_id, struct x_dev_cfg *cfg)
> > > > {
> > > >    struct x_dev *dev = x_devs[id];
> > > >    // some setup/config may go here
> > > >    return dev->configure(cfg, sizeof(cfg)); // sizeof(cfg) == 4
> > > > }
> > > >
> > > > Now, supposing we need to add in a new field into the config structure, a
> > > > very common occurance. This will indeed break the ABI, so we need to use
> > > > ABI versioning, to ensure that apps passing in the old structure, only call
> > > > a function which expects the old structure. Therefore, we need a copy of
> > > > the old structure, and a function to work on it. This gives this result:
> > > >
> > > > struct x_dev_cfg {
> > > > 	int x;
> > > > 	bool flag; // new field;
> > > > };
> > > >
> > > > struct x_dev_cfg_v22 { // needed for ABI-versioned function
> > > > 	int x;
> > > > };
> > > >
> > > > /* this function will only be called by *newly-linked* code, which uses
> > > >  * the new structure */
> > > > int
> > > > x_dev_cfg(int dev_id, struct x_dev_cfg *cfg)
> > > > {
> > > >    struct x_dev *dev = x_devs[id];
> > > >    // some setup/config may go here
> > > >    return dev->configure(cfg, sizeof(cfg)); // sizeof(cfg) is now 8
> > > > }
> > > >
> > > > /* this function is called by apps linked against old version */
> > > > int
> > > > x_dev_cfg_v22(int dev_id, struct x_dev_cfg_v22 *cfg)
> > > > {
> > > >    struct x_dev *dev = x_devs[id];
> > > >    // some setup/config may go here
> > > >    return dev->configure((void *)cfg, sizeof(cfg)); // sizeof(cfg) is still 4
> > > > }
> > > >
> > > > With the above library code, we have different functions using the
> > > > different structures, so ABI compatibility is preserved - apps passing in a
> > > > 4-byte struct call a function using the 4-byte struct, while newer apps can
> > > > use the 8-byte version.
> > > >
> > > > The final part of the puzzle is then how drivers react to this change.
> > > > Originally, all drivers only use "x" in the config structure because that
> > > > is all that there is. That will still continue to work fine in the above
> > > > case, as both 4-byte and 8-byte structs have the same x value at the same
> > > > offset. i.e. no driver updates for x_dev is needed.
> > > >
> > > > On the other hand, if there are drivers that do want/need the new field,
> > > > they can also get to use it, but they do need to check for its presence
> > > > before they do so, i.e they would work as below:
> > > >
> > > > 	if (size_param > struct(x_dev_cfg_v22)) { // or "== struct(x_dev_cfg)"
> > > > 		// use flags field
> > > > 	}
> > > >
> > > > Hope this is clear now.
> > >
> > > Yes, this is the kind of explanation we need in our guideline doc.
> > > Alternatives can be documented as well.
> > > If we can list pros/cons in the doc, it will be easier to choose
> > > the best approach and to explain the choice during code review.
> > >
> > >
> > Thanks you very much for the clear explanation.
> >
> > The draw back is that we need also to duplicate the functions.
> > Using the flags/version we only need to create new structures
> > and from application point of view it knows what exta fields it gets.
> > (I agree that application knowledge has downsides but also advantages)
> >
> > In the case of flags/version your example will look like this (this is for the record and may other
> > developers are intrested):
> >
> > struct x_dev_cfg {  //original struct
> > 	int ver;
> > 	int x;
> > };
> >
> > struct x_dev_cfg_v2 { // new struct
> > 	int ver;
> >  	int x;
> > 	bool flag; // new field;
> >  };
> >
> >
> > The function is always the same function:
> >  x_dev_cfg(int dev_id, struct x_dev_cfg *cfg)
> >  {
> >     struct x_dev *dev = x_devs[id];
> >     // some setup/config may go here
> >     return dev->configure(cfg);
> >  }
> >
> > When calling this function with old struct:
> > X_dev_cfg(id, (struct x_dev_cfg *)cfg)
> >
> > When calling this function with new struct:
> > X_dev_cfg(id, (struct x_dev_cfg *)cfg_v2)
> >
> > In PMD:
> > If (cfg->ver >= 2)
> > 	// version 2 logic
> > Else If (cfg->v >=0)
> > 	// base version logic
> >
> >
> > When using flags it gives even more control since pmd can tell exactly what
> > features are required.
> >
> > All options have prons/cons
> > I vote for the version one.
> >
> > We can have a poll 😊
> > Or like Thomas said list pros and cons and each subsystem can
> > have it own selection.
> 
> The biggest issue I have with this version approach is how is the user
> meant to know what version number to put into the structure? When the user
> upgrades from one version of DPDK to the next, are they manually to update
> their version numbers in all their structures? If they don't, they then may
> be mistified if they use the newer fields and find that they "don't work"
> because they forgot that they need to update the version field to the newer
> version at the same time. The reason I prefer the size field is that it is
> impossible for the end user to mess things up, and the entirity of the
> mechanism is internal, and hidden from the user.
> 

The solution is simple when you define new struct in the struct you write what
should be the version number.
You can also define that 0 is the latest one, so application that are are writing code which
is size agnostic will just set 0 all the time.

 
> Regards,
> /Bruce
  
Alexander Kozyrev Jan. 26, 2022, 10:02 p.m. UTC | #18
On Tuesday, January 25, 2022 13:44 Jerin Jacob <jerinjacobk@gmail.com> wrote:
> On Tue, Jan 25, 2022 at 6:58 AM Alexander Kozyrev <akozyrev@nvidia.com>
> wrote:
> >
> > On Monday, January 24, 2022 12:41 Ajit Khaparde
> <ajit.khaparde@broadcom.com> wrote:
> > > On Mon, Jan 24, 2022 at 6:37 AM Jerin Jacob <jerinjacobk@gmail.com>
> > > wrote:
> > > >
> 
> > Ok, I'll adopt this wording in the v3.
> >
> > > > > + *
> > > > > + * @param port_id
> > > > > + *   Port identifier of Ethernet device.
> > > > > + * @param[in] port_attr
> > > > > + *   Port configuration attributes.
> > > > > + * @param[out] error
> > > > > + *   Perform verbose error reporting if not NULL.
> > > > > + *   PMDs initialize this structure in case of error only.
> > > > > + *
> > > > > + * @return
> > > > > + *   0 on success, a negative errno value otherwise and rte_errno is
> set.
> > > > > + */
> > > > > +__rte_experimental
> > > > > +int
> > > > > +rte_flow_configure(uint16_t port_id,
> > > >
> > > > Should we couple, setting resource limit hint to configure function as
> > > > if we add future items in
> > > > configuration, we may pain to manage all state. Instead how about,
> > > > rte_flow_resource_reserve_hint_set()?
> > > +1
> > Port attributes are the hints, PMD can safely ignore anything that is not
> supported/deemed unreasonable.
> > Having several functions to call instead of one configuration function seems
> like a burden to me.
> 
> If we add a lot of features which has different state it will be
> difficult to manage.
> Since it is the slow path and OPTIONAL API. IMO, it should be fine to
> have a separate API for a specific purpose
> to have a clean interface.

This approach contradicts to the DPDK way of configuring devices.
It you look at the rte_eth_dev_configure or rte_eth_rx_queue_setup API
you will see that the configuration is propagated via config structures.
I would like to conform to this approach with my new API as well.

Another question is how to deal with interdependencies with separate hints?
There could be some resources that requires other resources to be present.
Or one resource shares the hardware registers with another one and needs to
be accounted for. That is not easy to do with separate function calls.

> >
> > >
> > > >
> > > >
> > > > > +                  const struct rte_flow_port_attr *port_attr,
> > > > > +                  struct rte_flow_error *error);
> > > >
> > > > I think, we should have _get function to get those limit numbers
> otherwise,
> > > > we can not write portable applications as the return value is  kind of
> > > > boolean now if
> > > > don't define exact values for rte_errno for reasons.
> > > +1
> > We had this discussion in RFC. The limits will vary from NIC to NIC and from
> system to
> > system, depending on hardware capabilities and amount of free memory
> for example.
> > It is easier to reject a configuration with a clear error description as we do
> for flow creation.
> 
> In that case, we can return a "defined" return value or "defined"
> errno to capture this case so that
> the application can make forward progress to differentiate between API
> failed vs dont having enough resources
> and move on.

I think you are right and it will be useful to provide some hardware capabilities.
I'll add something like rte_flow_info_get() to obtain available flow rule resources.
  
Jerin Jacob Jan. 27, 2022, 9:34 a.m. UTC | #19
On Thu, Jan 27, 2022 at 3:32 AM Alexander Kozyrev <akozyrev@nvidia.com> wrote:
>
> On Tuesday, January 25, 2022 13:44 Jerin Jacob <jerinjacobk@gmail.com> wrote:
> > On Tue, Jan 25, 2022 at 6:58 AM Alexander Kozyrev <akozyrev@nvidia.com>
> > wrote:
> > >
> > > On Monday, January 24, 2022 12:41 Ajit Khaparde
> > <ajit.khaparde@broadcom.com> wrote:
> > > > On Mon, Jan 24, 2022 at 6:37 AM Jerin Jacob <jerinjacobk@gmail.com>
> > > > wrote:
> > > > >
> >
> > > Ok, I'll adopt this wording in the v3.
> > >
> > > > > > + *
> > > > > > + * @param port_id
> > > > > > + *   Port identifier of Ethernet device.
> > > > > > + * @param[in] port_attr
> > > > > > + *   Port configuration attributes.
> > > > > > + * @param[out] error
> > > > > > + *   Perform verbose error reporting if not NULL.
> > > > > > + *   PMDs initialize this structure in case of error only.
> > > > > > + *
> > > > > > + * @return
> > > > > > + *   0 on success, a negative errno value otherwise and rte_errno is
> > set.
> > > > > > + */
> > > > > > +__rte_experimental
> > > > > > +int
> > > > > > +rte_flow_configure(uint16_t port_id,
> > > > >
> > > > > Should we couple, setting resource limit hint to configure function as
> > > > > if we add future items in
> > > > > configuration, we may pain to manage all state. Instead how about,
> > > > > rte_flow_resource_reserve_hint_set()?
> > > > +1
> > > Port attributes are the hints, PMD can safely ignore anything that is not
> > supported/deemed unreasonable.
> > > Having several functions to call instead of one configuration function seems
> > like a burden to me.
> >
> > If we add a lot of features which has different state it will be
> > difficult to manage.
> > Since it is the slow path and OPTIONAL API. IMO, it should be fine to
> > have a separate API for a specific purpose
> > to have a clean interface.
>
> This approach contradicts to the DPDK way of configuring devices.
> It you look at the rte_eth_dev_configure or rte_eth_rx_queue_setup API
> you will see that the configuration is propagated via config structures.
> I would like to conform to this approach with my new API as well.

There is a subtle difference,  those are mandatory APIs. i,e application must
call those API to use the subsequent APIs.

I am OK with introducing rte_flow_configure() for such use cases.
Probably, we can add these parameters in rte_flow_configure() for the
new features.
And make it mandatory API for the next ABI to avoid application breakage.

Also, please change git commit to the description for adding  the
configure state
for rte_flow API.

BTW: Your Queue patch[3/3] probably needs to add the nb_queue
parameter to configure.
So the driver knows, the number queue needed upfront like the ethdev API scheme.


>
> Another question is how to deal with interdependencies with separate hints?
> There could be some resources that requires other resources to be present.
> Or one resource shares the hardware registers with another one and needs to
> be accounted for. That is not easy to do with separate function calls.

I got the use case now.

>
> > >
> > > >
> > > > >
> > > > >
> > > > > > +                  const struct rte_flow_port_attr *port_attr,
> > > > > > +                  struct rte_flow_error *error);
> > > > >
> > > > > I think, we should have _get function to get those limit numbers
> > otherwise,
> > > > > we can not write portable applications as the return value is  kind of
> > > > > boolean now if
> > > > > don't define exact values for rte_errno for reasons.
> > > > +1
> > > We had this discussion in RFC. The limits will vary from NIC to NIC and from
> > system to
> > > system, depending on hardware capabilities and amount of free memory
> > for example.
> > > It is easier to reject a configuration with a clear error description as we do
> > for flow creation.
> >
> > In that case, we can return a "defined" return value or "defined"
> > errno to capture this case so that
> > the application can make forward progress to differentiate between API
> > failed vs dont having enough resources
> > and move on.
>
> I think you are right and it will be useful to provide some hardware capabilities.
> I'll add something like rte_flow_info_get() to obtain available flow rule resources.

Ack.
  

Patch

diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index b4aa9c47c2..86f8c8bda2 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -3589,6 +3589,43 @@  Return values:
 
 - 0 on success, a negative errno value otherwise and ``rte_errno`` is set.
 
+Rules management configuration
+------------------------------
+
+Configure flow rules management.
+
+An application may provide some hints at the initialization phase about
+rules management configuration and/or expected flow rules characteristics.
+These hints may be used by PMD to pre-allocate resources and configure NIC.
+
+Configuration
+~~~~~~~~~~~~~
+
+This function performs the flow rules management configuration and
+pre-allocates needed resources beforehand to avoid costly allocations later.
+Hints about the expected number of counters or meters in an application,
+for example, allow PMD to prepare and optimize NIC memory layout in advance.
+``rte_flow_configure()`` must be called before any flow rule is created,
+but after an Ethernet device is configured.
+
+.. code-block:: c
+
+   int
+   rte_flow_configure(uint16_t port_id,
+                     const struct rte_flow_port_attr *port_attr,
+                     struct rte_flow_error *error);
+
+Arguments:
+
+- ``port_id``: port identifier of Ethernet device.
+- ``port_attr``: port attributes for flow management library.
+- ``error``: perform verbose error reporting if not NULL. PMDs initialize
+  this structure in case of error only.
+
+Return values:
+
+- 0 on success, a negative errno value otherwise and ``rte_errno`` is set.
+
 .. _flow_isolated_mode:
 
 Flow isolated mode
diff --git a/doc/guides/rel_notes/release_22_03.rst b/doc/guides/rel_notes/release_22_03.rst
index 16c66c0641..71b3f0a651 100644
--- a/doc/guides/rel_notes/release_22_03.rst
+++ b/doc/guides/rel_notes/release_22_03.rst
@@ -55,6 +55,8 @@  New Features
      Also, make sure to start the actual text at the margin.
      =======================================================
 
+* ethdev: Added ``rte_flow_configure`` API to configure Flow Management
+  library, allowing to pre-allocate some resources for better performance.
 
 Removed Items
 -------------
diff --git a/lib/ethdev/rte_flow.c b/lib/ethdev/rte_flow.c
index a93f68abbc..5b78780ef9 100644
--- a/lib/ethdev/rte_flow.c
+++ b/lib/ethdev/rte_flow.c
@@ -1391,3 +1391,23 @@  rte_flow_flex_item_release(uint16_t port_id,
 	ret = ops->flex_item_release(dev, handle, error);
 	return flow_err(port_id, ret, error);
 }
+
+int
+rte_flow_configure(uint16_t port_id,
+		   const struct rte_flow_port_attr *port_attr,
+		   struct rte_flow_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+
+	if (unlikely(!ops))
+		return -rte_errno;
+	if (likely(!!ops->configure)) {
+		return flow_err(port_id,
+				ops->configure(dev, port_attr, error),
+				error);
+	}
+	return rte_flow_error_set(error, ENOTSUP,
+				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+				  NULL, rte_strerror(ENOTSUP));
+}
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index 1031fb246b..e145e68525 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -4853,6 +4853,69 @@  rte_flow_flex_item_release(uint16_t port_id,
 			   const struct rte_flow_item_flex_handle *handle,
 			   struct rte_flow_error *error);
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Flow engine port configuration attributes.
+ */
+__extension__
+struct rte_flow_port_attr {
+	/**
+	 * Version of the struct layout, should be 0.
+	 */
+	uint32_t version;
+	/**
+	 * Number of counter actions pre-configured.
+	 * If set to 0, PMD will allocate counters dynamically.
+	 * @see RTE_FLOW_ACTION_TYPE_COUNT
+	 */
+	uint32_t nb_counters;
+	/**
+	 * Number of aging actions pre-configured.
+	 * If set to 0, PMD will allocate aging dynamically.
+	 * @see RTE_FLOW_ACTION_TYPE_AGE
+	 */
+	uint32_t nb_aging;
+	/**
+	 * Number of traffic metering actions pre-configured.
+	 * If set to 0, PMD will allocate meters dynamically.
+	 * @see RTE_FLOW_ACTION_TYPE_METER
+	 */
+	uint32_t nb_meters;
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Configure flow rules module.
+ * To pre-allocate resources as per the flow port attributes
+ * this configuration function must be called before any flow rule is created.
+ * Must be called only after Ethernet device is configured, but may be called
+ * before or after the device is started as long as there are no flow rules.
+ * No other rte_flow function should be called while this function is invoked.
+ * This function can be called again to change the configuration.
+ * Some PMDs may not support re-configuration at all,
+ * or may only allow increasing the number of resources allocated.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param[in] port_attr
+ *   Port configuration attributes.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ *   PMDs initialize this structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+__rte_experimental
+int
+rte_flow_configure(uint16_t port_id,
+		   const struct rte_flow_port_attr *port_attr,
+		   struct rte_flow_error *error);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/ethdev/rte_flow_driver.h b/lib/ethdev/rte_flow_driver.h
index f691b04af4..5f722f1a39 100644
--- a/lib/ethdev/rte_flow_driver.h
+++ b/lib/ethdev/rte_flow_driver.h
@@ -152,6 +152,11 @@  struct rte_flow_ops {
 		(struct rte_eth_dev *dev,
 		 const struct rte_flow_item_flex_handle *handle,
 		 struct rte_flow_error *error);
+	/** See rte_flow_configure() */
+	int (*configure)
+		(struct rte_eth_dev *dev,
+		 const struct rte_flow_port_attr *port_attr,
+		 struct rte_flow_error *err);
 };
 
 /**
diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
index c2fb0669a4..7645796739 100644
--- a/lib/ethdev/version.map
+++ b/lib/ethdev/version.map
@@ -256,6 +256,9 @@  EXPERIMENTAL {
 	rte_flow_flex_item_create;
 	rte_flow_flex_item_release;
 	rte_flow_pick_transfer_proxy;
+
+	# added in 22.03
+	rte_flow_configure;
 };
 
 INTERNAL {