diff mbox series

ethdev: introduce generic flow item and action

Message ID	20230802173451.3151646-1-qi.z.zhang@intel.com (mailing list archive)
State	Superseded, archived
Delegated to:	Ferruh Yigit
Headers	From: Qi Zhang <qi.z.zhang@intel.com> To: thomas@monjalon.net, orika@nvidia.com, david.marchand@redhat.com, bruce.richardson@intel.com, jerinj@marvell.com, ferruh.yigit@amd.com Cc: cristian.dumiterscu@intel.com, techboard@dpdk.org, john.mcnamara@intel.com, helin.zhang@intel.com, dev@dpdk.org, Cristian Dumitrescu <cristian.dumitrescu@intel.com>, Qi Zhang <qi.z.zhang@intel.com> Subject: [PATCH] ethdev: introduce generic flow item and action Date: Wed, 2 Aug 2023 13:34:51 -0400 Message-Id: <20230802173451.3151646-1-qi.z.zhang@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: list Errors-To: dev-bounces@dpdk.org
Series	ethdev: introduce generic flow item and action \| ethdev: introduce generic flow item and action

Checks

Context	Check	Description
ci/checkpatch	success	coding style OK
ci/loongarch-compilation	success	Compilation OK
ci/loongarch-unit-testing	success	Unit Testing PASS
ci/github-robot: build	success	github build: passed
ci/Intel-compilation	success	Compilation OK
ci/intel-Testing	success	Testing PASS
ci/intel-Functional	success	Functional PASS

Commit Message

Qi Zhang Aug. 2, 2023, 5:34 p.m. UTC

  From: Cristian Dumitrescu <cristian.dumitrescu@intel.com>

For network devices that are programmable through languages such as
the P4 language, there are no pre-defined flow items and actions.

The format of the protocol header and metadata fields that are used to
specify the flow items that make up the flow pattern, as well as the
flow actions, are all defined by the program, with an infinity of
possible combinations, as opposed to being selected from a finite
pre-defined list.

It is virtually impossible to pre-define all the flow items and the
flow actions that programs might ever use, as these are only limited
by the set of HW resources and the program developer's imagination.

To support the programmable network devices, we are introducing:

* A generic flow item: The flow item is expressed as an array of bytes
of a given length, whose meaning is defined by the program loaded by
the network device. The device is still expected to accept the
existing pre-defined flow items such as Ethernet, IPv4/IPv6 headers,
etc, as long as the program currently loaded on the device is defining
and using flow items with identical format, but the device is not
limited to the current set of pre-defined RTE_FLOW flow items.

* A generic flow action: The flow action exact processing is defined
by the program loaded by the network device, with the user expected to
know the set of program actions for the purpose of assigning actions
to flows. The device is still expected to accept the existing
pre-defined flow actions such as packet drop, packet redirection to
output queues, packet modifications, etc, as long as the program
currently loaded on the device is defining and using flow actions that
perform identical processing, but the device is not limited to the
current set of pre-defined RTE_FLOW flow actions.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
---
 lib/ethdev/rte_flow.h | 82 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 82 insertions(+)

Comments

Qi Zhang Aug. 2, 2023, 9:37 a.m. UTC | #1

> -----Original Message-----
> From: Zhang, Qi Z <qi.z.zhang@intel.com>
> Sent: Thursday, August 3, 2023 1:35 AM
> To: thomas@monjalon.net; orika@nvidia.com; david.marchand@redhat.com;
> Richardson, Bruce <bruce.richardson@intel.com>; jerinj@marvell.com;
> ferruh.yigit@amd.com
> Cc: cristian.dumiterscu@intel.com; techboard@dpdk.org; Mcnamara, John
> <john.mcnamara@intel.com>; Zhang, Helin <helin.zhang@intel.com>;
> dev@dpdk.org; Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; Zhang,
> Qi Z <qi.z.zhang@intel.com>
> Subject: [PATCH] ethdev: introduce generic flow item and action

Sorry, this is supposed to be an RFC

> 
> From: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> 
> For network devices that are programmable through languages such as the P4
> language, there are no pre-defined flow items and actions.
> 
> The format of the protocol header and metadata fields that are used to specify
> the flow items that make up the flow pattern, as well as the flow actions, are
> all defined by the program, with an infinity of possible combinations, as
> opposed to being selected from a finite pre-defined list.
> 
> It is virtually impossible to pre-define all the flow items and the flow actions
> that programs might ever use, as these are only limited by the set of HW
> resources and the program developer's imagination.
> 
> To support the programmable network devices, we are introducing:
> 
> * A generic flow item: The flow item is expressed as an array of bytes of a given
> length, whose meaning is defined by the program loaded by the network
> device. The device is still expected to accept the existing pre-defined flow items
> such as Ethernet, IPv4/IPv6 headers, etc, as long as the program currently
> loaded on the device is defining and using flow items with identical format, but
> the device is not limited to the current set of pre-defined RTE_FLOW flow items.
> 
> * A generic flow action: The flow action exact processing is defined by the
> program loaded by the network device, with the user expected to know the set
> of program actions for the purpose of assigning actions to flows. The device is
> still expected to accept the existing pre-defined flow actions such as packet
> drop, packet redirection to output queues, packet modifications, etc, as long as
> the program currently loaded on the device is defining and using flow actions
> that perform identical processing, but the device is not limited to the current
> set of pre-defined RTE_FLOW flow actions.
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
> ---
>  lib/ethdev/rte_flow.h | 82 +++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 82 insertions(+)
> 
> diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h index
> 3fe57140f9..f7889d7dd0 100644
> --- a/lib/ethdev/rte_flow.h
> +++ b/lib/ethdev/rte_flow.h
> @@ -688,6 +688,15 @@ enum rte_flow_item_type {
>  	 * @see struct rte_flow_item_ib_bth.
>  	 */
>  	RTE_FLOW_ITEM_TYPE_IB_BTH,
> +
> +	/**
> +	 * Matches a custom protocol header or metadata field represented
> +	 * as a byte string of a given length, whose meaning is typically
> +	 * defined by the data plane program.
> +	 *
> +	 * @see struct rte_flow_item_generic.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_GENERIC,
>  };
> 
>  /**
> @@ -2311,6 +2320,32 @@ static const struct rte_flow_item_tx_queue
> rte_flow_item_tx_queue_mask = {  };  #endif
> 
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this structure may change without prior notice
> + *
> + * RTE_FLOW_ITEM_TYPE_GENERIC
> + *
> + * Matches a custom protocol header or metadata field represented as a
> +byte
> + * array of a given length, whose meaning is typically defined by the
> +data
> + * plane program.
> + *
> + * The field length must be non-zero. The field value must be non-NULL,
> +with the
> + * value bytes specified in network byte order.
> + */
> +struct rte_flow_item_generic {
> +	uint32_t length; /**< Field length. */
> +	const uint8_t *value; /**< Field value. */ };
> +
> +/** Default mask for RTE_FLOW_ITEM_TYPE_GENERIC. */ #ifndef __cplusplus
> +static const struct rte_flow_item_generic rte_flow_item_generic_mask = {
> +	.length = 0xffffffff,
> +	.value = NULL,
> +};
> +#endif
> +
>  /**
>   * Action types.
>   *
> @@ -2989,6 +3024,14 @@ enum rte_flow_action_type {
>  	 * @see struct rte_flow_action_indirect_list
>  	 */
>  	RTE_FLOW_ACTION_TYPE_INDIRECT_LIST,
> +
> +	/**
> +	 * Custom action whose processing is typically defined by the data
> plane
> +	 * program.
> +	 *
> +	 * @see struct rte_flow_action_generic.
> +	 */
> +	RTE_FLOW_ACTION_TYPE_GENERIC,
>  };
> 
>  /**
> @@ -4064,6 +4107,45 @@ struct
> rte_flow_indirect_update_flow_meter_mark {
>  	enum rte_color init_color;
>  };
> 
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this structure may change without prior notice.
> + *
> + * Generic action argument configuration parameters.
> + *
> + * The action argument field length must be non-zero. The action
> +argument field
> + * value must be non-NULL, with the value bytes specified in network byte
> order.
> + *
> + * @see struct rte_flow_action_generic
> + */
> +struct rte_flow_action_generic_argument {
> +	/** Argument field length. */
> +	uint32_t length;
> +	/** Argument field value. */
> +	const uint8_t *value;
> +};
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this structure may change without prior notice.
> + *
> + * RTE_FLOW_ACTION_TYPE_GENERIC
> + *
> + * Generic action configuration parameters.
> + *
> + * Each action can have zero or more arguments.
> + *
> + * @see RTE_FLOW_ACTION_TYPE_GENERIC
> + */
> +struct rte_flow_action_generic {
> +	/** Action ID. */
> +	uint32_t action_id;
> +	/** Number of action arguments. */
> +	uint32_t action_args_num;
> +	/** Action arguments array. */
> +	const struct rte_flow_action_generic_argument *action_args; };
> +
>  /* Mbuf dynamic field offset for metadata. */  extern int32_t
> rte_flow_dynf_metadata_offs;
> 
> --
> 2.31.1

Morten Brørup Aug. 2, 2023, 10:25 a.m. UTC | #2

> From: Qi Zhang [mailto:qi.z.zhang@intel.com]
> Sent: Wednesday, 2 August 2023 19.35
> 
> From: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> 
> For network devices that are programmable through languages such as
> the P4 language, there are no pre-defined flow items and actions.
> 
> The format of the protocol header and metadata fields that are used to
> specify the flow items that make up the flow pattern, as well as the
> flow actions, are all defined by the program, with an infinity of
> possible combinations, as opposed to being selected from a finite
> pre-defined list.
> 
> It is virtually impossible to pre-define all the flow items and the
> flow actions that programs might ever use, as these are only limited
> by the set of HW resources and the program developer's imagination.
> 
> To support the programmable network devices, we are introducing:
> 
> * A generic flow item: The flow item is expressed as an array of bytes
> of a given length, whose meaning is defined by the program loaded by
> the network device.

The flow item is not "generic", it is "opaque": Only the application knows what this flow item does.

I hate the concept for two reasons:
1. The inability for applications to detect which flow items the underlying hardware supports.
2. The risk that vendors will use this instead of introducing new flow item types, available for anyone to implement.

If you proceed anyway, there's a few comments inline below.

> The device is still expected to accept the
> existing pre-defined flow items such as Ethernet, IPv4/IPv6 headers,
> etc, as long as the program currently loaded on the device is defining
> and using flow items with identical format, but the device is not
> limited to the current set of pre-defined RTE_FLOW flow items.
> 
> * A generic flow action: The flow action exact processing is defined
> by the program loaded by the network device, with the user expected to
> know the set of program actions for the purpose of assigning actions
> to flows. The device is still expected to accept the existing
> pre-defined flow actions such as packet drop, packet redirection to
> output queues, packet modifications, etc, as long as the program
> currently loaded on the device is defining and using flow actions that
> perform identical processing, but the device is not limited to the
> current set of pre-defined RTE_FLOW flow actions.
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
> ---
>  lib/ethdev/rte_flow.h | 82 +++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 82 insertions(+)
> 
> diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
> index 3fe57140f9..f7889d7dd0 100644
> --- a/lib/ethdev/rte_flow.h
> +++ b/lib/ethdev/rte_flow.h
> @@ -688,6 +688,15 @@ enum rte_flow_item_type {
>  	 * @see struct rte_flow_item_ib_bth.
>  	 */
>  	RTE_FLOW_ITEM_TYPE_IB_BTH,
> +
> +	/**
> +	 * Matches a custom protocol header or metadata field represented
> +	 * as a byte string of a given length, whose meaning is typically
> +	 * defined by the data plane program.
> +	 *
> +	 * @see struct rte_flow_item_generic.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_GENERIC,

Rename: _GENERIC -> _OPAQUE, everywhere.

>  };
> 
>  /**
> @@ -2311,6 +2320,32 @@ static const struct rte_flow_item_tx_queue
> rte_flow_item_tx_queue_mask = {
>  };
>  #endif
> 
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this structure may change without prior notice
> + *
> + * RTE_FLOW_ITEM_TYPE_GENERIC
> + *
> + * Matches a custom protocol header or metadata field represented as a byte
> + * array of a given length, whose meaning is typically defined by the data
> + * plane program.

"byte array" -> "opaque data type"

> + *
> + * The field length must be non-zero. The field value must be non-NULL, with
> the
> + * value bytes specified in network byte order.
> + */
> +struct rte_flow_item_generic {
> +	uint32_t length; /**< Field length. */
> +	const uint8_t *value; /**< Field value. */

Suggestion: Change the value type from "const uint8_t *" to "const void *". It makes it easier for the application to use a hierarchy of structures internally for this.

> +};
> +
> +/** Default mask for RTE_FLOW_ITEM_TYPE_GENERIC. */
> +#ifndef __cplusplus
> +static const struct rte_flow_item_generic rte_flow_item_generic_mask = {
> +	.length = 0xffffffff,
> +	.value = NULL,
> +};
> +#endif
> +
>  /**
>   * Action types.
>   *
> @@ -2989,6 +3024,14 @@ enum rte_flow_action_type {
>  	 * @see struct rte_flow_action_indirect_list
>  	 */
>  	RTE_FLOW_ACTION_TYPE_INDIRECT_LIST,
> +
> +	/**
> +	 * Custom action whose processing is typically defined by the data plane
> +	 * program.
> +	 *
> +	 * @see struct rte_flow_action_generic.
> +	 */
> +	RTE_FLOW_ACTION_TYPE_GENERIC,
>  };
> 
>  /**
> @@ -4064,6 +4107,45 @@ struct rte_flow_indirect_update_flow_meter_mark {
>  	enum rte_color init_color;
>  };
> 
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this structure may change without prior notice.
> + *
> + * Generic action argument configuration parameters.
> + *
> + * The action argument field length must be non-zero. The action argument
> field
> + * value must be non-NULL, with the value bytes specified in network byte
> order.
> + *
> + * @see struct rte_flow_action_generic
> + */
> +struct rte_flow_action_generic_argument {
> +	/** Argument field length. */
> +	uint32_t length;
> +	/** Argument field value. */
> +	const uint8_t *value;
> +};
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this structure may change without prior notice.
> + *
> + * RTE_FLOW_ACTION_TYPE_GENERIC
> + *
> + * Generic action configuration parameters.
> + *
> + * Each action can have zero or more arguments.
> + *
> + * @see RTE_FLOW_ACTION_TYPE_GENERIC
> + */
> +struct rte_flow_action_generic {
> +	/** Action ID. */
> +	uint32_t action_id;
> +	/** Number of action arguments. */
> +	uint32_t action_args_num;
> +	/** Action arguments array. */
> +	const struct rte_flow_action_generic_argument *action_args;
> +};

This is a list of arguments, not one argument. You should append "_array" somewhere in the name, and add a normal (non-list) action, e.g.:

struct rte_flow_action_opaque {
//Or: "struct rte_flow_action_generic", if you want to keep that name.
	/** Action ID. */
	uint32_t action_id;
	/** Argument length. */
	uint32_t length;
	/** Argument value. */
	const uint8_t *value;
// Or: "const void *value;" for the same reason as for the flow item.
};


> +
>  /* Mbuf dynamic field offset for metadata. */
>  extern int32_t rte_flow_dynf_metadata_offs;
> 
> --
> 2.31.1

Morten Brørup Aug. 2, 2023, 11:01 a.m. UTC | #3

> From: Morten Brørup [mailto:mb@smartsharesystems.com]
> Sent: Wednesday, 2 August 2023 12.25
> 
> > From: Qi Zhang [mailto:qi.z.zhang@intel.com]
> > Sent: Wednesday, 2 August 2023 19.35
> >
> > From: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> >
> > For network devices that are programmable through languages such as
> > the P4 language, there are no pre-defined flow items and actions.
> >
> > The format of the protocol header and metadata fields that are used to
> > specify the flow items that make up the flow pattern, as well as the
> > flow actions, are all defined by the program, with an infinity of
> > possible combinations, as opposed to being selected from a finite
> > pre-defined list.
> >
> > It is virtually impossible to pre-define all the flow items and the
> > flow actions that programs might ever use, as these are only limited
> > by the set of HW resources and the program developer's imagination.
> >
> > To support the programmable network devices, we are introducing:
> >
> > * A generic flow item: The flow item is expressed as an array of bytes
> > of a given length, whose meaning is defined by the program loaded by
> > the network device.
> 
> The flow item is not "generic", it is "opaque": Only the application knows
> what this flow item does.
> 
> I hate the concept for two reasons:
> 1. The inability for applications to detect which flow items the underlying
> hardware supports.
> 2. The risk that vendors will use this instead of introducing new flow item
> types, available for anyone to implement.

After further consideration, there might be a middle ground.

Consider Vendor-Specific attributes for DHCP and RADIUS, or SNMP MIBs...

Any vendor is free to add his own, proprietary special-purpose attributes, without going through the standardization process. (This is the key challenge this patch seems to be aiming at.)

The vendor might publish these attributes, and other vendors may implement them too.

And in order to prevent collisions, the Vendor-Specific attributes contain a globally unique vendor ID, such as the Private Enterprise Number [1] managed by IANA.

If similar principles can be worked into the patch, I can support it.

Preferably, there should also be a means for applications to query if specific Vendor-Specific flow items and actions are supported or not.


[1]: https://www.iana.org/assignments/enterprise-numbers/

Jerin Jacob Aug. 2, 2023, 11:21 a.m. UTC | #4

On Wed, Aug 2, 2023 at 4:31 PM Morten Brørup <mb@smartsharesystems.com> wrote:
>
> > From: Morten Brørup [mailto:mb@smartsharesystems.com]
> > Sent: Wednesday, 2 August 2023 12.25
> >
> > > From: Qi Zhang [mailto:qi.z.zhang@intel.com]
> > > Sent: Wednesday, 2 August 2023 19.35
> > >
> > > From: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> > >
> > > For network devices that are programmable through languages such as
> > > the P4 language, there are no pre-defined flow items and actions.
> > >
> > > The format of the protocol header and metadata fields that are used to
> > > specify the flow items that make up the flow pattern, as well as the
> > > flow actions, are all defined by the program, with an infinity of
> > > possible combinations, as opposed to being selected from a finite
> > > pre-defined list.
> > >
> > > It is virtually impossible to pre-define all the flow items and the
> > > flow actions that programs might ever use, as these are only limited
> > > by the set of HW resources and the program developer's imagination.
> > >
> > > To support the programmable network devices, we are introducing:
> > >
> > > * A generic flow item: The flow item is expressed as an array of bytes
> > > of a given length, whose meaning is defined by the program loaded by
> > > the network device.
> >
> > The flow item is not "generic", it is "opaque": Only the application knows
> > what this flow item does.
> >
> > I hate the concept for two reasons:
> > 1. The inability for applications to detect which flow items the underlying
> > hardware supports.
> > 2. The risk that vendors will use this instead of introducing new flow item
> > types, available for anyone to implement.
>
> After further consideration, there might be a middle ground.
>
> Consider Vendor-Specific attributes for DHCP and RADIUS, or SNMP MIBs...
>
> Any vendor is free to add his own, proprietary special-purpose attributes, without going through the standardization process. (This is the key challenge this patch seems to be aiming at.)
>
> The vendor might publish these attributes, and other vendors may implement them too.
>
> And in order to prevent collisions, the Vendor-Specific attributes contain a globally unique vendor ID, such as the Private Enterprise Number [1] managed by IANA.
>
> If similar principles can be worked into the patch, I can support it.

+1


>
> Preferably, there should also be a means for applications to query if specific Vendor-Specific flow items and actions are supported or not.
>
>
> [1]: https://www.iana.org/assignments/enterprise-numbers/
>

Cristian Dumitrescu Aug. 2, 2023, 2:06 p.m. UTC | #5

Hi Morten,

Thanks for your reply, comments inline below.

> -----Original Message-----
> From: Morten Brørup <mb@smartsharesystems.com>
> Sent: Wednesday, August 2, 2023 11:25 AM
> To: Zhang, Qi Z <qi.z.zhang@intel.com>; thomas@monjalon.net;
> orika@nvidia.com; david.marchand@redhat.com; Richardson, Bruce
> <bruce.richardson@intel.com>; jerinj@marvell.com; ferruh.yigit@amd.com
> Cc: cristian.dumiterscu@intel.com; techboard@dpdk.org; Mcnamara, John
> <john.mcnamara@intel.com>; Zhang, Helin <helin.zhang@intel.com>;
> dev@dpdk.org; Dumitrescu, Cristian <cristian.dumitrescu@intel.com>
> Subject: RE: [PATCH] ethdev: introduce generic flow item and action
> 
> > From: Qi Zhang [mailto:qi.z.zhang@intel.com]
> > Sent: Wednesday, 2 August 2023 19.35
> >
> > From: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> >
> > For network devices that are programmable through languages such as
> > the P4 language, there are no pre-defined flow items and actions.
> >
> > The format of the protocol header and metadata fields that are used to
> > specify the flow items that make up the flow pattern, as well as the
> > flow actions, are all defined by the program, with an infinity of
> > possible combinations, as opposed to being selected from a finite
> > pre-defined list.
> >
> > It is virtually impossible to pre-define all the flow items and the
> > flow actions that programs might ever use, as these are only limited
> > by the set of HW resources and the program developer's imagination.
> >
> > To support the programmable network devices, we are introducing:
> >
> > * A generic flow item: The flow item is expressed as an array of bytes
> > of a given length, whose meaning is defined by the program loaded by
> > the network device.
> 
> The flow item is not "generic", it is "opaque": Only the application knows what
this flow item does.

This item is definitely not opaque, as it does not contain any pointers/handles.
It actually contains the exact bytes of the field in clear text, so how is this opaque?

We already have flow items in RTE_FLOW such as RAW, FUZZY, META, TAG, FLEX
whose meaning is fully defined by the application, as you state, with some of them
(like FLEX) containing pointers or indices that are completely "opaque". We could
use one of these existing items, but we think it would be cleaner to define a new
one that is more aligned to this specific purpose.

The way I would read your comment is that the _format_ of the flow item is
defined by the application. But this is how the P4-programmable HW devices from
Intel and other vendors work: the P4 program loaded by the device (i.e. the data
plane application) defines the _format_ of the flow items.

> 
> I hate the concept for two reasons:
> 1. The inability for applications to detect which flow items the underlying
> hardware supports.

Does RTE_FLOW have any capability to detect what flows are supported by the
underlying HW support? How is this a problem introduced by this proposal?

> 2. The risk that vendors will use this instead of introducing new flow item
> types, available for anyone to implement.

As stated above, there are already many existing flow items in RTE_FLOW that do just 
thisc(and even worse, as some of them contain opaque pointers or indices), such as
RAW, FUZZY, META, TAG, FLEX.

> 
> If you proceed anyway, there's a few comments inline below.
> 
> > The device is still expected to accept the
> > existing pre-defined flow items such as Ethernet, IPv4/IPv6 headers,
> > etc, as long as the program currently loaded on the device is defining
> > and using flow items with identical format, but the device is not
> > limited to the current set of pre-defined RTE_FLOW flow items.
> >
> > * A generic flow action: The flow action exact processing is defined
> > by the program loaded by the network device, with the user expected to
> > know the set of program actions for the purpose of assigning actions
> > to flows. The device is still expected to accept the existing
> > pre-defined flow actions such as packet drop, packet redirection to
> > output queues, packet modifications, etc, as long as the program
> > currently loaded on the device is defining and using flow actions that
> > perform identical processing, but the device is not limited to the
> > current set of pre-defined RTE_FLOW flow actions.
> >
> > Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> > Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
> > ---
> >  lib/ethdev/rte_flow.h | 82
> +++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 82 insertions(+)
> >
> > diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
> > index 3fe57140f9..f7889d7dd0 100644
> > --- a/lib/ethdev/rte_flow.h
> > +++ b/lib/ethdev/rte_flow.h
> > @@ -688,6 +688,15 @@ enum rte_flow_item_type {
> >  	 * @see struct rte_flow_item_ib_bth.
> >  	 */
> >  	RTE_FLOW_ITEM_TYPE_IB_BTH,
> > +
> > +	/**
> > +	 * Matches a custom protocol header or metadata field represented
> > +	 * as a byte string of a given length, whose meaning is typically
> > +	 * defined by the data plane program.
> > +	 *
> > +	 * @see struct rte_flow_item_generic.
> > +	 */
> > +	RTE_FLOW_ITEM_TYPE_GENERIC,
> 
> Rename: _GENERIC -> _OPAQUE, everywhere.
> 
> >  };
> >
> >  /**
> > @@ -2311,6 +2320,32 @@ static const struct rte_flow_item_tx_queue
> > rte_flow_item_tx_queue_mask = {
> >  };
> >  #endif
> >
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this structure may change without prior notice
> > + *
> > + * RTE_FLOW_ITEM_TYPE_GENERIC
> > + *
> > + * Matches a custom protocol header or metadata field represented as a
> byte
> > + * array of a given length, whose meaning is typically defined by the data
> > + * plane program.
> 
> "byte array" -> "opaque data type"

This flow item does not contain any pointer or index with hidden meaning,
this flow item contains the exact byte array in clear text, so it is not opaque.
The only difference is that you don't have a predefined name for its protocol,
as the field format is defined by the P4 program loaded by the HW device.


> 
> > + *
> > + * The field length must be non-zero. The field value must be non-NULL,
> with
> > the
> > + * value bytes specified in network byte order.
> > + */
> > +struct rte_flow_item_generic {
> > +	uint32_t length; /**< Field length. */
> > +	const uint8_t *value; /**< Field value. */
> 
> Suggestion: Change the value type from "const uint8_t *" to "const void *". It
> makes it easier for the application to use a hierarchy of structures internally
> for this.

The "value" should be an array of exactly "length" bytes. Changing the data type to
"void *" is making this field opaque for no good reason IMO. We are simply following
the same pattern used by other flow items such as RAW and FLEX, which are also
specified as array of bytes, right?

> 
> > +};
> > +
> > +/** Default mask for RTE_FLOW_ITEM_TYPE_GENERIC. */
> > +#ifndef __cplusplus
> > +static const struct rte_flow_item_generic rte_flow_item_generic_mask = {
> > +	.length = 0xffffffff,
> > +	.value = NULL,
> > +};
> > +#endif
> > +
> >  /**
> >   * Action types.
> >   *
> > @@ -2989,6 +3024,14 @@ enum rte_flow_action_type {
> >  	 * @see struct rte_flow_action_indirect_list
> >  	 */
> >  	RTE_FLOW_ACTION_TYPE_INDIRECT_LIST,
> > +
> > +	/**
> > +	 * Custom action whose processing is typically defined by the data
> plane
> > +	 * program.
> > +	 *
> > +	 * @see struct rte_flow_action_generic.
> > +	 */
> > +	RTE_FLOW_ACTION_TYPE_GENERIC,
> >  };
> >
> >  /**
> > @@ -4064,6 +4107,45 @@ struct
> rte_flow_indirect_update_flow_meter_mark {
> >  	enum rte_color init_color;
> >  };
> >
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this structure may change without prior notice.
> > + *
> > + * Generic action argument configuration parameters.
> > + *
> > + * The action argument field length must be non-zero. The action argument
> > field
> > + * value must be non-NULL, with the value bytes specified in network byte
> > order.
> > + *
> > + * @see struct rte_flow_action_generic
> > + */
> > +struct rte_flow_action_generic_argument {
> > +	/** Argument field length. */
> > +	uint32_t length;
> > +	/** Argument field value. */
> > +	const uint8_t *value;
> > +};
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this structure may change without prior notice.
> > + *
> > + * RTE_FLOW_ACTION_TYPE_GENERIC
> > + *
> > + * Generic action configuration parameters.
> > + *
> > + * Each action can have zero or more arguments.
> > + *
> > + * @see RTE_FLOW_ACTION_TYPE_GENERIC
> > + */
> > +struct rte_flow_action_generic {
> > +	/** Action ID. */
> > +	uint32_t action_id;
> > +	/** Number of action arguments. */
> > +	uint32_t action_args_num;
> > +	/** Action arguments array. */
> > +	const struct rte_flow_action_generic_argument *action_args;
> > +};
> 
> This is a list of arguments, not one argument. You should append "_array"
> somewhere in the name, and add a normal (non-list) action, e.g.:
> 
> struct rte_flow_action_opaque {
> //Or: "struct rte_flow_action_generic", if you want to keep that name.
> 	/** Action ID. */
> 	uint32_t action_id;
> 	/** Argument length. */
> 	uint32_t length;
> 	/** Argument value. */
> 	const uint8_t *value;
> // Or: "const void *value;" for the same reason as for the flow item.
> };
> 

Did you somehow overlooked the "struct rte_flow_action_generic" from just
above?

An action can have zero, one or more arguments, which are expressed
in the "action_args" array; we are requiring the user to provide the action
arguments as an array of "struct rte_flow_action_generic" as opposed to
a single raw buffer with all the arguments concatenated there, makes sense?


> 
> > +
> >  /* Mbuf dynamic field offset for metadata. */
> >  extern int32_t rte_flow_dynf_metadata_offs;
> >
> > --
> > 2.31.1

Regards,
Cristian

Cristian Dumitrescu Aug. 2, 2023, 2:06 p.m. UTC | #6

> -----Original Message-----
> From: Jerin Jacob <jerinjacobk@gmail.com>
> Sent: Wednesday, August 2, 2023 12:22 PM
> To: Morten Brørup <mb@smartsharesystems.com>
> Cc: Zhang, Qi Z <qi.z.zhang@intel.com>; thomas@monjalon.net;
> orika@nvidia.com; david.marchand@redhat.com; Richardson, Bruce
> <bruce.richardson@intel.com>; jerinj@marvell.com; ferruh.yigit@amd.com;
> techboard@dpdk.org; Mcnamara, John <john.mcnamara@intel.com>; Zhang,
> Helin <helin.zhang@intel.com>; dev@dpdk.org; Dumitrescu, Cristian
> <cristian.dumitrescu@intel.com>
> Subject: Re: [PATCH] ethdev: introduce generic flow item and action
> 
> On Wed, Aug 2, 2023 at 4:31 PM Morten Brørup
> <mb@smartsharesystems.com> wrote:
> >
> > > From: Morten Brørup [mailto:mb@smartsharesystems.com]
> > > Sent: Wednesday, 2 August 2023 12.25
> > >
> > > > From: Qi Zhang [mailto:qi.z.zhang@intel.com]
> > > > Sent: Wednesday, 2 August 2023 19.35
> > > >
> > > > From: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> > > >
> > > > For network devices that are programmable through languages such as
> > > > the P4 language, there are no pre-defined flow items and actions.
> > > >
> > > > The format of the protocol header and metadata fields that are used to
> > > > specify the flow items that make up the flow pattern, as well as the
> > > > flow actions, are all defined by the program, with an infinity of
> > > > possible combinations, as opposed to being selected from a finite
> > > > pre-defined list.
> > > >
> > > > It is virtually impossible to pre-define all the flow items and the
> > > > flow actions that programs might ever use, as these are only limited
> > > > by the set of HW resources and the program developer's imagination.
> > > >
> > > > To support the programmable network devices, we are introducing:
> > > >
> > > > * A generic flow item: The flow item is expressed as an array of bytes
> > > > of a given length, whose meaning is defined by the program loaded by
> > > > the network device.
> > >
> > > The flow item is not "generic", it is "opaque": Only the application knows
> > > what this flow item does.
> > >
> > > I hate the concept for two reasons:
> > > 1. The inability for applications to detect which flow items the underlying
> > > hardware supports.
> > > 2. The risk that vendors will use this instead of introducing new flow item
> > > types, available for anyone to implement.
> >
> > After further consideration, there might be a middle ground.
> >
> > Consider Vendor-Specific attributes for DHCP and RADIUS, or SNMP MIBs...
> >
> > Any vendor is free to add his own, proprietary special-purpose attributes,
> without going through the standardization process. (This is the key challenge
> this patch seems to be aiming at.)
> >
> > The vendor might publish these attributes, and other vendors may
> implement them too.
> >
> > And in order to prevent collisions, the Vendor-Specific attributes contain a
> globally unique vendor ID, such as the Private Enterprise Number [1]
> managed by IANA.
> >
> > If similar principles can be worked into the patch, I can support it.
> 
> +1
> 

Morten, Jerin,

I think there is a fundamental misunderstanding here: we are not trying to
provide support for some non-standard vendor-specific features here. What
we are trying to do is add generic multi-vendor support in RTE_FLOW for 
P4 programmable network devices, which requires supporting flow items
and actions that are defined directly by the user through their P4 programs
as opposed to being selected from a pre-defined list.

There are an infinity of flow items and actions that the users can define through
their P4 programs, and they cannot be supported with a finite list of RTE_FLOW
items and actions:

1/ Some flow items map directly to the IETF defined protocols, while some
others do not, and only the user writing the program knows the exact answer;

2/ Some flow items are simply application-specific (not vendor specific)
meta-data that (I hope we all accept) is outside of the standardization
process.

3/ Some flow actions map directly to the existing RTE_FLOW actions (especially
the more straightforward actions such as: packet drop, packet redirection to an
output queue, some specific packet modifications, etc), while the vast majority
of possible actions do not.

Are you saying that the P4 programmable network devices should NOT be
supported by DPDK and RTE_FLOW?

> 
> >
> > Preferably, there should also be a means for applications to query if specific
> Vendor-Specific flow items and actions are supported or not.
> >
> >
> > [1]: https://www.iana.org/assignments/enterprise-numbers/
> >

Regards,
Cristian

Morten Brørup Aug. 2, 2023, 2:54 p.m. UTC | #7

> From: Dumitrescu, Cristian [mailto:cristian.dumitrescu@intel.com]
> Sent: Wednesday, 2 August 2023 16.06
> 
> Hi Morten,
> 
> Thanks for your reply, comments inline below.
> 
> > From: Morten Brørup <mb@smartsharesystems.com>
> > Sent: Wednesday, August 2, 2023 11:25 AM
> >
> > > From: Qi Zhang [mailto:qi.z.zhang@intel.com]
> > > Sent: Wednesday, 2 August 2023 19.35
> > >
> > > From: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> > >
> > > For network devices that are programmable through languages such as
> > > the P4 language, there are no pre-defined flow items and actions.
> > >
> > > The format of the protocol header and metadata fields that are used to
> > > specify the flow items that make up the flow pattern, as well as the
> > > flow actions, are all defined by the program, with an infinity of
> > > possible combinations, as opposed to being selected from a finite
> > > pre-defined list.
> > >
> > > It is virtually impossible to pre-define all the flow items and the
> > > flow actions that programs might ever use, as these are only limited
> > > by the set of HW resources and the program developer's imagination.
> > >
> > > To support the programmable network devices, we are introducing:
> > >
> > > * A generic flow item: The flow item is expressed as an array of bytes
> > > of a given length, whose meaning is defined by the program loaded by
> > > the network device.
> >
> > The flow item is not "generic", it is "opaque": Only the application knows
> what
> this flow item does.
> 
> This item is definitely not opaque, as it does not contain any
> pointers/handles.
> It actually contains the exact bytes of the field in clear text, so how is
> this opaque?

The flow item is opaque in the sense that the DPDK library has no way to interpret it. E.g., unlike the RAW item, it cannot be debug-dumped to some human readable text format by a future DPDK library function.

> 
> We already have flow items in RTE_FLOW such as RAW, FUZZY, META, TAG, FLEX
> whose meaning is fully defined by the application, as you state, with some of
> them
> (like FLEX) containing pointers or indices that are completely "opaque". We
> could
> use one of these existing items, but we think it would be cleaner to define a
> new
> one that is more aligned to this specific purpose.
> 
> The way I would read your comment is that the _format_ of the flow item is
> defined by the application. But this is how the P4-programmable HW devices
> from
> Intel and other vendors work: the P4 program loaded by the device (i.e. the
> data
> plane application) defines the _format_ of the flow items.

Yes, the _format_ is opaque to DPDK, and only understood by the application.

> 
> >
> > I hate the concept for two reasons:
> > 1. The inability for applications to detect which flow items the underlying
> > hardware supports.
> 
> Does RTE_FLOW have any capability to detect what flows are supported by the
> underlying HW support?

I don't know. But a capability query API could be implemented.

> How is this a problem introduced by this proposal?

With this proposal, querying for capabilities seems impossible.

Maybe not.

> 
> > 2. The risk that vendors will use this instead of introducing new flow item
> > types, available for anyone to implement.
> 
> As stated above, there are already many existing flow items in RTE_FLOW that
> do just
> thisc(and even worse, as some of them contain opaque pointers or indices),
> such as
> RAW, FUZZY, META, TAG, FLEX.
> 
> >
> > If you proceed anyway, there's a few comments inline below.
> >
> > > The device is still expected to accept the
> > > existing pre-defined flow items such as Ethernet, IPv4/IPv6 headers,
> > > etc, as long as the program currently loaded on the device is defining
> > > and using flow items with identical format, but the device is not
> > > limited to the current set of pre-defined RTE_FLOW flow items.
> > >
> > > * A generic flow action: The flow action exact processing is defined
> > > by the program loaded by the network device, with the user expected to
> > > know the set of program actions for the purpose of assigning actions
> > > to flows. The device is still expected to accept the existing
> > > pre-defined flow actions such as packet drop, packet redirection to
> > > output queues, packet modifications, etc, as long as the program
> > > currently loaded on the device is defining and using flow actions that
> > > perform identical processing, but the device is not limited to the
> > > current set of pre-defined RTE_FLOW flow actions.
> > >
> > > Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> > > Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
> > > ---
> > >  lib/ethdev/rte_flow.h | 82
> > +++++++++++++++++++++++++++++++++++++++++++
> > >  1 file changed, 82 insertions(+)
> > >
> > > diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
> > > index 3fe57140f9..f7889d7dd0 100644
> > > --- a/lib/ethdev/rte_flow.h
> > > +++ b/lib/ethdev/rte_flow.h
> > > @@ -688,6 +688,15 @@ enum rte_flow_item_type {
> > >  	 * @see struct rte_flow_item_ib_bth.
> > >  	 */
> > >  	RTE_FLOW_ITEM_TYPE_IB_BTH,
> > > +
> > > +	/**
> > > +	 * Matches a custom protocol header or metadata field represented
> > > +	 * as a byte string of a given length, whose meaning is typically
> > > +	 * defined by the data plane program.
> > > +	 *
> > > +	 * @see struct rte_flow_item_generic.
> > > +	 */
> > > +	RTE_FLOW_ITEM_TYPE_GENERIC,
> >
> > Rename: _GENERIC -> _OPAQUE, everywhere.
> >
> > >  };
> > >
> > >  /**
> > > @@ -2311,6 +2320,32 @@ static const struct rte_flow_item_tx_queue
> > > rte_flow_item_tx_queue_mask = {
> > >  };
> > >  #endif
> > >
> > > +/**
> > > + * @warning
> > > + * @b EXPERIMENTAL: this structure may change without prior notice
> > > + *
> > > + * RTE_FLOW_ITEM_TYPE_GENERIC
> > > + *
> > > + * Matches a custom protocol header or metadata field represented as a
> > byte
> > > + * array of a given length, whose meaning is typically defined by the
> data
> > > + * plane program.
> >
> > "byte array" -> "opaque data type"
> 
> This flow item does not contain any pointer or index with hidden meaning,
> this flow item contains the exact byte array in clear text, so it is not
> opaque.

The meaning of the byte array is opaque. It doesn't matter that the byte array is in clear text.

If you look at the other flow items (e.g. RAW, FLEX), anyone can parse the byte array and tell what it means.

> The only difference is that you don't have a predefined name for its protocol,
> as the field format is defined by the P4 program loaded by the HW device.
> 
> 
> >
> > > + *
> > > + * The field length must be non-zero. The field value must be non-NULL,
> > with
> > > the
> > > + * value bytes specified in network byte order.
> > > + */
> > > +struct rte_flow_item_generic {
> > > +	uint32_t length; /**< Field length. */
> > > +	const uint8_t *value; /**< Field value. */
> >
> > Suggestion: Change the value type from "const uint8_t *" to "const void *".
> It
> > makes it easier for the application to use a hierarchy of structures
> internally
> > for this.
> 
> The "value" should be an array of exactly "length" bytes. Changing the data
> type to
> "void *" is making this field opaque for no good reason IMO. We are simply
> following
> the same pattern used by other flow items such as RAW and FLEX, which are also
> specified as array of bytes, right?

OK. It was only a suggestion. It seems you are doing the same as FLEX here.

NB: I wonder why the flow item points to the value (in RAW and FLEX), instead of directly holding the value in a variable length array... but it's not important.

> 
> >
> > > +};
> > > +
> > > +/** Default mask for RTE_FLOW_ITEM_TYPE_GENERIC. */
> > > +#ifndef __cplusplus
> > > +static const struct rte_flow_item_generic rte_flow_item_generic_mask = {
> > > +	.length = 0xffffffff,
> > > +	.value = NULL,
> > > +};
> > > +#endif
> > > +
> > >  /**
> > >   * Action types.
> > >   *
> > > @@ -2989,6 +3024,14 @@ enum rte_flow_action_type {
> > >  	 * @see struct rte_flow_action_indirect_list
> > >  	 */
> > >  	RTE_FLOW_ACTION_TYPE_INDIRECT_LIST,
> > > +
> > > +	/**
> > > +	 * Custom action whose processing is typically defined by the data
> > plane
> > > +	 * program.
> > > +	 *
> > > +	 * @see struct rte_flow_action_generic.
> > > +	 */
> > > +	RTE_FLOW_ACTION_TYPE_GENERIC,
> > >  };
> > >
> > >  /**
> > > @@ -4064,6 +4107,45 @@ struct
> > rte_flow_indirect_update_flow_meter_mark {
> > >  	enum rte_color init_color;
> > >  };
> > >
> > > +/**
> > > + * @warning
> > > + * @b EXPERIMENTAL: this structure may change without prior notice.
> > > + *
> > > + * Generic action argument configuration parameters.
> > > + *
> > > + * The action argument field length must be non-zero. The action argument
> > > field
> > > + * value must be non-NULL, with the value bytes specified in network byte
> > > order.
> > > + *
> > > + * @see struct rte_flow_action_generic
> > > + */
> > > +struct rte_flow_action_generic_argument {
> > > +	/** Argument field length. */
> > > +	uint32_t length;
> > > +	/** Argument field value. */
> > > +	const uint8_t *value;
> > > +};
> > > +
> > > +/**
> > > + * @warning
> > > + * @b EXPERIMENTAL: this structure may change without prior notice.
> > > + *
> > > + * RTE_FLOW_ACTION_TYPE_GENERIC
> > > + *
> > > + * Generic action configuration parameters.
> > > + *
> > > + * Each action can have zero or more arguments.
> > > + *
> > > + * @see RTE_FLOW_ACTION_TYPE_GENERIC
> > > + */
> > > +struct rte_flow_action_generic {
> > > +	/** Action ID. */
> > > +	uint32_t action_id;
> > > +	/** Number of action arguments. */
> > > +	uint32_t action_args_num;
> > > +	/** Action arguments array. */
> > > +	const struct rte_flow_action_generic_argument *action_args;
> > > +};
> >
> > This is a list of arguments, not one argument. You should append "_array"
> > somewhere in the name, and add a normal (non-list) action, e.g.:
> >
> > struct rte_flow_action_opaque {
> > //Or: "struct rte_flow_action_generic", if you want to keep that name.
> > 	/** Action ID. */
> > 	uint32_t action_id;
> > 	/** Argument length. */
> > 	uint32_t length;
> > 	/** Argument value. */
> > 	const uint8_t *value;
> > // Or: "const void *value;" for the same reason as for the flow item.
> > };
> >
> 
> Did you somehow overlooked the "struct rte_flow_action_generic" from just
> above?
> 
> An action can have zero, one or more arguments, which are expressed
> in the "action_args" array; we are requiring the user to provide the action
> arguments as an array of "struct rte_flow_action_generic" as opposed to
> a single raw buffer with all the arguments concatenated there, makes sense?

Makes sense. I was thinking that you should have two different flow items, one item with a single argument, and one item with an array of arguments. However, it seems that FLEX also does what you do here, i.e. it has one flow item to cover zero, one or more arguments.

> 
> 
> >
> > > +
> > >  /* Mbuf dynamic field offset for metadata. */
> > >  extern int32_t rte_flow_dynf_metadata_offs;
> > >
> > > --
> > > 2.31.1
> 
> Regards,
> Cristian

Morten Brørup Aug. 2, 2023, 3:24 p.m. UTC | #8

> From: Dumitrescu, Cristian [mailto:cristian.dumitrescu@intel.com]
> Sent: Wednesday, 2 August 2023 16.06
> 
> > From: Jerin Jacob <jerinjacobk@gmail.com>
> > Sent: Wednesday, August 2, 2023 12:22 PM
> >
> > On Wed, Aug 2, 2023 at 4:31 PM Morten Brørup
> > <mb@smartsharesystems.com> wrote:
> > >
> > > > From: Morten Brørup [mailto:mb@smartsharesystems.com]
> > > > Sent: Wednesday, 2 August 2023 12.25
> > > >
> > > > > From: Qi Zhang [mailto:qi.z.zhang@intel.com]
> > > > > Sent: Wednesday, 2 August 2023 19.35
> > > > >
> > > > > From: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> > > > >
> > > > > For network devices that are programmable through languages such as
> > > > > the P4 language, there are no pre-defined flow items and actions.
> > > > >
> > > > > The format of the protocol header and metadata fields that are used to
> > > > > specify the flow items that make up the flow pattern, as well as the
> > > > > flow actions, are all defined by the program, with an infinity of
> > > > > possible combinations, as opposed to being selected from a finite
> > > > > pre-defined list.
> > > > >
> > > > > It is virtually impossible to pre-define all the flow items and the
> > > > > flow actions that programs might ever use, as these are only limited
> > > > > by the set of HW resources and the program developer's imagination.
> > > > >
> > > > > To support the programmable network devices, we are introducing:
> > > > >
> > > > > * A generic flow item: The flow item is expressed as an array of bytes
> > > > > of a given length, whose meaning is defined by the program loaded by
> > > > > the network device.
> > > >
> > > > The flow item is not "generic", it is "opaque": Only the application
> knows
> > > > what this flow item does.
> > > >
> > > > I hate the concept for two reasons:
> > > > 1. The inability for applications to detect which flow items the
> underlying
> > > > hardware supports.
> > > > 2. The risk that vendors will use this instead of introducing new flow
> item
> > > > types, available for anyone to implement.
> > >
> > > After further consideration, there might be a middle ground.
> > >
> > > Consider Vendor-Specific attributes for DHCP and RADIUS, or SNMP MIBs...
> > >
> > > Any vendor is free to add his own, proprietary special-purpose attributes,
> > without going through the standardization process. (This is the key
> challenge
> > this patch seems to be aiming at.)
> > >
> > > The vendor might publish these attributes, and other vendors may
> > implement them too.
> > >
> > > And in order to prevent collisions, the Vendor-Specific attributes contain
> a
> > globally unique vendor ID, such as the Private Enterprise Number [1]
> > managed by IANA.
> > >
> > > If similar principles can be worked into the patch, I can support it.
> >
> > +1
> >
> 
> Morten, Jerin,
> 
> I think there is a fundamental misunderstanding here: we are not trying to
> provide support for some non-standard vendor-specific features here. What
> we are trying to do is add generic multi-vendor support in RTE_FLOW for
> P4 programmable network devices, which requires supporting flow items
> and actions that are defined directly by the user through their P4 programs
> as opposed to being selected from a pre-defined list.
> 
> There are an infinity of flow items and actions that the users can define
> through
> their P4 programs, and they cannot be supported with a finite list of RTE_FLOW
> items and actions:
> 
> 1/ Some flow items map directly to the IETF defined protocols, while some
> others do not, and only the user writing the program knows the exact answer;
> 
> 2/ Some flow items are simply application-specific (not vendor specific)
> meta-data that (I hope we all accept) is outside of the standardization
> process.

Such items can use a special "reserved" vendor-id.

> 
> 3/ Some flow actions map directly to the existing RTE_FLOW actions (especially
> the more straightforward actions such as: packet drop, packet redirection to
> an
> output queue, some specific packet modifications, etc), while the vast
> majority
> of possible actions do not.
> 
> Are you saying that the P4 programmable network devices should NOT be
> supported by DPDK and RTE_FLOW?

No, I get the need for this. And I understand that since P4 is compiled to hardware-specific binary blobs, there is a need to put such blobs into DPDK as flow items and actions, instead of the "uncompiled" P4 program.

I am suggesting that instead of adding a completely opaque data type:

Struct item {
Int len;      // Length of value in bytes.
Char value[]; // Application specific meaning.
};

...add a semi-opaque data type:

Struct tlv {
Int type;     // Vendor specific type.
Int len;      // Length of value in bytes.
Char value[]; // (Vendor, Type) specific meaning.
};

Struct item {
Int vendor;          // Vendor ID.
Int len;             // Length of values in bytes.
Struct tlv values[]; // Array of TLVs.
};

Like RADIUS Vendor-Specific attributes:
https://datatracker.ietf.org/doc/html/rfc2138#section-5.26

Then some (Vendor, Type) fields can be documented (and thus generally understood by DPDK), and some undocumented.

E.g. like Microsoft documented some of theirs in RFC 2548:
https://datatracker.ietf.org/doc/html/rfc2548


Another benefit is that these new "VENDOR-SPECIFIC" flow types can be reused for other purposes than compiled P4 programs.

> 
> >
> > >
> > > Preferably, there should also be a means for applications to query if
> specific
> > Vendor-Specific flow items and actions are supported or not.
> > >
> > >
> > > [1]: https://www.iana.org/assignments/enterprise-numbers/
> > >
> 
> Regards,
> Cristian

Ori Kam Aug. 2, 2023, 3:47 p.m. UTC | #9

Hi Qi

> -----Original Message-----
> From: Morten Brørup <mb@smartsharesystems.com>
> Sent: Wednesday, August 2, 2023 6:25 PM
> 
> > From: Dumitrescu, Cristian [mailto:cristian.dumitrescu@intel.com]
> > Sent: Wednesday, 2 August 2023 16.06
> >
> > > From: Jerin Jacob <jerinjacobk@gmail.com>
> > > Sent: Wednesday, August 2, 2023 12:22 PM
> > >
> > > On Wed, Aug 2, 2023 at 4:31 PM Morten Brørup
> > > <mb@smartsharesystems.com> wrote:
> > > >
> > > > > From: Morten Brørup [mailto:mb@smartsharesystems.com]
> > > > > Sent: Wednesday, 2 August 2023 12.25
> > > > >
> > > > > > From: Qi Zhang [mailto:qi.z.zhang@intel.com]
> > > > > > Sent: Wednesday, 2 August 2023 19.35
> > > > > >
> > > > > > From: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> > > > > >
> > > > > > For network devices that are programmable through languages such as
> > > > > > the P4 language, there are no pre-defined flow items and actions.
> > > > > >
> > > > > > The format of the protocol header and metadata fields that are used to
> > > > > > specify the flow items that make up the flow pattern, as well as the
> > > > > > flow actions, are all defined by the program, with an infinity of
> > > > > > possible combinations, as opposed to being selected from a finite
> > > > > > pre-defined list.
> > > > > >
> > > > > > It is virtually impossible to pre-define all the flow items and the
> > > > > > flow actions that programs might ever use, as these are only limited
> > > > > > by the set of HW resources and the program developer's imagination.
> > > > > >
> > > > > > To support the programmable network devices, we are introducing:
> > > > > >
> > > > > > * A generic flow item: The flow item is expressed as an array of bytes
> > > > > > of a given length, whose meaning is defined by the program loaded by
> > > > > > the network device.
> > > > >
> > > > > The flow item is not "generic", it is "opaque": Only the application
> > knows
> > > > > what this flow item does.
> > > > >
> > > > > I hate the concept for two reasons:
> > > > > 1. The inability for applications to detect which flow items the
> > underlying
> > > > > hardware supports.
> > > > > 2. The risk that vendors will use this instead of introducing new flow
> > item
> > > > > types, available for anyone to implement.
> > > >
> > > > After further consideration, there might be a middle ground.
> > > >
> > > > Consider Vendor-Specific attributes for DHCP and RADIUS, or SNMP
> MIBs...
> > > >
> > > > Any vendor is free to add his own, proprietary special-purpose attributes,
> > > without going through the standardization process. (This is the key
> > challenge
> > > this patch seems to be aiming at.)
> > > >
> > > > The vendor might publish these attributes, and other vendors may
> > > implement them too.
> > > >
> > > > And in order to prevent collisions, the Vendor-Specific attributes contain
> > a
> > > globally unique vendor ID, such as the Private Enterprise Number [1]
> > > managed by IANA.
> > > >
> > > > If similar principles can be worked into the patch, I can support it.
> > >
> > > +1
> > >
+1 I understand that this is supposed to be generic, but how can it?
how do you know if PMD supports this? 
what if each PMD needs different configurations?

In addition how can you handle number of those action and items?
For example if I have match on protocol X and Y and do actions Z and W
each one of those can be generic item.
if you have a way to define a standard why to read such actions then we have something to talk about.


> >
> > Morten, Jerin,
> >
> > I think there is a fundamental misunderstanding here: we are not trying to
> > provide support for some non-standard vendor-specific features here. What
> > we are trying to do is add generic multi-vendor support in RTE_FLOW for
> > P4 programmable network devices, which requires supporting flow items
> > and actions that are defined directly by the user through their P4 programs
> > as opposed to being selected from a pre-defined list.
> >
> > There are an infinity of flow items and actions that the users can define
> > through
> > their P4 programs, and they cannot be supported with a finite list of
> RTE_FLOW
> > items and actions:
> >
> > 1/ Some flow items map directly to the IETF defined protocols, while some
> > others do not, and only the user writing the program knows the exact answer;
> >
> > 2/ Some flow items are simply application-specific (not vendor specific)
> > meta-data that (I hope we all accept) is outside of the standardization
> > process.
> 
> Such items can use a special "reserved" vendor-id.
> 

Can you show me what items/actions are missing in rte_flow?

> >
> > 3/ Some flow actions map directly to the existing RTE_FLOW actions
> (especially
> > the more straightforward actions such as: packet drop, packet redirection to
> > an
> > output queue, some specific packet modifications, etc), while the vast
> > majority
> > of possible actions do not.
> >
> > Are you saying that the P4 programmable network devices should NOT be
> > supported by DPDK and RTE_FLOW?
> 
> No, I get the need for this. And I understand that since P4 is compiled to
> hardware-specific binary blobs, there is a need to put such blobs into DPDK as
> flow items and actions, instead of the "uncompiled" P4 program.
> 
> I am suggesting that instead of adding a completely opaque data type:
> 
> Struct item {
> Int len;      // Length of value in bytes.
> Char value[]; // Application specific meaning.
> };
> 

But since you didn't define a known protocol for PMD to read the data how 
2 pmds can use the same action?

> ...add a semi-opaque data type:
> 
> Struct tlv {
> Int type;     // Vendor specific type.
> Int len;      // Length of value in bytes.
> Char value[]; // (Vendor, Type) specific meaning.
> };
> 
> Struct item {
> Int vendor;          // Vendor ID.
> Int len;             // Length of values in bytes.
> Struct tlv values[]; // Array of TLVs.
> };
> 
> Like RADIUS Vendor-Specific attributes:
> https://datatracker.ietf.org/doc/html/rfc2138#section-5.26
> 
> Then some (Vendor, Type) fields can be documented (and thus generally
> understood by DPDK), and some undocumented.
> 
> E.g. like Microsoft documented some of theirs in RFC 2548:
> https://datatracker.ietf.org/doc/html/rfc2548
> 
> 
> Another benefit is that these new "VENDOR-SPECIFIC" flow types can be reused
> for other purposes than compiled P4 programs.
> 
> >
> > >
> > > >
> > > > Preferably, there should also be a means for applications to query if
> > specific
> > > Vendor-Specific flow items and actions are supported or not.
> > > >
> > > >
> > > > [1]: https://www.iana.org/assignments/enterprise-numbers/
> > > >
> >
> > Regards,
> > Cristian

Ori Kam Aug. 2, 2023, 4:06 p.m. UTC | #10

Hi Qi,

In addition to my previous email,
I fully support you’re your idea to update the rte_flow API
so it will be easier for P4 integration, I just think the suggested approach is not 
the correct one at least not as appears in the RFC.

I think it will be good if we can discuss some uses cases you are having 
with the API/implementation and see what is the best way to solve them.
The main idea is not to re-invent the wheel, but to solve issues.

To summarize, as I see it there are several issues:
1. no protocol is defined so different PMD can't translate it. 
2. even the same PMD doesn't know what is the action, unless you plan that this will move
directly to the HW, in this case, the action will be HW dependent.
3. when application should use this new action or the old ones. 


Thanks,
Ori

> -----Original Message-----
> From: Ori Kam <orika@nvidia.com>
> Sent: Wednesday, August 2, 2023 6:47 PM
> 
> Hi Qi
> 
> > -----Original Message-----
> > From: Morten Brørup <mb@smartsharesystems.com>
> > Sent: Wednesday, August 2, 2023 6:25 PM
> >
> > > From: Dumitrescu, Cristian [mailto:cristian.dumitrescu@intel.com]
> > > Sent: Wednesday, 2 August 2023 16.06
> > >
> > > > From: Jerin Jacob <jerinjacobk@gmail.com>
> > > > Sent: Wednesday, August 2, 2023 12:22 PM
> > > >
> > > > On Wed, Aug 2, 2023 at 4:31 PM Morten Brørup
> > > > <mb@smartsharesystems.com> wrote:
> > > > >
> > > > > > From: Morten Brørup [mailto:mb@smartsharesystems.com]
> > > > > > Sent: Wednesday, 2 August 2023 12.25
> > > > > >
> > > > > > > From: Qi Zhang [mailto:qi.z.zhang@intel.com]
> > > > > > > Sent: Wednesday, 2 August 2023 19.35
> > > > > > >
> > > > > > > From: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> > > > > > >
> > > > > > > For network devices that are programmable through languages such
> as
> > > > > > > the P4 language, there are no pre-defined flow items and actions.
> > > > > > >
> > > > > > > The format of the protocol header and metadata fields that are used
> to
> > > > > > > specify the flow items that make up the flow pattern, as well as the
> > > > > > > flow actions, are all defined by the program, with an infinity of
> > > > > > > possible combinations, as opposed to being selected from a finite
> > > > > > > pre-defined list.
> > > > > > >
> > > > > > > It is virtually impossible to pre-define all the flow items and the
> > > > > > > flow actions that programs might ever use, as these are only limited
> > > > > > > by the set of HW resources and the program developer's
> imagination.
> > > > > > >
> > > > > > > To support the programmable network devices, we are introducing:
> > > > > > >
> > > > > > > * A generic flow item: The flow item is expressed as an array of bytes
> > > > > > > of a given length, whose meaning is defined by the program loaded
> by
> > > > > > > the network device.
> > > > > >
> > > > > > The flow item is not "generic", it is "opaque": Only the application
> > > knows
> > > > > > what this flow item does.
> > > > > >
> > > > > > I hate the concept for two reasons:
> > > > > > 1. The inability for applications to detect which flow items the
> > > underlying
> > > > > > hardware supports.
> > > > > > 2. The risk that vendors will use this instead of introducing new flow
> > > item
> > > > > > types, available for anyone to implement.
> > > > >
> > > > > After further consideration, there might be a middle ground.
> > > > >
> > > > > Consider Vendor-Specific attributes for DHCP and RADIUS, or SNMP
> > MIBs...
> > > > >
> > > > > Any vendor is free to add his own, proprietary special-purpose
> attributes,
> > > > without going through the standardization process. (This is the key
> > > challenge
> > > > this patch seems to be aiming at.)
> > > > >
> > > > > The vendor might publish these attributes, and other vendors may
> > > > implement them too.
> > > > >
> > > > > And in order to prevent collisions, the Vendor-Specific attributes contain
> > > a
> > > > globally unique vendor ID, such as the Private Enterprise Number [1]
> > > > managed by IANA.
> > > > >
> > > > > If similar principles can be worked into the patch, I can support it.
> > > >
> > > > +1
> > > >
> +1 I understand that this is supposed to be generic, but how can it?
> how do you know if PMD supports this?
> what if each PMD needs different configurations?
> 
> In addition how can you handle number of those action and items?
> For example if I have match on protocol X and Y and do actions Z and W
> each one of those can be generic item.
> if you have a way to define a standard why to read such actions then we have
> something to talk about.
> 
> 
> > >
> > > Morten, Jerin,
> > >
> > > I think there is a fundamental misunderstanding here: we are not trying to
> > > provide support for some non-standard vendor-specific features here. What
> > > we are trying to do is add generic multi-vendor support in RTE_FLOW for
> > > P4 programmable network devices, which requires supporting flow items
> > > and actions that are defined directly by the user through their P4 programs
> > > as opposed to being selected from a pre-defined list.
> > >
> > > There are an infinity of flow items and actions that the users can define
> > > through
> > > their P4 programs, and they cannot be supported with a finite list of
> > RTE_FLOW
> > > items and actions:
> > >
> > > 1/ Some flow items map directly to the IETF defined protocols, while some
> > > others do not, and only the user writing the program knows the exact
> answer;
> > >
> > > 2/ Some flow items are simply application-specific (not vendor specific)
> > > meta-data that (I hope we all accept) is outside of the standardization
> > > process.
> >
> > Such items can use a special "reserved" vendor-id.
> >
> 
> Can you show me what items/actions are missing in rte_flow?
> 
> > >
> > > 3/ Some flow actions map directly to the existing RTE_FLOW actions
> > (especially
> > > the more straightforward actions such as: packet drop, packet redirection to
> > > an
> > > output queue, some specific packet modifications, etc), while the vast
> > > majority
> > > of possible actions do not.
> > >
> > > Are you saying that the P4 programmable network devices should NOT be
> > > supported by DPDK and RTE_FLOW?
> >
> > No, I get the need for this. And I understand that since P4 is compiled to
> > hardware-specific binary blobs, there is a need to put such blobs into DPDK as
> > flow items and actions, instead of the "uncompiled" P4 program.
> >
> > I am suggesting that instead of adding a completely opaque data type:
> >
> > Struct item {
> > Int len;      // Length of value in bytes.
> > Char value[]; // Application specific meaning.
> > };
> >
> 
> But since you didn't define a known protocol for PMD to read the data how
> 2 pmds can use the same action?
> 
> > ...add a semi-opaque data type:
> >
> > Struct tlv {
> > Int type;     // Vendor specific type.
> > Int len;      // Length of value in bytes.
> > Char value[]; // (Vendor, Type) specific meaning.
> > };
> >
> > Struct item {
> > Int vendor;          // Vendor ID.
> > Int len;             // Length of values in bytes.
> > Struct tlv values[]; // Array of TLVs.
> > };
> >
> > Like RADIUS Vendor-Specific attributes:
> > https://datatracker.ietf.org/doc/html/rfc2138#section-5.26
> >
> > Then some (Vendor, Type) fields can be documented (and thus generally
> > understood by DPDK), and some undocumented.
> >
> > E.g. like Microsoft documented some of theirs in RFC 2548:
> > https://datatracker.ietf.org/doc/html/rfc2548
> >
> >
> > Another benefit is that these new "VENDOR-SPECIFIC" flow types can be
> reused
> > for other purposes than compiled P4 programs.
> >
> > >
> > > >
> > > > >
> > > > > Preferably, there should also be a means for applications to query if
> > > specific
> > > > Vendor-Specific flow items and actions are supported or not.
> > > > >
> > > > >
> > > > > [1]: https://www.iana.org/assignments/enterprise-numbers/
> > > > >
> > >
> > > Regards,
> > > Cristian

Cristian Dumitrescu Aug. 2, 2023, 4:14 p.m. UTC | #11

> -----Original Message-----
> From: Morten Brørup <mb@smartsharesystems.com>
> Sent: Wednesday, August 2, 2023 4:25 PM
> To: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; Jerin Jacob
> <jerinjacobk@gmail.com>
> Cc: Zhang, Qi Z <qi.z.zhang@intel.com>; thomas@monjalon.net;
> orika@nvidia.com; david.marchand@redhat.com; Richardson, Bruce
> <bruce.richardson@intel.com>; jerinj@marvell.com; ferruh.yigit@amd.com;
> techboard@dpdk.org; Mcnamara, John <john.mcnamara@intel.com>; Zhang,
> Helin <helin.zhang@intel.com>; dev@dpdk.org
> Subject: RE: [PATCH] ethdev: introduce generic flow item and action
> 
> > From: Dumitrescu, Cristian [mailto:cristian.dumitrescu@intel.com]
> > Sent: Wednesday, 2 August 2023 16.06
> >
> > > From: Jerin Jacob <jerinjacobk@gmail.com>
> > > Sent: Wednesday, August 2, 2023 12:22 PM
> > >
> > > On Wed, Aug 2, 2023 at 4:31 PM Morten Brørup
> > > <mb@smartsharesystems.com> wrote:
> > > >
> > > > > From: Morten Brørup [mailto:mb@smartsharesystems.com]
> > > > > Sent: Wednesday, 2 August 2023 12.25
> > > > >
> > > > > > From: Qi Zhang [mailto:qi.z.zhang@intel.com]
> > > > > > Sent: Wednesday, 2 August 2023 19.35
> > > > > >
> > > > > > From: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> > > > > >
> > > > > > For network devices that are programmable through languages such
> as
> > > > > > the P4 language, there are no pre-defined flow items and actions.
> > > > > >
> > > > > > The format of the protocol header and metadata fields that are used
> to
> > > > > > specify the flow items that make up the flow pattern, as well as the
> > > > > > flow actions, are all defined by the program, with an infinity of
> > > > > > possible combinations, as opposed to being selected from a finite
> > > > > > pre-defined list.
> > > > > >
> > > > > > It is virtually impossible to pre-define all the flow items and the
> > > > > > flow actions that programs might ever use, as these are only limited
> > > > > > by the set of HW resources and the program developer's
> imagination.
> > > > > >
> > > > > > To support the programmable network devices, we are introducing:
> > > > > >
> > > > > > * A generic flow item: The flow item is expressed as an array of bytes
> > > > > > of a given length, whose meaning is defined by the program loaded
> by
> > > > > > the network device.
> > > > >
> > > > > The flow item is not "generic", it is "opaque": Only the application
> > knows
> > > > > what this flow item does.
> > > > >
> > > > > I hate the concept for two reasons:
> > > > > 1. The inability for applications to detect which flow items the
> > underlying
> > > > > hardware supports.
> > > > > 2. The risk that vendors will use this instead of introducing new flow
> > item
> > > > > types, available for anyone to implement.
> > > >
> > > > After further consideration, there might be a middle ground.
> > > >
> > > > Consider Vendor-Specific attributes for DHCP and RADIUS, or SNMP
> MIBs...
> > > >
> > > > Any vendor is free to add his own, proprietary special-purpose
> attributes,
> > > without going through the standardization process. (This is the key
> > challenge
> > > this patch seems to be aiming at.)
> > > >
> > > > The vendor might publish these attributes, and other vendors may
> > > implement them too.
> > > >
> > > > And in order to prevent collisions, the Vendor-Specific attributes contain
> > a
> > > globally unique vendor ID, such as the Private Enterprise Number [1]
> > > managed by IANA.
> > > >
> > > > If similar principles can be worked into the patch, I can support it.
> > >
> > > +1
> > >
> >
> > Morten, Jerin,
> >
> > I think there is a fundamental misunderstanding here: we are not trying to
> > provide support for some non-standard vendor-specific features here. What
> > we are trying to do is add generic multi-vendor support in RTE_FLOW for
> > P4 programmable network devices, which requires supporting flow items
> > and actions that are defined directly by the user through their P4 programs
> > as opposed to being selected from a pre-defined list.
> >
> > There are an infinity of flow items and actions that the users can define
> > through
> > their P4 programs, and they cannot be supported with a finite list of
> RTE_FLOW
> > items and actions:
> >
> > 1/ Some flow items map directly to the IETF defined protocols, while some
> > others do not, and only the user writing the program knows the exact
> answer;
> >
> > 2/ Some flow items are simply application-specific (not vendor specific)
> > meta-data that (I hope we all accept) is outside of the standardization
> > process.
> 
> Such items can use a special "reserved" vendor-id.
> 
> >
> > 3/ Some flow actions map directly to the existing RTE_FLOW actions
> (especially
> > the more straightforward actions such as: packet drop, packet redirection to
> > an
> > output queue, some specific packet modifications, etc), while the vast
> > majority
> > of possible actions do not.
> >
> > Are you saying that the P4 programmable network devices should NOT be
> > supported by DPDK and RTE_FLOW?
> 
> No, I get the need for this. And I understand that since P4 is compiled to
> hardware-specific binary blobs, there is a need to put such blobs into DPDK as
> flow items and actions, instead of the "uncompiled" P4 program.
> 
> I am suggesting that instead of adding a completely opaque data type:
> 
> Struct item {
> Int len;      // Length of value in bytes.
> Char value[]; // Application specific meaning.
> };
> 
> ...add a semi-opaque data type:
> 
> Struct tlv {
> Int type;     // Vendor specific type.
> Int len;      // Length of value in bytes.
> Char value[]; // (Vendor, Type) specific meaning.
> };
> 
> Struct item {
> Int vendor;          // Vendor ID.
> Int len;             // Length of values in bytes.
> Struct tlv values[]; // Array of TLVs.
> };
> 
> Like RADIUS Vendor-Specific attributes:
> https://datatracker.ietf.org/doc/html/rfc2138#section-5.26
> 
> Then some (Vendor, Type) fields can be documented (and thus generally
> understood by DPDK), and some undocumented.
> 
> E.g. like Microsoft documented some of theirs in RFC 2548:
> https://datatracker.ietf.org/doc/html/rfc2548
> 
> 
> Another benefit is that these new "VENDOR-SPECIFIC" flow types can be
> reused for other purposes than compiled P4 programs.
> 

Hi Morten,

Not sure we managed to level set: It looks to me you are trying to describe
a way to register vendor-specific flow items and actions in a friendlier way;
you are assuming that the meaning of those flow items and actions is well
known to and fully understood by the vendor, correct?

In fact, the key message I am trying to get across is that in the P4 case the
meaning of the flow items and actions is completely unknown to the vendor;
they are only known to the user, who defines them without any involvement
from the vendor through the user-authored P4 program (which represents the
data path application). And the user can change the P4 program at any time,
same as you would change the source code & rebuild & re-run  a CPU app.

Therefore, a "generic" control path application (generic in the sense that it
works with any P4 program) can only handle the flow items and actions in
a generic way: it knows the set of flow items and actions and their details
(such as  the size of each flow item or the arguments and format of each
action), but it does not know, nor care, whether a certain 32-bit flow item
is the RTE_FLOW IPv4 address or not, or a certain action is the RTE_FLOW
"set packet to queue" action or not.

Or am I missing something in what you are saying? Maybe your point is
that for the generic flow item and action we should simply add a numeric or
string-based "type" field to provide an identification/description? Again, the
vendor would not be able to populate it with anything meaningful, as the
flow item & action come from the user transparently to the vendor.

Regards,
Cristian

> >
> > >
> > > >
> > > > Preferably, there should also be a means for applications to query if
> > specific
> > > Vendor-Specific flow items and actions are supported or not.
> > > >
> > > >
> > > > [1]: https://www.iana.org/assignments/enterprise-numbers/
> > > >
> >
> > Regards,
> > Cristian

Cristian Dumitrescu Aug. 2, 2023, 4:56 p.m. UTC | #12

> -----Original Message-----
> From: Ori Kam <orika@nvidia.com>
> Sent: Wednesday, August 2, 2023 4:47 PM
> To: Morten Brørup <mb@smartsharesystems.com>; Dumitrescu, Cristian
> <cristian.dumitrescu@intel.com>; Jerin Jacob <jerinjacobk@gmail.com>
> Cc: Zhang, Qi Z <qi.z.zhang@intel.com>; NBU-Contact-Thomas Monjalon
> (EXTERNAL) <thomas@monjalon.net>; david.marchand@redhat.com;
> Richardson, Bruce <bruce.richardson@intel.com>; jerinj@marvell.com;
> ferruh.yigit@amd.com; techboard@dpdk.org; Mcnamara, John
> <john.mcnamara@intel.com>; Zhang, Helin <helin.zhang@intel.com>;
> dev@dpdk.org
> Subject: RE: [PATCH] ethdev: introduce generic flow item and action
> 
> Hi Qi

Hi Ori,

Thanks for your input!

> 
> > -----Original Message-----
> > From: Morten Brørup <mb@smartsharesystems.com>
> > Sent: Wednesday, August 2, 2023 6:25 PM
> >
> > > From: Dumitrescu, Cristian [mailto:cristian.dumitrescu@intel.com]
> > > Sent: Wednesday, 2 August 2023 16.06
> > >
> > > > From: Jerin Jacob <jerinjacobk@gmail.com>
> > > > Sent: Wednesday, August 2, 2023 12:22 PM
> > > >
> > > > On Wed, Aug 2, 2023 at 4:31 PM Morten Brørup
> > > > <mb@smartsharesystems.com> wrote:
> > > > >
> > > > > > From: Morten Brørup [mailto:mb@smartsharesystems.com]
> > > > > > Sent: Wednesday, 2 August 2023 12.25
> > > > > >
> > > > > > > From: Qi Zhang [mailto:qi.z.zhang@intel.com]
> > > > > > > Sent: Wednesday, 2 August 2023 19.35
> > > > > > >
> > > > > > > From: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> > > > > > >
> > > > > > > For network devices that are programmable through languages
> such as
> > > > > > > the P4 language, there are no pre-defined flow items and actions.
> > > > > > >
> > > > > > > The format of the protocol header and metadata fields that are
> used to
> > > > > > > specify the flow items that make up the flow pattern, as well as the
> > > > > > > flow actions, are all defined by the program, with an infinity of
> > > > > > > possible combinations, as opposed to being selected from a finite
> > > > > > > pre-defined list.
> > > > > > >
> > > > > > > It is virtually impossible to pre-define all the flow items and the
> > > > > > > flow actions that programs might ever use, as these are only
> limited
> > > > > > > by the set of HW resources and the program developer's
> imagination.
> > > > > > >
> > > > > > > To support the programmable network devices, we are introducing:
> > > > > > >
> > > > > > > * A generic flow item: The flow item is expressed as an array of
> bytes
> > > > > > > of a given length, whose meaning is defined by the program loaded
> by
> > > > > > > the network device.
> > > > > >
> > > > > > The flow item is not "generic", it is "opaque": Only the application
> > > knows
> > > > > > what this flow item does.
> > > > > >
> > > > > > I hate the concept for two reasons:
> > > > > > 1. The inability for applications to detect which flow items the
> > > underlying
> > > > > > hardware supports.
> > > > > > 2. The risk that vendors will use this instead of introducing new flow
> > > item
> > > > > > types, available for anyone to implement.
> > > > >
> > > > > After further consideration, there might be a middle ground.
> > > > >
> > > > > Consider Vendor-Specific attributes for DHCP and RADIUS, or SNMP
> > MIBs...
> > > > >
> > > > > Any vendor is free to add his own, proprietary special-purpose
> attributes,
> > > > without going through the standardization process. (This is the key
> > > challenge
> > > > this patch seems to be aiming at.)
> > > > >
> > > > > The vendor might publish these attributes, and other vendors may
> > > > implement them too.
> > > > >
> > > > > And in order to prevent collisions, the Vendor-Specific attributes
> contain
> > > a
> > > > globally unique vendor ID, such as the Private Enterprise Number [1]
> > > > managed by IANA.
> > > > >
> > > > > If similar principles can be worked into the patch, I can support it.
> > > >
> > > > +1
> > > >
> +1 I understand that this is supposed to be generic, but how can it?

What exactly it not generic?

> how do you know if PMD supports this?

What is the current RTE_FLOW mechanism to find out whether a device/PMD
supports a certain flow? The only mechanism available is to try to create the
flow (rte_flow_create) or mimic the creation of the flow (rte_flow_validate)
and see if you get success or error, right? Same mechanism to be used here.
Of course, we would be happy to support the addition of a more complex
query mechanism in RTE_FLOW.

> what if each PMD needs different configurations?

Not sure what you're referring to exactly. The format of the flow items and
flow actions is defined by the P4 program; so all the HW devices that are
able to successfully load a given P4 program, potentially from different
vendors, will have the same understanding of each flow item and action.

>
> In addition how can you handle number of those action and items?
> For example if I have match on protocol X and Y and do actions Z and W
> each one of those can be generic item.
> if you have a way to define a standard why to read such actions then we have
> something to talk about.

Yes, we do. The format of all flow items and flow actions available on the HW
device is fully defined by the P4 program, therefore all the HW devices,
potentially from different vendors, that are able to successfully load a given P4
program, will have the same understanding of them.

> 
> 
> > >
> > > Morten, Jerin,
> > >
> > > I think there is a fundamental misunderstanding here: we are not trying to
> > > provide support for some non-standard vendor-specific features here.
> What
> > > we are trying to do is add generic multi-vendor support in RTE_FLOW for
> > > P4 programmable network devices, which requires supporting flow items
> > > and actions that are defined directly by the user through their P4
> programs
> > > as opposed to being selected from a pre-defined list.
> > >
> > > There are an infinity of flow items and actions that the users can define
> > > through
> > > their P4 programs, and they cannot be supported with a finite list of
> > RTE_FLOW
> > > items and actions:
> > >
> > > 1/ Some flow items map directly to the IETF defined protocols, while some
> > > others do not, and only the user writing the program knows the exact
> answer;
> > >
> > > 2/ Some flow items are simply application-specific (not vendor specific)
> > > meta-data that (I hope we all accept) is outside of the standardization
> > > process.
> >
> > Such items can use a special "reserved" vendor-id.
> >
> 
> Can you show me what items/actions are missing in rte_flow?

The number of flow items (protocol headers, but also metadata) and flow actions
that users can define in their P4 programs is infinite, so unfortunately Ori, as much
as I want to grant you this wish, I am not going to be able to 😊. Also, it is
important to note that the users write their P4 program (defining their data path
pipeline) without any involvement from their device vendor, so the vendor is simply
not aware of meaning of the user's flow items and actions either.

> 
> > >
> > > 3/ Some flow actions map directly to the existing RTE_FLOW actions
> > (especially
> > > the more straightforward actions such as: packet drop, packet redirection
> to
> > > an
> > > output queue, some specific packet modifications, etc), while the vast
> > > majority
> > > of possible actions do not.
> > >
> > > Are you saying that the P4 programmable network devices should NOT be
> > > supported by DPDK and RTE_FLOW?
> >
> > No, I get the need for this. And I understand that since P4 is compiled to
> > hardware-specific binary blobs, there is a need to put such blobs into DPDK
> as
> > flow items and actions, instead of the "uncompiled" P4 program.
> >
> > I am suggesting that instead of adding a completely opaque data type:
> >
> > Struct item {
> > Int len;      // Length of value in bytes.
> > Char value[]; // Application specific meaning.
> > };
> >
> 
> But since you didn't define a known protocol for PMD to read the data how
> 2 pmds can use the same action?

The format of all flow items and flow actions available on the HW device is fully
defined by the P4 program, therefore all the HW devices, even if from different
vendors, will have the same understanding of them.

> 
> > ...add a semi-opaque data type:
> >
> > Struct tlv {
> > Int type;     // Vendor specific type.
> > Int len;      // Length of value in bytes.
> > Char value[]; // (Vendor, Type) specific meaning.
> > };
> >
> > Struct item {
> > Int vendor;          // Vendor ID.
> > Int len;             // Length of values in bytes.
> > Struct tlv values[]; // Array of TLVs.
> > };
> >
> > Like RADIUS Vendor-Specific attributes:
> > https://datatracker.ietf.org/doc/html/rfc2138#section-5.26
> >
> > Then some (Vendor, Type) fields can be documented (and thus generally
> > understood by DPDK), and some undocumented.
> >
> > E.g. like Microsoft documented some of theirs in RFC 2548:
> > https://datatracker.ietf.org/doc/html/rfc2548
> >
> >
> > Another benefit is that these new "VENDOR-SPECIFIC" flow types can be
> reused
> > for other purposes than compiled P4 programs.
> >
> > >
> > > >
> > > > >
> > > > > Preferably, there should also be a means for applications to query if
> > > specific
> > > > Vendor-Specific flow items and actions are supported or not.
> > > > >
> > > > >
> > > > > [1]: https://www.iana.org/assignments/enterprise-numbers/
> > > > >
> > >
> > > Regards,
> > > Cristian

Regards,
Cristian

Cristian Dumitrescu Aug. 2, 2023, 5:22 p.m. UTC | #13

> -----Original Message-----
> From: Ori Kam <orika@nvidia.com>
> Sent: Wednesday, August 2, 2023 5:06 PM
> To: Ori Kam <orika@nvidia.com>; Morten Brørup
> <mb@smartsharesystems.com>; Dumitrescu, Cristian
> <cristian.dumitrescu@intel.com>; Jerin Jacob <jerinjacobk@gmail.com>
> Cc: Zhang, Qi Z <qi.z.zhang@intel.com>; NBU-Contact-Thomas Monjalon
> (EXTERNAL) <thomas@monjalon.net>; david.marchand@redhat.com;
> Richardson, Bruce <bruce.richardson@intel.com>; jerinj@marvell.com;
> ferruh.yigit@amd.com; techboard@dpdk.org; Mcnamara, John
> <john.mcnamara@intel.com>; Zhang, Helin <helin.zhang@intel.com>;
> dev@dpdk.org
> Subject: RE: [PATCH] ethdev: introduce generic flow item and action
> 
> Hi Qi,
> 
> In addition to my previous email,
> I fully support you’re your idea to update the rte_flow API
> so it will be easier for P4 integration, I just think the suggested approach is not
> the correct one at least not as appears in the RFC.
> 
> I think it will be good if we can discuss some uses cases you are having
> with the API/implementation and see what is the best way to solve them.
> The main idea is not to re-invent the wheel, but to solve issues.

Yes, fully agree, it would be great meet and talk through this, as we did it in the
past for other issues. What days & time next week would be good for people?

Meanwhile, some answers below.

> 
> To summarize, as I see it there are several issues:
> 1. no protocol is defined so different PMD can't translate it.

The format of the flow items is defined by the P4 program, so all the HW devices
(from the same or from different vendors) that are able to successfully load the
given P4 program will have the same understanding of the flow items.

> 2. even the same PMD doesn't know what is the action, unless you plan that
> this will move
> directly to the HW, in this case, the action will be HW dependent.

The processing of each flow action, as well as the number of arguments and the
format of each action argument, is defined by the P4 program, so all the HW
devices (from the same or from different vendors) that are able to successfully
load the given P4 program will have the same understanding of the flow actions.

> 3. when application should use this new action or the old ones.

I guess it is good to clarify that there are two application: a data path application
(the P4 program) that defines the packet processing pipeline, and a control path
application that invokes RTE_FLOW to add/delete the flows on the device. I guess
we are now referring to the control path app.

Whenever the P4 program (the data path app) that is currently loaded on the
device is defining and using flow actions that perform identical processing to one
of the existing pre-defined RTE_FLOW actions (such as packet drop, packet
redirection to a given output queue, packet modifications, etc), then the app
(the control path app) can accept these actions as well.

But in the (frequent) case that the user's P4 program defines actions that do not
map to an RTE_FLOW action from the pre-defined list, then the app has no other
option but to use the newly proposed generic flow action in order to specify
(through the action_id field) the exact flow action from the P4 program.

Makes sense?

> 
> 
> Thanks,
> Ori
> 

Regards,
Cristian

> > -----Original Message-----
> > From: Ori Kam <orika@nvidia.com>
> > Sent: Wednesday, August 2, 2023 6:47 PM
> >
> > Hi Qi
> >
> > > -----Original Message-----
> > > From: Morten Brørup <mb@smartsharesystems.com>
> > > Sent: Wednesday, August 2, 2023 6:25 PM
> > >
> > > > From: Dumitrescu, Cristian [mailto:cristian.dumitrescu@intel.com]
> > > > Sent: Wednesday, 2 August 2023 16.06
> > > >
> > > > > From: Jerin Jacob <jerinjacobk@gmail.com>
> > > > > Sent: Wednesday, August 2, 2023 12:22 PM
> > > > >
> > > > > On Wed, Aug 2, 2023 at 4:31 PM Morten Brørup
> > > > > <mb@smartsharesystems.com> wrote:
> > > > > >
> > > > > > > From: Morten Brørup [mailto:mb@smartsharesystems.com]
> > > > > > > Sent: Wednesday, 2 August 2023 12.25
> > > > > > >
> > > > > > > > From: Qi Zhang [mailto:qi.z.zhang@intel.com]
> > > > > > > > Sent: Wednesday, 2 August 2023 19.35
> > > > > > > >
> > > > > > > > From: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> > > > > > > >
> > > > > > > > For network devices that are programmable through languages
> such
> > as
> > > > > > > > the P4 language, there are no pre-defined flow items and actions.
> > > > > > > >
> > > > > > > > The format of the protocol header and metadata fields that are
> used
> > to
> > > > > > > > specify the flow items that make up the flow pattern, as well as
> the
> > > > > > > > flow actions, are all defined by the program, with an infinity of
> > > > > > > > possible combinations, as opposed to being selected from a
> finite
> > > > > > > > pre-defined list.
> > > > > > > >
> > > > > > > > It is virtually impossible to pre-define all the flow items and the
> > > > > > > > flow actions that programs might ever use, as these are only
> limited
> > > > > > > > by the set of HW resources and the program developer's
> > imagination.
> > > > > > > >
> > > > > > > > To support the programmable network devices, we are
> introducing:
> > > > > > > >
> > > > > > > > * A generic flow item: The flow item is expressed as an array of
> bytes
> > > > > > > > of a given length, whose meaning is defined by the program
> loaded
> > by
> > > > > > > > the network device.
> > > > > > >
> > > > > > > The flow item is not "generic", it is "opaque": Only the application
> > > > knows
> > > > > > > what this flow item does.
> > > > > > >
> > > > > > > I hate the concept for two reasons:
> > > > > > > 1. The inability for applications to detect which flow items the
> > > > underlying
> > > > > > > hardware supports.
> > > > > > > 2. The risk that vendors will use this instead of introducing new
> flow
> > > > item
> > > > > > > types, available for anyone to implement.
> > > > > >
> > > > > > After further consideration, there might be a middle ground.
> > > > > >
> > > > > > Consider Vendor-Specific attributes for DHCP and RADIUS, or SNMP
> > > MIBs...
> > > > > >
> > > > > > Any vendor is free to add his own, proprietary special-purpose
> > attributes,
> > > > > without going through the standardization process. (This is the key
> > > > challenge
> > > > > this patch seems to be aiming at.)
> > > > > >
> > > > > > The vendor might publish these attributes, and other vendors may
> > > > > implement them too.
> > > > > >
> > > > > > And in order to prevent collisions, the Vendor-Specific attributes
> contain
> > > > a
> > > > > globally unique vendor ID, such as the Private Enterprise Number [1]
> > > > > managed by IANA.
> > > > > >
> > > > > > If similar principles can be worked into the patch, I can support it.
> > > > >
> > > > > +1
> > > > >
> > +1 I understand that this is supposed to be generic, but how can it?
> > how do you know if PMD supports this?
> > what if each PMD needs different configurations?
> >
> > In addition how can you handle number of those action and items?
> > For example if I have match on protocol X and Y and do actions Z and W
> > each one of those can be generic item.
> > if you have a way to define a standard why to read such actions then we
> have
> > something to talk about.
> >
> >
> > > >
> > > > Morten, Jerin,
> > > >
> > > > I think there is a fundamental misunderstanding here: we are not trying
> to
> > > > provide support for some non-standard vendor-specific features here.
> What
> > > > we are trying to do is add generic multi-vendor support in RTE_FLOW
> for
> > > > P4 programmable network devices, which requires supporting flow
> items
> > > > and actions that are defined directly by the user through their P4
> programs
> > > > as opposed to being selected from a pre-defined list.
> > > >
> > > > There are an infinity of flow items and actions that the users can define
> > > > through
> > > > their P4 programs, and they cannot be supported with a finite list of
> > > RTE_FLOW
> > > > items and actions:
> > > >
> > > > 1/ Some flow items map directly to the IETF defined protocols, while
> some
> > > > others do not, and only the user writing the program knows the exact
> > answer;
> > > >
> > > > 2/ Some flow items are simply application-specific (not vendor specific)
> > > > meta-data that (I hope we all accept) is outside of the standardization
> > > > process.
> > >
> > > Such items can use a special "reserved" vendor-id.
> > >
> >
> > Can you show me what items/actions are missing in rte_flow?
> >
> > > >
> > > > 3/ Some flow actions map directly to the existing RTE_FLOW actions
> > > (especially
> > > > the more straightforward actions such as: packet drop, packet redirection
> to
> > > > an
> > > > output queue, some specific packet modifications, etc), while the vast
> > > > majority
> > > > of possible actions do not.
> > > >
> > > > Are you saying that the P4 programmable network devices should NOT
> be
> > > > supported by DPDK and RTE_FLOW?
> > >
> > > No, I get the need for this. And I understand that since P4 is compiled to
> > > hardware-specific binary blobs, there is a need to put such blobs into
> DPDK as
> > > flow items and actions, instead of the "uncompiled" P4 program.
> > >
> > > I am suggesting that instead of adding a completely opaque data type:
> > >
> > > Struct item {
> > > Int len;      // Length of value in bytes.
> > > Char value[]; // Application specific meaning.
> > > };
> > >
> >
> > But since you didn't define a known protocol for PMD to read the data how
> > 2 pmds can use the same action?
> >
> > > ...add a semi-opaque data type:
> > >
> > > Struct tlv {
> > > Int type;     // Vendor specific type.
> > > Int len;      // Length of value in bytes.
> > > Char value[]; // (Vendor, Type) specific meaning.
> > > };
> > >
> > > Struct item {
> > > Int vendor;          // Vendor ID.
> > > Int len;             // Length of values in bytes.
> > > Struct tlv values[]; // Array of TLVs.
> > > };
> > >
> > > Like RADIUS Vendor-Specific attributes:
> > > https://datatracker.ietf.org/doc/html/rfc2138#section-5.26
> > >
> > > Then some (Vendor, Type) fields can be documented (and thus generally
> > > understood by DPDK), and some undocumented.
> > >
> > > E.g. like Microsoft documented some of theirs in RFC 2548:
> > > https://datatracker.ietf.org/doc/html/rfc2548
> > >
> > >
> > > Another benefit is that these new "VENDOR-SPECIFIC" flow types can be
> > reused
> > > for other purposes than compiled P4 programs.
> > >
> > > >
> > > > >
> > > > > >
> > > > > > Preferably, there should also be a means for applications to query if
> > > > specific
> > > > > Vendor-Specific flow items and actions are supported or not.
> > > > > >
> > > > > >
> > > > > > [1]: https://www.iana.org/assignments/enterprise-numbers/
> > > > > >
> > > >
> > > > Regards,
> > > > Cristian

Morten Brørup Aug. 2, 2023, 5:56 p.m. UTC | #14

> From: Dumitrescu, Cristian [mailto:cristian.dumitrescu@intel.com]
> Sent: Wednesday, 2 August 2023 19.23
> 
> > From: Ori Kam <orika@nvidia.com>
> > Sent: Wednesday, August 2, 2023 5:06 PM
> >
> > Hi Qi,
> >
> > In addition to my previous email,
> > I fully support you’re your idea to update the rte_flow API
> > so it will be easier for P4 integration, I just think the suggested
> approach is not
> > the correct one at least not as appears in the RFC.
> >
> > I think it will be good if we can discuss some uses cases you are
> having
> > with the API/implementation and see what is the best way to solve
> them.
> > The main idea is not to re-invent the wheel, but to solve issues.
> 
> Yes, fully agree, it would be great meet and talk through this, as we
> did it in the
> past for other issues. What days & time next week would be good for
> people?

My calendar is pretty much all open these days, so anytime work hours in the Central European time zone works for me.

> 
> Meanwhile, some answers below.
> 
> >
> > To summarize, as I see it there are several issues:
> > 1. no protocol is defined so different PMD can't translate it.
> 
> The format of the flow items is defined by the P4 program, so all the HW
> devices
> (from the same or from different vendors) that are able to successfully
> load the
> given P4 program will have the same understanding of the flow items.

If the P4 flow items/actions are standardized by some P4 organization or similar, they can be enumerated and defined as DPDK flow items/actions. At least the ones that are standardized.

And if you (for flexibility or other reasons) need to bypass the RTE_FLOW standardization process (getting ACKs etc. on the DPDK mailing list) for faster integration of new DPDK flow items/actions, it does make sense to define a generic flow item (and action) for this purpose (and not just for P4).

In order to avoid conflicts between P4 and non-P4 generic flow items/actions, the generic type should include information about how to interpret the information, which is why I suggest making it a Vendor-Specific type, with vendor-specific TLV's (managed by the vendor), like the RADIUS Vendor-Specific attributes I compared to, instead of just an opaque blob.

The P4 standardized items/actions can use the Vendor ID of the P4 standards organization.

The non-standardized items/actions can use the Vendor ID of the hardware vendor or the application developer.

> 
> > 2. even the same PMD doesn't know what is the action, unless you plan
> that
> > this will move
> > directly to the HW, in this case, the action will be HW dependent.
> 
> The processing of each flow action, as well as the number of arguments
> and the
> format of each action argument, is defined by the P4 program, so all the
> HW
> devices (from the same or from different vendors) that are able to
> successfully
> load the given P4 program will have the same understanding of the flow
> actions.
> 
> > 3. when application should use this new action or the old ones.
> 
> I guess it is good to clarify that there are two application: a data
> path application
> (the P4 program) that defines the packet processing pipeline, and a
> control path
> application that invokes RTE_FLOW to add/delete the flows on the device.
> I guess
> we are now referring to the control path app.
> 
> Whenever the P4 program (the data path app) that is currently loaded on
> the
> device is defining and using flow actions that perform identical
> processing to one
> of the existing pre-defined RTE_FLOW actions (such as packet drop,
> packet
> redirection to a given output queue, packet modifications, etc), then
> the app
> (the control path app) can accept these actions as well.
> 
> But in the (frequent) case that the user's P4 program defines actions
> that do not
> map to an RTE_FLOW action from the pre-defined list, then the app has no
> other
> option but to use the newly proposed generic flow action in order to
> specify
> (through the action_id field) the exact flow action from the P4 program.
> 
> Makes sense?
> 
> >
> >
> > Thanks,
> > Ori
> >
> 
> Regards,
> Cristian
> 
> > > -----Original Message-----
> > > From: Ori Kam <orika@nvidia.com>
> > > Sent: Wednesday, August 2, 2023 6:47 PM
> > >
> > > Hi Qi
> > >
> > > > -----Original Message-----
> > > > From: Morten Brørup <mb@smartsharesystems.com>
> > > > Sent: Wednesday, August 2, 2023 6:25 PM
> > > >
> > > > > From: Dumitrescu, Cristian
> [mailto:cristian.dumitrescu@intel.com]
> > > > > Sent: Wednesday, 2 August 2023 16.06
> > > > >
> > > > > > From: Jerin Jacob <jerinjacobk@gmail.com>
> > > > > > Sent: Wednesday, August 2, 2023 12:22 PM
> > > > > >
> > > > > > On Wed, Aug 2, 2023 at 4:31 PM Morten Brørup
> > > > > > <mb@smartsharesystems.com> wrote:
> > > > > > >
> > > > > > > > From: Morten Brørup [mailto:mb@smartsharesystems.com]
> > > > > > > > Sent: Wednesday, 2 August 2023 12.25
> > > > > > > >
> > > > > > > > > From: Qi Zhang [mailto:qi.z.zhang@intel.com]
> > > > > > > > > Sent: Wednesday, 2 August 2023 19.35
> > > > > > > > >
> > > > > > > > > From: Cristian Dumitrescu
> <cristian.dumitrescu@intel.com>
> > > > > > > > >
> > > > > > > > > For network devices that are programmable through
> languages
> > such
> > > as
> > > > > > > > > the P4 language, there are no pre-defined flow items and
> actions.
> > > > > > > > >
> > > > > > > > > The format of the protocol header and metadata fields
> that are
> > used
> > > to
> > > > > > > > > specify the flow items that make up the flow pattern, as
> well as
> > the
> > > > > > > > > flow actions, are all defined by the program, with an
> infinity of
> > > > > > > > > possible combinations, as opposed to being selected from
> a
> > finite
> > > > > > > > > pre-defined list.
> > > > > > > > >
> > > > > > > > > It is virtually impossible to pre-define all the flow
> items and the
> > > > > > > > > flow actions that programs might ever use, as these are
> only
> > limited
> > > > > > > > > by the set of HW resources and the program developer's
> > > imagination.
> > > > > > > > >
> > > > > > > > > To support the programmable network devices, we are
> > introducing:
> > > > > > > > >
> > > > > > > > > * A generic flow item: The flow item is expressed as an
> array of
> > bytes
> > > > > > > > > of a given length, whose meaning is defined by the
> program
> > loaded
> > > by
> > > > > > > > > the network device.
> > > > > > > >
> > > > > > > > The flow item is not "generic", it is "opaque": Only the
> application
> > > > > knows
> > > > > > > > what this flow item does.
> > > > > > > >
> > > > > > > > I hate the concept for two reasons:
> > > > > > > > 1. The inability for applications to detect which flow
> items the
> > > > > underlying
> > > > > > > > hardware supports.
> > > > > > > > 2. The risk that vendors will use this instead of
> introducing new
> > flow
> > > > > item
> > > > > > > > types, available for anyone to implement.
> > > > > > >
> > > > > > > After further consideration, there might be a middle ground.
> > > > > > >
> > > > > > > Consider Vendor-Specific attributes for DHCP and RADIUS, or
> SNMP
> > > > MIBs...
> > > > > > >
> > > > > > > Any vendor is free to add his own, proprietary special-
> purpose
> > > attributes,
> > > > > > without going through the standardization process. (This is
> the key
> > > > > challenge
> > > > > > this patch seems to be aiming at.)
> > > > > > >
> > > > > > > The vendor might publish these attributes, and other vendors
> may
> > > > > > implement them too.
> > > > > > >
> > > > > > > And in order to prevent collisions, the Vendor-Specific
> attributes
> > contain
> > > > > a
> > > > > > globally unique vendor ID, such as the Private Enterprise
> Number [1]
> > > > > > managed by IANA.
> > > > > > >
> > > > > > > If similar principles can be worked into the patch, I can
> support it.
> > > > > >
> > > > > > +1
> > > > > >
> > > +1 I understand that this is supposed to be generic, but how can it?
> > > how do you know if PMD supports this?
> > > what if each PMD needs different configurations?
> > >
> > > In addition how can you handle number of those action and items?
> > > For example if I have match on protocol X and Y and do actions Z and
> W
> > > each one of those can be generic item.
> > > if you have a way to define a standard why to read such actions then
> we
> > have
> > > something to talk about.
> > >
> > >
> > > > >
> > > > > Morten, Jerin,
> > > > >
> > > > > I think there is a fundamental misunderstanding here: we are not
> trying
> > to
> > > > > provide support for some non-standard vendor-specific features
> here.
> > What
> > > > > we are trying to do is add generic multi-vendor support in
> RTE_FLOW
> > for
> > > > > P4 programmable network devices, which requires supporting flow
> > items
> > > > > and actions that are defined directly by the user through their
> P4
> > programs
> > > > > as opposed to being selected from a pre-defined list.
> > > > >
> > > > > There are an infinity of flow items and actions that the users
> can define
> > > > > through
> > > > > their P4 programs, and they cannot be supported with a finite
> list of
> > > > RTE_FLOW
> > > > > items and actions:
> > > > >
> > > > > 1/ Some flow items map directly to the IETF defined protocols,
> while
> > some
> > > > > others do not, and only the user writing the program knows the
> exact
> > > answer;
> > > > >
> > > > > 2/ Some flow items are simply application-specific (not vendor
> specific)
> > > > > meta-data that (I hope we all accept) is outside of the
> standardization
> > > > > process.
> > > >
> > > > Such items can use a special "reserved" vendor-id.
> > > >
> > >
> > > Can you show me what items/actions are missing in rte_flow?
> > >
> > > > >
> > > > > 3/ Some flow actions map directly to the existing RTE_FLOW
> actions
> > > > (especially
> > > > > the more straightforward actions such as: packet drop, packet
> redirection
> > to
> > > > > an
> > > > > output queue, some specific packet modifications, etc), while
> the vast
> > > > > majority
> > > > > of possible actions do not.
> > > > >
> > > > > Are you saying that the P4 programmable network devices should
> NOT
> > be
> > > > > supported by DPDK and RTE_FLOW?
> > > >
> > > > No, I get the need for this. And I understand that since P4 is
> compiled to
> > > > hardware-specific binary blobs, there is a need to put such blobs
> into
> > DPDK as
> > > > flow items and actions, instead of the "uncompiled" P4 program.
> > > >
> > > > I am suggesting that instead of adding a completely opaque data
> type:
> > > >
> > > > Struct item {
> > > > Int len;      // Length of value in bytes.
> > > > Char value[]; // Application specific meaning.
> > > > };
> > > >
> > >
> > > But since you didn't define a known protocol for PMD to read the
> data how
> > > 2 pmds can use the same action?
> > >
> > > > ...add a semi-opaque data type:
> > > >
> > > > Struct tlv {
> > > > Int type;     // Vendor specific type.
> > > > Int len;      // Length of value in bytes.
> > > > Char value[]; // (Vendor, Type) specific meaning.
> > > > };
> > > >
> > > > Struct item {
> > > > Int vendor;          // Vendor ID.
> > > > Int len;             // Length of values in bytes.
> > > > Struct tlv values[]; // Array of TLVs.
> > > > };
> > > >
> > > > Like RADIUS Vendor-Specific attributes:
> > > > https://datatracker.ietf.org/doc/html/rfc2138#section-5.26
> > > >
> > > > Then some (Vendor, Type) fields can be documented (and thus
> generally
> > > > understood by DPDK), and some undocumented.
> > > >
> > > > E.g. like Microsoft documented some of theirs in RFC 2548:
> > > > https://datatracker.ietf.org/doc/html/rfc2548
> > > >
> > > >
> > > > Another benefit is that these new "VENDOR-SPECIFIC" flow types can
> be
> > > reused
> > > > for other purposes than compiled P4 programs.
> > > >
> > > > >
> > > > > >
> > > > > > >
> > > > > > > Preferably, there should also be a means for applications to
> query if
> > > > > specific
> > > > > > Vendor-Specific flow items and actions are supported or not.
> > > > > > >
> > > > > > >
> > > > > > > [1]: https://www.iana.org/assignments/enterprise-numbers/
> > > > > > >
> > > > >
> > > > > Regards,
> > > > > Cristian

Qi Zhang Aug. 3, 2023, 1:05 a.m. UTC | #15

> -----Original Message-----
> From: Morten Brørup <mb@smartsharesystems.com>
> Sent: Thursday, August 3, 2023 1:56 AM
> To: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; Ori Kam
> <orika@nvidia.com>; Jerin Jacob <jerinjacobk@gmail.com>
> Cc: Zhang, Qi Z <qi.z.zhang@intel.com>; NBU-Contact-Thomas Monjalon
> (EXTERNAL) <thomas@monjalon.net>; david.marchand@redhat.com;
> Richardson, Bruce <bruce.richardson@intel.com>; jerinj@marvell.com;
> ferruh.yigit@amd.com; techboard@dpdk.org; Mcnamara, John
> <john.mcnamara@intel.com>; Zhang, Helin <helin.zhang@intel.com>;
> dev@dpdk.org
> Subject: RE: [PATCH] ethdev: introduce generic flow item and action
> 
> > From: Dumitrescu, Cristian [mailto:cristian.dumitrescu@intel.com]
> > Sent: Wednesday, 2 August 2023 19.23
> >
> > > From: Ori Kam <orika@nvidia.com>
> > > Sent: Wednesday, August 2, 2023 5:06 PM
> > >
> > > Hi Qi,
> > >
> > > In addition to my previous email,
> > > I fully support you’re your idea to update the rte_flow API so it
> > > will be easier for P4 integration, I just think the suggested
> > approach is not
> > > the correct one at least not as appears in the RFC.
> > >
> > > I think it will be good if we can discuss some uses cases you are
> > having
> > > with the API/implementation and see what is the best way to solve
> > them.
> > > The main idea is not to re-invent the wheel, but to solve issues.
> >
> > Yes, fully agree, it would be great meet and talk through this, as we
> > did it in the past for other issues. What days & time next week would
> > be good for people?
> 
> My calendar is pretty much all open these days, so anytime work hours in the
> Central European time zone works for me.
> 
> >
> > Meanwhile, some answers below.
> >
> > >
> > > To summarize, as I see it there are several issues:
> > > 1. no protocol is defined so different PMD can't translate it.
> >
> > The format of the flow items is defined by the P4 program, so all the
> > HW devices (from the same or from different vendors) that are able to
> > successfully load the given P4 program will have the same
> > understanding of the flow items.
> 
> If the P4 flow items/actions are standardized by some P4 organization or
> similar, they can be enumerated and defined as DPDK flow items/actions. At
> least the ones that are standardized.
> 
> And if you (for flexibility or other reasons) need to bypass the RTE_FLOW
> standardization process (getting ACKs etc. on the DPDK mailing list) for faster
> integration of new DPDK flow items/actions, it does make sense to define a
> generic flow item (and action) for this purpose (and not just for P4).

> 
> In order to avoid conflicts between P4 and non-P4 generic flow items/actions,
> the generic type should include information about how to interpret the
> information, which is why I suggest making it a Vendor-Specific type, with
> vendor-specific TLV's (managed by the vendor), like the RADIUS Vendor-
> Specific attributes I compared to, instead of just an opaque blob.

I like this idea, but it is not necessary to introduce a vendor-specific type, it could be considered a device-specific type (or port-specific in the context of DPDK).

However, the PMD can manage a dictionary, enabling users to query about the format of each generic item or action if we can expose a set of query APIs.

But I guess we don't need vendor-id / vendor-type as RADIUS does, as we have port_id here.


> 
> The P4 standardized items/actions can use the Vendor ID of the P4 standards
> organization.
> 
> The non-standardized items/actions can use the Vendor ID of the hardware
> vendor or the application developer.
> 
> >
> > > 2. even the same PMD doesn't know what is the action, unless you
> > > plan
> > that
> > > this will move
> > > directly to the HW, in this case, the action will be HW dependent.
> >
> > The processing of each flow action, as well as the number of arguments
> > and the format of each action argument, is defined by the P4 program,
> > so all the HW devices (from the same or from different vendors) that
> > are able to successfully load the given P4 program will have the same
> > understanding of the flow actions.
> >
> > > 3. when application should use this new action or the old ones.
> >
> > I guess it is good to clarify that there are two application: a data
> > path application (the P4 program) that defines the packet processing
> > pipeline, and a control path application that invokes RTE_FLOW to
> > add/delete the flows on the device.
> > I guess
> > we are now referring to the control path app.
> >
> > Whenever the P4 program (the data path app) that is currently loaded
> > on the device is defining and using flow actions that perform
> > identical processing to one of the existing pre-defined RTE_FLOW
> > actions (such as packet drop, packet redirection to a given output
> > queue, packet modifications, etc), then the app (the control path app)
> > can accept these actions as well.
> >
> > But in the (frequent) case that the user's P4 program defines actions
> > that do not map to an RTE_FLOW action from the pre-defined list, then
> > the app has no other option but to use the newly proposed generic flow
> > action in order to specify (through the action_id field) the exact
> > flow action from the P4 program.
> >
> > Makes sense?
> >
> > >
> > >
> > > Thanks,
> > > Ori
> > >
> >
> > Regards,
> > Cristian
> >
> > > > -----Original Message-----
> > > > From: Ori Kam <orika@nvidia.com>
> > > > Sent: Wednesday, August 2, 2023 6:47 PM
> > > >
> > > > Hi Qi
> > > >
> > > > > -----Original Message-----
> > > > > From: Morten Brørup <mb@smartsharesystems.com>
> > > > > Sent: Wednesday, August 2, 2023 6:25 PM
> > > > >
> > > > > > From: Dumitrescu, Cristian
> > [mailto:cristian.dumitrescu@intel.com]
> > > > > > Sent: Wednesday, 2 August 2023 16.06
> > > > > >
> > > > > > > From: Jerin Jacob <jerinjacobk@gmail.com>
> > > > > > > Sent: Wednesday, August 2, 2023 12:22 PM
> > > > > > >
> > > > > > > On Wed, Aug 2, 2023 at 4:31 PM Morten Brørup
> > > > > > > <mb@smartsharesystems.com> wrote:
> > > > > > > >
> > > > > > > > > From: Morten Brørup [mailto:mb@smartsharesystems.com]
> > > > > > > > > Sent: Wednesday, 2 August 2023 12.25
> > > > > > > > >
> > > > > > > > > > From: Qi Zhang [mailto:qi.z.zhang@intel.com]
> > > > > > > > > > Sent: Wednesday, 2 August 2023 19.35
> > > > > > > > > >
> > > > > > > > > > From: Cristian Dumitrescu
> > <cristian.dumitrescu@intel.com>
> > > > > > > > > >
> > > > > > > > > > For network devices that are programmable through
> > languages
> > > such
> > > > as
> > > > > > > > > > the P4 language, there are no pre-defined flow items
> > > > > > > > > > and
> > actions.
> > > > > > > > > >
> > > > > > > > > > The format of the protocol header and metadata fields
> > that are
> > > used
> > > > to
> > > > > > > > > > specify the flow items that make up the flow pattern,
> > > > > > > > > > as
> > well as
> > > the
> > > > > > > > > > flow actions, are all defined by the program, with an
> > infinity of
> > > > > > > > > > possible combinations, as opposed to being selected
> > > > > > > > > > from
> > a
> > > finite
> > > > > > > > > > pre-defined list.
> > > > > > > > > >
> > > > > > > > > > It is virtually impossible to pre-define all the flow
> > items and the
> > > > > > > > > > flow actions that programs might ever use, as these
> > > > > > > > > > are
> > only
> > > limited
> > > > > > > > > > by the set of HW resources and the program developer's
> > > > imagination.
> > > > > > > > > >
> > > > > > > > > > To support the programmable network devices, we are
> > > introducing:
> > > > > > > > > >
> > > > > > > > > > * A generic flow item: The flow item is expressed as
> > > > > > > > > > an
> > array of
> > > bytes
> > > > > > > > > > of a given length, whose meaning is defined by the
> > program
> > > loaded
> > > > by
> > > > > > > > > > the network device.
> > > > > > > > >
> > > > > > > > > The flow item is not "generic", it is "opaque": Only the
> > application
> > > > > > knows
> > > > > > > > > what this flow item does.
> > > > > > > > >
> > > > > > > > > I hate the concept for two reasons:
> > > > > > > > > 1. The inability for applications to detect which flow
> > items the
> > > > > > underlying
> > > > > > > > > hardware supports.
> > > > > > > > > 2. The risk that vendors will use this instead of
> > introducing new
> > > flow
> > > > > > item
> > > > > > > > > types, available for anyone to implement.
> > > > > > > >
> > > > > > > > After further consideration, there might be a middle ground.
> > > > > > > >
> > > > > > > > Consider Vendor-Specific attributes for DHCP and RADIUS,
> > > > > > > > or
> > SNMP
> > > > > MIBs...
> > > > > > > >
> > > > > > > > Any vendor is free to add his own, proprietary special-
> > purpose
> > > > attributes,
> > > > > > > without going through the standardization process. (This is
> > the key
> > > > > > challenge
> > > > > > > this patch seems to be aiming at.)
> > > > > > > >
> > > > > > > > The vendor might publish these attributes, and other
> > > > > > > > vendors
> > may
> > > > > > > implement them too.
> > > > > > > >
> > > > > > > > And in order to prevent collisions, the Vendor-Specific
> > attributes
> > > contain
> > > > > > a
> > > > > > > globally unique vendor ID, such as the Private Enterprise
> > Number [1]
> > > > > > > managed by IANA.
> > > > > > > >
> > > > > > > > If similar principles can be worked into the patch, I can
> > support it.
> > > > > > >
> > > > > > > +1
> > > > > > >
> > > > +1 I understand that this is supposed to be generic, but how can it?
> > > > how do you know if PMD supports this?
> > > > what if each PMD needs different configurations?
> > > >
> > > > In addition how can you handle number of those action and items?
> > > > For example if I have match on protocol X and Y and do actions Z
> > > > and
> > W
> > > > each one of those can be generic item.
> > > > if you have a way to define a standard why to read such actions
> > > > then
> > we
> > > have
> > > > something to talk about.
> > > >
> > > >
> > > > > >
> > > > > > Morten, Jerin,
> > > > > >
> > > > > > I think there is a fundamental misunderstanding here: we are
> > > > > > not
> > trying
> > > to
> > > > > > provide support for some non-standard vendor-specific features
> > here.
> > > What
> > > > > > we are trying to do is add generic multi-vendor support in
> > RTE_FLOW
> > > for
> > > > > > P4 programmable network devices, which requires supporting
> > > > > > flow
> > > items
> > > > > > and actions that are defined directly by the user through
> > > > > > their
> > P4
> > > programs
> > > > > > as opposed to being selected from a pre-defined list.
> > > > > >
> > > > > > There are an infinity of flow items and actions that the users
> > can define
> > > > > > through
> > > > > > their P4 programs, and they cannot be supported with a finite
> > list of
> > > > > RTE_FLOW
> > > > > > items and actions:
> > > > > >
> > > > > > 1/ Some flow items map directly to the IETF defined protocols,
> > while
> > > some
> > > > > > others do not, and only the user writing the program knows the
> > exact
> > > > answer;
> > > > > >
> > > > > > 2/ Some flow items are simply application-specific (not vendor
> > specific)
> > > > > > meta-data that (I hope we all accept) is outside of the
> > standardization
> > > > > > process.
> > > > >
> > > > > Such items can use a special "reserved" vendor-id.
> > > > >
> > > >
> > > > Can you show me what items/actions are missing in rte_flow?
> > > >
> > > > > >
> > > > > > 3/ Some flow actions map directly to the existing RTE_FLOW
> > actions
> > > > > (especially
> > > > > > the more straightforward actions such as: packet drop, packet
> > redirection
> > > to
> > > > > > an
> > > > > > output queue, some specific packet modifications, etc), while
> > the vast
> > > > > > majority
> > > > > > of possible actions do not.
> > > > > >
> > > > > > Are you saying that the P4 programmable network devices should
> > NOT
> > > be
> > > > > > supported by DPDK and RTE_FLOW?
> > > > >
> > > > > No, I get the need for this. And I understand that since P4 is
> > compiled to
> > > > > hardware-specific binary blobs, there is a need to put such
> > > > > blobs
> > into
> > > DPDK as
> > > > > flow items and actions, instead of the "uncompiled" P4 program.
> > > > >
> > > > > I am suggesting that instead of adding a completely opaque data
> > type:
> > > > >
> > > > > Struct item {
> > > > > Int len;      // Length of value in bytes.
> > > > > Char value[]; // Application specific meaning.
> > > > > };
> > > > >
> > > >
> > > > But since you didn't define a known protocol for PMD to read the
> > data how
> > > > 2 pmds can use the same action?
> > > >
> > > > > ...add a semi-opaque data type:
> > > > >
> > > > > Struct tlv {
> > > > > Int type;     // Vendor specific type.
> > > > > Int len;      // Length of value in bytes.
> > > > > Char value[]; // (Vendor, Type) specific meaning.
> > > > > };
> > > > >
> > > > > Struct item {
> > > > > Int vendor;          // Vendor ID.
> > > > > Int len;             // Length of values in bytes.
> > > > > Struct tlv values[]; // Array of TLVs.
> > > > > };
> > > > >
> > > > > Like RADIUS Vendor-Specific attributes:
> > > > > https://datatracker.ietf.org/doc/html/rfc2138#section-5.26
> > > > >
> > > > > Then some (Vendor, Type) fields can be documented (and thus
> > generally
> > > > > understood by DPDK), and some undocumented.
> > > > >
> > > > > E.g. like Microsoft documented some of theirs in RFC 2548:
> > > > > https://datatracker.ietf.org/doc/html/rfc2548
> > > > >
> > > > >
> > > > > Another benefit is that these new "VENDOR-SPECIFIC" flow types
> > > > > can
> > be
> > > > reused
> > > > > for other purposes than compiled P4 programs.
> > > > >
> > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > Preferably, there should also be a means for applications
> > > > > > > > to
> > query if
> > > > > > specific
> > > > > > > Vendor-Specific flow items and actions are supported or not.
> > > > > > > >
> > > > > > > >
> > > > > > > > [1]: https://www.iana.org/assignments/enterprise-numbers/
> > > > > > > >
> > > > > >
> > > > > > Regards,
> > > > > > Cristian

Ori Kam Aug. 3, 2023, 8:39 a.m. UTC | #16

Hi Cristian,

> -----Original Message-----
> From: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>
> Sent: Wednesday, August 2, 2023 7:57 PM
> 
> 
> 
> > -----Original Message-----
> > From: Ori Kam <orika@nvidia.com>
> > Sent: Wednesday, August 2, 2023 4:47 PM
> > To: Morten Brørup <mb@smartsharesystems.com>; Dumitrescu, Cristian
> > <cristian.dumitrescu@intel.com>; Jerin Jacob <jerinjacobk@gmail.com>
> > Cc: Zhang, Qi Z <qi.z.zhang@intel.com>; NBU-Contact-Thomas Monjalon
> > (EXTERNAL) <thomas@monjalon.net>; david.marchand@redhat.com;
> > Richardson, Bruce <bruce.richardson@intel.com>; jerinj@marvell.com;
> > ferruh.yigit@amd.com; techboard@dpdk.org; Mcnamara, John
> > <john.mcnamara@intel.com>; Zhang, Helin <helin.zhang@intel.com>;
> > dev@dpdk.org
> > Subject: RE: [PATCH] ethdev: introduce generic flow item and action
> >
> > Hi Qi
> 
> Hi Ori,
> 
> Thanks for your input!
> 
> >
> > > -----Original Message-----
> > > From: Morten Brørup <mb@smartsharesystems.com>
> > > Sent: Wednesday, August 2, 2023 6:25 PM
> > >
> > > > From: Dumitrescu, Cristian [mailto:cristian.dumitrescu@intel.com]
> > > > Sent: Wednesday, 2 August 2023 16.06
> > > >
> > > > > From: Jerin Jacob <jerinjacobk@gmail.com>
> > > > > Sent: Wednesday, August 2, 2023 12:22 PM
> > > > >
> > > > > On Wed, Aug 2, 2023 at 4:31 PM Morten Brørup
> > > > > <mb@smartsharesystems.com> wrote:
> > > > > >
> > > > > > > From: Morten Brørup [mailto:mb@smartsharesystems.com]
> > > > > > > Sent: Wednesday, 2 August 2023 12.25
> > > > > > >
> > > > > > > > From: Qi Zhang [mailto:qi.z.zhang@intel.com]
> > > > > > > > Sent: Wednesday, 2 August 2023 19.35
> > > > > > > >
> > > > > > > > From: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> > > > > > > >
> > > > > > > > For network devices that are programmable through languages
> > such as
> > > > > > > > the P4 language, there are no pre-defined flow items and actions.
> > > > > > > >
> > > > > > > > The format of the protocol header and metadata fields that are
> > used to
> > > > > > > > specify the flow items that make up the flow pattern, as well as the
> > > > > > > > flow actions, are all defined by the program, with an infinity of
> > > > > > > > possible combinations, as opposed to being selected from a finite
> > > > > > > > pre-defined list.
> > > > > > > >
> > > > > > > > It is virtually impossible to pre-define all the flow items and the
> > > > > > > > flow actions that programs might ever use, as these are only
> > limited
> > > > > > > > by the set of HW resources and the program developer's
> > imagination.
> > > > > > > >
> > > > > > > > To support the programmable network devices, we are introducing:
> > > > > > > >
> > > > > > > > * A generic flow item: The flow item is expressed as an array of
> > bytes
> > > > > > > > of a given length, whose meaning is defined by the program loaded
> > by
> > > > > > > > the network device.
> > > > > > >
> > > > > > > The flow item is not "generic", it is "opaque": Only the application
> > > > knows
> > > > > > > what this flow item does.
> > > > > > >
> > > > > > > I hate the concept for two reasons:
> > > > > > > 1. The inability for applications to detect which flow items the
> > > > underlying
> > > > > > > hardware supports.
> > > > > > > 2. The risk that vendors will use this instead of introducing new flow
> > > > item
> > > > > > > types, available for anyone to implement.
> > > > > >
> > > > > > After further consideration, there might be a middle ground.
> > > > > >
> > > > > > Consider Vendor-Specific attributes for DHCP and RADIUS, or SNMP
> > > MIBs...
> > > > > >
> > > > > > Any vendor is free to add his own, proprietary special-purpose
> > attributes,
> > > > > without going through the standardization process. (This is the key
> > > > challenge
> > > > > this patch seems to be aiming at.)
> > > > > >
> > > > > > The vendor might publish these attributes, and other vendors may
> > > > > implement them too.
> > > > > >
> > > > > > And in order to prevent collisions, the Vendor-Specific attributes
> > contain
> > > > a
> > > > > globally unique vendor ID, such as the Private Enterprise Number [1]
> > > > > managed by IANA.
> > > > > >
> > > > > > If similar principles can be worked into the patch, I can support it.
> > > > >
> > > > > +1
> > > > >
> > +1 I understand that this is supposed to be generic, but how can it?
> 
> What exactly it not generic?
> 
> > how do you know if PMD supports this?
> 
> What is the current RTE_FLOW mechanism to find out whether a device/PMD
> supports a certain flow? The only mechanism available is to try to create the
> flow (rte_flow_create) or mimic the creation of the flow (rte_flow_validate)
> and see if you get success or error, right? Same mechanism to be used here.
> Of course, we would be happy to support the addition of a more complex
> query mechanism in RTE_FLOW.
> 

My bad explanation, you are right the way to know if some PMD supports a feature
is trial and error, what I ment is how PMD knows how to phrase the data, it should be 
in a known format.

> > what if each PMD needs different configurations?
> 
> Not sure what you're referring to exactly. The format of the flow items and
> flow actions is defined by the P4 program; so all the HW devices that are
> able to successfully load a given P4 program, potentially from different
> vendors, will have the same understanding of each flow item and action.
> 

So you don't suggest generic, you suggest p4_action/item which will get
a blob based on standard.

> >
> > In addition how can you handle number of those action and items?
> > For example if I have match on protocol X and Y and do actions Z and W
> > each one of those can be generic item.
> > if you have a way to define a standard why to read such actions then we have
> > something to talk about.
> 
> Yes, we do. The format of all flow items and flow actions available on the HW
> device is fully defined by the P4 program, therefore all the HW devices,
> potentially from different vendors, that are able to successfully load a given P4
> program, will have the same understanding of them.
> 

Again this was not clear from the RFC, see my above comment
> >
> >
> > > >
> > > > Morten, Jerin,
> > > >
> > > > I think there is a fundamental misunderstanding here: we are not trying to
> > > > provide support for some non-standard vendor-specific features here.
> > What
> > > > we are trying to do is add generic multi-vendor support in RTE_FLOW for
> > > > P4 programmable network devices, which requires supporting flow items
> > > > and actions that are defined directly by the user through their P4
> > programs
> > > > as opposed to being selected from a pre-defined list.
> > > >
> > > > There are an infinity of flow items and actions that the users can define
> > > > through
> > > > their P4 programs, and they cannot be supported with a finite list of
> > > RTE_FLOW
> > > > items and actions:
> > > >
> > > > 1/ Some flow items map directly to the IETF defined protocols, while some
> > > > others do not, and only the user writing the program knows the exact
> > answer;
> > > >
> > > > 2/ Some flow items are simply application-specific (not vendor specific)
> > > > meta-data that (I hope we all accept) is outside of the standardization
> > > > process.
> > >
> > > Such items can use a special "reserved" vendor-id.
> > >
> >
> > Can you show me what items/actions are missing in rte_flow?
> 
> The number of flow items (protocol headers, but also metadata) and flow
> actions
> that users can define in their P4 programs is infinite, so unfortunately Ori, as
> much
> as I want to grant you this wish, I am not going to be able to 😊. Also, it is
> important to note that the users write their P4 program (defining their data
> path
> pipeline) without any involvement from their device vendor, so the vendor is
> simply
> not aware of meaning of the user's flow items and actions either.
> 

Maybe we should add an API to register actions/items, but at the end we don't want to write code
Twice, meaning if action is already supported by the current API there is no reason to use different
API.
maybe we can add a small translation function the a PMD can convert actions and output the right
rte_flow actions/items, and if it can't, it can create a user action 

> >
> > > >
> > > > 3/ Some flow actions map directly to the existing RTE_FLOW actions
> > > (especially
> > > > the more straightforward actions such as: packet drop, packet redirection
> > to
> > > > an
> > > > output queue, some specific packet modifications, etc), while the vast
> > > > majority
> > > > of possible actions do not.
> > > >
> > > > Are you saying that the P4 programmable network devices should NOT be
> > > > supported by DPDK and RTE_FLOW?
> > >
> > > No, I get the need for this. And I understand that since P4 is compiled to
> > > hardware-specific binary blobs, there is a need to put such blobs into DPDK
> > as
> > > flow items and actions, instead of the "uncompiled" P4 program.
> > >
> > > I am suggesting that instead of adding a completely opaque data type:
> > >
> > > Struct item {
> > > Int len;      // Length of value in bytes.
> > > Char value[]; // Application specific meaning.
> > > };
> > >
> >
> > But since you didn't define a known protocol for PMD to read the data how
> > 2 pmds can use the same action?
> 
> The format of all flow items and flow actions available on the HW device is fully
> defined by the P4 program, therefore all the HW devices, even if from different
> vendors, will have the same understanding of them.
> 
Like I said above it is not generic it is p4 format.

Best,
Ori

> >
> > > ...add a semi-opaque data type:
> > >
> > > Struct tlv {
> > > Int type;     // Vendor specific type.
> > > Int len;      // Length of value in bytes.
> > > Char value[]; // (Vendor, Type) specific meaning.
> > > };
> > >
> > > Struct item {
> > > Int vendor;          // Vendor ID.
> > > Int len;             // Length of values in bytes.
> > > Struct tlv values[]; // Array of TLVs.
> > > };
> > >
> > > Like RADIUS Vendor-Specific attributes:
> > > https://datatracker.ietf.org/doc/html/rfc2138#section-5.26
> > >
> > > Then some (Vendor, Type) fields can be documented (and thus generally
> > > understood by DPDK), and some undocumented.
> > >
> > > E.g. like Microsoft documented some of theirs in RFC 2548:
> > > https://datatracker.ietf.org/doc/html/rfc2548
> > >
> > >
> > > Another benefit is that these new "VENDOR-SPECIFIC" flow types can be
> > reused
> > > for other purposes than compiled P4 programs.
> > >
> > > >
> > > > >
> > > > > >
> > > > > > Preferably, there should also be a means for applications to query if
> > > > specific
> > > > > Vendor-Specific flow items and actions are supported or not.
> > > > > >
> > > > > >
> > > > > > [1]: https://www.iana.org/assignments/enterprise-numbers/
> > > > > >
> > > >
> > > > Regards,
> > > > Cristian
> 
> Regards,
> Cristian

Morten Brørup Aug. 3, 2023, 1:57 p.m. UTC | #17

> From: Zhang, Qi Z [mailto:qi.z.zhang@intel.com]
> Sent: Thursday, 3 August 2023 03.05
> 
> > From: Morten Brørup <mb@smartsharesystems.com>
> > Sent: Thursday, August 3, 2023 1:56 AM
> >
> > > From: Dumitrescu, Cristian [mailto:cristian.dumitrescu@intel.com]
> > > Sent: Wednesday, 2 August 2023 19.23
> > >
> > > > From: Ori Kam <orika@nvidia.com>
> > > > Sent: Wednesday, August 2, 2023 5:06 PM
> > > >
> > > > Hi Qi,
> > > >
> > > > In addition to my previous email,
> > > > I fully support you’re your idea to update the rte_flow API so it
> > > > will be easier for P4 integration, I just think the suggested
> > > approach is not
> > > > the correct one at least not as appears in the RFC.
> > > >
> > > > I think it will be good if we can discuss some uses cases you are
> > > having
> > > > with the API/implementation and see what is the best way to solve
> > > them.
> > > > The main idea is not to re-invent the wheel, but to solve issues.
> > >
> > > Yes, fully agree, it would be great meet and talk through this, as we
> > > did it in the past for other issues. What days & time next week would
> > > be good for people?
> >
> > My calendar is pretty much all open these days, so anytime work hours in the
> > Central European time zone works for me.
> >
> > >
> > > Meanwhile, some answers below.
> > >
> > > >
> > > > To summarize, as I see it there are several issues:
> > > > 1. no protocol is defined so different PMD can't translate it.
> > >
> > > The format of the flow items is defined by the P4 program, so all the
> > > HW devices (from the same or from different vendors) that are able to
> > > successfully load the given P4 program will have the same
> > > understanding of the flow items.
> >
> > If the P4 flow items/actions are standardized by some P4 organization or
> > similar, they can be enumerated and defined as DPDK flow items/actions. At
> > least the ones that are standardized.
> >
> > And if you (for flexibility or other reasons) need to bypass the RTE_FLOW
> > standardization process (getting ACKs etc. on the DPDK mailing list) for
> faster
> > integration of new DPDK flow items/actions, it does make sense to define a
> > generic flow item (and action) for this purpose (and not just for P4).
> 
> >
> > In order to avoid conflicts between P4 and non-P4 generic flow
> items/actions,
> > the generic type should include information about how to interpret the
> > information, which is why I suggest making it a Vendor-Specific type, with
> > vendor-specific TLV's (managed by the vendor), like the RADIUS Vendor-
> > Specific attributes I compared to, instead of just an opaque blob.
> 
> I like this idea, but it is not necessary to introduce a vendor-specific type,
> it could be considered a device-specific type (or port-specific in the context
> of DPDK).
> 
> However, the PMD can manage a dictionary, enabling users to query about the
> format of each generic item or action if we can expose a set of query APIs.
> 
> But I guess we don't need vendor-id / vendor-type as RADIUS does, as we have
> port_id here.

If the flow item itself doesn't have a "type" field (identifying how to interpret the blob), you might have two different NICs using each their own blob format. The NIC must be able to determine if a given flow item is of a type it can understand, before it tries to parse the blob in it.

This is why the "struct rte_flow_item" has a "type" field. It tells the HW how to interpret the values in a flow item.

If we introduce a "generic" flow item type, it can only be used for multiple purposes (i.e. both P4, but also other purposes than P4) if it has a "sub-type" field. Otherwise, someone will create a "generic" flow item containing a P4 program and send it to a NIC, which uses the "generic" flow item type for other program types, e.g. a cBPF program. And this cBPF capable NIC has no way to detect that the blob in the flow item is not a cBPF program, but a P4 program. The P4 capable NIC will accept the P4 program, but will be confused when sent the cBPF program understood by the first NIC.

So I am suggesting that the "generic" flow items and actions follow an existing and well known design patterns for how to identify the meaning of blobs: Include a Vendor-ID followed by vendor-specific TLV formatted data.

As I wrote initially, I am opposed to introducing uninterpretable blobs into DPDK. Flow items/actions need to be well defined. Allowing "Vendor-Specific" flow items/actions is a workaround that allows you to bypass the normal standardization process.

> 
> 
> >
> > The P4 standardized items/actions can use the Vendor ID of the P4 standards
> > organization.
> >
> > The non-standardized items/actions can use the Vendor ID of the hardware
> > vendor or the application developer.
> >
> > >
> > > > 2. even the same PMD doesn't know what is the action, unless you
> > > > plan
> > > that
> > > > this will move
> > > > directly to the HW, in this case, the action will be HW dependent.
> > >
> > > The processing of each flow action, as well as the number of arguments
> > > and the format of each action argument, is defined by the P4 program,
> > > so all the HW devices (from the same or from different vendors) that
> > > are able to successfully load the given P4 program will have the same
> > > understanding of the flow actions.
> > >
> > > > 3. when application should use this new action or the old ones.
> > >
> > > I guess it is good to clarify that there are two application: a data
> > > path application (the P4 program) that defines the packet processing
> > > pipeline, and a control path application that invokes RTE_FLOW to
> > > add/delete the flows on the device.
> > > I guess
> > > we are now referring to the control path app.
> > >
> > > Whenever the P4 program (the data path app) that is currently loaded
> > > on the device is defining and using flow actions that perform
> > > identical processing to one of the existing pre-defined RTE_FLOW
> > > actions (such as packet drop, packet redirection to a given output
> > > queue, packet modifications, etc), then the app (the control path app)
> > > can accept these actions as well.
> > >
> > > But in the (frequent) case that the user's P4 program defines actions
> > > that do not map to an RTE_FLOW action from the pre-defined list, then
> > > the app has no other option but to use the newly proposed generic flow
> > > action in order to specify (through the action_id field) the exact
> > > flow action from the P4 program.
> > >
> > > Makes sense?
> > >
> > > >
> > > >
> > > > Thanks,
> > > > Ori
> > > >
> > >
> > > Regards,
> > > Cristian
> > >
> > > > > -----Original Message-----
> > > > > From: Ori Kam <orika@nvidia.com>
> > > > > Sent: Wednesday, August 2, 2023 6:47 PM
> > > > >
> > > > > Hi Qi
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Morten Brørup <mb@smartsharesystems.com>
> > > > > > Sent: Wednesday, August 2, 2023 6:25 PM
> > > > > >
> > > > > > > From: Dumitrescu, Cristian
> > > [mailto:cristian.dumitrescu@intel.com]
> > > > > > > Sent: Wednesday, 2 August 2023 16.06
> > > > > > >
> > > > > > > > From: Jerin Jacob <jerinjacobk@gmail.com>
> > > > > > > > Sent: Wednesday, August 2, 2023 12:22 PM
> > > > > > > >
> > > > > > > > On Wed, Aug 2, 2023 at 4:31 PM Morten Brørup
> > > > > > > > <mb@smartsharesystems.com> wrote:
> > > > > > > > >
> > > > > > > > > > From: Morten Brørup [mailto:mb@smartsharesystems.com]
> > > > > > > > > > Sent: Wednesday, 2 August 2023 12.25
> > > > > > > > > >
> > > > > > > > > > > From: Qi Zhang [mailto:qi.z.zhang@intel.com]
> > > > > > > > > > > Sent: Wednesday, 2 August 2023 19.35
> > > > > > > > > > >
> > > > > > > > > > > From: Cristian Dumitrescu
> > > <cristian.dumitrescu@intel.com>
> > > > > > > > > > >
> > > > > > > > > > > For network devices that are programmable through
> > > languages
> > > > such
> > > > > as
> > > > > > > > > > > the P4 language, there are no pre-defined flow items
> > > > > > > > > > > and
> > > actions.
> > > > > > > > > > >
> > > > > > > > > > > The format of the protocol header and metadata fields
> > > that are
> > > > used
> > > > > to
> > > > > > > > > > > specify the flow items that make up the flow pattern,
> > > > > > > > > > > as
> > > well as
> > > > the
> > > > > > > > > > > flow actions, are all defined by the program, with an
> > > infinity of
> > > > > > > > > > > possible combinations, as opposed to being selected
> > > > > > > > > > > from
> > > a
> > > > finite
> > > > > > > > > > > pre-defined list.
> > > > > > > > > > >
> > > > > > > > > > > It is virtually impossible to pre-define all the flow
> > > items and the
> > > > > > > > > > > flow actions that programs might ever use, as these
> > > > > > > > > > > are
> > > only
> > > > limited
> > > > > > > > > > > by the set of HW resources and the program developer's
> > > > > imagination.
> > > > > > > > > > >
> > > > > > > > > > > To support the programmable network devices, we are
> > > > introducing:
> > > > > > > > > > >
> > > > > > > > > > > * A generic flow item: The flow item is expressed as
> > > > > > > > > > > an
> > > array of
> > > > bytes
> > > > > > > > > > > of a given length, whose meaning is defined by the
> > > program
> > > > loaded
> > > > > by
> > > > > > > > > > > the network device.
> > > > > > > > > >
> > > > > > > > > > The flow item is not "generic", it is "opaque": Only the
> > > application
> > > > > > > knows
> > > > > > > > > > what this flow item does.
> > > > > > > > > >
> > > > > > > > > > I hate the concept for two reasons:
> > > > > > > > > > 1. The inability for applications to detect which flow
> > > items the
> > > > > > > underlying
> > > > > > > > > > hardware supports.
> > > > > > > > > > 2. The risk that vendors will use this instead of
> > > introducing new
> > > > flow
> > > > > > > item
> > > > > > > > > > types, available for anyone to implement.
> > > > > > > > >
> > > > > > > > > After further consideration, there might be a middle ground.
> > > > > > > > >
> > > > > > > > > Consider Vendor-Specific attributes for DHCP and RADIUS,
> > > > > > > > > or
> > > SNMP
> > > > > > MIBs...
> > > > > > > > >
> > > > > > > > > Any vendor is free to add his own, proprietary special-
> > > purpose
> > > > > attributes,
> > > > > > > > without going through the standardization process. (This is
> > > the key
> > > > > > > challenge
> > > > > > > > this patch seems to be aiming at.)
> > > > > > > > >
> > > > > > > > > The vendor might publish these attributes, and other
> > > > > > > > > vendors
> > > may
> > > > > > > > implement them too.
> > > > > > > > >
> > > > > > > > > And in order to prevent collisions, the Vendor-Specific
> > > attributes
> > > > contain
> > > > > > > a
> > > > > > > > globally unique vendor ID, such as the Private Enterprise
> > > Number [1]
> > > > > > > > managed by IANA.
> > > > > > > > >
> > > > > > > > > If similar principles can be worked into the patch, I can
> > > support it.
> > > > > > > >
> > > > > > > > +1
> > > > > > > >
> > > > > +1 I understand that this is supposed to be generic, but how can it?
> > > > > how do you know if PMD supports this?
> > > > > what if each PMD needs different configurations?
> > > > >
> > > > > In addition how can you handle number of those action and items?
> > > > > For example if I have match on protocol X and Y and do actions Z
> > > > > and
> > > W
> > > > > each one of those can be generic item.
> > > > > if you have a way to define a standard why to read such actions
> > > > > then
> > > we
> > > > have
> > > > > something to talk about.
> > > > >
> > > > >
> > > > > > >
> > > > > > > Morten, Jerin,
> > > > > > >
> > > > > > > I think there is a fundamental misunderstanding here: we are
> > > > > > > not
> > > trying
> > > > to
> > > > > > > provide support for some non-standard vendor-specific features
> > > here.
> > > > What
> > > > > > > we are trying to do is add generic multi-vendor support in
> > > RTE_FLOW
> > > > for
> > > > > > > P4 programmable network devices, which requires supporting
> > > > > > > flow
> > > > items
> > > > > > > and actions that are defined directly by the user through
> > > > > > > their
> > > P4
> > > > programs
> > > > > > > as opposed to being selected from a pre-defined list.
> > > > > > >
> > > > > > > There are an infinity of flow items and actions that the users
> > > can define
> > > > > > > through
> > > > > > > their P4 programs, and they cannot be supported with a finite
> > > list of
> > > > > > RTE_FLOW
> > > > > > > items and actions:
> > > > > > >
> > > > > > > 1/ Some flow items map directly to the IETF defined protocols,
> > > while
> > > > some
> > > > > > > others do not, and only the user writing the program knows the
> > > exact
> > > > > answer;
> > > > > > >
> > > > > > > 2/ Some flow items are simply application-specific (not vendor
> > > specific)
> > > > > > > meta-data that (I hope we all accept) is outside of the
> > > standardization
> > > > > > > process.
> > > > > >
> > > > > > Such items can use a special "reserved" vendor-id.
> > > > > >
> > > > >
> > > > > Can you show me what items/actions are missing in rte_flow?
> > > > >
> > > > > > >
> > > > > > > 3/ Some flow actions map directly to the existing RTE_FLOW
> > > actions
> > > > > > (especially
> > > > > > > the more straightforward actions such as: packet drop, packet
> > > redirection
> > > > to
> > > > > > > an
> > > > > > > output queue, some specific packet modifications, etc), while
> > > the vast
> > > > > > > majority
> > > > > > > of possible actions do not.
> > > > > > >
> > > > > > > Are you saying that the P4 programmable network devices should
> > > NOT
> > > > be
> > > > > > > supported by DPDK and RTE_FLOW?
> > > > > >
> > > > > > No, I get the need for this. And I understand that since P4 is
> > > compiled to
> > > > > > hardware-specific binary blobs, there is a need to put such
> > > > > > blobs
> > > into
> > > > DPDK as
> > > > > > flow items and actions, instead of the "uncompiled" P4 program.
> > > > > >
> > > > > > I am suggesting that instead of adding a completely opaque data
> > > type:
> > > > > >
> > > > > > Struct item {
> > > > > > Int len;      // Length of value in bytes.
> > > > > > Char value[]; // Application specific meaning.
> > > > > > };
> > > > > >
> > > > >
> > > > > But since you didn't define a known protocol for PMD to read the
> > > data how
> > > > > 2 pmds can use the same action?
> > > > >
> > > > > > ...add a semi-opaque data type:
> > > > > >
> > > > > > Struct tlv {
> > > > > > Int type;     // Vendor specific type.
> > > > > > Int len;      // Length of value in bytes.
> > > > > > Char value[]; // (Vendor, Type) specific meaning.
> > > > > > };
> > > > > >
> > > > > > Struct item {
> > > > > > Int vendor;          // Vendor ID.
> > > > > > Int len;             // Length of values in bytes.
> > > > > > Struct tlv values[]; // Array of TLVs.
> > > > > > };
> > > > > >
> > > > > > Like RADIUS Vendor-Specific attributes:
> > > > > > https://datatracker.ietf.org/doc/html/rfc2138#section-5.26
> > > > > >
> > > > > > Then some (Vendor, Type) fields can be documented (and thus
> > > generally
> > > > > > understood by DPDK), and some undocumented.
> > > > > >
> > > > > > E.g. like Microsoft documented some of theirs in RFC 2548:
> > > > > > https://datatracker.ietf.org/doc/html/rfc2548
> > > > > >
> > > > > >
> > > > > > Another benefit is that these new "VENDOR-SPECIFIC" flow types
> > > > > > can
> > > be
> > > > > reused
> > > > > > for other purposes than compiled P4 programs.
> > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > Preferably, there should also be a means for applications
> > > > > > > > > to
> > > query if
> > > > > > > specific
> > > > > > > > Vendor-Specific flow items and actions are supported or not.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > [1]: https://www.iana.org/assignments/enterprise-numbers/
> > > > > > > > >
> > > > > > >
> > > > > > > Regards,
> > > > > > > Cristian

Cristian Dumitrescu Aug. 16, 2023, 1:12 p.m. UTC | #18

Hi Ori,

<snip>

> > > Can you show me what items/actions are missing in rte_flow?
> >
> > The number of flow items (protocol headers, but also metadata) and flow
> > actions
> > that users can define in their P4 programs is infinite, so unfortunately Ori, as
> > much
> > as I want to grant you this wish, I am not going to be able to 😊. Also, it is
> > important to note that the users write their P4 program (defining their data
> > path
> > pipeline) without any involvement from their device vendor, so the vendor
> is
> > simply
> > not aware of meaning of the user's flow items and actions either.
> >
> 
> Maybe we should add an API to register actions/items, but at the end we
> don't want to write code
> Twice, meaning if action is already supported by the current API there is no
> reason to use different
> API.
> maybe we can add a small translation function the a PMD can convert actions
> and output the right
> rte_flow actions/items, and if it can't, it can create a user action
> 

This is a very interesting idea, do you mean that we can allow the user to
register their own flow items and actions on top on the existing ones that
are already provided by RTE_FLOW?

Something similar to these ones below?

int
rte_flow_item_register(uint16_t port_id,
	int flow_item_id,
	size_t flow_item_length,
	const char *flow_item_description,
	struct rte_flow_error *error);

struct rte_flow_action_argument {
	size_t argument_length;
	const char *argument_description;
};

int
rte_flow_action_register(uint16_t port_id,
	int action_id,
	uint32_t action_args_num,
	const struct rte_flow_action_argument *action_args,
	struct rte_flow_error *error);

<snip>

Regards,
Cristian

Cristian Dumitrescu Aug. 16, 2023, 1:23 p.m. UTC | #19

Hi Morten,

<snip>

> > >
> > > In order to avoid conflicts between P4 and non-P4 generic flow
> > items/actions,
> > > the generic type should include information about how to interpret the
> > > information, which is why I suggest making it a Vendor-Specific type, with
> > > vendor-specific TLV's (managed by the vendor), like the RADIUS Vendor-
> > > Specific attributes I compared to, instead of just an opaque blob.
> >
> > I like this idea, but it is not necessary to introduce a vendor-specific type,
> > it could be considered a device-specific type (or port-specific in the context
> > of DPDK).
> >
> > However, the PMD can manage a dictionary, enabling users to query about
> the
> > format of each generic item or action if we can expose a set of query APIs.
> >
> > But I guess we don't need vendor-id / vendor-type as RADIUS does, as we
> have
> > port_id here.
> 
> If the flow item itself doesn't have a "type" field (identifying how to interpret
> the blob), you might have two different NICs using each their own blob
> format. The NIC must be able to determine if a given flow item is of a type it
> can understand, before it tries to parse the blob in it.
> 
> This is why the "struct rte_flow_item" has a "type" field. It tells the HW how
> to interpret the values in a flow item.
> 
> If we introduce a "generic" flow item type, it can only be used for multiple
> purposes (i.e. both P4, but also other purposes than P4) if it has a "sub-type"
> field. Otherwise, someone will create a "generic" flow item containing a P4
> program and send it to a NIC, which uses the "generic" flow item type for
> other program types, e.g. a cBPF program. And this cBPF capable NIC has no
> way to detect that the blob in the flow item is not a cBPF program, but a P4
> program. The P4 capable NIC will accept the P4 program, but will be confused
> when sent the cBPF program understood by the first NIC.
> 
> So I am suggesting that the "generic" flow items and actions follow an existing
> and well known design patterns for how to identify the meaning of blobs:
> Include a Vendor-ID followed by vendor-specific TLV formatted data.
> 
> As I wrote initially, I am opposed to introducing uninterpretable blobs into
> DPDK. Flow items/actions need to be well defined. Allowing "Vendor-Specific"
> flow items/actions is a workaround that allows you to bypass the normal
> standardization process.
> 

I would be happy to add mechanisms to describe the user-defined flow items
and actions in greater detail. Would you be able to provide some examples for
your proposal for a flow item and a flow action of your choice, please? Thanks!

One thing I would want to stress here: the flow items and flow actions are
defined exclusively by the user (through their P4 program) without any
knowledge or intervention from the HW vendor, so any TLVs / helper fields
must be populated by the user as opposed to the HW vendor.

<snip>

Regards,
Cristian

Morten Brørup Aug. 16, 2023, 2:20 p.m. UTC | #20

> From: Dumitrescu, Cristian [mailto:cristian.dumitrescu@intel.com]
> Sent: Wednesday, 16 August 2023 15.23
> 
> Hi Morten,
> 
> <snip>
> 
> > > >
> > > > In order to avoid conflicts between P4 and non-P4 generic flow
> > > items/actions,
> > > > the generic type should include information about how to interpret the
> > > > information, which is why I suggest making it a Vendor-Specific type,
> with
> > > > vendor-specific TLV's (managed by the vendor), like the RADIUS Vendor-
> > > > Specific attributes I compared to, instead of just an opaque blob.
> > >
> > > I like this idea, but it is not necessary to introduce a vendor-specific
> type,
> > > it could be considered a device-specific type (or port-specific in the
> context
> > > of DPDK).
> > >
> > > However, the PMD can manage a dictionary, enabling users to query about
> > the
> > > format of each generic item or action if we can expose a set of query
> APIs.
> > >
> > > But I guess we don't need vendor-id / vendor-type as RADIUS does, as we
> > have
> > > port_id here.
> >
> > If the flow item itself doesn't have a "type" field (identifying how to
> interpret
> > the blob), you might have two different NICs using each their own blob
> > format. The NIC must be able to determine if a given flow item is of a type
> it
> > can understand, before it tries to parse the blob in it.
> >
> > This is why the "struct rte_flow_item" has a "type" field. It tells the HW
> how
> > to interpret the values in a flow item.
> >
> > If we introduce a "generic" flow item type, it can only be used for multiple
> > purposes (i.e. both P4, but also other purposes than P4) if it has a "sub-
> type"
> > field. Otherwise, someone will create a "generic" flow item containing a P4
> > program and send it to a NIC, which uses the "generic" flow item type for
> > other program types, e.g. a cBPF program. And this cBPF capable NIC has no
> > way to detect that the blob in the flow item is not a cBPF program, but a P4
> > program. The P4 capable NIC will accept the P4 program, but will be confused
> > when sent the cBPF program understood by the first NIC.
> >
> > So I am suggesting that the "generic" flow items and actions follow an
> existing
> > and well known design patterns for how to identify the meaning of blobs:
> > Include a Vendor-ID followed by vendor-specific TLV formatted data.
> >
> > As I wrote initially, I am opposed to introducing uninterpretable blobs into
> > DPDK. Flow items/actions need to be well defined. Allowing "Vendor-Specific"
> > flow items/actions is a workaround that allows you to bypass the normal
> > standardization process.
> >
> 
> I would be happy to add mechanisms to describe the user-defined flow items
> and actions in greater detail. Would you be able to provide some examples for
> your proposal for a flow item and a flow action of your choice, please?
> Thanks!
> 
> One thing I would want to stress here: the flow items and flow actions are
> defined exclusively by the user (through their P4 program) without any
> knowledge or intervention from the HW vendor, so any TLVs / helper fields
> must be populated by the user as opposed to the HW vendor.

Perhaps I have completely misunderstood this patch...

I thought the purpose is for the user to define some generic flow items and actions, which are not in the list of DPDK standardized (and fully documented) RTE_FLOW items/actions, but are understood by a variety of programmable NICs from various HW vendors. In this case, each blob needs to be prefixed with a "type" field, so the HW can determine which of its processing engines needs to parse the blob. E.g. a NIC could have both a P4 processing engine and a BPF processing engine, so the blob needs to indicate which of the two engines to use for the provided flow item/action.

But maybe the purpose is completely different. Is the purpose of this patch to introduce flow items and flow actions, which each make the HW perform a "callback" to the user application? In this case, only the user application (handling the "callbacks") can understand them, and thus they are completely opaque to everything else.

> 
> <snip>
> 
> Regards,
> Cristian

Cristian Dumitrescu Aug. 16, 2023, 4:19 p.m. UTC | #21

Hi folks,

We just set up a community call for next week to discuss in more details the proposal for RTE_FLOW extensions to support P4-programmable devices https://mails.dpdk.org/archives/dev/2023-August/273703.html and look for ways to converge and make progress.

All the people from To: and CC: are already invited. To avoid cluttering people's calendars, I did not add dev@dpdk.org, so if anybody else wants to attend, please send me a private email and I will be happy to forward the invite.

Thanks,
Cristian

Ferruh Yigit Aug. 16, 2023, 5:08 p.m. UTC | #22

On 8/16/2023 3:20 PM, Morten Brørup wrote:
>> From: Dumitrescu, Cristian [mailto:cristian.dumitrescu@intel.com]
>> Sent: Wednesday, 16 August 2023 15.23
>>
>> Hi Morten,
>>
>> <snip>
>>
>>>>>
>>>>> In order to avoid conflicts between P4 and non-P4 generic flow
>>>> items/actions,
>>>>> the generic type should include information about how to interpret the
>>>>> information, which is why I suggest making it a Vendor-Specific type,
>> with
>>>>> vendor-specific TLV's (managed by the vendor), like the RADIUS Vendor-
>>>>> Specific attributes I compared to, instead of just an opaque blob.
>>>>
>>>> I like this idea, but it is not necessary to introduce a vendor-specific
>> type,
>>>> it could be considered a device-specific type (or port-specific in the
>> context
>>>> of DPDK).
>>>>
>>>> However, the PMD can manage a dictionary, enabling users to query about
>>> the
>>>> format of each generic item or action if we can expose a set of query
>> APIs.
>>>>
>>>> But I guess we don't need vendor-id / vendor-type as RADIUS does, as we
>>> have
>>>> port_id here.
>>>
>>> If the flow item itself doesn't have a "type" field (identifying how to
>> interpret
>>> the blob), you might have two different NICs using each their own blob
>>> format. The NIC must be able to determine if a given flow item is of a type
>> it
>>> can understand, before it tries to parse the blob in it.
>>>
>>> This is why the "struct rte_flow_item" has a "type" field. It tells the HW
>> how
>>> to interpret the values in a flow item.
>>>
>>> If we introduce a "generic" flow item type, it can only be used for multiple
>>> purposes (i.e. both P4, but also other purposes than P4) if it has a "sub-
>> type"
>>> field. Otherwise, someone will create a "generic" flow item containing a P4
>>> program and send it to a NIC, which uses the "generic" flow item type for
>>> other program types, e.g. a cBPF program. And this cBPF capable NIC has no
>>> way to detect that the blob in the flow item is not a cBPF program, but a P4
>>> program. The P4 capable NIC will accept the P4 program, but will be confused
>>> when sent the cBPF program understood by the first NIC.
>>>
>>> So I am suggesting that the "generic" flow items and actions follow an
>> existing
>>> and well known design patterns for how to identify the meaning of blobs:
>>> Include a Vendor-ID followed by vendor-specific TLV formatted data.
>>>
>>> As I wrote initially, I am opposed to introducing uninterpretable blobs into
>>> DPDK. Flow items/actions need to be well defined. Allowing "Vendor-Specific"
>>> flow items/actions is a workaround that allows you to bypass the normal
>>> standardization process.
>>>
>>
>> I would be happy to add mechanisms to describe the user-defined flow items
>> and actions in greater detail. Would you be able to provide some examples for
>> your proposal for a flow item and a flow action of your choice, please?
>> Thanks!
>>
>> One thing I would want to stress here: the flow items and flow actions are
>> defined exclusively by the user (through their P4 program) without any
>> knowledge or intervention from the HW vendor, so any TLVs / helper fields
>> must be populated by the user as opposed to the HW vendor.
> 
> Perhaps I have completely misunderstood this patch...
> 
> I thought the purpose is for the user to define some generic flow items and actions, which are not in the list of DPDK standardized (and fully documented) RTE_FLOW items/actions, but are understood by a variety of programmable NICs from various HW vendors. In this case, each blob needs to be prefixed with a "type" field, so the HW can determine which of its processing engines needs to parse the blob. E.g. a NIC could have both a P4 processing engine and a BPF processing engine, so the blob needs to indicate which of the two engines to use for the provided flow item/action.
> 
> But maybe the purpose is completely different. Is the purpose of this patch to introduce flow items and flow actions, which each make the HW perform a "callback" to the user application? In this case, only the user application (handling the "callbacks") can understand them, and thus they are completely opaque to everything else.
> 

@Morten, I agree that this is more "opaque" than "generic" and it is
open to abuse.

As far as I understand, purpose is NOT to implement unsupported RTE_FLOW
items/actions, if that is the case I agree with @Ori to figure out gaps
and implement them.


Purpose seems to provide control path for P4 pipelines.

Each P4 pipeline implementation can have set of tables, used for match
action for the pipeline, and these tables can be for anything, doesn't
have to be standard net functions or protocols etc..

To be able to dynamically program the pipeline, someone needs the
ability to program the tables, since these tables not limited to net
functions it is not possible to address these tables by rte_flow patterns.
Hence this opaque rte_flow pattern/action seems designed to dynamically
update/control the pipeline.


If above understanding is correct, I suggest using a middle ground,
instead of having wide open key/value based rte_flow item, rename it to
RTE_FLOW_ITEM_TYPE_P4[SOMETHING],

and have "struct rte_flow_item_p4[something]" to include all relevant
fields to program a P4 pipeline, like table_id/table_name, value etc..

This way new rte_flow item/action remains scoped and focused, but still
flexible enough to program any type of P4 pipeline.


Also we don't need to maintain a unique vendor ID etc, although what is
the table name/id, what it is for and what is the relevant action is HW
(even pipeline) specific, there is nothing to collide per vendor to manage.
With this approach code can be portable, same application can be used on
top of different HW that implements exact same P4 pipelines (assuming
driver implementations are complete.)

Cristian Dumitrescu Aug. 16, 2023, 5:18 p.m. UTC | #23

> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@amd.com>
> Sent: Wednesday, August 16, 2023 6:08 PM
> To: Morten Brørup <mb@smartsharesystems.com>; Dumitrescu, Cristian
> <cristian.dumitrescu@intel.com>; Zhang, Qi Z <qi.z.zhang@intel.com>; Ori
> Kam <orika@nvidia.com>; Jerin Jacob <jerinjacobk@gmail.com>
> Cc: NBU-Contact-Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>;
> david.marchand@redhat.com; Richardson, Bruce
> <bruce.richardson@intel.com>; jerinj@marvell.com; techboard@dpdk.org;
> Mcnamara, John <john.mcnamara@intel.com>; Zhang, Helin
> <helin.zhang@intel.com>; dev@dpdk.org
> Subject: Re: [PATCH] ethdev: introduce generic flow item and action
> 
> On 8/16/2023 3:20 PM, Morten Brørup wrote:
> >> From: Dumitrescu, Cristian [mailto:cristian.dumitrescu@intel.com]
> >> Sent: Wednesday, 16 August 2023 15.23
> >>
> >> Hi Morten,
> >>
> >> <snip>
> >>
> >>>>>
> >>>>> In order to avoid conflicts between P4 and non-P4 generic flow
> >>>> items/actions,
> >>>>> the generic type should include information about how to interpret the
> >>>>> information, which is why I suggest making it a Vendor-Specific type,
> >> with
> >>>>> vendor-specific TLV's (managed by the vendor), like the RADIUS
> Vendor-
> >>>>> Specific attributes I compared to, instead of just an opaque blob.
> >>>>
> >>>> I like this idea, but it is not necessary to introduce a vendor-specific
> >> type,
> >>>> it could be considered a device-specific type (or port-specific in the
> >> context
> >>>> of DPDK).
> >>>>
> >>>> However, the PMD can manage a dictionary, enabling users to query
> about
> >>> the
> >>>> format of each generic item or action if we can expose a set of query
> >> APIs.
> >>>>
> >>>> But I guess we don't need vendor-id / vendor-type as RADIUS does, as
> we
> >>> have
> >>>> port_id here.
> >>>
> >>> If the flow item itself doesn't have a "type" field (identifying how to
> >> interpret
> >>> the blob), you might have two different NICs using each their own blob
> >>> format. The NIC must be able to determine if a given flow item is of a
> type
> >> it
> >>> can understand, before it tries to parse the blob in it.
> >>>
> >>> This is why the "struct rte_flow_item" has a "type" field. It tells the HW
> >> how
> >>> to interpret the values in a flow item.
> >>>
> >>> If we introduce a "generic" flow item type, it can only be used for
> multiple
> >>> purposes (i.e. both P4, but also other purposes than P4) if it has a "sub-
> >> type"
> >>> field. Otherwise, someone will create a "generic" flow item containing a
> P4
> >>> program and send it to a NIC, which uses the "generic" flow item type for
> >>> other program types, e.g. a cBPF program. And this cBPF capable NIC has
> no
> >>> way to detect that the blob in the flow item is not a cBPF program, but a
> P4
> >>> program. The P4 capable NIC will accept the P4 program, but will be
> confused
> >>> when sent the cBPF program understood by the first NIC.
> >>>
> >>> So I am suggesting that the "generic" flow items and actions follow an
> >> existing
> >>> and well known design patterns for how to identify the meaning of blobs:
> >>> Include a Vendor-ID followed by vendor-specific TLV formatted data.
> >>>
> >>> As I wrote initially, I am opposed to introducing uninterpretable blobs into
> >>> DPDK. Flow items/actions need to be well defined. Allowing "Vendor-
> Specific"
> >>> flow items/actions is a workaround that allows you to bypass the normal
> >>> standardization process.
> >>>
> >>
> >> I would be happy to add mechanisms to describe the user-defined flow
> items
> >> and actions in greater detail. Would you be able to provide some examples
> for
> >> your proposal for a flow item and a flow action of your choice, please?
> >> Thanks!
> >>
> >> One thing I would want to stress here: the flow items and flow actions are
> >> defined exclusively by the user (through their P4 program) without any
> >> knowledge or intervention from the HW vendor, so any TLVs / helper fields
> >> must be populated by the user as opposed to the HW vendor.
> >
> > Perhaps I have completely misunderstood this patch...
> >
> > I thought the purpose is for the user to define some generic flow items and
> actions, which are not in the list of DPDK standardized (and fully documented)
> RTE_FLOW items/actions, but are understood by a variety of programmable
> NICs from various HW vendors. In this case, each blob needs to be prefixed
> with a "type" field, so the HW can determine which of its processing engines
> needs to parse the blob. E.g. a NIC could have both a P4 processing engine
> and a BPF processing engine, so the blob needs to indicate which of the two
> engines to use for the provided flow item/action.
> >
> > But maybe the purpose is completely different. Is the purpose of this patch
> to introduce flow items and flow actions, which each make the HW perform a
> "callback" to the user application? In this case, only the user application
> (handling the "callbacks") can understand them, and thus they are completely
> opaque to everything else.
> >
> 
> @Morten, I agree that this is more "opaque" than "generic" and it is
> open to abuse.
> 
> As far as I understand, purpose is NOT to implement unsupported RTE_FLOW
> items/actions, if that is the case I agree with @Ori to figure out gaps
> and implement them.
> 

Exactly.

> 
> Purpose seems to provide control path for P4 pipelines.
> 

+1

> Each P4 pipeline implementation can have set of tables, used for match
> action for the pipeline, and these tables can be for anything, doesn't
> have to be standard net functions or protocols etc..
> 
> To be able to dynamically program the pipeline, someone needs the
> ability to program the tables, since these tables not limited to net
> functions it is not possible to address these tables by rte_flow patterns.
> Hence this opaque rte_flow pattern/action seems designed to dynamically
> update/control the pipeline.
> 

Yes.

> 
> If above understanding is correct, I suggest using a middle ground,
> instead of having wide open key/value based rte_flow item, rename it to
> RTE_FLOW_ITEM_TYPE_P4[SOMETHING],
> 
> and have "struct rte_flow_item_p4[something]" to include all relevant
> fields to program a P4 pipeline, like table_id/table_name, value etc..
> 
> This way new rte_flow item/action remains scoped and focused, but still
> flexible enough to program any type of P4 pipeline.
> 

If this approach would be acceptable, we'd be glad to go for it.

> 
> Also we don't need to maintain a unique vendor ID etc, although what is
> the table name/id, what it is for and what is the relevant action is HW
> (even pipeline) specific, there is nothing to collide per vendor to manage.
> With this approach code can be portable, same application can be used on
> top of different HW that implements exact same P4 pipelines (assuming
> driver implementations are complete.)
> 

Yes, flow items and actions are defined by the user, the vendor is happily
unaware about what the users are doing in their P4 programs.
> 
> 
> 
> 
> 
> 
> 
> 
>

Ori Kam Aug. 27, 2023, 7:48 a.m. UTC | #24

Hi,
Since I was on vacation, can someone please send the meeting summary?

Thanks,
Ori

> -----Original Message-----
> From: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>
> Sent: Wednesday, August 16, 2023 7:20 PM
> To: Morten Brørup <mb@smartsharesystems.com>; Zhang, Qi Z
> <qi.z.zhang@intel.com>; Ori Kam <orika@nvidia.com>; Jerin Jacob
> <jerinjacobk@gmail.com>
> Cc: NBU-Contact-Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>;
> david.marchand@redhat.com; Richardson, Bruce
> <bruce.richardson@intel.com>; jerinj@marvell.com; ferruh.yigit@amd.com;
> techboard@dpdk.org; Mcnamara, John <john.mcnamara@intel.com>; Zhang,
> Helin <helin.zhang@intel.com>; dev@dpdk.org
> Subject: DPDK community: RTE_FLOW support for P4-programmable devices
> 
> Hi folks,
> 
> We just set up a community call for next week to discuss in more details the
> proposal for RTE_FLOW extensions to support P4-programmable devices
> https://mails.dpdk.org/archives/dev/2023-August/273703.html and look for
> ways to converge and make progress.
> 
> All the people from To: and CC: are already invited. To avoid cluttering people's
> calendars, I did not add dev@dpdk.org, so if anybody else wants to attend,
> please send me a private email and I will be happy to forward the invite.
> 
> Thanks,
> Cristian

Cristian Dumitrescu Aug. 28, 2023, 4:12 p.m. UTC | #25

> > We just set up a community call for next week to discuss in more details the
> > proposal for RTE_FLOW extensions to support P4-programmable devices
> > https://mails.dpdk.org/archives/dev/2023-August/273703.html and look for
> > ways to converge and make progress.
> >
> > All the people from To: and CC: are already invited. To avoid cluttering
> people's
> > calendars, I did not add dev@dpdk.org, so if anybody else wants to attend,
> > please send me a private email and I will be happy to forward the invite.
> >
> > Thanks,
> > Cristian

Attendees: Morten Brorup, Jerin Jacob, Anoob Joseph, Vipin Varghese, Qi Zhang,
Cristian Dumitrescu

1. Ori (RTE_FLOW maintainer) and others were not present, probably on vacation,
hopefully they will be able to attend the next call on this topic. Ferruh had a last
minute conflict that he could not avoid.

2. Cristian presented a few slides (attached) with the problem statement, current
RTE_FLOW gaps for P4-programmable devices and the list of current solution
proposals.

3. Everybody on the call agreed that the P4-programmable devices from Intel,
AMD and others need to be fully supported by DPDK and that there are some
gaps in RTE_FLOW to be fixed for supporting these devices.

3. Ori suggested in a previous email to potentially introduce a registration API
In RTE_FLOW for user-defined flow items and actions. Cristian replied with a 
proposal on the email list and currently wating for Ori's reply (see also proposal
#2 on slide 5).

4. Will setup a follow-up call in early September.

Regards,
Cristian

Jerin Jacob Aug. 29, 2023, 7:38 a.m. UTC | #26

On Mon, Aug 28, 2023 at 9:43 PM Dumitrescu, Cristian
<cristian.dumitrescu@intel.com> wrote:
>
> > > We just set up a community call for next week to discuss in more details the
> > > proposal for RTE_FLOW extensions to support P4-programmable devices
> > > https://mails.dpdk.org/archives/dev/2023-August/273703.html and look for
> > > ways to converge and make progress.
> > >
> > > All the people from To: and CC: are already invited. To avoid cluttering
> > people's
> > > calendars, I did not add dev@dpdk.org, so if anybody else wants to attend,
> > > please send me a private email and I will be happy to forward the invite.
> > >
> > > Thanks,
> > > Cristian
>
> Attendees: Morten Brorup, Jerin Jacob, Anoob Joseph, Vipin Varghese, Qi Zhang,
> Cristian Dumitrescu
>
> 1. Ori (RTE_FLOW maintainer) and others were not present, probably on vacation,
> hopefully they will be able to attend the next call on this topic. Ferruh had a last
> minute conflict that he could not avoid.
>
> 2. Cristian presented a few slides (attached) with the problem statement, current
> RTE_FLOW gaps for P4-programmable devices and the list of current solution
> proposals.
>
> 3. Everybody on the call agreed that the P4-programmable devices from Intel,
> AMD and others need to be fully supported by DPDK and that there are some
> gaps in RTE_FLOW to be fixed for supporting these devices.

Personally, It makes sense to me to have normative DPDK API to send p4
runtime message to the
ethdev so that we have "vendor neutral + DPDK based" p4 runtime backend.

I prefer to have specialized ethdev ops for this due to the following reasons.

# If the ethdev has both real TCAM based HW(for existing rte_flow
patterns and actions) and SW agent to receive P4 runtime message etc.
Typically, it needs to take a different path in driver to talk. Assume, if you
have cascaded patterns/actions, One is targeted for TCAM and other for
SW agent for p4, one
need to have serious amount checking for dispatching.It complicates
the driver and forbid to have
driver optimization especially cases for templates etc. if user making
rules for both category of HW.

# All we need "char buffer//string" based communication ethdev <-> app.

Ferruh Yigit Aug. 29, 2023, 10:18 a.m. UTC | #27

On 8/29/2023 8:38 AM, Jerin Jacob wrote:
> On Mon, Aug 28, 2023 at 9:43 PM Dumitrescu, Cristian
> <cristian.dumitrescu@intel.com> wrote:
>>
>>>> We just set up a community call for next week to discuss in more details the
>>>> proposal for RTE_FLOW extensions to support P4-programmable devices
>>>> https://mails.dpdk.org/archives/dev/2023-August/273703.html and look for
>>>> ways to converge and make progress.
>>>>
>>>> All the people from To: and CC: are already invited. To avoid cluttering
>>> people's
>>>> calendars, I did not add dev@dpdk.org, so if anybody else wants to attend,
>>>> please send me a private email and I will be happy to forward the invite.
>>>>
>>>> Thanks,
>>>> Cristian
>>
>> Attendees: Morten Brorup, Jerin Jacob, Anoob Joseph, Vipin Varghese, Qi Zhang,
>> Cristian Dumitrescu
>>
>> 1. Ori (RTE_FLOW maintainer) and others were not present, probably on vacation,
>> hopefully they will be able to attend the next call on this topic. Ferruh had a last
>> minute conflict that he could not avoid.
>>
>> 2. Cristian presented a few slides (attached) with the problem statement, current
>> RTE_FLOW gaps for P4-programmable devices and the list of current solution
>> proposals.
>>
>> 3. Everybody on the call agreed that the P4-programmable devices from Intel,
>> AMD and others need to be fully supported by DPDK and that there are some
>> gaps in RTE_FLOW to be fixed for supporting these devices.
> 
> Personally, It makes sense to me to have normative DPDK API to send p4
> runtime message to the
> ethdev so that we have "vendor neutral + DPDK based" p4 runtime backend.
> 
> I prefer to have specialized ethdev ops for this due to the following reasons.
> 
> # If the ethdev has both real TCAM based HW(for existing rte_flow
> patterns and actions) and SW agent to receive P4 runtime message etc.
> Typically, it needs to take a different path in driver to talk. Assume, if you
> have cascaded patterns/actions, One is targeted for TCAM and other for
> SW agent for p4, one
> need to have serious amount checking for dispatching.It complicates
> the driver and forbid to have
> driver optimization especially cases for templates etc. if user making
> rules for both category of HW.
> 

Indeed I am not against dedicated APIs for P4 runtime backend.

But assuming there is a dedicated rte_flow item for P4, how it is
different than dedicated API in above scenario?
If driver detects P4 runtime specific rule, it can bail it out to SW agent.
Can you please elaborate the complexity it introduces?

> # All we need "char buffer//string" based communication ethdev <-> app.
>

Yes, and both a dedicated API or dedicated rte_flow item can provide
medium for this communication.

rte_flow one has flexibility & extensibility advantages, but maybe not
as straightforward as an API.

Ori Kam Aug. 31, 2023, 10:32 a.m. UTC | #28

Hi

> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@amd.com>
> Sent: Tuesday, August 29, 2023 1:18 PM
> devices
> 
> On 8/29/2023 8:38 AM, Jerin Jacob wrote:
> > On Mon, Aug 28, 2023 at 9:43 PM Dumitrescu, Cristian
> > <cristian.dumitrescu@intel.com> wrote:
> >>
> >>>> We just set up a community call for next week to discuss in more details
> the
> >>>> proposal for RTE_FLOW extensions to support P4-programmable devices
> >>>> https://mails.dpdk.org/archives/dev/2023-August/273703.html and look
> for
> >>>> ways to converge and make progress.
> >>>>
> >>>> All the people from To: and CC: are already invited. To avoid cluttering
> >>> people's
> >>>> calendars, I did not add dev@dpdk.org, so if anybody else wants to attend,
> >>>> please send me a private email and I will be happy to forward the invite.
> >>>>
> >>>> Thanks,
> >>>> Cristian
> >>
> >> Attendees: Morten Brorup, Jerin Jacob, Anoob Joseph, Vipin Varghese, Qi
> Zhang,
> >> Cristian Dumitrescu
> >>
> >> 1. Ori (RTE_FLOW maintainer) and others were not present, probably on
> vacation,
> >> hopefully they will be able to attend the next call on this topic. Ferruh had a
> last
> >> minute conflict that he could not avoid.
> >>
> >> 2. Cristian presented a few slides (attached) with the problem statement,
> current
> >> RTE_FLOW gaps for P4-programmable devices and the list of current
> solution
> >> proposals.
> >>
> >> 3. Everybody on the call agreed that the P4-programmable devices from
> Intel,
> >> AMD and others need to be fully supported by DPDK and that there are
> some
> >> gaps in RTE_FLOW to be fixed for supporting these devices.
> >
> > Personally, It makes sense to me to have normative DPDK API to send p4
> > runtime message to the
> > ethdev so that we have "vendor neutral + DPDK based" p4 runtime backend.
> >
> > I prefer to have specialized ethdev ops for this due to the following reasons.
> >
> > # If the ethdev has both real TCAM based HW(for existing rte_flow
> > patterns and actions) and SW agent to receive P4 runtime message etc.
> > Typically, it needs to take a different path in driver to talk. Assume, if you
> > have cascaded patterns/actions, One is targeted for TCAM and other for
> > SW agent for p4, one
> > need to have serious amount checking for dispatching.It complicates
> > the driver and forbid to have
> > driver optimization especially cases for templates etc. if user making
> > rules for both category of HW.
> >
> 
> Indeed I am not against dedicated APIs for P4 runtime backend.
> 
> But assuming there is a dedicated rte_flow item for P4, how it is
> different than dedicated API in above scenario?
> If driver detects P4 runtime specific rule, it can bail it out to SW agent.
> Can you please elaborate the complexity it introduces?
> 
> > # All we need "char buffer//string" based communication ethdev <-> app.
> >
> 
> Yes, and both a dedicated API or dedicated rte_flow item can provide
> medium for this communication.
> 
> rte_flow one has flexibility & extensibility advantages, but maybe not
> as straightforward as an API.

I think not using the rte_flow will also require duplication of all the rule handling functions and table creations, for example aync rule create/destroy query ......

Stephen Hemminger Aug. 31, 2023, 1:42 p.m. UTC | #29

It also introduces lots of new test conditions. For example, can a P4 flow
be deleted, supersed, or read by a flow created by rte_flow.

On Thu, Aug 31, 2023, 12:32 PM Ori Kam <orika@nvidia.com> wrote:

> Hi
>
> > -----Original Message-----
> > From: Ferruh Yigit <ferruh.yigit@amd.com>
> > Sent: Tuesday, August 29, 2023 1:18 PM
> > devices
> >
> > On 8/29/2023 8:38 AM, Jerin Jacob wrote:
> > > On Mon, Aug 28, 2023 at 9:43 PM Dumitrescu, Cristian
> > > <cristian.dumitrescu@intel.com> wrote:
> > >>
> > >>>> We just set up a community call for next week to discuss in more
> details
> > the
> > >>>> proposal for RTE_FLOW extensions to support P4-programmable devices
> > >>>> https://mails.dpdk.org/archives/dev/2023-August/273703.html and
> look
> > for
> > >>>> ways to converge and make progress.
> > >>>>
> > >>>> All the people from To: and CC: are already invited. To avoid
> cluttering
> > >>> people's
> > >>>> calendars, I did not add dev@dpdk.org, so if anybody else wants to
> attend,
> > >>>> please send me a private email and I will be happy to forward the
> invite.
> > >>>>
> > >>>> Thanks,
> > >>>> Cristian
> > >>
> > >> Attendees: Morten Brorup, Jerin Jacob, Anoob Joseph, Vipin Varghese,
> Qi
> > Zhang,
> > >> Cristian Dumitrescu
> > >>
> > >> 1. Ori (RTE_FLOW maintainer) and others were not present, probably on
> > vacation,
> > >> hopefully they will be able to attend the next call on this topic.
> Ferruh had a
> > last
> > >> minute conflict that he could not avoid.
> > >>
> > >> 2. Cristian presented a few slides (attached) with the problem
> statement,
> > current
> > >> RTE_FLOW gaps for P4-programmable devices and the list of current
> > solution
> > >> proposals.
> > >>
> > >> 3. Everybody on the call agreed that the P4-programmable devices from
> > Intel,
> > >> AMD and others need to be fully supported by DPDK and that there are
> > some
> > >> gaps in RTE_FLOW to be fixed for supporting these devices.
> > >
> > > Personally, It makes sense to me to have normative DPDK API to send p4
> > > runtime message to the
> > > ethdev so that we have "vendor neutral + DPDK based" p4 runtime
> backend.
> > >
> > > I prefer to have specialized ethdev ops for this due to the following
> reasons.
> > >
> > > # If the ethdev has both real TCAM based HW(for existing rte_flow
> > > patterns and actions) and SW agent to receive P4 runtime message etc.
> > > Typically, it needs to take a different path in driver to talk.
> Assume, if you
> > > have cascaded patterns/actions, One is targeted for TCAM and other for
> > > SW agent for p4, one
> > > need to have serious amount checking for dispatching.It complicates
> > > the driver and forbid to have
> > > driver optimization especially cases for templates etc. if user making
> > > rules for both category of HW.
> > >
> >
> > Indeed I am not against dedicated APIs for P4 runtime backend.
> >
> > But assuming there is a dedicated rte_flow item for P4, how it is
> > different than dedicated API in above scenario?
> > If driver detects P4 runtime specific rule, it can bail it out to SW
> agent.
> > Can you please elaborate the complexity it introduces?
> >
> > > # All we need "char buffer//string" based communication ethdev <-> app.
> > >
> >
> > Yes, and both a dedicated API or dedicated rte_flow item can provide
> > medium for this communication.
> >
> > rte_flow one has flexibility & extensibility advantages, but maybe not
> > as straightforward as an API.
>
> I think not using the rte_flow will also require duplication of all the
> rule handling functions and table creations, for example aync rule
> create/destroy query ......
>
>

Jerin Jacob Sept. 1, 2023, 2:07 a.m. UTC | #30

On Thu, Aug 31, 2023 at 4:02 PM Ori Kam <orika@nvidia.com> wrote:
>
> Hi

> > >>
> > >> 3. Everybody on the call agreed that the P4-programmable devices from
> > Intel,
> > >> AMD and others need to be fully supported by DPDK and that there are
> > some
> > >> gaps in RTE_FLOW to be fixed for supporting these devices.
> > >
> > > Personally, It makes sense to me to have normative DPDK API to send p4
> > > runtime message to the
> > > ethdev so that we have "vendor neutral + DPDK based" p4 runtime backend.
> > >
> > > I prefer to have specialized ethdev ops for this due to the following reasons.
> > >
> > > # If the ethdev has both real TCAM based HW(for existing rte_flow
> > > patterns and actions) and SW agent to receive P4 runtime message etc.
> > > Typically, it needs to take a different path in driver to talk. Assume, if you
> > > have cascaded patterns/actions, One is targeted for TCAM and other for
> > > SW agent for p4, one
> > > need to have serious amount checking for dispatching.It complicates
> > > the driver and forbid to have
> > > driver optimization especially cases for templates etc. if user making
> > > rules for both category of HW.
> > >
> >
> > Indeed I am not against dedicated APIs for P4 runtime backend.
> >
> > But assuming there is a dedicated rte_flow item for P4, how it is
> > different than dedicated API in above scenario?
> > If driver detects P4 runtime specific rule, it can bail it out to SW agent.
> > Can you please elaborate the complexity it introduces?

Assume, normal existing rte-flow programming include a bunch of
register writes and
p4 runtime backend is more of SW ring. If a template has both patterns
and actions
as cascaded, it will be difficult for driver to optimize the template.


> >
> > > # All we need "char buffer//string" based communication ethdev <-> app.
> > >
> >
> > Yes, and both a dedicated API or dedicated rte_flow item can provide
> > medium for this communication.
> >
> > rte_flow one has flexibility & extensibility advantages, but maybe not
> > as straightforward as an API.
>
> I think not using the rte_flow will also require duplication of all the rule handling functions and table creations, for example aync rule create/destroy query ......

Yes. That is a fair point. I am OK with exposing as rte_flow.
As a driver implementation note, to get rid of the above problem,
driver can choose to have pseudo ethdev devices for p4 if needed(if
driver finds difficult to combine TCAM based on HW rules and p4
runtime rule).


>

Ori Kam Sept. 1, 2023, 6:57 a.m. UTC | #31

> -----Original Message-----
> From: Jerin Jacob <jerinjacobk@gmail.com>
> Sent: Friday, September 1, 2023 5:07 AM
> 
> On Thu, Aug 31, 2023 at 4:02 PM Ori Kam <orika@nvidia.com> wrote:
> >
> > Hi
> 
> > > >>
> > > >> 3. Everybody on the call agreed that the P4-programmable devices from
> > > Intel,
> > > >> AMD and others need to be fully supported by DPDK and that there are
> > > some
> > > >> gaps in RTE_FLOW to be fixed for supporting these devices.
> > > >
> > > > Personally, It makes sense to me to have normative DPDK API to send p4
> > > > runtime message to the
> > > > ethdev so that we have "vendor neutral + DPDK based" p4 runtime
> backend.
> > > >
> > > > I prefer to have specialized ethdev ops for this due to the following
> reasons.
> > > >
> > > > # If the ethdev has both real TCAM based HW(for existing rte_flow
> > > > patterns and actions) and SW agent to receive P4 runtime message etc.
> > > > Typically, it needs to take a different path in driver to talk. Assume, if you
> > > > have cascaded patterns/actions, One is targeted for TCAM and other for
> > > > SW agent for p4, one
> > > > need to have serious amount checking for dispatching.It complicates
> > > > the driver and forbid to have
> > > > driver optimization especially cases for templates etc. if user making
> > > > rules for both category of HW.
> > > >
> > >
> > > Indeed I am not against dedicated APIs for P4 runtime backend.
> > >
> > > But assuming there is a dedicated rte_flow item for P4, how it is
> > > different than dedicated API in above scenario?
> > > If driver detects P4 runtime specific rule, it can bail it out to SW agent.
> > > Can you please elaborate the complexity it introduces?
> 
> Assume, normal existing rte-flow programming include a bunch of
> register writes and
> p4 runtime backend is more of SW ring. If a template has both patterns
> and actions
> as cascaded, it will be difficult for driver to optimize the template.
> 
> 
> > >
> > > > # All we need "char buffer//string" based communication ethdev <-> app.
> > > >
> > >
> > > Yes, and both a dedicated API or dedicated rte_flow item can provide
> > > medium for this communication.
> > >
> > > rte_flow one has flexibility & extensibility advantages, but maybe not
> > > as straightforward as an API.
> >
> > I think not using the rte_flow will also require duplication of all the rule
> handling functions and table creations, for example aync rule create/destroy
> query ......
> 
> Yes. That is a fair point. I am OK with exposing as rte_flow.
> As a driver implementation note, to get rid of the above problem,
> driver can choose to have pseudo ethdev devices for p4 if needed(if
> driver finds difficult to combine TCAM based on HW rules and p4
> runtime rule).
> 

What about the following concept:
The p4 compiler can generate the translation to known PMD objects in rte_flow,
then when a command is sent from the p4 agent to the offload using GRPC or any other way, the DPDK will convert from 
p4 protocol to rte_flow commands (this should be very fast since the commands are known and the mapping is already
defined).

To support the above idea we need to add two new functions to rte_flow (each function will be implemented in PMD level)
Rte_flow_register_p4(void *p4_info, void *p4_blob)
{
	Creates the static structures/objects
	Internal register the p4 commands to PMD translation table.
}

Rte_flow_p4_runtime(p4 command based on the p4 spec)
{
	Based on the registered mapping, translate the command to rte_flow commands.
	Rte_flow_xxx() calls
}

As far as I see, the above code will also allow PMD to implement internal logic if needed, while from DPDK API,
we will only add two new functions.

> 
> >

Jerin Jacob Sept. 1, 2023, 9:52 a.m. UTC | #32

On Fri, Sep 1, 2023 at 12:27 PM Ori Kam <orika@nvidia.com> wrote:
>
>
>
> > -----Original Message-----
> > From: Jerin Jacob <jerinjacobk@gmail.com>
> > Sent: Friday, September 1, 2023 5:07 AM
> >
> > On Thu, Aug 31, 2023 at 4:02 PM Ori Kam <orika@nvidia.com> wrote:
> > >
> > > Hi
> >
> > > > >>
> > > > >> 3. Everybody on the call agreed that the P4-programmable devices from
> > > > Intel,
> > > > >> AMD and others need to be fully supported by DPDK and that there are
> > > > some
> > > > >> gaps in RTE_FLOW to be fixed for supporting these devices.
> > > > >
> > > > > Personally, It makes sense to me to have normative DPDK API to send p4
> > > > > runtime message to the
> > > > > ethdev so that we have "vendor neutral + DPDK based" p4 runtime
> > backend.
> > > > >
> > > > > I prefer to have specialized ethdev ops for this due to the following
> > reasons.
> > > > >
> > > > > # If the ethdev has both real TCAM based HW(for existing rte_flow
> > > > > patterns and actions) and SW agent to receive P4 runtime message etc.
> > > > > Typically, it needs to take a different path in driver to talk. Assume, if you
> > > > > have cascaded patterns/actions, One is targeted for TCAM and other for
> > > > > SW agent for p4, one
> > > > > need to have serious amount checking for dispatching.It complicates
> > > > > the driver and forbid to have
> > > > > driver optimization especially cases for templates etc. if user making
> > > > > rules for both category of HW.
> > > > >
> > > >
> > > > Indeed I am not against dedicated APIs for P4 runtime backend.
> > > >
> > > > But assuming there is a dedicated rte_flow item for P4, how it is
> > > > different than dedicated API in above scenario?
> > > > If driver detects P4 runtime specific rule, it can bail it out to SW agent.
> > > > Can you please elaborate the complexity it introduces?
> >
> > Assume, normal existing rte-flow programming include a bunch of
> > register writes and
> > p4 runtime backend is more of SW ring. If a template has both patterns
> > and actions
> > as cascaded, it will be difficult for driver to optimize the template.
> >
> >
> > > >
> > > > > # All we need "char buffer//string" based communication ethdev <-> app.
> > > > >
> > > >
> > > > Yes, and both a dedicated API or dedicated rte_flow item can provide
> > > > medium for this communication.
> > > >
> > > > rte_flow one has flexibility & extensibility advantages, but maybe not
> > > > as straightforward as an API.
> > >
> > > I think not using the rte_flow will also require duplication of all the rule
> > handling functions and table creations, for example aync rule create/destroy
> > query ......
> >
> > Yes. That is a fair point. I am OK with exposing as rte_flow.
> > As a driver implementation note, to get rid of the above problem,
> > driver can choose to have pseudo ethdev devices for p4 if needed(if
> > driver finds difficult to combine TCAM based on HW rules and p4
> > runtime rule).
> >
>
> What about the following concept:
> The p4 compiler can generate the translation to known PMD objects in rte_flow,
> then when a command is sent from the p4 agent to the offload using GRPC or any other way, the DPDK will convert from
> p4 protocol to rte_flow commands (this should be very fast since the commands are known and the mapping is already
> defined).
>
> To support the above idea we need to add two new functions to rte_flow (each function will be implemented in PMD level)

If the implemention of rte_flow_p4_runtime((p4 command based on the p4 spec))
is defined in PMD level, The PMD driver to decide to use rte_flow or
something else.
I think it is good, this is actually going back to specialized API.
BTW, rte_flow prefix is not needed in that case.



> Rte_flow_register_p4(void *p4_info, void *p4_blob)
> {
>         Creates the static structures/objects
>         Internal register the p4 commands to PMD translation table.
> }
>
> Rte_flow_p4_runtime(p4 command based on the p4 spec)
> {
>         Based on the registered mapping, translate the command to rte_flow commands.
>         Rte_flow_xxx() calls
> }
>
> As far as I see, the above code will also allow PMD to implement internal logic if needed, while from DPDK API,
> we will only add two new functions.
>
> >
> > >

Ferruh Yigit Sept. 4, 2023, 7:45 a.m. UTC | #33

On 9/1/2023 10:52 AM, Jerin Jacob wrote:
> On Fri, Sep 1, 2023 at 12:27 PM Ori Kam <orika@nvidia.com> wrote:
>>
>>
>>
>>> -----Original Message-----
>>> From: Jerin Jacob <jerinjacobk@gmail.com>
>>> Sent: Friday, September 1, 2023 5:07 AM
>>>
>>> On Thu, Aug 31, 2023 at 4:02 PM Ori Kam <orika@nvidia.com> wrote:
>>>>
>>>> Hi
>>>
>>>>>>>
>>>>>>> 3. Everybody on the call agreed that the P4-programmable devices from
>>>>> Intel,
>>>>>>> AMD and others need to be fully supported by DPDK and that there are
>>>>> some
>>>>>>> gaps in RTE_FLOW to be fixed for supporting these devices.
>>>>>>
>>>>>> Personally, It makes sense to me to have normative DPDK API to send p4
>>>>>> runtime message to the
>>>>>> ethdev so that we have "vendor neutral + DPDK based" p4 runtime
>>> backend.
>>>>>>
>>>>>> I prefer to have specialized ethdev ops for this due to the following
>>> reasons.
>>>>>>
>>>>>> # If the ethdev has both real TCAM based HW(for existing rte_flow
>>>>>> patterns and actions) and SW agent to receive P4 runtime message etc.
>>>>>> Typically, it needs to take a different path in driver to talk. Assume, if you
>>>>>> have cascaded patterns/actions, One is targeted for TCAM and other for
>>>>>> SW agent for p4, one
>>>>>> need to have serious amount checking for dispatching.It complicates
>>>>>> the driver and forbid to have
>>>>>> driver optimization especially cases for templates etc. if user making
>>>>>> rules for both category of HW.
>>>>>>
>>>>>
>>>>> Indeed I am not against dedicated APIs for P4 runtime backend.
>>>>>
>>>>> But assuming there is a dedicated rte_flow item for P4, how it is
>>>>> different than dedicated API in above scenario?
>>>>> If driver detects P4 runtime specific rule, it can bail it out to SW agent.
>>>>> Can you please elaborate the complexity it introduces?
>>>
>>> Assume, normal existing rte-flow programming include a bunch of
>>> register writes and
>>> p4 runtime backend is more of SW ring. If a template has both patterns
>>> and actions
>>> as cascaded, it will be difficult for driver to optimize the template.
>>>

I assumed P4 specific item/action can take completely separate path, do
you think existing rte_flow item/actions and P4 item/actions can mix in
same rte_flow rule?


>>>
>>>>>
>>>>>> # All we need "char buffer//string" based communication ethdev <-> app.
>>>>>>
>>>>>
>>>>> Yes, and both a dedicated API or dedicated rte_flow item can provide
>>>>> medium for this communication.
>>>>>
>>>>> rte_flow one has flexibility & extensibility advantages, but maybe not
>>>>> as straightforward as an API.
>>>>
>>>> I think not using the rte_flow will also require duplication of all the rule
>>> handling functions and table creations, for example aync rule create/destroy
>>> query ......
>>>
>>> Yes. That is a fair point. I am OK with exposing as rte_flow.
>>> As a driver implementation note, to get rid of the above problem,
>>> driver can choose to have pseudo ethdev devices for p4 if needed(if
>>> driver finds difficult to combine TCAM based on HW rules and p4
>>> runtime rule).
>>>
>>
>> What about the following concept:
>> The p4 compiler can generate the translation to known PMD objects in rte_flow,
>> then when a command is sent from the p4 agent to the offload using GRPC or any other way, the DPDK will convert from
>> p4 protocol to rte_flow commands (this should be very fast since the commands are known and the mapping is already
>> defined).
>>
>> To support the above idea we need to add two new functions to rte_flow (each function will be implemented in PMD level)
> 
> If the implemention of rte_flow_p4_runtime((p4 command based on the p4 spec))
> is defined in PMD level, The PMD driver to decide to use rte_flow or
> something else.
> I think it is good, this is actually going back to specialized API.
> BTW, rte_flow prefix is not needed in that case.
> 

+1 this is leaning to the dedicated API option.

> 
> 
>> Rte_flow_register_p4(void *p4_info, void *p4_blob)
>> {
>>         Creates the static structures/objects
>>         Internal register the p4 commands to PMD translation table.
>> }
>>
>> Rte_flow_p4_runtime(p4 command based on the p4 spec)
>> {
>>         Based on the registered mapping, translate the command to rte_flow commands.
>>         Rte_flow_xxx() calls

For the 'rte_flow_xxx()' calls here, my understanding is existing
rte_flow calls are not sufficient to meet the requirement to program the
P4 pipeline, that is why this patch is about defining the new flow item.

If these are not rte_flow calls but PMD defined code, than it is not
much different from previously suggested dedicated API solution.


>> }
>>
>> As far as I see, the above code will also allow PMD to implement internal logic if needed, while from DPDK API,
>> we will only add two new functions.
>>
>>>
>>>>

Ori Kam Sept. 6, 2023, 8:30 a.m. UTC | #34

> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@amd.com>
> Sent: Monday, September 4, 2023 10:46 AM
> 
> On 9/1/2023 10:52 AM, Jerin Jacob wrote:
> > On Fri, Sep 1, 2023 at 12:27 PM Ori Kam <orika@nvidia.com> wrote:
> >>
> >>
> >>
> >>> -----Original Message-----
> >>> From: Jerin Jacob <jerinjacobk@gmail.com>
> >>> Sent: Friday, September 1, 2023 5:07 AM
> >>>
> >>> On Thu, Aug 31, 2023 at 4:02 PM Ori Kam <orika@nvidia.com> wrote:
> >>>>
> >>>> Hi
> >>>
> >>>>>>>
> >>>>>>> 3. Everybody on the call agreed that the P4-programmable devices
> from
> >>>>> Intel,
> >>>>>>> AMD and others need to be fully supported by DPDK and that there are
> >>>>> some
> >>>>>>> gaps in RTE_FLOW to be fixed for supporting these devices.
> >>>>>>
> >>>>>> Personally, It makes sense to me to have normative DPDK API to send
> p4
> >>>>>> runtime message to the
> >>>>>> ethdev so that we have "vendor neutral + DPDK based" p4 runtime
> >>> backend.
> >>>>>>
> >>>>>> I prefer to have specialized ethdev ops for this due to the following
> >>> reasons.
> >>>>>>
> >>>>>> # If the ethdev has both real TCAM based HW(for existing rte_flow
> >>>>>> patterns and actions) and SW agent to receive P4 runtime message etc.
> >>>>>> Typically, it needs to take a different path in driver to talk. Assume, if
> you
> >>>>>> have cascaded patterns/actions, One is targeted for TCAM and other for
> >>>>>> SW agent for p4, one
> >>>>>> need to have serious amount checking for dispatching.It complicates
> >>>>>> the driver and forbid to have
> >>>>>> driver optimization especially cases for templates etc. if user making
> >>>>>> rules for both category of HW.
> >>>>>>
> >>>>>
> >>>>> Indeed I am not against dedicated APIs for P4 runtime backend.
> >>>>>
> >>>>> But assuming there is a dedicated rte_flow item for P4, how it is
> >>>>> different than dedicated API in above scenario?
> >>>>> If driver detects P4 runtime specific rule, it can bail it out to SW agent.
> >>>>> Can you please elaborate the complexity it introduces?
> >>>
> >>> Assume, normal existing rte-flow programming include a bunch of
> >>> register writes and
> >>> p4 runtime backend is more of SW ring. If a template has both patterns
> >>> and actions
> >>> as cascaded, it will be difficult for driver to optimize the template.
> >>>
> 
> I assumed P4 specific item/action can take completely separate path, do
> you think existing rte_flow item/actions and P4 item/actions can mix in
> same rte_flow rule?
> 

I don't think one rule will use both APIs

> 
> >>>
> >>>>>
> >>>>>> # All we need "char buffer//string" based communication ethdev <->
> app.
> >>>>>>
> >>>>>
> >>>>> Yes, and both a dedicated API or dedicated rte_flow item can provide
> >>>>> medium for this communication.
> >>>>>
> >>>>> rte_flow one has flexibility & extensibility advantages, but maybe not
> >>>>> as straightforward as an API.
> >>>>
> >>>> I think not using the rte_flow will also require duplication of all the rule
> >>> handling functions and table creations, for example aync rule
> create/destroy
> >>> query ......
> >>>
> >>> Yes. That is a fair point. I am OK with exposing as rte_flow.
> >>> As a driver implementation note, to get rid of the above problem,
> >>> driver can choose to have pseudo ethdev devices for p4 if needed(if
> >>> driver finds difficult to combine TCAM based on HW rules and p4
> >>> runtime rule).
> >>>
> >>
> >> What about the following concept:
> >> The p4 compiler can generate the translation to known PMD objects in
> rte_flow,
> >> then when a command is sent from the p4 agent to the offload using GRPC or
> any other way, the DPDK will convert from
> >> p4 protocol to rte_flow commands (this should be very fast since the
> commands are known and the mapping is already
> >> defined).
> >>
> >> To support the above idea we need to add two new functions to rte_flow
> (each function will be implemented in PMD level)
> >
> > If the implemention of rte_flow_p4_runtime((p4 command based on the p4
> spec))
> > is defined in PMD level, The PMD driver to decide to use rte_flow or
> > something else.
> > I think it is good, this is actually going back to specialized API.
> > BTW, rte_flow prefix is not needed in that case.
> >
> 
> +1 this is leaning to the dedicated API option.

I agree that we need some dedicated API but it is very lean,
and doesn't have any duplication with rte_flow.
In my viewpoint there are only the two suggested functions,
I don't want to see a function to insert rules or new items/actions.
In other words, it is only 2 more functions and not a new lib.

One more thing ideally the complier output format will be in rte_flow format,
And the about function are only for the grpc interface.


> 
> >
> >
> >> Rte_flow_register_p4(void *p4_info, void *p4_blob)
> >> {
> >>         Creates the static structures/objects
> >>         Internal register the p4 commands to PMD translation table.
> >> }
> >>
> >> Rte_flow_p4_runtime(p4 command based on the p4 spec)
> >> {
> >>         Based on the registered mapping, translate the command to rte_flow
> commands.
> >>         Rte_flow_xxx() calls
> 
> For the 'rte_flow_xxx()' calls here, my understanding is existing
> rte_flow calls are not sufficient to meet the requirement to program the
> P4 pipeline, that is why this patch is about defining the new flow item.
> 
I think rte_flow is good and doesn’t miss much,
I will be more than happy to see what is missing and add it. We have a complete session about it
in the summit.
Saying that PMD can do whatever it wants internally.

> If these are not rte_flow calls but PMD defined code, than it is not
> much different from previously suggested dedicated API solution.
> 

I agree, this is much closer to the original suggestion,
I think the correct way is that the compiler will output rte_flow
structures. 
This was my original thinking and I still think this is the best one.
And if something is missing in rte_flow we need to add it.

> 
> >> }
> >>
> >> As far as I see, the above code will also allow PMD to implement internal
> logic if needed, while from DPDK API,
> >> we will only add two new functions.
> >>
> >>>
> >>>>

Best,
Ori

Cristian Dumitrescu Sept. 11, 2023, 8:20 p.m. UTC | #35

> -----Original Message-----
> From: Ori Kam <orika@nvidia.com>
> Sent: Friday, September 1, 2023 7:58 AM
> To: Jerin Jacob <jerinjacobk@gmail.com>
> Cc: Ferruh Yigit <ferruh.yigit@amd.com>; Dumitrescu, Cristian
> <cristian.dumitrescu@intel.com>; Morten Brørup
> <mb@smartsharesystems.com>; Zhang, Qi Z <qi.z.zhang@intel.com>; NBU-
> Contact-Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>;
> david.marchand@redhat.com; Richardson, Bruce <bruce.richardson@intel.com>;
> jerinj@marvell.com; techboard@dpdk.org; Mcnamara, John
> <john.mcnamara@intel.com>; Zhang, Helin <helin.zhang@intel.com>;
> dev@dpdk.org
> Subject: RE: DPDK community: RTE_FLOW support for P4-programmable devices
> 
> 
> 
> > -----Original Message-----
> > From: Jerin Jacob <jerinjacobk@gmail.com>
> > Sent: Friday, September 1, 2023 5:07 AM
> >
> > On Thu, Aug 31, 2023 at 4:02 PM Ori Kam <orika@nvidia.com> wrote:
> > >
> > > Hi
> >
> > > > >>
> > > > >> 3. Everybody on the call agreed that the P4-programmable devices from
> > > > Intel,
> > > > >> AMD and others need to be fully supported by DPDK and that there are
> > > > some
> > > > >> gaps in RTE_FLOW to be fixed for supporting these devices.
> > > > >
> > > > > Personally, It makes sense to me to have normative DPDK API to send p4
> > > > > runtime message to the
> > > > > ethdev so that we have "vendor neutral + DPDK based" p4 runtime
> > backend.
> > > > >
> > > > > I prefer to have specialized ethdev ops for this due to the following
> > reasons.
> > > > >
> > > > > # If the ethdev has both real TCAM based HW(for existing rte_flow
> > > > > patterns and actions) and SW agent to receive P4 runtime message etc.
> > > > > Typically, it needs to take a different path in driver to talk. Assume, if you
> > > > > have cascaded patterns/actions, One is targeted for TCAM and other for
> > > > > SW agent for p4, one
> > > > > need to have serious amount checking for dispatching.It complicates
> > > > > the driver and forbid to have
> > > > > driver optimization especially cases for templates etc. if user making
> > > > > rules for both category of HW.
> > > > >
> > > >
> > > > Indeed I am not against dedicated APIs for P4 runtime backend.
> > > >
> > > > But assuming there is a dedicated rte_flow item for P4, how it is
> > > > different than dedicated API in above scenario?
> > > > If driver detects P4 runtime specific rule, it can bail it out to SW agent.
> > > > Can you please elaborate the complexity it introduces?
> >
> > Assume, normal existing rte-flow programming include a bunch of
> > register writes and
> > p4 runtime backend is more of SW ring. If a template has both patterns
> > and actions
> > as cascaded, it will be difficult for driver to optimize the template.
> >
> >
> > > >
> > > > > # All we need "char buffer//string" based communication ethdev <-> app.
> > > > >
> > > >
> > > > Yes, and both a dedicated API or dedicated rte_flow item can provide
> > > > medium for this communication.
> > > >
> > > > rte_flow one has flexibility & extensibility advantages, but maybe not
> > > > as straightforward as an API.
> > >
> > > I think not using the rte_flow will also require duplication of all the rule
> > handling functions and table creations, for example aync rule create/destroy
> > query ......
> >
> > Yes. That is a fair point. I am OK with exposing as rte_flow.
> > As a driver implementation note, to get rid of the above problem,
> > driver can choose to have pseudo ethdev devices for p4 if needed(if
> > driver finds difficult to combine TCAM based on HW rules and p4
> > runtime rule).
> >
> 
> What about the following concept:
> The p4 compiler can generate the translation to known PMD objects in rte_flow,
> then when a command is sent from the p4 agent to the offload using GRPC or any
> other way, the DPDK will convert from
> p4 protocol to rte_flow commands (this should be very fast since the commands
> are known and the mapping is already
> defined).
> 
> To support the above idea we need to add two new functions to rte_flow (each
> function will be implemented in PMD level)
> Rte_flow_register_p4(void *p4_info, void *p4_blob)
> {
> 	Creates the static structures/objects
> 	Internal register the p4 commands to PMD translation table.
> }
> 
> Rte_flow_p4_runtime(p4 command based on the p4 spec)
> {
> 	Based on the registered mapping, translate the command to rte_flow
> commands.
> 	Rte_flow_xxx() calls
> }
> 
> As far as I see, the above code will also allow PMD to implement internal logic if
> needed, while from DPDK API,
> we will only add two new functions.

Hi Ori,

It would be great if we (all the interested folks) could meet at the DPDK summit
in the next few days and brainstorm on this topic!

I don't think the compiler idea fixes the problem at all, unfortunately:

1. The P4 compiler is for the data path, not for the control path. You seem to 
suggest a second compiler focused on the control path, which would be a
completely new tool of high complexity or even tasked to do the impossible
(explained below), which is a path to failure. Therefore,  it is definitely not OK
to make his translator/compiler a mandatory requirement for RTE_FLOW
support of existing HW!

2. This compiler is expected to magically translate an infinite list of possible
user-defined action to one of the existing finite list of RTE_FLOW actions,
and this is obviously not possible. See the example P4 program from the
slides I presented, the user defined P4 actions are a sort of user-defined
subroutines (mixing assignments, arithmetic operations and branches on
metadata and header fields, target dependent intrinsics, etc) as opposed
to simple RTE_FLOW primitives.

3. The translation of user-defined actions to one of the existing RTE_FLOW
actions is sometimes possible, but this takes place only for the most trivial
cases. And even in such cases it typically needs a hint from the user to testify
the mapping as opposed to the mapping being unequivocally identified by a
compiler. Even the simplest action of packet drop can be done in so many
different ways:
	a) straight call to a target dependent drop intrinsic (unknown to the
	   "compiler" unless user provides a hint);
	b) set a meta-data flag that is tested at a later stage for packet drop;
	c) send the packet to a predefined output port that is either dropping
	   or logging/counting the packet;
	d) etc.

4. The P4 actions are simply user-defined actions that are built into the HW
through the program, so the control path to configure them it needs to know
just their ID and their arguments that have to be provided. These actions are
not portable between programs, as their life span is the time the P4 program
is loaded on the target; once a new program is loaded, the old actions cease
to exist, therefore it makes less sense to standardize some "ephemeral"
actions, and it makes more sense to provide a registration mechanism in
RTE_FLOW for these user-defined actions to be registered when the device
is initialized.

What do people think?

Regards,
Cristian

diff mbox series

Patch

diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index 3fe57140f9..f7889d7dd0 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -688,6 +688,15 @@  enum rte_flow_item_type {
 	 * @see struct rte_flow_item_ib_bth.
 	 */
 	RTE_FLOW_ITEM_TYPE_IB_BTH,
+
+	/**
+	 * Matches a custom protocol header or metadata field represented
+	 * as a byte string of a given length, whose meaning is typically
+	 * defined by the data plane program.
+	 *
+	 * @see struct rte_flow_item_generic.
+	 */
+	RTE_FLOW_ITEM_TYPE_GENERIC,
 };
 
 /**
@@ -2311,6 +2320,32 @@  static const struct rte_flow_item_tx_queue rte_flow_item_tx_queue_mask = {
 };
 #endif
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ITEM_TYPE_GENERIC
+ *
+ * Matches a custom protocol header or metadata field represented as a byte
+ * array of a given length, whose meaning is typically defined by the data
+ * plane program.
+ *
+ * The field length must be non-zero. The field value must be non-NULL, with the
+ * value bytes specified in network byte order.
+ */
+struct rte_flow_item_generic {
+	uint32_t length; /**< Field length. */
+	const uint8_t *value; /**< Field value. */
+};
+
+/** Default mask for RTE_FLOW_ITEM_TYPE_GENERIC. */
+#ifndef __cplusplus
+static const struct rte_flow_item_generic rte_flow_item_generic_mask = {
+	.length = 0xffffffff,
+	.value = NULL,
+};
+#endif
+
 /**
  * Action types.
  *
@@ -2989,6 +3024,14 @@  enum rte_flow_action_type {
 	 * @see struct rte_flow_action_indirect_list
 	 */
 	RTE_FLOW_ACTION_TYPE_INDIRECT_LIST,
+
+	/**
+	 * Custom action whose processing is typically defined by the data plane
+	 * program.
+	 *
+	 * @see struct rte_flow_action_generic.
+	 */
+	RTE_FLOW_ACTION_TYPE_GENERIC,
 };
 
 /**
@@ -4064,6 +4107,45 @@  struct rte_flow_indirect_update_flow_meter_mark {
 	enum rte_color init_color;
 };
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice.
+ *
+ * Generic action argument configuration parameters.
+ *
+ * The action argument field length must be non-zero. The action argument field
+ * value must be non-NULL, with the value bytes specified in network byte order.
+ *
+ * @see struct rte_flow_action_generic
+ */
+struct rte_flow_action_generic_argument {
+	/** Argument field length. */
+	uint32_t length;
+	/** Argument field value. */
+	const uint8_t *value;
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice.
+ *
+ * RTE_FLOW_ACTION_TYPE_GENERIC
+ *
+ * Generic action configuration parameters.
+ *
+ * Each action can have zero or more arguments.
+ *
+ * @see RTE_FLOW_ACTION_TYPE_GENERIC
+ */
+struct rte_flow_action_generic {
+	/** Action ID. */
+	uint32_t action_id;
+	/** Number of action arguments. */
+	uint32_t action_args_num;
+	/** Action arguments array. */
+	const struct rte_flow_action_generic_argument *action_args;
+};
+
 /* Mbuf dynamic field offset for metadata. */
 extern int32_t rte_flow_dynf_metadata_offs;