[v2] ethdev: extend flow metadata

Message ID 1570723359-24650-1-git-send-email-viacheslavo@mellanox.com (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers
Series [v2] ethdev: extend flow metadata |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation fail apply issues

Commit Message

Slava Ovsiienko Oct. 10, 2019, 4:02 p.m. UTC
  Currently, metadata can be set on egress path via mbuf tx_metadata field
with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META matches metadata.

This patch extends the metadata feature usability.

1) RTE_FLOW_ACTION_TYPE_SET_META

When supporting multiple tables, Tx metadata can also be set by a rule and
matched by another rule. This new action allows metadata to be set as a
result of flow match.

2) Metadata on ingress

There's also need to support metadata on ingress. Metadata can be set by
SET_META action and matched by META item like Tx. The final value set by
the action will be delivered to application via metadata dynamic field of
mbuf which can be accessed by RTE_FLOW_DYNF_METADATA().
PKT_RX_DYNF_METADATA flag will be set along with the data.

The mbuf dynamic field must be registered by calling
rte_flow_dynf_metadata_register() prior to use SET_META action.

The availability of dynamic mbuf metadata field can be checked
with rte_flow_dynf_metadata_avail() routine.

For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
propagated to the other path depending on hardware capability.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---

  This patch uses dynamic mbuf field and must be applied after:
  http://patches.dpdk.org/patch/59343/
  "mbuf: support dynamic fields and flags"

  v2: - rebased
      - relies on dynamic mbuf field feature

  v1: http://patches.dpdk.org/patch/56103/

  rfc: http://patches.dpdk.org/patch/54270/

 app/test-pmd/cmdline_flow.c              | 57 ++++++++++++++++++++-
 app/test-pmd/util.c                      |  5 ++
 doc/guides/prog_guide/rte_flow.rst       | 57 +++++++++++++++++++++
 doc/guides/rel_notes/release_19_11.rst   |  8 +++
 lib/librte_ethdev/rte_ethdev.h           |  1 -
 lib/librte_ethdev/rte_ethdev_version.map |  6 +++
 lib/librte_ethdev/rte_flow.c             | 41 +++++++++++++++
 lib/librte_ethdev/rte_flow.h             | 87 ++++++++++++++++++++++++++++++--
 lib/librte_mbuf/rte_mbuf_dyn.h           |  8 +++
 9 files changed, 265 insertions(+), 5 deletions(-)
  

Comments

Olivier Matz Oct. 18, 2019, 9:22 a.m. UTC | #1
Hi Viacheslav,

Few comments on the dynamic mbuf part below.

On Thu, Oct 10, 2019 at 04:02:39PM +0000, Viacheslav Ovsiienko wrote:
> Currently, metadata can be set on egress path via mbuf tx_metadata field
> with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META matches metadata.
> 
> This patch extends the metadata feature usability.
> 
> 1) RTE_FLOW_ACTION_TYPE_SET_META
> 
> When supporting multiple tables, Tx metadata can also be set by a rule and
> matched by another rule. This new action allows metadata to be set as a
> result of flow match.
> 
> 2) Metadata on ingress
> 
> There's also need to support metadata on ingress. Metadata can be set by
> SET_META action and matched by META item like Tx. The final value set by
> the action will be delivered to application via metadata dynamic field of
> mbuf which can be accessed by RTE_FLOW_DYNF_METADATA().
> PKT_RX_DYNF_METADATA flag will be set along with the data.
> 
> The mbuf dynamic field must be registered by calling
> rte_flow_dynf_metadata_register() prior to use SET_META action.
> 
> The availability of dynamic mbuf metadata field can be checked
> with rte_flow_dynf_metadata_avail() routine.
> 
> For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
> propagated to the other path depending on hardware capability.
> 
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> ---
> 
>   This patch uses dynamic mbuf field and must be applied after:
>   http://patches.dpdk.org/patch/59343/
>   "mbuf: support dynamic fields and flags"
> 
>   v2: - rebased
>       - relies on dynamic mbuf field feature
> 
>   v1: http://patches.dpdk.org/patch/56103/
> 
>   rfc: http://patches.dpdk.org/patch/54270/
> 
>  app/test-pmd/cmdline_flow.c              | 57 ++++++++++++++++++++-
>  app/test-pmd/util.c                      |  5 ++
>  doc/guides/prog_guide/rte_flow.rst       | 57 +++++++++++++++++++++
>  doc/guides/rel_notes/release_19_11.rst   |  8 +++
>  lib/librte_ethdev/rte_ethdev.h           |  1 -
>  lib/librte_ethdev/rte_ethdev_version.map |  6 +++
>  lib/librte_ethdev/rte_flow.c             | 41 +++++++++++++++
>  lib/librte_ethdev/rte_flow.h             | 87 ++++++++++++++++++++++++++++++--
>  lib/librte_mbuf/rte_mbuf_dyn.h           |  8 +++
>  9 files changed, 265 insertions(+), 5 deletions(-)
> 
> diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
> index b26b8bf..078f256 100644
> --- a/app/test-pmd/cmdline_flow.c
> +++ b/app/test-pmd/cmdline_flow.c
> @@ -305,6 +305,9 @@ enum index {
>  	ACTION_DEC_TCP_ACK_VALUE,
>  	ACTION_RAW_ENCAP,
>  	ACTION_RAW_DECAP,
> +	ACTION_SET_META,
> +	ACTION_SET_META_DATA,
> +	ACTION_SET_META_MASK,
>  };
>  
>  /** Maximum size for pattern in struct rte_flow_item_raw. */
> @@ -994,6 +997,7 @@ struct parse_action_priv {
>  	ACTION_DEC_TCP_ACK,
>  	ACTION_RAW_ENCAP,
>  	ACTION_RAW_DECAP,
> +	ACTION_SET_META,
>  	ZERO,
>  };
>  
> @@ -1180,6 +1184,13 @@ struct parse_action_priv {
>  	ZERO,
>  };
>  
> +static const enum index action_set_meta[] = {
> +	ACTION_SET_META_DATA,
> +	ACTION_SET_META_MASK,
> +	ACTION_NEXT,
> +	ZERO,
> +};
> +
>  static int parse_set_raw_encap_decap(struct context *, const struct token *,
>  				     const char *, unsigned int,
>  				     void *, unsigned int);
> @@ -1238,6 +1249,10 @@ static int parse_vc_action_raw_encap(struct context *,
>  static int parse_vc_action_raw_decap(struct context *,
>  				     const struct token *, const char *,
>  				     unsigned int, void *, unsigned int);
> +static int parse_vc_action_set_meta(struct context *ctx,
> +				    const struct token *token, const char *str,
> +				    unsigned int len, void *buf,
> +				    unsigned int size);
>  static int parse_destroy(struct context *, const struct token *,
>  			 const char *, unsigned int,
>  			 void *, unsigned int);
> @@ -3222,7 +3237,31 @@ static int comp_vc_action_rss_queue(struct context *, const struct token *,
>  		.help = "set raw decap data",
>  		.next = NEXT(next_item),
>  		.call = parse_set_raw_encap_decap,
> -	}
> +	},
> +	[ACTION_SET_META] = {
> +		.name = "set_meta",
> +		.help = "set metadata",
> +		.priv = PRIV_ACTION(SET_META,
> +			sizeof(struct rte_flow_action_set_meta)),
> +		.next = NEXT(action_set_meta),
> +		.call = parse_vc_action_set_meta,
> +	},
> +	[ACTION_SET_META_DATA] = {
> +		.name = "data",
> +		.help = "metadata value",
> +		.next = NEXT(action_set_meta, NEXT_ENTRY(UNSIGNED)),
> +		.args = ARGS(ARGS_ENTRY_HTON
> +			     (struct rte_flow_action_set_meta, data)),
> +		.call = parse_vc_conf,
> +	},
> +	[ACTION_SET_META_MASK] = {
> +		.name = "mask",
> +		.help = "mask for metadata value",
> +		.next = NEXT(action_set_meta, NEXT_ENTRY(UNSIGNED)),
> +		.args = ARGS(ARGS_ENTRY_HTON
> +			     (struct rte_flow_action_set_meta, mask)),
> +		.call = parse_vc_conf,
> +	},
>  };
>  
>  /** Remove and return last entry from argument stack. */
> @@ -4592,6 +4631,22 @@ static int comp_vc_action_rss_queue(struct context *, const struct token *,
>  	return ret;
>  }
>  
> +static int
> +parse_vc_action_set_meta(struct context *ctx, const struct token *token,
> +			 const char *str, unsigned int len, void *buf,
> +			 unsigned int size)
> +{
> +	int ret;
> +
> +	ret = parse_vc(ctx, token, str, len, buf, size);
> +	if (ret < 0)
> +		return ret;
> +	ret = rte_flow_dynf_metadata_register();
> +	if (ret < 0)
> +		return -1;
> +	return len;
> +}
> +
>  /** Parse tokens for destroy command. */
>  static int
>  parse_destroy(struct context *ctx, const struct token *token,
> diff --git a/app/test-pmd/util.c b/app/test-pmd/util.c
> index 1570270..39ff07b 100644
> --- a/app/test-pmd/util.c
> +++ b/app/test-pmd/util.c
> @@ -81,6 +81,11 @@
>  			       mb->vlan_tci, mb->vlan_tci_outer);
>  		else if (ol_flags & PKT_RX_VLAN)
>  			printf(" - VLAN tci=0x%x", mb->vlan_tci);
> +		if (ol_flags & PKT_TX_METADATA)
> +			printf(" - Tx metadata: 0x%x", mb->tx_metadata);
> +		if (ol_flags & PKT_RX_DYNF_METADATA)
> +			printf(" - Rx metadata: 0x%x",
> +			       *RTE_FLOW_DYNF_METADATA(mb));
>  		if (mb->packet_type) {
>  			rte_get_ptype_name(mb->packet_type, buf, sizeof(buf));
>  			printf(" - hw ptype: %s", buf);
> diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
> index ff6fb11..45fc041 100644
> --- a/doc/guides/prog_guide/rte_flow.rst
> +++ b/doc/guides/prog_guide/rte_flow.rst
> @@ -658,6 +658,32 @@ the physical device, with virtual groups in the PMD or not at all.
>     | ``mask`` | ``id``   | zeroed to match any value |
>     +----------+----------+---------------------------+
>  
> +Item: ``META``
> +^^^^^^^^^^^^^^^^^
> +
> +Matches 32 bit metadata item set.
> +
> +On egress, metadata can be set either by mbuf metadata field with
> +PKT_TX_METADATA flag or ``SET_META`` action. On ingress, ``SET_META``
> +action sets metadata for a packet and the metadata will be reported via
> +``metadata`` dynamic field of ``rte_mbuf`` with PKT_RX_DYNF_METADATA flag.
> +
> +- Default ``mask`` matches the specified Rx metadata value.
> +
> +.. _table_rte_flow_item_meta:
> +
> +.. table:: META
> +
> +   +----------+----------+---------------------------------------+
> +   | Field    | Subfield | Value                                 |
> +   +==========+==========+=======================================+
> +   | ``spec`` | ``data`` | 32 bit metadata value                 |
> +   +----------+----------+---------------------------------------+
> +   | ``last`` | ``data`` | upper range value                     |
> +   +----------+----------+---------------------------------------+
> +   | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
> +   +----------+----------+---------------------------------------+
> +
>  Data matching item types
>  ~~~~~~~~~~~~~~~~~~~~~~~~
>  
> @@ -2415,6 +2441,37 @@ Value to decrease TCP acknowledgment number by is a big-endian 32 bit integer.
>  
>  Using this action on non-matching traffic will result in undefined behavior.
>  
> +Action: ``SET_META``
> +^^^^^^^^^^^^^^^^^^^^^^^
> +
> +Set metadata. Item ``META`` matches metadata.
> +
> +Metadata set by mbuf metadata field with PKT_TX_METADATA flag on egress will be
> +overridden by this action. On ingress, the metadata will be carried by
> +``metadata`` dynamic field of ``rte_mbuf`` which can be accessed by
> +``RTE_FLOW_DYNF_METADATA()``. PKT_RX_DYNF_METADATA flag will be set along
> +with the data.
> +
> +The mbuf dynamic field must be registered by calling
> +``rte_flow_dynf_metadata_register()`` prior to use ``SET_META`` action.
> +
> +Altering partial bits is supported with ``mask``. For bits which have never been
> +set, unpredictable value will be seen depending on driver implementation. For
> +loopback/hairpin packet, metadata set on Rx/Tx may or may not be propagated to
> +the other path depending on HW capability.
> +
> +.. _table_rte_flow_action_set_meta:
> +
> +.. table:: SET_META
> +
> +   +----------+----------------------------+
> +   | Field    | Value                      |
> +   +==========+============================+
> +   | ``data`` | 32 bit metadata value      |
> +   +----------+----------------------------+
> +   | ``mask`` | bit-mask applies to "data" |
> +   +----------+----------------------------+
> +
>  Negative types
>  ~~~~~~~~~~~~~~
>  
> diff --git a/doc/guides/rel_notes/release_19_11.rst b/doc/guides/rel_notes/release_19_11.rst
> index 8921cfd..904746e 100644
> --- a/doc/guides/rel_notes/release_19_11.rst
> +++ b/doc/guides/rel_notes/release_19_11.rst
> @@ -95,6 +95,14 @@ New Features
>    for specific offload features, where adding a static field or flag
>    in the mbuf is not justified.
>  
> +* **Extended metadata support in rte_flow.**
> +
> +  Flow metadata is extended to both Rx and Tx.
> +
> +  * Tx metadata can also be set by SET_META action of rte_flow.
> +  * Rx metadata is delivered to host via a dynamic field of ``rte_mbuf`` with
> +    PKT_RX_DYNF_METADATA.
> +
>  Removed Items
>  -------------
>  
> diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
> index d937fb4..9a6432c 100644
> --- a/lib/librte_ethdev/rte_ethdev.h
> +++ b/lib/librte_ethdev/rte_ethdev.h
> @@ -1013,7 +1013,6 @@ struct rte_eth_conf {
>  #define DEV_RX_OFFLOAD_KEEP_CRC		0x00010000
>  #define DEV_RX_OFFLOAD_SCTP_CKSUM	0x00020000
>  #define DEV_RX_OFFLOAD_OUTER_UDP_CKSUM  0x00040000
> -
>  #define DEV_RX_OFFLOAD_CHECKSUM (DEV_RX_OFFLOAD_IPV4_CKSUM | \
>  				 DEV_RX_OFFLOAD_UDP_CKSUM | \
>  				 DEV_RX_OFFLOAD_TCP_CKSUM)
> diff --git a/lib/librte_ethdev/rte_ethdev_version.map b/lib/librte_ethdev/rte_ethdev_version.map
> index 6df42a4..3d9cafc 100644
> --- a/lib/librte_ethdev/rte_ethdev_version.map
> +++ b/lib/librte_ethdev/rte_ethdev_version.map
> @@ -283,4 +283,10 @@ EXPERIMENTAL {
>  
>  	# added in 19.08
>  	rte_eth_read_clock;
> +
> +	# added in 19.11
> +	rte_flow_dynf_metadata_offs;
> +	rte_flow_dynf_metadata_flag;
> +	rte_flow_dynf_metadata_avail;
> +	rte_flow_dynf_metadata_register;
>  };
> diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
> index cc03b15..9cbda75 100644
> --- a/lib/librte_ethdev/rte_flow.c
> +++ b/lib/librte_ethdev/rte_flow.c
> @@ -12,10 +12,18 @@
>  #include <rte_errno.h>
>  #include <rte_branch_prediction.h>
>  #include <rte_string_fns.h>
> +#include <rte_mbuf.h>
> +#include <rte_mbuf_dyn.h>
>  #include "rte_ethdev.h"
>  #include "rte_flow_driver.h"
>  #include "rte_flow.h"
>  
> +/* Mbuf dynamic field name for metadata. */
> +int rte_flow_dynf_metadata_offs = -1;
> +
> +/* Mbuf dynamic field flag bit number for metadata. */
> +uint64_t rte_flow_dynf_metadata_mask;
> +
>  /**
>   * Flow elements description tables.
>   */
> @@ -153,8 +161,41 @@ struct rte_flow_desc_data {
>  	MK_FLOW_ACTION(DEC_TCP_SEQ, sizeof(rte_be32_t)),
>  	MK_FLOW_ACTION(INC_TCP_ACK, sizeof(rte_be32_t)),
>  	MK_FLOW_ACTION(DEC_TCP_ACK, sizeof(rte_be32_t)),
> +	MK_FLOW_ACTION(SET_META, sizeof(struct rte_flow_action_set_meta)),
>  };
>  
> +int
> +rte_flow_dynf_metadata_register(void)
> +{
> +	int offset;
> +	int flag;
> +
> +	static const struct rte_mbuf_dynfield desc_offs = {
> +		.name = MBUF_DYNF_METADATA_NAME,
> +		.size = MBUF_DYNF_METADATA_SIZE,
> +		.align = MBUF_DYNF_METADATA_ALIGN,
> +		.flags = MBUF_DYNF_METADATA_FLAGS,
> +	};
> +	static const struct rte_mbuf_dynflag desc_flag = {
> +		.name = MBUF_DYNF_METADATA_NAME,
> +	};

I don't see think we need #defines.
You can directly use the name, sizeof() and __alignof__() here.
If the information is used externally, the structure shall
be made global non-static.


> +
> +	offset = rte_mbuf_dynfield_register(&desc_offs);
> +	if (offset < 0)
> +		goto error;
> +	flag = rte_mbuf_dynflag_register(&desc_flag);
> +	if (flag < 0)
> +		goto error;
> +	rte_flow_dynf_metadata_offs = offset;
> +	rte_flow_dynf_metadata_mask = (1ULL << flag);
> +	return 0;
> +
> +error:
> +	rte_flow_dynf_metadata_offs = -1;
> +	rte_flow_dynf_metadata_mask = 0ULL;
> +	return -rte_errno;
> +}
> +
>  static int
>  flow_err(uint16_t port_id, int ret, struct rte_flow_error *error)
>  {
> diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
> index 391a44a..a27e619 100644
> --- a/lib/librte_ethdev/rte_flow.h
> +++ b/lib/librte_ethdev/rte_flow.h
> @@ -27,6 +27,8 @@
>  #include <rte_udp.h>
>  #include <rte_byteorder.h>
>  #include <rte_esp.h>
> +#include <rte_mbuf.h>
> +#include <rte_mbuf_dyn.h>
>  
>  #ifdef __cplusplus
>  extern "C" {
> @@ -417,7 +419,8 @@ enum rte_flow_item_type {
>  	/**
>  	 * [META]
>  	 *
> -	 * Matches a metadata value specified in mbuf metadata field.
> +	 * Matches a metadata value.
> +	 *
>  	 * See struct rte_flow_item_meta.
>  	 */
>  	RTE_FLOW_ITEM_TYPE_META,
> @@ -1213,9 +1216,17 @@ struct rte_flow_item_icmp6_nd_opt_tla_eth {
>  #endif
>  
>  /**
> - * RTE_FLOW_ITEM_TYPE_META.
> + * @warning
> + * @b EXPERIMENTAL: this structure may change without prior notice
>   *
> - * Matches a specified metadata value.
> + * RTE_FLOW_ITEM_TYPE_META
> + *
> + * Matches a specified metadata value. On egress, metadata can be set either by
> + * mbuf tx_metadata field with PKT_TX_METADATA flag or
> + * RTE_FLOW_ACTION_TYPE_SET_META. On ingress, RTE_FLOW_ACTION_TYPE_SET_META sets
> + * metadata for a packet and the metadata will be reported via mbuf metadata
> + * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic mbuf field must be
> + * registered in advance by rte_flow_dynf_metadata_register().
>   */
>  struct rte_flow_item_meta {
>  	rte_be32_t data;
> @@ -1813,6 +1824,13 @@ enum rte_flow_action_type {
>  	 * undefined behavior.
>  	 */
>  	RTE_FLOW_ACTION_TYPE_DEC_TCP_ACK,
> +
> +	/**
> +	 * Set metadata on ingress or egress path.
> +	 *
> +	 * See struct rte_flow_action_set_meta.
> +	 */
> +	RTE_FLOW_ACTION_TYPE_SET_META,
>  };
>  
>  /**
> @@ -2300,6 +2318,43 @@ struct rte_flow_action_set_mac {
>  	uint8_t mac_addr[RTE_ETHER_ADDR_LEN];
>  };
>  
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this structure may change without prior notice
> + *
> + * RTE_FLOW_ACTION_TYPE_SET_META
> + *
> + * Set metadata. Metadata set by mbuf tx_metadata field with
> + * PKT_TX_METADATA flag on egress will be overridden by this action. On
> + * ingress, the metadata will be carried by mbuf metadata dynamic field
> + * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field must be
> + * registered in advance by rte_flow_dynf_metadata_register().
> + *
> + * Altering partial bits is supported with mask. For bits which have never
> + * been set, unpredictable value will be seen depending on driver
> + * implementation. For loopback/hairpin packet, metadata set on Rx/Tx may
> + * or may not be propagated to the other path depending on HW capability.
> + *
> + * RTE_FLOW_ITEM_TYPE_META matches metadata.
> + */
> +struct rte_flow_action_set_meta {
> +	rte_be32_t data;
> +	rte_be32_t mask;
> +};
> +
> +/* Mbuf dynamic field offset for metadata. */
> +extern int rte_flow_dynf_metadata_offs;
> +
> +/* Mbuf dynamic field flag mask for metadata. */
> +extern uint64_t rte_flow_dynf_metadata_mask;
> +
> +/* Mbuf dynamic field pointer for metadata. */
> +#define RTE_FLOW_DYNF_METADATA(m) \
> +	RTE_MBUF_DYNFIELD((m), rte_flow_dynf_metadata_offs, uint32_t *)
> +
> +/* Mbuf dynamic flag for metadata. */
> +#define PKT_RX_DYNF_METADATA (rte_flow_dynf_metadata_mask)
> +

I wonder if helpers like this wouldn't be better, because
they combine the flag and the field:

/**
 * Set metadata dynamic field and flag in mbuf.
 *
 * rte_flow_dynf_metadata_register() must have been called first.
 */
__rte_experimental
static inline void rte_mbuf_dyn_metadata_set(struct rte_mbuf *m,
                                       uint32_t metadata)
{
       *RTE_MBUF_DYNFIELD(m, rte_flow_dynf_metadata_offs,
                       uint32_t *) = metadata;
       m->ol_flags |= rte_flow_dynf_metadata_mask;
}

/**
 * Get metadata dynamic field value in mbuf.
 *
 * rte_flow_dynf_metadata_register() must have been called first.
 */
__rte_experimental
static inline int rte_mbuf_dyn_metadata_get(const struct rte_mbuf *m,
                                       uint32_t *metadata)
{
       if ((m->ol_flags & rte_flow_dynf_metadata_mask) == 0)
               return -1;
       *metadata = *RTE_MBUF_DYNFIELD(m, rte_flow_dynf_metadata_offs,
                               uint32_t *);
       return 0;
}

/**
 * Delete the metadata dynamic flag in mbuf.
 *
 * rte_flow_dynf_metadata_register() must have been called first.
 */
__rte_experimental
static inline void rte_mbuf_dyn_metadata_del(struct rte_mbuf *m)
{
       m->ol_flags &= ~rte_flow_dynf_metadata_mask;
}


>  /*
>   * Definition of a single action.
>   *
> @@ -2533,6 +2588,32 @@ enum rte_flow_conv_op {
>  };
>  
>  /**
> + * Check if mbuf dynamic field for metadata is registered.
> + *
> + * @return
> + *   True if registered, false otherwise.
> + */
> +__rte_experimental
> +static inline int
> +rte_flow_dynf_metadata_avail(void) {
> +	return !!rte_flow_dynf_metadata_mask;
> +}

_registered() instead of _avail() ?

> +
> +/**
> + * Register mbuf dynamic field and flag for metadata.
> + *
> + * This function must be called prior to use SET_META action in order to
> + * register the dynamic mbuf field. Otherwise, the data cannot be delivered to
> + * application.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +__rte_experimental
> +int
> +rte_flow_dynf_metadata_register(void);
> +
> +/**
>   * Check whether a flow rule can be created on a given port.
>   *
>   * The flow rule is validated for correctness and whether it could be accepted
> diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
> index 6e2c816..4ff33ac 100644
> --- a/lib/librte_mbuf/rte_mbuf_dyn.h
> +++ b/lib/librte_mbuf/rte_mbuf_dyn.h
> @@ -160,4 +160,12 @@ int rte_mbuf_dynflag_lookup(const char *name,
>   */
>  #define RTE_MBUF_DYNFIELD(m, offset, type) ((type)((uintptr_t)(m) + (offset)))
>  
> +/**
> + * Flow metadata dynamic field definitions.
> + */
> +#define MBUF_DYNF_METADATA_NAME "flow-metadata"
> +#define MBUF_DYNF_METADATA_SIZE sizeof(uint32_t)
> +#define MBUF_DYNF_METADATA_ALIGN __alignof__(uint32_t)
> +#define MBUF_DYNF_METADATA_FLAGS 0

If this flag is only to be used in rte_flow, it can stay in rte_flow.
The name should follow the function name conventions, I suggest
"rte_flow_metadata".

If the flag is going to be used in several places in dpdk (rte_flow,
pmd, app, ...), I wonder if it shouldn't be defined it in
rte_mbuf_dyn.c. I mean:

====
/* rte_mbuf_dyn.c */
const struct rte_mbuf_dynfield rte_mbuf_dynfield_flow_metadata = {
   ...
};
int rte_mbuf_dynfield_flow_metadata_offset = -1;
const struct rte_mbuf_dynflag rte_mbuf_dynflag_flow_metadata = {
   ...
};
int rte_mbuf_dynflag_flow_metadata_bitnum = -1;

int rte_mbuf_dyn_flow_metadata_register(void)
{
...
}

/* rte_mbuf_dyn.h */
extern const struct rte_mbuf_dynfield rte_mbuf_dynfield_flow_metadata;
extern int rte_mbuf_dynfield_flow_metadata_offset;
extern const struct rte_mbuf_dynflag rte_mbuf_dynflag_flow_metadata;
extern int rte_mbuf_dynflag_flow_metadata_bitnum;

...helpers to set/get metadata...
===

Centralizing the definitions of non-private dynamic fields/flags in
rte_mbuf_dyn may help other people to reuse a field that is well
described if it match their use-case.

In your case, what is carried by metadata? Could it be reused by
others? I think some more description is needed.


Regards,
Olivier
  
Slava Ovsiienko Oct. 19, 2019, 7:47 p.m. UTC | #2
Hi, Olivier

Thank you for your comment (and for the dynamic mbuf patch, btw). Please, see below.

> -----Original Message-----
> From: Olivier Matz <olivier.matz@6wind.com>
> Sent: Friday, October 18, 2019 12:22
> To: Slava Ovsiienko <viacheslavo@mellanox.com>
> Cc: dev@dpdk.org; Matan Azrad <matan@mellanox.com>; Raslan
> Darawsheh <rasland@mellanox.com>; Thomas Monjalon
> <thomas@monjalon.net>; Yongseok Koh <yskoh@mellanox.com>
> Subject: Re: [PATCH v2] ethdev: extend flow metadata
> 
> Hi Viacheslav,
> 
> Few comments on the dynamic mbuf part below.
> 
[snip]

> > @@ -12,10 +12,18 @@
> >  #include <rte_errno.h>
> >  #include <rte_branch_prediction.h>
> >  #include <rte_string_fns.h>
> > +#include <rte_mbuf.h>
> > +#include <rte_mbuf_dyn.h>
> >  #include "rte_ethdev.h"
> >  #include "rte_flow_driver.h"
> >  #include "rte_flow.h"
> >
> > +/* Mbuf dynamic field name for metadata. */ int
> > +rte_flow_dynf_metadata_offs = -1;
> > +
> > +/* Mbuf dynamic field flag bit number for metadata. */ uint64_t
> > +rte_flow_dynf_metadata_mask;
> > +
> >  /**
> >   * Flow elements description tables.
> >   */
> > @@ -153,8 +161,41 @@ struct rte_flow_desc_data {
> >  	MK_FLOW_ACTION(DEC_TCP_SEQ, sizeof(rte_be32_t)),
> >  	MK_FLOW_ACTION(INC_TCP_ACK, sizeof(rte_be32_t)),
> >  	MK_FLOW_ACTION(DEC_TCP_ACK, sizeof(rte_be32_t)),
> > +	MK_FLOW_ACTION(SET_META, sizeof(struct
> rte_flow_action_set_meta)),
> >  };
> >
> > +int
> > +rte_flow_dynf_metadata_register(void)
> > +{
> > +	int offset;
> > +	int flag;
> > +
> > +	static const struct rte_mbuf_dynfield desc_offs = {
> > +		.name = MBUF_DYNF_METADATA_NAME,
> > +		.size = MBUF_DYNF_METADATA_SIZE,
> > +		.align = MBUF_DYNF_METADATA_ALIGN,
> > +		.flags = MBUF_DYNF_METADATA_FLAGS,
> > +	};
> > +	static const struct rte_mbuf_dynflag desc_flag = {
> > +		.name = MBUF_DYNF_METADATA_NAME,
> > +	};
> 
> I don't see think we need #defines.
> You can directly use the name, sizeof() and __alignof__() here.
> If the information is used externally, the structure shall be made global non-
> static.

The intention was to gather all dynamic fields definitions in one place 
(in rte_mbuf_dyn.h). It would be easy to see all fields in one sight (some
might be shared, some might be mutual exclusive, estimate mbuf space,
required by various features, etc.). So, we can't just fill structure fields
with simple sizeof() and alignof() instead of definitions (the field parameters
must be defined once).

I do not see the reasons to make table global. I would prefer the definitions.
- the definitions are compile time processing (table fields are runtime),
it provides code optimization and better performance.

> > +
> > +	offset = rte_mbuf_dynfield_register(&desc_offs);
> > +	if (offset < 0)
> > +		goto error;
> > +	flag = rte_mbuf_dynflag_register(&desc_flag);
> > +	if (flag < 0)
> > +		goto error;
> > +	rte_flow_dynf_metadata_offs = offset;
> > +	rte_flow_dynf_metadata_mask = (1ULL << flag);
> > +	return 0;
> > +
> > +error:
> > +	rte_flow_dynf_metadata_offs = -1;
> > +	rte_flow_dynf_metadata_mask = 0ULL;
> > +	return -rte_errno;
> > +}
> > +
> >  static int
> >  flow_err(uint16_t port_id, int ret, struct rte_flow_error *error)  {
> > diff --git a/lib/librte_ethdev/rte_flow.h
> > b/lib/librte_ethdev/rte_flow.h index 391a44a..a27e619 100644
> > --- a/lib/librte_ethdev/rte_flow.h
> > +++ b/lib/librte_ethdev/rte_flow.h
> > @@ -27,6 +27,8 @@
> >  #include <rte_udp.h>
> >  #include <rte_byteorder.h>
> >  #include <rte_esp.h>
> > +#include <rte_mbuf.h>
> > +#include <rte_mbuf_dyn.h>
> >
> >  #ifdef __cplusplus
> >  extern "C" {
> > @@ -417,7 +419,8 @@ enum rte_flow_item_type {
> >  	/**
> >  	 * [META]
> >  	 *
> > -	 * Matches a metadata value specified in mbuf metadata field.
> > +	 * Matches a metadata value.
> > +	 *
> >  	 * See struct rte_flow_item_meta.
> >  	 */
> >  	RTE_FLOW_ITEM_TYPE_META,
> > @@ -1213,9 +1216,17 @@ struct rte_flow_item_icmp6_nd_opt_tla_eth {
> > #endif
> >
> >  /**
> > - * RTE_FLOW_ITEM_TYPE_META.
> > + * @warning
> > + * @b EXPERIMENTAL: this structure may change without prior notice
> >   *
> > - * Matches a specified metadata value.
> > + * RTE_FLOW_ITEM_TYPE_META
> > + *
> > + * Matches a specified metadata value. On egress, metadata can be set
> > + either by
> > + * mbuf tx_metadata field with PKT_TX_METADATA flag or
> > + * RTE_FLOW_ACTION_TYPE_SET_META. On ingress,
> > + RTE_FLOW_ACTION_TYPE_SET_META sets
> > + * metadata for a packet and the metadata will be reported via mbuf
> > + metadata
> > + * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic mbuf
> > + field must be
> > + * registered in advance by rte_flow_dynf_metadata_register().
> >   */
> >  struct rte_flow_item_meta {
> >  	rte_be32_t data;
> > @@ -1813,6 +1824,13 @@ enum rte_flow_action_type {
> >  	 * undefined behavior.
> >  	 */
> >  	RTE_FLOW_ACTION_TYPE_DEC_TCP_ACK,
> > +
> > +	/**
> > +	 * Set metadata on ingress or egress path.
> > +	 *
> > +	 * See struct rte_flow_action_set_meta.
> > +	 */
> > +	RTE_FLOW_ACTION_TYPE_SET_META,
> >  };
> >
> >  /**
> > @@ -2300,6 +2318,43 @@ struct rte_flow_action_set_mac {
> >  	uint8_t mac_addr[RTE_ETHER_ADDR_LEN];  };
> >
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this structure may change without prior notice
> > + *
> > + * RTE_FLOW_ACTION_TYPE_SET_META
> > + *
> > + * Set metadata. Metadata set by mbuf tx_metadata field with
> > + * PKT_TX_METADATA flag on egress will be overridden by this action.
> > +On
> > + * ingress, the metadata will be carried by mbuf metadata dynamic
> > +field
> > + * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field
> > +must be
> > + * registered in advance by rte_flow_dynf_metadata_register().
> > + *
> > + * Altering partial bits is supported with mask. For bits which have
> > +never
> > + * been set, unpredictable value will be seen depending on driver
> > + * implementation. For loopback/hairpin packet, metadata set on Rx/Tx
> > +may
> > + * or may not be propagated to the other path depending on HW
> capability.
> > + *
> > + * RTE_FLOW_ITEM_TYPE_META matches metadata.
> > + */
> > +struct rte_flow_action_set_meta {
> > +	rte_be32_t data;
> > +	rte_be32_t mask;
> > +};
> > +
> > +/* Mbuf dynamic field offset for metadata. */ extern int
> > +rte_flow_dynf_metadata_offs;
> > +
> > +/* Mbuf dynamic field flag mask for metadata. */ extern uint64_t
> > +rte_flow_dynf_metadata_mask;
> > +
> > +/* Mbuf dynamic field pointer for metadata. */ #define
> > +RTE_FLOW_DYNF_METADATA(m) \
> > +	RTE_MBUF_DYNFIELD((m), rte_flow_dynf_metadata_offs, uint32_t
> *)
> > +
> > +/* Mbuf dynamic flag for metadata. */ #define PKT_RX_DYNF_METADATA
> > +(rte_flow_dynf_metadata_mask)
> > +
> 
> I wonder if helpers like this wouldn't be better, because they combine the
> flag and the field:
> 
> /**
>  * Set metadata dynamic field and flag in mbuf.
>  *
>  * rte_flow_dynf_metadata_register() must have been called first.
>  */
> __rte_experimental
> static inline void rte_mbuf_dyn_metadata_set(struct rte_mbuf *m,
>                                        uint32_t metadata) {
>        *RTE_MBUF_DYNFIELD(m, rte_flow_dynf_metadata_offs,
>                        uint32_t *) = metadata;
>        m->ol_flags |= rte_flow_dynf_metadata_mask; }
Setting flag looks redundantly.
What if driver just replaces the metadata and flag is already set?
The other option - the flags (for set of fields) might be set in combinations.
mbuf field is supposed to be engaged in datapath, performance is
very critical, adding one more abstraction layer seems not to be relevant.
Also, metadata is not feature of mbuf. It should have rte_flow prefix.

> /**
>  * Get metadata dynamic field value in mbuf.
>  *
>  * rte_flow_dynf_metadata_register() must have been called first.
>  */
> __rte_experimental
> static inline int rte_mbuf_dyn_metadata_get(const struct rte_mbuf *m,
>                                        uint32_t *metadata) {
>        if ((m->ol_flags & rte_flow_dynf_metadata_mask) == 0)
>                return -1;
What if metadata is 0xFFFFFFFF ?
The checking of availability might embrace larger code block, 
so this might be not the best place to check availability.

>        *metadata = *RTE_MBUF_DYNFIELD(m, rte_flow_dynf_metadata_offs,
>                                uint32_t *);
>        return 0;
> }
> 
> /**
>  * Delete the metadata dynamic flag in mbuf.
>  *
>  * rte_flow_dynf_metadata_register() must have been called first.
>  */
> __rte_experimental
> static inline void rte_mbuf_dyn_metadata_del(struct rte_mbuf *m) {
>        m->ol_flags &= ~rte_flow_dynf_metadata_mask; }
> 
Sorry, I do not see the practical usecase for these helpers. In my opinion it is just some kind of obscuration.
They do replace the very simple code and introduce some risk of performance impact.

> 
> >  /*
> >   * Definition of a single action.
> >   *
> > @@ -2533,6 +2588,32 @@ enum rte_flow_conv_op {  };
> >
> >  /**
> > + * Check if mbuf dynamic field for metadata is registered.
> > + *
> > + * @return
> > + *   True if registered, false otherwise.
> > + */
> > +__rte_experimental
> > +static inline int
> > +rte_flow_dynf_metadata_avail(void) {
> > +	return !!rte_flow_dynf_metadata_mask; }
> 
> _registered() instead of _avail() ?
Accepted, sounds better.

> 
> > +
> > +/**
> > + * Register mbuf dynamic field and flag for metadata.
> > + *
> > + * This function must be called prior to use SET_META action in order
> > +to
> > + * register the dynamic mbuf field. Otherwise, the data cannot be
> > +delivered to
> > + * application.
> > + *
> > + * @return
> > + *   0 on success, a negative errno value otherwise and rte_errno is set.
> > + */
> > +__rte_experimental
> > +int
> > +rte_flow_dynf_metadata_register(void);
> > +
> > +/**
> >   * Check whether a flow rule can be created on a given port.
> >   *
> >   * The flow rule is validated for correctness and whether it could be
> > accepted diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h
> > b/lib/librte_mbuf/rte_mbuf_dyn.h index 6e2c816..4ff33ac 100644
> > --- a/lib/librte_mbuf/rte_mbuf_dyn.h
> > +++ b/lib/librte_mbuf/rte_mbuf_dyn.h
> > @@ -160,4 +160,12 @@ int rte_mbuf_dynflag_lookup(const char *name,
> >   */
> >  #define RTE_MBUF_DYNFIELD(m, offset, type) ((type)((uintptr_t)(m) +
> > (offset)))
> >
> > +/**
> > + * Flow metadata dynamic field definitions.
> > + */
> > +#define MBUF_DYNF_METADATA_NAME "flow-metadata"
> > +#define MBUF_DYNF_METADATA_SIZE sizeof(uint32_t) #define
> > +MBUF_DYNF_METADATA_ALIGN __alignof__(uint32_t) #define
> > +MBUF_DYNF_METADATA_FLAGS 0
> 
> If this flag is only to be used in rte_flow, it can stay in rte_flow.
> The name should follow the function name conventions, I suggest
> "rte_flow_metadata".

The definitions:
MBUF_DYNF_METADATA_NAME, 
MBUF_DYNF_METADATA_SIZE,
MBUF_DYNF_METADATA_ALIGN
are global. rte_flow proposes only minimal set tyo check and access
the metadata. By knowing the field names applications would have the
more flexibility in processing the fields, for example it allows to  optimize
the handling of multiple dynamic fields . The definition of metadata size allows
to generate optimized code:
#if MBUF_DYNF_METADATA_SIZE == sizeof(uint32)
	*RTE_MBUF_DYNFIELD(m) = get_metadata_32bit()
#else
	*RTE_MBUF_DYNFIELD(m) = get_metadata_64bit()
#endif

MBUF_DYNF_METADATA_FLAGS flag is not used by rte_flow,
this flag is related exclusively to dynamic mbuf  " Reserved for future use, must be 0".
Would you like to drop this definition?

> 
> If the flag is going to be used in several places in dpdk (rte_flow, pmd, app,
> ...), I wonder if it shouldn't be defined it in rte_mbuf_dyn.c. I mean:
> 
> ====
> /* rte_mbuf_dyn.c */
> const struct rte_mbuf_dynfield rte_mbuf_dynfield_flow_metadata = {
>    ...
> };
In this case we would make this descriptor global.
It is no needed, because there Is no supposed any usage, but by
rte_flow_dynf_metadata_register() only. The 

> int rte_mbuf_dynfield_flow_metadata_offset = -1; const struct
> rte_mbuf_dynflag rte_mbuf_dynflag_flow_metadata = {
>    ...
> };
> int rte_mbuf_dynflag_flow_metadata_bitnum = -1;
> 
> int rte_mbuf_dyn_flow_metadata_register(void)
> {
> ...
> }
> 
> /* rte_mbuf_dyn.h */
> extern const struct rte_mbuf_dynfield rte_mbuf_dynfield_flow_metadata;
> extern int rte_mbuf_dynfield_flow_metadata_offset;
> extern const struct rte_mbuf_dynflag rte_mbuf_dynflag_flow_metadata;
> extern int rte_mbuf_dynflag_flow_metadata_bitnum;
> 
> ...helpers to set/get metadata...
> ===
> 
> Centralizing the definitions of non-private dynamic fields/flags in
> rte_mbuf_dyn may help other people to reuse a field that is well described if
> it match their use-case.

Yes, centralizing is important, that's why MBUF_DYNF_METADATA_xxx placed
in rte_mbuf_dyn.h. Do you think we should share the descriptors either?
I have no idea why someone (but rte_flow_dynf_metadata_register()) might
register metadata field directly.

> 
> In your case, what is carried by metadata? Could it be reused by others? I
> think some more description is needed.
In my case, metadata is just opaquie rte_flow related 32-bit unsigned value provided by
mlx5 hardrware in rx datapath. I have no guess whether someone wishes to reuse.

Brief summary of you comment (just to make sure I understood your proposal in correct way):
1. drop all definitions MBUF_DYNF_METADATA_xxx, leave MBUF_DYNF_METADATA_NAME only
2. move the descriptor const struct rte_mbuf_dynfield desc_offs = {} to rte_mbuf_dyn.c and make it global
3. provide helpers to access metadata

[1] and [2] look OK in general. Although I think these ones make code less flexible, restrict the potential compile time options.
For now it is rather theoretical question, if you insist on your approach - please, let me know, I'll address [1] and [2]
and update.my patch.

As for [3] - IMHO, the extra abstraction layer is not useful, and might be even harmful.
I tend not to complicate the code, at least, for now.

With best regards,
Slava
 
> Regards,
> Olivier
  
Olivier Matz Oct. 21, 2019, 4:37 p.m. UTC | #3
Hi Slava,

On Sat, Oct 19, 2019 at 07:47:59PM +0000, Slava Ovsiienko wrote:
> Hi, Olivier
> 
> Thank you for your comment (and for the dynamic mbuf patch, btw). Please, see below.
> 
> > -----Original Message-----
> > From: Olivier Matz <olivier.matz@6wind.com>
> > Sent: Friday, October 18, 2019 12:22
> > To: Slava Ovsiienko <viacheslavo@mellanox.com>
> > Cc: dev@dpdk.org; Matan Azrad <matan@mellanox.com>; Raslan
> > Darawsheh <rasland@mellanox.com>; Thomas Monjalon
> > <thomas@monjalon.net>; Yongseok Koh <yskoh@mellanox.com>
> > Subject: Re: [PATCH v2] ethdev: extend flow metadata
> > 
> > Hi Viacheslav,
> > 
> > Few comments on the dynamic mbuf part below.
> > 
> [snip]
> 
> > > @@ -12,10 +12,18 @@
> > >  #include <rte_errno.h>
> > >  #include <rte_branch_prediction.h>
> > >  #include <rte_string_fns.h>
> > > +#include <rte_mbuf.h>
> > > +#include <rte_mbuf_dyn.h>
> > >  #include "rte_ethdev.h"
> > >  #include "rte_flow_driver.h"
> > >  #include "rte_flow.h"
> > >
> > > +/* Mbuf dynamic field name for metadata. */ int
> > > +rte_flow_dynf_metadata_offs = -1;
> > > +
> > > +/* Mbuf dynamic field flag bit number for metadata. */ uint64_t
> > > +rte_flow_dynf_metadata_mask;
> > > +
> > >  /**
> > >   * Flow elements description tables.
> > >   */
> > > @@ -153,8 +161,41 @@ struct rte_flow_desc_data {
> > >  	MK_FLOW_ACTION(DEC_TCP_SEQ, sizeof(rte_be32_t)),
> > >  	MK_FLOW_ACTION(INC_TCP_ACK, sizeof(rte_be32_t)),
> > >  	MK_FLOW_ACTION(DEC_TCP_ACK, sizeof(rte_be32_t)),
> > > +	MK_FLOW_ACTION(SET_META, sizeof(struct
> > rte_flow_action_set_meta)),
> > >  };
> > >
> > > +int
> > > +rte_flow_dynf_metadata_register(void)
> > > +{
> > > +	int offset;
> > > +	int flag;
> > > +
> > > +	static const struct rte_mbuf_dynfield desc_offs = {
> > > +		.name = MBUF_DYNF_METADATA_NAME,
> > > +		.size = MBUF_DYNF_METADATA_SIZE,
> > > +		.align = MBUF_DYNF_METADATA_ALIGN,
> > > +		.flags = MBUF_DYNF_METADATA_FLAGS,
> > > +	};
> > > +	static const struct rte_mbuf_dynflag desc_flag = {
> > > +		.name = MBUF_DYNF_METADATA_NAME,
> > > +	};
> > 
> > I don't see think we need #defines.
> > You can directly use the name, sizeof() and __alignof__() here.
> > If the information is used externally, the structure shall be made global non-
> > static.
> 
> The intention was to gather all dynamic fields definitions in one place 
> (in rte_mbuf_dyn.h).

If the dynamic field is only going to be used inside rte_flow, I think
there is no need to expose it in rte_mbuf_dyn.h.
The other reason is I think the #define are just "passthrough", and do
not really bring added value, just an indirection.

> It would be easy to see all fields in one sight (some
> might be shared, some might be mutual exclusive, estimate mbuf space,
> required by various features, etc.). So, we can't just fill structure fields
> with simple sizeof() and alignof() instead of definitions (the field parameters
> must be defined once).
> 
> I do not see the reasons to make table global. I would prefer the definitions.
> - the definitions are compile time processing (table fields are runtime),
> it provides code optimization and better performance.

There is indeed no need to make the table global if the field is private
to rte_flow. About better performance, my understanding is that it would
only impact registration, am I missing something?

> 
> > > +
> > > +	offset = rte_mbuf_dynfield_register(&desc_offs);
> > > +	if (offset < 0)
> > > +		goto error;
> > > +	flag = rte_mbuf_dynflag_register(&desc_flag);
> > > +	if (flag < 0)
> > > +		goto error;
> > > +	rte_flow_dynf_metadata_offs = offset;
> > > +	rte_flow_dynf_metadata_mask = (1ULL << flag);
> > > +	return 0;
> > > +
> > > +error:
> > > +	rte_flow_dynf_metadata_offs = -1;
> > > +	rte_flow_dynf_metadata_mask = 0ULL;
> > > +	return -rte_errno;
> > > +}
> > > +
> > >  static int
> > >  flow_err(uint16_t port_id, int ret, struct rte_flow_error *error)  {
> > > diff --git a/lib/librte_ethdev/rte_flow.h
> > > b/lib/librte_ethdev/rte_flow.h index 391a44a..a27e619 100644
> > > --- a/lib/librte_ethdev/rte_flow.h
> > > +++ b/lib/librte_ethdev/rte_flow.h
> > > @@ -27,6 +27,8 @@
> > >  #include <rte_udp.h>
> > >  #include <rte_byteorder.h>
> > >  #include <rte_esp.h>
> > > +#include <rte_mbuf.h>
> > > +#include <rte_mbuf_dyn.h>
> > >
> > >  #ifdef __cplusplus
> > >  extern "C" {
> > > @@ -417,7 +419,8 @@ enum rte_flow_item_type {
> > >  	/**
> > >  	 * [META]
> > >  	 *
> > > -	 * Matches a metadata value specified in mbuf metadata field.
> > > +	 * Matches a metadata value.
> > > +	 *
> > >  	 * See struct rte_flow_item_meta.
> > >  	 */
> > >  	RTE_FLOW_ITEM_TYPE_META,
> > > @@ -1213,9 +1216,17 @@ struct rte_flow_item_icmp6_nd_opt_tla_eth {
> > > #endif
> > >
> > >  /**
> > > - * RTE_FLOW_ITEM_TYPE_META.
> > > + * @warning
> > > + * @b EXPERIMENTAL: this structure may change without prior notice
> > >   *
> > > - * Matches a specified metadata value.
> > > + * RTE_FLOW_ITEM_TYPE_META
> > > + *
> > > + * Matches a specified metadata value. On egress, metadata can be set
> > > + either by
> > > + * mbuf tx_metadata field with PKT_TX_METADATA flag or
> > > + * RTE_FLOW_ACTION_TYPE_SET_META. On ingress,
> > > + RTE_FLOW_ACTION_TYPE_SET_META sets
> > > + * metadata for a packet and the metadata will be reported via mbuf
> > > + metadata
> > > + * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic mbuf
> > > + field must be
> > > + * registered in advance by rte_flow_dynf_metadata_register().
> > >   */
> > >  struct rte_flow_item_meta {
> > >  	rte_be32_t data;
> > > @@ -1813,6 +1824,13 @@ enum rte_flow_action_type {
> > >  	 * undefined behavior.
> > >  	 */
> > >  	RTE_FLOW_ACTION_TYPE_DEC_TCP_ACK,
> > > +
> > > +	/**
> > > +	 * Set metadata on ingress or egress path.
> > > +	 *
> > > +	 * See struct rte_flow_action_set_meta.
> > > +	 */
> > > +	RTE_FLOW_ACTION_TYPE_SET_META,
> > >  };
> > >
> > >  /**
> > > @@ -2300,6 +2318,43 @@ struct rte_flow_action_set_mac {
> > >  	uint8_t mac_addr[RTE_ETHER_ADDR_LEN];  };
> > >
> > > +/**
> > > + * @warning
> > > + * @b EXPERIMENTAL: this structure may change without prior notice
> > > + *
> > > + * RTE_FLOW_ACTION_TYPE_SET_META
> > > + *
> > > + * Set metadata. Metadata set by mbuf tx_metadata field with
> > > + * PKT_TX_METADATA flag on egress will be overridden by this action.
> > > +On
> > > + * ingress, the metadata will be carried by mbuf metadata dynamic
> > > +field
> > > + * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field
> > > +must be
> > > + * registered in advance by rte_flow_dynf_metadata_register().
> > > + *
> > > + * Altering partial bits is supported with mask. For bits which have
> > > +never
> > > + * been set, unpredictable value will be seen depending on driver
> > > + * implementation. For loopback/hairpin packet, metadata set on Rx/Tx
> > > +may
> > > + * or may not be propagated to the other path depending on HW
> > capability.
> > > + *
> > > + * RTE_FLOW_ITEM_TYPE_META matches metadata.
> > > + */
> > > +struct rte_flow_action_set_meta {
> > > +	rte_be32_t data;
> > > +	rte_be32_t mask;
> > > +};
> > > +
> > > +/* Mbuf dynamic field offset for metadata. */ extern int
> > > +rte_flow_dynf_metadata_offs;
> > > +
> > > +/* Mbuf dynamic field flag mask for metadata. */ extern uint64_t
> > > +rte_flow_dynf_metadata_mask;
> > > +
> > > +/* Mbuf dynamic field pointer for metadata. */ #define
> > > +RTE_FLOW_DYNF_METADATA(m) \
> > > +	RTE_MBUF_DYNFIELD((m), rte_flow_dynf_metadata_offs, uint32_t
> > *)
> > > +
> > > +/* Mbuf dynamic flag for metadata. */ #define PKT_RX_DYNF_METADATA
> > > +(rte_flow_dynf_metadata_mask)
> > > +
> > 
> > I wonder if helpers like this wouldn't be better, because they combine the
> > flag and the field:
> > 
> > /**
> >  * Set metadata dynamic field and flag in mbuf.
> >  *
> >  * rte_flow_dynf_metadata_register() must have been called first.
> >  */
> > __rte_experimental
> > static inline void rte_mbuf_dyn_metadata_set(struct rte_mbuf *m,
> >                                        uint32_t metadata) {
> >        *RTE_MBUF_DYNFIELD(m, rte_flow_dynf_metadata_offs,
> >                        uint32_t *) = metadata;
> >        m->ol_flags |= rte_flow_dynf_metadata_mask; }
> Setting flag looks redundantly.
> What if driver just replaces the metadata and flag is already set?
> The other option - the flags (for set of fields) might be set in combinations.
> mbuf field is supposed to be engaged in datapath, performance is
> very critical, adding one more abstraction layer seems not to be relevant.

Ok, that was just a suggestion. Let's use your accessors if you fear a
performance impact.

Nevertheless I suggest to use static inline functions in place of
macros if possible. For RTE_MBUF_DYNFIELD(), I used a macro because
it's the only way to provide a type to cast the result. But in your
case, you know it's a uint32_t *.

> Also, metadata is not feature of mbuf. It should have rte_flow prefix.

Yes, sure. The example derives from a test I've done, and I forgot to
change it.


> > /**
> >  * Get metadata dynamic field value in mbuf.
> >  *
> >  * rte_flow_dynf_metadata_register() must have been called first.
> >  */
> > __rte_experimental
> > static inline int rte_mbuf_dyn_metadata_get(const struct rte_mbuf *m,
> >                                        uint32_t *metadata) {
> >        if ((m->ol_flags & rte_flow_dynf_metadata_mask) == 0)
> >                return -1;
> What if metadata is 0xFFFFFFFF ?
> The checking of availability might embrace larger code block, 
> so this might be not the best place to check availability.
> 
> >        *metadata = *RTE_MBUF_DYNFIELD(m, rte_flow_dynf_metadata_offs,
> >                                uint32_t *);
> >        return 0;
> > }
> > 
> > /**
> >  * Delete the metadata dynamic flag in mbuf.
> >  *
> >  * rte_flow_dynf_metadata_register() must have been called first.
> >  */
> > __rte_experimental
> > static inline void rte_mbuf_dyn_metadata_del(struct rte_mbuf *m) {
> >        m->ol_flags &= ~rte_flow_dynf_metadata_mask; }
> > 
> Sorry, I do not see the practical usecase for these helpers. In my opinion it is just some kind of obscuration.
> They do replace the very simple code and introduce some risk of performance impact.
> 
> > 
> > >  /*
> > >   * Definition of a single action.
> > >   *
> > > @@ -2533,6 +2588,32 @@ enum rte_flow_conv_op {  };
> > >
> > >  /**
> > > + * Check if mbuf dynamic field for metadata is registered.
> > > + *
> > > + * @return
> > > + *   True if registered, false otherwise.
> > > + */
> > > +__rte_experimental
> > > +static inline int
> > > +rte_flow_dynf_metadata_avail(void) {
> > > +	return !!rte_flow_dynf_metadata_mask; }
> > 
> > _registered() instead of _avail() ?
> Accepted, sounds better.
> 
> > 
> > > +
> > > +/**
> > > + * Register mbuf dynamic field and flag for metadata.
> > > + *
> > > + * This function must be called prior to use SET_META action in order
> > > +to
> > > + * register the dynamic mbuf field. Otherwise, the data cannot be
> > > +delivered to
> > > + * application.
> > > + *
> > > + * @return
> > > + *   0 on success, a negative errno value otherwise and rte_errno is set.
> > > + */
> > > +__rte_experimental
> > > +int
> > > +rte_flow_dynf_metadata_register(void);
> > > +
> > > +/**
> > >   * Check whether a flow rule can be created on a given port.
> > >   *
> > >   * The flow rule is validated for correctness and whether it could be
> > > accepted diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h
> > > b/lib/librte_mbuf/rte_mbuf_dyn.h index 6e2c816..4ff33ac 100644
> > > --- a/lib/librte_mbuf/rte_mbuf_dyn.h
> > > +++ b/lib/librte_mbuf/rte_mbuf_dyn.h
> > > @@ -160,4 +160,12 @@ int rte_mbuf_dynflag_lookup(const char *name,
> > >   */
> > >  #define RTE_MBUF_DYNFIELD(m, offset, type) ((type)((uintptr_t)(m) +
> > > (offset)))
> > >
> > > +/**
> > > + * Flow metadata dynamic field definitions.
> > > + */
> > > +#define MBUF_DYNF_METADATA_NAME "flow-metadata"
> > > +#define MBUF_DYNF_METADATA_SIZE sizeof(uint32_t) #define
> > > +MBUF_DYNF_METADATA_ALIGN __alignof__(uint32_t) #define
> > > +MBUF_DYNF_METADATA_FLAGS 0
> > 
> > If this flag is only to be used in rte_flow, it can stay in rte_flow.
> > The name should follow the function name conventions, I suggest
> > "rte_flow_metadata".
> 
> The definitions:
> MBUF_DYNF_METADATA_NAME, 
> MBUF_DYNF_METADATA_SIZE,
> MBUF_DYNF_METADATA_ALIGN
> are global. rte_flow proposes only minimal set tyo check and access
> the metadata. By knowing the field names applications would have the
> more flexibility in processing the fields, for example it allows to  optimize
> the handling of multiple dynamic fields . The definition of metadata size allows
> to generate optimized code:
> #if MBUF_DYNF_METADATA_SIZE == sizeof(uint32)
> 	*RTE_MBUF_DYNFIELD(m) = get_metadata_32bit()
> #else
> 	*RTE_MBUF_DYNFIELD(m) = get_metadata_64bit()
> #endif

I don't see any reason why the same dynamic field could have different
sizes, I even think it could be dangerous. Your accessors suppose that
the metadata is a uint32_t. Having a compile-time option for that does
not look desirable.

Just a side note: we have to take care when adding a new *public*
dynamic field that it won't change in the future: the accessors are
macros or static inline functions, so they are embedded in the binaries.
This is probably something we should discuss
and may not be when updating the dpdk (as shared lib).


> MBUF_DYNF_METADATA_FLAGS flag is not used by rte_flow,
> this flag is related exclusively to dynamic mbuf  " Reserved for future use, must be 0".
> Would you like to drop this definition?
> 
> > 
> > If the flag is going to be used in several places in dpdk (rte_flow, pmd, app,
> > ...), I wonder if it shouldn't be defined it in rte_mbuf_dyn.c. I mean:
> > 
> > ====
> > /* rte_mbuf_dyn.c */
> > const struct rte_mbuf_dynfield rte_mbuf_dynfield_flow_metadata = {
> >    ...
> > };
> In this case we would make this descriptor global.
> It is no needed, because there Is no supposed any usage, but by
> rte_flow_dynf_metadata_register() only. The 

Yes, in my example I wasn't sure it was going to be private to rte_flow (see "If
the flag is going to be used in several places in dpdk (rte_flow, pmd, app,
...)").

So yes, I agree the struct should remain private.


> > int rte_mbuf_dynfield_flow_metadata_offset = -1; const struct
> > rte_mbuf_dynflag rte_mbuf_dynflag_flow_metadata = {
> >    ...
> > };
> > int rte_mbuf_dynflag_flow_metadata_bitnum = -1;
> > 
> > int rte_mbuf_dyn_flow_metadata_register(void)
> > {
> > ...
> > }
> > 
> > /* rte_mbuf_dyn.h */
> > extern const struct rte_mbuf_dynfield rte_mbuf_dynfield_flow_metadata;
> > extern int rte_mbuf_dynfield_flow_metadata_offset;
> > extern const struct rte_mbuf_dynflag rte_mbuf_dynflag_flow_metadata;
> > extern int rte_mbuf_dynflag_flow_metadata_bitnum;
> > 
> > ...helpers to set/get metadata...
> > ===
> > 
> > Centralizing the definitions of non-private dynamic fields/flags in
> > rte_mbuf_dyn may help other people to reuse a field that is well described if
> > it match their use-case.
> 
> Yes, centralizing is important, that's why MBUF_DYNF_METADATA_xxx placed
> in rte_mbuf_dyn.h. Do you think we should share the descriptors either?
> I have no idea why someone (but rte_flow_dynf_metadata_register()) might
> register metadata field directly.

If the field is private to rte_flow, yes, there is no need to share the
"struct rte_mbuf_dynfield". Even the rte_flow_dynf_metadata_register()
could be marked as internal, right?

One more question: I see the registration is done by
parse_vc_action_set_meta(). My understanding is that this function is
not in datapath, and is called when configuring rte_flow. Do you
confirm?

> > In your case, what is carried by metadata? Could it be reused by others? I
> > think some more description is needed.
> In my case, metadata is just opaquie rte_flow related 32-bit unsigned value provided by
> mlx5 hardrware in rx datapath. I have no guess whether someone wishes to reuse.

What is the user supposed to do with this value? If it is hw-specific
data, I think the name of the mbuf field should include "MLX", and it
should be described.

Are these rte_flow actions somehow specific to mellanox drivers ?

> Brief summary of you comment (just to make sure I understood your proposal in correct way):
> 1. drop all definitions MBUF_DYNF_METADATA_xxx, leave MBUF_DYNF_METADATA_NAME only
> 2. move the descriptor const struct rte_mbuf_dynfield desc_offs = {} to rte_mbuf_dyn.c and make it global
> 3. provide helpers to access metadata
> 
> [1] and [2] look OK in general. Although I think these ones make code less flexible, restrict the potential compile time options.
> For now it is rather theoretical question, if you insist on your approach - please, let me know, I'll address [1] and [2]
> and update.my patch.

[1] I think the #define only adds an indirection, and I didn't see any
    perf constraint here.
[2] My previous comment was surely not clear, sorry. The code can stay
    in rte_flow.

> As for [3] - IMHO, the extra abstraction layer is not useful, and might be even harmful.
> I tend not to complicate the code, at least, for now.

[3] ok for me


Thanks,
Olivier
  
Slava Ovsiienko Oct. 24, 2019, 6:49 a.m. UTC | #4
Hi, Olivier

> > [snip]
> >
> > > > +int
> > > > +rte_flow_dynf_metadata_register(void)
> > > > +{
> > > > +	int offset;
> > > > +	int flag;
> > > > +
> > > > +	static const struct rte_mbuf_dynfield desc_offs = {
> > > > +		.name = MBUF_DYNF_METADATA_NAME,
> > > > +		.size = MBUF_DYNF_METADATA_SIZE,
> > > > +		.align = MBUF_DYNF_METADATA_ALIGN,
> > > > +		.flags = MBUF_DYNF_METADATA_FLAGS,
> > > > +	};
> > > > +	static const struct rte_mbuf_dynflag desc_flag = {
> > > > +		.name = MBUF_DYNF_METADATA_NAME,
> > > > +	};
> > >
> > > I don't see think we need #defines.
> > > You can directly use the name, sizeof() and __alignof__() here.
> > > If the information is used externally, the structure shall be made
> > > global non- static.
> >
> > The intention was to gather all dynamic fields definitions in one
> > place (in rte_mbuf_dyn.h).
> 
> If the dynamic field is only going to be used inside rte_flow, I think there is no
> need to expose it in rte_mbuf_dyn.h.
> The other reason is I think the #define are just "passthrough", and do not
> really bring added value, just an indirection.
> 
> > It would be easy to see all fields in one sight (some might be shared,
> > some might be mutual exclusive, estimate mbuf space, required by
> > various features, etc.). So, we can't just fill structure fields with
> > simple sizeof() and alignof() instead of definitions (the field
> > parameters must be defined once).
> >
> > I do not see the reasons to make table global. I would prefer the
> definitions.
> > - the definitions are compile time processing (table fields are
> > runtime), it provides code optimization and better performance.
> 
> There is indeed no need to make the table global if the field is private to
> rte_flow. About better performance, my understanding is that it would only
> impact registration, am I missing something?

OK, I thought about some opportunity to allow application to register
field directly, bypassing rte_flow_dynf_metadata_register(). So either
definitions or field description table was supposed to be global. 
I agree, let's do not complicate the matter, I'll will make global the
metadata field name definition only - in the rte_mbuf_dyn.h in order
just to have some centralizing point.

> >
> > > > +
> > > > +	offset = rte_mbuf_dynfield_register(&desc_offs);
> > > > +	if (offset < 0)
> > > > +		goto error;
> > > > +	flag = rte_mbuf_dynflag_register(&desc_flag);
> > > > +	if (flag < 0)
> > > > +		goto error;
> > > > +	rte_flow_dynf_metadata_offs = offset;
> > > > +	rte_flow_dynf_metadata_mask = (1ULL << flag);
> > > > +	return 0;
> > > > +
> > > > +error:
> > > > +	rte_flow_dynf_metadata_offs = -1;
> > > > +	rte_flow_dynf_metadata_mask = 0ULL;
> > > > +	return -rte_errno;
> > > > +}
> > > > +
> > > >  static int
> > > >  flow_err(uint16_t port_id, int ret, struct rte_flow_error *error)
> > > > { diff --git a/lib/librte_ethdev/rte_flow.h
> > > > b/lib/librte_ethdev/rte_flow.h index 391a44a..a27e619 100644
> > > > --- a/lib/librte_ethdev/rte_flow.h
> > > > +++ b/lib/librte_ethdev/rte_flow.h
> > > > @@ -27,6 +27,8 @@
> > > >  #include <rte_udp.h>
> > > >  #include <rte_byteorder.h>
> > > >  #include <rte_esp.h>
> > > > +#include <rte_mbuf.h>
> > > > +#include <rte_mbuf_dyn.h>
> > > >
> > > >  #ifdef __cplusplus
> > > >  extern "C" {
> > > > @@ -417,7 +419,8 @@ enum rte_flow_item_type {
> > > >  	/**
> > > >  	 * [META]
> > > >  	 *
> > > > -	 * Matches a metadata value specified in mbuf metadata field.
> > > > +	 * Matches a metadata value.
> > > > +	 *
> > > >  	 * See struct rte_flow_item_meta.
> > > >  	 */
> > > >  	RTE_FLOW_ITEM_TYPE_META,
> > > > @@ -1213,9 +1216,17 @@ struct
> rte_flow_item_icmp6_nd_opt_tla_eth {
> > > > #endif
> > > >
> > > >  /**
> > > > - * RTE_FLOW_ITEM_TYPE_META.
> > > > + * @warning
> > > > + * @b EXPERIMENTAL: this structure may change without prior
> > > > + notice
> > > >   *
> > > > - * Matches a specified metadata value.
> > > > + * RTE_FLOW_ITEM_TYPE_META
> > > > + *
> > > > + * Matches a specified metadata value. On egress, metadata can be
> > > > + set either by
> > > > + * mbuf tx_metadata field with PKT_TX_METADATA flag or
> > > > + * RTE_FLOW_ACTION_TYPE_SET_META. On ingress,
> > > > + RTE_FLOW_ACTION_TYPE_SET_META sets
> > > > + * metadata for a packet and the metadata will be reported via
> > > > + mbuf metadata
> > > > + * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic
> mbuf
> > > > + field must be
> > > > + * registered in advance by rte_flow_dynf_metadata_register().
> > > >   */
> > > >  struct rte_flow_item_meta {
> > > >  	rte_be32_t data;
> > > > @@ -1813,6 +1824,13 @@ enum rte_flow_action_type {
> > > >  	 * undefined behavior.
> > > >  	 */
> > > >  	RTE_FLOW_ACTION_TYPE_DEC_TCP_ACK,
> > > > +
> > > > +	/**
> > > > +	 * Set metadata on ingress or egress path.
> > > > +	 *
> > > > +	 * See struct rte_flow_action_set_meta.
> > > > +	 */
> > > > +	RTE_FLOW_ACTION_TYPE_SET_META,
> > > >  };
> > > >
> > > >  /**
> > > > @@ -2300,6 +2318,43 @@ struct rte_flow_action_set_mac {
> > > >  	uint8_t mac_addr[RTE_ETHER_ADDR_LEN];  };
> > > >
> > > > +/**
> > > > + * @warning
> > > > + * @b EXPERIMENTAL: this structure may change without prior
> > > > +notice
> > > > + *
> > > > + * RTE_FLOW_ACTION_TYPE_SET_META
> > > > + *
> > > > + * Set metadata. Metadata set by mbuf tx_metadata field with
> > > > + * PKT_TX_METADATA flag on egress will be overridden by this action.
> > > > +On
> > > > + * ingress, the metadata will be carried by mbuf metadata dynamic
> > > > +field
> > > > + * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field
> > > > +must be
> > > > + * registered in advance by rte_flow_dynf_metadata_register().
> > > > + *
> > > > + * Altering partial bits is supported with mask. For bits which
> > > > +have never
> > > > + * been set, unpredictable value will be seen depending on driver
> > > > + * implementation. For loopback/hairpin packet, metadata set on
> > > > +Rx/Tx may
> > > > + * or may not be propagated to the other path depending on HW
> > > capability.
> > > > + *
> > > > + * RTE_FLOW_ITEM_TYPE_META matches metadata.
> > > > + */
> > > > +struct rte_flow_action_set_meta {
> > > > +	rte_be32_t data;
> > > > +	rte_be32_t mask;
> > > > +};
> > > > +
> > > > +/* Mbuf dynamic field offset for metadata. */ extern int
> > > > +rte_flow_dynf_metadata_offs;
> > > > +
> > > > +/* Mbuf dynamic field flag mask for metadata. */ extern uint64_t
> > > > +rte_flow_dynf_metadata_mask;
> > > > +
> > > > +/* Mbuf dynamic field pointer for metadata. */ #define
> > > > +RTE_FLOW_DYNF_METADATA(m) \
> > > > +	RTE_MBUF_DYNFIELD((m), rte_flow_dynf_metadata_offs, uint32_t
> > > *)
> > > > +
> > > > +/* Mbuf dynamic flag for metadata. */ #define
> > > > +PKT_RX_DYNF_METADATA
> > > > +(rte_flow_dynf_metadata_mask)
> > > > +
> > >
> > > I wonder if helpers like this wouldn't be better, because they
> > > combine the flag and the field:
> > >
> > > /**
> > >  * Set metadata dynamic field and flag in mbuf.
> > >  *
> > >  * rte_flow_dynf_metadata_register() must have been called first.
> > >  */
> > > __rte_experimental
> > > static inline void rte_mbuf_dyn_metadata_set(struct rte_mbuf *m,
> > >                                        uint32_t metadata) {
> > >        *RTE_MBUF_DYNFIELD(m, rte_flow_dynf_metadata_offs,
> > >                        uint32_t *) = metadata;
> > >        m->ol_flags |= rte_flow_dynf_metadata_mask; }
> > Setting flag looks redundantly.
> > What if driver just replaces the metadata and flag is already set?
> > The other option - the flags (for set of fields) might be set in combinations.
> > mbuf field is supposed to be engaged in datapath, performance is very
> > critical, adding one more abstraction layer seems not to be relevant.
> 
> Ok, that was just a suggestion. Let's use your accessors if you fear a
> performance impact.
The simple example - mlx5 PMD has the rx_burst routine implemented
with vector instructions, and it processes four packets at once. No need
to check field availability four times, and the storing the metadata
is the subject for further optimization with vector instructions.
It is a bit difficult to provide common helpers to handle the metadata
field due to extremely high optimization requirements.

> 
> Nevertheless I suggest to use static inline functions in place of macros if
> possible. For RTE_MBUF_DYNFIELD(), I used a macro because it's the only
> way to provide a type to cast the result. But in your case, you know it's a
> uint32_t *.
What If one needs to specify the address of field? Macro allows to do that,
inline functions - do not. Packets may be processed in bizarre ways,
for example in a batch, with vector instructions. OK, I'll provide
the set/get routines, but I'm not sure whether will use ones in mlx5 code.
In my opinion it just obscures the field nature. Field is just a field, AFAIU, 
it is main idea of your patch, the way to handle dynamic field should be close
to handling usual static fields, I think. Macro pointer follows this approach,
routines - does not.

> > Also, metadata is not feature of mbuf. It should have rte_flow prefix.
> 
> Yes, sure. The example derives from a test I've done, and I forgot to change
> it.
> 
> 
> > > /**
> > >  * Get metadata dynamic field value in mbuf.
> > >  *
> > >  * rte_flow_dynf_metadata_register() must have been called first.
> > >  */
> > > __rte_experimental
> > > static inline int rte_mbuf_dyn_metadata_get(const struct rte_mbuf *m,
> > >                                        uint32_t *metadata) {
> > >        if ((m->ol_flags & rte_flow_dynf_metadata_mask) == 0)
> > >                return -1;
> > What if metadata is 0xFFFFFFFF ?
> > The checking of availability might embrace larger code block, so this
> > might be not the best place to check availability.
> >
> > >        *metadata = *RTE_MBUF_DYNFIELD(m,
> rte_flow_dynf_metadata_offs,
> > >                                uint32_t *);
> > >        return 0;
> > > }
> > >
> > > /**
> > >  * Delete the metadata dynamic flag in mbuf.
> > >  *
> > >  * rte_flow_dynf_metadata_register() must have been called first.
> > >  */
> > > __rte_experimental
> > > static inline void rte_mbuf_dyn_metadata_del(struct rte_mbuf *m) {
> > >        m->ol_flags &= ~rte_flow_dynf_metadata_mask; }
> > >
> > Sorry, I do not see the practical usecase for these helpers. In my opinion it
> is just some kind of obscuration.
> > They do replace the very simple code and introduce some risk of
> performance impact.
> >
> > >
> > > >  /*
> > > >   * Definition of a single action.
> > > >   *
> > > > @@ -2533,6 +2588,32 @@ enum rte_flow_conv_op {  };
> > > >
> > > >  /**
> > > > + * Check if mbuf dynamic field for metadata is registered.
> > > > + *
> > > > + * @return
> > > > + *   True if registered, false otherwise.
> > > > + */
> > > > +__rte_experimental
> > > > +static inline int
> > > > +rte_flow_dynf_metadata_avail(void) {
> > > > +	return !!rte_flow_dynf_metadata_mask; }
> > >
> > > _registered() instead of _avail() ?
> > Accepted, sounds better.

Hmm, I changed my opinion - we already have
rte_flow_dynf_metadata_register(void). Is it OK to have
rte_flow_dynf_metadata_registerED(void) ?
It would be easy to mistype. 

> >
> > >
> > > > +
> > > > +/**
> > > > + * Register mbuf dynamic field and flag for metadata.
> > > > + *
> > > > + * This function must be called prior to use SET_META action in
> > > > +order to
> > > > + * register the dynamic mbuf field. Otherwise, the data cannot be
> > > > +delivered to
> > > > + * application.
> > > > + *
> > > > + * @return
> > > > + *   0 on success, a negative errno value otherwise and rte_errno is
> set.
> > > > + */
> > > > +__rte_experimental
> > > > +int
> > > > +rte_flow_dynf_metadata_register(void);
> > > > +
> > > > +/**
> > > >   * Check whether a flow rule can be created on a given port.
> > > >   *
> > > >   * The flow rule is validated for correctness and whether it
> > > > could be accepted diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h
> > > > b/lib/librte_mbuf/rte_mbuf_dyn.h index 6e2c816..4ff33ac 100644
> > > > --- a/lib/librte_mbuf/rte_mbuf_dyn.h
> > > > +++ b/lib/librte_mbuf/rte_mbuf_dyn.h
> > > > @@ -160,4 +160,12 @@ int rte_mbuf_dynflag_lookup(const char
> *name,
> > > >   */
> > > >  #define RTE_MBUF_DYNFIELD(m, offset, type) ((type)((uintptr_t)(m)
> > > > +
> > > > (offset)))
> > > >
> > > > +/**
> > > > + * Flow metadata dynamic field definitions.
> > > > + */
> > > > +#define MBUF_DYNF_METADATA_NAME "flow-metadata"
> > > > +#define MBUF_DYNF_METADATA_SIZE sizeof(uint32_t) #define
> > > > +MBUF_DYNF_METADATA_ALIGN __alignof__(uint32_t) #define
> > > > +MBUF_DYNF_METADATA_FLAGS 0
> > >
> > > If this flag is only to be used in rte_flow, it can stay in rte_flow.
> > > The name should follow the function name conventions, I suggest
> > > "rte_flow_metadata".
> >
> > The definitions:
> > MBUF_DYNF_METADATA_NAME,
> > MBUF_DYNF_METADATA_SIZE,
> > MBUF_DYNF_METADATA_ALIGN
> > are global. rte_flow proposes only minimal set tyo check and access
> > the metadata. By knowing the field names applications would have the
> > more flexibility in processing the fields, for example it allows to
> > optimize the handling of multiple dynamic fields . The definition of
> > metadata size allows to generate optimized code:
> > #if MBUF_DYNF_METADATA_SIZE == sizeof(uint32)
> > 	*RTE_MBUF_DYNFIELD(m) = get_metadata_32bit() #else
> > 	*RTE_MBUF_DYNFIELD(m) = get_metadata_64bit() #endif
> 
> I don't see any reason why the same dynamic field could have different sizes,
> I even think it could be dangerous. Your accessors suppose that the
> metadata is a uint32_t. Having a compile-time option for that does not look
> desirable.

I tried to provide maximal flexibility and It was just an example of the thing
we could do with global definitions. If you think we do not need it - OK,
let's do things simpler.

> 
> Just a side note: we have to take care when adding a new *public* dynamic
> field that it won't change in the future: the accessors are macros or static
> inline functions, so they are embedded in the binaries.
> This is probably something we should discuss and may not be when updating
> the dpdk (as shared lib).

Yes, agree, defines just will not work correct in correct way and even break an ABI.
As we decided - global metadata defines MBUF_DYNF_METADATA_xxxx
should be removed.

> 
> > MBUF_DYNF_METADATA_FLAGS flag is not used by rte_flow, this flag is
> > related exclusively to dynamic mbuf  " Reserved for future use, must be 0".
> > Would you like to drop this definition?
> >
> > >
> > > If the flag is going to be used in several places in dpdk (rte_flow,
> > > pmd, app, ...), I wonder if it shouldn't be defined it in rte_mbuf_dyn.c. I
> mean:
> > >
> > > ====
> > > /* rte_mbuf_dyn.c */
> > > const struct rte_mbuf_dynfield rte_mbuf_dynfield_flow_metadata = {
> > >    ...
> > > };
> > In this case we would make this descriptor global.
> > It is no needed, because there Is no supposed any usage, but by
> > rte_flow_dynf_metadata_register() only. The
> 
> Yes, in my example I wasn't sure it was going to be private to rte_flow (see
> "If the flag is going to be used in several places in dpdk (rte_flow, pmd, app,
> ...)").
> 
> So yes, I agree the struct should remain private.
OK.

> 
> 
> > > int rte_mbuf_dynfield_flow_metadata_offset = -1; const struct
> > > rte_mbuf_dynflag rte_mbuf_dynflag_flow_metadata = {
> > >    ...
> > > };
> > > int rte_mbuf_dynflag_flow_metadata_bitnum = -1;
> > >
> > > int rte_mbuf_dyn_flow_metadata_register(void)
> > > {
> > > ...
> > > }
> > >
> > > /* rte_mbuf_dyn.h */
> > > extern const struct rte_mbuf_dynfield
> > > rte_mbuf_dynfield_flow_metadata; extern int
> > > rte_mbuf_dynfield_flow_metadata_offset;
> > > extern const struct rte_mbuf_dynflag rte_mbuf_dynflag_flow_metadata;
> > > extern int rte_mbuf_dynflag_flow_metadata_bitnum;
> > >
> > > ...helpers to set/get metadata...
> > > ===
> > >
> > > Centralizing the definitions of non-private dynamic fields/flags in
> > > rte_mbuf_dyn may help other people to reuse a field that is well
> > > described if it match their use-case.
> >
> > Yes, centralizing is important, that's why MBUF_DYNF_METADATA_xxx
> > placed in rte_mbuf_dyn.h. Do you think we should share the descriptors
> either?
> > I have no idea why someone (but rte_flow_dynf_metadata_register())
> > might register metadata field directly.
> 
> If the field is private to rte_flow, yes, there is no need to share the "struct
> rte_mbuf_dynfield". Even the rte_flow_dynf_metadata_register() could be
> marked as internal, right?
rte_flow_dynf_metadata_register() is intended to be called by application.
Some applications might wish to engage metadata feature, some ones - not.

> 
> One more question: I see the registration is done by
> parse_vc_action_set_meta(). My understanding is that this function is not in
> datapath, and is called when configuring rte_flow. Do you confirm?
Rather it is called to configure application in general. If user sets metadata 
(by issuing the appropriate command) it is assumed he/she would like
the metadata on Rx side either. This is just for test purposes and it is not brilliant
example of rte_flow_dynf_metadata_register() use case.


> 
> > > In your case, what is carried by metadata? Could it be reused by
> > > others? I think some more description is needed.
> > In my case, metadata is just opaquie rte_flow related 32-bit unsigned
> > value provided by
> > mlx5 hardrware in rx datapath. I have no guess whether someone wishes
> to reuse.
> 
> What is the user supposed to do with this value? If it is hw-specific data, I
> think the name of the mbuf field should include "MLX", and it should be
> described.

Metadata are not HW specific at all - they neither control nor are produced
by HW (abstracting from the flow engine is implemented in HW).
Metadata are some opaque data, it is some kind of link between data
path and flow space.  With metadata application may provide some per packet
information to flow engine and get back some information from flow engine.
it is generic concept, supposed to be neither HW-related nor vendor specific.

> 
> Are these rte_flow actions somehow specific to mellanox drivers ?

AFAIK, currently it is going to be supported by mlx5 PMD only,
but concept is common and is not vendor specific.

> 
> > Brief summary of you comment (just to make sure I understood your
> proposal in correct way):
> > 1. drop all definitions MBUF_DYNF_METADATA_xxx, leave
> > MBUF_DYNF_METADATA_NAME only 2. move the descriptor const struct
> > rte_mbuf_dynfield desc_offs = {} to rte_mbuf_dyn.c and make it global
> > 3. provide helpers to access metadata
> >
> > [1] and [2] look OK in general. Although I think these ones make code less
> flexible, restrict the potential compile time options.
> > For now it is rather theoretical question, if you insist on your
> > approach - please, let me know, I'll address [1] and [2] and update.my
> patch.
> 
> [1] I think the #define only adds an indirection, and I didn't see any
>     perf constraint here.
> [2] My previous comment was surely not clear, sorry. The code can stay
>     in rte_flow.
> 
> > As for [3] - IMHO, the extra abstraction layer is not useful, and might be
> even harmful.
> > I tend not to complicate the code, at least, for now.
> 
> [3] ok for me
> 
> 
> Thanks,
> Olivier

With best regards, Slava
  
Olivier Matz Oct. 24, 2019, 9:22 a.m. UTC | #5
Hi Slava,

On Thu, Oct 24, 2019 at 06:49:41AM +0000, Slava Ovsiienko wrote:
> Hi, Olivier
> 
> > > [snip]
> > >
> > > > > +int
> > > > > +rte_flow_dynf_metadata_register(void)
> > > > > +{
> > > > > +	int offset;
> > > > > +	int flag;
> > > > > +
> > > > > +	static const struct rte_mbuf_dynfield desc_offs = {
> > > > > +		.name = MBUF_DYNF_METADATA_NAME,
> > > > > +		.size = MBUF_DYNF_METADATA_SIZE,
> > > > > +		.align = MBUF_DYNF_METADATA_ALIGN,
> > > > > +		.flags = MBUF_DYNF_METADATA_FLAGS,
> > > > > +	};
> > > > > +	static const struct rte_mbuf_dynflag desc_flag = {
> > > > > +		.name = MBUF_DYNF_METADATA_NAME,
> > > > > +	};
> > > >
> > > > I don't see think we need #defines.
> > > > You can directly use the name, sizeof() and __alignof__() here.
> > > > If the information is used externally, the structure shall be made
> > > > global non- static.
> > >
> > > The intention was to gather all dynamic fields definitions in one
> > > place (in rte_mbuf_dyn.h).
> > 
> > If the dynamic field is only going to be used inside rte_flow, I think there is no
> > need to expose it in rte_mbuf_dyn.h.
> > The other reason is I think the #define are just "passthrough", and do not
> > really bring added value, just an indirection.
> > 
> > > It would be easy to see all fields in one sight (some might be shared,
> > > some might be mutual exclusive, estimate mbuf space, required by
> > > various features, etc.). So, we can't just fill structure fields with
> > > simple sizeof() and alignof() instead of definitions (the field
> > > parameters must be defined once).
> > >
> > > I do not see the reasons to make table global. I would prefer the
> > definitions.
> > > - the definitions are compile time processing (table fields are
> > > runtime), it provides code optimization and better performance.
> > 
> > There is indeed no need to make the table global if the field is private to
> > rte_flow. About better performance, my understanding is that it would only
> > impact registration, am I missing something?
> 
> OK, I thought about some opportunity to allow application to register
> field directly, bypassing rte_flow_dynf_metadata_register(). So either
> definitions or field description table was supposed to be global. 
> I agree, let's do not complicate the matter, I'll will make global the
> metadata field name definition only - in the rte_mbuf_dyn.h in order
> just to have some centralizing point.

By reading your mail, things are also clearer to me about which
parts need access to this field.

To summarize what I understand:
- dyn field registration is done in rte_flow lib when configuring
  a flow using META
- the dynamic field will never be get/set in a mbuf by a PMD or rte_flow
  before a flow using META is added

One question then: why would you need the dyn field name to be exported?
Does the PMD need to know if the field is registered with a lookup or
something like that? If yes, can you detail why?


> 
> > >
> > > > > +
> > > > > +	offset = rte_mbuf_dynfield_register(&desc_offs);
> > > > > +	if (offset < 0)
> > > > > +		goto error;
> > > > > +	flag = rte_mbuf_dynflag_register(&desc_flag);
> > > > > +	if (flag < 0)
> > > > > +		goto error;
> > > > > +	rte_flow_dynf_metadata_offs = offset;
> > > > > +	rte_flow_dynf_metadata_mask = (1ULL << flag);
> > > > > +	return 0;
> > > > > +
> > > > > +error:
> > > > > +	rte_flow_dynf_metadata_offs = -1;
> > > > > +	rte_flow_dynf_metadata_mask = 0ULL;
> > > > > +	return -rte_errno;
> > > > > +}
> > > > > +
> > > > >  static int
> > > > >  flow_err(uint16_t port_id, int ret, struct rte_flow_error *error)
> > > > > { diff --git a/lib/librte_ethdev/rte_flow.h
> > > > > b/lib/librte_ethdev/rte_flow.h index 391a44a..a27e619 100644
> > > > > --- a/lib/librte_ethdev/rte_flow.h
> > > > > +++ b/lib/librte_ethdev/rte_flow.h
> > > > > @@ -27,6 +27,8 @@
> > > > >  #include <rte_udp.h>
> > > > >  #include <rte_byteorder.h>
> > > > >  #include <rte_esp.h>
> > > > > +#include <rte_mbuf.h>
> > > > > +#include <rte_mbuf_dyn.h>
> > > > >
> > > > >  #ifdef __cplusplus
> > > > >  extern "C" {
> > > > > @@ -417,7 +419,8 @@ enum rte_flow_item_type {
> > > > >  	/**
> > > > >  	 * [META]
> > > > >  	 *
> > > > > -	 * Matches a metadata value specified in mbuf metadata field.
> > > > > +	 * Matches a metadata value.
> > > > > +	 *
> > > > >  	 * See struct rte_flow_item_meta.
> > > > >  	 */
> > > > >  	RTE_FLOW_ITEM_TYPE_META,
> > > > > @@ -1213,9 +1216,17 @@ struct
> > rte_flow_item_icmp6_nd_opt_tla_eth {
> > > > > #endif
> > > > >
> > > > >  /**
> > > > > - * RTE_FLOW_ITEM_TYPE_META.
> > > > > + * @warning
> > > > > + * @b EXPERIMENTAL: this structure may change without prior
> > > > > + notice
> > > > >   *
> > > > > - * Matches a specified metadata value.
> > > > > + * RTE_FLOW_ITEM_TYPE_META
> > > > > + *
> > > > > + * Matches a specified metadata value. On egress, metadata can be
> > > > > + set either by
> > > > > + * mbuf tx_metadata field with PKT_TX_METADATA flag or
> > > > > + * RTE_FLOW_ACTION_TYPE_SET_META. On ingress,
> > > > > + RTE_FLOW_ACTION_TYPE_SET_META sets
> > > > > + * metadata for a packet and the metadata will be reported via
> > > > > + mbuf metadata
> > > > > + * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic
> > mbuf
> > > > > + field must be
> > > > > + * registered in advance by rte_flow_dynf_metadata_register().
> > > > >   */
> > > > >  struct rte_flow_item_meta {
> > > > >  	rte_be32_t data;
> > > > > @@ -1813,6 +1824,13 @@ enum rte_flow_action_type {
> > > > >  	 * undefined behavior.
> > > > >  	 */
> > > > >  	RTE_FLOW_ACTION_TYPE_DEC_TCP_ACK,
> > > > > +
> > > > > +	/**
> > > > > +	 * Set metadata on ingress or egress path.
> > > > > +	 *
> > > > > +	 * See struct rte_flow_action_set_meta.
> > > > > +	 */
> > > > > +	RTE_FLOW_ACTION_TYPE_SET_META,
> > > > >  };
> > > > >
> > > > >  /**
> > > > > @@ -2300,6 +2318,43 @@ struct rte_flow_action_set_mac {
> > > > >  	uint8_t mac_addr[RTE_ETHER_ADDR_LEN];  };
> > > > >
> > > > > +/**
> > > > > + * @warning
> > > > > + * @b EXPERIMENTAL: this structure may change without prior
> > > > > +notice
> > > > > + *
> > > > > + * RTE_FLOW_ACTION_TYPE_SET_META
> > > > > + *
> > > > > + * Set metadata. Metadata set by mbuf tx_metadata field with
> > > > > + * PKT_TX_METADATA flag on egress will be overridden by this action.
> > > > > +On
> > > > > + * ingress, the metadata will be carried by mbuf metadata dynamic
> > > > > +field
> > > > > + * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field
> > > > > +must be
> > > > > + * registered in advance by rte_flow_dynf_metadata_register().
> > > > > + *
> > > > > + * Altering partial bits is supported with mask. For bits which
> > > > > +have never
> > > > > + * been set, unpredictable value will be seen depending on driver
> > > > > + * implementation. For loopback/hairpin packet, metadata set on
> > > > > +Rx/Tx may
> > > > > + * or may not be propagated to the other path depending on HW
> > > > capability.
> > > > > + *
> > > > > + * RTE_FLOW_ITEM_TYPE_META matches metadata.
> > > > > + */
> > > > > +struct rte_flow_action_set_meta {
> > > > > +	rte_be32_t data;
> > > > > +	rte_be32_t mask;
> > > > > +};
> > > > > +
> > > > > +/* Mbuf dynamic field offset for metadata. */ extern int
> > > > > +rte_flow_dynf_metadata_offs;
> > > > > +
> > > > > +/* Mbuf dynamic field flag mask for metadata. */ extern uint64_t
> > > > > +rte_flow_dynf_metadata_mask;
> > > > > +
> > > > > +/* Mbuf dynamic field pointer for metadata. */ #define
> > > > > +RTE_FLOW_DYNF_METADATA(m) \
> > > > > +	RTE_MBUF_DYNFIELD((m), rte_flow_dynf_metadata_offs, uint32_t
> > > > *)
> > > > > +
> > > > > +/* Mbuf dynamic flag for metadata. */ #define
> > > > > +PKT_RX_DYNF_METADATA
> > > > > +(rte_flow_dynf_metadata_mask)
> > > > > +
> > > >
> > > > I wonder if helpers like this wouldn't be better, because they
> > > > combine the flag and the field:
> > > >
> > > > /**
> > > >  * Set metadata dynamic field and flag in mbuf.
> > > >  *
> > > >  * rte_flow_dynf_metadata_register() must have been called first.
> > > >  */
> > > > __rte_experimental
> > > > static inline void rte_mbuf_dyn_metadata_set(struct rte_mbuf *m,
> > > >                                        uint32_t metadata) {
> > > >        *RTE_MBUF_DYNFIELD(m, rte_flow_dynf_metadata_offs,
> > > >                        uint32_t *) = metadata;
> > > >        m->ol_flags |= rte_flow_dynf_metadata_mask; }
> > > Setting flag looks redundantly.
> > > What if driver just replaces the metadata and flag is already set?
> > > The other option - the flags (for set of fields) might be set in combinations.
> > > mbuf field is supposed to be engaged in datapath, performance is very
> > > critical, adding one more abstraction layer seems not to be relevant.
> > 
> > Ok, that was just a suggestion. Let's use your accessors if you fear a
> > performance impact.
> The simple example - mlx5 PMD has the rx_burst routine implemented
> with vector instructions, and it processes four packets at once. No need
> to check field availability four times, and the storing the metadata
> is the subject for further optimization with vector instructions.
> It is a bit difficult to provide common helpers to handle the metadata
> field due to extremely high optimization requirements.

ok, got it

> > Nevertheless I suggest to use static inline functions in place of macros if
> > possible. For RTE_MBUF_DYNFIELD(), I used a macro because it's the only
> > way to provide a type to cast the result. But in your case, you know it's a
> > uint32_t *.
> What If one needs to specify the address of field? Macro allows to do that,
> inline functions - do not. Packets may be processed in bizarre ways,
> for example in a batch, with vector instructions. OK, I'll provide
> the set/get routines, but I'm not sure whether will use ones in mlx5 code.
> In my opinion it just obscures the field nature. Field is just a field, AFAIU, 
> it is main idea of your patch, the way to handle dynamic field should be close
> to handling usual static fields, I think. Macro pointer follows this approach,
> routines - does not.

Well, I just think that:
  rte_mbuf_set_timestamp(m, 1234);
is more readable than:
  *RTE_MBUF_TIMESTAMP(m) = 1234;

Anyway, in your case, if you need to use vector instructions in the PMD,
I guess you will directly use the offset.

> > > Also, metadata is not feature of mbuf. It should have rte_flow prefix.
> > 
> > Yes, sure. The example derives from a test I've done, and I forgot to change
> > it.
> > 
> > 
> > > > /**
> > > >  * Get metadata dynamic field value in mbuf.
> > > >  *
> > > >  * rte_flow_dynf_metadata_register() must have been called first.
> > > >  */
> > > > __rte_experimental
> > > > static inline int rte_mbuf_dyn_metadata_get(const struct rte_mbuf *m,
> > > >                                        uint32_t *metadata) {
> > > >        if ((m->ol_flags & rte_flow_dynf_metadata_mask) == 0)
> > > >                return -1;
> > > What if metadata is 0xFFFFFFFF ?
> > > The checking of availability might embrace larger code block, so this
> > > might be not the best place to check availability.
> > >
> > > >        *metadata = *RTE_MBUF_DYNFIELD(m,
> > rte_flow_dynf_metadata_offs,
> > > >                                uint32_t *);
> > > >        return 0;
> > > > }
> > > >
> > > > /**
> > > >  * Delete the metadata dynamic flag in mbuf.
> > > >  *
> > > >  * rte_flow_dynf_metadata_register() must have been called first.
> > > >  */
> > > > __rte_experimental
> > > > static inline void rte_mbuf_dyn_metadata_del(struct rte_mbuf *m) {
> > > >        m->ol_flags &= ~rte_flow_dynf_metadata_mask; }
> > > >
> > > Sorry, I do not see the practical usecase for these helpers. In my opinion it
> > is just some kind of obscuration.
> > > They do replace the very simple code and introduce some risk of
> > performance impact.
> > >
> > > >
> > > > >  /*
> > > > >   * Definition of a single action.
> > > > >   *
> > > > > @@ -2533,6 +2588,32 @@ enum rte_flow_conv_op {  };
> > > > >
> > > > >  /**
> > > > > + * Check if mbuf dynamic field for metadata is registered.
> > > > > + *
> > > > > + * @return
> > > > > + *   True if registered, false otherwise.
> > > > > + */
> > > > > +__rte_experimental
> > > > > +static inline int
> > > > > +rte_flow_dynf_metadata_avail(void) {
> > > > > +	return !!rte_flow_dynf_metadata_mask; }
> > > >
> > > > _registered() instead of _avail() ?
> > > Accepted, sounds better.
> 
> Hmm, I changed my opinion - we already have
> rte_flow_dynf_metadata_register(void). Is it OK to have
> rte_flow_dynf_metadata_registerED(void) ?
> It would be easy to mistype. 

what about xxx_is_registered() ?
if you feel it's too long, ok, let's keep avail()

> 
> > >
> > > >
> > > > > +
> > > > > +/**
> > > > > + * Register mbuf dynamic field and flag for metadata.
> > > > > + *
> > > > > + * This function must be called prior to use SET_META action in
> > > > > +order to
> > > > > + * register the dynamic mbuf field. Otherwise, the data cannot be
> > > > > +delivered to
> > > > > + * application.
> > > > > + *
> > > > > + * @return
> > > > > + *   0 on success, a negative errno value otherwise and rte_errno is
> > set.
> > > > > + */
> > > > > +__rte_experimental
> > > > > +int
> > > > > +rte_flow_dynf_metadata_register(void);
> > > > > +
> > > > > +/**
> > > > >   * Check whether a flow rule can be created on a given port.
> > > > >   *
> > > > >   * The flow rule is validated for correctness and whether it
> > > > > could be accepted diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h
> > > > > b/lib/librte_mbuf/rte_mbuf_dyn.h index 6e2c816..4ff33ac 100644
> > > > > --- a/lib/librte_mbuf/rte_mbuf_dyn.h
> > > > > +++ b/lib/librte_mbuf/rte_mbuf_dyn.h
> > > > > @@ -160,4 +160,12 @@ int rte_mbuf_dynflag_lookup(const char
> > *name,
> > > > >   */
> > > > >  #define RTE_MBUF_DYNFIELD(m, offset, type) ((type)((uintptr_t)(m)
> > > > > +
> > > > > (offset)))
> > > > >
> > > > > +/**
> > > > > + * Flow metadata dynamic field definitions.
> > > > > + */
> > > > > +#define MBUF_DYNF_METADATA_NAME "flow-metadata"
> > > > > +#define MBUF_DYNF_METADATA_SIZE sizeof(uint32_t) #define
> > > > > +MBUF_DYNF_METADATA_ALIGN __alignof__(uint32_t) #define
> > > > > +MBUF_DYNF_METADATA_FLAGS 0
> > > >
> > > > If this flag is only to be used in rte_flow, it can stay in rte_flow.
> > > > The name should follow the function name conventions, I suggest
> > > > "rte_flow_metadata".
> > >
> > > The definitions:
> > > MBUF_DYNF_METADATA_NAME,
> > > MBUF_DYNF_METADATA_SIZE,
> > > MBUF_DYNF_METADATA_ALIGN
> > > are global. rte_flow proposes only minimal set tyo check and access
> > > the metadata. By knowing the field names applications would have the
> > > more flexibility in processing the fields, for example it allows to
> > > optimize the handling of multiple dynamic fields . The definition of
> > > metadata size allows to generate optimized code:
> > > #if MBUF_DYNF_METADATA_SIZE == sizeof(uint32)
> > > 	*RTE_MBUF_DYNFIELD(m) = get_metadata_32bit() #else
> > > 	*RTE_MBUF_DYNFIELD(m) = get_metadata_64bit() #endif
> > 
> > I don't see any reason why the same dynamic field could have different sizes,
> > I even think it could be dangerous. Your accessors suppose that the
> > metadata is a uint32_t. Having a compile-time option for that does not look
> > desirable.
> 
> I tried to provide maximal flexibility and It was just an example of the thing
> we could do with global definitions. If you think we do not need it - OK,
> let's do things simpler.
> 
> > 
> > Just a side note: we have to take care when adding a new *public* dynamic
> > field that it won't change in the future: the accessors are macros or static
> > inline functions, so they are embedded in the binaries.
> > This is probably something we should discuss and may not be when updating
> > the dpdk (as shared lib).
> 
> Yes, agree, defines just will not work correct in correct way and even break an ABI.
> As we decided - global metadata defines MBUF_DYNF_METADATA_xxxx
> should be removed.
> 
> > 
> > > MBUF_DYNF_METADATA_FLAGS flag is not used by rte_flow, this flag is
> > > related exclusively to dynamic mbuf  " Reserved for future use, must be 0".
> > > Would you like to drop this definition?
> > >
> > > >
> > > > If the flag is going to be used in several places in dpdk (rte_flow,
> > > > pmd, app, ...), I wonder if it shouldn't be defined it in rte_mbuf_dyn.c. I
> > mean:
> > > >
> > > > ====
> > > > /* rte_mbuf_dyn.c */
> > > > const struct rte_mbuf_dynfield rte_mbuf_dynfield_flow_metadata = {
> > > >    ...
> > > > };
> > > In this case we would make this descriptor global.
> > > It is no needed, because there Is no supposed any usage, but by
> > > rte_flow_dynf_metadata_register() only. The
> > 
> > Yes, in my example I wasn't sure it was going to be private to rte_flow (see
> > "If the flag is going to be used in several places in dpdk (rte_flow, pmd, app,
> > ...)").
> > 
> > So yes, I agree the struct should remain private.
> OK.
> 
> > 
> > 
> > > > int rte_mbuf_dynfield_flow_metadata_offset = -1; const struct
> > > > rte_mbuf_dynflag rte_mbuf_dynflag_flow_metadata = {
> > > >    ...
> > > > };
> > > > int rte_mbuf_dynflag_flow_metadata_bitnum = -1;
> > > >
> > > > int rte_mbuf_dyn_flow_metadata_register(void)
> > > > {
> > > > ...
> > > > }
> > > >
> > > > /* rte_mbuf_dyn.h */
> > > > extern const struct rte_mbuf_dynfield
> > > > rte_mbuf_dynfield_flow_metadata; extern int
> > > > rte_mbuf_dynfield_flow_metadata_offset;
> > > > extern const struct rte_mbuf_dynflag rte_mbuf_dynflag_flow_metadata;
> > > > extern int rte_mbuf_dynflag_flow_metadata_bitnum;
> > > >
> > > > ...helpers to set/get metadata...
> > > > ===
> > > >
> > > > Centralizing the definitions of non-private dynamic fields/flags in
> > > > rte_mbuf_dyn may help other people to reuse a field that is well
> > > > described if it match their use-case.
> > >
> > > Yes, centralizing is important, that's why MBUF_DYNF_METADATA_xxx
> > > placed in rte_mbuf_dyn.h. Do you think we should share the descriptors
> > either?
> > > I have no idea why someone (but rte_flow_dynf_metadata_register())
> > > might register metadata field directly.
> > 
> > If the field is private to rte_flow, yes, there is no need to share the "struct
> > rte_mbuf_dynfield". Even the rte_flow_dynf_metadata_register() could be
> > marked as internal, right?
> rte_flow_dynf_metadata_register() is intended to be called by application.
> Some applications might wish to engage metadata feature, some ones - not.
> 
> > 
> > One more question: I see the registration is done by
> > parse_vc_action_set_meta(). My understanding is that this function is not in
> > datapath, and is called when configuring rte_flow. Do you confirm?
> Rather it is called to configure application in general. If user sets metadata 
> (by issuing the appropriate command) it is assumed he/she would like
> the metadata on Rx side either. This is just for test purposes and it is not brilliant
> example of rte_flow_dynf_metadata_register() use case.
> 
> 
> > 
> > > > In your case, what is carried by metadata? Could it be reused by
> > > > others? I think some more description is needed.
> > > In my case, metadata is just opaquie rte_flow related 32-bit unsigned
> > > value provided by
> > > mlx5 hardrware in rx datapath. I have no guess whether someone wishes
> > to reuse.
> > 
> > What is the user supposed to do with this value? If it is hw-specific data, I
> > think the name of the mbuf field should include "MLX", and it should be
> > described.
> 
> Metadata are not HW specific at all - they neither control nor are produced
> by HW (abstracting from the flow engine is implemented in HW).
> Metadata are some opaque data, it is some kind of link between data
> path and flow space.  With metadata application may provide some per packet
> information to flow engine and get back some information from flow engine.
> it is generic concept, supposed to be neither HW-related nor vendor specific.

ok, understood, it's like a mark or tag.

> > Are these rte_flow actions somehow specific to mellanox drivers ?
> 
> AFAIK, currently it is going to be supported by mlx5 PMD only,
> but concept is common and is not vendor specific.
> 
> > 
> > > Brief summary of you comment (just to make sure I understood your
> > proposal in correct way):
> > > 1. drop all definitions MBUF_DYNF_METADATA_xxx, leave
> > > MBUF_DYNF_METADATA_NAME only 2. move the descriptor const struct
> > > rte_mbuf_dynfield desc_offs = {} to rte_mbuf_dyn.c and make it global
> > > 3. provide helpers to access metadata
> > >
> > > [1] and [2] look OK in general. Although I think these ones make code less
> > flexible, restrict the potential compile time options.
> > > For now it is rather theoretical question, if you insist on your
> > > approach - please, let me know, I'll address [1] and [2] and update.my
> > patch.
> > 
> > [1] I think the #define only adds an indirection, and I didn't see any
> >     perf constraint here.
> > [2] My previous comment was surely not clear, sorry. The code can stay
> >     in rte_flow.
> > 
> > > As for [3] - IMHO, the extra abstraction layer is not useful, and might be
> > even harmful.
> > > I tend not to complicate the code, at least, for now.
> > 
> > [3] ok for me
> > 
> > 
> > Thanks,
> > Olivier
> 
> With best regards, Slava

Thanks
  
Slava Ovsiienko Oct. 24, 2019, 12:30 p.m. UTC | #6
Hi Olivier,
> -----Original Message-----
> From: Olivier Matz <olivier.matz@6wind.com>
> Sent: Thursday, October 24, 2019 12:23
> To: Slava Ovsiienko <viacheslavo@mellanox.com>
> Cc: dev@dpdk.org; Matan Azrad <matan@mellanox.com>; Raslan
> Darawsheh <rasland@mellanox.com>; Thomas Monjalon
> <thomas@monjalon.net>
> Subject: Re: [PATCH v2] ethdev: extend flow metadata
>
> Hi Slava,
>
> On Thu, Oct 24, 2019 at 06:49:41AM +0000, Slava Ovsiienko wrote:
> > Hi, Olivier
> >
> > > > [snip]
> > > >
> > > > > > +int
> > > > > > +rte_flow_dynf_metadata_register(void)
> > > > > > +{
> > > > > > +   int offset;
> > > > > > +   int flag;
> > > > > > +
> > > > > > +   static const struct rte_mbuf_dynfield desc_offs = {
> > > > > > +           .name = MBUF_DYNF_METADATA_NAME,
> > > > > > +           .size = MBUF_DYNF_METADATA_SIZE,
> > > > > > +           .align = MBUF_DYNF_METADATA_ALIGN,
> > > > > > +           .flags = MBUF_DYNF_METADATA_FLAGS,
> > > > > > +   };
> > > > > > +   static const struct rte_mbuf_dynflag desc_flag = {
> > > > > > +           .name = MBUF_DYNF_METADATA_NAME,
> > > > > > +   };
> > > > >
> > > > > I don't see think we need #defines.
> > > > > You can directly use the name, sizeof() and __alignof__() here.
> > > > > If the information is used externally, the structure shall be
> > > > > made global non- static.
> > > >
> > > > The intention was to gather all dynamic fields definitions in one
> > > > place (in rte_mbuf_dyn.h).
> > >
> > > If the dynamic field is only going to be used inside rte_flow, I
> > > think there is no need to expose it in rte_mbuf_dyn.h.
> > > The other reason is I think the #define are just "passthrough", and
> > > do not really bring added value, just an indirection.
> > >
> > > > It would be easy to see all fields in one sight (some might be
> > > > shared, some might be mutual exclusive, estimate mbuf space,
> > > > required by various features, etc.). So, we can't just fill
> > > > structure fields with simple sizeof() and alignof() instead of
> > > > definitions (the field parameters must be defined once).
> > > >
> > > > I do not see the reasons to make table global. I would prefer the
> > > definitions.
> > > > - the definitions are compile time processing (table fields are
> > > > runtime), it provides code optimization and better performance.
> > >
> > > There is indeed no need to make the table global if the field is
> > > private to rte_flow. About better performance, my understanding is
> > > that it would only impact registration, am I missing something?
> >
> > OK, I thought about some opportunity to allow application to register
> > field directly, bypassing rte_flow_dynf_metadata_register(). So either
> > definitions or field description table was supposed to be global.
> > I agree, let's do not complicate the matter, I'll will make global the
> > metadata field name definition only - in the rte_mbuf_dyn.h in order
> > just to have some centralizing point.
>
> By reading your mail, things are also clearer to me about which parts need
> access to this field.
>
> To summarize what I understand:
> - dyn field registration is done in rte_flow lib when configuring
>   a flow using META
> - the dynamic field will never be get/set in a mbuf by a PMD or rte_flow
>   before a flow using META is added

In testpmd with current patch - yes, and this is just a sample. The common practice of
enabling metadata may differ. If application sees some PMD supporting RX/TX_METADATA
offload and it desires to receive metadata - it registers the dynamic field for ones.

> One question then: why would you need the dyn field name to be exported?
> Does the PMD need to know if the field is registered with a lookup or
> something like that? If yes, can you detail why?

I think it might happen the PMD does.
Right now I have an issue with mlx5 PMD compiled as shared library.
The global variables from rte_flow.c is not seen in PMD (just because I forget
to add one into the .map file). The way dynamic data are linked is system
dependent and it might be needed to optimize. I mean - in some
cases PMD might need to do lookup explicitly and use local copies
of offset and mask. So' I'd prefer to see field descriptor be
global visible. Yes, PMD can take the offset/flag directly
from the rte_flow variables and cache ones internally,
so global descriptor is just some kind of insurance.

As for the name - it is less critical, it may
be just useful for various log/debug messages and so on. The other
reason to have name definition is to put it in the "centralizing point"
somewhere in the rte_mbuf_dyn.h, to gather all names together and
eliminate the name conflicts (yes,  the documented name convention
reduces the risk, but it just convenient to see all fields names
within one sight - it is easy to determine which are supported, etc).

> >
> > > >
> > > > > > +
> > > > > > +   offset = rte_mbuf_dynfield_register(&desc_offs);
> > > > > > +   if (offset < 0)
> > > > > > +           goto error;
> > > > > > +   flag = rte_mbuf_dynflag_register(&desc_flag);
> > > > > > +   if (flag < 0)
> > > > > > +           goto error;
> > > > > > +   rte_flow_dynf_metadata_offs = offset;
> > > > > > +   rte_flow_dynf_metadata_mask = (1ULL << flag);
> > > > > > +   return 0;
> > > > > > +
> > > > > > +error:
> > > > > > +   rte_flow_dynf_metadata_offs = -1;
> > > > > > +   rte_flow_dynf_metadata_mask = 0ULL;
> > > > > > +   return -rte_errno;
> > > > > > +}
> > > > > > +
> > > > > >  static int
> > > > > >  flow_err(uint16_t port_id, int ret, struct rte_flow_error
> > > > > > *error) { diff --git a/lib/librte_ethdev/rte_flow.h
> > > > > > b/lib/librte_ethdev/rte_flow.h index 391a44a..a27e619 100644
> > > > > > --- a/lib/librte_ethdev/rte_flow.h
> > > > > > +++ b/lib/librte_ethdev/rte_flow.h
> > > > > > @@ -27,6 +27,8 @@
> > > > > >  #include <rte_udp.h>
> > > > > >  #include <rte_byteorder.h>
> > > > > >  #include <rte_esp.h>
> > > > > > +#include <rte_mbuf.h>
> > > > > > +#include <rte_mbuf_dyn.h>
> > > > > >
> > > > > >  #ifdef __cplusplus
> > > > > >  extern "C" {
> > > > > > @@ -417,7 +419,8 @@ enum rte_flow_item_type {
> > > > > >     /**
> > > > > >      * [META]
> > > > > >      *
> > > > > > -    * Matches a metadata value specified in mbuf metadata
> field.
> > > > > > +    * Matches a metadata value.
> > > > > > +    *
> > > > > >      * See struct rte_flow_item_meta.
> > > > > >      */
> > > > > >     RTE_FLOW_ITEM_TYPE_META,
> > > > > > @@ -1213,9 +1216,17 @@ struct
> > > rte_flow_item_icmp6_nd_opt_tla_eth {
> > > > > > #endif
> > > > > >
> > > > > >  /**
> > > > > > - * RTE_FLOW_ITEM_TYPE_META.
> > > > > > + * @warning
> > > > > > + * @b EXPERIMENTAL: this structure may change without prior
> > > > > > + notice
> > > > > >   *
> > > > > > - * Matches a specified metadata value.
> > > > > > + * RTE_FLOW_ITEM_TYPE_META
> > > > > > + *
> > > > > > + * Matches a specified metadata value. On egress, metadata
> > > > > > + can be set either by
> > > > > > + * mbuf tx_metadata field with PKT_TX_METADATA flag or
> > > > > > + * RTE_FLOW_ACTION_TYPE_SET_META. On ingress,
> > > > > > + RTE_FLOW_ACTION_TYPE_SET_META sets
> > > > > > + * metadata for a packet and the metadata will be reported
> > > > > > + via mbuf metadata
> > > > > > + * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic
> > > mbuf
> > > > > > + field must be
> > > > > > + * registered in advance by rte_flow_dynf_metadata_register().
> > > > > >   */
> > > > > >  struct rte_flow_item_meta {
> > > > > >     rte_be32_t data;
> > > > > > @@ -1813,6 +1824,13 @@ enum rte_flow_action_type {
> > > > > >      * undefined behavior.
> > > > > >      */
> > > > > >     RTE_FLOW_ACTION_TYPE_DEC_TCP_ACK,
> > > > > > +
> > > > > > +   /**
> > > > > > +    * Set metadata on ingress or egress path.
> > > > > > +    *
> > > > > > +    * See struct rte_flow_action_set_meta.
> > > > > > +    */
> > > > > > +   RTE_FLOW_ACTION_TYPE_SET_META,
> > > > > >  };
> > > > > >
> > > > > >  /**
> > > > > > @@ -2300,6 +2318,43 @@ struct rte_flow_action_set_mac {
> > > > > >     uint8_t mac_addr[RTE_ETHER_ADDR_LEN];  };
> > > > > >
> > > > > > +/**
> > > > > > + * @warning
> > > > > > + * @b EXPERIMENTAL: this structure may change without prior
> > > > > > +notice
> > > > > > + *
> > > > > > + * RTE_FLOW_ACTION_TYPE_SET_META
> > > > > > + *
> > > > > > + * Set metadata. Metadata set by mbuf tx_metadata field with
> > > > > > + * PKT_TX_METADATA flag on egress will be overridden by this
> action.
> > > > > > +On
> > > > > > + * ingress, the metadata will be carried by mbuf metadata
> > > > > > +dynamic field
> > > > > > + * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf
> > > > > > +field must be
> > > > > > + * registered in advance by rte_flow_dynf_metadata_register().
> > > > > > + *
> > > > > > + * Altering partial bits is supported with mask. For bits
> > > > > > +which have never
> > > > > > + * been set, unpredictable value will be seen depending on
> > > > > > +driver
> > > > > > + * implementation. For loopback/hairpin packet, metadata set
> > > > > > +on Rx/Tx may
> > > > > > + * or may not be propagated to the other path depending on HW
> > > > > capability.
> > > > > > + *
> > > > > > + * RTE_FLOW_ITEM_TYPE_META matches metadata.
> > > > > > + */
> > > > > > +struct rte_flow_action_set_meta {
> > > > > > +   rte_be32_t data;
> > > > > > +   rte_be32_t mask;
> > > > > > +};
> > > > > > +
> > > > > > +/* Mbuf dynamic field offset for metadata. */ extern int
> > > > > > +rte_flow_dynf_metadata_offs;
> > > > > > +
> > > > > > +/* Mbuf dynamic field flag mask for metadata. */ extern
> > > > > > +uint64_t rte_flow_dynf_metadata_mask;
> > > > > > +
> > > > > > +/* Mbuf dynamic field pointer for metadata. */ #define
> > > > > > +RTE_FLOW_DYNF_METADATA(m) \
> > > > > > +   RTE_MBUF_DYNFIELD((m), rte_flow_dynf_metadata_offs,
> uint32_t
> > > > > *)
> > > > > > +
> > > > > > +/* Mbuf dynamic flag for metadata. */ #define
> > > > > > +PKT_RX_DYNF_METADATA
> > > > > > +(rte_flow_dynf_metadata_mask)
> > > > > > +
> > > > >
> > > > > I wonder if helpers like this wouldn't be better, because they
> > > > > combine the flag and the field:
> > > > >
> > > > > /**
> > > > >  * Set metadata dynamic field and flag in mbuf.
> > > > >  *
> > > > >  * rte_flow_dynf_metadata_register() must have been called first.
> > > > >  */
> > > > > __rte_experimental
> > > > > static inline void rte_mbuf_dyn_metadata_set(struct rte_mbuf *m,
> > > > >                                        uint32_t metadata) {
> > > > >        *RTE_MBUF_DYNFIELD(m, rte_flow_dynf_metadata_offs,
> > > > >                        uint32_t *) = metadata;
> > > > >        m->ol_flags |= rte_flow_dynf_metadata_mask; }
> > > > Setting flag looks redundantly.
> > > > What if driver just replaces the metadata and flag is already set?
> > > > The other option - the flags (for set of fields) might be set in
> combinations.
> > > > mbuf field is supposed to be engaged in datapath, performance is
> > > > very critical, adding one more abstraction layer seems not to be
> relevant.
> > >
> > > Ok, that was just a suggestion. Let's use your accessors if you fear
> > > a performance impact.
> > The simple example - mlx5 PMD has the rx_burst routine implemented
> > with vector instructions, and it processes four packets at once. No
> > need to check field availability four times, and the storing the
> > metadata is the subject for further optimization with vector instructions.
> > It is a bit difficult to provide common helpers to handle the metadata
> > field due to extremely high optimization requirements.
>
> ok, got it
>
> > > Nevertheless I suggest to use static inline functions in place of
> > > macros if possible. For RTE_MBUF_DYNFIELD(), I used a macro because
> > > it's the only way to provide a type to cast the result. But in your
> > > case, you know it's a uint32_t *.
> > What If one needs to specify the address of field? Macro allows to do
> > that, inline functions - do not. Packets may be processed in bizarre
> > ways, for example in a batch, with vector instructions. OK, I'll
> > provide the set/get routines, but I'm not sure whether will use ones in mlx5
> code.
> > In my opinion it just obscures the field nature. Field is just a
> > field, AFAIU, it is main idea of your patch, the way to handle dynamic
> > field should be close to handling usual static fields, I think. Macro
> > pointer follows this approach, routines - does not.
>
> Well, I just think that:
>   rte_mbuf_set_time_stamp(m, 1234);
> is more readable than:
>   *RTE_MBUF_TIMESTAMP(m) = 1234;

I implemented these metadata set/get in v3, as you proposed.
But, mlx5 PMD does not use these ones (possible, I'll refactor some occurrences)
BTW, I did not find any rte_mbuf_set_xxxx() implemented? Did I miss smth?
Should we start with metadata field specifically? 😊

>
> Anyway, in your case, if you need to use vector instructions in the PMD, I
> guess you will directly use the offset.

Right.

>
> > > > Also, metadata is not feature of mbuf. It should have rte_flow prefix.
> > >
> > > Yes, sure. The example derives from a test I've done, and I forgot
> > > to change it.
> > >
> > >
> > > > > /**
> > > > >  * Get metadata dynamic field value in mbuf.
> > > > >  *
> > > > >  * rte_flow_dynf_metadata_register() must have been called first.
> > > > >  */
> > > > > __rte_experimental
> > > > > static inline int rte_mbuf_dyn_metadata_get(const struct rte_mbuf
> *m,
> > > > >                                        uint32_t *metadata) {
> > > > >        if ((m->ol_flags & rte_flow_dynf_metadata_mask) == 0)
> > > > >                return -1;
> > > > What if metadata is 0xFFFFFFFF ?
> > > > The checking of availability might embrace larger code block, so
> > > > this might be not the best place to check availability.
> > > >
> > > > >        *metadata = *RTE_MBUF_DYNFIELD(m,
> > > rte_flow_dynf_metadata_offs,
> > > > >                                uint32_t *);
> > > > >        return 0;
> > > > > }
> > > > >
> > > > > /**
> > > > >  * Delete the metadata dynamic flag in mbuf.
> > > > >  *
> > > > >  * rte_flow_dynf_metadata_register() must have been called first.
> > > > >  */
> > > > > __rte_experimental
> > > > > static inline void rte_mbuf_dyn_metadata_del(struct rte_mbuf *m) {
> > > > >        m->ol_flags &= ~rte_flow_dynf_metadata_mask; }
> > > > >
> > > > Sorry, I do not see the practical usecase for these helpers. In my
> > > > opinion it
> > > is just some kind of obscuration.
> > > > They do replace the very simple code and introduce some risk of
> > > performance impact.
> > > >
> > > > >
> > > > > >  /*
> > > > > >   * Definition of a single action.
> > > > > >   *
> > > > > > @@ -2533,6 +2588,32 @@ enum rte_flow_conv_op {  };
> > > > > >
> > > > > >  /**
> > > > > > + * Check if mbuf dynamic field for metadata is registered.
> > > > > > + *
> > > > > > + * @return
> > > > > > + *   True if registered, false otherwise.
> > > > > > + */
> > > > > > +__rte_experimental
> > > > > > +static inline int
> > > > > > +rte_flow_dynf_metadata_avail(void) {
> > > > > > +   return !!rte_flow_dynf_metadata_mask; }
> > > > >
> > > > > _registered() instead of _avail() ?
> > > > Accepted, sounds better.
> >
> > Hmm, I changed my opinion - we already have
> > rte_flow_dynf_metadata_register(void). Is it OK to have
> > rte_flow_dynf_metadata_registerED(void) ?
> > It would be easy to mistype.
>
> what about xxx_is_registered() ?
It seems to be not much better, sorry ☹
> if you feel it's too long, ok, let's keep avail()
Actually, I tend to complete with "_available", but it is really long.

> >
> > > >
> > > > >
> > > > > > +
> > > > > > +/**
> > > > > > + * Register mbuf dynamic field and flag for metadata.
> > > > > > + *
> > > > > > + * This function must be called prior to use SET_META action
> > > > > > +in order to
> > > > > > + * register the dynamic mbuf field. Otherwise, the data
> > > > > > +cannot be delivered to
> > > > > > + * application.
> > > > > > + *
> > > > > > + * @return
> > > > > > + *   0 on success, a negative errno value otherwise and rte_errno is
> > > set.
> > > > > > + */
> > > > > > +__rte_experimental
> > > > > > +int
> > > > > > +rte_flow_dynf_metadata_register(void);
> > > > > > +
> > > > > > +/**
> > > > > >   * Check whether a flow rule can be created on a given port.
> > > > > >   *
> > > > > >   * The flow rule is validated for correctness and whether it
> > > > > > could be accepted diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h
> > > > > > b/lib/librte_mbuf/rte_mbuf_dyn.h index 6e2c816..4ff33ac 100644
> > > > > > --- a/lib/librte_mbuf/rte_mbuf_dyn.h
> > > > > > +++ b/lib/librte_mbuf/rte_mbuf_dyn.h
> > > > > > @@ -160,4 +160,12 @@ int rte_mbuf_dynflag_lookup(const char
> > > *name,
> > > > > >   */
> > > > > >  #define RTE_MBUF_DYNFIELD(m, offset, type)
> > > > > > ((type)((uintptr_t)(m)
> > > > > > +
> > > > > > (offset)))
> > > > > >
> > > > > > +/**
> > > > > > + * Flow metadata dynamic field definitions.
> > > > > > + */
> > > > > > +#define MBUF_DYNF_METADATA_NAME "flow-metadata"
> > > > > > +#define MBUF_DYNF_METADATA_SIZE sizeof(uint32_t) #define
> > > > > > +MBUF_DYNF_METADATA_ALIGN __alignof__(uint32_t) #define
> > > > > > +MBUF_DYNF_METADATA_FLAGS 0
> > > > >
> > > > > If this flag is only to be used in rte_flow, it can stay in rte_flow.
> > > > > The name should follow the function name conventions, I suggest
> > > > > "rte_flow_metadata".
> > > >
> > > > The definitions:
> > > > MBUF_DYNF_METADATA_NAME,
> > > > MBUF_DYNF_METADATA_SIZE,
> > > > MBUF_DYNF_METADATA_ALIGN
> > > > are global. rte_flow proposes only minimal set tyo check and
> > > > access the metadata. By knowing the field names applications would
> > > > have the more flexibility in processing the fields, for example it
> > > > allows to optimize the handling of multiple dynamic fields . The
> > > > definition of metadata size allows to generate optimized code:
> > > > #if MBUF_DYNF_METADATA_SIZE == sizeof(uint32)
> > > >         *RTE_MBUF_DYNFIELD(m) = get_metadata_32bit() #else
> > > >         *RTE_MBUF_DYNFIELD(m) = get_metadata_64bit() #endif
> > >
> > > I don't see any reason why the same dynamic field could have
> > > different sizes, I even think it could be dangerous. Your accessors
> > > suppose that the metadata is a uint32_t. Having a compile-time
> > > option for that does not look desirable.
> >
> > I tried to provide maximal flexibility and It was just an example of
> > the thing we could do with global definitions. If you think we do not
> > need it - OK, let's do things simpler.
> >
> > >
> > > Just a side note: we have to take care when adding a new *public*
> > > dynamic field that it won't change in the future: the accessors are
> > > macros or static inline functions, so they are embedded in the binaries.
> > > This is probably something we should discuss and may not be when
> > > updating the dpdk (as shared lib).
> >
> > Yes, agree, defines just will not work correct in correct way and even break
> an ABI.
> > As we decided - global metadata defines MBUF_DYNF_METADATA_xxxx
> should
> > be removed.
> >
> > >
> > > > MBUF_DYNF_METADATA_FLAGS flag is not used by rte_flow, this flag
> > > > is related exclusively to dynamic mbuf  " Reserved for future use, must
> be 0".
> > > > Would you like to drop this definition?
> > > >
> > > > >
> > > > > If the flag is going to be used in several places in dpdk
> > > > > (rte_flow, pmd, app, ...), I wonder if it shouldn't be defined
> > > > > it in rte_mbuf_dyn.c. I
> > > mean:
> > > > >
> > > > > ====
> > > > > /* rte_mbuf_dyn.c */
> > > > > const struct rte_mbuf_dynfield rte_mbuf_dynfield_flow_metadata = {
> > > > >    ...
> > > > > };
> > > > In this case we would make this descriptor global.
> > > > It is no needed, because there Is no supposed any usage, but by
> > > > rte_flow_dynf_metadata_register() only. The
> > >
> > > Yes, in my example I wasn't sure it was going to be private to
> > > rte_flow (see "If the flag is going to be used in several places in
> > > dpdk (rte_flow, pmd, app, ...)").
> > >
> > > So yes, I agree the struct should remain private.
> > OK.
> >
> > >
> > >
> > > > > int rte_mbuf_dynfield_flow_metadata_offset = -1; const struct
> > > > > rte_mbuf_dynflag rte_mbuf_dynflag_flow_metadata = {
> > > > >    ...
> > > > > };
> > > > > int rte_mbuf_dynflag_flow_metadata_bitnum = -1;
> > > > >
> > > > > int rte_mbuf_dyn_flow_metadata_register(void)
> > > > > {
> > > > > ...
> > > > > }
> > > > >
> > > > > /* rte_mbuf_dyn.h */
> > > > > extern const struct rte_mbuf_dynfield
> > > > > rte_mbuf_dynfield_flow_metadata; extern int
> > > > > rte_mbuf_dynfield_flow_metadata_offset;
> > > > > extern const struct rte_mbuf_dynflag
> > > > > rte_mbuf_dynflag_flow_metadata; extern int
> > > > > rte_mbuf_dynflag_flow_metadata_bitnum;
> > > > >
> > > > > ...helpers to set/get metadata...
> > > > > ===
> > > > >
> > > > > Centralizing the definitions of non-private dynamic fields/flags
> > > > > in rte_mbuf_dyn may help other people to reuse a field that is
> > > > > well described if it match their use-case.
> > > >
> > > > Yes, centralizing is important, that's why MBUF_DYNF_METADATA_xxx
> > > > placed in rte_mbuf_dyn.h. Do you think we should share the
> > > > descriptors
> > > either?
> > > > I have no idea why someone (but rte_flow_dynf_metadata_register())
> > > > might register metadata field directly.
> > >
> > > If the field is private to rte_flow, yes, there is no need to share
> > > the "struct rte_mbuf_dynfield". Even the
> > > rte_flow_dynf_metadata_register() could be marked as internal, right?
> > rte_flow_dynf_metadata_register() is intended to be called by application.
> > Some applications might wish to engage metadata feature, some ones -
> not.
> >
> > >
> > > One more question: I see the registration is done by
> > > parse_vc_action_set_meta(). My understanding is that this function
> > > is not in datapath, and is called when configuring rte_flow. Do you
> confirm?
> > Rather it is called to configure application in general. If user sets
> > metadata (by issuing the appropriate command) it is assumed he/she
> > would like the metadata on Rx side either. This is just for test
> > purposes and it is not brilliant example of
> rte_flow_dynf_metadata_register() use case.
> >
> >
> > >
> > > > > In your case, what is carried by metadata? Could it be reused by
> > > > > others? I think some more description is needed.
> > > > In my case, metadata is just opaquie rte_flow related 32-bit
> > > > unsigned value provided by
> > > > mlx5 hardrware in rx datapath. I have no guess whether someone
> > > > wishes
> > > to reuse.
> > >
> > > What is the user supposed to do with this value? If it is
> > > hw-specific data, I think the name of the mbuf field should include
> > > "MLX", and it should be described.
> >
> > Metadata are not HW specific at all - they neither control nor are
> > produced by HW (abstracting from the flow engine is implemented in HW).
> > Metadata are some opaque data, it is some kind of link between data
> > path and flow space.  With metadata application may provide some per
> > packet information to flow engine and get back some information from
> flow engine.
> > it is generic concept, supposed to be neither HW-related nor vendor
> specific.
>
> ok, understood, it's like a mark or tag.
>
> > > Are these rte_flow actions somehow specific to mellanox drivers ?
> >
> > AFAIK, currently it is going to be supported by mlx5 PMD only, but
> > concept is common and is not vendor specific.
> >
> > >
> > > > Brief summary of you comment (just to make sure I understood your
> > > proposal in correct way):
> > > > 1. drop all definitions MBUF_DYNF_METADATA_xxx, leave
> > > > MBUF_DYNF_METADATA_NAME only 2. move the descriptor const
> struct
> > > > rte_mbuf_dynfield desc_offs = {} to rte_mbuf_dyn.c and make it
> > > > global 3. provide helpers to access metadata
> > > >
> > > > [1] and [2] look OK in general. Although I think these ones make
> > > > code less
> > > flexible, restrict the potential compile time options.
> > > > For now it is rather theoretical question, if you insist on your
> > > > approach - please, let me know, I'll address [1] and [2] and
> > > > update.my
> > > patch.
> > >
> > > [1] I think the #define only adds an indirection, and I didn't see any
> > >     perf constraint here.
> > > [2] My previous comment was surely not clear, sorry. The code can stay
> > >     in rte_flow.
> > >
> > > > As for [3] - IMHO, the extra abstraction layer is not useful, and
> > > > might be
> > > even harmful.
> > > > I tend not to complicate the code, at least, for now.
> > >
> > > [3] ok for me
> > >
> > >
> > > Thanks,
> > > Olivier
> >
 With best regards, Slava
  

Patch

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index b26b8bf..078f256 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -305,6 +305,9 @@  enum index {
 	ACTION_DEC_TCP_ACK_VALUE,
 	ACTION_RAW_ENCAP,
 	ACTION_RAW_DECAP,
+	ACTION_SET_META,
+	ACTION_SET_META_DATA,
+	ACTION_SET_META_MASK,
 };
 
 /** Maximum size for pattern in struct rte_flow_item_raw. */
@@ -994,6 +997,7 @@  struct parse_action_priv {
 	ACTION_DEC_TCP_ACK,
 	ACTION_RAW_ENCAP,
 	ACTION_RAW_DECAP,
+	ACTION_SET_META,
 	ZERO,
 };
 
@@ -1180,6 +1184,13 @@  struct parse_action_priv {
 	ZERO,
 };
 
+static const enum index action_set_meta[] = {
+	ACTION_SET_META_DATA,
+	ACTION_SET_META_MASK,
+	ACTION_NEXT,
+	ZERO,
+};
+
 static int parse_set_raw_encap_decap(struct context *, const struct token *,
 				     const char *, unsigned int,
 				     void *, unsigned int);
@@ -1238,6 +1249,10 @@  static int parse_vc_action_raw_encap(struct context *,
 static int parse_vc_action_raw_decap(struct context *,
 				     const struct token *, const char *,
 				     unsigned int, void *, unsigned int);
+static int parse_vc_action_set_meta(struct context *ctx,
+				    const struct token *token, const char *str,
+				    unsigned int len, void *buf,
+				    unsigned int size);
 static int parse_destroy(struct context *, const struct token *,
 			 const char *, unsigned int,
 			 void *, unsigned int);
@@ -3222,7 +3237,31 @@  static int comp_vc_action_rss_queue(struct context *, const struct token *,
 		.help = "set raw decap data",
 		.next = NEXT(next_item),
 		.call = parse_set_raw_encap_decap,
-	}
+	},
+	[ACTION_SET_META] = {
+		.name = "set_meta",
+		.help = "set metadata",
+		.priv = PRIV_ACTION(SET_META,
+			sizeof(struct rte_flow_action_set_meta)),
+		.next = NEXT(action_set_meta),
+		.call = parse_vc_action_set_meta,
+	},
+	[ACTION_SET_META_DATA] = {
+		.name = "data",
+		.help = "metadata value",
+		.next = NEXT(action_set_meta, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_HTON
+			     (struct rte_flow_action_set_meta, data)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_SET_META_MASK] = {
+		.name = "mask",
+		.help = "mask for metadata value",
+		.next = NEXT(action_set_meta, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_HTON
+			     (struct rte_flow_action_set_meta, mask)),
+		.call = parse_vc_conf,
+	},
 };
 
 /** Remove and return last entry from argument stack. */
@@ -4592,6 +4631,22 @@  static int comp_vc_action_rss_queue(struct context *, const struct token *,
 	return ret;
 }
 
+static int
+parse_vc_action_set_meta(struct context *ctx, const struct token *token,
+			 const char *str, unsigned int len, void *buf,
+			 unsigned int size)
+{
+	int ret;
+
+	ret = parse_vc(ctx, token, str, len, buf, size);
+	if (ret < 0)
+		return ret;
+	ret = rte_flow_dynf_metadata_register();
+	if (ret < 0)
+		return -1;
+	return len;
+}
+
 /** Parse tokens for destroy command. */
 static int
 parse_destroy(struct context *ctx, const struct token *token,
diff --git a/app/test-pmd/util.c b/app/test-pmd/util.c
index 1570270..39ff07b 100644
--- a/app/test-pmd/util.c
+++ b/app/test-pmd/util.c
@@ -81,6 +81,11 @@ 
 			       mb->vlan_tci, mb->vlan_tci_outer);
 		else if (ol_flags & PKT_RX_VLAN)
 			printf(" - VLAN tci=0x%x", mb->vlan_tci);
+		if (ol_flags & PKT_TX_METADATA)
+			printf(" - Tx metadata: 0x%x", mb->tx_metadata);
+		if (ol_flags & PKT_RX_DYNF_METADATA)
+			printf(" - Rx metadata: 0x%x",
+			       *RTE_FLOW_DYNF_METADATA(mb));
 		if (mb->packet_type) {
 			rte_get_ptype_name(mb->packet_type, buf, sizeof(buf));
 			printf(" - hw ptype: %s", buf);
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index ff6fb11..45fc041 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -658,6 +658,32 @@  the physical device, with virtual groups in the PMD or not at all.
    | ``mask`` | ``id``   | zeroed to match any value |
    +----------+----------+---------------------------+
 
+Item: ``META``
+^^^^^^^^^^^^^^^^^
+
+Matches 32 bit metadata item set.
+
+On egress, metadata can be set either by mbuf metadata field with
+PKT_TX_METADATA flag or ``SET_META`` action. On ingress, ``SET_META``
+action sets metadata for a packet and the metadata will be reported via
+``metadata`` dynamic field of ``rte_mbuf`` with PKT_RX_DYNF_METADATA flag.
+
+- Default ``mask`` matches the specified Rx metadata value.
+
+.. _table_rte_flow_item_meta:
+
+.. table:: META
+
+   +----------+----------+---------------------------------------+
+   | Field    | Subfield | Value                                 |
+   +==========+==========+=======================================+
+   | ``spec`` | ``data`` | 32 bit metadata value                 |
+   +----------+----------+---------------------------------------+
+   | ``last`` | ``data`` | upper range value                     |
+   +----------+----------+---------------------------------------+
+   | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
+   +----------+----------+---------------------------------------+
+
 Data matching item types
 ~~~~~~~~~~~~~~~~~~~~~~~~
 
@@ -2415,6 +2441,37 @@  Value to decrease TCP acknowledgment number by is a big-endian 32 bit integer.
 
 Using this action on non-matching traffic will result in undefined behavior.
 
+Action: ``SET_META``
+^^^^^^^^^^^^^^^^^^^^^^^
+
+Set metadata. Item ``META`` matches metadata.
+
+Metadata set by mbuf metadata field with PKT_TX_METADATA flag on egress will be
+overridden by this action. On ingress, the metadata will be carried by
+``metadata`` dynamic field of ``rte_mbuf`` which can be accessed by
+``RTE_FLOW_DYNF_METADATA()``. PKT_RX_DYNF_METADATA flag will be set along
+with the data.
+
+The mbuf dynamic field must be registered by calling
+``rte_flow_dynf_metadata_register()`` prior to use ``SET_META`` action.
+
+Altering partial bits is supported with ``mask``. For bits which have never been
+set, unpredictable value will be seen depending on driver implementation. For
+loopback/hairpin packet, metadata set on Rx/Tx may or may not be propagated to
+the other path depending on HW capability.
+
+.. _table_rte_flow_action_set_meta:
+
+.. table:: SET_META
+
+   +----------+----------------------------+
+   | Field    | Value                      |
+   +==========+============================+
+   | ``data`` | 32 bit metadata value      |
+   +----------+----------------------------+
+   | ``mask`` | bit-mask applies to "data" |
+   +----------+----------------------------+
+
 Negative types
 ~~~~~~~~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_19_11.rst b/doc/guides/rel_notes/release_19_11.rst
index 8921cfd..904746e 100644
--- a/doc/guides/rel_notes/release_19_11.rst
+++ b/doc/guides/rel_notes/release_19_11.rst
@@ -95,6 +95,14 @@  New Features
   for specific offload features, where adding a static field or flag
   in the mbuf is not justified.
 
+* **Extended metadata support in rte_flow.**
+
+  Flow metadata is extended to both Rx and Tx.
+
+  * Tx metadata can also be set by SET_META action of rte_flow.
+  * Rx metadata is delivered to host via a dynamic field of ``rte_mbuf`` with
+    PKT_RX_DYNF_METADATA.
+
 Removed Items
 -------------
 
diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index d937fb4..9a6432c 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -1013,7 +1013,6 @@  struct rte_eth_conf {
 #define DEV_RX_OFFLOAD_KEEP_CRC		0x00010000
 #define DEV_RX_OFFLOAD_SCTP_CKSUM	0x00020000
 #define DEV_RX_OFFLOAD_OUTER_UDP_CKSUM  0x00040000
-
 #define DEV_RX_OFFLOAD_CHECKSUM (DEV_RX_OFFLOAD_IPV4_CKSUM | \
 				 DEV_RX_OFFLOAD_UDP_CKSUM | \
 				 DEV_RX_OFFLOAD_TCP_CKSUM)
diff --git a/lib/librte_ethdev/rte_ethdev_version.map b/lib/librte_ethdev/rte_ethdev_version.map
index 6df42a4..3d9cafc 100644
--- a/lib/librte_ethdev/rte_ethdev_version.map
+++ b/lib/librte_ethdev/rte_ethdev_version.map
@@ -283,4 +283,10 @@  EXPERIMENTAL {
 
 	# added in 19.08
 	rte_eth_read_clock;
+
+	# added in 19.11
+	rte_flow_dynf_metadata_offs;
+	rte_flow_dynf_metadata_flag;
+	rte_flow_dynf_metadata_avail;
+	rte_flow_dynf_metadata_register;
 };
diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
index cc03b15..9cbda75 100644
--- a/lib/librte_ethdev/rte_flow.c
+++ b/lib/librte_ethdev/rte_flow.c
@@ -12,10 +12,18 @@ 
 #include <rte_errno.h>
 #include <rte_branch_prediction.h>
 #include <rte_string_fns.h>
+#include <rte_mbuf.h>
+#include <rte_mbuf_dyn.h>
 #include "rte_ethdev.h"
 #include "rte_flow_driver.h"
 #include "rte_flow.h"
 
+/* Mbuf dynamic field name for metadata. */
+int rte_flow_dynf_metadata_offs = -1;
+
+/* Mbuf dynamic field flag bit number for metadata. */
+uint64_t rte_flow_dynf_metadata_mask;
+
 /**
  * Flow elements description tables.
  */
@@ -153,8 +161,41 @@  struct rte_flow_desc_data {
 	MK_FLOW_ACTION(DEC_TCP_SEQ, sizeof(rte_be32_t)),
 	MK_FLOW_ACTION(INC_TCP_ACK, sizeof(rte_be32_t)),
 	MK_FLOW_ACTION(DEC_TCP_ACK, sizeof(rte_be32_t)),
+	MK_FLOW_ACTION(SET_META, sizeof(struct rte_flow_action_set_meta)),
 };
 
+int
+rte_flow_dynf_metadata_register(void)
+{
+	int offset;
+	int flag;
+
+	static const struct rte_mbuf_dynfield desc_offs = {
+		.name = MBUF_DYNF_METADATA_NAME,
+		.size = MBUF_DYNF_METADATA_SIZE,
+		.align = MBUF_DYNF_METADATA_ALIGN,
+		.flags = MBUF_DYNF_METADATA_FLAGS,
+	};
+	static const struct rte_mbuf_dynflag desc_flag = {
+		.name = MBUF_DYNF_METADATA_NAME,
+	};
+
+	offset = rte_mbuf_dynfield_register(&desc_offs);
+	if (offset < 0)
+		goto error;
+	flag = rte_mbuf_dynflag_register(&desc_flag);
+	if (flag < 0)
+		goto error;
+	rte_flow_dynf_metadata_offs = offset;
+	rte_flow_dynf_metadata_mask = (1ULL << flag);
+	return 0;
+
+error:
+	rte_flow_dynf_metadata_offs = -1;
+	rte_flow_dynf_metadata_mask = 0ULL;
+	return -rte_errno;
+}
+
 static int
 flow_err(uint16_t port_id, int ret, struct rte_flow_error *error)
 {
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index 391a44a..a27e619 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -27,6 +27,8 @@ 
 #include <rte_udp.h>
 #include <rte_byteorder.h>
 #include <rte_esp.h>
+#include <rte_mbuf.h>
+#include <rte_mbuf_dyn.h>
 
 #ifdef __cplusplus
 extern "C" {
@@ -417,7 +419,8 @@  enum rte_flow_item_type {
 	/**
 	 * [META]
 	 *
-	 * Matches a metadata value specified in mbuf metadata field.
+	 * Matches a metadata value.
+	 *
 	 * See struct rte_flow_item_meta.
 	 */
 	RTE_FLOW_ITEM_TYPE_META,
@@ -1213,9 +1216,17 @@  struct rte_flow_item_icmp6_nd_opt_tla_eth {
 #endif
 
 /**
- * RTE_FLOW_ITEM_TYPE_META.
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
  *
- * Matches a specified metadata value.
+ * RTE_FLOW_ITEM_TYPE_META
+ *
+ * Matches a specified metadata value. On egress, metadata can be set either by
+ * mbuf tx_metadata field with PKT_TX_METADATA flag or
+ * RTE_FLOW_ACTION_TYPE_SET_META. On ingress, RTE_FLOW_ACTION_TYPE_SET_META sets
+ * metadata for a packet and the metadata will be reported via mbuf metadata
+ * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic mbuf field must be
+ * registered in advance by rte_flow_dynf_metadata_register().
  */
 struct rte_flow_item_meta {
 	rte_be32_t data;
@@ -1813,6 +1824,13 @@  enum rte_flow_action_type {
 	 * undefined behavior.
 	 */
 	RTE_FLOW_ACTION_TYPE_DEC_TCP_ACK,
+
+	/**
+	 * Set metadata on ingress or egress path.
+	 *
+	 * See struct rte_flow_action_set_meta.
+	 */
+	RTE_FLOW_ACTION_TYPE_SET_META,
 };
 
 /**
@@ -2300,6 +2318,43 @@  struct rte_flow_action_set_mac {
 	uint8_t mac_addr[RTE_ETHER_ADDR_LEN];
 };
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ACTION_TYPE_SET_META
+ *
+ * Set metadata. Metadata set by mbuf tx_metadata field with
+ * PKT_TX_METADATA flag on egress will be overridden by this action. On
+ * ingress, the metadata will be carried by mbuf metadata dynamic field
+ * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field must be
+ * registered in advance by rte_flow_dynf_metadata_register().
+ *
+ * Altering partial bits is supported with mask. For bits which have never
+ * been set, unpredictable value will be seen depending on driver
+ * implementation. For loopback/hairpin packet, metadata set on Rx/Tx may
+ * or may not be propagated to the other path depending on HW capability.
+ *
+ * RTE_FLOW_ITEM_TYPE_META matches metadata.
+ */
+struct rte_flow_action_set_meta {
+	rte_be32_t data;
+	rte_be32_t mask;
+};
+
+/* Mbuf dynamic field offset for metadata. */
+extern int rte_flow_dynf_metadata_offs;
+
+/* Mbuf dynamic field flag mask for metadata. */
+extern uint64_t rte_flow_dynf_metadata_mask;
+
+/* Mbuf dynamic field pointer for metadata. */
+#define RTE_FLOW_DYNF_METADATA(m) \
+	RTE_MBUF_DYNFIELD((m), rte_flow_dynf_metadata_offs, uint32_t *)
+
+/* Mbuf dynamic flag for metadata. */
+#define PKT_RX_DYNF_METADATA (rte_flow_dynf_metadata_mask)
+
 /*
  * Definition of a single action.
  *
@@ -2533,6 +2588,32 @@  enum rte_flow_conv_op {
 };
 
 /**
+ * Check if mbuf dynamic field for metadata is registered.
+ *
+ * @return
+ *   True if registered, false otherwise.
+ */
+__rte_experimental
+static inline int
+rte_flow_dynf_metadata_avail(void) {
+	return !!rte_flow_dynf_metadata_mask;
+}
+
+/**
+ * Register mbuf dynamic field and flag for metadata.
+ *
+ * This function must be called prior to use SET_META action in order to
+ * register the dynamic mbuf field. Otherwise, the data cannot be delivered to
+ * application.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+__rte_experimental
+int
+rte_flow_dynf_metadata_register(void);
+
+/**
  * Check whether a flow rule can be created on a given port.
  *
  * The flow rule is validated for correctness and whether it could be accepted
diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
index 6e2c816..4ff33ac 100644
--- a/lib/librte_mbuf/rte_mbuf_dyn.h
+++ b/lib/librte_mbuf/rte_mbuf_dyn.h
@@ -160,4 +160,12 @@  int rte_mbuf_dynflag_lookup(const char *name,
  */
 #define RTE_MBUF_DYNFIELD(m, offset, type) ((type)((uintptr_t)(m) + (offset)))
 
+/**
+ * Flow metadata dynamic field definitions.
+ */
+#define MBUF_DYNF_METADATA_NAME "flow-metadata"
+#define MBUF_DYNF_METADATA_SIZE sizeof(uint32_t)
+#define MBUF_DYNF_METADATA_ALIGN __alignof__(uint32_t)
+#define MBUF_DYNF_METADATA_FLAGS 0
+
 #endif