mbox series

[00/20] net/mlx5: implement extensive metadata feature

Message ID 1572940915-29416-1-git-send-email-viacheslavo@mellanox.com (mailing list archive)
Headers
Series net/mlx5: implement extensive metadata feature |

Message

Slava Ovsiienko Nov. 5, 2019, 8:01 a.m. UTC
  The modern networks operate on the base of the packet switching
approach, and in-network environment data are transmitted as the
packets. Within the host besides the data, actually transmitted
on the wire as packets, there might some out-of-band data helping
to process packets. These data are named as metadata, exist on
a per-packet basis and are attached to each packet as some extra
dedicated storage (in meaning it besides the packet data itself).

In the DPDK network data are represented as mbuf structure chains
and go along the application/DPDK datapath. From the other side,
DPDK provides Flow API to control the flow engine. Being precise,
there are two kinds of metadata in the DPDK, the one is purely
software metadata (as fields of mbuf - flags, packet types, data
length, etc.), and the other is metadata within flow engine.
In this scope, we cover the second type (flow engine metadata) only.

The flow engine metadata is some extra data, supported on the
per-packet basis and usually handled by hardware inside flow
engine.

Initially, there were proposed two metadata related actions:

- RTE_FLOW_ACTION_TYPE_FLAG
- RTE_FLOW_ACTION_TYPE_MARK

These actions set the special flag in the packet metadata, MARK
action stores some specified value in the metadata storage, and,
on the packet receiving PMD puts the flag and value to the mbuf
and applications can see the packet was threated inside flow engine
according to the appropriate RTE flow(s). MARK and FLAG are like
some kind of gateway to transfer some per-packet information from
the flow engine to the application via receiving datapath. Also,
there is the item of type RTE_FLOW_ITEM_TYPE_MARK provided. It
allows us to extend the flow match pattern with the capability
to match the metadata values set by MARK/FLAG actions on other
flows.

From the datapath point of view, the MARK and FLAG are related
to the receiving side only. It would useful to have the same gateway
on the transmitting side and there was the feature of type
RTE_FLOW_ITEM_TYPE_META was proposed. The application can fill
the field in mbuf and this value will be transferred to some field
in the packet metadata inside the flow engine.

It did not matter whether these metadata fields are shared because
of MARK and META items belonged to different domains (receiving and
transmitting) and could be vendor-specific.

So far, so good, DPDK proposes some entities to control metadata
inside the flow engine and gateways to exchange these values on
a per-packet basis via datapath.

As we can see, the MARK and META means are not symmetric, there
is absent action which would allow us to set META value on the
transmitting path. So, the action of type:

- RTE_FLOW_ACTION_TYPE_SET_META is proposed.

The next, applications raise the new requirements for packet
metadata. The flow engines are getting more complex, internal
switches are introduced, multiple ports might be supported within
the same flow engine namespace. From the DPDK points of view, it
means the packets might be sent on one eth_dev port and received
on the other one, and the packet path inside the flow engine
entirely belongs to the same hardware device. The simplest example
is SR-IOV with PF, VFs and the representors. And there is a
brilliant opportunity to provide some out-of-band channel to
transfer some extra data from one port to another one, besides
the packet data itself. And applications would like to use this
opportunity.

Improving the metadata definitions it is proposed to:
- suppose MARK and META metadata fields not shared, dedicated
- extend applying area for MARK and META items/actions for all
  flow engine domains - transmitting and receiving
  - allow MARK and META metadata to be preserved while crossing
    the flow domains (from transmit origin through flow database
    inside (E-)switch to receiving side domain), in simple words,
    to allow metadata to convey the packet thought entire flow
    engine space.

Another new proposed feature is transient per-packet storage
inside the flow engine. It might have a lot of use cases.
For example, if there is VXLAN tunneled traffic and some flow
performs VXLAN decapsulation and wishes to save information
regarding the dropped header it could use this temporary
transient storage. The tools to maintain this storage are
traditional (for DPDK rte_flow API):

- RTE_FLOW_ACTION_TYPE_SET_TAG - to set value
- RTE_FLOW_ACTION_TYPE_SET_ITEM - to match on

There are primary properties of the proposed storage:
- the storage is presented as an array of 32-bit opaque values
- the size of array (or even bitmap of available indices) is
  vendor specific and is subject to run-time trial
- it is transient, it means it exists only inside flow engine,
  no gateways for interacting with datapath, applications have
  way neither to specify these data on transmitting nor to get
  these data on receiving

This patchset implements the abovementioned extensive metadata
feature in the mlx5 PMD.

The patchset must be applied after public RTE API updates:

[1] http://patches.dpdk.org/patch/62354/
[2] http://patches.dpdk.org/patch/62355/

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

Viacheslav Ovsiienko (20):
  net/mlx5: convert internal tag endianness
  net/mlx5: update modify header action translator
  net/mlx5: add metadata register copy
  net/mlx5: refactor flow structure
  net/mlx5: update flow functions
  net/mlx5: update meta register matcher set
  net/mlx5: rename structure and function
  net/mlx5: check metadata registers availability
  net/mlx5: add devarg for extensive metadata support
  net/mlx5: adjust shared register according to mask
  net/mlx5: check the maximal modify actions number
  net/mlx5: update metadata register id query
  net/mlx5: add flow tag support
  net/mlx5: extend flow mark support
  net/mlx5: extend flow meta data support
  net/mlx5: add meta data support to Rx datapath
  net/mlx5: add simple hash table
  net/mlx5: introduce flow splitters chain
  net/mlx5: split Rx flows to provide metadata copy
  net/mlx5: add metadata register copy table

 doc/guides/nics/mlx5.rst                 |   49 +
 drivers/net/mlx5/mlx5.c                  |  135 ++-
 drivers/net/mlx5/mlx5.h                  |   19 +-
 drivers/net/mlx5/mlx5_defs.h             |    7 +
 drivers/net/mlx5/mlx5_ethdev.c           |    8 +-
 drivers/net/mlx5/mlx5_flow.c             | 1178 ++++++++++++++++++++++-
 drivers/net/mlx5/mlx5_flow.h             |  108 ++-
 drivers/net/mlx5/mlx5_flow_dv.c          | 1544 ++++++++++++++++++++++++------
 drivers/net/mlx5/mlx5_flow_verbs.c       |   55 +-
 drivers/net/mlx5/mlx5_prm.h              |   45 +-
 drivers/net/mlx5/mlx5_rxtx.c             |    6 +
 drivers/net/mlx5/mlx5_rxtx_vec_altivec.h |   25 +-
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h    |   23 +
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h     |   27 +-
 drivers/net/mlx5/mlx5_utils.h            |  115 ++-
 15 files changed, 2922 insertions(+), 422 deletions(-)
  

Comments

Matan Azrad Nov. 5, 2019, 9:35 a.m. UTC | #1
From: Viacheslav Ovsiienko
> The modern networks operate on the base of the packet switching
> approach, and in-network environment data are transmitted as the packets.
> Within the host besides the data, actually transmitted on the wire as packets,
> there might some out-of-band data helping to process packets. These data
> are named as metadata, exist on a per-packet basis and are attached to each
> packet as some extra dedicated storage (in meaning it besides the packet
> data itself).
> 
> In the DPDK network data are represented as mbuf structure chains and go
> along the application/DPDK datapath. From the other side, DPDK provides
> Flow API to control the flow engine. Being precise, there are two kinds of
> metadata in the DPDK, the one is purely software metadata (as fields of
> mbuf - flags, packet types, data length, etc.), and the other is metadata
> within flow engine.
> In this scope, we cover the second type (flow engine metadata) only.
> 
> The flow engine metadata is some extra data, supported on the per-packet
> basis and usually handled by hardware inside flow engine.
> 
> Initially, there were proposed two metadata related actions:
> 
> - RTE_FLOW_ACTION_TYPE_FLAG
> - RTE_FLOW_ACTION_TYPE_MARK
> 
> These actions set the special flag in the packet metadata, MARK action stores
> some specified value in the metadata storage, and, on the packet receiving
> PMD puts the flag and value to the mbuf and applications can see the packet
> was threated inside flow engine according to the appropriate RTE flow(s).
> MARK and FLAG are like some kind of gateway to transfer some per-packet
> information from the flow engine to the application via receiving datapath.
> Also, there is the item of type RTE_FLOW_ITEM_TYPE_MARK provided. It
> allows us to extend the flow match pattern with the capability to match the
> metadata values set by MARK/FLAG actions on other flows.
> 
> From the datapath point of view, the MARK and FLAG are related to the
> receiving side only. It would useful to have the same gateway on the
> transmitting side and there was the feature of type
> RTE_FLOW_ITEM_TYPE_META was proposed. The application can fill the field
> in mbuf and this value will be transferred to some field in the packet
> metadata inside the flow engine.
> 
> It did not matter whether these metadata fields are shared because of
> MARK and META items belonged to different domains (receiving and
> transmitting) and could be vendor-specific.
> 
> So far, so good, DPDK proposes some entities to control metadata inside the
> flow engine and gateways to exchange these values on a per-packet basis via
> datapath.
> 
> As we can see, the MARK and META means are not symmetric, there is
> absent action which would allow us to set META value on the transmitting
> path. So, the action of type:
> 
> - RTE_FLOW_ACTION_TYPE_SET_META is proposed.
> 
> The next, applications raise the new requirements for packet metadata. The
> flow engines are getting more complex, internal switches are introduced,
> multiple ports might be supported within the same flow engine namespace.
> From the DPDK points of view, it means the packets might be sent on one
> eth_dev port and received on the other one, and the packet path inside the
> flow engine entirely belongs to the same hardware device. The simplest
> example is SR-IOV with PF, VFs and the representors. And there is a brilliant
> opportunity to provide some out-of-band channel to transfer some extra
> data from one port to another one, besides the packet data itself. And
> applications would like to use this opportunity.
> 
> Improving the metadata definitions it is proposed to:
> - suppose MARK and META metadata fields not shared, dedicated
> - extend applying area for MARK and META items/actions for all
>   flow engine domains - transmitting and receiving
>   - allow MARK and META metadata to be preserved while crossing
>     the flow domains (from transmit origin through flow database
>     inside (E-)switch to receiving side domain), in simple words,
>     to allow metadata to convey the packet thought entire flow
>     engine space.
> 
> Another new proposed feature is transient per-packet storage inside the
> flow engine. It might have a lot of use cases.
> For example, if there is VXLAN tunneled traffic and some flow performs
> VXLAN decapsulation and wishes to save information regarding the dropped
> header it could use this temporary transient storage. The tools to maintain
> this storage are traditional (for DPDK rte_flow API):
> 
> - RTE_FLOW_ACTION_TYPE_SET_TAG - to set value
> - RTE_FLOW_ACTION_TYPE_SET_ITEM - to match on
> 
> There are primary properties of the proposed storage:
> - the storage is presented as an array of 32-bit opaque values
> - the size of array (or even bitmap of available indices) is
>   vendor specific and is subject to run-time trial
> - it is transient, it means it exists only inside flow engine,
>   no gateways for interacting with datapath, applications have
>   way neither to specify these data on transmitting nor to get
>   these data on receiving
> 
> This patchset implements the abovementioned extensive metadata feature
> in the mlx5 PMD.
> 
> The patchset must be applied after public RTE API updates:
> 
> [1]
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatch
> es.dpdk.org%2Fpatch%2F62354%2F&amp;data=02%7C01%7Cmatan%40mella
> nox.com%7C3d761a21d7b24f6cc3f908d761c67b89%7Ca652971c7d2e4d9ba6a4
> d149256f461b%7C0%7C0%7C637085377412009134&amp;sdata=J4RsC9w8BNn
> %2BawnLG%2F4UJPzeo%2BZ%2BTBfVxkZ6j4gheg0%3D&amp;reserved=0
> [2]
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatch
> es.dpdk.org%2Fpatch%2F62355%2F&amp;data=02%7C01%7Cmatan%40mella
> nox.com%7C3d761a21d7b24f6cc3f908d761c67b89%7Ca652971c7d2e4d9ba6a4
> d149256f461b%7C0%7C0%7C637085377412009134&amp;sdata=AXgzWWkU8I
> yr3zm5z%2FPW%2B9p%2Fein7d87Jdr2dclqe71s%3D&amp;reserved=0
> 
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

For all the series:
Acked-by: Matan Azrad <matan@mellanox.com>

> Viacheslav Ovsiienko (20):
>   net/mlx5: convert internal tag endianness
>   net/mlx5: update modify header action translator
>   net/mlx5: add metadata register copy
>   net/mlx5: refactor flow structure
>   net/mlx5: update flow functions
>   net/mlx5: update meta register matcher set
>   net/mlx5: rename structure and function
>   net/mlx5: check metadata registers availability
>   net/mlx5: add devarg for extensive metadata support
>   net/mlx5: adjust shared register according to mask
>   net/mlx5: check the maximal modify actions number
>   net/mlx5: update metadata register id query
>   net/mlx5: add flow tag support
>   net/mlx5: extend flow mark support
>   net/mlx5: extend flow meta data support
>   net/mlx5: add meta data support to Rx datapath
>   net/mlx5: add simple hash table
>   net/mlx5: introduce flow splitters chain
>   net/mlx5: split Rx flows to provide metadata copy
>   net/mlx5: add metadata register copy table
> 
>  doc/guides/nics/mlx5.rst                 |   49 +
>  drivers/net/mlx5/mlx5.c                  |  135 ++-
>  drivers/net/mlx5/mlx5.h                  |   19 +-
>  drivers/net/mlx5/mlx5_defs.h             |    7 +
>  drivers/net/mlx5/mlx5_ethdev.c           |    8 +-
>  drivers/net/mlx5/mlx5_flow.c             | 1178 ++++++++++++++++++++++-
>  drivers/net/mlx5/mlx5_flow.h             |  108 ++-
>  drivers/net/mlx5/mlx5_flow_dv.c          | 1544
> ++++++++++++++++++++++++------
>  drivers/net/mlx5/mlx5_flow_verbs.c       |   55 +-
>  drivers/net/mlx5/mlx5_prm.h              |   45 +-
>  drivers/net/mlx5/mlx5_rxtx.c             |    6 +
>  drivers/net/mlx5/mlx5_rxtx_vec_altivec.h |   25 +-
>  drivers/net/mlx5/mlx5_rxtx_vec_neon.h    |   23 +
>  drivers/net/mlx5/mlx5_rxtx_vec_sse.h     |   27 +-
>  drivers/net/mlx5/mlx5_utils.h            |  115 ++-
>  15 files changed, 2922 insertions(+), 422 deletions(-)
> 
> --
> 1.8.3.1