[v8,1/3] ethdev: introduce protocol header based buffer split

Message ID 20220601135059.958882-3-wenxuanx.wu@intel.com (mailing list archive)
State Superseded, archived
Headers
Series [v8,1/3] ethdev: introduce protocol header based buffer split |

Checks

Context Check Description
ci/checkpatch success coding style OK

Commit Message

Wu, WenxuanX June 1, 2022, 1:50 p.m. UTC
  From: Wenxuan Wu <wenxuanx.wu@intel.com>

Currently, Rx buffer split supports length based split. With Rx queue
offload RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT enabled and Rx packet segment
configured, PMD will be able to split the received packets into
multiple segments.

However, length based buffer split is not suitable for NICs that do split
based on protocol headers. Given a arbitrarily variable length in Rx packet
segment, it is almost impossible to pass a fixed protocol header to PMD.
Besides, the existence of tunneling results in the composition of a packet
is various, which makes the situation even worse.

This patch extends current buffer split to support protocol header based
buffer split. A new proto_hdr field is introduced in the reserved field
of rte_eth_rxseg_split structure to specify protocol header. The proto_hdr
field defines the split position of packet, splitting will always happens
after the protocol header defined in the Rx packet segment. When Rx queue
offload RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT is enabled and corresponding
protocol header is configured, PMD will split the ingress packets into
multiple segments.

struct rte_eth_rxseg_split {

        struct rte_mempool *mp; /* memory pools to allocate segment from */
        uint16_t length; /* segment maximal data length,
                            configures "split point" */
        uint16_t offset; /* data offset from beginning
                            of mbuf data buffer */
        uint32_t proto_hdr; /* inner/outer L2/L3/L4 protocol header,
			       configures "split point" */
    };

Both inner and outer L2/L3/L4 level protocol header split can be supported.
Corresponding protocol header capability is RTE_PTYPE_L2_ETHER,
RTE_PTYPE_L3_IPV4, RTE_PTYPE_L3_IPV6, RTE_PTYPE_L4_TCP, RTE_PTYPE_L4_UDP,
RTE_PTYPE_L4_SCTP, RTE_PTYPE_INNER_L2_ETHER, RTE_PTYPE_INNER_L3_IPV4,
RTE_PTYPE_INNER_L3_IPV6, RTE_PTYPE_INNER_L4_TCP, RTE_PTYPE_INNER_L4_UDP,
RTE_PTYPE_INNER_L4_SCTP.

For example, let's suppose we configured the Rx queue with the
following segments:
    seg0 - pool0, proto_hdr0=RTE_PTYPE_L3_IPV4, off0=2B
    seg1 - pool1, proto_hdr1=RTE_PTYPE_L4_UDP, off1=128B
    seg2 - pool2, off1=0B

The packet consists of MAC_IPV4_UDP_PAYLOAD will be split like
following:
    seg0 - ipv4 header @ RTE_PKTMBUF_HEADROOM + 2 in mbuf from pool0
    seg1 - udp header @ 128 in mbuf from pool1
    seg2 - payload @ 0 in mbuf from pool2

Now buffet split can be configured in two modes. For length based
buffer split, the mp, length, offset field in Rx packet segment should
be configured, while the proto_hdr field should not be configured.
For protocol header based buffer split, the mp, offset, proto_hdr field
in Rx packet segment should be configured, while the length field should
not be configured.

The split limitations imposed by underlying PMD is reported in the
rte_eth_dev_info->rx_seg_capa field. The memory attributes for the split
parts may differ either, dpdk memory and external memory, respectively.

Signed-off-by: Xuan Ding <xuan.ding@intel.com>
Signed-off-by: Yuan Wang <yuanx.wang@intel.com>
Signed-off-by: Wenxuan Wu <wenxuanx.wu@intel.com>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
---
 lib/ethdev/rte_ethdev.c | 40 +++++++++++++++++++++++++++++++++-------
 lib/ethdev/rte_ethdev.h | 28 +++++++++++++++++++++++++++-
 2 files changed, 60 insertions(+), 8 deletions(-)
  

Comments

Andrew Rybchenko June 2, 2022, 1:20 p.m. UTC | #1
There are two v8 1/3 patches in my mailbox. Which one is the right one?

On 6/1/22 16:50, wenxuanx.wu@intel.com wrote:
> From: Wenxuan Wu <wenxuanx.wu@intel.com>
> 
> Currently, Rx buffer split supports length based split. With Rx queue
> offload RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT enabled and Rx packet segment
> configured, PMD will be able to split the received packets into
> multiple segments.
> 
> However, length based buffer split is not suitable for NICs that do split
> based on protocol headers. Given a arbitrarily variable length in Rx packet
> segment, it is almost impossible to pass a fixed protocol header to PMD.
> Besides, the existence of tunneling results in the composition of a packet
> is various, which makes the situation even worse.
> 
> This patch extends current buffer split to support protocol header based
> buffer split. A new proto_hdr field is introduced in the reserved field
> of rte_eth_rxseg_split structure to specify protocol header. The proto_hdr
> field defines the split position of packet, splitting will always happens
> after the protocol header defined in the Rx packet segment. When Rx queue
> offload RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT is enabled and corresponding
> protocol header is configured, PMD will split the ingress packets into
> multiple segments.
> 
> struct rte_eth_rxseg_split {
> 
>          struct rte_mempool *mp; /* memory pools to allocate segment from */
>          uint16_t length; /* segment maximal data length,
>                              configures "split point" */
>          uint16_t offset; /* data offset from beginning
>                              of mbuf data buffer */
>          uint32_t proto_hdr; /* inner/outer L2/L3/L4 protocol header,
> 			       configures "split point" */
>      };
> 
> Both inner and outer L2/L3/L4 level protocol header split can be supported.
> Corresponding protocol header capability is RTE_PTYPE_L2_ETHER,
> RTE_PTYPE_L3_IPV4, RTE_PTYPE_L3_IPV6, RTE_PTYPE_L4_TCP, RTE_PTYPE_L4_UDP,
> RTE_PTYPE_L4_SCTP, RTE_PTYPE_INNER_L2_ETHER, RTE_PTYPE_INNER_L3_IPV4,
> RTE_PTYPE_INNER_L3_IPV6, RTE_PTYPE_INNER_L4_TCP, RTE_PTYPE_INNER_L4_UDP,
> RTE_PTYPE_INNER_L4_SCTP.
> 
> For example, let's suppose we configured the Rx queue with the
> following segments:
>      seg0 - pool0, proto_hdr0=RTE_PTYPE_L3_IPV4, off0=2B
>      seg1 - pool1, proto_hdr1=RTE_PTYPE_L4_UDP, off1=128B
>      seg2 - pool2, off1=0B
> 
> The packet consists of MAC_IPV4_UDP_PAYLOAD will be split like
> following:
>      seg0 - ipv4 header @ RTE_PKTMBUF_HEADROOM + 2 in mbuf from pool0
>      seg1 - udp header @ 128 in mbuf from pool1
>      seg2 - payload @ 0 in mbuf from pool2
> 
> Now buffet split can be configured in two modes. For length based
> buffer split, the mp, length, offset field in Rx packet segment should
> be configured, while the proto_hdr field should not be configured.
> For protocol header based buffer split, the mp, offset, proto_hdr field
> in Rx packet segment should be configured, while the length field should
> not be configured.
> 
> The split limitations imposed by underlying PMD is reported in the
> rte_eth_dev_info->rx_seg_capa field. The memory attributes for the split
> parts may differ either, dpdk memory and external memory, respectively.
> 
> Signed-off-by: Xuan Ding <xuan.ding@intel.com>
> Signed-off-by: Yuan Wang <yuanx.wang@intel.com>
> Signed-off-by: Wenxuan Wu <wenxuanx.wu@intel.com>
> Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
> Acked-by: Ray Kinsella <mdr@ashroe.eu>

[snip]
  
Ding, Xuan June 2, 2022, 1:44 p.m. UTC | #2
Hi Andrew,

> -----Original Message-----
> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Sent: Thursday, June 2, 2022 9:21 PM
> To: Wu, WenxuanX <wenxuanx.wu@intel.com>; thomas@monjalon.net; Li,
> Xiaoyun <xiaoyun.li@intel.com>; ferruh.yigit@xilinx.com; Singh, Aman Deep
> <aman.deep.singh@intel.com>; dev@dpdk.org; Zhang, Yuying
> <yuying.zhang@intel.com>; Zhang, Qi Z <qi.z.zhang@intel.com>;
> jerinjacobk@gmail.com
> Cc: stephen@networkplumber.org; Ding, Xuan <xuan.ding@intel.com>;
> Wang, YuanX <yuanx.wang@intel.com>; Ray Kinsella <mdr@ashroe.eu>
> Subject: Re: [PATCH v8 1/3] ethdev: introduce protocol header based buffer
> split
> 
> There are two v8 1/3 patches in my mailbox. Which one is the right one?

Yes, you are right, the second one is the latest one, sorry for the inconvenience.

Thanks,
Xuan

> 
> On 6/1/22 16:50, wenxuanx.wu@intel.com wrote:
> > From: Wenxuan Wu <wenxuanx.wu@intel.com>
> >
> > Currently, Rx buffer split supports length based split. With Rx queue
> > offload RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT enabled and Rx packet
> segment
> > configured, PMD will be able to split the received packets into
> > multiple segments.
> >
> > However, length based buffer split is not suitable for NICs that do
> > split based on protocol headers. Given a arbitrarily variable length
> > in Rx packet segment, it is almost impossible to pass a fixed protocol
> header to PMD.
> > Besides, the existence of tunneling results in the composition of a
> > packet is various, which makes the situation even worse.
> >
> > This patch extends current buffer split to support protocol header
> > based buffer split. A new proto_hdr field is introduced in the
> > reserved field of rte_eth_rxseg_split structure to specify protocol
> > header. The proto_hdr field defines the split position of packet,
> > splitting will always happens after the protocol header defined in the
> > Rx packet segment. When Rx queue offload
> > RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT is enabled and corresponding
> protocol
> > header is configured, PMD will split the ingress packets into multiple
> segments.
> >
> > struct rte_eth_rxseg_split {
> >
> >          struct rte_mempool *mp; /* memory pools to allocate segment from
> */
> >          uint16_t length; /* segment maximal data length,
> >                              configures "split point" */
> >          uint16_t offset; /* data offset from beginning
> >                              of mbuf data buffer */
> >          uint32_t proto_hdr; /* inner/outer L2/L3/L4 protocol header,
> > 			       configures "split point" */
> >      };
> >
> > Both inner and outer L2/L3/L4 level protocol header split can be supported.
> > Corresponding protocol header capability is RTE_PTYPE_L2_ETHER,
> > RTE_PTYPE_L3_IPV4, RTE_PTYPE_L3_IPV6, RTE_PTYPE_L4_TCP,
> > RTE_PTYPE_L4_UDP, RTE_PTYPE_L4_SCTP, RTE_PTYPE_INNER_L2_ETHER,
> > RTE_PTYPE_INNER_L3_IPV4, RTE_PTYPE_INNER_L3_IPV6,
> > RTE_PTYPE_INNER_L4_TCP, RTE_PTYPE_INNER_L4_UDP,
> RTE_PTYPE_INNER_L4_SCTP.
> >
> > For example, let's suppose we configured the Rx queue with the
> > following segments:
> >      seg0 - pool0, proto_hdr0=RTE_PTYPE_L3_IPV4, off0=2B
> >      seg1 - pool1, proto_hdr1=RTE_PTYPE_L4_UDP, off1=128B
> >      seg2 - pool2, off1=0B
> >
> > The packet consists of MAC_IPV4_UDP_PAYLOAD will be split like
> > following:
> >      seg0 - ipv4 header @ RTE_PKTMBUF_HEADROOM + 2 in mbuf from
> pool0
> >      seg1 - udp header @ 128 in mbuf from pool1
> >      seg2 - payload @ 0 in mbuf from pool2
> >
> > Now buffet split can be configured in two modes. For length based
> > buffer split, the mp, length, offset field in Rx packet segment should
> > be configured, while the proto_hdr field should not be configured.
> > For protocol header based buffer split, the mp, offset, proto_hdr
> > field in Rx packet segment should be configured, while the length
> > field should not be configured.
> >
> > The split limitations imposed by underlying PMD is reported in the
> > rte_eth_dev_info->rx_seg_capa field. The memory attributes for the
> > split parts may differ either, dpdk memory and external memory,
> respectively.
> >
> > Signed-off-by: Xuan Ding <xuan.ding@intel.com>
> > Signed-off-by: Yuan Wang <yuanx.wang@intel.com>
> > Signed-off-by: Wenxuan Wu <wenxuanx.wu@intel.com>
> > Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
> > Acked-by: Ray Kinsella <mdr@ashroe.eu>
> 
> [snip]
  

Patch

diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
index 29a3d80466..fbd55cdd9d 100644
--- a/lib/ethdev/rte_ethdev.c
+++ b/lib/ethdev/rte_ethdev.c
@@ -1661,6 +1661,7 @@  rte_eth_rx_queue_check_split(const struct rte_eth_rxseg_split *rx_seg,
 		struct rte_mempool *mpl = rx_seg[seg_idx].mp;
 		uint32_t length = rx_seg[seg_idx].length;
 		uint32_t offset = rx_seg[seg_idx].offset;
+		uint32_t proto_hdr = rx_seg[seg_idx].proto_hdr;
 
 		if (mpl == NULL) {
 			RTE_ETHDEV_LOG(ERR, "null mempool pointer\n");
@@ -1694,13 +1695,38 @@  rte_eth_rx_queue_check_split(const struct rte_eth_rxseg_split *rx_seg,
 		}
 		offset += seg_idx != 0 ? 0 : RTE_PKTMBUF_HEADROOM;
 		*mbp_buf_size = rte_pktmbuf_data_room_size(mpl);
-		length = length != 0 ? length : *mbp_buf_size;
-		if (*mbp_buf_size < length + offset) {
-			RTE_ETHDEV_LOG(ERR,
-				       "%s mbuf_data_room_size %u < %u (segment length=%u + segment offset=%u)\n",
-				       mpl->name, *mbp_buf_size,
-				       length + offset, length, offset);
-			return -EINVAL;
+		if (proto_hdr == RTE_PTYPE_UNKNOWN) {
+			/* Split at fixed length. */
+			length = length != 0 ? length : *mbp_buf_size;
+			if (*mbp_buf_size < length + offset) {
+				RTE_ETHDEV_LOG(ERR,
+					"%s mbuf_data_room_size %u < %u (segment length=%u + segment offset=%u)\n",
+					mpl->name, *mbp_buf_size,
+					length + offset, length, offset);
+				return -EINVAL;
+			}
+		} else {
+			/* Split after specified protocol header. */
+			if (!(proto_hdr & RTE_BUFFER_SPLIT_PROTO_HDR_MASK)) {
+				RTE_ETHDEV_LOG(ERR,
+					"Protocol header %u not supported)\n",
+					proto_hdr);
+				return -EINVAL;
+			}
+
+			if (length != 0) {
+				RTE_ETHDEV_LOG(ERR, "segment length should be set to zero in protocol header "
+					       "based buffer split\n");
+				return -EINVAL;
+			}
+
+			if (*mbp_buf_size < offset) {
+				RTE_ETHDEV_LOG(ERR,
+						"%s mbuf_data_room_size %u < %u segment offset)\n",
+						mpl->name, *mbp_buf_size,
+						offset);
+				return -EINVAL;
+			}
 		}
 	}
 	return 0;
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index 04cff8ee10..0cd9dd6cc0 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -1187,6 +1187,9 @@  struct rte_eth_txmode {
  *   mbuf) the following data will be pushed to the next segment
  *   up to its own length, and so on.
  *
+ * - The proto_hdrs in the elements define the split position of
+ *   received packets.
+ *
  * - If the length in the segment description element is zero
  *   the actual buffer size will be deduced from the appropriate
  *   memory pool properties.
@@ -1197,14 +1200,37 @@  struct rte_eth_txmode {
  *     - pool from the last valid element
  *     - the buffer size from this pool
  *     - zero offset
+ *
+ * - Length based buffer split:
+ *     - mp, length, offset should be configured.
+ *     - The proto_hdr field should not be configured.
+ *
+ * - Protocol header based buffer split:
+ *     - mp, offset, proto_hdr should be configured.
+ *     - The length field should not be configured.
  */
 struct rte_eth_rxseg_split {
 	struct rte_mempool *mp; /**< Memory pool to allocate segment from. */
 	uint16_t length; /**< Segment data length, configures split point. */
 	uint16_t offset; /**< Data offset from beginning of mbuf data buffer. */
-	uint32_t reserved; /**< Reserved field. */
+	uint32_t proto_hdr; /**< Inner/outer L2/L3/L4 protocol header, configures split point. */
 };
 
+/* Buffer split protocol header capability. */
+#define RTE_BUFFER_SPLIT_PROTO_HDR_MASK ( \
+	RTE_PTYPE_L2_ETHER | \
+	RTE_PTYPE_L3_IPV4 | \
+	RTE_PTYPE_L3_IPV6 | \
+	RTE_PTYPE_L4_TCP | \
+	RTE_PTYPE_L4_UDP | \
+	RTE_PTYPE_L4_SCTP | \
+	RTE_PTYPE_INNER_L2_ETHER | \
+	RTE_PTYPE_INNER_L3_IPV4 | \
+	RTE_PTYPE_INNER_L3_IPV6 | \
+	RTE_PTYPE_INNER_L4_TCP | \
+	RTE_PTYPE_INNER_L4_UDP | \
+	RTE_PTYPE_INNER_L4_SCTP)
+
 /**
  * @warning
  * @b EXPERIMENTAL: this structure may change without prior notice.