[v1,1/1] ethdev: introduce pool sort capability

Message ID 20220812104648.1019978-1-hpothula@marvell.com (mailing list archive)
State Superseded, archived
Delegated to: Andrew Rybchenko
Headers
Series [v1,1/1] ethdev: introduce pool sort capability |

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/Intel-compilation fail Compilation issues
ci/github-robot: build fail github build: failed

Commit Message

Hanumanth Pothula Aug. 12, 2022, 10:46 a.m. UTC
Presently, the 'Buffer Split' feature supports sending multiple
segments of the received packet to PMD, which programs the HW
to receive the packet in segments from different pools.

This patch extends the feature to support the pool sort capability.
Some of the HW has support for choosing memory pools based on the
packet's size. The pool sort capability allows PMD to choose a
memory pool based on the packet's length.

This is often useful for saving the memory where the application
can create a different pool to steer the specific size of the
packet, thus enabling effective use of memory.

For example, let's say HW has a capability of three pools,
 - pool-1 size is 2K
 - pool-2 size is > 2K and < 4K
 - pool-3 size is > 4K
Here,
        pool-1 can accommodate packets with sizes < 2K
        pool-2 can accommodate packets with sizes > 2K and < 4K
        pool-3 can accommodate packets with sizes > 4K

With pool sort capability enabled in SW, an application may create
three pools of different sizes and send them to PMD. Allowing PMD
to program HW based on packet lengths. So that packets with less
than 2K are received on pool-1, packets with lengths between 2K
and 4K are received on pool-2 and finally packets greater than 4K
are received on pool-3.

The following two capabilities are added to the rte_eth_rxseg_capa
structure,
1. pool_sort --> tells pool sort capability is supported by HW.
2. max_npool --> max number of pools supported by HW.

Defined new structure rte_eth_rxseg_sort, to be used only when pool
sort capability is present. If required this may be extended further
to support more configurations.

Signed-off-by: Hanumanth Pothula <hpothula@marvell.com>
Change-Id: I5a2485a7919616902c468c767b5c01834d4a2c27
---
 lib/ethdev/rte_ethdev.c | 81 ++++++++++++++++++++++++++++++++++++++---
 lib/ethdev/rte_ethdev.h | 46 +++++++++++++++++++++--
 2 files changed, 119 insertions(+), 8 deletions(-)
  

Comments

Morten Brørup Aug. 12, 2022, 1:27 p.m. UTC | #1
> From: Hanumanth Pothula [mailto:hpothula@marvell.com]
> Sent: Friday, 12 August 2022 12.47
> 
> Presently, the 'Buffer Split' feature supports sending multiple
> segments of the received packet to PMD, which programs the HW
> to receive the packet in segments from different pools.
> 
> This patch extends the feature to support the pool sort capability.
> Some of the HW has support for choosing memory pools based on the
> packet's size. The pool sort capability allows PMD to choose a
> memory pool based on the packet's length.
> 
> This is often useful for saving the memory where the application
> can create a different pool to steer the specific size of the
> packet, thus enabling effective use of memory.
> 
> For example, let's say HW has a capability of three pools,
>  - pool-1 size is 2K
>  - pool-2 size is > 2K and < 4K
>  - pool-3 size is > 4K
> Here,
>         pool-1 can accommodate packets with sizes < 2K
>         pool-2 can accommodate packets with sizes > 2K and < 4K
>         pool-3 can accommodate packets with sizes > 4K
> 
> With pool sort capability enabled in SW, an application may create
> three pools of different sizes and send them to PMD. Allowing PMD
> to program HW based on packet lengths. So that packets with less
> than 2K are received on pool-1, packets with lengths between 2K
> and 4K are received on pool-2 and finally packets greater than 4K
> are received on pool-3.
> 
> The following two capabilities are added to the rte_eth_rxseg_capa
> structure,
> 1. pool_sort --> tells pool sort capability is supported by HW.
> 2. max_npool --> max number of pools supported by HW.
> 
> Defined new structure rte_eth_rxseg_sort, to be used only when pool
> sort capability is present. If required this may be extended further
> to support more configurations.
> 
> Signed-off-by: Hanumanth Pothula <hpothula@marvell.com>
> Change-Id: I5a2485a7919616902c468c767b5c01834d4a2c27
> ---

I like the concept of a PMD being able to use different mbuf pools depending on packet size.

However, the "pool sort" feature is not an extension of the "buffer split" feature, but a separate feature. The API and documentation must reflect this.

Please also consider this, when you implement it in the drivers: If no buffers are available in one of the pools, the next (larger) pool should be used instead of dropping the packet.

Here's another example use case: Assuming that 25 % of internet traffic is tiny packets (e.g. empty TCP ACK packets), a separate pool for those could be used.
  

Patch

diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
index 1979dc0850..e21a651787 100644
--- a/lib/ethdev/rte_ethdev.c
+++ b/lib/ethdev/rte_ethdev.c
@@ -1634,6 +1634,54 @@  rte_eth_dev_is_removed(uint16_t port_id)
 	return ret;
 }
 
+static int
+rte_eth_rx_queue_check_sort(const struct rte_eth_rxseg_sort *rx_seg,
+			     uint16_t n_seg, uint32_t *mbp_buf_size,
+			     const struct rte_eth_dev_info *dev_info)
+{
+	const struct rte_eth_rxseg_capa *seg_capa = &dev_info->rx_seg_capa;
+	uint16_t seg_idx;
+
+	if (!seg_capa->multi_pools || n_seg > seg_capa->max_npool) {
+		RTE_ETHDEV_LOG(ERR,
+			       "Invalid capabilities, multi_pools:%d differnt length segments %u exceed supported %u\n",
+			       seg_capa->multi_pools, n_seg, seg_capa->max_nseg);
+		return -EINVAL;
+	}
+
+	for (seg_idx = 0; seg_idx < n_seg; seg_idx++) {
+		struct rte_mempool *mpl = rx_seg[seg_idx].mp;
+		uint32_t length = rx_seg[seg_idx].length;
+
+		if (mpl == NULL) {
+			RTE_ETHDEV_LOG(ERR, "null mempool pointer\n");
+			return -EINVAL;
+		}
+
+		if (mpl->private_data_size <
+			sizeof(struct rte_pktmbuf_pool_private)) {
+			RTE_ETHDEV_LOG(ERR,
+				       "%s private_data_size %u < %u\n",
+				       mpl->name, mpl->private_data_size,
+				       (unsigned int)sizeof
+					(struct rte_pktmbuf_pool_private));
+			return -ENOSPC;
+		}
+
+		*mbp_buf_size = rte_pktmbuf_data_room_size(mpl);
+		length = length != 0 ? length : (*mbp_buf_size - RTE_PKTMBUF_HEADROOM);
+		if (*mbp_buf_size < length + RTE_PKTMBUF_HEADROOM) {
+			RTE_ETHDEV_LOG(ERR,
+				       "%s mbuf_data_room_size %u < %u))\n",
+				       mpl->name, *mbp_buf_size,
+				       length);
+			return -EINVAL;
+		}
+	}
+
+	return 0;
+}
+
 static int
 rte_eth_rx_queue_check_split(const struct rte_eth_rxseg_split *rx_seg,
 			     uint16_t n_seg, uint32_t *mbp_buf_size,
@@ -1693,7 +1741,11 @@  rte_eth_rx_queue_check_split(const struct rte_eth_rxseg_split *rx_seg,
 		}
 		offset += seg_idx != 0 ? 0 : RTE_PKTMBUF_HEADROOM;
 		*mbp_buf_size = rte_pktmbuf_data_room_size(mpl);
-		length = length != 0 ? length : *mbp_buf_size;
+		/* On segment length == 0, update segment's length with
+		 * the pool's length - headeroom space, to make sure enough
+		 * space is accomidate for header.
+		 **/
+		length = length != 0 ? length : (*mbp_buf_size - RTE_PKTMBUF_HEADROOM);
 		if (*mbp_buf_size < length + offset) {
 			RTE_ETHDEV_LOG(ERR,
 				       "%s mbuf_data_room_size %u < %u (segment length=%u + segment offset=%u)\n",
@@ -1765,6 +1817,7 @@  rte_eth_rx_queue_setup(uint16_t port_id, uint16_t rx_queue_id,
 		}
 	} else {
 		const struct rte_eth_rxseg_split *rx_seg;
+		const struct rte_eth_rxseg_sort *rx_sort;
 		uint16_t n_seg;
 
 		/* Extended multi-segment configuration check. */
@@ -1774,13 +1827,31 @@  rte_eth_rx_queue_setup(uint16_t port_id, uint16_t rx_queue_id,
 			return -EINVAL;
 		}
 
-		rx_seg = (const struct rte_eth_rxseg_split *)rx_conf->rx_seg;
 		n_seg = rx_conf->rx_nseg;
 
 		if (rx_conf->offloads & RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT) {
-			ret = rte_eth_rx_queue_check_split(rx_seg, n_seg,
-							   &mbp_buf_size,
-							   &dev_info);
+			ret = -1; /* To make sure at least one of below conditions becomes true */
+
+			/* Check both NIX and application supports buffer-split capability */
+			if (dev_info.rx_seg_capa.mode_flag == RTE_ETH_RXSEG_MODE_SPLIT &&
+			    rx_conf->rx_seg->mode_flag == RTE_ETH_RXSEG_MODE_SPLIT) {
+				rx_seg = (const struct rte_eth_rxseg_split *)
+					 &(rx_conf->rx_seg->split);
+				ret = rte_eth_rx_queue_check_split(rx_seg, n_seg,
+								   &mbp_buf_size,
+								   &dev_info);
+			}
+
+			/* Check both NIX and application supports pool-sort capability */
+			if (dev_info.rx_seg_capa.mode_flag == RTE_ETH_RXSEG_MODE_SORT &&
+			    rx_conf->rx_seg->mode_flag == RTE_ETH_RXSEG_MODE_SORT) {
+				rx_sort = (const struct rte_eth_rxseg_sort *)
+					  &(rx_conf->rx_seg->sort);
+				ret = rte_eth_rx_queue_check_sort(rx_sort, n_seg,
+								  &mbp_buf_size,
+								  &dev_info);
+			}
+
 			if (ret != 0)
 				return ret;
 		} else {
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index de9e970d4d..9ff8ba8085 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -1204,16 +1204,53 @@  struct rte_eth_rxseg_split {
 	uint32_t reserved; /**< Reserved field. */
 };
 
+/**
+ * The pool sort capability allows PMD to choose a memory pool based on the
+ * packet's length. So, basically, PMD programs HW for receiving packets from
+ * different pools, based on the packet's length.
+ *
+ * This is often useful for saving the memory where the application can create
+ * a different pool to steer the specific size of the packet, thus enabling
+ * effective use of memory.
+ */
+struct rte_eth_rxseg_sort {
+	struct rte_mempool *mp; /**< Memory pool to allocate packets from. */
+	uint16_t length; /**< Packet data length. */
+	uint32_t reserved; /**< Reserved field. */
+};
+
+enum rte_eth_rxseg_mode {
+	/**
+	 * Buffer split mode: PMD split the received packets into multiple segments.
+	 * @see struct rte_eth_rxseg_split
+	 */
+	RTE_ETH_RXSEG_MODE_SPLIT = RTE_BIT64(0),
+	/**
+	 * Pool sort mode: PMD to chooses a memory pool based on the packet's length.
+	 * @see struct rte_eth_rxseg_sort
+	 */
+	RTE_ETH_RXSEG_MODE_SORT  = RTE_BIT64(1),
+};
+
 /**
  * @warning
  * @b EXPERIMENTAL: this structure may change without prior notice.
  *
  * A common structure used to describe Rx packet segment properties.
  */
-union rte_eth_rxseg {
+struct rte_eth_rxseg {
+
+	/**
+	 * PMD may support more than one rxseg mode. This allows application
+	 * to chose which mode to enable.
+	 */
+	enum rte_eth_rxseg_mode mode_flag;
+
 	/* The settings for buffer split offload. */
 	struct rte_eth_rxseg_split split;
-	/* The other features settings should be added here. */
+
+	/*The settings for packet sort offload. */
+	struct rte_eth_rxseg_sort sort;
 };
 
 /**
@@ -1246,7 +1283,7 @@  struct rte_eth_rxconf {
 	 * The supported capabilities of receiving segmentation is reported
 	 * in rte_eth_dev_info.rx_seg_capa field.
 	 */
-	union rte_eth_rxseg *rx_seg;
+	struct rte_eth_rxseg *rx_seg;
 
 	uint64_t reserved_64s[2]; /**< Reserved for future fields */
 	void *reserved_ptrs[2];   /**< Reserved for future fields */
@@ -1831,6 +1868,9 @@  struct rte_eth_rxseg_capa {
 	uint32_t offset_allowed:1; /**< Supports buffer offsets. */
 	uint32_t offset_align_log2:4; /**< Required offset alignment. */
 	uint16_t max_nseg; /**< Maximum amount of segments to split. */
+	/* < Maximum amount of pools that PMD can sort based on packet/segment lengths */
+	uint16_t max_npool;
+	enum rte_eth_rxseg_mode mode_flag; /**< supported rxseg  modes */
 	uint16_t reserved; /**< Reserved field. */
 };