diff mbox series

[RFC] ethdev: introduce shared Rx queue

Message ID 20210727034204.20649-1-xuemingl@nvidia.com (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers show
Series [RFC] ethdev: introduce shared Rx queue | expand

Checks

Context Check Description
ci/intel-Testing success Testing PASS
ci/Intel-compilation success Compilation OK
ci/checkpatch success coding style OK

Commit Message

Xueming(Steven) Li July 27, 2021, 3:42 a.m. UTC
In eth PMD driver model, each RX queue was pre-loaded with mbufs for
saving incoming packets. When number of SF or VF scale out in a switch
domain, the memory consumption became significant. Most important,
polling all ports leads to high cache miss, high latency and low
throughput.

To save memory and speed up, this patch introduces shared RX queue.
Ports with same configuration in a switch domain could share RX queue
set by specifying offloading flag RTE_ETH_RX_OFFLOAD_SHARED_RXQ. Polling
a member port in shared RX queue receives packets for all member ports.
Source port is identified by mbuf->port.

Queue number of ports in shared group should be identical. Queue index
is 1:1 mapped in shared group.

Shared RX queue is supposed to be polled on same thread.

Multiple groups is supported by group ID.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
---
 lib/ethdev/rte_ethdev.c | 1 +
 lib/ethdev/rte_ethdev.h | 7 +++++++
 2 files changed, 8 insertions(+)

Comments

Andrew Rybchenko July 28, 2021, 7:56 a.m. UTC | #1
On 7/27/21 6:42 AM, Xueming Li wrote:
> In eth PMD driver model, each RX queue was pre-loaded with mbufs for
> saving incoming packets. When number of SF or VF scale out in a switch
> domain, the memory consumption became significant. Most important,
> polling all ports leads to high cache miss, high latency and low
> throughput.
> 
> To save memory and speed up, this patch introduces shared RX queue.
> Ports with same configuration in a switch domain could share RX queue
> set by specifying offloading flag RTE_ETH_RX_OFFLOAD_SHARED_RXQ. Polling
> a member port in shared RX queue receives packets for all member ports.
> Source port is identified by mbuf->port.
> 
> Queue number of ports in shared group should be identical. Queue index
> is 1:1 mapped in shared group.
> 
> Shared RX queue is supposed to be polled on same thread.
> 
> Multiple groups is supported by group ID.
> 
> Signed-off-by: Xueming Li <xuemingl@nvidia.com>

It looks like it could be useful to artificial benchmarks, but
absolutely useless for real life. SFs and VFs are used by VMs
(or containers?) to have its own part of HW. If so, SF or VF
Rx and Tx queues live in a VM and cannot be shared.

Sharing makes sense for representors, but it is not mentioned in
the description.
Xueming(Steven) Li July 28, 2021, 8:20 a.m. UTC | #2
Hi Andrew,

> -----Original Message-----
> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Sent: Wednesday, July 28, 2021 3:57 PM
> To: Xueming(Steven) Li <xuemingl@nvidia.com>
> Cc: dev@dpdk.org; Slava Ovsiienko <viacheslavo@nvidia.com>; NBU-Contact-Thomas Monjalon <thomas@monjalon.net>; Ferruh
> Yigit <ferruh.yigit@intel.com>
> Subject: Re: [RFC] ethdev: introduce shared Rx queue
> 
> On 7/27/21 6:42 AM, Xueming Li wrote:
> > In eth PMD driver model, each RX queue was pre-loaded with mbufs for
> > saving incoming packets. When number of SF or VF scale out in a switch
> > domain, the memory consumption became significant. Most important,
> > polling all ports leads to high cache miss, high latency and low
> > throughput.
> >
> > To save memory and speed up, this patch introduces shared RX queue.
> > Ports with same configuration in a switch domain could share RX queue
> > set by specifying offloading flag RTE_ETH_RX_OFFLOAD_SHARED_RXQ.
> > Polling a member port in shared RX queue receives packets for all member ports.
> > Source port is identified by mbuf->port.
> >
> > Queue number of ports in shared group should be identical. Queue index
> > is 1:1 mapped in shared group.
> >
> > Shared RX queue is supposed to be polled on same thread.
> >
> > Multiple groups is supported by group ID.
> >
> > Signed-off-by: Xueming Li <xuemingl@nvidia.com>
> 
> It looks like it could be useful to artificial benchmarks, but absolutely useless for real life. SFs and VFs are used by VMs (or containers?)
> to have its own part of HW. If so, SF or VF Rx and Tx queues live in a VM and cannot be shared.

Thanks for looking at this! Agree, SF and VF can't be shared.

> 
> Sharing makes sense for representors, but it is not mentioned in the description.

Yes, the major target is representors, ports in same switch domain, I'll emphasis this in next version.
diff mbox series

Patch

diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
index a1106f5896..632a0e890b 100644
--- a/lib/ethdev/rte_ethdev.c
+++ b/lib/ethdev/rte_ethdev.c
@@ -127,6 +127,7 @@  static const struct {
 	RTE_RX_OFFLOAD_BIT2STR(OUTER_UDP_CKSUM),
 	RTE_RX_OFFLOAD_BIT2STR(RSS_HASH),
 	RTE_ETH_RX_OFFLOAD_BIT2STR(BUFFER_SPLIT),
+	RTE_ETH_RX_OFFLOAD_BIT2STR(SHARED_RXQ),
 };
 
 #undef RTE_RX_OFFLOAD_BIT2STR
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index d2b27c351f..5c63751be0 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -1047,6 +1047,7 @@  struct rte_eth_rxconf {
 	uint8_t rx_drop_en; /**< Drop packets if no descriptors are available. */
 	uint8_t rx_deferred_start; /**< Do not start queue with rte_eth_dev_start(). */
 	uint16_t rx_nseg; /**< Number of descriptions in rx_seg array. */
+	uint32_t shared_group; /**< Shared port group index in switch domain. */
 	/**
 	 * Per-queue Rx offloads to be set using DEV_RX_OFFLOAD_* flags.
 	 * Only offloads set on rx_queue_offload_capa or rx_offload_capa
@@ -1373,6 +1374,12 @@  struct rte_eth_conf {
 #define DEV_RX_OFFLOAD_OUTER_UDP_CKSUM  0x00040000
 #define DEV_RX_OFFLOAD_RSS_HASH		0x00080000
 #define RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT 0x00100000
+/**
+ * RXQ is shared within ports in switch domain to save memory and avoid
+ * polling every port. Any port in group could be used to receive packets.
+ * Real source port number saved in mbuf->port field.
+ */
+#define RTE_ETH_RX_OFFLOAD_SHARED_RXQ   0x00200000
 
 #define DEV_RX_OFFLOAD_CHECKSUM (DEV_RX_OFFLOAD_IPV4_CKSUM | \
 				 DEV_RX_OFFLOAD_UDP_CKSUM | \