Message ID | 20210224111852.11947-2-ciara.loftus@intel.com (mailing list archive) |
---|---|
State | Changes Requested, archived |
Delegated to: | Ferruh Yigit |
Headers | show |
Series | AF_XDP Preferred Busy Polling | expand |
Context | Check | Description |
---|---|---|
ci/checkpatch | success | coding style OK |
On 2/24/2021 11:18 AM, Ciara Loftus wrote: > Prior to this the max size was 32 which was unnecessarily > small. Can you please describe the impact? Why changed from 32 to 512? I assume this is to improve the performance but can you please explicitly document it in the commit log? > Also enforce the max batch size for TX for both > copy and zero copy modes. Prior to this only copy mode > enforced the max size. > By enforcing, the PMD ignores the user provided burst value if it is more than PMS supported MAX, and this ignoring is done in silent. Also there is no way to discover this MAX value without checking the code. Overall, why this max values are required at all? After quick check I can see they are used for some bulk operations, which I assume can be eliminated, what do you think? > Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> <...>
> > On 2/24/2021 11:18 AM, Ciara Loftus wrote: > > Prior to this the max size was 32 which was unnecessarily > > small. > > Can you please describe the impact? Why changed from 32 to 512? > I assume this is to improve the performance but can you please explicitly > document it in the commit log? Indeed - improved performance due to bulk operations and fewer ring accesses and syscalls. The value 512 was arbitrary. I will change this to the default ring size as defined by libbpf (2048) in v2. Will update the commit log with this info. > > > Also enforce the max batch size for TX for both > > copy and zero copy modes. Prior to this only copy mode > > enforced the max size. > > > > By enforcing, the PMD ignores the user provided burst value if it is more than > PMS supported MAX, and this ignoring is done in silent. Also there is no way > to > discover this MAX value without checking the code. > > Overall, why this max values are required at all? After quick check I can see > they are used for some bulk operations, which I assume can be eliminated, > what > do you think? We need to size some arrays at compile time with this max value. Instead of removing the bulk operations which may impact performance, how about taking an approach where we split batches that are > 2048 into smaller batches and still handle all the packets instead of discarding those > 2048. Something like what's done in ixgbe for example: http://code.dpdk.org/dpdk/v21.02/source/drivers/net/ixgbe/ixgbe_rxtx.c#L318 Thanks, Ciara > > > Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> > > <...>
On 3/3/2021 3:07 PM, Loftus, Ciara wrote: >> >> On 2/24/2021 11:18 AM, Ciara Loftus wrote: >>> Prior to this the max size was 32 which was unnecessarily >>> small. >> >> Can you please describe the impact? Why changed from 32 to 512? >> I assume this is to improve the performance but can you please explicitly >> document it in the commit log? > > Indeed - improved performance due to bulk operations and fewer ring accesses and syscalls. > The value 512 was arbitrary. I will change this to the default ring size as defined by libbpf (2048) in v2. > Will update the commit log with this info. > >> >>> Also enforce the max batch size for TX for both >>> copy and zero copy modes. Prior to this only copy mode >>> enforced the max size. >>> >> >> By enforcing, the PMD ignores the user provided burst value if it is more than >> PMS supported MAX, and this ignoring is done in silent. Also there is no way >> to >> discover this MAX value without checking the code. >> >> Overall, why this max values are required at all? After quick check I can see >> they are used for some bulk operations, which I assume can be eliminated, >> what >> do you think? > > We need to size some arrays at compile time with this max value. > > Instead of removing the bulk operations which may impact performance, how about taking an approach where we split batches that are > 2048 into smaller batches and still handle all the packets instead of discarding those > 2048. Something like what's done in ixgbe for example: > http://code.dpdk.org/dpdk/v21.02/source/drivers/net/ixgbe/ixgbe_rxtx.c#L318 >` If there is no reasonable way to eliminate the fix sized arrays, above suggestion looks good. Thanks, ferruh
diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c b/drivers/net/af_xdp/rte_eth_af_xdp.c index b8d5ad0d91..b51db90204 100644 --- a/drivers/net/af_xdp/rte_eth_af_xdp.c +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c @@ -66,8 +66,8 @@ RTE_LOG_REGISTER(af_xdp_logtype, pmd.net.af_xdp, NOTICE); #define ETH_AF_XDP_DFLT_START_QUEUE_IDX 0 #define ETH_AF_XDP_DFLT_QUEUE_COUNT 1 -#define ETH_AF_XDP_RX_BATCH_SIZE 32 -#define ETH_AF_XDP_TX_BATCH_SIZE 32 +#define ETH_AF_XDP_RX_BATCH_SIZE 512 +#define ETH_AF_XDP_TX_BATCH_SIZE 512 struct xsk_umem_info { @@ -535,8 +535,6 @@ af_xdp_tx_cp(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) uint32_t idx_tx; struct xsk_ring_cons *cq = &txq->pair->cq; - nb_pkts = RTE_MIN(nb_pkts, ETH_AF_XDP_TX_BATCH_SIZE); - pull_umem_cq(umem, nb_pkts, cq); nb_pkts = rte_ring_dequeue_bulk(umem->buf_ring, addrs, @@ -580,6 +578,8 @@ af_xdp_tx_cp(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) static uint16_t eth_af_xdp_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) { + nb_pkts = RTE_MIN(nb_pkts, ETH_AF_XDP_TX_BATCH_SIZE); + #if defined(XDP_UMEM_UNALIGNED_CHUNK_FLAG) return af_xdp_tx_zc(queue, bufs, nb_pkts); #else
Prior to this the max size was 32 which was unnecessarily small. Also enforce the max batch size for TX for both copy and zero copy modes. Prior to this only copy mode enforced the max size. Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> --- drivers/net/af_xdp/rte_eth_af_xdp.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)