diff mbox series

[1/2] net/bnxt: avoid unnecessary work in AVX2 Rx path

Message ID 20211115182410.5545-2-lance.richardson@broadcom.com (mailing list archive)
State Accepted, archived
Delegated to: Ajit Khaparde
Headers show
Series net/bnxt: minor performance fixes for AVX2 Rx | expand

Checks

Context Check Description
ci/checkpatch success coding style OK

Commit Message

Lance Richardson Nov. 15, 2021, 6:24 p.m. UTC
Each call to the AVX2 vector burst receive function makes at
least one pass through the function's inner loop, loading
256 bytes of completion descriptors and copying 8 rte_mbuf
pointers regardless of whether there are any packets to be
received.

Unidirectional forwarding performance is improved by about
3-4% if we ensure that at least one packet can be received
before entering the inner loop.

Fixes: c4e4c18963b0 ("net/bnxt: add AVX2 RX/Tx")
Cc: stable@dpdk.org
Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
---
 drivers/net/bnxt/bnxt_rxtx_vec_avx2.c | 4 ++++
 1 file changed, 4 insertions(+)
diff mbox series

Patch

diff --git a/drivers/net/bnxt/bnxt_rxtx_vec_avx2.c b/drivers/net/bnxt/bnxt_rxtx_vec_avx2.c
index e4905b4fd1..54e3af22ac 100644
--- a/drivers/net/bnxt/bnxt_rxtx_vec_avx2.c
+++ b/drivers/net/bnxt/bnxt_rxtx_vec_avx2.c
@@ -98,6 +98,10 @@  recv_burst_vec_avx2(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 	rte_prefetch0(&cp_desc_ring[cons + 8]);
 	rte_prefetch0(&cp_desc_ring[cons + 12]);
 
+	/* Return immediately if there is not at least one completed packet. */
+	if (!bnxt_cpr_cmp_valid(&cp_desc_ring[cons], raw_cons, cp_ring_size))
+		return 0;
+
 	/* Ensure that we do not go past the ends of the rings. */
 	nb_pkts = RTE_MIN(nb_pkts, RTE_MIN(rx_ring_size - mbcons,
 					   (cp_ring_size - cons) / 2));