[1/2] net/i40e: desc loading is unnecessarily ordered for aarch64

Message ID 1565693011-33998-2-git-send-email-gavin.hu@arm.com
State New
Delegated to: Qi Zhang
Headers show
Series
  • i40e neon vPMD optiomization for aarch64
Related show

Checks

Context Check Description
ci/mellanox-Performance-Testing success Performance Testing PASS
ci/intel-Performance-Testing success Performance Testing PASS
ci/iol-Compile-Testing success Compile Testing PASS
ci/Intel-compilation success Compilation OK
ci/checkpatch success coding style OK

Commit Message

Gavin Hu Aug. 13, 2019, 10:43 a.m.
For x86, the descriptors needs to be loaded in order, so in between two
descriptors loading, there is a compiler barrier in place.[1]
For aarch64, a patch [2] is in place to survive with discontinuous DD bits,
the barriers can be removed to take full advantage of out-of-order
execution.

50% performance gain in the RFC2544 NDR test was measured on ThunderX2.
12.50% performan gain in the RFC2544 NDR test was measured on Ampere
eMAG80 platform.

[1] http://inbox.dpdk.org/users/039ED4275CED7440929022BC67E7061153D71548@
SHSMSX105.ccr.corp.intel.com/
[2] https://mails.dpdk.org/archives/stable/2017-October/003324.html

Fixes: ae0eb310f253 ("net/i40e: implement vector PMD for ARM")
Cc: stable@dpdk.org

Signed-off-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
---
 drivers/net/i40e/i40e_rxtx_vec_neon.c | 1 -
 1 file changed, 1 deletion(-)

Patch

diff --git a/drivers/net/i40e/i40e_rxtx_vec_neon.c b/drivers/net/i40e/i40e_rxtx_vec_neon.c
index 83572ef..5555e9b 100644
--- a/drivers/net/i40e/i40e_rxtx_vec_neon.c
+++ b/drivers/net/i40e/i40e_rxtx_vec_neon.c
@@ -285,7 +285,6 @@  _recv_raw_pkts_vec(struct i40e_rx_queue *rxq, struct rte_mbuf **rx_pkts,
 		/* Read desc statuses backwards to avoid race condition */
 		/* A.1 load 4 pkts desc */
 		descs[3] =  vld1q_u64((uint64_t *)(rxdp + 3));
-		rte_rmb();
 
 		/* B.2 copy 2 mbuf point into rx_pkts  */
 		vst1q_u64((uint64_t *)&rx_pkts[pos], mbp1);