[v12,2/2] net/i40e: replace put function

Message ID 20230721162836.1208772-2-dharmikjayesh.thakkar@arm.com (mailing list archive)
State New
Delegated to: Thomas Monjalon
Headers
Series [v12,1/2] mempool cache: add zero-copy get and put functions |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/intel-Testing success Testing PASS
ci/github-robot: build success github build: passed
ci/intel-Functional success Functional PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-abi-testing success Testing PASS
ci/iol-aarch-unit-testing success Testing PASS
ci/iol-unit-testing fail Testing issues
ci/iol-x86_64-compile-testing success Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-testing success Testing PASS
ci/iol-x86_64-unit-testing success Testing PASS

Commit Message

Dharmik Thakkar July 21, 2023, 4:28 p.m. UTC
  From: Kamalakshitha Aligeri <kamalakshitha.aligeri@arm.com>

Integrated zero-copy put API in mempool cache in i40e PMD.
On Ampere Altra server, l3fwd single core's performance improves by 5%
with the new API

Signed-off-by: Kamalakshitha Aligeri <kamalakshitha.aligeri@arm.com>
Signed-off-by: Dharmik Thakkar <dharmikjayesh.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Feifei Wang <feifei.wang2@arm.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>
---
 drivers/net/i40e/i40e_rxtx_vec_common.h | 29 ++++++++++++++++++++-----
 1 file changed, 23 insertions(+), 6 deletions(-)
  

Patch

diff --git a/drivers/net/i40e/i40e_rxtx_vec_common.h b/drivers/net/i40e/i40e_rxtx_vec_common.h
index fe1a6ec75ef5..bb746fc36823 100644
--- a/drivers/net/i40e/i40e_rxtx_vec_common.h
+++ b/drivers/net/i40e/i40e_rxtx_vec_common.h
@@ -85,7 +85,7 @@  i40e_tx_free_bufs(struct i40e_tx_queue *txq)
 	uint32_t n;
 	uint32_t i;
 	int nb_free = 0;
-	struct rte_mbuf *m, *free[RTE_I40E_TX_MAX_FREE_BUF_SZ];
+	struct rte_mbuf *m, *free[RTE_I40E_TX_MAX_FREE_BUF_SZ] = {0};
 
 	/* check DD bits on threshold descriptor */
 	if ((txq->tx_ring[txq->tx_next_dd].cmd_type_offset_bsz &
@@ -95,18 +95,35 @@  i40e_tx_free_bufs(struct i40e_tx_queue *txq)
 
 	n = txq->tx_rs_thresh;
 
-	 /* first buffer to free from S/W ring is at index
-	  * tx_next_dd - (tx_rs_thresh-1)
-	  */
+	/* first buffer to free from S/W ring is at index
+	 * tx_next_dd - (tx_rs_thresh-1)
+	 */
 	txep = &txq->sw_ring[txq->tx_next_dd - (n - 1)];
 
 	if (txq->offloads & RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE) {
+		struct rte_mempool *mp = txep[0].mbuf->pool;
+		struct rte_mempool_cache *cache = rte_mempool_default_cache(mp, rte_lcore_id());
+		void **cache_objs;
+
+		if (unlikely(!cache))
+			goto fallback;
+
+		cache_objs = rte_mempool_cache_zc_put_bulk(cache, mp, n);
+		if (unlikely(!cache_objs))
+			goto fallback;
+
 		for (i = 0; i < n; i++) {
-			free[i] = txep[i].mbuf;
+			cache_objs[i] = txep[i].mbuf;
 			/* no need to reset txep[i].mbuf in vector path */
 		}
-		rte_mempool_put_bulk(free[0]->pool, (void **)free, n);
 		goto done;
+
+fallback:
+		for (i = 0; i < n; i++)
+			free[i] = txep[i].mbuf;
+		rte_mempool_generic_put(mp, (void **)free, n, cache);
+		goto done;
+
 	}
 
 	m = rte_pktmbuf_prefree_seg(txep[0].mbuf);