From patchwork Wed Nov 16 12:14:16 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-Patchwork-Submitter: =?utf-8?q?Morten_Br=C3=B8rup?=
 <mb@smartsharesystems.com>
X-Patchwork-Id: 119891
X-Patchwork-Delegate: thomas@monjalon.net
Return-Path: <dev-bounces@dpdk.org>
X-Original-To: patchwork@inbox.dpdk.org
Delivered-To: patchwork@inbox.dpdk.org
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id EE0F3A055B;
	Wed, 16 Nov 2022 13:14:20 +0100 (CET)
Received: from mails.dpdk.org (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id DAD3340E28;
	Wed, 16 Nov 2022 13:14:20 +0100 (CET)
Received: from smartserver.smartsharesystems.com
 (smartserver.smartsharesystems.com [77.243.40.215])
 by mails.dpdk.org (Postfix) with ESMTP id C1BB640DFB
 for <dev@dpdk.org>; Wed, 16 Nov 2022 13:14:19 +0100 (CET)
Received: from dkrd2.smartsharesys.local ([192.168.4.12]) by
 smartserver.smartsharesystems.com with Microsoft SMTPSVC(6.0.3790.4675);
 Wed, 16 Nov 2022 13:14:18 +0100
From: =?utf-8?q?Morten_Br=C3=B8rup?= <mb@smartsharesystems.com>
To: olivier.matz@6wind.com,
	andrew.rybchenko@oktetlabs.ru,
	dev@dpdk.org
Cc: honnappa.nagarahalli@arm.com, bruce.richardson@intel.com,
 konstantin.ananyev@huawei.com,
 =?utf-8?q?Morten_Br=C3=B8rup?= <mb@smartsharesystems.com>
Subject: [PATCH v2] mempool: micro-optimize put function
Date: Wed, 16 Nov 2022 13:14:16 +0100
Message-Id: <20221116121416.94990-1-mb@smartsharesystems.com>
X-Mailer: git-send-email 2.17.1
In-Reply-To: <20221116101855.93297-1-mb@smartsharesystems.com>
References: <20221116101855.93297-1-mb@smartsharesystems.com>
MIME-Version: 1.0
X-OriginalArrivalTime: 16 Nov 2022 12:14:18.0694 (UTC)
 FILETIME=[F3C9AE60:01D8F9B4]
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org

Micro-optimization:
Reduced the most likely code path in the generic put function by moving an
unlikely check out of the most likely code path and further down.

Also updated the comments in the function.

v2 (feedback from Andrew Rybchenko):
* Modified comparison to prevent overflow if n is really huge and len is
  non-zero.
* Added assertion about the invariant preventing overflow in the
  comparison.
* Crossing the threshold is not extremely unlikely, so removed likely()
  from that comparison.
  The compiler will generate code with optimal static branch prediction
  here anyway.

Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
---
 lib/mempool/rte_mempool.h | 36 ++++++++++++++++++++----------------
 1 file changed, 20 insertions(+), 16 deletions(-)

diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
index 9f530db24b..dd1a3177d6 100644
--- a/lib/mempool/rte_mempool.h
+++ b/lib/mempool/rte_mempool.h
@@ -1364,32 +1364,36 @@ rte_mempool_do_generic_put(struct rte_mempool *mp, void * const *obj_table,
 {
 	void **cache_objs;
 
-	/* No cache provided */
+	/* No cache provided? */
 	if (unlikely(cache == NULL))
 		goto driver_enqueue;
 
-	/* increment stat now, adding in mempool always success */
+	/* Increment stats now, adding in mempool always succeeds. */
 	RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_bulk, 1);
 	RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, n);
 
-	/* The request itself is too big for the cache */
-	if (unlikely(n > cache->flushthresh))
-		goto driver_enqueue_stats_incremented;
-
-	/*
-	 * The cache follows the following algorithm:
-	 *   1. If the objects cannot be added to the cache without crossing
-	 *      the flush threshold, flush the cache to the backend.
-	 *   2. Add the objects to the cache.
-	 */
+	/* Assert the invariant preventing overflow in the comparison below. */
+	RTE_ASSERT(cache->len <= cache->flushthresh);
 
-	if (cache->len + n <= cache->flushthresh) {
+	if (n <= cache->flushthresh - cache->len) {
+		/*
+		 * The objects can be added to the cache without crossing the
+		 * flush threshold.
+		 */
 		cache_objs = &cache->objs[cache->len];
 		cache->len += n;
-	} else {
+	} else if (likely(n <= cache->flushthresh)) {
+		/*
+		 * The request itself fits into the cache.
+		 * But first, the cache must be flushed to the backend, so
+		 * adding the objects does not cross the flush threshold.
+		 */
 		cache_objs = &cache->objs[0];
 		rte_mempool_ops_enqueue_bulk(mp, cache_objs, cache->len);
 		cache->len = n;
+	} else {
+		/* The request itself is too big for the cache. */
+		goto driver_enqueue_stats_incremented;
 	}
 
 	/* Add the objects to the cache. */
@@ -1399,13 +1403,13 @@ rte_mempool_do_generic_put(struct rte_mempool *mp, void * const *obj_table,
 
 driver_enqueue:
 
-	/* increment stat now, adding in mempool always success */
+	/* Increment stats now, adding in mempool always succeeds. */
 	RTE_MEMPOOL_STAT_ADD(mp, put_bulk, 1);
 	RTE_MEMPOOL_STAT_ADD(mp, put_objs, n);
 
 driver_enqueue_stats_incremented:
 
-	/* push objects to the backend */
+	/* Push the objects to the backend. */
 	rte_mempool_ops_enqueue_bulk(mp, obj_table, n);
 }