From patchwork Wed May 31 04:26:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavan Nikhilesh Bhagavatula X-Patchwork-Id: 127736 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id B696542BEB; Wed, 31 May 2023 06:26:37 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 7D22442B8B; Wed, 31 May 2023 06:26:32 +0200 (CEST) Received: from mx0b-0016f401.pphosted.com (mx0a-0016f401.pphosted.com [67.231.148.174]) by mails.dpdk.org (Postfix) with ESMTP id 05AEF40ED7 for ; Wed, 31 May 2023 06:26:29 +0200 (CEST) Received: from pps.filterd (m0045849.ppops.net [127.0.0.1]) by mx0a-0016f401.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 34ULk88Q020106; Tue, 30 May 2023 21:26:26 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=pfpt0220; bh=y9yEy6vZfaU3ZjKaTEU/yYMCG6qkP+fjMuvjzly0mHc=; b=MfF/aCfsOC67NkbrHnVRy5+KcBL80xseb5AOg2rO0TzBny0wAh3tbjl1Fy56glrD0YWV MVuF3z00m2KkJiPR4UqCnwySJQi+BR2N5JAO/QeFNKGSNItFv7nHZefPq4gtnsWmXKvg k9Q/sif74SiEiol/KoBqxICRU7VlQnQOuBdj8pzVqyaqO9NFhMTtXSe91EXXbBHZQsSU XVm7LOwtm00l/+zOnpbMUxEnsaQ65Al6Dmy2OjTlm/Bh/g0Ufrts3JhyzUa798kwEgCU XnLaOv69nh98ZVfkLsMO5APhCpz7aX+lzXXnFcYljetB8I26sGm5zWAhj4xHhg/jj5Td pw== Received: from dc5-exch02.marvell.com ([199.233.59.182]) by mx0a-0016f401.pphosted.com (PPS) with ESMTPS id 3qwsb8sb4e-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Tue, 30 May 2023 21:26:25 -0700 Received: from DC5-EXCH02.marvell.com (10.69.176.39) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server (TLS) id 15.0.1497.48; Tue, 30 May 2023 21:26:23 -0700 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server id 15.0.1497.48 via Frontend Transport; Tue, 30 May 2023 21:26:23 -0700 Received: from MININT-80QBFE8.corp.innovium.com (unknown [10.28.164.122]) by maili.marvell.com (Postfix) with ESMTP id 3D4673F7063; Tue, 30 May 2023 21:26:20 -0700 (PDT) From: To: , Ruifeng Wang , Yipeng Wang , Sameh Gobriel , "Bruce Richardson" , Vladimir Medvedkin , Konstantin Ananyev CC: , Pavan Nikhilesh Subject: [PATCH v4 1/2] ip_frag: optimize key compare and hash generation Date: Wed, 31 May 2023 09:56:16 +0530 Message-ID: <20230531042617.13282-1-pbhagavatula@marvell.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230529145502.11805-1-pbhagavatula@marvell.com> References: <20230529145502.11805-1-pbhagavatula@marvell.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: j_1iipH5u2zunEpUFj7i1hXEGsaU_Dmx X-Proofpoint-GUID: j_1iipH5u2zunEpUFj7i1hXEGsaU_Dmx X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.573,FMLib:17.11.176.26 definitions=2023-05-30_18,2023-05-30_01,2023-05-22_02 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org From: Pavan Nikhilesh Use optimized rte_hash_k32_cmp_eq routine for key comparison for x86 and ARM64. Use CRC instructions for hash generation on ARM64. Signed-off-by: Pavan Nikhilesh Reviewed-by: Ruifeng Wang --- On Neoverse-N2, performance improved by 10% when measured with examples/ip_reassembly. v4 Changes: - Fix compilation failures (sys/queue) - Update test case to use proper macros. v3 Changes: - Drop NEON patch. v2 Changes: - Fix compilation failure with non ARM64/x86 targets lib/hash/rte_cmp_arm64.h | 16 ++++++++-------- lib/hash/rte_cmp_x86.h | 16 ++++++++-------- lib/ip_frag/ip_frag_common.h | 14 ++++++++++++++ lib/ip_frag/ip_frag_internal.c | 4 ++-- 4 files changed, 32 insertions(+), 18 deletions(-) -- 2.25.1 diff --git a/lib/hash/rte_cmp_arm64.h b/lib/hash/rte_cmp_arm64.h index e9e26f9abd..a3e85635eb 100644 --- a/lib/hash/rte_cmp_arm64.h +++ b/lib/hash/rte_cmp_arm64.h @@ -3,7 +3,7 @@ */ /* Functions to compare multiple of 16 byte keys (up to 128 bytes) */ -static int +static inline int rte_hash_k16_cmp_eq(const void *key1, const void *key2, size_t key_len __rte_unused) { @@ -24,7 +24,7 @@ rte_hash_k16_cmp_eq(const void *key1, const void *key2, return !(x0 == 0 && x1 == 0); } -static int +static inline int rte_hash_k32_cmp_eq(const void *key1, const void *key2, size_t key_len) { return rte_hash_k16_cmp_eq(key1, key2, key_len) || @@ -32,7 +32,7 @@ rte_hash_k32_cmp_eq(const void *key1, const void *key2, size_t key_len) (const char *) key2 + 16, key_len); } -static int +static inline int rte_hash_k48_cmp_eq(const void *key1, const void *key2, size_t key_len) { return rte_hash_k16_cmp_eq(key1, key2, key_len) || @@ -42,7 +42,7 @@ rte_hash_k48_cmp_eq(const void *key1, const void *key2, size_t key_len) (const char *) key2 + 32, key_len); } -static int +static inline int rte_hash_k64_cmp_eq(const void *key1, const void *key2, size_t key_len) { return rte_hash_k32_cmp_eq(key1, key2, key_len) || @@ -50,7 +50,7 @@ rte_hash_k64_cmp_eq(const void *key1, const void *key2, size_t key_len) (const char *) key2 + 32, key_len); } -static int +static inline int rte_hash_k80_cmp_eq(const void *key1, const void *key2, size_t key_len) { return rte_hash_k64_cmp_eq(key1, key2, key_len) || @@ -58,7 +58,7 @@ rte_hash_k80_cmp_eq(const void *key1, const void *key2, size_t key_len) (const char *) key2 + 64, key_len); } -static int +static inline int rte_hash_k96_cmp_eq(const void *key1, const void *key2, size_t key_len) { return rte_hash_k64_cmp_eq(key1, key2, key_len) || @@ -66,7 +66,7 @@ rte_hash_k96_cmp_eq(const void *key1, const void *key2, size_t key_len) (const char *) key2 + 64, key_len); } -static int +static inline int rte_hash_k112_cmp_eq(const void *key1, const void *key2, size_t key_len) { return rte_hash_k64_cmp_eq(key1, key2, key_len) || @@ -76,7 +76,7 @@ rte_hash_k112_cmp_eq(const void *key1, const void *key2, size_t key_len) (const char *) key2 + 96, key_len); } -static int +static inline int rte_hash_k128_cmp_eq(const void *key1, const void *key2, size_t key_len) { return rte_hash_k64_cmp_eq(key1, key2, key_len) || diff --git a/lib/hash/rte_cmp_x86.h b/lib/hash/rte_cmp_x86.h index 13a5836351..ddfbef462f 100644 --- a/lib/hash/rte_cmp_x86.h +++ b/lib/hash/rte_cmp_x86.h @@ -5,7 +5,7 @@ #include /* Functions to compare multiple of 16 byte keys (up to 128 bytes) */ -static int +static inline int rte_hash_k16_cmp_eq(const void *key1, const void *key2, size_t key_len __rte_unused) { const __m128i k1 = _mm_loadu_si128((const __m128i *) key1); @@ -15,7 +15,7 @@ rte_hash_k16_cmp_eq(const void *key1, const void *key2, size_t key_len __rte_unu return !_mm_test_all_zeros(x, x); } -static int +static inline int rte_hash_k32_cmp_eq(const void *key1, const void *key2, size_t key_len) { return rte_hash_k16_cmp_eq(key1, key2, key_len) || @@ -23,7 +23,7 @@ rte_hash_k32_cmp_eq(const void *key1, const void *key2, size_t key_len) (const char *) key2 + 16, key_len); } -static int +static inline int rte_hash_k48_cmp_eq(const void *key1, const void *key2, size_t key_len) { return rte_hash_k16_cmp_eq(key1, key2, key_len) || @@ -33,7 +33,7 @@ rte_hash_k48_cmp_eq(const void *key1, const void *key2, size_t key_len) (const char *) key2 + 32, key_len); } -static int +static inline int rte_hash_k64_cmp_eq(const void *key1, const void *key2, size_t key_len) { return rte_hash_k32_cmp_eq(key1, key2, key_len) || @@ -41,7 +41,7 @@ rte_hash_k64_cmp_eq(const void *key1, const void *key2, size_t key_len) (const char *) key2 + 32, key_len); } -static int +static inline int rte_hash_k80_cmp_eq(const void *key1, const void *key2, size_t key_len) { return rte_hash_k64_cmp_eq(key1, key2, key_len) || @@ -49,7 +49,7 @@ rte_hash_k80_cmp_eq(const void *key1, const void *key2, size_t key_len) (const char *) key2 + 64, key_len); } -static int +static inline int rte_hash_k96_cmp_eq(const void *key1, const void *key2, size_t key_len) { return rte_hash_k64_cmp_eq(key1, key2, key_len) || @@ -57,7 +57,7 @@ rte_hash_k96_cmp_eq(const void *key1, const void *key2, size_t key_len) (const char *) key2 + 64, key_len); } -static int +static inline int rte_hash_k112_cmp_eq(const void *key1, const void *key2, size_t key_len) { return rte_hash_k64_cmp_eq(key1, key2, key_len) || @@ -67,7 +67,7 @@ rte_hash_k112_cmp_eq(const void *key1, const void *key2, size_t key_len) (const char *) key2 + 96, key_len); } -static int +static inline int rte_hash_k128_cmp_eq(const void *key1, const void *key2, size_t key_len) { return rte_hash_k64_cmp_eq(key1, key2, key_len) || diff --git a/lib/ip_frag/ip_frag_common.h b/lib/ip_frag/ip_frag_common.h index 0d8ce6a1e1..7d6c1aa98d 100644 --- a/lib/ip_frag/ip_frag_common.h +++ b/lib/ip_frag/ip_frag_common.h @@ -7,6 +7,14 @@ #include +#include + +#if defined(RTE_ARCH_ARM64) +#include +#elif defined(RTE_ARCH_X86) +#include +#endif + #include "rte_ip_frag.h" #include "ip_reassembly.h" @@ -75,12 +83,18 @@ ip_frag_key_invalidate(struct ip_frag_key * key) static inline uint64_t ip_frag_key_cmp(const struct ip_frag_key * k1, const struct ip_frag_key * k2) { +#if defined(RTE_ARCH_X86) || defined(RTE_ARCH_ARM64) + return (k1->id_key_len != k2->id_key_len) || + (k1->key_len == IPV4_KEYLEN ? k1->src_dst[0] != k2->src_dst[0] : + rte_hash_k32_cmp_eq(k1, k2, 32)); +#else uint32_t i; uint64_t val; val = k1->id_key_len ^ k2->id_key_len; for (i = 0; i < k1->key_len; i++) val |= k1->src_dst[i] ^ k2->src_dst[i]; return val; +#endif } /* diff --git a/lib/ip_frag/ip_frag_internal.c b/lib/ip_frag/ip_frag_internal.c index b436a4c931..7cbef647df 100644 --- a/lib/ip_frag/ip_frag_internal.c +++ b/lib/ip_frag/ip_frag_internal.c @@ -45,7 +45,7 @@ ipv4_frag_hash(const struct ip_frag_key *key, uint32_t *v1, uint32_t *v2) p = (const uint32_t *)&key->src_dst; -#ifdef RTE_ARCH_X86 +#if defined(RTE_ARCH_X86) || defined(RTE_ARCH_ARM64) v = rte_hash_crc_4byte(p[0], PRIME_VALUE); v = rte_hash_crc_4byte(p[1], v); v = rte_hash_crc_4byte(key->id, v); @@ -66,7 +66,7 @@ ipv6_frag_hash(const struct ip_frag_key *key, uint32_t *v1, uint32_t *v2) p = (const uint32_t *) &key->src_dst; -#ifdef RTE_ARCH_X86 +#if defined(RTE_ARCH_X86) || defined(RTE_ARCH_ARM64) v = rte_hash_crc_4byte(p[0], PRIME_VALUE); v = rte_hash_crc_4byte(p[1], v); v = rte_hash_crc_4byte(p[2], v);