From patchwork Sat Oct 7 07:15:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jieqiang Wang X-Patchwork-Id: 132382 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id A76EE426DC; Sat, 7 Oct 2023 09:16:43 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 332CF4029C; Sat, 7 Oct 2023 09:16:43 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by mails.dpdk.org (Postfix) with ESMTP id 6908840293; Sat, 7 Oct 2023 09:16:42 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5E506C15; Sat, 7 Oct 2023 00:17:21 -0700 (PDT) Received: from net-x86-dell-8268.shanghai.arm.com (net-x86-dell-8268.shanghai.arm.com [10.169.210.116]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id DC03D3F762; Sat, 7 Oct 2023 00:16:37 -0700 (PDT) From: Jieqiang Wang To: Yipeng Wang , Sameh Gobriel , Bruce Richardson , Vladimir Medvedkin , Dharmik Thakkar , Honnappa Nagarahalli Cc: dev@dpdk.org, nd@arm.com, Jieqiang Wang , stable@dpdk.org, Feifei Wang , Ruifeng Wang Subject: [PATCH v2] hash: fix SSE comparison Date: Sat, 7 Oct 2023 15:15:59 +0800 Message-Id: <20231007071559.3453852-1-jieqiang.wang@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230906023100.3618303-1-jieqiang.wang@arm.com> References: <20230906023100.3618303-1-jieqiang.wang@arm.com> MIME-Version: 1.0 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org __mm_cmpeq_epi16 returns 0xFFFF if the corresponding 16-bit elements are equal. In original SSE2 implementation for function compare_signatures, it utilizes _mm_movemask_epi8 to create mask from the MSB of each 8-bit element, while we should only care about the MSB of lower 8-bit in each 16-bit element. For example, if the comparison result is all equal, SSE2 path returns 0xFFFF while NEON and default scalar path return 0x5555. Although this bug is not causing any negative effects since the caller function solely examines the trailing zeros of each match mask, we recommend this fix to ensure consistency with NEON and default scalar code behaviors. Fixes: c7d93df552c2 ("hash: use partial-key hashing") Cc: stable@dpdk.org Signed-off-by: Feifei Wang Signed-off-by: Jieqiang Wang Reviewed-by: Ruifeng Wang --- lib/hash/rte_cuckoo_hash.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/lib/hash/rte_cuckoo_hash.c b/lib/hash/rte_cuckoo_hash.c index d92a903bb3..d348d45246 100644 --- a/lib/hash/rte_cuckoo_hash.c +++ b/lib/hash/rte_cuckoo_hash.c @@ -1868,11 +1868,15 @@ compare_signatures(uint32_t *prim_hash_matches, uint32_t *sec_hash_matches, _mm_load_si128( (__m128i const *)prim_bkt->sig_current), _mm_set1_epi16(sig))); + /* Extract the even-index bits only */ + *prim_hash_matches &= 0x5555; /* Compare all signatures in the bucket */ *sec_hash_matches = _mm_movemask_epi8(_mm_cmpeq_epi16( _mm_load_si128( (__m128i const *)sec_bkt->sig_current), _mm_set1_epi16(sig))); + /* Extract the even-index bits only */ + *sec_hash_matches &= 0x5555; break; #elif defined(__ARM_NEON) case RTE_HASH_COMPARE_NEON: {