From patchwork Sat Oct 7 07:36:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jieqiang Wang X-Patchwork-Id: 132383 X-Patchwork-Delegate: david.marchand@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 9887E426DC; Sat, 7 Oct 2023 09:36:52 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id C3C16402F1; Sat, 7 Oct 2023 09:36:51 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by mails.dpdk.org (Postfix) with ESMTP id 96F7340293; Sat, 7 Oct 2023 09:36:49 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 94E3EC15; Sat, 7 Oct 2023 00:37:28 -0700 (PDT) Received: from net-x86-dell-8268.shanghai.arm.com (net-x86-dell-8268.shanghai.arm.com [10.169.210.116]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 2E9183F762; Sat, 7 Oct 2023 00:36:44 -0700 (PDT) From: Jieqiang Wang To: Yipeng Wang , Sameh Gobriel , Bruce Richardson , Vladimir Medvedkin , Honnappa Nagarahalli , Dharmik Thakkar Cc: dev@dpdk.org, nd@arm.com, Jieqiang Wang , stable@dpdk.org, Feifei Wang , Ruifeng Wang Subject: [PATCH v3] hash: fix SSE comparison Date: Sat, 7 Oct 2023 15:36:34 +0800 Message-Id: <20231007073634.3458294-1-jieqiang.wang@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230906023100.3618303-1-jieqiang.wang@arm.com> References: <20230906023100.3618303-1-jieqiang.wang@arm.com> MIME-Version: 1.0 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org __mm_cmpeq_epi16 returns 0xFFFF if the corresponding 16-bit elements are equal. In original SSE2 implementation for function compare_signatures, it utilizes _mm_movemask_epi8 to create mask from the MSB of each 8-bit element, while we should only care about the MSB of lower 8-bit in each 16-bit element. For example, if the comparison result is all equal, SSE2 path returns 0xFFFF while NEON and default scalar path return 0x5555. Although this bug is not causing any negative effects since the caller function solely examines the trailing zeros of each match mask, we recommend this fix to ensure consistency with NEON and default scalar code behaviors. Fixes: c7d93df552c2 ("hash: use partial-key hashing") Cc: stable@dpdk.org v2: 1. Utilize scalar mask instead of vector mask to save extra loads (Bruce) v3: 1. Fix coding style warnings Signed-off-by: Feifei Wang Signed-off-by: Jieqiang Wang Reviewed-by: Ruifeng Wang Acked-by: Bruce Richardson --- lib/hash/rte_cuckoo_hash.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/lib/hash/rte_cuckoo_hash.c b/lib/hash/rte_cuckoo_hash.c index d92a903bb3..19b23f2a97 100644 --- a/lib/hash/rte_cuckoo_hash.c +++ b/lib/hash/rte_cuckoo_hash.c @@ -1868,11 +1868,15 @@ compare_signatures(uint32_t *prim_hash_matches, uint32_t *sec_hash_matches, _mm_load_si128( (__m128i const *)prim_bkt->sig_current), _mm_set1_epi16(sig))); + /* Extract the even-index bits only */ + *prim_hash_matches &= 0x5555; /* Compare all signatures in the bucket */ *sec_hash_matches = _mm_movemask_epi8(_mm_cmpeq_epi16( _mm_load_si128( (__m128i const *)sec_bkt->sig_current), _mm_set1_epi16(sig))); + /* Extract the even-index bits only */ + *sec_hash_matches &= 0x5555; break; #elif defined(__ARM_NEON) case RTE_HASH_COMPARE_NEON: {