Message ID | 20240430162743.1525484-1-yoan.picchi@arm.com (mailing list archive) |
---|---|
Headers |
Return-Path: <dev-bounces@dpdk.org> X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id B3D1343F00; Tue, 30 Apr 2024 18:27:55 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 3D7AA40262; Tue, 30 Apr 2024 18:27:55 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by mails.dpdk.org (Postfix) with ESMTP id 999C8400EF for <dev@dpdk.org>; Tue, 30 Apr 2024 18:27:54 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 407C82F4; Tue, 30 Apr 2024 09:28:20 -0700 (PDT) Received: from octeon10-1.usa.Arm.com (unknown [10.118.91.161]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id AC5BB3F73F; Tue, 30 Apr 2024 09:27:53 -0700 (PDT) From: Yoan Picchi <yoan.picchi@arm.com> To: Cc: dev@dpdk.org, nd@arm.com, Yoan Picchi <yoan.picchi@arm.com> Subject: [PATCH v9 0/4] hash: add SVE support for bulk key lookup Date: Tue, 30 Apr 2024 16:27:39 +0000 Message-Id: <20240430162743.1525484-1-yoan.picchi@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231020165159.1649282-1-yoan.picchi@arm.com> References: <20231020165159.1649282-1-yoan.picchi@arm.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions <dev.dpdk.org> List-Unsubscribe: <https://mails.dpdk.org/options/dev>, <mailto:dev-request@dpdk.org?subject=unsubscribe> List-Archive: <http://mails.dpdk.org/archives/dev/> List-Post: <mailto:dev@dpdk.org> List-Help: <mailto:dev-request@dpdk.org?subject=help> List-Subscribe: <https://mails.dpdk.org/listinfo/dev>, <mailto:dev-request@dpdk.org?subject=subscribe> Errors-To: dev-bounces@dpdk.org |
Series |
hash: add SVE support for bulk key lookup
|
|
Message
Yoan Picchi
April 30, 2024, 4:27 p.m. UTC
This patchset adds SVE support for the signature comparison in the cuckoo hash lookup and improves the existing NEON implementation. These optimizations required changes to the data format and signature of the relevant functions to support dense hitmasks (no padding) and having the primary and secondary hitmasks interleaved instead of being in their own array each. Benchmarking the cuckoo hash perf test, I observed this effect on speed: There are no significant changes on Intel (ran on Sapphire Rapids) Neon is up to 7-10% faster (ran on ampere altra) 128b SVE is about 3-5% slower than the optimized neon (ran on a graviton 3 cloud instance) 256b SVE is about 0-3% slower than the optimized neon (ran on a graviton 3 cloud instance) V2->V3: Remove a redundant if in the test Change a couple int to uint16_t in compare_signatures_dense Several codding-style fix V3->V4: Rebase V4->V5: Commit message V5->V6: Move the arch-specific code into new arch-specific files Isolate the data struture refactor from adding SVE V6->V7: Commit message Moved RTE_HASH_COMPARE_SVE to the last commit of the chain V7->V8: Commit message Typos and missing spaces V8->V9: Use __rte_unused instead of (void) Fix an indentation mistake Yoan Picchi (4): hash: pack the hitmask for hash in bulk lookup hash: optimize compare signature for NEON test/hash: check bulk lookup of keys after collision hash: add SVE support for bulk key lookup .mailmap | 2 + app/test/test_hash.c | 99 ++++++++--- lib/hash/arch/arm/compare_signatures.h | 117 +++++++++++++ lib/hash/arch/common/compare_signatures.h | 37 ++++ lib/hash/arch/x86/compare_signatures.h | 53 ++++++ lib/hash/rte_cuckoo_hash.c | 199 ++++++++++++---------- lib/hash/rte_cuckoo_hash.h | 1 + 7 files changed, 393 insertions(+), 115 deletions(-) create mode 100644 lib/hash/arch/arm/compare_signatures.h create mode 100644 lib/hash/arch/common/compare_signatures.h create mode 100644 lib/hash/arch/x86/compare_signatures.h
Comments
On Tue, Apr 30, 2024 at 6:28 PM Yoan Picchi <yoan.picchi@arm.com> wrote: > > This patchset adds SVE support for the signature comparison in the cuckoo > hash lookup and improves the existing NEON implementation. These > optimizations required changes to the data format and signature of the > relevant functions to support dense hitmasks (no padding) and having the > primary and secondary hitmasks interleaved instead of being in their own > array each. > > Benchmarking the cuckoo hash perf test, I observed this effect on speed: > There are no significant changes on Intel (ran on Sapphire Rapids) > Neon is up to 7-10% faster (ran on ampere altra) > 128b SVE is about 3-5% slower than the optimized neon (ran on a graviton > 3 cloud instance) > 256b SVE is about 0-3% slower than the optimized neon (ran on a graviton > 3 cloud instance) > > V2->V3: > Remove a redundant if in the test > Change a couple int to uint16_t in compare_signatures_dense > Several codding-style fix > > V3->V4: > Rebase > > V4->V5: > Commit message > > V5->V6: > Move the arch-specific code into new arch-specific files > Isolate the data struture refactor from adding SVE > > V6->V7: > Commit message > Moved RTE_HASH_COMPARE_SVE to the last commit of the chain > > V7->V8: > Commit message > Typos and missing spaces > > V8->V9: > Use __rte_unused instead of (void) > Fix an indentation mistake > > Yoan Picchi (4): > hash: pack the hitmask for hash in bulk lookup > hash: optimize compare signature for NEON > test/hash: check bulk lookup of keys after collision > hash: add SVE support for bulk key lookup > > .mailmap | 2 + > app/test/test_hash.c | 99 ++++++++--- > lib/hash/arch/arm/compare_signatures.h | 117 +++++++++++++ > lib/hash/arch/common/compare_signatures.h | 37 ++++ > lib/hash/arch/x86/compare_signatures.h | 53 ++++++ > lib/hash/rte_cuckoo_hash.c | 199 ++++++++++++---------- > lib/hash/rte_cuckoo_hash.h | 1 + > 7 files changed, 393 insertions(+), 115 deletions(-) > create mode 100644 lib/hash/arch/arm/compare_signatures.h > create mode 100644 lib/hash/arch/common/compare_signatures.h > create mode 100644 lib/hash/arch/x86/compare_signatures.h > Can any of you have a look at this series? Thanks.
Hi David, > > This patchset adds SVE support for the signature comparison in the cuckoo > > hash lookup and improves the existing NEON implementation. These > > optimizations required changes to the data format and signature of the > > relevant functions to support dense hitmasks (no padding) and having the > > primary and secondary hitmasks interleaved instead of being in their own > > array each. > > > > Benchmarking the cuckoo hash perf test, I observed this effect on speed: > > There are no significant changes on Intel (ran on Sapphire Rapids) > > Neon is up to 7-10% faster (ran on ampere altra) > > 128b SVE is about 3-5% slower than the optimized neon (ran on a graviton > > 3 cloud instance) > > 256b SVE is about 0-3% slower than the optimized neon (ran on a graviton > > 3 cloud instance) > > > > V2->V3: > > Remove a redundant if in the test > > Change a couple int to uint16_t in compare_signatures_dense > > Several codding-style fix > > > > V3->V4: > > Rebase > > > > V4->V5: > > Commit message > > > > V5->V6: > > Move the arch-specific code into new arch-specific files > > Isolate the data struture refactor from adding SVE > > > > V6->V7: > > Commit message > > Moved RTE_HASH_COMPARE_SVE to the last commit of the chain > > > > V7->V8: > > Commit message > > Typos and missing spaces > > > > V8->V9: > > Use __rte_unused instead of (void) > > Fix an indentation mistake > > > > Yoan Picchi (4): > > hash: pack the hitmask for hash in bulk lookup > > hash: optimize compare signature for NEON > > test/hash: check bulk lookup of keys after collision > > hash: add SVE support for bulk key lookup > > > > .mailmap | 2 + > > app/test/test_hash.c | 99 ++++++++--- > > lib/hash/arch/arm/compare_signatures.h | 117 +++++++++++++ > > lib/hash/arch/common/compare_signatures.h | 37 ++++ > > lib/hash/arch/x86/compare_signatures.h | 53 ++++++ > > lib/hash/rte_cuckoo_hash.c | 199 ++++++++++++---------- > > lib/hash/rte_cuckoo_hash.h | 1 + > > 7 files changed, 393 insertions(+), 115 deletions(-) > > create mode 100644 lib/hash/arch/arm/compare_signatures.h > > create mode 100644 lib/hash/arch/common/compare_signatures.h > > create mode 100644 lib/hash/arch/x86/compare_signatures.h > > > > Can any of you have a look at this series? > Thanks. It looks ok to me. The only un-processed comment I have about it, from v7: Ok, but before that, a 'generic' one (non-x86 and non-ARM) used 'sparse' one, correct? If so, then probably need to outline it a bit more in patch comments and might be even release notes. At least that would be my expectations, probably hash lib maintainers need to say what is the best way here. The code refactoring itself - LGTM. https://inbox.dpdk.org/dev/3cfce8e3b128473096e1d43683fbe6f0@huawei.com/
30/04/2024 18:27, Yoan Picchi: > This patchset adds SVE support for the signature comparison in the cuckoo > hash lookup and improves the existing NEON implementation. These > optimizations required changes to the data format and signature of the > relevant functions to support dense hitmasks (no padding) and having the > primary and secondary hitmasks interleaved instead of being in their own > array each. > > Benchmarking the cuckoo hash perf test, I observed this effect on speed: > There are no significant changes on Intel (ran on Sapphire Rapids) > Neon is up to 7-10% faster (ran on ampere altra) > 128b SVE is about 3-5% slower than the optimized neon (ran on a graviton > 3 cloud instance) > 256b SVE is about 0-3% slower than the optimized neon (ran on a graviton > 3 cloud instance) > > V2->V3: > Remove a redundant if in the test > Change a couple int to uint16_t in compare_signatures_dense > Several codding-style fix > > V3->V4: > Rebase > > V4->V5: > Commit message > > V5->V6: > Move the arch-specific code into new arch-specific files > Isolate the data struture refactor from adding SVE > > V6->V7: > Commit message > Moved RTE_HASH_COMPARE_SVE to the last commit of the chain > > V7->V8: > Commit message > Typos and missing spaces > > V8->V9: > Use __rte_unused instead of (void) > Fix an indentation mistake Waiting for a new version after comments sent in June please. Note: we didn't have a review from the lib maintainers.