mbox series

[v12,0/7] hash: add SVE support for bulk key lookup

Message ID 20240708121411.885996-1-yoan.picchi@arm.com (mailing list archive)
Headers
Series hash: add SVE support for bulk key lookup |

Message

Yoan Picchi July 8, 2024, 12:14 p.m. UTC
This patchset adds SVE support for the signature comparison in the cuckoo
hash lookup and improves the existing NEON implementation. These
optimizations required changes to the data format and signature of the
relevant functions to support dense hitmasks (no padding) and having the
primary and secondary hitmasks interleaved instead of being in their own
array each.

Benchmarking the cuckoo hash perf test, I observed this effect on speed:
  There are no significant changes on Intel (ran on Sapphire Rapids)
  Neon is up to 7-10% faster (ran on ampere altra)
  128b SVE is about 3-5% slower than the optimized neon (ran on a graviton
    3 cloud instance)
  256b SVE is about 0-3% slower than the optimized neon (ran on a graviton
    3 cloud instance)

V2->V3:
  Remove a redundant if in the test
  Change a couple int to uint16_t in compare_signatures_dense
  Several codding-style fix

V3->V4:
  Rebase

V4->V5:
  Commit message

V5->V6:
  Move the arch-specific code into new arch-specific files
  Isolate the data struture refactor from adding SVE

V6->V7:
  Commit message
  Moved RTE_HASH_COMPARE_SVE to the last commit of the chain

V7->V8:
  Commit message
  Typos and missing spaces

V8->V9:
  Use __rte_unused instead of (void)
  Fix an indentation mistake

V9->V10:
  Fix more formating and indentation
  Move the new compare signature file directly in hash instead of being
    in a new subdir
  Re-order includes
  Remove duplicated static check
  Move rte_hash_sig_compare_function's definition into a private header

V10->V11:
  Split the "pack the hitmask" commit into four commits:
    Move the compare function enum out of the ABI
    Move the compare function implementations into arch-specific files
    Add a missing check on RTE_HASH_BUCKET_ENTRIES in case we change it
      in the future
    Implement the dense hitmask
  Add missing header guards
  Move compare function enum into cuckoo_hash.c instead of its own header.

V11->V12:
  Change the name of the compare function file (remove the _pvt suffix)

Yoan Picchi (7):
  hash: make compare signature function enum private
  hash: split compare signature into arch-specific files
  hash: add a check on hash entry max size
  hash: pack the hitmask for hash in bulk lookup
  hash: optimize compare signature for NEON
  test/hash: check bulk lookup of keys after collision
  hash: add SVE support for bulk key lookup

 .mailmap                              |   2 +
 app/test/test_hash.c                  |  99 +++++++++---
 lib/hash/compare_signatures_arm.h     | 121 +++++++++++++++
 lib/hash/compare_signatures_generic.h |  40 +++++
 lib/hash/compare_signatures_x86.h     |  55 +++++++
 lib/hash/rte_cuckoo_hash.c            | 207 ++++++++++++++------------
 lib/hash/rte_cuckoo_hash.h            |  10 +-
 7 files changed, 410 insertions(+), 124 deletions(-)
 create mode 100644 lib/hash/compare_signatures_arm.h
 create mode 100644 lib/hash/compare_signatures_generic.h
 create mode 100644 lib/hash/compare_signatures_x86.h
  

Comments

David Marchand July 9, 2024, 4:48 a.m. UTC | #1
On Mon, Jul 8, 2024 at 2:14 PM Yoan Picchi <yoan.picchi@arm.com> wrote:
>
> This patchset adds SVE support for the signature comparison in the cuckoo
> hash lookup and improves the existing NEON implementation. These
> optimizations required changes to the data format and signature of the
> relevant functions to support dense hitmasks (no padding) and having the
> primary and secondary hitmasks interleaved instead of being in their own
> array each.
>
> Benchmarking the cuckoo hash perf test, I observed this effect on speed:
>   There are no significant changes on Intel (ran on Sapphire Rapids)
>   Neon is up to 7-10% faster (ran on ampere altra)
>   128b SVE is about 3-5% slower than the optimized neon (ran on a graviton
>     3 cloud instance)
>   256b SVE is about 0-3% slower than the optimized neon (ran on a graviton
>     3 cloud instance)
>
> V2->V3:
>   Remove a redundant if in the test
>   Change a couple int to uint16_t in compare_signatures_dense
>   Several codding-style fix
>
> V3->V4:
>   Rebase
>
> V4->V5:
>   Commit message
>
> V5->V6:
>   Move the arch-specific code into new arch-specific files
>   Isolate the data struture refactor from adding SVE
>
> V6->V7:
>   Commit message
>   Moved RTE_HASH_COMPARE_SVE to the last commit of the chain
>
> V7->V8:
>   Commit message
>   Typos and missing spaces
>
> V8->V9:
>   Use __rte_unused instead of (void)
>   Fix an indentation mistake
>
> V9->V10:
>   Fix more formating and indentation
>   Move the new compare signature file directly in hash instead of being
>     in a new subdir
>   Re-order includes
>   Remove duplicated static check
>   Move rte_hash_sig_compare_function's definition into a private header
>
> V10->V11:
>   Split the "pack the hitmask" commit into four commits:
>     Move the compare function enum out of the ABI
>     Move the compare function implementations into arch-specific files
>     Add a missing check on RTE_HASH_BUCKET_ENTRIES in case we change it
>       in the future
>     Implement the dense hitmask
>   Add missing header guards
>   Move compare function enum into cuckoo_hash.c instead of its own header.
>
> V11->V12:
>   Change the name of the compare function file (remove the _pvt suffix)
>
> Yoan Picchi (7):
>   hash: make compare signature function enum private
>   hash: split compare signature into arch-specific files
>   hash: add a check on hash entry max size
>   hash: pack the hitmask for hash in bulk lookup
>   hash: optimize compare signature for NEON
>   test/hash: check bulk lookup of keys after collision
>   hash: add SVE support for bulk key lookup
>
>  .mailmap                              |   2 +
>  app/test/test_hash.c                  |  99 +++++++++---
>  lib/hash/compare_signatures_arm.h     | 121 +++++++++++++++
>  lib/hash/compare_signatures_generic.h |  40 +++++
>  lib/hash/compare_signatures_x86.h     |  55 +++++++
>  lib/hash/rte_cuckoo_hash.c            | 207 ++++++++++++++------------
>  lib/hash/rte_cuckoo_hash.h            |  10 +-
>  7 files changed, 410 insertions(+), 124 deletions(-)
>  create mode 100644 lib/hash/compare_signatures_arm.h
>  create mode 100644 lib/hash/compare_signatures_generic.h
>  create mode 100644 lib/hash/compare_signatures_x86.h

I added RN updates, reformated commitlogs, fixed header guards and
removed some pvt leftover.
Series applied, thanks.