From patchwork Wed Aug 10 07:45:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leyi Rong X-Patchwork-Id: 114803 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id DE1E9A0543; Wed, 10 Aug 2022 09:45:30 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id D9E7E415D7; Wed, 10 Aug 2022 09:45:29 +0200 (CEST) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by mails.dpdk.org (Postfix) with ESMTP id 7C5D1415D7 for ; Wed, 10 Aug 2022 09:45:27 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1660117527; x=1691653527; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=w6ftPKXPucBvxcNHXrqlBDbR6qF9ueXDMGUsymgYcEc=; b=XwZJHrX90mBn+olOcmdu2iBw7bAunZol2loZToMKPjXFZrTwzdjT3BfD /BYgpyFEqNxTtwqksUW+H5PdnUxYA1d2E+466pAh88EFNMeK0sBcI/SGd vi69pxu35AognXZ2JV7cetP7W3kXXmvAangcGQO7CDFMdhthknvaTUkQo dEetZ4BbVbDVYk7WVXJs+63ctoBJdbQuiQDtfMPiS8V1aSDl56Sal6IQ0 728GUL9xiCpbhXZH+LsQn0OWwN1aXZ5MwP/iN03UkGcvJIRk6WQaSMSf/ W6onD2pNww6swAu+zPGSVj7i2t6hvXJC2qsfsl3dgld0tEeqi7c9a3NgK w==; X-IronPort-AV: E=McAfee;i="6400,9594,10434"; a="271407705" X-IronPort-AV: E=Sophos;i="5.93,226,1654585200"; d="scan'208";a="271407705" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Aug 2022 00:45:27 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.93,226,1654585200"; d="scan'208";a="664785914" Received: from dpdk-lrong-icx-01.sh.intel.com ([10.67.119.18]) by fmsmga008.fm.intel.com with ESMTP; 10 Aug 2022 00:45:24 -0700 From: Leyi Rong To: yipeng1.wang@intel.com, zaoxingliu@gmail.com, sameh.gobriel@intel.com Cc: dev@dpdk.org, Leyi Rong Subject: [PATCH 1/2] member: implement NitroSketch mode Date: Wed, 10 Aug 2022 15:45:17 +0800 Message-Id: <20220810074518.1695013-2-leyi.rong@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220810074518.1695013-1-leyi.rong@intel.com> References: <20220810074518.1695013-1-leyi.rong@intel.com> MIME-Version: 1.0 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sketching algorithm provide high-fidelity approximate measurements and appears as a promising alternative to traditional approches such as packet sampling. NitroSketch [1] is a software sketching framework that optimizes performance, provides accuracy guarantees, and supports a variety of sketches. This commit adds a new data structure called sketch into membership library. This new data structure is an efficient way to profile the traffic for heavy hitters. Also use min-heap structure to maintain the top-k flow keys. [1] Zaoxing Liu, Ran Ben-Basat, Gil Einziger, Yaron Kassner, Vladimir Braverman, Roy Friedman, Vyas Sekar, "NitroSketch: Robust and General Sketch-based Monitoring in Software Switches", in ACM SIGCOMM 2019. Signed-off-by: Alan Liu Signed-off-by: Yipeng Wang Signed-off-by: Leyi Rong --- lib/member/meson.build | 37 +- lib/member/rte_member.c | 75 ++++ lib/member/rte_member.h | 149 ++++++- lib/member/rte_member_heap.h | 420 +++++++++++++++++++ lib/member/rte_member_sketch.c | 583 ++++++++++++++++++++++++++ lib/member/rte_member_sketch.h | 96 +++++ lib/member/rte_member_sketch_avx512.c | 69 +++ lib/member/rte_member_sketch_avx512.h | 36 ++ lib/member/rte_xxh64_avx512.h | 117 ++++++ 9 files changed, 1578 insertions(+), 4 deletions(-) create mode 100644 lib/member/rte_member_heap.h create mode 100644 lib/member/rte_member_sketch.c create mode 100644 lib/member/rte_member_sketch.h create mode 100644 lib/member/rte_member_sketch_avx512.c create mode 100644 lib/member/rte_member_sketch_avx512.h create mode 100644 lib/member/rte_xxh64_avx512.h diff --git a/lib/member/meson.build b/lib/member/meson.build index e06fddc240..ab9d48544a 100644 --- a/lib/member/meson.build +++ b/lib/member/meson.build @@ -7,6 +7,41 @@ if is_windows subdir_done() endif -sources = files('rte_member.c', 'rte_member_ht.c', 'rte_member_vbf.c') +sources = files('rte_member.c', 'rte_member_ht.c', 'rte_member_vbf.c', 'rte_member_sketch.c') headers = files('rte_member.h') deps += ['hash'] +includes += include_directories('../hash', '../ring') + +# compile AVX512 version if: +# we are building 64-bit binary AND binutils can generate proper code +if dpdk_conf.has('RTE_ARCH_X86_64') and binutils_ok + # compile AVX512 version if either: + # a. we have AVX512 supported in minimum instruction set + # baseline + # b. it's not minimum instruction set, but supported by + # compiler + # + # in former case, just add avx512 C file to files list + # in latter case, compile c file to static lib, using correct + # compiler flags, and then have the .o file from static lib + # linked into main lib. + sketch_avx512_cpu_support = ( + cc.get_define('__AVX512F__', args: machine_args) != '' + ) + + if sketch_avx512_cpu_support == true + cflags += ['-DCC_AVX512_SUPPORT'] + if cc.has_argument('-mavx512f') + cflags += ['-mavx', '-mavx2', '-mavx512f', '-mavx512ifma', '-march=icelake-server'] + endif + sources += files('rte_member_sketch_avx512.c') + elif cc.has_argument('-mavx512f') + cflags += '-DCC_AVX512_SUPPORT' + sketch_avx512_tmp = static_library('sketch_avx512_tmp', + 'rte_member_sketch_avx512.c', + include_directories: includes, + dependencies: static_rte_eal, + c_args: cflags + ['-mavx', '-mavx2', '-mavx512f', '-mavx512ifma', '-march=icelake-server']) + objs += sketch_avx512_tmp.extract_objects('rte_member_sketch_avx512.c') + endif +endif diff --git a/lib/member/rte_member.c b/lib/member/rte_member.c index 7e1632e6b5..8f859f7fbd 100644 --- a/lib/member/rte_member.c +++ b/lib/member/rte_member.c @@ -9,10 +9,12 @@ #include #include #include +#include #include "rte_member.h" #include "rte_member_ht.h" #include "rte_member_vbf.h" +#include "rte_member_sketch.h" TAILQ_HEAD(rte_member_list, rte_tailq_entry); static struct rte_tailq_elem rte_member_tailq = { @@ -72,6 +74,9 @@ rte_member_free(struct rte_member_setsum *setsum) case RTE_MEMBER_TYPE_VBF: rte_member_free_vbf(setsum); break; + case RTE_MEMBER_TYPE_SKETCH: + rte_member_free_sketch(setsum); + break; default: break; } @@ -86,6 +91,8 @@ rte_member_create(const struct rte_member_parameters *params) struct rte_member_list *member_list; struct rte_member_setsum *setsum; int ret; + char ring_name[RTE_RING_NAMESIZE]; + struct rte_ring *sketch_key_ring = NULL; if (params == NULL) { rte_errno = EINVAL; @@ -100,6 +107,16 @@ rte_member_create(const struct rte_member_parameters *params) return NULL; } + if (params->type == RTE_MEMBER_TYPE_SKETCH) { + snprintf(ring_name, sizeof(ring_name), "SK_%s", params->name); + sketch_key_ring = rte_ring_create_elem(ring_name, sizeof(uint32_t), + rte_align32pow2(params->top_k), params->socket_id, 0); + if (sketch_key_ring == NULL) { + RTE_MEMBER_LOG(ERR, "Sketch Ring Memory allocation failed\n"); + return NULL; + } + } + member_list = RTE_TAILQ_CAST(rte_member_tailq.head, rte_member_list); rte_mcfg_tailq_write_lock(); @@ -145,6 +162,9 @@ rte_member_create(const struct rte_member_parameters *params) case RTE_MEMBER_TYPE_VBF: ret = rte_member_create_vbf(setsum, params); break; + case RTE_MEMBER_TYPE_SKETCH: + ret = rte_member_create_sketch(setsum, params, sketch_key_ring); + break; default: goto error_unlock_exit; } @@ -162,6 +182,7 @@ rte_member_create(const struct rte_member_parameters *params) error_unlock_exit: rte_free(te); rte_free(setsum); + rte_ring_free(sketch_key_ring); rte_mcfg_tailq_write_unlock(); return NULL; } @@ -178,6 +199,23 @@ rte_member_add(const struct rte_member_setsum *setsum, const void *key, return rte_member_add_ht(setsum, key, set_id); case RTE_MEMBER_TYPE_VBF: return rte_member_add_vbf(setsum, key, set_id); + case RTE_MEMBER_TYPE_SKETCH: + return rte_member_add_sketch(setsum, key, set_id); + default: + return -EINVAL; + } +} + +int +rte_member_add_byte_count(const struct rte_member_setsum *setsum, + const void *key, uint32_t byte_count) +{ + if (setsum == NULL || key == NULL || byte_count == 0) + return -EINVAL; + + switch (setsum->type) { + case RTE_MEMBER_TYPE_SKETCH: + return rte_member_add_sketch_byte_count(setsum, key, byte_count); default: return -EINVAL; } @@ -195,6 +233,8 @@ rte_member_lookup(const struct rte_member_setsum *setsum, const void *key, return rte_member_lookup_ht(setsum, key, set_id); case RTE_MEMBER_TYPE_VBF: return rte_member_lookup_vbf(setsum, key, set_id); + case RTE_MEMBER_TYPE_SKETCH: + return rte_member_lookup_sketch(setsum, key, set_id); default: return -EINVAL; } @@ -261,6 +301,36 @@ rte_member_lookup_multi_bulk(const struct rte_member_setsum *setsum, } } +int +rte_member_query_count(const struct rte_member_setsum *setsum, + const void *key, uint64_t *output) +{ + if (setsum == NULL || key == NULL || output == NULL) + return -EINVAL; + + switch (setsum->type) { + case RTE_MEMBER_TYPE_SKETCH: + return rte_member_query_sketch(setsum, key, output); + default: + return -EINVAL; + } +} + +int +rte_member_report_heavyhitter(const struct rte_member_setsum *setsum, + void **key, uint64_t *count) +{ + if (setsum == NULL || key == NULL || count == NULL) + return -EINVAL; + + switch (setsum->type) { + case RTE_MEMBER_TYPE_SKETCH: + return rte_member_report_heavyhitter_sketch(setsum, key, count); + default: + return -EINVAL; + } +} + int rte_member_delete(const struct rte_member_setsum *setsum, const void *key, member_set_t set_id) @@ -272,6 +342,8 @@ rte_member_delete(const struct rte_member_setsum *setsum, const void *key, case RTE_MEMBER_TYPE_HT: return rte_member_delete_ht(setsum, key, set_id); /* current vBF implementation does not support delete function */ + case RTE_MEMBER_TYPE_SKETCH: + return rte_member_delete_sketch(setsum, key); case RTE_MEMBER_TYPE_VBF: default: return -EINVAL; @@ -290,6 +362,9 @@ rte_member_reset(const struct rte_member_setsum *setsum) case RTE_MEMBER_TYPE_VBF: rte_member_reset_vbf(setsum); return; + case RTE_MEMBER_TYPE_SKETCH: + rte_member_reset_sketch(setsum); + return; default: return; } diff --git a/lib/member/rte_member.h b/lib/member/rte_member.h index 2611015771..6fd591611c 100644 --- a/lib/member/rte_member.h +++ b/lib/member/rte_member.h @@ -39,6 +39,18 @@ * | | | not overwrite | | * | | | existing key. | | * +----------+---------------------+----------------+-------------------------+ + * +==========+=============================+ + * | type | sketch | + * +==========+=============================+ + * |structure | counting bloom filter array | + * +----------+-----------------------------+ + * |set id | 1: heavy set, 0: light set | + * | | | + * +----------+-----------------------------+ + * |usages & | count size of a flow, | + * |properties| used for heavy hitter | + * | | detection. | + * +----------+-----------------------------+ * --> */ @@ -50,6 +62,7 @@ extern "C" { #endif #include +#include #include @@ -66,6 +79,19 @@ typedef uint16_t member_set_t; /** Maximum number of characters in setsum name. */ #define RTE_MEMBER_NAMESIZE 32 +/** + * As packets skipped in the sampling-based algorithm, the accounting + * results accuracy is not guaranteed in the start stage. There should + * be a "convergence time" to achieve the accuracy after receiving enough + * packets. + * For sketch, use the flag if prefer always bounded mode, which only + * starts sampling after receiving enough packets to keep the results + * accuracy always bounded. + */ +#define RTE_MEMBER_SKETCH_ALWAYS_BOUNDED 0x01 +/** For sketch, use the flag if to count packet size instead of packet count */ +#define RTE_MEMBER_SKETCH_COUNT_BYTE 0x02 + /** @internal Hash function used by membership library. */ #if defined(RTE_ARCH_X86) || defined(__ARM_FEATURE_CRC32) #include @@ -104,6 +130,7 @@ struct rte_member_parameters; enum rte_member_setsum_type { RTE_MEMBER_TYPE_HT = 0, /**< Hash table based set summary. */ RTE_MEMBER_TYPE_VBF, /**< Vector of bloom filters. */ + RTE_MEMBER_TYPE_SKETCH, RTE_MEMBER_NUM_TYPE }; @@ -114,6 +141,19 @@ enum rte_member_sig_compare_function { RTE_MEMBER_COMPARE_NUM }; +/* sketch update function with different implementations. */ +typedef void (*sketch_update_fn_t)(const struct rte_member_setsum *ss, + const void *key, + uint32_t count); + +/* sketch lookup function with different implementations. */ +typedef uint64_t (*sketch_lookup_fn_t)(const struct rte_member_setsum *ss, + const void *key); + +/* sketch delete function with different implementations. */ +typedef void (*sketch_delete_fn_t)(const struct rte_member_setsum *ss, + const void *key); + /** @internal setsummary structure. */ struct rte_member_setsum { enum rte_member_setsum_type type; /* Type of the set summary. */ @@ -134,6 +174,21 @@ struct rte_member_setsum { uint32_t bit_mask; /* Bit mask to get bit location in bf. */ uint32_t num_hashes; /* Number of hash values to index bf. */ + /* Parameters for sketch */ + float error_rate; + float sample_rate; + uint32_t num_col; + uint32_t num_row; + int always_bounded; + double converge_thresh; + uint32_t topk; + uint32_t count_byte; + uint64_t *hash_seeds; + sketch_update_fn_t sketch_update; /* Pointer to the sketch update function */ + sketch_lookup_fn_t sketch_lookup; /* Pointer to the sketch lookup function */ + sketch_delete_fn_t sketch_delete; /* Pointer to the sketch delete function */ + + void *runtime_var; uint32_t mul_shift; /* vbf internal variable used during bit test. */ uint32_t div_shift; /* vbf internal variable used during bit test. */ @@ -143,6 +198,9 @@ struct rte_member_setsum { /* Second cache line should start here. */ uint32_t socket_id; /* NUMA Socket ID for memory. */ char name[RTE_MEMBER_NAMESIZE]; /* Name of this set summary. */ +#ifdef RTE_ARCH_X86 + bool use_avx512; +#endif } __rte_cache_aligned; /** @@ -261,8 +319,33 @@ struct rte_member_parameters { */ uint32_t sec_hash_seed; + /** + * For count(min) sketch data structure, error rate defines the accuracy + * required by the user. Higher accuracy leads to more memory usage, but + * the flow size is estimated more accurately. + */ + float error_rate; + + /** + * Sampling rate means the internal sample rate of the rows of the count + * min sketches. Lower sampling rate can reduce CPU overhead, but the + * data structure will require more time to converge statistically. + */ + float sample_rate; + + /** + * How many top heavy hitter to be reported. The library will internally + * keep the keys of heavy hitters for final report. + */ + uint32_t top_k; + + /** + * Extra flags that may passed in by user + */ + uint32_t extra_flag; + int socket_id; /**< NUMA Socket ID for memory. */ -}; +} __rte_cache_aligned; /** * @warning @@ -418,7 +501,7 @@ rte_member_lookup_multi_bulk(const struct rte_member_setsum *setsum, * RTE_MEMBER_NO_MATCH by default is set as 0. * For HT mode, the set_id has range as [1, 0x7FFF], MSB is reserved. * For vBF mode the set id is limited by the num_set parameter when create - * the set-summary. + * the set-summary. For sketch mode, this id is ignored. * @return * HT (cache mode) and vBF should never fail unless the set_id is not in the * valid range. In such case -EINVAL is returned. @@ -429,12 +512,72 @@ rte_member_lookup_multi_bulk(const struct rte_member_setsum *setsum, * Return 0 for HT (cache mode) if the add does not cause * eviction, return 1 otherwise. Return 0 for non-cache mode if success, * -ENOSPC for full, and 1 if cuckoo eviction happens. - * Always returns 0 for vBF mode. + * Always returns 0 for vBF mode and sketch. */ int rte_member_add(const struct rte_member_setsum *setsum, const void *key, member_set_t set_id); +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Add the packet byte size into the sketch. + * + * @param setsum + * Pointer of a set-summary. + * @param key + * Pointer of the key to be added. + * @param byte_count + * Add the byte count of the packet into the sketch. + * @return + * Return -EINVAL for invalid parameters, otherwise return 0. + */ +int +rte_member_add_byte_count(const struct rte_member_setsum *setsum, + const void *key, uint32_t byte_count); + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Query packet count for a certain flow-key. + * + * @param setsum + * Pointer of a set-summary. + * @param key + * Pointer of the key to be added. + * @param count + * The output packet count or byte count. + * @return + * Return -EINVAL for invalid parameters. + */ +int +rte_member_query_count(const struct rte_member_setsum *setsum, + const void *key, uint64_t *count); + + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Report heavyhitter flow-keys into set-summary (SS). + * + * @param setsum + * Pointer of a set-summary. + * @param key + * Pointer of the output top-k key array. + * @param count + * Pointer of the output packet count or byte count array of the top-k keys. + * @return + * Return -EINVAL for invalid parameters. Return a positive integer indicate + * how many heavy hitters are reported. + */ +int +rte_member_report_heavyhitter(const struct rte_member_setsum *setsum, + void **keys, uint64_t *counts); + + /** * @warning * @b EXPERIMENTAL: this API may change without prior notice diff --git a/lib/member/rte_member_heap.h b/lib/member/rte_member_heap.h new file mode 100644 index 0000000000..1756a1eca5 --- /dev/null +++ b/lib/member/rte_member_heap.h @@ -0,0 +1,420 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2020 Intel Corporation + * Copyright(c) 2020, Alan Liu + */ + +#ifndef _RTE_MEMBER_HEAP_H_ +#define _RTE_MEMBER_HEAP_H_ + +#include +#include "rte_member.h" + +#define LCHILD(x) (2 * x + 1) +#define RCHILD(x) (2 * x + 2) +#define PARENT(x) ((x - 1) / 2) + +#define HASH_BKT_SIZE 16 +#define HASH_HP_MULTI 4 +#define HASH_RESIZE_MULTI 2 + +struct hash_bkt { + uint16_t sig[HASH_BKT_SIZE]; + uint16_t idx[HASH_BKT_SIZE]; +}; + +struct hash { + uint16_t bkt_cnt; + uint16_t num_item; + uint32_t seed; + struct hash_bkt buckets[0]; +}; + +struct node { + void *key; + uint64_t count; +}; + +struct minheap { + uint32_t key_len; + uint32_t size; + uint32_t socket; + struct hash *hashtable; + struct node *elem; +}; + +static int +hash_table_insert(const void *key, int value, int key_len, struct hash *table) +{ + uint32_t hash = MEMBER_HASH_FUNC(key, key_len, table->seed); + uint16_t idx = hash % table->bkt_cnt; + uint16_t sig = hash >> 16; + + for (int i = 0; i < HASH_BKT_SIZE; i++) { + if (table->buckets[idx].idx[i] == 0) { + table->buckets[idx].idx[i] = value; + table->buckets[idx].sig[i] = sig; + table->num_item++; + return 0; + } + } + + return -ENOMEM; +} + +static int +hash_table_update(const void *key, int old_value, int value, int key_len, struct hash *table) +{ + uint32_t hash = MEMBER_HASH_FUNC(key, key_len, table->seed); + uint16_t idx = hash % table->bkt_cnt; + uint16_t sig = hash >> 16; + + for (int i = 0; i < HASH_BKT_SIZE; i++) { + if (table->buckets[idx].sig[i] == sig && table->buckets[idx].idx[i] == old_value) { + table->buckets[idx].idx[i] = value; + return 0; + } + } + + return -1; +} + +static int +hash_table_del(const void *key, uint16_t value, int key_len, struct hash *table) +{ + uint32_t hash = MEMBER_HASH_FUNC(key, key_len, table->seed); + uint16_t idx = hash % table->bkt_cnt; + uint16_t sig = hash >> 16; + + for (int i = 0; i < HASH_BKT_SIZE; i++) { + if (table->buckets[idx].sig[i] == sig && table->buckets[idx].idx[i] == value) { + table->buckets[idx].idx[i] = 0; + table->num_item--; + return 0; + } + } + + return -1; +} + +static int +hash_table_lookup(const void *key, int key_len, struct minheap *hp) +{ + struct hash *table = hp->hashtable; + uint32_t hash = MEMBER_HASH_FUNC(key, key_len, table->seed); + uint16_t idx = hash % table->bkt_cnt; + uint16_t sig = hash >> 16; + + for (int i = 0; i < HASH_BKT_SIZE; i++) { + if (table->buckets[idx].sig[i] == sig && table->buckets[idx].idx[i] != 0) { + uint32_t hp_idx = table->buckets[idx].idx[i] - 1; + + if (memcmp(hp->elem[hp_idx].key, key, hp->key_len) == 0) + return hp_idx; + } + } + + return -ENOENT; /* key doesn't exist */ +} + +static int +resize_hash_table(struct minheap *hp) +{ + uint32_t i; + uint32_t new_bkt_cnt; + + while (1) { + new_bkt_cnt = hp->hashtable->bkt_cnt * HASH_RESIZE_MULTI; + + RTE_MEMBER_LOG(ERR, "Sketch Minheap HT load factor is [%f]\n", + hp->hashtable->num_item / ((float)hp->hashtable->bkt_cnt * HASH_BKT_SIZE)); + RTE_MEMBER_LOG(ERR, "Sketch Minheap HT resize happen!\n"); + rte_free(hp->hashtable); + hp->hashtable = rte_zmalloc_socket(NULL, sizeof(struct hash) + + new_bkt_cnt * sizeof(struct hash_bkt), + RTE_CACHE_LINE_SIZE, hp->socket); + + if (hp->hashtable == NULL) { + RTE_MEMBER_LOG(ERR, "Sketch Minheap HT allocation failed\n"); + return -ENOMEM; + } + + hp->hashtable->bkt_cnt = new_bkt_cnt; + + for (i = 0; i < hp->size; ++i) { + if (hash_table_insert(hp->elem[i].key, + i + 1, hp->key_len, hp->hashtable) < 0) { + RTE_MEMBER_LOG(ERR, + "Sketch Minheap HT resize insert fail!\n"); + break; + } + } + if (i == hp->size) + break; + } + + return 0; +} + +/* find the item in the given minheap */ +static int +rte_member_minheap_find(struct minheap *hp, const void *key) +{ + int idx = hash_table_lookup(key, hp->key_len, hp); + return idx; +} + +static int +rte_member_minheap_init(struct minheap *heap, int size, + uint32_t socket, uint32_t seed) +{ + heap->elem = rte_zmalloc_socket(NULL, sizeof(struct node) * size, + RTE_CACHE_LINE_SIZE, socket); + if (heap->elem == NULL) { + RTE_MEMBER_LOG(ERR, "Sketch Minheap elem allocation failed\n"); + return -ENOMEM; + } + + uint32_t hash_bkt_cnt = rte_align32pow2(size * HASH_HP_MULTI) / HASH_BKT_SIZE; + + if (hash_bkt_cnt == 0) + hash_bkt_cnt = 1; + + heap->hashtable = rte_zmalloc_socket(NULL, sizeof(struct hash) + + hash_bkt_cnt * sizeof(struct hash_bkt), + RTE_CACHE_LINE_SIZE, socket); + + if (heap->hashtable == NULL) { + RTE_MEMBER_LOG(ERR, "Sketch Minheap HT allocation failed\n"); + rte_free(heap->elem); + return -ENOMEM; + } + + heap->hashtable->seed = seed; + heap->hashtable->bkt_cnt = hash_bkt_cnt; + heap->socket = socket; + + return 0; +} + +/* swap the minheap nodes */ +static __rte_always_inline void +rte_member_heap_swap(struct node *n1, struct node *n2) +{ + struct node temp = *n1; + *n1 = *n2; + *n2 = temp; +} + +/* heapify function */ +static void +rte_member_heapify(struct minheap *hp, uint32_t idx, bool update_hash) +{ + uint32_t smallest; + + if (LCHILD(idx) < hp->size && + hp->elem[LCHILD(idx)].count < hp->elem[idx].count) + smallest = LCHILD(idx); + else + smallest = idx; + + if (RCHILD(idx) < hp->size && + hp->elem[RCHILD(idx)].count < hp->elem[smallest].count) + smallest = RCHILD(idx); + + if (smallest != idx) { + rte_member_heap_swap(&(hp->elem[idx]), &(hp->elem[smallest])); + + if (update_hash) { + if (hash_table_update(hp->elem[smallest].key, idx + 1, smallest + 1, + hp->key_len, hp->hashtable) < 0) { + RTE_MEMBER_LOG(ERR, "Minheap Hash Table update failed\n"); + return; + } + + if (hash_table_update(hp->elem[idx].key, smallest + 1, idx + 1, + hp->key_len, hp->hashtable) < 0) { + RTE_MEMBER_LOG(ERR, "Minheap Hash Table update failed\n"); + return; + } + } + rte_member_heapify(hp, smallest, update_hash); + } +} + +/* insert a node into the minheap */ +static int +rte_member_minheap_insert_node(struct minheap *hp, const void *key, + int counter, void *key_slot, + struct rte_ring *free_key_slot) +{ + struct node nd; + uint32_t slot_id; + + if (rte_ring_sc_dequeue_elem(free_key_slot, &slot_id, sizeof(uint32_t)) != 0) { + RTE_MEMBER_LOG(ERR, "Minheap get empty keyslot failed\n"); + return -1; + } + + nd.count = counter; + nd.key = RTE_PTR_ADD(key_slot, slot_id * hp->key_len); + + memcpy(nd.key, key, hp->key_len); + + uint32_t i = (hp->size)++; + + while (i && nd.count < hp->elem[PARENT(i)].count) { + hp->elem[i] = hp->elem[PARENT(i)]; + if (hash_table_update(hp->elem[i].key, PARENT(i) + 1, i + 1, + hp->key_len, hp->hashtable) < 0) { + RTE_MEMBER_LOG(ERR, "Minheap Hash Table update failed\n"); + return -1; + } + i = PARENT(i); + } + hp->elem[i] = nd; + + if (hash_table_insert(key, i + 1, hp->key_len, hp->hashtable) < 0) { + if (resize_hash_table(hp) < 0) { + RTE_MEMBER_LOG(ERR, "Minheap Hash Table resize failed\n"); + return -1; + } + } + + return 0; +} + +/* delete a key from the minheap */ +static int +rte_member_minheap_delete_node(struct minheap *hp, const void *key, + void *key_slot, struct rte_ring *free_key_slot) +{ + int idx = rte_member_minheap_find(hp, key); + uint32_t offset = RTE_PTR_DIFF(hp->elem[idx].key, key_slot) / hp->key_len; + + if (hash_table_del(key, idx + 1, hp->key_len, hp->hashtable) < 0) { + RTE_MEMBER_LOG(ERR, "Minheap Hash Table delete failed\n"); + return -1; + } + + rte_ring_sp_enqueue_elem(free_key_slot, &offset, sizeof(uint32_t)); + + if (idx == (int)(hp->size - 1)) { + hp->size--; + return 0; + } + + hp->elem[idx] = hp->elem[hp->size - 1]; + + if (hash_table_update(hp->elem[idx].key, hp->size, idx + 1, + hp->key_len, hp->hashtable) < 0) { + RTE_MEMBER_LOG(ERR, "Minheap Hash Table update failed\n"); + return -1; + } + hp->size--; + rte_member_heapify(hp, idx, true); + + return 0; +} + +/* replace a min node with a new key. */ +static int +rte_member_minheap_replace_node(struct minheap *hp, + const void *new_key, + int new_counter) +{ + struct node nd; + void *recycle_key = NULL; + + recycle_key = hp->elem[0].key; + + if (hash_table_del(recycle_key, 1, hp->key_len, hp->hashtable) < 0) { + RTE_MEMBER_LOG(ERR, "Minheap Hash Table delete failed\n"); + return -1; + } + + hp->elem[0] = hp->elem[hp->size - 1]; + + if (hash_table_update(hp->elem[0].key, hp->size, 1, + hp->key_len, hp->hashtable) < 0) { + RTE_MEMBER_LOG(ERR, "Minheap Hash Table update failed\n"); + return -1; + } + hp->size--; + + rte_member_heapify(hp, 0, true); + + nd.count = new_counter; + nd.key = recycle_key; + + memcpy(nd.key, new_key, hp->key_len); + + uint32_t i = (hp->size)++; + + while (i && nd.count < hp->elem[PARENT(i)].count) { + hp->elem[i] = hp->elem[PARENT(i)]; + if (hash_table_update(hp->elem[i].key, PARENT(i) + 1, i + 1, + hp->key_len, hp->hashtable) < 0) { + RTE_MEMBER_LOG(ERR, "Minheap Hash Table update failed\n"); + return -1; + } + i = PARENT(i); + } + + hp->elem[i] = nd; + + if (hash_table_insert(new_key, i + 1, hp->key_len, hp->hashtable) < 0) { + RTE_MEMBER_LOG(ERR, "Minheap Hash Table replace insert failed\n"); + if (resize_hash_table(hp) < 0) { + RTE_MEMBER_LOG(ERR, "Minheap Hash Table replace resize failed\n"); + return -1; + } + } + + return 0; +} + +/* sort the heap into a decending array */ +static void +rte_member_heapsort(struct minheap *hp, struct node *result_array) +{ + struct minheap new_hp; + + /* build a new heap for using the given array */ + new_hp.size = hp->size; + new_hp.key_len = hp->key_len; + new_hp.elem = result_array; + memcpy(result_array, hp->elem, hp->size * sizeof(struct node)); + + /* sort the new heap */ + while (new_hp.size > 1) { + rte_member_heap_swap(&(new_hp.elem[0]), &(new_hp.elem[new_hp.size - 1])); + new_hp.size--; + rte_member_heapify(&new_hp, 0, false); + } +} + +static void +rte_member_minheap_free(struct minheap *hp) +{ + if (hp == NULL) + return; + + rte_free(hp->elem); + rte_free(hp->hashtable); +} + +static void +rte_member_minheap_reset(struct minheap *hp) +{ + if (hp == NULL) + return; + + memset(hp->elem, 0, sizeof(struct node) * hp->size); + hp->size = 0; + + memset((char *)hp->hashtable + sizeof(struct hash), 0, + hp->hashtable->bkt_cnt * sizeof(struct hash_bkt)); + hp->hashtable->num_item = 0; +} + +#endif /* _RTE_MEMBER_HEAP_H_ */ diff --git a/lib/member/rte_member_sketch.c b/lib/member/rte_member_sketch.c new file mode 100644 index 0000000000..b0a258165f --- /dev/null +++ b/lib/member/rte_member_sketch.c @@ -0,0 +1,583 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2020 Intel Corporation + * Copyright(c) 2020, Alan Liu + */ + +#include +#include + +#include +#include +#include +#include +#include +#include +#include + +#include "rte_member.h" +#include "rte_member_sketch.h" +#include "rte_member_heap.h" + +#ifdef CC_AVX512_SUPPORT +#include "rte_member_sketch_avx512.h" +#endif /* CC_AVX512_SUPPORT */ + +struct sketch_runtime { + uint64_t pkt_cnt; + uint32_t until_next; + int converged; + struct minheap heap; + struct node *report_array; + void *key_slots; + struct rte_ring *free_key_slots; +} __rte_cache_aligned; + +/* + * Geometric sampling to calculate how many packets needs to be + * skipped until next update. This method can mitigate the CPU + * overheads compared with coin-toss sampling. + */ +static uint32_t +draw_geometric(const struct rte_member_setsum *ss) +{ + double rand = 1; + + if (ss->sample_rate == 1) + return 1; + + while (rand == 1 || rand == 0) + rand = (double) rte_rand() / (UINT64_MAX); + + return (uint32_t)ceil(log(1 - rand) / log(1 - ss->sample_rate)); +} + +static void +isort(uint64_t *array, int n) +{ + for (int i = 1; i < n; i++) { + uint64_t t = array[i]; + int j; + + for (j = i - 1; j >= 0; j--) { + if (t < array[j]) + array[j + 1] = array[j]; + else + break; + } + array[j + 1] = t; + } +} + +static __rte_always_inline void +swap(uint64_t *a, uint64_t *b) +{ + uint64_t tmp = *a; + *a = *b; + *b = tmp; +} + +static uint64_t +medianof5(uint64_t a, uint64_t b, uint64_t c, uint64_t d, uint64_t e) +{ + if (a > b) + swap(&a, &b); + if (c > d) + swap(&c, &d); + if (a > c) { + if (d > e) + swap(&c, &e); + else { + swap(&c, &d); + swap(&d, &e); + } + } else { + if (b > e) + swap(&a, &e); + else { + swap(&a, &b); + swap(&b, &e); + } + } + + if (a > c) + return a > d ? d : a; + else + return b > c ? c : b; +} + +int +rte_member_create_sketch(struct rte_member_setsum *ss, + const struct rte_member_parameters *params, + struct rte_ring *ring) +{ + struct sketch_runtime *runtime; + uint32_t num_col; + uint32_t i; + + if (params->sample_rate == 0 || params->sample_rate > 1) { + rte_errno = EINVAL; + RTE_MEMBER_LOG(ERR, + "Membership Sketch created with invalid parameters\n"); + return -EINVAL; + } + + if (params->extra_flag & RTE_MEMBER_SKETCH_COUNT_BYTE) + ss->count_byte = 1; + + if (ss->count_byte == 1 && + rte_vect_get_max_simd_bitwidth() >= RTE_VECT_SIMD_512 && + rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F) == 1 && + rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512IFMA) == 1) { +#ifdef CC_AVX512_SUPPORT + ss->use_avx512 = true; +#else + ss->use_avx512 = false; +#endif + } + + if (ss->use_avx512 == true) { + ss->num_row = NUM_ROW_VEC; + RTE_MEMBER_LOG(NOTICE, + "Membership Sketch AVX512 update/lookup/delete ops is selected\n"); + ss->sketch_update = sketch_update_avx512; + ss->sketch_lookup = sketch_lookup_avx512; + ss->sketch_delete = sketch_delete_avx512; + } else { + ss->num_row = NUM_ROW_SCALAR; + RTE_MEMBER_LOG(NOTICE, + "Membership Sketch SCALAR update/lookup/delete ops is selected\n"); + ss->sketch_update = sketch_update_scalar; + ss->sketch_lookup = sketch_lookup_scalar; + ss->sketch_delete = sketch_delete_scalar; + } + + ss->socket_id = params->socket_id; + + if (ss->count_byte == 0) + num_col = 4.0 / params->error_rate / params->sample_rate; + else if (ss->use_avx512 == true) + num_col = rte_align32pow2(4.0 / params->error_rate); + else + num_col = 4.0 / params->error_rate; + + ss->table = rte_zmalloc_socket(NULL, + sizeof(uint64_t) * num_col * ss->num_row, + RTE_CACHE_LINE_SIZE, ss->socket_id); + if (ss->table == NULL) { + RTE_MEMBER_LOG(ERR, "Sketch Table memory allocation failed\n"); + return -ENOMEM; + } + + ss->hash_seeds = rte_zmalloc_socket(NULL, sizeof(uint64_t) * ss->num_row, + RTE_CACHE_LINE_SIZE, ss->socket_id); + if (ss->hash_seeds == NULL) { + RTE_MEMBER_LOG(ERR, "Sketch Hashseeds memory allocation failed\n"); + return -ENOMEM; + } + + ss->runtime_var = rte_zmalloc_socket(NULL, sizeof(struct sketch_runtime), + RTE_CACHE_LINE_SIZE, ss->socket_id); + if (ss->runtime_var == NULL) { + RTE_MEMBER_LOG(ERR, "Sketch Runtime memory allocation failed\n"); + rte_free(ss); + return -ENOMEM; + } + runtime = ss->runtime_var; + + ss->num_col = num_col; + ss->sample_rate = params->sample_rate; + ss->prim_hash_seed = params->prim_hash_seed; + ss->sec_hash_seed = params->sec_hash_seed; + ss->error_rate = params->error_rate; + ss->topk = params->top_k; + ss->key_len = params->key_len; + runtime->heap.key_len = ss->key_len; + + runtime->key_slots = rte_zmalloc_socket(NULL, ss->key_len * ss->topk, + RTE_CACHE_LINE_SIZE, ss->socket_id); + if (runtime->key_slots == NULL) { + RTE_MEMBER_LOG(ERR, "Sketch Key Slots allocation failed\n"); + goto error; + } + + runtime->free_key_slots = ring; + for (i = 0; i < ss->topk; i++) + rte_ring_sp_enqueue_elem(runtime->free_key_slots, + &i, sizeof(uint32_t)); + + if (rte_member_minheap_init(&(runtime->heap), params->top_k, + ss->socket_id, params->prim_hash_seed) < 0) { + RTE_MEMBER_LOG(ERR, "Sketch Minheap allocation failed\n"); + goto error_runtime; + } + + runtime->report_array = rte_zmalloc_socket(NULL, sizeof(struct node) * ss->topk, + RTE_CACHE_LINE_SIZE, ss->socket_id); + if (runtime->report_array == NULL) { + RTE_MEMBER_LOG(ERR, "Sketch Runtime Report Array allocation failed\n"); + goto error_runtime; + } + + rte_srand(ss->prim_hash_seed); + for (uint32_t i = 0; i < ss->num_row; i++) + ss->hash_seeds[i] = rte_rand(); + + if (params->extra_flag & RTE_MEMBER_SKETCH_ALWAYS_BOUNDED) + ss->always_bounded = 1; + + if (ss->always_bounded) { + double delta = 1.0 / (pow(2, ss->num_row)); + + ss->converge_thresh = 10 * pow(ss->error_rate, -2.0) * sqrt(log(1 / delta)); + } + + RTE_MEMBER_LOG(DEBUG, "Sketch created, " + "the total memory required is %u Bytes\n", ss->num_col * ss->num_row * 8); + + return 0; + +error_runtime: + rte_member_minheap_free(&runtime->heap); + rte_ring_free(runtime->free_key_slots); + rte_free(runtime->key_slots); +error: + rte_free(runtime); + rte_free(ss); + + return -ENOMEM; +} + +uint64_t +sketch_lookup_scalar(const struct rte_member_setsum *ss, const void *key) +{ + uint64_t *count_array = ss->table; + uint32_t col[ss->num_row]; + uint64_t count_row[ss->num_row]; + uint32_t cur_row; + uint64_t count; + + for (cur_row = 0; cur_row < ss->num_row; cur_row++) { + col[cur_row] = MEMBER_HASH_FUNC(key, ss->key_len, + ss->hash_seeds[cur_row]) % ss->num_col; + + rte_prefetch0(&count_array[cur_row * ss->num_col + col[cur_row]]); + } + + /* if sample rate is 1, it is a regular count-min, we report the min */ + if (ss->sample_rate == 1 || ss->count_byte == 1) + return count_min(ss, col); + + /* otherwise we report the median number */ + for (cur_row = 0; cur_row < ss->num_row; cur_row++) + count_row[cur_row] = count_array[cur_row * ss->num_col + col[cur_row]]; + + if (ss->num_row == 5) + return medianof5(count_row[0], count_row[1], + count_row[2], count_row[3], count_row[4]); + + isort(count_row, ss->num_row); + + if (ss->num_row % 2 == 0) { + count = (count_row[ss->num_row / 2] + count_row[ss->num_row / 2 - 1]) / 2; + return count; + } + /* ss->num_row % 2 != 0 */ + count = count_row[ss->num_row / 2]; + + return count; +} + +void +sketch_delete_scalar(const struct rte_member_setsum *ss, const void *key) +{ + uint32_t col[ss->num_row]; + uint64_t *count_array = ss->table; + uint32_t cur_row; + + for (cur_row = 0; cur_row < ss->num_row; cur_row++) { + col[cur_row] = MEMBER_HASH_FUNC(key, ss->key_len, + ss->hash_seeds[cur_row]) % ss->num_col; + + /* set corresponding counter to 0 */ + count_array[cur_row * ss->num_col + col[cur_row]] = 0; + } +} + +int +rte_member_query_sketch(const struct rte_member_setsum *ss, + const void *key, + uint64_t *output) +{ + uint64_t count = ss->sketch_lookup(ss, key); + *output = count; + + return 0; +} + +void +rte_member_update_heap(const struct rte_member_setsum *ss) +{ + uint32_t i; + struct sketch_runtime *runtime_var = ss->runtime_var; + + for (i = 0; i < runtime_var->heap.size; i++) { + uint64_t count = ss->sketch_lookup(ss, runtime_var->heap.elem[i].key); + + runtime_var->heap.elem[i].count = count; + } +} + +int +rte_member_report_heavyhitter_sketch(const struct rte_member_setsum *setsum, + void **key, + uint64_t *count) +{ + uint32_t i; + struct sketch_runtime *runtime_var = setsum->runtime_var; + + rte_member_update_heap(setsum); + rte_member_heapsort(&(runtime_var->heap), runtime_var->report_array); + + for (i = 0; i < runtime_var->heap.size; i++) { + key[i] = runtime_var->report_array[i].key; + count[i] = runtime_var->report_array[i].count; + } + + return runtime_var->heap.size; +} + +int +rte_member_lookup_sketch(const struct rte_member_setsum *ss, + const void *key, member_set_t *set_id) +{ + uint64_t count = ss->sketch_lookup(ss, key); + struct sketch_runtime *runtime_var = ss->runtime_var; + + if (runtime_var->heap.size > 0 && count >= runtime_var->heap.elem[0].count) + *set_id = 1; + else + *set_id = 0; + + if (count == 0) + return 0; + else + return 1; +} + +static void +should_converge(const struct rte_member_setsum *ss) +{ + struct sketch_runtime *runtime_var = ss->runtime_var; + + /* For count min sketch - L1 norm */ + if (runtime_var->pkt_cnt > ss->converge_thresh) { + runtime_var->converged = 1; + RTE_MEMBER_LOG(DEBUG, "Sketch converged, begin sampling " + "from key count %lu\n", + runtime_var->pkt_cnt); + } +} + +static void +sketch_update_row(const struct rte_member_setsum *ss, const void *key, + uint32_t count, uint32_t cur_row) +{ + uint64_t *count_array = ss->table; + uint32_t col = MEMBER_HASH_FUNC(key, ss->key_len, + ss->hash_seeds[cur_row]) % ss->num_col; + + /* sketch counter update */ + count_array[cur_row * ss->num_col + col] += + ceil(count / (ss->sample_rate)); +} + +void +sketch_update_scalar(const struct rte_member_setsum *ss, + const void *key, + uint32_t count) +{ + uint64_t *count_array = ss->table; + uint32_t col; + uint32_t cur_row; + + for (cur_row = 0; cur_row < ss->num_row; cur_row++) { + col = MEMBER_HASH_FUNC(key, ss->key_len, + ss->hash_seeds[cur_row]) % ss->num_col; + count_array[cur_row * ss->num_col + col] += count; + } +} + +static void +heap_update(const struct rte_member_setsum *ss, const void *key) +{ + struct sketch_runtime *runtime_var = ss->runtime_var; + uint64_t key_cnt = 0; + int found; + + /* We also update the heap for this key */ + key_cnt = ss->sketch_lookup(ss, key); + if (key_cnt > runtime_var->heap.elem[0].count) { + found = rte_member_minheap_find(&runtime_var->heap, key); + /* the key is found in the top-k heap */ + if (found >= 0) { + if (runtime_var->heap.elem[found].count < key_cnt) + rte_member_heapify(&runtime_var->heap, found, true); + + runtime_var->heap.elem[found].count = key_cnt; + } else if (runtime_var->heap.size < ss->topk) { + rte_member_minheap_insert_node(&runtime_var->heap, key, + key_cnt, runtime_var->key_slots, runtime_var->free_key_slots); + } else { + rte_member_minheap_replace_node(&runtime_var->heap, key, key_cnt); + } + } else if (runtime_var->heap.size < ss->topk) { + found = rte_member_minheap_find(&runtime_var->heap, key); + if (found >= 0) { + if (runtime_var->heap.elem[found].count < key_cnt) + rte_member_heapify(&runtime_var->heap, found, true); + + runtime_var->heap.elem[found].count = key_cnt; + } else + rte_member_minheap_insert_node(&runtime_var->heap, key, + key_cnt, runtime_var->key_slots, runtime_var->free_key_slots); + } +} + +/* + * Add a single packet into the sketch. + * Sketch value is meatured by packet numbers in this mode. + */ +int +rte_member_add_sketch(const struct rte_member_setsum *ss, + const void *key, + __rte_unused member_set_t set_id) +{ + uint32_t cur_row; + struct sketch_runtime *runtime_var = ss->runtime_var; + uint32_t *until_next = &(runtime_var->until_next); + + /* + * If sketch is mesured by byte count, + * the rte_member_add_sketch_byte_count routine should be used. + */ + if (ss->count_byte == 1) { + RTE_MEMBER_LOG(ERR, "Sketch is Byte Mode, " + "should use rte_member_add_byte_count()!\n"); + return -EINVAL; + } + + if (ss->sample_rate == 1) { + ss->sketch_update(ss, key, 1); + heap_update(ss, key); + return 0; + } + + /* convergence stage if it's needed */ + if (ss->always_bounded && !runtime_var->converged) { + ss->sketch_update(ss, key, 1); + + if (!((++runtime_var->pkt_cnt) & (INTERVAL - 1))) + should_converge(ss); + + heap_update(ss, key); + return 0; + } + + /* should we skip this packet */ + if (*until_next >= ss->num_row) { + *until_next -= ss->num_row; + return 0; + } + cur_row = *until_next; + do { + sketch_update_row(ss, key, 1, cur_row); + *until_next = draw_geometric(ss); + if (cur_row + *until_next >= ss->num_row) + break; + cur_row += *until_next; + } while (1); + + *until_next -= (ss->num_row - cur_row); + + heap_update(ss, key); + + return 0; +} + +/* + * Add the byte count of the packet into the sketch. + * Sketch value is meatured by byte count numbers in this mode. + */ +int +rte_member_add_sketch_byte_count(const struct rte_member_setsum *ss, + const void *key, + uint32_t byte_count) +{ + struct sketch_runtime *runtime_var = ss->runtime_var; + uint32_t *until_next = &(runtime_var->until_next); + + /* should not call this API if not in count byte mode */ + if (ss->count_byte == 0) { + RTE_MEMBER_LOG(ERR, "Sketch is Pkt Mode, " + "should use rte_member_add()!\n"); + return -EINVAL; + } + + /* there's specific optimization for the sketch update */ + ss->sketch_update(ss, key, byte_count); + + if (*until_next != 0) { + *until_next = *until_next - 1; + return 0; + } + + *until_next = draw_geometric(ss) - 1; + + heap_update(ss, key); + + return 0; +} + +int +rte_member_delete_sketch(const struct rte_member_setsum *ss, + const void *key) +{ + struct sketch_runtime *runtime_var = ss->runtime_var; + int found; + + found = rte_member_minheap_find(&runtime_var->heap, key); + if (found < 0) + return -1; + + ss->sketch_delete(ss, key); + + return rte_member_minheap_delete_node + (&runtime_var->heap, key, runtime_var->key_slots, runtime_var->free_key_slots); +} + +void +rte_member_free_sketch(struct rte_member_setsum *ss) +{ + struct sketch_runtime *runtime_var = ss->runtime_var; + + rte_free(ss->table); + rte_member_minheap_free(&runtime_var->heap); + rte_free(runtime_var->key_slots); + rte_ring_free(runtime_var->free_key_slots); + rte_free(runtime_var); +} + +void +rte_member_reset_sketch(const struct rte_member_setsum *ss) +{ + struct sketch_runtime *runtime_var = ss->runtime_var; + uint64_t *sketch = ss->table; + uint32_t i; + + memset(sketch, 0, sizeof(uint64_t) * ss->num_col * ss->num_row); + rte_member_minheap_reset(&runtime_var->heap); + rte_ring_reset(runtime_var->free_key_slots); + + for (i = 0; i < ss->topk; i++) + rte_ring_sp_enqueue_elem(runtime_var->free_key_slots, &i, sizeof(uint32_t)); +} diff --git a/lib/member/rte_member_sketch.h b/lib/member/rte_member_sketch.h new file mode 100644 index 0000000000..a5e633a74e --- /dev/null +++ b/lib/member/rte_member_sketch.h @@ -0,0 +1,96 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2017 Intel Corporation + */ + +#ifndef _RTE_MEMBER_SKETCH_H_ +#define _RTE_MEMBER_SKETCH_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include + +#define NUM_ROW_SCALAR 5 +#define INTERVAL (1 << 15) + +#if !RTE_IS_POWER_OF_2(INTERVAL) +#error sketch INTERVAL macro must be a power of 2 +#endif + +int +rte_member_create_sketch(struct rte_member_setsum *ss, + const struct rte_member_parameters *params, + struct rte_ring *r); + +int +rte_member_lookup_sketch(const struct rte_member_setsum *setsum, + const void *key, member_set_t *set_id); + +int +rte_member_add_sketch(const struct rte_member_setsum *setsum, + const void *key, + member_set_t set_id); + +int +rte_member_add_sketch_byte_count(const struct rte_member_setsum *ss, + const void *key, uint32_t byte_count); + +void +sketch_update_scalar(const struct rte_member_setsum *ss, + const void *key, + uint32_t count); + +uint64_t +sketch_lookup_scalar(const struct rte_member_setsum *ss, + const void *key); + +void +sketch_delete_scalar(const struct rte_member_setsum *ss, + const void *key); + +int +rte_member_delete_sketch(const struct rte_member_setsum *setsum, + const void *key); + +int +rte_member_query_sketch(const struct rte_member_setsum *setsum, + const void *key, uint64_t *output); + +void +rte_member_free_sketch(struct rte_member_setsum *ss); + +void +rte_member_reset_sketch(const struct rte_member_setsum *setsum); + +int +rte_member_report_heavyhitter_sketch(const struct rte_member_setsum *setsum, + void **key, uint64_t *count); + +void +rte_member_update_heap(const struct rte_member_setsum *ss); + +static __rte_always_inline uint64_t +count_min(const struct rte_member_setsum *ss, const uint32_t *hash_results) +{ + uint64_t *count_array = ss->table; + uint64_t count; + uint32_t cur_row; + uint64_t min = UINT64_MAX; + + for (cur_row = 0; cur_row < ss->num_row; cur_row++) { + uint64_t cnt = count_array[cur_row * ss->num_col + hash_results[cur_row]]; + + if (cnt < min) + min = cnt; + } + count = min; + + return count; +} + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_MEMBER_SKETCH_H_ */ diff --git a/lib/member/rte_member_sketch_avx512.c b/lib/member/rte_member_sketch_avx512.c new file mode 100644 index 0000000000..c83f4b6fd1 --- /dev/null +++ b/lib/member/rte_member_sketch_avx512.c @@ -0,0 +1,69 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2020 Intel Corporation + */ + +#include "rte_member_sketch_avx512.h" + +__rte_always_inline void +sketch_update_avx512(const struct rte_member_setsum *ss, + const void *key, + uint32_t count) +{ + uint64_t *count_array = ss->table; + uint32_t num_col = ss->num_col; + uint32_t key_len = ss->key_len; + __m256i v_row_base; + __m256i v_hash_result; + __m512i current_sketch; + __m512i updated_sketch; + __m512i v_count; + + const __m256i v_idx = _mm256_set_epi32(7, 6, 5, 4, 3, 2, 1, 0); + const __m256i v_col = _mm256_set1_epi32(num_col); + + /* compute the hash result parallelly */ + v_hash_result = rte_xxh64_sketch_avx512 + (key, key_len, *(__m512i *)ss->hash_seeds, num_col); + v_row_base = _mm256_mullo_epi32(v_idx, v_col); + v_hash_result = _mm256_add_epi32(v_row_base, v_hash_result); + + current_sketch = + _mm512_i32gather_epi64(v_hash_result, count_array, 8); + v_count = _mm512_set1_epi64(count); + updated_sketch = _mm512_add_epi64(current_sketch, v_count); + _mm512_i32scatter_epi64 + ((void *)count_array, v_hash_result, updated_sketch, 8); +} + +uint64_t +sketch_lookup_avx512(const struct rte_member_setsum *ss, const void *key) +{ + uint32_t col[ss->num_row]; + + /* currently only for sketch byte count mode */ + __m256i v_hash_result = rte_xxh64_sketch_avx512 + (key, ss->key_len, *(__m512i *)ss->hash_seeds, ss->num_col); + _mm256_storeu_si256((__m256i *)col, v_hash_result); + + return count_min(ss, col); +} + +void +sketch_delete_avx512(const struct rte_member_setsum *ss, const void *key) +{ + uint32_t col[ss->num_row]; + uint64_t *count_array = ss->table; + uint64_t min = UINT64_MAX; + uint32_t cur_row; + + __m256i v_hash_result = rte_xxh64_sketch_avx512 + (key, ss->key_len, *(__m512i *)ss->hash_seeds, + RTE_ALIGN_FLOOR(ss->num_col, 32)); + _mm256_storeu_si256((__m256i *)col, v_hash_result); + + min = count_min(ss, col); + + /* subtract the min value from all the counters */ + for (cur_row = 0; cur_row < ss->num_row; cur_row++) + count_array[cur_row * ss->num_col + col[cur_row]] -= min; +} diff --git a/lib/member/rte_member_sketch_avx512.h b/lib/member/rte_member_sketch_avx512.h new file mode 100644 index 0000000000..e7c25da643 --- /dev/null +++ b/lib/member/rte_member_sketch_avx512.h @@ -0,0 +1,36 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2020 Intel Corporation + */ + +#ifndef _RTE_MEMBER_SKETCH_AVX512_H_ +#define _RTE_MEMBER_SKETCH_AVX512_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include +#include "rte_member.h" +#include "rte_member_sketch.h" +#include "rte_xxh64_avx512.h" + +#define NUM_ROW_VEC 8 + +void +sketch_update_avx512(const struct rte_member_setsum *ss, + const void *key, + uint32_t count); + +uint64_t +sketch_lookup_avx512(const struct rte_member_setsum *ss, + const void *key); + +void +sketch_delete_avx512(const struct rte_member_setsum *ss, + const void *key); + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_MEMBER_SKETCH_AVX512_H_ */ diff --git a/lib/member/rte_xxh64_avx512.h b/lib/member/rte_xxh64_avx512.h new file mode 100644 index 0000000000..406c9ac256 --- /dev/null +++ b/lib/member/rte_xxh64_avx512.h @@ -0,0 +1,117 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2020 Intel Corporation + */ + +#ifndef _RTE_XXH64_AVX512_H_ +#define _RTE_XXH64_AVX512_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include +#include + +/* 0b1001111000110111011110011011000110000101111010111100101010000111 */ +static const uint64_t PRIME64_1 = 0x9E3779B185EBCA87ULL; +/* 0b1100001010110010101011100011110100100111110101001110101101001111 */ +static const uint64_t PRIME64_2 = 0xC2B2AE3D27D4EB4FULL; +/* 0b0001011001010110011001111011000110011110001101110111100111111001 */ +static const uint64_t PRIME64_3 = 0x165667B19E3779F9ULL; +/* 0b1000010111101011110010100111011111000010101100101010111001100011 */ +static const uint64_t PRIME64_4 = 0x85EBCA77C2B2AE63ULL; +/* 0b0010011111010100111010110010111100010110010101100110011111000101 */ +static const uint64_t PRIME64_5 = 0x27D4EB2F165667C5ULL; + +static __rte_always_inline __m512i +xxh64_round_avx512(__m512i hash, __m512i input) +{ + hash = _mm512_madd52lo_epu64(hash, + input, + _mm512_set1_epi64(PRIME64_2)); + + hash = _mm512_rol_epi64(hash, 31); + + return hash; +} + +static __rte_always_inline __m512i +xxh64_fmix_avx512(__m512i hash) +{ + hash = _mm512_xor_si512(hash, _mm512_srli_epi64(hash, 33)); + + return hash; +} + +static __rte_always_inline __m256i +rte_xxh64_sketch_avx512(const void *key, uint32_t key_len, + __m512i v_seed, uint32_t modulo) +{ + __m512i v_prime64_5, v_hash; + size_t remaining = key_len; + size_t offset = 0; + __m512i input; + + v_prime64_5 = _mm512_set1_epi64(PRIME64_5); + v_hash = _mm512_add_epi64 + (_mm512_add_epi64(v_seed, v_prime64_5), + _mm512_set1_epi64(key_len)); + + while (remaining >= 8) { + input = _mm512_set1_epi64(*(uint64_t *)RTE_PTR_ADD(key, offset)); + v_hash = _mm512_xor_epi64(v_hash, + xxh64_round_avx512(_mm512_setzero_si512(), input)); + v_hash = _mm512_madd52lo_epu64(_mm512_set1_epi64(PRIME64_4), + v_hash, + _mm512_set1_epi64(PRIME64_1)); + + remaining -= 8; + offset += 8; + } + + if (remaining >= 4) { + input = _mm512_set1_epi64 + (*(uint32_t *)RTE_PTR_ADD(key, offset)); + v_hash = _mm512_xor_epi64(v_hash, + _mm512_mullo_epi64(input, + _mm512_set1_epi64(PRIME64_1))); + v_hash = _mm512_madd52lo_epu64 + (_mm512_set1_epi64(PRIME64_3), + _mm512_rol_epi64(v_hash, 23), + _mm512_set1_epi64(PRIME64_2)); + + offset += 4; + remaining -= 4; + } + + while (remaining != 0) { + input = _mm512_set1_epi64 + (*(uint8_t *)RTE_PTR_ADD(key, offset)); + v_hash = _mm512_xor_epi64(v_hash, + _mm512_mullo_epi64(input, + _mm512_set1_epi64(PRIME64_5))); + v_hash = _mm512_mullo_epi64 + (_mm512_rol_epi64(v_hash, 11), + _mm512_set1_epi64(PRIME64_1)); + offset++; + remaining--; + } + + v_hash = xxh64_fmix_avx512(v_hash); + + /* + * theoritically, such modular operations can be replaced by + * _mm512_rem_epi64(), but seems it depends on the complier's + * implementation. so here is the limitation that the modulo + * value should be power of 2. + */ + __m512i v_hash_remainder = _mm512_set1_epi64((modulo - 1)); + + return _mm512_cvtepi64_epi32(_mm512_and_si512(v_hash, v_hash_remainder)); +} + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_XXH64_AVX512_H_ */ From patchwork Wed Aug 10 07:45:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leyi Rong X-Patchwork-Id: 114804 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id C7CC3A0543; Wed, 10 Aug 2022 09:45:38 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 2898D42BF0; Wed, 10 Aug 2022 09:45:31 +0200 (CEST) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by mails.dpdk.org (Postfix) with ESMTP id AB2DD42BBF for ; Wed, 10 Aug 2022 09:45:28 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1660117528; x=1691653528; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=a+EERUg3C0gjvaSgl1bpBBaEEkGLrzez/Hd55GtUr1E=; b=J8+sorWtQw3URNLg8OJbU5ObxPypnz++GdkeTPx5JnDYSttFef9iCF40 INjYnMporyrSw1vIeYv5Zdjc5nuOqrVwssd4mxyLyvKHlE4/Y/BhDL0TA quKmwV2zjR8m2GMXWAgMEV8ucdV7Lj6mKhLMThNQgvuOGodD7xwRfkZzH Xy36qVD2o38/SINOo53CDYqAwMjsqdhz/wF8/rTJPp2Jw4MiGzX5VH5zq oe+I7dpXnKSE92nIVyOdGQMUudkP2DZQhd66w+mCcItCkfNMLQ3I/SIwV rVh/SjSwu2ckuokRBr+ITyYtIca0ki86EIEi2O3/lGHStu3Y87Knf7vXi w==; X-IronPort-AV: E=McAfee;i="6400,9594,10434"; a="271407712" X-IronPort-AV: E=Sophos;i="5.93,226,1654585200"; d="scan'208";a="271407712" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Aug 2022 00:45:27 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.93,226,1654585200"; d="scan'208";a="664785928" Received: from dpdk-lrong-icx-01.sh.intel.com ([10.67.119.18]) by fmsmga008.fm.intel.com with ESMTP; 10 Aug 2022 00:45:26 -0700 From: Leyi Rong To: yipeng1.wang@intel.com, zaoxingliu@gmail.com, sameh.gobriel@intel.com Cc: dev@dpdk.org, Leyi Rong Subject: [PATCH 2/2] test/member: add functional and perf tests for sketch Date: Wed, 10 Aug 2022 15:45:18 +0800 Message-Id: <20220810074518.1695013-3-leyi.rong@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220810074518.1695013-1-leyi.rong@intel.com> References: <20220810074518.1695013-1-leyi.rong@intel.com> MIME-Version: 1.0 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org This patch adds functional and performance tests for sketch mode of membership library. Signed-off-by: Yipeng Wang Signed-off-by: Leyi Rong --- app/test/test_member.c | 258 ++++++++++++++++++++++++++++++++++++ app/test/test_member_perf.c | 153 ++++++++++++++++++++- 2 files changed, 407 insertions(+), 4 deletions(-) diff --git a/app/test/test_member.c b/app/test/test_member.c index 26a712439f..abe59bb9f8 100644 --- a/app/test/test_member.c +++ b/app/test/test_member.c @@ -4,6 +4,7 @@ /* This test is for membership library's simple feature test */ +#include #include "test.h" #include @@ -28,6 +29,7 @@ test_member(void) struct rte_member_setsum *setsum_ht; struct rte_member_setsum *setsum_cache; struct rte_member_setsum *setsum_vbf; +struct rte_member_setsum *setsum_sketch; /* 5-tuple key type */ struct flow_key { @@ -108,6 +110,21 @@ static struct rte_member_parameters params = { .socket_id = 0 /* NUMA Socket ID for memory. */ }; +/* for sketch definitions */ +#define TOP_K 10 +#define HH_PKT_SIZE 16 +#define SKETCH_ERROR_RATE 0.05 +#define SKETCH_SAMPLE_RATE 0.001 +#define PRINT_OUT_COUNT 20 + +#define SKETCH_LARGEST_KEY_SIZE 1000000 +#define SKETCH_TOTAL_KEY 500 +#define NUM_OF_KEY(key) {\ + (unsigned int)ceil(SKETCH_LARGEST_KEY_SIZE / (key + 1)) \ +} + +void *heavy_hitters[TOP_K]; + /* * Sequence of operations for find existing setsummary * @@ -684,6 +701,243 @@ perform_free(void) rte_member_free(setsum_vbf); } +static void +print_out_sketch_results(uint64_t *count_result, member_set_t *heavy_set, + uint32_t print_num, bool count_byte) +{ + uint32_t i; + + for (i = 0; i < print_num; i++) { + if (count_byte) + printf("key %2u, count %8lu, real count %8u, " + "heavy_set %u, deviation rate [%.04f]\n", + i, count_result[i], + (unsigned int)ceil(SKETCH_LARGEST_KEY_SIZE / (i + 1)) * + HH_PKT_SIZE, + heavy_set[i], + fabs((float)count_result[i] - (float)NUM_OF_KEY(i) * HH_PKT_SIZE) / + ((float)NUM_OF_KEY(i) * HH_PKT_SIZE)); + else + printf("key %2u, count %8lu, real count %8u, " + "heavy_set %u, deviation rate [%.04f]\n", + i, count_result[i], + (unsigned int)ceil(SKETCH_LARGEST_KEY_SIZE / (i + 1)), + heavy_set[i], + fabs((float)count_result[i] - (float)NUM_OF_KEY(i)) / + (float)NUM_OF_KEY(i)); + } +} + +static int +sketch_test(uint32_t *keys, uint32_t total_pkt, int count_byte, int reset_test) +{ + uint32_t i; + uint64_t result_count[SKETCH_TOTAL_KEY]; + member_set_t heavy_set[SKETCH_TOTAL_KEY]; + uint64_t count[TOP_K]; + int ret; + int hh_cnt; + + setsum_sketch = rte_member_create(¶ms); + if (setsum_sketch == NULL) { + printf("Creation of setsums fail\n"); + return -1; + } + + for (i = 0; i < total_pkt; i++) { + if (count_byte) + ret = rte_member_add_byte_count(setsum_sketch, &keys[i], HH_PKT_SIZE); + else + ret = rte_member_add(setsum_sketch, &keys[i], 1); + + if (ret < 0) { + printf("rte_member_add Failed! Error [%d]\n", ret); + + return -1; + } + } + + for (i = 0; i < SKETCH_TOTAL_KEY; i++) { + uint32_t tmp_key = i; + + rte_member_query_count(setsum_sketch, (void *)&tmp_key, &result_count[i]); + rte_member_lookup(setsum_sketch, (void *)&tmp_key, &heavy_set[i]); + } + + print_out_sketch_results(result_count, heavy_set, PRINT_OUT_COUNT, count_byte); + + hh_cnt = rte_member_report_heavyhitter(setsum_sketch, heavy_hitters, count); + if (hh_cnt < 0) { + printf("sketch report heavy hitter error!"); + + return -1; + } + + printf("Report heavy hitters:"); + for (i = 0; i < (unsigned int)hh_cnt; i++) { + printf("%u: %lu\t", + *((uint32_t *)heavy_hitters[i]), count[i]); + } + printf("\n"); + + if (reset_test) { + printf("\nEntering Sketch Reset Test Process!\n"); + rte_member_reset(setsum_sketch); + + /* after reset, check some key's count */ + for (i = 0; i < SKETCH_TOTAL_KEY; i++) { + uint32_t tmp_key = i; + + rte_member_query_count(setsum_sketch, (void *)&tmp_key, &result_count[i]); + rte_member_lookup(setsum_sketch, (void *)&tmp_key, &heavy_set[i]); + } + + print_out_sketch_results(result_count, heavy_set, PRINT_OUT_COUNT, count_byte); + + printf("\nReinsert keys after Sketch Reset!\n"); + for (i = 0; i < total_pkt; i++) { + if (count_byte) + ret = rte_member_add_byte_count + (setsum_sketch, &keys[i], HH_PKT_SIZE); + else + ret = rte_member_add(setsum_sketch, &keys[i], 1); + + if (ret < 0) { + printf("rte_member_add Failed! Error [%d]\n", ret); + + return -1; + } + } + + for (i = 0; i < SKETCH_TOTAL_KEY; i++) { + uint32_t tmp_key = i; + + rte_member_query_count(setsum_sketch, (void *)&tmp_key, &result_count[i]); + rte_member_lookup(setsum_sketch, (void *)&tmp_key, &heavy_set[i]); + } + + print_out_sketch_results(result_count, heavy_set, PRINT_OUT_COUNT, count_byte); + + hh_cnt = rte_member_report_heavyhitter(setsum_sketch, heavy_hitters, count); + if (hh_cnt < 0) { + printf("sketch report heavy hitter error!"); + + return -1; + } + printf("Report heavy hitters:"); + for (i = 0; i < (unsigned int)hh_cnt; i++) { + printf("%u: %lu\t", + *((uint32_t *)heavy_hitters[i]), count[i]); + } + printf("\n"); + + printf("\nDelete some keys!\n"); + uint32_t tmp_key = 0; + + rte_member_delete(setsum_sketch, (void *)&tmp_key, 0); + tmp_key = 1; + rte_member_delete(setsum_sketch, (void *)&tmp_key, 0); + + for (i = 0; i < SKETCH_TOTAL_KEY; i++) { + uint32_t tmp_key = i; + + rte_member_query_count(setsum_sketch, (void *)&tmp_key, &result_count[i]); + rte_member_lookup(setsum_sketch, (void *)&tmp_key, &heavy_set[i]); + } + + print_out_sketch_results(result_count, heavy_set, PRINT_OUT_COUNT, count_byte); + + hh_cnt = rte_member_report_heavyhitter(setsum_sketch, heavy_hitters, count); + if (hh_cnt < 0) { + printf("sketch report heavy hitter error!"); + + return -1; + } + printf("Report heavy hitters:"); + for (i = 0; i < (unsigned int)hh_cnt; i++) { + printf("%u: %lu\t", + *((uint32_t *)heavy_hitters[i]), count[i]); + } + printf("\n"); + } + + rte_member_free(setsum_sketch); + return 0; +} + +static int +test_member_sketch(void) +{ + unsigned int i, j, index; + uint32_t total_pkt = 0; + uint32_t *keys; + int count_byte = 0; + + for (i = 0; i < SKETCH_TOTAL_KEY; i++) + total_pkt += ceil(SKETCH_LARGEST_KEY_SIZE / (i + 1)); + + printf("\nTotal key count [%u] in Sketch Autotest\n", total_pkt); + + keys = rte_zmalloc(NULL, sizeof(uint32_t) * total_pkt, 0); + + if (keys == NULL) { + printf("RTE_ZMALLOC failed\n"); + return -1; + } + + index = 0; + for (i = 0; i < SKETCH_TOTAL_KEY; i++) { + for (j = 0; j < ceil(SKETCH_LARGEST_KEY_SIZE / (i + 1)); j++) + keys[index++] = i; + } + + /* shuffle the keys */ + for (i = index - 1; i > 0; i--) { + uint32_t swap_idx = rte_rand() % i; + uint32_t tmp_key = keys[i]; + + keys[i] = keys[swap_idx]; + keys[swap_idx] = tmp_key; + } + + params.key_len = 4; + params.name = "test_member_sketch"; + params.type = RTE_MEMBER_TYPE_SKETCH; + params.error_rate = SKETCH_ERROR_RATE; + params.sample_rate = SKETCH_SAMPLE_RATE; + params.extra_flag = 0; + params.top_k = TOP_K; + params.prim_hash_seed = rte_rdtsc(); + int reset_test = 0; + + printf("Default sketching params: Error Rate: [%f]\tSample Rate: [%f]\tTopK: [%d]\n", + SKETCH_ERROR_RATE, SKETCH_SAMPLE_RATE, TOP_K); + + printf("\n[Sketch with Fixed Sampling Rate Mode]\n"); + if (sketch_test(keys, total_pkt, count_byte, reset_test) < 0) + return -1; + + params.extra_flag |= RTE_MEMBER_SKETCH_ALWAYS_BOUNDED; + printf("\n[Sketch with Always Bounded Mode]\n"); + if (sketch_test(keys, total_pkt, count_byte, reset_test) < 0) + return -1; + + count_byte = 1; + params.extra_flag |= RTE_MEMBER_SKETCH_COUNT_BYTE; + printf("\n[Sketch with Packet Size Mode]\n"); + if (sketch_test(keys, total_pkt, count_byte, reset_test) < 0) + return -1; + + count_byte = 0; + params.extra_flag = 0; + reset_test = 1; + printf("\nreset sketch test\n"); + if (sketch_test(keys, total_pkt, count_byte, reset_test) < 0) + return -1; + + return 0; +} + static int test_member(void) { @@ -719,6 +973,10 @@ test_member(void) return -1; } + if (test_member_sketch() < 0) { + perform_free(); + return -1; + } perform_free(); return 0; } diff --git a/app/test/test_member_perf.c b/app/test/test_member_perf.c index 978db0020a..b7eb3e4c66 100644 --- a/app/test/test_member_perf.c +++ b/app/test/test_member_perf.c @@ -13,6 +13,7 @@ #include #include #include +#include #ifdef RTE_EXEC_ENV_WINDOWS static int @@ -26,7 +27,7 @@ test_member_perf(void) #include -#define NUM_KEYSIZES 10 +#define NUM_KEYSIZES RTE_DIM(hashtest_key_lens) #define NUM_SHUFFLES 10 #define MAX_KEYSIZE 64 #define MAX_ENTRIES (1 << 19) @@ -36,12 +37,23 @@ test_member_perf(void) #define BURST_SIZE 64 #define VBF_FALSE_RATE 0.03 +/* for the heavy hitter detection */ +#define SKETCH_LARGEST_KEY_SIZE (1<<15) +#define SKETCH_PKT_SIZE 16 +#define TOP_K 100 +#define SKETCH_ERROR_RATE 0.05 +#define SKETCH_SAMPLE_RATE 0.001 +#define NUM_ADDS (KEYS_TO_ADD * 20) + static unsigned int test_socket_id; enum sstype { HT = 0, CACHE, VBF, + SKETCH, + SKETCH_BOUNDED, + SKETCH_BYTE, NUM_TYPE }; @@ -88,6 +100,7 @@ static member_set_t data[NUM_TYPE][/* Array to store the data */KEYS_TO_ADD]; /* Array to store all input keys */ static uint8_t keys[KEYS_TO_ADD][MAX_KEYSIZE]; +static uint8_t hh_keys[KEYS_TO_ADD][MAX_KEYSIZE]; /* Shuffle the keys that have been added, so lookups will be totally random */ static void @@ -136,6 +149,10 @@ setup_keys_and_data(struct member_perf_params *params, unsigned int cycle, { unsigned int i, j; int num_duplicates; + int distinct_key = 0; + int count_down = SKETCH_LARGEST_KEY_SIZE; + uint32_t swap_idx; + uint8_t temp_key[MAX_KEYSIZE]; params->key_size = hashtest_key_lens[cycle]; params->cycle = cycle; @@ -176,6 +193,22 @@ setup_keys_and_data(struct member_perf_params *params, unsigned int cycle, /* Shuffle the random values again */ shuffle_input_keys(params); + for (i = 0; i < KEYS_TO_ADD; i++) { + if (count_down == 0) { + distinct_key++; + count_down = ceil(SKETCH_LARGEST_KEY_SIZE / (distinct_key + 1)); + } + memcpy(hh_keys[i], keys[distinct_key], params->key_size); + count_down--; + } + + for (i = KEYS_TO_ADD - 1; i > 0; i--) { + swap_idx = rte_rand() % i; + memcpy(temp_key, hh_keys[i], params->key_size); + memcpy(hh_keys[i], hh_keys[swap_idx], params->key_size); + memcpy(hh_keys[swap_idx], temp_key, params->key_size); + } + /* For testing miss lookup, we insert half and lookup the other half */ unsigned int entry_cnt, bf_key_cnt; if (!miss) { @@ -208,6 +241,44 @@ setup_keys_and_data(struct member_perf_params *params, unsigned int cycle, params->setsum[VBF] = rte_member_create(&member_params); if (params->setsum[VBF] == NULL) fprintf(stderr, "VBF create fail\n"); + + member_params.name = "test_member_sketch"; + member_params.key_len = params->key_size; + member_params.type = RTE_MEMBER_TYPE_SKETCH; + member_params.error_rate = SKETCH_ERROR_RATE; + member_params.sample_rate = SKETCH_SAMPLE_RATE; + member_params.extra_flag = 0; + member_params.top_k = TOP_K; + member_params.prim_hash_seed = rte_rdtsc(); + params->setsum[SKETCH] = rte_member_create(&member_params); + if (params->setsum[SKETCH] == NULL) + fprintf(stderr, "sketch create fail\n"); + + member_params.name = "test_member_sketch_bounded"; + member_params.key_len = params->key_size; + member_params.type = RTE_MEMBER_TYPE_SKETCH; + member_params.error_rate = SKETCH_ERROR_RATE; + member_params.sample_rate = SKETCH_SAMPLE_RATE; + member_params.extra_flag |= RTE_MEMBER_SKETCH_ALWAYS_BOUNDED; + member_params.top_k = TOP_K; + member_params.prim_hash_seed = rte_rdtsc(); + params->setsum[SKETCH_BOUNDED] = rte_member_create(&member_params); + if (params->setsum[SKETCH_BOUNDED] == NULL) + fprintf(stderr, "sketch create fail\n"); + + member_params.name = "test_member_sketch_byte"; + member_params.key_len = params->key_size; + member_params.type = RTE_MEMBER_TYPE_SKETCH; + member_params.error_rate = SKETCH_ERROR_RATE; + member_params.sample_rate = SKETCH_SAMPLE_RATE; + member_params.extra_flag |= RTE_MEMBER_SKETCH_COUNT_BYTE; + member_params.top_k = TOP_K; + member_params.prim_hash_seed = rte_rdtsc(); + params->setsum[SKETCH_BYTE] = rte_member_create(&member_params); + if (params->setsum[SKETCH_BYTE] == NULL) + fprintf(stderr, "sketch create fail\n"); + + for (i = 0; i < NUM_TYPE; i++) { if (params->setsum[i] == NULL) return -1; @@ -243,6 +314,39 @@ timed_adds(struct member_perf_params *params, int type) return 0; } +static int +timed_adds_sketch(struct member_perf_params *params, int type) +{ + const uint64_t start_tsc = rte_rdtsc(); + unsigned int i, j, a; + int32_t ret; + + for (i = 0; i < NUM_ADDS / KEYS_TO_ADD; i++) { + for (j = 0; j < KEYS_TO_ADD; j++) { + if (type == SKETCH_BYTE) + ret = rte_member_add_byte_count(params->setsum[type], + &hh_keys[j], SKETCH_PKT_SIZE); + else + ret = rte_member_add(params->setsum[type], &hh_keys[j], 1); + if (ret < 0) { + printf("Error %d in rte_member_add - key=0x", ret); + for (a = 0; a < params->key_size; a++) + printf("%02x", hh_keys[j][a]); + printf("type: %d\n", type); + + return -1; + } + } + } + + const uint64_t end_tsc = rte_rdtsc(); + const uint64_t time_taken = end_tsc - start_tsc; + + cycles[type][params->cycle][ADD] = time_taken / NUM_ADDS; + + return 0; +} + static int timed_lookups(struct member_perf_params *params, int type) { @@ -279,6 +383,36 @@ timed_lookups(struct member_perf_params *params, int type) return 0; } +static int +timed_lookups_sketch(struct member_perf_params *params, int type) +{ + unsigned int i, j; + + false_data[type][params->cycle] = 0; + + const uint64_t start_tsc = rte_rdtsc(); + member_set_t result; + int ret; + + for (i = 0; i < NUM_LOOKUPS / KEYS_TO_ADD; i++) { + for (j = 0; j < KEYS_TO_ADD; j++) { + ret = rte_member_lookup(params->setsum[type], &hh_keys[j], + &result); + if (ret < 0) { + printf("lookup wrong internally"); + return -1; + } + } + } + + const uint64_t end_tsc = rte_rdtsc(); + const uint64_t time_taken = end_tsc - start_tsc; + + cycles[type][params->cycle][LOOKUP] = time_taken / NUM_LOOKUPS; + + return 0; +} + static int timed_lookups_bulk(struct member_perf_params *params, int type) { @@ -531,7 +665,7 @@ run_all_tbl_perf_tests(void) printf("Could not create keys/data/table\n"); return -1; } - for (j = 0; j < NUM_TYPE; j++) { + for (j = 0; j < SKETCH; j++) { if (timed_adds(¶ms, j) < 0) return exit_with_fail("timed_adds", ¶ms, @@ -562,6 +696,17 @@ run_all_tbl_perf_tests(void) /* Print a dot to show progress on operations */ } + + for (j = SKETCH; j < NUM_TYPE; j++) { + if (timed_adds_sketch(¶ms, j) < 0) + return exit_with_fail + ("timed_adds_sketch", ¶ms, i, j); + + if (timed_lookups_sketch(¶ms, j) < 0) + return exit_with_fail + ("timed_lookups_sketch", ¶ms, i, j); + } + printf("."); fflush(stdout); @@ -574,7 +719,7 @@ run_all_tbl_perf_tests(void) printf("Could not create keys/data/table\n"); return -1; } - for (j = 0; j < NUM_TYPE; j++) { + for (j = 0; j < SKETCH; j++) { if (timed_miss_lookup(¶ms, j) < 0) return exit_with_fail("timed_miss_lookup", ¶ms, i, j); @@ -605,7 +750,7 @@ run_all_tbl_perf_tests(void) "fr_multi_bulk", "false_positive_rate"); /* Key size not influence False rate so just print out one key size */ for (i = 0; i < 1; i++) { - for (j = 0; j < NUM_TYPE; j++) { + for (j = 0; j < SKETCH; j++) { printf("%-18d", hashtest_key_lens[i]); printf("%-18d", j); printf("%-18f", (float)false_data[j][i] / NUM_LOOKUPS);