From patchwork Tue Feb 28 09:39:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Mattias_R=C3=B6nnblom?= X-Patchwork-Id: 124569 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 929B541D9B; Tue, 28 Feb 2023 10:45:22 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 6331D42BC9; Tue, 28 Feb 2023 10:45:08 +0100 (CET) Received: from EUR02-DB5-obe.outbound.protection.outlook.com (mail-db5eur02on2041.outbound.protection.outlook.com [40.107.249.41]) by mails.dpdk.org (Postfix) with ESMTP id 9145341181 for ; Tue, 28 Feb 2023 10:45:05 +0100 (CET) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=YYgLSEX54NIlWx2SnA1wMkYC2BNQBLy8NsfBhh1nIoupTAO4Xz7aLTzRgaA6QM+PeVkhKvmrVcA4ScdoQCJVaKIABx+K6SPOGvXfWHtyjnnj3mmp9to1YRqyKhpcvpXEB4i4ordPfeJTzOGVmYUP1JKRj0fG4Ep1czeiIBQQEWbGlXIaImirVvdo+zOdbAuOnfikUQJcGAUOUkLWDHI3WGrq+cu1rv+5y/lcuQ5N7NRl1jxWM8vr0meggy5SuHrJvgMFAuNePufMH6rOWt8i5nTDU1/q1T//oXkHKd1v8c8LSzQO8tkpXHbh2iXRj8EiFDFFoIElc/9HfVELM9c+Rg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=orYH3OtjnmJP44ssb7/CYcaaz0dyB2ijS3qX517VOxw=; b=WJShEa4IAvBKqAngDut/DBhSRxgYhe3N5sNcc2dZDjcpqC2hgOua17hInXYlXhFMUsbWd03+AVdXYwmHkyCcSsiNFIAKd8qW4QDFrYIgIwGINtD3h1V/aauCu3KL04v6mHfKcJ2oFsrc5rt+XLQKFcgjCeOqMdio8hxSrv1D8Xedg8ezy3mAhU8R+scNda8DW89W9BAyCoT3XSyB+HAKSPqQ1crMRb64odvJYHuRsFNyBZZgO8OInQ7CzCx6j+YO+V+2cg85kJQ6T5St0YutnrZNDOU9d2YtToeDYWPVyrSo1GfsvXBOX+uVMd222nWTis4a/ZpzLduV+2osi7JyvA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 192.176.1.74) smtp.rcpttodomain=dpdk.org smtp.mailfrom=ericsson.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=ericsson.com; dkim=none (message not signed); arc=none Received: from AM5PR0101CA0002.eurprd01.prod.exchangelabs.com (2603:10a6:206:16::15) by DU2PR07MB8285.eurprd07.prod.outlook.com (2603:10a6:10:277::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6134.30; Tue, 28 Feb 2023 09:45:00 +0000 Received: from AM0EUR02FT050.eop-EUR02.prod.protection.outlook.com (2603:10a6:206:16:cafe::28) by AM5PR0101CA0002.outlook.office365.com (2603:10a6:206:16::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6134.30 via Frontend Transport; Tue, 28 Feb 2023 09:45:00 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 192.176.1.74) smtp.mailfrom=ericsson.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=ericsson.com; Received-SPF: Pass (protection.outlook.com: domain of ericsson.com designates 192.176.1.74 as permitted sender) receiver=protection.outlook.com; client-ip=192.176.1.74; helo=oa.msg.ericsson.com; pr=C Received: from oa.msg.ericsson.com (192.176.1.74) by AM0EUR02FT050.mail.protection.outlook.com (10.13.54.117) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.20.6156.12 via Frontend Transport; Tue, 28 Feb 2023 09:45:00 +0000 Received: from ESESBMB502.ericsson.se (153.88.183.169) by ESESSMB502.ericsson.se (153.88.183.163) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.2507.17; Tue, 28 Feb 2023 10:44:59 +0100 Received: from seliicinfr00050.seli.gic.ericsson.se (153.88.183.153) by smtp.internal.ericsson.com (153.88.183.185) with Microsoft SMTP Server id 15.1.2507.17 via Frontend Transport; Tue, 28 Feb 2023 10:44:59 +0100 Received: from breslau.. (seliicwb00002.seli.gic.ericsson.se [10.156.25.100]) by seliicinfr00050.seli.gic.ericsson.se (Postfix) with ESMTP id 8759A1C006A; Tue, 28 Feb 2023 10:44:59 +0100 (CET) From: =?utf-8?q?Mattias_R=C3=B6nnblom?= To: CC: Erik Gabriel Carrillo , David Marchand , , Stefan Sundkvist , =?utf-8?q?Mattias_R=C3=B6?= =?utf-8?q?nnblom?= Subject: [RFC 1/2] eal: add bitset type Date: Tue, 28 Feb 2023 10:39:15 +0100 Message-ID: <20230228093916.87206-2-mattias.ronnblom@ericsson.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230228093916.87206-1-mattias.ronnblom@ericsson.com> References: <20230228093916.87206-1-mattias.ronnblom@ericsson.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: AM0EUR02FT050:EE_|DU2PR07MB8285:EE_ X-MS-Office365-Filtering-Correlation-Id: 88ea363a-c44a-465b-83ff-08db19707580 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: o6sivawecBM/8N9EDXVoL6qGfofolaVgM3aQrGH4BbUQVJ1SL3vP/gxf3nVhXVznEOJBJqtwi8CBI33Lv/gEkUB2UGlneVk9jcIafE6L2VKVkIs4IqQ/UqLVAO2SHn2grLzMSNNTdllubSdjFyS0yye5qnBygkhfUdk6P1nUe0j5tKOzRYl2v7/RJacC9+tTZo7nTucpYudTjU0F/990It/Nbldkt0arg1g5J3Z26KDeVtTtlire/Ig32Y/M8DbPzdTESiiVjz9JJND6SAtzFqEBIwv+eJhVdKq+dr0e0og17TWm8dEd0ol2r+sz6rqdGgI13nmdhZCHNl5NI506q/ag6skGRkxUF6xH3uhhEEtLVoqXhNdw6316XaZWIFOs28gZWpPIIYs5FYwl7AJEzr7fm888yMX/8NMIy26HK4yZIYSAgVxiT1o1KypdOUTk7VOEJ3P+3ZOSObPonPbBi8sk8AzHQ9RufN8j8tzxoLzsF6G5OJ9TIfSdpSlW5BRYvZqIoJyppfZoyFLkZFT8i7q1FzYyElEH+f8dB70DGBqXtEZRmTK3zpaHiZMou7n3DuWHeFt3LZB8os998d+syloCCJzkOHakV3KfoOV0DM6xZfM1q0gEVp+jlGD3ELRBRvk6WlRRuUDWhbq4Ql48ft2uJmfLbVxKIuqkNivCEPzzzd+HeRJVYVIXVDqo4ZGXYRWvUZnwOUOZA/7Exlp5SW/Gz/wQstBn9SfS+vv7pD8= X-Forefront-Antispam-Report: CIP:192.176.1.74; CTRY:SE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:oa.msg.ericsson.com; PTR:office365.se.ericsson.net; CAT:NONE; SFS:(13230025)(4636009)(39860400002)(376002)(346002)(136003)(396003)(451199018)(40470700004)(46966006)(36840700001)(86362001)(36756003)(5660300002)(6916009)(70206006)(41300700001)(40480700001)(8676002)(4326008)(8936002)(2906002)(30864003)(36860700001)(82960400001)(7636003)(356005)(82740400003)(107886003)(478600001)(54906003)(316002)(70586007)(6666004)(82310400005)(40460700003)(83380400001)(66574015)(47076005)(2616005)(26005)(6266002)(186003)(336012)(1076003)(21314003)(579004); DIR:OUT; SFP:1101; X-OriginatorOrg: ericsson.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 Feb 2023 09:45:00.0820 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 88ea363a-c44a-465b-83ff-08db19707580 X-MS-Exchange-CrossTenant-Id: 92e84ceb-fbfd-47ab-be52-080c6b87953f X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=92e84ceb-fbfd-47ab-be52-080c6b87953f; Ip=[192.176.1.74]; Helo=[oa.msg.ericsson.com] X-MS-Exchange-CrossTenant-AuthSource: AM0EUR02FT050.eop-EUR02.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DU2PR07MB8285 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Introduce a set of functions and macros that operate on sets of bits, kept in arrays of 64-bit elements. RTE bitset is designed for bitsets which are larger than what fits in a single machine word (i.e., 64 bits). For very large bitsets, the API may be a more appropriate choice. Signed-off-by: Mattias Rönnblom --- app/test/meson.build | 4 +- app/test/test_bitset.c | 646 ++++++++++++++++++++++++++ lib/eal/common/meson.build | 1 + lib/eal/common/rte_bitset.c | 29 ++ lib/eal/include/meson.build | 1 + lib/eal/include/rte_bitset.h | 878 +++++++++++++++++++++++++++++++++++ lib/eal/version.map | 3 + 7 files changed, 1561 insertions(+), 1 deletion(-) create mode 100644 app/test/test_bitset.c create mode 100644 lib/eal/common/rte_bitset.c create mode 100644 lib/eal/include/rte_bitset.h diff --git a/app/test/meson.build b/app/test/meson.build index f34d19e3c3..03811ff692 100644 --- a/app/test/meson.build +++ b/app/test/meson.build @@ -13,8 +13,9 @@ test_sources = files( 'test_alarm.c', 'test_atomic.c', 'test_barrier.c', - 'test_bitops.c', 'test_bitmap.c', + 'test_bitset.c', + 'test_bitops.c', 'test_bpf.c', 'test_byteorder.c', 'test_cksum.c', @@ -164,6 +165,7 @@ fast_tests = [ ['bpf_autotest', true, true], ['bpf_convert_autotest', true, true], ['bitops_autotest', true, true], + ['bitset_autotest', true, true], ['byteorder_autotest', true, true], ['cksum_autotest', true, true], ['cmdline_autotest', true, true], diff --git a/app/test/test_bitset.c b/app/test/test_bitset.c new file mode 100644 index 0000000000..504363e86e --- /dev/null +++ b/app/test/test_bitset.c @@ -0,0 +1,646 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2023 Ericsson AB + */ + +#include +#include +#include + +#include + +#include + +#include "test.h" + +#define MAGIC UINT64_C(0xdeadbeefdeadbeef) + +static void +rand_buf(void *buf, size_t n) +{ + size_t i; + + for (i = 0; i < n; i++) + ((char *)buf)[i] = (char)rte_rand(); +} + +static uint64_t * +alloc_bitset(size_t size) +{ + uint64_t *p; + + p = malloc(RTE_BITSET_SIZE(size) + 2 * sizeof(uint64_t)); + + if (p == NULL) + rte_panic("Unable to allocate memory\n"); + + rand_buf(&p[0], RTE_BITSET_SIZE(size)); + + p[0] = MAGIC; + p[RTE_BITSET_NUM_WORDS(size) + 1] = MAGIC; + + return p + 1; +} + + +static int +free_bitset(uint64_t *bitset, size_t size) +{ + uint64_t *p; + + p = bitset - 1; + + if (p[0] != MAGIC) + return TEST_FAILED; + + if (p[RTE_BITSET_NUM_WORDS(size) + 1] != MAGIC) + return TEST_FAILED; + + free(p); + + return TEST_SUCCESS; +} + +static bool +rand_bool(void) +{ + return rte_rand_max(2); +} + +static void +rand_bool_ary(bool *ary, size_t len) +{ + size_t i; + + for (i = 0; i < len; i++) + ary[i] = rand_bool(); +} + +static int +test_test_set_size(size_t size) +{ + size_t i; + bool reference[size]; + uint64_t *bitset; + + rand_bool_ary(reference, size); + + bitset = alloc_bitset(size); + + if (bitset == NULL) + return TEST_FAILED; + + rte_bitset_init(bitset, size); + + for (i = 0; i < size; i++) { + if (reference[i]) + rte_bitset_set(bitset, i); + else + rte_bitset_clear(bitset, i); + } + + for (i = 0; i < size; i++) + if (reference[i] != rte_bitset_test(bitset, i)) + return TEST_FAILED; + + if (free_bitset(bitset, size) != TEST_SUCCESS) + return TEST_FAILED; + + return TEST_SUCCESS; +} + +#define RAND_ITERATIONS (10000) +#define RAND_SET_MAX_SIZE (1000) + +static int +test_test_set(void) +{ + size_t i; + + for (i = 0; i < RAND_ITERATIONS; i++) { + size_t size = 1 + rte_rand_max(RAND_SET_MAX_SIZE - 1); + + if (test_test_set_size(size) != TEST_SUCCESS) + return TEST_FAILED; + } + + return TEST_SUCCESS; +} + +static ssize_t +find(const bool *ary, size_t num_bools, size_t start, size_t len, bool set) +{ + size_t i; + + for (i = 0; i < len; i++) { + ssize_t idx = (start + i) % num_bools; + + if (ary[idx] == set) + return idx; + } + + return -1; +} + +static ssize_t +find_set(const bool *ary, size_t num_bools, size_t start, size_t len) +{ + return find(ary, num_bools, start, len, true); +} + +static ssize_t +find_clear(const bool *ary, size_t num_bools, size_t start, size_t len) +{ + return find(ary, num_bools, start, len, false); +} + +#define FFS_ITERATIONS (100) + +static int +test_find_size(size_t size, bool set) +{ + uint64_t *bitset; + bool reference[size]; + size_t i; + + bitset = alloc_bitset(size); + + if (bitset == NULL) + return TEST_FAILED; + + rte_bitset_init(bitset, size); + + for (i = 0; i < size; i++) { + bool bit = rand_bool(); + reference[i] = bit; + + if (bit) + rte_bitset_set(bitset, i); + else /* redundant, still useful for testing */ + rte_bitset_clear(bitset, i); + } + + for (i = 0; i < FFS_ITERATIONS; i++) { + size_t start_bit = rte_rand_max(size); + size_t len = rte_rand_max(size + 1); + bool full_range = len == size && start_bit == 0; + bool wraps = start_bit + len > size; + ssize_t rc; + + if (set) { + if (full_range && rand_bool()) + rc = rte_bitset_find_first_set(bitset, + size); + else if (wraps || rand_bool()) { + rc = rte_bitset_find_set_wrap(bitset, size, + start_bit, len); + + } else + rc = rte_bitset_find_set(bitset, size, + start_bit, len); + + if (rc != find_set(reference, size, start_bit, + len)) + return TEST_FAILED; + } else { + if (full_range && rand_bool()) + rc = rte_bitset_find_first_clear(bitset, + size); + else if (wraps || rand_bool()) + rc = rte_bitset_find_clear_wrap(bitset, + size, + start_bit, len); + else + rc = rte_bitset_find_clear(bitset, size, + start_bit, len); + + if (rc != find_clear(reference, size, start_bit, + len)) + return TEST_FAILED; + } + + } + + if (free_bitset(bitset, size) != TEST_SUCCESS) + return TEST_FAILED; + + return TEST_SUCCESS; +} + +static int +test_find_set_size(size_t size) +{ + return test_find_size(size, true); +} + +static int +test_find_clear_size(size_t size) +{ + return test_find_size(size, false); +} + +static int +test_find(void) +{ + size_t i; + + for (i = 0; i < RAND_ITERATIONS; i++) { + size_t size = 2 + rte_rand_max(RAND_SET_MAX_SIZE - 2); + + if (test_find_set_size(size) != TEST_SUCCESS) + return TEST_FAILED; + + if (test_find_clear_size(size) != TEST_SUCCESS) + return TEST_FAILED; + } + + return TEST_SUCCESS; +} + +static int +record_match(ssize_t match_idx, size_t size, int *calls) +{ + if (match_idx < 0 || (size_t)match_idx >= size) + return TEST_FAILED; + + calls[match_idx]++; + + return TEST_SUCCESS; +} + +static int +test_foreach_size(ssize_t size, bool may_wrap, bool set) +{ + bool reference[size]; + int calls[size]; + uint64_t *bitset; + ssize_t i; + ssize_t start_bit; + ssize_t len; + bool full_range; + size_t total_calls = 0; + + rand_bool_ary(reference, size); + + bitset = alloc_bitset(size); + + if (bitset == NULL) + return TEST_FAILED; + + memset(calls, 0, sizeof(calls)); + + start_bit = rte_rand_max(size); + len = may_wrap ? rte_rand_max(size + 1) : + rte_rand_max(size - start_bit + 1); + + rte_bitset_init(bitset, size); + + /* random data in the unused bits should not matter */ + rand_buf(bitset, RTE_BITSET_SIZE(size)); + + for (i = start_bit; i < start_bit + len; i++) { + size_t idx = i % size; + + if (reference[idx]) + rte_bitset_set(bitset, idx); + else + rte_bitset_clear(bitset, idx); + + if (rte_bitset_test(bitset, idx) != reference[idx]) + return TEST_FAILED; + } + + full_range = (len == size && start_bit == 0); + + /* XXX: verify iteration order as well */ + if (set) { + if (full_range && rand_bool()) { + RTE_BITSET_FOREACH_SET(i, bitset, size) { + if (record_match(i, size, calls) != + TEST_SUCCESS) + return TEST_FAILED; + } + } else if (may_wrap) { + RTE_BITSET_FOREACH_SET_WRAP(i, bitset, size, + start_bit, len) { + if (record_match(i, size, calls) != + TEST_SUCCESS) { + printf("failed\n"); + return TEST_FAILED; + } + } + } else { + RTE_BITSET_FOREACH_SET_RANGE(i, bitset, size, + start_bit, len) { + if (record_match(i, size, calls) != + TEST_SUCCESS) + return TEST_FAILED; + } + } + } else { + if (full_range && rand_bool()) { + RTE_BITSET_FOREACH_CLEAR(i, bitset, size) + if (record_match(i, size, calls) != + TEST_SUCCESS) + return TEST_FAILED; + } else if (may_wrap) { + RTE_BITSET_FOREACH_CLEAR_WRAP(i, bitset, size, + start_bit, len) { + if (record_match(i, size, calls) != + TEST_SUCCESS) + return TEST_FAILED; + } + } else { + RTE_BITSET_FOREACH_CLEAR_RANGE(i, bitset, size, + start_bit, len) + if (record_match(i, size, calls) != + TEST_SUCCESS) + return TEST_FAILED; + } + } + + for (i = 0; i < len; i++) { + size_t idx = (start_bit + i) % size; + + if (reference[idx] == set && calls[idx] != 1) { + printf("bit %zd shouldn't have been found %d " + "times\n", idx, calls[idx]); + return TEST_FAILED; + } + + if (reference[idx] != set && calls[idx] != 0) { + puts("bar"); + return TEST_FAILED; + } + + total_calls += calls[idx]; + } + + if (full_range) { + size_t count; + + count = set ? rte_bitset_count_set(bitset, size) : + rte_bitset_count_clear(bitset, size); + + if (count != total_calls) + return TEST_FAILED; + } + + if (free_bitset(bitset, size) != TEST_SUCCESS) + return TEST_FAILED; + + return TEST_SUCCESS; +} + +static int +test_foreach(void) +{ + size_t i; + + for (i = 0; i < RAND_ITERATIONS; i++) { + size_t size = 1 + rte_rand_max(RAND_SET_MAX_SIZE - 1); + + if (test_foreach_size(size, false, true) != TEST_SUCCESS) + return TEST_FAILED; + + if (test_foreach_size(size, false, false) != TEST_SUCCESS) + return TEST_FAILED; + + if (test_foreach_size(size, true, true) != TEST_SUCCESS) + return TEST_FAILED; + + if (test_foreach_size(size, true, false) != TEST_SUCCESS) + return TEST_FAILED; + } + + return TEST_SUCCESS; +} + +static int +test_count_size(size_t size) +{ + uint64_t *bitset; + + bitset = alloc_bitset(size); + + if (bitset == NULL) + return TEST_FAILED; + + rte_bitset_init(bitset, size); + + if (rte_bitset_count_set(bitset, size) != 0) + return TEST_FAILED; + + if (rte_bitset_count_clear(bitset, size) != size) + return TEST_FAILED; + + rte_bitset_set_all(bitset, size); + + if (rte_bitset_count_set(bitset, size) != size) + return TEST_FAILED; + + if (rte_bitset_count_clear(bitset, size) != 0) + return TEST_FAILED; + + rte_bitset_clear_all(bitset, size); + + if (rte_bitset_count_set(bitset, size) != 0) + return TEST_FAILED; + + if (rte_bitset_count_clear(bitset, size) != size) + return TEST_FAILED; + + rte_bitset_set(bitset, rte_rand_max(size)); + + if (rte_bitset_count_set(bitset, size) != 1) + return TEST_FAILED; + + if (rte_bitset_count_clear(bitset, size) != (size - 1)) + return TEST_FAILED; + + rte_bitset_clear_all(bitset, size); + if (rte_bitset_count_set(bitset, size) != 0) + return TEST_FAILED; + if (rte_bitset_count_clear(bitset, size) != size) + return TEST_FAILED; + + rte_bitset_set_all(bitset, size); + if (rte_bitset_count_set(bitset, size) != size) + return TEST_FAILED; + if (rte_bitset_count_clear(bitset, size) != 0) + return TEST_FAILED; + + if (free_bitset(bitset, size) != TEST_SUCCESS) + return TEST_FAILED; + + return TEST_SUCCESS; +} + +static int +test_count(void) +{ + size_t i; + + if (test_count_size(128) != TEST_SUCCESS) + return TEST_FAILED; + if (test_count_size(1) != TEST_SUCCESS) + return TEST_FAILED; + if (test_count_size(63) != TEST_SUCCESS) + return TEST_FAILED; + if (test_count_size(64) != TEST_SUCCESS) + return TEST_FAILED; + if (test_count_size(65) != TEST_SUCCESS) + return TEST_FAILED; + + for (i = 0; i < RAND_ITERATIONS; i++) { + size_t size = 1 + rte_rand_max(RAND_SET_MAX_SIZE - 1); + + if (test_count_size(size) != TEST_SUCCESS) + return TEST_FAILED; + } + + return TEST_SUCCESS; +} + +#define GEN_DECLARE(size) \ + { \ + RTE_BITSET_DECLARE(bitset, size); \ + size_t idx; \ + \ + idx = rte_rand_max(size); \ + rte_bitset_init(bitset, size); \ + \ + rte_bitset_set(bitset, idx); \ + if (!rte_bitset_test(bitset, idx)) \ + return TEST_FAILED; \ + if (rte_bitset_count_set(bitset, size) != 1) \ + return TEST_FAILED; \ + return TEST_SUCCESS; \ + } + +static int +test_define(void) +{ + GEN_DECLARE(1); + GEN_DECLARE(64); + GEN_DECLARE(65); + GEN_DECLARE(4097); +} + +static int +test_equal(void) +{ + const size_t size = 100; + RTE_BITSET_DECLARE(bitset_a, size); + RTE_BITSET_DECLARE(bitset_b, size); + + rand_buf(bitset_a, RTE_BITSET_SIZE(size)); + rand_buf(bitset_b, RTE_BITSET_SIZE(size)); + + rte_bitset_init(bitset_a, size); + rte_bitset_init(bitset_b, size); + + rte_bitset_set(bitset_a, 9); + rte_bitset_set(bitset_b, 9); + rte_bitset_set(bitset_a, 90); + rte_bitset_set(bitset_b, 90); + + if (!rte_bitset_equal(bitset_a, bitset_b, size)) + return TEST_FAILED; + + /* set unused bit, which should be ignored */ + rte_bitset_set(&bitset_a[1], 60); + + if (!rte_bitset_equal(bitset_a, bitset_b, size)) + return TEST_FAILED; + + return TEST_SUCCESS; +} + +static int +test_copy(void) +{ + const size_t size = 100; + RTE_BITSET_DECLARE(bitset_a, size); + RTE_BITSET_DECLARE(bitset_b, size); + + rand_buf(bitset_a, RTE_BITSET_SIZE(size)); + rand_buf(bitset_b, RTE_BITSET_SIZE(size)); + + if (rte_bitset_equal(bitset_a, bitset_b, size)) + return TEST_FAILED; + + rte_bitset_copy(bitset_a, bitset_b, size); + + if (!rte_bitset_equal(bitset_a, bitset_b, size)) + return TEST_FAILED; + + return TEST_SUCCESS; +} + +static int +test_to_str(void) +{ + char buf[1024]; + RTE_BITSET_DECLARE(bitset, 128); + + rte_bitset_init(bitset, 128); + rte_bitset_set(bitset, 1); + + if (rte_bitset_to_str(bitset, 2, buf, 3) != 3) + return TEST_FAILED; + if (strcmp(buf, "10") != 0) + return TEST_FAILED; + + rte_bitset_set(bitset, 0); + + if (rte_bitset_to_str(bitset, 1, buf, sizeof(buf)) != 2) + return TEST_FAILED; + if (strcmp(buf, "1") != 0) + return TEST_FAILED; + + rte_bitset_init(bitset, 99); + rte_bitset_set(bitset, 98); + + if (rte_bitset_to_str(bitset, 99, buf, sizeof(buf)) != 100) + return TEST_FAILED; + + if (buf[0] != '1' || strchr(&buf[1], '1') != NULL) + return TEST_FAILED; + + if (rte_bitset_to_str(bitset, 128, buf, 64) != -EINVAL) + return TEST_FAILED; + + return TEST_SUCCESS; +} + +static int +test_bitset(void) +{ + if (test_test_set() != TEST_SUCCESS) + return TEST_FAILED; + + if (test_find() != TEST_SUCCESS) + return TEST_FAILED; + + if (test_foreach() != TEST_SUCCESS) + return TEST_FAILED; + + if (test_count() != TEST_SUCCESS) + return TEST_FAILED; + + if (test_define() != TEST_SUCCESS) + return TEST_FAILED; + + if (test_equal() != TEST_SUCCESS) + return TEST_FAILED; + + if (test_copy() != TEST_SUCCESS) + return TEST_FAILED; + + if (test_to_str() != TEST_SUCCESS) + return TEST_FAILED; + + return TEST_SUCCESS; +} + +REGISTER_TEST_COMMAND(bitset_autotest, test_bitset); diff --git a/lib/eal/common/meson.build b/lib/eal/common/meson.build index 917758cc65..687ae51d87 100644 --- a/lib/eal/common/meson.build +++ b/lib/eal/common/meson.build @@ -32,6 +32,7 @@ sources += files( 'eal_common_uuid.c', 'malloc_elem.c', 'malloc_heap.c', + 'rte_bitset.c', 'rte_malloc.c', 'rte_random.c', 'rte_reciprocal.c', diff --git a/lib/eal/common/rte_bitset.c b/lib/eal/common/rte_bitset.c new file mode 100644 index 0000000000..35e55a64db --- /dev/null +++ b/lib/eal/common/rte_bitset.c @@ -0,0 +1,29 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2023 Ericsson AB + */ + +#include + +#include "rte_bitset.h" + +ssize_t +rte_bitset_to_str(const uint64_t *bitset, size_t num_bits, char *buf, + size_t capacity) +{ + size_t i; + + if (capacity < (num_bits + 1)) + return -EINVAL; + + for (i = 0; i < num_bits; i++) { + bool value; + + value = rte_bitset_test(bitset, num_bits - 1 - i); + + buf[i] = value ? '1' : '0'; + } + + buf[num_bits] = '\0'; + + return num_bits + 1; +} diff --git a/lib/eal/include/meson.build b/lib/eal/include/meson.build index b0db9b3b3a..fa3cb884e9 100644 --- a/lib/eal/include/meson.build +++ b/lib/eal/include/meson.build @@ -5,6 +5,7 @@ includes += include_directories('.') headers += files( 'rte_alarm.h', + 'rte_bitset.h', 'rte_bitmap.h', 'rte_bitops.h', 'rte_branch_prediction.h', diff --git a/lib/eal/include/rte_bitset.h b/lib/eal/include/rte_bitset.h new file mode 100644 index 0000000000..e333e527e5 --- /dev/null +++ b/lib/eal/include/rte_bitset.h @@ -0,0 +1,878 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2023 Ericsson AB + */ + +#ifndef _RTE_BITSET_H_ +#define _RTE_BITSET_H_ + +/** + * @file + * RTE Bitset + * + * This file provides functions and macros for querying and + * manipulating sets of bits kept in arrays of @c uint64_t-sized + * elements. + * + * The bits in a bitset are numbered from 0 to @c size - 1, with the + * lowest index being the least significant bit. + * + * The bitset array must be properly aligned. + * + * For optimal performance, the @c size parameter, required by + * many of the API's functions, should be a compile-time constant. + * + * For large bitsets, the rte_bitmap.h API may be more appropriate. + * + * @warning + * All functions modifying a bitset may overwrite any unused bits of + * the last word. Such unused bits are ignored by all functions reading + * bits. + * + */ + +#include +#include +#include +#include + +#include +#include +#include +#include + +#ifdef __cplusplus +extern "C" { +#endif + +/** + * The size (in bytes) of each element in the array used to represent + * a bitset. + */ +#define RTE_BITSET_WORD_SIZE (sizeof(uint64_t)) + +/** + * The size (in bits) of each element in the array used to represent + * a bitset. + */ +#define RTE_BITSET_WORD_BITS (RTE_BITSET_WORD_SIZE * CHAR_BIT) + +/** + * Computes the number of words required to store @c size bits. + */ +#define RTE_BITSET_NUM_WORDS(size) \ + ((size + RTE_BITSET_WORD_BITS - 1) / RTE_BITSET_WORD_BITS) + +/** + * Computes the amount of memory (in bytes) required to fit a bitset + * holding @c size bits. + */ +#define RTE_BITSET_SIZE(size) \ + ((size_t)(RTE_BITSET_NUM_WORDS(size) * RTE_BITSET_WORD_SIZE)) + +#define __RTE_BITSET_WORD_IDX(bit_num) ((bit_num) / RTE_BITSET_WORD_BITS) +#define __RTE_BITSET_BIT_OFFSET(bit_num) ((bit_num) % RTE_BITSET_WORD_BITS) +#define __RTE_BITSET_UNUSED(size) \ + ((RTE_BITSET_NUM_WORDS(size) * RTE_BITSET_WORD_BITS) \ + - (size)) + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Declare a bitset. + * + * Declare (e.g., as a struct field) or define (e.g., as a stack + * variable) a bitset of the specified size. + * + * @param size + * The number of bits the bitset must be able to represent. Must be + * a compile-time constant. + * @param name + * The field or variable name of the resulting definition. + */ +#define RTE_BITSET_DECLARE(name, size) \ + uint64_t name[RTE_BITSET_NUM_WORDS(size)] + +/* XXX: should one include flags here and use to avoid a comparison? */ +/* XXX: would this be better off as a function? */ + +#define __RTE_BITSET_FOREACH_LEFT(var, size, start_bit, len) \ + ((len) - 1 - ((var) >= (start_bit) ? (var) - (start_bit) : \ + (size) - (start_bit) + (var))) + +#define __RTE_BITSET_FOREACH(var, bitset, size, start_bit, len, flags) \ + for ((var) = __rte_bitset_find(bitset, size, start_bit, len, \ + flags); \ + (var) != -1; \ + (var) = __RTE_BITSET_FOREACH_LEFT(var, size, start_bit, \ + len) > 0 ? \ + __rte_bitset_find(bitset, size, \ + ((var) + 1) % (size), \ + __RTE_BITSET_FOREACH_LEFT(var, \ + size, \ + start_bit, \ + len), \ + flags) : -1) + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Iterate over all bits set. + * + * This macro iterates over all bits set (i.e., all ones) in the + * bitset, in the forward direction (i.e., starting with the least + * significant '1'). + * + * @param var + * An iterator variable of type @c ssize_t. For each sucessive iteration, + * this variable will hold the bit index of a set bit. + * @param bitset + * A const uint64_t * pointer to the bitset array. + * @param size + * The size of the bitset (in bits). + */ + +#define RTE_BITSET_FOREACH_SET(var, bitset, size) \ + __RTE_BITSET_FOREACH(var, bitset, size, 0, size, 0) + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Iterate over all bits cleared. + * + * This macro iterates over all bits cleared in the bitset, in the + * forward direction (i.e., starting with the lowest-indexed set bit). + * + * @param var + * An iterator variable of type @c ssize_t. For each successive iteration, + * this variable will hold the bit index of a cleared bit. + * @param bitset + * A const uint64_t * pointer to the bitset array. + * @param size + * The size of the bitset (in bits). + */ + +#define RTE_BITSET_FOREACH_CLEAR(var, bitset, size) \ + __RTE_BITSET_FOREACH(var, bitset, size, 0, size, \ + __RTE_BITSET_FIND_FLAG_FIND_CLEAR) + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Iterate over all bits set within a range. + * + * This macro iterates over all bits set (i.e., all ones) in the + * specified range, in the forward direction (i.e., starting with the + * least significant '1'). + * + * @param var + * An iterator variable of type @c ssize_t. For each sucessive iteration, + * this variable will hold the bit index of a set bit. + * @param bitset + * A const uint64_t * pointer to the bitset array. + * @param size + * The size of the bitset (in bits). + * @param start_bit + * The index of the first bit to check. Must be less than @c size. + * @param len + * The length (in bits) of the range. @c start_bit + @c len must be less + * than or equal to @c size. + */ + +#define RTE_BITSET_FOREACH_SET_RANGE(var, bitset, size, start_bit, \ + len) \ + __RTE_BITSET_FOREACH(var, bitset, size, start_bit, len, 0) + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Iterate over all cleared bits within a range. + * + * This macro iterates over all bits cleared (i.e., all zeroes) in the + * specified range, in the forward direction (i.e., starting with the + * least significant '0'). + * + * @param var + * An iterator variable of type @c ssize_t. For each sucessive iteration, + * this variable will hold the bit index of a set bit. + * @param bitset + * A const uint64_t * pointer to the bitset array. + * @param size + * The size of the bitset (in bits). + * @param start_bit + * The index of the first bit to check. Must be less than @c size. + * @param len + * The length (in bits) of the range. @c start_bit + @c len must be less + * than or equal to @c size. + */ + +#define RTE_BITSET_FOREACH_CLEAR_RANGE(var, bitset, size, start_bit, \ + len) \ + __RTE_BITSET_FOREACH(var, bitset, size, start_bit, len, \ + __RTE_BITSET_FIND_FLAG_FIND_CLEAR) + +#define RTE_BITSET_FOREACH_SET_WRAP(var, bitset, size, start_bit, \ + len) \ + __RTE_BITSET_FOREACH(var, bitset, size, start_bit, len, \ + __RTE_BITSET_FIND_FLAG_WRAP) + +#define RTE_BITSET_FOREACH_CLEAR_WRAP(var, bitset, size, start_bit, \ + len) \ + __RTE_BITSET_FOREACH(var, bitset, size, start_bit, len, \ + __RTE_BITSET_FIND_FLAG_WRAP | \ + __RTE_BITSET_FIND_FLAG_FIND_CLEAR) + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Initializes a bitset. + * + * All bits are cleared. + * + * @param bitset + * A pointer to the array of bitset 64-bit words. + * @param size + * The size of the bitset (in bits). + */ + +__rte_experimental +static inline void +rte_bitset_init(uint64_t *bitset, size_t size) +{ + memset(bitset, 0, RTE_BITSET_SIZE(size)); +} + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Set a bit in the bitset. + * + * Bits are numbered from 0 to (size - 1) (inclusive). + * + * @param bitset + * A pointer to the array words making up the bitset. + * @param bit_num + * The index of the bit to be set. + */ + +__rte_experimental +static inline void +rte_bitset_set(uint64_t *bitset, size_t bit_num) +{ + size_t word; + size_t offset; + uint64_t mask; + + word = __RTE_BITSET_WORD_IDX(bit_num); + offset = __RTE_BITSET_BIT_OFFSET(bit_num); + mask = UINT64_C(1) << offset; + + bitset[word] |= mask; +} + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Clear a bit in the bitset. + * + * Bits are numbered 0 to (size - 1) (inclusive). + * + * @param bitset + * A pointer to the array words making up the bitset. + * @param bit_num + * The index of the bit to be cleared. + */ + +__rte_experimental +static inline void +rte_bitset_clear(uint64_t *bitset, size_t bit_num) +{ + size_t word; + size_t offset; + uint64_t mask; + + word = __RTE_BITSET_WORD_IDX(bit_num); + offset = __RTE_BITSET_BIT_OFFSET(bit_num); + mask = ~(UINT64_C(1) << offset); + + bitset[word] &= mask; +} + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Set all bits in the bitset. + * + * @param bitset + * A pointer to the array of words making up the bitset. + * @param size + * The size of the bitset (in bits). + */ + +__rte_experimental +static inline void +rte_bitset_set_all(uint64_t *bitset, size_t size) +{ + memset(bitset, 0xFF, RTE_BITSET_SIZE(size)); +} + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Clear all bits in the bitset. + * + * @param bitset + * A pointer to the array of words making up the bitset. + * @param size + * The size of the bitset (in bits). + */ + +__rte_experimental +static inline void +rte_bitset_clear_all(uint64_t *bitset, size_t size) +{ + rte_bitset_init(bitset, size); +} + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Count all set bits. + * + * @param bitset + * A pointer to the array of words making up the bitset. + * @param size + * The size of the bitset (in bits). + * @return + * Returns the number of '1' bits in the bitset. + */ + +__rte_experimental +static inline size_t +rte_bitset_count_set(const uint64_t *bitset, size_t size) +{ + size_t i; + size_t total = 0; + uint64_t unused_mask; + + /* + * Unused bits in a rte_bitset are always '0', and thus are + * not included in this count. + */ + for (i = 0; i < RTE_BITSET_NUM_WORDS(size) - 1; i++) + total += __builtin_popcountll(bitset[i]); + + unused_mask = UINT64_MAX >> __RTE_BITSET_UNUSED(size); + total += __builtin_popcountll(bitset[i] & unused_mask); + + return total; +} + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Count all cleared bits. + * + * @param bitset + * A pointer to the array of words making up the bitset. + * @param size + * The size of the bitset (in bits). + * @return + * Returns the number of '0' bits in the bitset. + */ + +__rte_experimental +static inline size_t +rte_bitset_count_clear(const uint64_t *bitset, size_t size) +{ + return size - rte_bitset_count_set(bitset, size); +} + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Test if a bit is set. + * + * @param bitset + * A pointer to the array of words making up the bitset. + * @param bit_num + * Index of the bit to test. Index 0 is the least significant bit. + * @return + * Returns true if the bit is '1', and false if the bit is '0'. + */ + +__rte_experimental +static inline bool +rte_bitset_test(const uint64_t *bitset, size_t bit_num) +{ + size_t word; + size_t offset; + + word = __RTE_BITSET_WORD_IDX(bit_num); + offset = __RTE_BITSET_BIT_OFFSET(bit_num); + + return (bitset[word] >> offset) & 1; +} + +#define __RTE_BITSET_FIND_FLAG_FIND_CLEAR (1U << 0) +#define __RTE_BITSET_FIND_FLAG_WRAP (1U << 1) + +__rte_experimental +static inline ssize_t +__rte_bitset_find_nowrap(const uint64_t *bitset, size_t __rte_unused size, + size_t start_bit, size_t len, bool find_clear) +{ + size_t word_idx; + size_t offset; + size_t end_bit = start_bit + len; + + RTE_ASSERT(end_bit <= size); + + if (unlikely(len == 0)) + return -1; + + word_idx = __RTE_BITSET_WORD_IDX(start_bit); + offset = __RTE_BITSET_BIT_OFFSET(start_bit); + + while (word_idx <= __RTE_BITSET_WORD_IDX(end_bit - 1)) { + uint64_t word; + int word_ffs; + + word = bitset[word_idx]; + if (find_clear) + word = ~word; + + word >>= offset; + + word_ffs = __builtin_ffsll(word); + + if (word_ffs != 0) { + ssize_t ffs = start_bit + word_ffs - 1; + + /* + * Check if set bit were among the last, + * unused bits, in the last word. + */ + if (unlikely(ffs >= (ssize_t)end_bit)) + return -1; + + return ffs; + } + + start_bit += (RTE_BITSET_WORD_BITS - offset); + word_idx++; + offset = 0; + } + + return -1; + +} + +__rte_experimental +static inline ssize_t +__rte_bitset_find(const uint64_t *bitset, size_t size, size_t start_bit, + size_t len, unsigned int flags) +{ + bool find_clear = flags & __RTE_BITSET_FIND_FLAG_FIND_CLEAR; + bool may_wrap = flags & __RTE_BITSET_FIND_FLAG_WRAP; + bool does_wrap = (start_bit + len) > size; + ssize_t rc; + + RTE_ASSERT(len <= size); + if (!may_wrap) + RTE_ASSERT(!does_wrap); + + if (may_wrap && does_wrap) { + size_t len0 = size - start_bit; + size_t len1 = len - len0; + + rc = __rte_bitset_find_nowrap(bitset, size, start_bit, len0, + find_clear); + if (rc < 0) + rc = __rte_bitset_find_nowrap(bitset, size, + 0, len1, find_clear); + } else + rc = __rte_bitset_find_nowrap(bitset, size, start_bit, + len, find_clear); + + return rc; +} + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Find first bit set. + * + * Scans the bitset in the forward direction (i.e., starting at the + * least significant bit), and returns the index of the first '1'. + * + * @param bitset + * A pointer to the array of words making up the bitset. + * @param size + * The size of the bitset (in bits). + * @return + * Returns the index of the least significant '1', or -1 if all + * bits are '0'. + */ + +__rte_experimental +static inline ssize_t +rte_bitset_find_first_set(const uint64_t *bitset, size_t size) +{ + return __rte_bitset_find(bitset, size, 0, size, 0); +} + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Find first bit set at offset. + * + * Scans the bitset in the forward direction (i.e., starting at the + * least significant bit), starting at an offset @c start_bit into the + * bitset, and returns the index of the first '1' encountered. + * + * @param bitset + * A pointer to the array of words making up the bitset. + * @param size + * The size of the bitset (in bits). + * @param start_bit + * The index of the first bit to check. Must be less than @c size. + * @param len + * The number of bits to scan. @c start_bit + @c len must be less + * than or equal to @c size. + * @return + * Returns the index of the least significant '1', or -1 if all + * bits are '0'. + */ + +__rte_experimental +static inline ssize_t +rte_bitset_find_set(const uint64_t *bitset, size_t size, + size_t start_bit, size_t len) +{ + return __rte_bitset_find(bitset, size, start_bit, len, 0); +} + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Find first bit set at offset, with wrap-around. + * + * Scans the bitset in the forward direction (i.e., starting at the + * least significant bit), starting at an offset @c start_bit into the + * bitset. If no '1' is encountered before the end of the bitset, the search + * will continue at index 0. + * + * @param bitset + * A pointer to the array of words making up the bitset. + * @param size + * The size of the bitset (in bits). + * @param start_bit + * The index of the first bit to check. Must be less than @c size. + * @param len + * The number of bits to scan. @c start_bit + @c len must be less + * than or equal to @c size. + * @return + * Returns the index of the least significant '1', or -1 if all + * bits are '0'. + */ + +__rte_experimental +static inline ssize_t +rte_bitset_find_set_wrap(const uint64_t *bitset, size_t size, + size_t start_bit, size_t len) +{ + return __rte_bitset_find(bitset, size, start_bit, len, + __RTE_BITSET_FIND_FLAG_WRAP); +} + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Find first cleared bit. + * + * Scans the bitset in the forward direction (i.e., starting at the + * least significant bit), and returns the index of the first '0'. + * + * @param bitset + * A pointer to the array of words making up the bitset. + * @param size + * The size of the bitset (in bits). + * @return + * Returns the index of the least significant '0', or -1 if all + * bits are '1'. + */ + +__rte_experimental +static inline ssize_t +rte_bitset_find_first_clear(const uint64_t *bitset, size_t size) +{ + return __rte_bitset_find(bitset, size, 0, size, + __RTE_BITSET_FIND_FLAG_FIND_CLEAR); +} + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Find first cleared bit at offset. + * + * Scans the bitset in the forward direction (i.e., starting at the + * least significant bit), starting at an offset @c start_bit into the + * bitset, and returns the index of the first '0' encountered. + * + * @param bitset + * A pointer to the array of words making up the bitset. + * @param size + * The size of the bitset (in bits). + * @param start_bit + * The index of the first bit to check. Must be less than @c size. + * @param len + * The number of bits to scan. @c start_bit + @c len must be less + * than or equal to @c size. + * @return + * Returns the index of the least significant '0', or -1 if all + * bits are '1'. + */ + +__rte_experimental +static inline ssize_t +rte_bitset_find_clear(const uint64_t *bitset, size_t size, + size_t start_bit, size_t len) +{ + return __rte_bitset_find(bitset, size, start_bit, len, + __RTE_BITSET_FIND_FLAG_FIND_CLEAR); +} + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Find first cleared bit at offset, with wrap-around. + * + * Scans the bitset in the forward direction (i.e., starting at the + * least significant bit), starting at an offset @c start_bit into the + * bitset. If no '0' is encountered before the end of the bitset, the + * search will continue at index 0. + * + * @param bitset + * A pointer to the array of words making up the bitset. + * @param size + * The size of the bitset (in bits). + * @param start_bit + * The index of the first bit to check. Must be less than @c size. + * @param len + * The number of bits to scan. @c start_bit + @c len must be less + * than or equal to @c size. + * @return + * Returns the index of the least significant '0', or -1 if all + * bits are '1'. + */ + +__rte_experimental +static inline ssize_t +rte_bitset_find_clear_wrap(const uint64_t *bitset, size_t size, + size_t start_bit, size_t len) +{ + return __rte_bitset_find(bitset, size, start_bit, len, + __RTE_BITSET_FIND_FLAG_FIND_CLEAR | + __RTE_BITSET_FIND_FLAG_WRAP); +} + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Copy bitset. + * + * Copy the bits of the @c src_bitset to the @c dst_bitset. + * + * The bitsets may not overlap and must be of equal size. + * + * @param dst_bitset + * A pointer to the array of words making up the bitset. + * @param src_bitset + * A pointer to the array of words making up the bitset. + * @param size + * The size of the bitsets (in bits). + */ + +__rte_experimental +static inline void +rte_bitset_copy(uint64_t *__rte_restrict dst_bitset, + const uint64_t *__rte_restrict src_bitset, + size_t size) +{ + rte_memcpy(dst_bitset, src_bitset, RTE_BITSET_SIZE(size)); +} + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Bitwise or two bitsets. + * + * Perform a bitwise OR operation on all bits in the two equal-size + * bitsets @c dst_bitset and @c src_bitset, and store the results in + * @c dst_bitset. + * + * @param dst_bitset + * A pointer to the destination bitset. + * @param src_bitset + * A pointer to the source bitset. + * @param size + * The size of the bitsets (in bits). + */ + +__rte_experimental +static inline void +rte_bitset_or(uint64_t *dst_bitset, const uint64_t *src_bitset, size_t size) +{ + size_t i; + + for (i = 0; i < RTE_BITSET_NUM_WORDS(size); i++) + dst_bitset[i] |= src_bitset[i]; +} + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Bitwise and two bitsets. + * + * Perform a bitwise AND operation on all bits in the two equal-size + * bitsets @c dst_bitset and @c src_bitset, and store the results in + * @c dst_bitset. + * + * @param dst_bitset + * A pointer to the destination bitset. + * @param src_bitset + * A pointer to the source bitset. + * @param size + * The size of the bitsets (in bits). + */ + +__rte_experimental +static inline void +rte_bitset_and(uint64_t *dst_bitset, const uint64_t *src_bitset, size_t size) +{ + size_t i; + + for (i = 0; i < RTE_BITSET_NUM_WORDS(size); i++) + dst_bitset[i] &= src_bitset[i]; +} + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Bitwise xor two bitsets. + * + * Perform a bitwise XOR operation on all bits in the two equal-size + * bitsets @c dst_bitset and @c src_bitset, and store the results in + * @c dst_bitset. + * + * @param dst_bitset + * A pointer to the destination bitset. + * @param src_bitset + * A pointer to the source bitset. + * @param size + * The size of the bitsets (in bits). + */ + +__rte_experimental +static inline void +rte_bitset_xor(uint64_t *__rte_restrict dst_bitset, + const uint64_t *__rte_restrict src_bitset, size_t size) +{ + size_t i; + + for (i = 0; i < RTE_BITSET_NUM_WORDS(size); i++) + dst_bitset[i] ^= src_bitset[i]; +} + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Compare two bitsets. + * + * Compare two bitsets for equality. + * + * @param bitset_a + * A pointer to the destination bitset. + * @param bitset_b + * A pointer to the source bitset. + * @param size + * The size of the bitsets (in bits). + */ + +__rte_experimental +static inline bool +rte_bitset_equal(const uint64_t *bitset_a, const uint64_t *bitset_b, + size_t size) +{ + size_t i; + uint64_t last_a, last_b; + + for (i = 0; i < RTE_BITSET_NUM_WORDS(size) - 1; i++) + if (bitset_a[i] != bitset_b[i]) + return false; + + last_a = bitset_a[i] << __RTE_BITSET_UNUSED(size); + last_b = bitset_b[i] << __RTE_BITSET_UNUSED(size); + + return last_a == last_b; +} + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Converts a bitset to a string. + * + * This function prints a string representation of the bitstring to + * the supplied buffer. + * + * Each bit is represented either by '0' or '1' in the output. The + * resulting string is NUL terminated. + * + * @param bitset + * A pointer to the array of bitset 64-bit words. + * @param size + * The number of bits the bitset represent. + * @param buf + * A buffer to hold the output. + * @param capacity + * The size of the buffer. Must be @c size + 1 or larger. + * @return + * Returns the number of bytes written (i.e., @c size + 1), or -EINVAL + * in case the buffer capacity was too small. + */ + +__rte_experimental +ssize_t +rte_bitset_to_str(const uint64_t *bitset, size_t size, char *buf, + size_t capacity); + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_BITSET_H_ */ diff --git a/lib/eal/version.map b/lib/eal/version.map index 6d6978f108..9136a71c73 100644 --- a/lib/eal/version.map +++ b/lib/eal/version.map @@ -430,6 +430,9 @@ EXPERIMENTAL { rte_thread_create_control; rte_thread_set_name; __rte_eal_trace_generic_blob; + + # added in X.Y + rte_bitset_to_str; }; INTERNAL { From patchwork Tue Feb 28 09:39:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Mattias_R=C3=B6nnblom?= X-Patchwork-Id: 124568 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 4F3F241D9B; Tue, 28 Feb 2023 10:45:12 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id E0BA741181; Tue, 28 Feb 2023 10:45:05 +0100 (CET) Received: from EUR02-AM0-obe.outbound.protection.outlook.com (mail-am0eur02on2050.outbound.protection.outlook.com [40.107.247.50]) by mails.dpdk.org (Postfix) with ESMTP id C9C814113C for ; Tue, 28 Feb 2023 10:45:04 +0100 (CET) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=B4HucGjpQO+ywRXeDohPPdOM8NcG4+vGY1SWbGqXsF7ktopt1Wr9FXmzdY79LLoUhBkW3dnVcDk7QDzRuDvlihRwJ6tzebRGztnUfDrkcxhOpw4tqoiD2IB1G0jwC43623WQA3TGvMjztkBCsrqbC9Zw2wmwgJ6Lp4TVyiDm1qIsLBaj4XfzqTllr4ePZUYOPYDRLV8yAuy9NbQuHcq1dTMtLYBEvmagIq5OVNiEq/nKQTLMbVz/DMr6/qgVaKkWsUd/WCupz5lA1ArrBe5Z0aQIUx1OhvhkNn/rc9pThHfrHEZPUnVZPCKzMfBXd3B1855QLQd+MsmdfndPxTfEsQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=tlBDrKlbYUUzsAsvLQ4fyvKDGxKBTpdrs/Y/s6tx1JU=; b=U/Lj8muz3pLcKs6+Te7ebAAzFLNZG3qk3CjXD4H1spxKSMqg/Y7bvY4HvFj3cIW81DQN83y+aPgUoZukOfZT1WJNVt5oXFTVpHixNef60NkrLm6OyiXwjZaBpLu7HJOiSNIOkQT+u3ULR2F6iY4T/lUEzxxwHp2VxvM+HbL+A0hqjjyK+u2r7CNS5JopHFkzNTLrL8apmVdu/VEWe5EkqeIztKH0QLNVd1h47SOlMO7wv4QiZ5r0F84Fr4hrsPANNlrIqnUCmNNkDHLeANgUA+Gb2unEer0kw7lIIaUVThxdDP5d4B1Pc1ETMBrsBjtYcCRq/VGz9cvb3Vi79pUZxg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 192.176.1.74) smtp.rcpttodomain=dpdk.org smtp.mailfrom=ericsson.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=ericsson.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ericsson.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=tlBDrKlbYUUzsAsvLQ4fyvKDGxKBTpdrs/Y/s6tx1JU=; b=THj7BVxkghpWJvzheDdG1FUJqXUaSc3YIh7YBNz7V0skTipzWmRvphBz2EUGeGrDD3JPrU1zriKgUcu8CGP+43G78uGsrJaukyAaWiBO8A22uQHgP5UM23bhN2+AFpIeK28IXKI0GbHqKibgiaaqZ7+vYKad3FfNGOcRUghrW+E= Received: from DBBPR09CA0024.eurprd09.prod.outlook.com (2603:10a6:10:c0::36) by DU2PR07MB8318.eurprd07.prod.outlook.com (2603:10a6:10:27b::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6134.30; Tue, 28 Feb 2023 09:45:01 +0000 Received: from DB5EUR02FT066.eop-EUR02.prod.protection.outlook.com (2603:10a6:10:c0:cafe::4f) by DBBPR09CA0024.outlook.office365.com (2603:10a6:10:c0::36) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6134.30 via Frontend Transport; Tue, 28 Feb 2023 09:45:01 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 192.176.1.74) smtp.mailfrom=ericsson.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=ericsson.com; Received-SPF: Pass (protection.outlook.com: domain of ericsson.com designates 192.176.1.74 as permitted sender) receiver=protection.outlook.com; client-ip=192.176.1.74; helo=oa.msg.ericsson.com; pr=C Received: from oa.msg.ericsson.com (192.176.1.74) by DB5EUR02FT066.mail.protection.outlook.com (10.13.59.11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.20.6134.21 via Frontend Transport; Tue, 28 Feb 2023 09:45:01 +0000 Received: from ESESSMB503.ericsson.se (153.88.183.164) by ESESBMB505.ericsson.se (153.88.183.172) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.2507.17; Tue, 28 Feb 2023 10:44:59 +0100 Received: from seliicinfr00050.seli.gic.ericsson.se (153.88.183.153) by smtp.internal.ericsson.com (153.88.183.191) with Microsoft SMTP Server id 15.1.2507.17 via Frontend Transport; Tue, 28 Feb 2023 10:45:00 +0100 Received: from breslau.. (seliicwb00002.seli.gic.ericsson.se [10.156.25.100]) by seliicinfr00050.seli.gic.ericsson.se (Postfix) with ESMTP id 122661C006A; Tue, 28 Feb 2023 10:45:00 +0100 (CET) From: =?utf-8?q?Mattias_R=C3=B6nnblom?= To: CC: Erik Gabriel Carrillo , David Marchand , , Stefan Sundkvist , =?utf-8?q?Mattias_R=C3=B6?= =?utf-8?q?nnblom?= Subject: [RFC 2/2] eal: add high-performance timer facility Date: Tue, 28 Feb 2023 10:39:16 +0100 Message-ID: <20230228093916.87206-3-mattias.ronnblom@ericsson.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230228093916.87206-1-mattias.ronnblom@ericsson.com> References: <20230228093916.87206-1-mattias.ronnblom@ericsson.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DB5EUR02FT066:EE_|DU2PR07MB8318:EE_ X-MS-Office365-Filtering-Correlation-Id: 349454b1-9f65-4ea3-a493-08db19707641 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: yuL05fbVi0cmqr7YimYpe3/tmcZ8v2ViNaIL1sCBGfRf6LlWJdDUvqNj4go74vPAxdmCDprK4rdmOuhFhfoAg5EtJqDrv9XN35PFGmAQwrymtYp7H+Z/PW8P9pOgSlpyRGJ9L8OgmZEBPTZEOW6S45qYzqs9ua7EHC41MLc5dwMrqh5phzo2crqSRi0qpx8B87zjkbfpZJe3Kq6EMwzYW+RCAypWakw2Tl9xM1oSgSd+a2XO0phsAA/SLWM3im+ivnFccUq2Z7ObP8bfJDq3iKUvVQ20U8ckikz32VGLoHapzBGjhTppYI9lbkjVM0N6Jovt0tTv18CCZoQsihSs9u1TZ5PafMgoT3V77UoCSX4V8zPaiDYkXJ/tAa1c7IyC7wW7EUE2KaheEc9SoPlleGVrBeldhYgmLG72Qnc6FjlM6DBUD5pvMAKH244EDhMqKkhV9NBaHIsyRtzx+m/q4+N5t5t5gOSTJukJE8xccaO35PtNETH3P/w6aow0FfBT3mRUIdT7R9TJ69A0ydCWwU2odnO0xCmN0S9ODgbZM1tbIDs5KImuIZxOMgiIfGxCi0LrILcD9Kb11WquHaU0wORNqKXzgRP8wRUyAuf+WUO9GUM4K39U1ClsIY7Qg+AKEvQhNRGDckdCrrfcCOREfYV9KfF+Sp0tJy6yBoR8k/ALqQTVFb2Fv2ldDqCHxsidt+4BZroJXuhEjO/0MPJPwBLrU5mqWFe07MTvhYllK80p3opO9DH9Yux4NK2G+ClnxYLwuimKHWWzmwm7b3pVOQ== X-Forefront-Antispam-Report: CIP:192.176.1.74; CTRY:SE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:oa.msg.ericsson.com; PTR:office365.se.ericsson.net; CAT:NONE; SFS:(13230025)(4636009)(136003)(346002)(39860400002)(376002)(396003)(451199018)(40470700004)(46966006)(36840700001)(66899018)(86362001)(36756003)(70206006)(40480700001)(41300700001)(8676002)(5660300002)(8936002)(6916009)(4326008)(2906002)(30864003)(36860700001)(7636003)(82960400001)(356005)(82740400003)(6666004)(107886003)(478600001)(54906003)(316002)(70586007)(82310400005)(83380400001)(66574015)(47076005)(2616005)(40460700003)(6266002)(336012)(186003)(26005)(1076003)(21314003)(579004)(559001); DIR:OUT; SFP:1101; X-OriginatorOrg: ericsson.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 Feb 2023 09:45:01.3164 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 349454b1-9f65-4ea3-a493-08db19707641 X-MS-Exchange-CrossTenant-Id: 92e84ceb-fbfd-47ab-be52-080c6b87953f X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=92e84ceb-fbfd-47ab-be52-080c6b87953f; Ip=[192.176.1.74]; Helo=[oa.msg.ericsson.com] X-MS-Exchange-CrossTenant-AuthSource: DB5EUR02FT066.eop-EUR02.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DU2PR07MB8318 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org The htimer library attempts at providing a timer facility with roughly the same functionality, but less overhead and better scalability than DPDK timer library. The htimer library employs per-lcore hierachical timer wheels and a message-based synchronization/MT-safety scheme. Signed-off-by: Mattias Rönnblom --- app/test/meson.build | 8 +- app/test/test_htimer_mgr.c | 674 +++++++++++++++++++++++++++++++ app/test/test_htimer_mgr_perf.c | 324 +++++++++++++++ app/test/test_htw.c | 478 ++++++++++++++++++++++ app/test/test_htw_perf.c | 181 +++++++++ doc/api/doxy-api-index.md | 5 +- doc/api/doxy-api.conf.in | 1 + lib/htimer/meson.build | 7 + lib/htimer/rte_htimer.h | 65 +++ lib/htimer/rte_htimer_mgr.c | 488 ++++++++++++++++++++++ lib/htimer/rte_htimer_mgr.h | 497 +++++++++++++++++++++++ lib/htimer/rte_htimer_msg.h | 44 ++ lib/htimer/rte_htimer_msg_ring.c | 18 + lib/htimer/rte_htimer_msg_ring.h | 49 +++ lib/htimer/rte_htw.c | 437 ++++++++++++++++++++ lib/htimer/rte_htw.h | 49 +++ lib/htimer/version.map | 17 + lib/meson.build | 1 + 18 files changed, 3341 insertions(+), 2 deletions(-) create mode 100644 app/test/test_htimer_mgr.c create mode 100644 app/test/test_htimer_mgr_perf.c create mode 100644 app/test/test_htw.c create mode 100644 app/test/test_htw_perf.c create mode 100644 lib/htimer/meson.build create mode 100644 lib/htimer/rte_htimer.h create mode 100644 lib/htimer/rte_htimer_mgr.c create mode 100644 lib/htimer/rte_htimer_mgr.h create mode 100644 lib/htimer/rte_htimer_msg.h create mode 100644 lib/htimer/rte_htimer_msg_ring.c create mode 100644 lib/htimer/rte_htimer_msg_ring.h create mode 100644 lib/htimer/rte_htw.c create mode 100644 lib/htimer/rte_htw.h create mode 100644 lib/htimer/version.map diff --git a/app/test/meson.build b/app/test/meson.build index 03811ff692..5a48775a60 100644 --- a/app/test/meson.build +++ b/app/test/meson.build @@ -143,6 +143,10 @@ test_sources = files( 'test_timer_perf.c', 'test_timer_racecond.c', 'test_timer_secondary.c', + 'test_htw.c', + 'test_htw_perf.c', + 'test_htimer_mgr_perf.c', + 'test_htimer_mgr.c', 'test_ticketlock.c', 'test_trace.c', 'test_trace_register.c', @@ -165,7 +169,6 @@ fast_tests = [ ['bpf_autotest', true, true], ['bpf_convert_autotest', true, true], ['bitops_autotest', true, true], - ['bitset_autotest', true, true], ['byteorder_autotest', true, true], ['cksum_autotest', true, true], ['cmdline_autotest', true, true], @@ -193,6 +196,7 @@ fast_tests = [ ['fib6_autotest', true, true], ['func_reentrancy_autotest', false, true], ['hash_autotest', true, true], + ['htimer_mgr_autotest', true, true], ['interrupt_autotest', true, true], ['ipfrag_autotest', false, true], ['lcores_autotest', true, true], @@ -265,6 +269,8 @@ perf_test_names = [ 'memcpy_perf_autotest', 'hash_perf_autotest', 'timer_perf_autotest', + 'htimer_mgr_perf_autotest', + 'htw_perf_autotest', 'reciprocal_division', 'reciprocal_division_perf', 'lpm_perf_autotest', diff --git a/app/test/test_htimer_mgr.c b/app/test/test_htimer_mgr.c new file mode 100644 index 0000000000..4d82a5e8b0 --- /dev/null +++ b/app/test/test_htimer_mgr.c @@ -0,0 +1,674 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2023 Ericsson AB + */ + +#include "test.h" + +#include +#include +#include + +#include +#include +#include +#include +#include +#include + +static int +timer_lcore(void *arg) +{ + bool *stop = arg; + + while (!__atomic_load_n(stop, __ATOMIC_RELAXED)) + rte_htimer_mgr_manage(); + + return 0; +} + +static void +count_timer_cb(struct rte_htimer *timer __rte_unused, void *arg) +{ + unsigned int *count = arg; + + __atomic_fetch_add(count, 1, __ATOMIC_RELAXED); +} + +static void +count_async_cb(struct rte_htimer *timer __rte_unused, int result, + void *cb_arg) +{ + unsigned int *count = cb_arg; + + if (result == RTE_HTIMER_MGR_ASYNC_RESULT_ADDED) + __atomic_fetch_add(count, 1, __ATOMIC_RELAXED); +} + +static uint64_t s_to_tsc(double s) +{ + return s * rte_get_tsc_hz(); +} + +#define ASYNC_ADD_TEST_EXPIRATION_TIME 0.25 /* s */ +#define ASYNC_TEST_TICK s_to_tsc(1e-6) + +static int +test_htimer_mgr_async_add(unsigned int num_timers_per_lcore) +{ + struct rte_htimer *timers; + unsigned int timer_idx; + unsigned int lcore_id; + bool stop = false; + unsigned int timeout_count = 0; + unsigned int async_count = 0; + unsigned int num_workers = 0; + uint64_t expiration_time; + unsigned int num_total_timers; + + rte_htimer_mgr_init(ASYNC_TEST_TICK); + + RTE_LCORE_FOREACH_WORKER(lcore_id) { + if (rte_eal_remote_launch(timer_lcore, &stop, lcore_id) != 0) + rte_panic("Unable to launch timer lcore\n"); + num_workers++; + } + + num_total_timers = num_workers * num_timers_per_lcore; + + timers = malloc(num_total_timers * sizeof(struct rte_htimer)); + timer_idx = 0; + + if (timers == NULL) + rte_panic("Unable to allocate heap memory\n"); + + expiration_time = rte_get_tsc_hz() * ASYNC_ADD_TEST_EXPIRATION_TIME; + + RTE_LCORE_FOREACH_WORKER(lcore_id) { + unsigned int i; + + for (i = 0; i < num_timers_per_lcore; i++) { + struct rte_htimer *timer = &timers[timer_idx++]; + + for (;;) { + int rc; + + rc = rte_htimer_mgr_async_add(timer, lcore_id, + expiration_time, + 0, + count_timer_cb, + &timeout_count, 0, + count_async_cb, + &async_count); + if (unlikely(rc == -EBUSY)) + rte_htimer_mgr_process(); + else + break; + } + } + } + + while (__atomic_load_n(&async_count, __ATOMIC_RELAXED) != + num_total_timers || + __atomic_load_n(&timeout_count, __ATOMIC_RELAXED) != + num_total_timers) + rte_htimer_mgr_manage(); + + __atomic_store_n(&stop, true, __ATOMIC_RELAXED); + + rte_eal_mp_wait_lcore(); + + rte_htimer_mgr_deinit(); + + free(timers); + + return TEST_SUCCESS; +} + +struct async_recorder_state { + bool timer_cb_run; + bool async_add_cb_run; + bool async_cancel_cb_run; + bool failed; +}; + +static void +record_async_add_cb(struct rte_htimer *timer __rte_unused, + int result, void *cb_arg) +{ + struct async_recorder_state *state = cb_arg; + + if (state->failed) + return; + + if (state->async_add_cb_run || + result != RTE_HTIMER_MGR_ASYNC_RESULT_ADDED) { + puts("async add run already"); + state->failed = true; + } + + state->async_add_cb_run = true; +} + +static void +record_async_cancel_cb(struct rte_htimer *timer __rte_unused, + int result, void *cb_arg) +{ + struct async_recorder_state *state = cb_arg; + + if (state->failed) + return; + + if (state->async_cancel_cb_run) { + state->failed = true; + return; + } + + switch (result) { + case RTE_HTIMER_MGR_ASYNC_RESULT_EXPIRED: + if (!state->timer_cb_run) + state->failed = true; + break; + case RTE_HTIMER_MGR_ASYNC_RESULT_CANCELED: + if (state->timer_cb_run) + state->failed = true; + break; + case RTE_HTIMER_MGR_ASYNC_RESULT_ALREADY_CANCELED: + state->failed = true; + } + + state->async_cancel_cb_run = true; +} + +static int +record_check_consistency(struct async_recorder_state *state) +{ + if (state->failed) + return -1; + + return state->async_cancel_cb_run ? 1 : 0; +} + +static int +records_check_consistency(struct async_recorder_state *states, + unsigned int num_states) +{ + unsigned int i; + int canceled = 0; + + for (i = 0; i < num_states; i++) { + int rc; + + rc = record_check_consistency(&states[i]); + + if (rc < 0) + return -1; + canceled += rc; + } + + return canceled; +} + +static void +log_timer_expiry_cb(struct rte_htimer *timer __rte_unused, + void *arg) +{ + bool *timer_run = arg; + + *timer_run = true; +} + + +#define ASYNC_ADD_CANCEL_TEST_EXPIRATION_TIME_MAX 10e-3 /* s */ + +static int +test_htimer_mgr_async_add_cancel(unsigned int num_timers_per_lcore) +{ + struct rte_htimer *timers; + struct async_recorder_state *recorder_states; + unsigned int timer_idx = 0; + unsigned int lcore_id; + uint64_t now; + unsigned int num_workers = 0; + bool stop = false; + uint64_t max_expiration_time = + s_to_tsc(ASYNC_ADD_CANCEL_TEST_EXPIRATION_TIME_MAX); + unsigned int num_total_timers; + int canceled = 0; + + rte_htimer_mgr_init(ASYNC_TEST_TICK); + + RTE_LCORE_FOREACH_WORKER(lcore_id) { + if (rte_eal_remote_launch(timer_lcore, &stop, lcore_id) != 0) + rte_panic("Unable to launch timer lcore\n"); + num_workers++; + } + + num_total_timers = num_workers * num_timers_per_lcore; + + timers = malloc(num_total_timers * sizeof(struct rte_htimer)); + recorder_states = + malloc(num_total_timers * sizeof(struct async_recorder_state)); + + if (timers == NULL || recorder_states == NULL) + rte_panic("Unable to allocate heap memory\n"); + + now = rte_get_tsc_cycles(); + + RTE_LCORE_FOREACH_WORKER(lcore_id) { + unsigned int i; + + for (i = 0; i < num_timers_per_lcore; i++) { + struct rte_htimer *timer = &timers[timer_idx]; + struct async_recorder_state *state = + &recorder_states[timer_idx]; + + timer_idx++; + + *state = (struct async_recorder_state) {}; + + uint64_t expiration_time = + now + rte_rand_max(max_expiration_time); + + for (;;) { + int rc; + + rc = rte_htimer_mgr_async_add(timer, lcore_id, + expiration_time, + 0, + log_timer_expiry_cb, + &state->timer_cb_run, + 0, + record_async_add_cb, + state); + + if (unlikely(rc == -EBUSY)) + rte_htimer_mgr_process(); + else + break; + } + } + } + + timer_idx = 0; + + RTE_LCORE_FOREACH_WORKER(lcore_id) { + unsigned int i; + + for (i = 0; i < num_timers_per_lcore; i++) { + struct rte_htimer *timer = &timers[timer_idx]; + struct async_recorder_state *state = + &recorder_states[timer_idx]; + + timer_idx++; + + /* cancel roughly half of the timers */ + if (rte_rand_max(2) == 0) + continue; + + for (;;) { + int rc; + + rc = rte_htimer_mgr_async_cancel(timer, + record_async_cancel_cb, + state); + + if (unlikely(rc == -EBUSY)) { + puts("busy"); + rte_htimer_mgr_process(); + } else + break; + } + + canceled++; + } + } + + for (;;) { + int cancel_completed; + + cancel_completed = records_check_consistency(recorder_states, + num_total_timers); + + if (cancel_completed < 0) { + puts("Inconstinency found"); + return TEST_FAILED; + } + + if (cancel_completed == canceled) + break; + + rte_htimer_mgr_process(); + } + + __atomic_store_n(&stop, true, __ATOMIC_RELAXED); + + rte_eal_mp_wait_lcore(); + + rte_htimer_mgr_deinit(); + + free(timers); + free(recorder_states); + + return TEST_SUCCESS; +} + +/* + * This is a test case where one thread asynchronously adds two timers, + * with the same expiration time; one on the local lcore and one on a + * remote lcore. This creates a tricky situation for the timer + * manager, and for the application as well, if the htimer struct is + * dynamically allocated. + */ + +struct test_timer { + uint32_t ref_cnt; + uint64_t expiration_time; /* in TSC, not tick */ + uint32_t *timeout_count; + bool *failure_occured; + struct rte_htimer htimer; +}; + + +static struct test_timer * +test_timer_create(uint64_t expiration_time, uint32_t *timeout_count, + bool *failure_occured) +{ + struct test_timer *timer; + + timer = malloc(sizeof(struct test_timer)); + + if (timer == NULL) + rte_panic("Unable to allocate timer memory\n"); + + timer->ref_cnt = 1; + timer->expiration_time = expiration_time; + timer->timeout_count = timeout_count; + timer->failure_occured = failure_occured; + + return timer; +} + +static void +test_timer_inc_ref_cnt(struct test_timer *timer) +{ + __atomic_add_fetch(&timer->ref_cnt, 1, __ATOMIC_RELEASE); +} + +static void +test_timer_dec_ref_cnt(struct test_timer *timer) +{ + if (timer != NULL) { + uint32_t cnt = __atomic_sub_fetch(&timer->ref_cnt, 1, + __ATOMIC_RELEASE); + if (cnt == 0) + free(timer); + } +} + +static void +test_timer_cb(struct rte_htimer *timer, void *arg __rte_unused) +{ + struct test_timer *test_timer = + container_of(timer, struct test_timer, htimer); + uint64_t now = rte_get_tsc_cycles(); + + if (now < test_timer->expiration_time) + *(test_timer->failure_occured) = true; + + __atomic_fetch_add(test_timer->timeout_count, 1, __ATOMIC_RELAXED); + + test_timer_dec_ref_cnt(test_timer); +} + +static int +worker_lcore(void *arg) +{ + bool *stop = arg; + + while (!__atomic_load_n(stop, __ATOMIC_RELAXED)) + rte_htimer_mgr_manage(); + + return 0; +} + +struct cancel_timer { + bool cancel; + struct rte_htimer *target_timer; + uint32_t *cancel_count; + uint32_t *expired_count; + bool *failure_occured; + struct rte_htimer htimer; +}; + +static struct cancel_timer * +cancel_timer_create(bool cancel, struct rte_htimer *target_timer, + uint32_t *cancel_count, uint32_t *expired_count, + bool *failure_occured) +{ + struct cancel_timer *timer; + + timer = malloc(sizeof(struct cancel_timer)); + + if (timer == NULL) + rte_panic("Unable to allocate timer memory\n"); + + timer->cancel = cancel; + timer->target_timer = target_timer; + timer->cancel_count = cancel_count; + timer->expired_count = expired_count; + timer->failure_occured = failure_occured; + + return timer; +} + +static void +async_cancel_cb(struct rte_htimer *timer, int result, void *cb_arg) +{ + struct test_timer *test_timer = + container_of(timer, struct test_timer, htimer); + struct cancel_timer *cancel_timer = cb_arg; + bool *failure_occured = cancel_timer->failure_occured; + + if (!cancel_timer->cancel || cancel_timer->target_timer != timer) + *failure_occured = true; + + if (result == RTE_HTIMER_MGR_ASYNC_RESULT_CANCELED) { + uint32_t *cancel_count = cancel_timer->cancel_count; + + /* decrease target lcore's ref count */ + test_timer_dec_ref_cnt(test_timer); + (*cancel_count)++; + } else if (result == RTE_HTIMER_MGR_ASYNC_RESULT_EXPIRED) { + uint32_t *expired_count = cancel_timer->expired_count; + + (*expired_count)++; + } else + *failure_occured = true; + + /* source lcore's ref count */ + test_timer_dec_ref_cnt(test_timer); + + free(cancel_timer); +} + +static void +cancel_timer_cb(struct rte_htimer *timer, void *arg __rte_unused) +{ + struct cancel_timer *cancel_timer = + container_of(timer, struct cancel_timer, htimer); + + if (cancel_timer->cancel) { + int rc; + + rc = rte_htimer_mgr_async_cancel(cancel_timer->target_timer, + async_cancel_cb, cancel_timer); + + if (rc == -EBUSY) + rte_htimer_mgr_add(timer, 0, 0, cancel_timer_cb, + NULL, 0); + } else + free(cancel_timer); +} + +#define REF_CNT_TEST_TICK s_to_tsc(10e-9) +#define REF_CNT_AVG_EXPIRATION_TIME (50e-6) +#define REF_CNT_MAX_EXPIRATION_TIME (2 * REF_CNT_AVG_EXPIRATION_TIME) +#define REF_CNT_CANCEL_FUZZ(expiration_time) \ + ((uint64_t)((expiration_time) * (rte_drand()/10 + 0.95))) + +static int +test_htimer_mgr_ref_cnt_timers(unsigned int num_timers_per_lcore) +{ + unsigned int lcore_id; + bool stop = false; + unsigned int num_workers = 0; + struct test_timer **timers; + struct cancel_timer **cancel_timers; + unsigned int num_timers; + uint32_t timeout_count = 0; + uint32_t cancel_count = 0; + uint32_t expired_count = 0; + bool failure_occured = false; + unsigned int timer_idx; + unsigned int expected_cancel_attempts; + uint64_t deadline; + uint64_t now; + + rte_htimer_mgr_init(REF_CNT_TEST_TICK); + + RTE_LCORE_FOREACH_WORKER(lcore_id) { + if (rte_eal_remote_launch(worker_lcore, &stop, lcore_id) != 0) + rte_panic("Unable to launch timer lcore\n"); + num_workers++; + } + + /* give the workers a chance to get going */ + rte_delay_us_block(10*1000); + + num_timers = num_timers_per_lcore * num_workers; + + timers = malloc(sizeof(struct test_timer *) * num_timers); + cancel_timers = malloc(sizeof(struct cancel_timer *) * num_timers); + + if (timers == NULL || cancel_timers == NULL) + rte_panic("Unable to allocate memory\n"); + + timer_idx = 0; + expected_cancel_attempts = 0; + + RTE_LCORE_FOREACH_WORKER(lcore_id) { + unsigned int i; + + for (i = 0; i < num_timers_per_lcore; i++) { + uint64_t expiration_time; + struct test_timer *timer; + struct rte_htimer *htimer; + bool cancel; + struct cancel_timer *cancel_timer; + uint64_t cancel_expiration_time; + + expiration_time = + s_to_tsc(REF_CNT_MAX_EXPIRATION_TIME * + rte_drand()); + + timer = test_timer_create(expiration_time, + &timeout_count, + &failure_occured); + htimer = &timer->htimer; + + timers[timer_idx++] = timer; + + /* for the target lcore's usage of this time */ + test_timer_inc_ref_cnt(timer); + + for (;;) { + int rc; + + rc = rte_htimer_mgr_async_add(htimer, lcore_id, + expiration_time, + 0, test_timer_cb, + NULL, 0, NULL, + NULL); + if (unlikely(rc == -EBUSY)) + rte_htimer_mgr_process(); + else + break; + } + + cancel = rte_rand_max(2); + + cancel_timer = + cancel_timer_create(cancel, &timer->htimer, + &cancel_count, + &expired_count, + &failure_occured); + + cancel_expiration_time = + REF_CNT_CANCEL_FUZZ(expiration_time); + + rte_htimer_mgr_add(&cancel_timer->htimer, + cancel_expiration_time, 0, + cancel_timer_cb, NULL, 0); + + if (cancel) + expected_cancel_attempts++; + } + } + + deadline = rte_get_tsc_cycles() + REF_CNT_MAX_EXPIRATION_TIME + + s_to_tsc(0.25); + + do { + now = rte_get_tsc_cycles(); + + rte_htimer_mgr_manage_time(now); + + } while (now < deadline); + + __atomic_store_n(&stop, true, __ATOMIC_RELAXED); + + rte_eal_mp_wait_lcore(); + + if (failure_occured) + return TEST_FAILED; + + if ((cancel_count + expired_count) != expected_cancel_attempts) + return TEST_FAILED; + + if (timeout_count != (num_timers - cancel_count)) + return TEST_FAILED; + + rte_htimer_mgr_deinit(); + + return TEST_SUCCESS; +} + +static int +test_htimer_mgr(void) +{ + int rc; + + rc = test_htimer_mgr_async_add(1); + if (rc != TEST_SUCCESS) + return rc; + + rc = test_htimer_mgr_async_add(100000); + if (rc != TEST_SUCCESS) + return rc; + + rc = test_htimer_mgr_async_add_cancel(100); + if (rc != TEST_SUCCESS) + return rc; + + rc = test_htimer_mgr_ref_cnt_timers(10); + if (rc != TEST_SUCCESS) + return rc; + + rc = test_htimer_mgr_ref_cnt_timers(10000); + if (rc != TEST_SUCCESS) + return rc; + + return TEST_SUCCESS; +} + +REGISTER_TEST_COMMAND(htimer_mgr_autotest, test_htimer_mgr); diff --git a/app/test/test_htimer_mgr_perf.c b/app/test/test_htimer_mgr_perf.c new file mode 100644 index 0000000000..179b0ba6e1 --- /dev/null +++ b/app/test/test_htimer_mgr_perf.c @@ -0,0 +1,324 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2023 Ericsson AB + */ + +#include "test.h" + +#include +#include +#include + +#include +#include +#include +#include +#include + +static void +nop_cb(struct rte_htimer *, void *) +{ +} + +static uint64_t +add_rand_timers(struct rte_htimer *timers, uint64_t num, + uint64_t timeout_start, uint64_t max_timeout) +{ + uint64_t i; + uint64_t expiration_times[num]; + uint64_t start_ts; + uint64_t end_ts; + + for (i = 0; i < num; i++) + expiration_times[i] = + 1 + timeout_start + rte_rand_max(max_timeout - 1); + + start_ts = rte_get_tsc_cycles(); + + for (i = 0; i < num; i++) + rte_htimer_mgr_add(&timers[i], expiration_times[i], 0, nop_cb, + NULL, RTE_HTIMER_FLAG_ABSOLUTE_TIME); + + /* make sure the timers are actually scheduled in the wheel */ + rte_htimer_mgr_process(); + + end_ts = rte_get_tsc_cycles(); + + return end_ts - start_ts; +} + +#define TIME_STEP 16 + +static void +test_add_manage_perf(const char *scenario_name, uint64_t num_timers, + uint64_t timespan) +{ + uint64_t manage_calls; + struct rte_htimer *timers; + uint64_t start; + uint64_t now; + uint64_t start_ts; + uint64_t end_ts; + uint64_t add_latency; + uint64_t manage_latency; + + rte_htimer_mgr_init(1); + + manage_calls = timespan / TIME_STEP; + + printf("Scenario: %s\n", scenario_name); + printf(" Configuration:\n"); + printf(" Timers: %"PRIu64"\n", num_timers); + printf(" Max timeout: %"PRIu64" ticks\n", timespan); + printf(" Average timeouts/manage call: %.3f\n", + num_timers / (double)manage_calls); + printf(" Time advance per manage call: %d\n", TIME_STEP); + + printf(" Results:\n"); + + timers = rte_malloc(NULL, sizeof(struct rte_htimer) * num_timers, + 0); + + if (timers == NULL) + rte_panic("Unable to allocate memory\n"); + + start = 1 + rte_rand_max(UINT64_MAX / 2); + + rte_htimer_mgr_manage_time(start - 1); + + add_latency = add_rand_timers(timers, num_timers, start, timespan); + + start_ts = rte_get_tsc_cycles(); + + for (now = start; now < (start + timespan); now += TIME_STEP) + rte_htimer_mgr_manage_time(now); + + end_ts = rte_get_tsc_cycles(); + + manage_latency = end_ts - start_ts; + + printf(" %.0f TSC cycles / add op\n", + (double)add_latency / num_timers); + printf(" %.0f TSC cycles / manage call\n", + (double)manage_latency / manage_calls); + printf(" %.1f TSC cycles / tick\n", + (double)manage_latency / timespan); + + rte_htimer_mgr_deinit(); + + rte_free(timers); +} + +#define ITERATIONS 500 + +static int +test_del_perf(uint64_t num_timers, uint64_t timespan) +{ + struct rte_htimer *timers; + uint64_t start; + uint64_t i, j; + uint64_t start_ts; + uint64_t end_ts; + uint64_t latency = 0; + + rte_htimer_mgr_init(1); + + timers = rte_malloc(NULL, sizeof(struct rte_htimer) * num_timers, + 0); + + if (timers == NULL) + rte_panic("Unable to allocate memory\n"); + + start = 1 + rte_rand_max(UINT64_MAX / 2); + + for (i = 0; i < ITERATIONS; i++) { + rte_htimer_mgr_manage_time(start - 1); + + add_rand_timers(timers, num_timers, start, timespan); + + /* A manage (or process) call is required to get all + * timers scheduled, which may in turn make them a + * little more expensive to remove. + */ + rte_htimer_mgr_manage_time(start); + + start_ts = rte_get_tsc_cycles(); + + for (j = 0; j < num_timers; j++) + if (rte_htimer_mgr_cancel(&timers[j]) < 0) + return TEST_FAILED; + + end_ts = rte_get_tsc_cycles(); + + latency += (end_ts - start_ts); + + start += (timespan + 1); + } + + printf("Timer delete: %.0f TSC cycles / call\n", + (double)latency / (double)ITERATIONS / (double)num_timers); + + rte_htimer_mgr_deinit(); + + rte_free(timers); + + return TEST_SUCCESS; +} + +static int +target_lcore(void *arg) +{ + bool *stop = arg; + + while (!__atomic_load_n(stop, __ATOMIC_RELAXED)) + rte_htimer_mgr_manage(); + + return 0; +} + +static void +count_async_cb(struct rte_htimer *timer __rte_unused, int result, + void *cb_arg) +{ + unsigned int *count = cb_arg; + + if (result == RTE_HTIMER_MGR_ASYNC_RESULT_ADDED) + (*count)++; +} + +static uint64_t +s_to_tsc(double s) +{ + return s * rte_get_tsc_hz(); +} + +static uint64_t +tsc_to_us(uint64_t tsc) +{ + return (double)tsc / (double)rte_get_tsc_hz() * 1e6; +} + +#define ASYNC_ADD_TEST_TICK s_to_tsc(500e-9) +/* + * The number of test timers must be kept less than size of the + * htimer-internal message ring for this test case to work. + */ +#define ASYNC_ADD_TEST_NUM_TIMERS 1000 +#define ASYNC_ADD_TEST_MIN_TIMEOUT (ASYNC_ADD_TEST_NUM_TIMERS * s_to_tsc(1e-6)) +#define ASYNC_ADD_TEST_MAX_TIMEOUT (2 * ASYNC_ADD_TEST_MIN_TIMEOUT) + +static void +test_async_add_perf(void) +{ + uint64_t max_timeout = ASYNC_ADD_TEST_MAX_TIMEOUT; + uint64_t min_timeout = ASYNC_ADD_TEST_MIN_TIMEOUT; + unsigned int num_timers = ASYNC_ADD_TEST_NUM_TIMERS; + struct rte_htimer *timers; + bool *stop; + unsigned int lcore_id = rte_lcore_id(); + unsigned int target_lcore_id = + rte_get_next_lcore(lcore_id, true, true); + uint64_t now; + uint64_t request_latency = 0; + uint64_t response_latency = 0; + unsigned int i; + + rte_htimer_mgr_init(ASYNC_ADD_TEST_TICK); + + timers = rte_malloc(NULL, sizeof(struct rte_htimer) * num_timers, + RTE_CACHE_LINE_SIZE); + stop = rte_malloc(NULL, sizeof(bool), RTE_CACHE_LINE_SIZE); + + if (timers == NULL || stop == NULL) + rte_panic("Unable to allocate memory\n"); + + *stop = false; + + if (rte_eal_remote_launch(target_lcore, stop, target_lcore_id) != 0) + rte_panic("Unable to launch worker lcore\n"); + + /* wait for launch to complete */ + rte_delay_us_block(100); + + for (i = 0; i < ITERATIONS; i++) { + uint64_t expiration_times[num_timers]; + unsigned int j; + uint64_t start_ts; + uint64_t end_ts; + unsigned int count = 0; + + now = rte_get_tsc_cycles(); + + for (j = 0; j < num_timers; j++) + expiration_times[j] = now + min_timeout + + rte_rand_max(max_timeout - min_timeout); + + start_ts = rte_get_tsc_cycles(); + + for (j = 0; j < num_timers; j++) + rte_htimer_mgr_async_add(&timers[j], target_lcore_id, + expiration_times[j], 0, + nop_cb, NULL, + RTE_HTIMER_FLAG_ABSOLUTE_TIME, + count_async_cb, &count); + + end_ts = rte_get_tsc_cycles(); + + request_latency += (end_ts - start_ts); + + /* wait long-enough for the target lcore to answered */ + rte_delay_us_block(1 * num_timers); + + start_ts = rte_get_tsc_cycles(); + + while (count != num_timers) + rte_htimer_mgr_process(); + + end_ts = rte_get_tsc_cycles(); + + response_latency += (end_ts - start_ts); + + /* wait until all timeouts have fired */ + rte_delay_us_block(tsc_to_us(max_timeout)); + } + + __atomic_store_n(stop, true, __ATOMIC_RELAXED); + + rte_eal_mp_wait_lcore(); + + rte_free(timers); + + rte_htimer_mgr_deinit(); + + printf("Timer async add:\n"); + printf(" Configuration:\n"); + printf(" Timers: %d\n", ASYNC_ADD_TEST_NUM_TIMERS); + printf(" Results:\n"); + printf(" Source lcore cost: %.0f TSC cycles / add request\n", + (double)request_latency / (double)ITERATIONS / num_timers); + printf(" %.0f TSC cycles / async add " + "response\n", + (double)response_latency / (double)ITERATIONS / num_timers); +} + +static int +test_htimer_mgr_perf(void) +{ + rte_delay_us_block(10000); + + test_add_manage_perf("Sparse", 100000, 10000000); + + test_add_manage_perf("Dense", 100000, 200000); + + test_add_manage_perf("Idle", 10, 100000); + + test_add_manage_perf("Small", 1000, 100000); + + if (test_del_perf(100000, 100000) != TEST_SUCCESS) + return TEST_FAILED; + + test_async_add_perf(); + + return TEST_SUCCESS; +} + +REGISTER_TEST_COMMAND(htimer_mgr_perf_autotest, test_htimer_mgr_perf); diff --git a/app/test/test_htw.c b/app/test/test_htw.c new file mode 100644 index 0000000000..3cddfaed7f --- /dev/null +++ b/app/test/test_htw.c @@ -0,0 +1,478 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2023 Ericsson AB + */ + +#include "test.h" + +#include +#include +#include + +#include +#include +#include + +struct recorder { + struct rte_htimer_list timeout_list; + uint64_t num_timeouts; +}; + +static void +recorder_init(struct recorder *recorder) +{ + recorder->num_timeouts = 0; + LIST_INIT(&recorder->timeout_list); +} + +static void +recorder_cb(struct rte_htimer *timer, void *arg) +{ + struct recorder *recorder = arg; + + recorder->num_timeouts++; + + LIST_INSERT_HEAD(&recorder->timeout_list, timer, entry); +} + +static int +recorder_verify(struct recorder *recorder, uint64_t min_expiry, + uint64_t max_expiry) +{ + struct rte_htimer *timer; + + LIST_FOREACH(timer, &recorder->timeout_list, entry) { + if (timer->expiration_time > max_expiry) + return TEST_FAILED; + + if (timer->expiration_time < min_expiry) + return TEST_FAILED; + } + + return TEST_SUCCESS; +} + +static void +add_rand_timers(struct rte_htw *htw, struct rte_htimer *timers, + uint64_t num, uint64_t timeout_start, uint64_t max_timeout, + rte_htimer_cb_t cb, void *cb_arg) +{ + uint64_t i; + + for (i = 0; i < num; i++) { + struct rte_htimer *timer = &timers[i]; + bool use_absolute = rte_rand() & 1; + unsigned int flags = 0; + uint64_t expiration_time; + + expiration_time = timeout_start + rte_rand_max(max_timeout); + + if (use_absolute) + flags |= RTE_HTIMER_FLAG_ABSOLUTE_TIME; + else { + uint64_t htw_current_time; + + htw_current_time = rte_htw_current_time(htw); + + if (expiration_time < htw_current_time) + expiration_time = 0; + else + expiration_time -= htw_current_time; + } + + rte_htw_add(htw, timer, expiration_time, 0, cb, cb_arg, flags); + } +} + +#define ADVANCE_TIME_MAX_STEP 16 + +static int +test_rand_timers(uint64_t in_flight_timers, uint64_t max_timeout, + uint64_t runtime) +{ + struct recorder recorder; + struct rte_htimer *timers; + uint64_t fired = 0; + uint64_t start; + uint64_t now; + struct rte_htw *htw; + uint64_t added; + + recorder_init(&recorder); + + timers = malloc(sizeof(struct rte_htimer) * in_flight_timers); + + if (timers == NULL) + rte_panic("Unable to allocate heap memory\n"); + + start = rte_rand_max(UINT64_MAX - max_timeout); + + htw = rte_htw_create(); + + if (htw == NULL) + return TEST_FAILED; + + added = in_flight_timers; + add_rand_timers(htw, timers, added, start + 1, max_timeout, + recorder_cb, &recorder); + + for (now = start; now < (start + runtime); ) { + uint64_t advance; + + advance = rte_rand_max(ADVANCE_TIME_MAX_STEP); + + now += advance; + + rte_htw_manage(htw, now); + + if (recorder.num_timeouts > 0) { + struct rte_htimer *timer; + + if (advance == 0) + return TEST_FAILED; + + if (recorder_verify(&recorder, now - advance + 1, now) + != TEST_SUCCESS) + return TEST_FAILED; + + while ((timer = LIST_FIRST(&recorder.timeout_list)) + != NULL) { + LIST_REMOVE(timer, entry); + + add_rand_timers(htw, timer, 1, + now + 1, max_timeout, + recorder_cb, &recorder); + added++; + fired++; + } + + recorder.num_timeouts = 0; + } + } + + /* finish the remaining timeouts */ + + rte_htw_manage(htw, now + max_timeout); + + if (recorder_verify(&recorder, now, now + max_timeout) != TEST_SUCCESS) + return TEST_FAILED; + fired += recorder.num_timeouts; + + if (fired != added) + return TEST_FAILED; + + rte_htw_destroy(htw); + + free(timers); + + return TEST_SUCCESS; +} + +struct counter_state { + int calls; + struct rte_htw *htw; + bool cancel; +}; + +static void +count_timeouts_cb(struct rte_htimer *timer __rte_unused, void *arg) +{ + struct counter_state *state = arg; + + state->calls++; + + if (state->cancel) + rte_htw_cancel(state->htw, timer); +} + +static int +test_single_timeout_type(uint64_t now, uint64_t distance, bool use_absolute) +{ + struct rte_htw *htw; + struct counter_state cstate = {}; + struct rte_htimer timer; + uint64_t expiration_time; + unsigned int flags = 0; + + htw = rte_htw_create(); + + rte_htw_manage(htw, now); + + if (use_absolute) { + expiration_time = now + distance; + flags |= RTE_HTIMER_FLAG_ABSOLUTE_TIME; + } else + expiration_time = distance; + + rte_htw_add(htw, &timer, expiration_time, 0, count_timeouts_cb, + &cstate, flags); + + rte_htw_manage(htw, now); + + if (cstate.calls != 0) + return TEST_FAILED; + + rte_htw_manage(htw, now + distance - 1); + + if (cstate.calls != 0) + return TEST_FAILED; + + rte_htw_manage(htw, now + distance); + + + if (cstate.calls != 1) + return TEST_FAILED; + + rte_htw_manage(htw, now + distance); + + if (cstate.calls != 1) + return TEST_FAILED; + + rte_htw_manage(htw, now + distance + 1); + + if (cstate.calls != 1) + return TEST_FAILED; + + rte_htw_destroy(htw); + + return TEST_SUCCESS; +} + +static int +test_single_timeout(uint64_t now, uint64_t distance) +{ + + int rc; + + rc = test_single_timeout_type(now, distance, true); + if (rc < 0) + return rc; + + rc = test_single_timeout_type(now, distance, false); + if (rc < 0) + return rc; + + return TEST_SUCCESS; +} + +static int +test_periodical_timer(uint64_t now, uint64_t start, uint64_t period) +{ + struct rte_htw *htw; + struct counter_state cstate; + struct rte_htimer timer; + + htw = rte_htw_create(); + + cstate = (struct counter_state) { + .htw = htw + }; + + rte_htw_manage(htw, now); + + rte_htw_add(htw, &timer, start, period, count_timeouts_cb, + &cstate, RTE_HTIMER_FLAG_PERIODICAL); + + rte_htw_manage(htw, now); + + if (cstate.calls != 0) + return TEST_FAILED; + + rte_htw_manage(htw, now + start - 1); + + if (cstate.calls != 0) + return TEST_FAILED; + + rte_htw_manage(htw, now + start); + + if (cstate.calls != 1) + return TEST_FAILED; + + rte_htw_manage(htw, now + start + 1); + + if (cstate.calls != 1) + return TEST_FAILED; + + rte_htw_manage(htw, now + start + period); + + if (cstate.calls != 2) + return TEST_FAILED; + + cstate.cancel = true; + + rte_htw_manage(htw, now + start + 2 * period); + + if (cstate.calls != 3) + return TEST_FAILED; + + rte_htw_manage(htw, now + start + 3 * period); + + if (cstate.calls != 3) + return TEST_FAILED; + + rte_htw_destroy(htw); + + return TEST_SUCCESS; +} + +#define CANCEL_ITERATIONS 1000 +#define CANCEL_NUM_TIMERS 1000 +#define CANCEL_MAX_DISTANCE 10000 + +static int +test_cancel_timer(void) +{ + uint64_t now; + struct rte_htw *htw; + int i; + struct rte_htimer timers[CANCEL_NUM_TIMERS]; + struct counter_state timeouts[CANCEL_NUM_TIMERS]; + + now = rte_rand_max(UINT64_MAX / 2); + + htw = rte_htw_create(); + + for (i = 0; i < CANCEL_ITERATIONS; i++) { + int j; + int target; + + for (j = 0; j < CANCEL_NUM_TIMERS; j++) { + struct rte_htimer *timer = &timers[j]; + uint64_t expiration_time; + + timeouts[j] = (struct counter_state) {}; + + expiration_time = now + 1 + + rte_rand_max(CANCEL_MAX_DISTANCE); + + rte_htw_add(htw, timer, expiration_time, 0, + count_timeouts_cb, &timeouts[j], + RTE_HTIMER_FLAG_ABSOLUTE_TIME); + } + + target = rte_rand_max(CANCEL_NUM_TIMERS); + + rte_htw_cancel(htw, &timers[target]); + + now += CANCEL_MAX_DISTANCE; + + rte_htw_manage(htw, now); + + for (j = 0; j < CANCEL_NUM_TIMERS; j++) { + if (j != target) { + if (timeouts[j].calls != 1) + return TEST_FAILED; + } else { + if (timeouts[j].calls > 0) + return TEST_FAILED; + } + } + } + + rte_htw_destroy(htw); + + return TEST_SUCCESS; +} + +static void +nop_cb(struct rte_htimer *timer __rte_unused, void *arg __rte_unused) +{ +} + +#define NEXT_NUM_TIMERS 1000 +#define NEXT_MAX_DISTANCE 10000 + +static int +test_next_timeout(void) +{ + uint64_t now; + struct rte_htw *htw; + int i; + struct rte_htimer timers[NEXT_NUM_TIMERS]; + uint64_t last_expiration; + + now = rte_rand_max(NEXT_MAX_DISTANCE); + + htw = rte_htw_create(); + + if (rte_htw_next_timeout(htw, UINT64_MAX) != UINT64_MAX) + return TEST_FAILED; + if (rte_htw_next_timeout(htw, now + 1) != (now + 1)) + return TEST_FAILED; + + rte_htw_manage(htw, now); + + last_expiration = now + NEXT_MAX_DISTANCE * NEXT_NUM_TIMERS; + + for (i = 0; i < NEXT_NUM_TIMERS; i++) { + struct rte_htimer *timer = &timers[i]; + uint64_t expiration; + uint64_t upper_bound; + + /* add timers, each new one closer than the last */ + + expiration = last_expiration - rte_rand_max(NEXT_MAX_DISTANCE); + + rte_htw_add(htw, timer, expiration, 0, nop_cb, NULL, + RTE_HTIMER_FLAG_ABSOLUTE_TIME); + + if (rte_htw_next_timeout(htw, UINT64_MAX) != expiration) + return TEST_FAILED; + + upper_bound = expiration + rte_rand_max(100000); + + if (rte_htw_next_timeout(htw, upper_bound) != expiration) + return TEST_FAILED; + + upper_bound = expiration - rte_rand_max(expiration); + + if (rte_htw_next_timeout(htw, upper_bound) != upper_bound) + return TEST_FAILED; + + last_expiration = expiration; + } + + rte_htw_destroy(htw); + + return TEST_SUCCESS; +} + +static int +test_htw(void) +{ + if (test_single_timeout(0, 10) != TEST_SUCCESS) + return TEST_FAILED; + + if (test_single_timeout(0, 254) != TEST_SUCCESS) + return TEST_FAILED; + + if (test_single_timeout(0, 255) != TEST_SUCCESS) + return TEST_FAILED; + + if (test_single_timeout(255, 1) != TEST_SUCCESS) + return TEST_FAILED; + + if (test_single_timeout(254, 2) != TEST_SUCCESS) + return TEST_FAILED; + + if (test_periodical_timer(10000, 500, 2) != TEST_SUCCESS) + return TEST_FAILED; + + if (test_periodical_timer(1234567, 12345, 100000) != TEST_SUCCESS) + return TEST_FAILED; + + if (test_cancel_timer() != TEST_SUCCESS) + return TEST_FAILED; + + if (test_rand_timers(1000, 100000, 100000000) != TEST_SUCCESS) + return TEST_FAILED; + + if (test_rand_timers(100000, 100000, 1000000) != TEST_SUCCESS) + return TEST_FAILED; + + if (test_next_timeout() != TEST_SUCCESS) + return TEST_FAILED; + + return TEST_SUCCESS; +} + +REGISTER_TEST_COMMAND(htw_autotest, test_htw); diff --git a/app/test/test_htw_perf.c b/app/test/test_htw_perf.c new file mode 100644 index 0000000000..65901f0874 --- /dev/null +++ b/app/test/test_htw_perf.c @@ -0,0 +1,181 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2023 Ericsson AB + */ + +#include "test.h" + +#include +#include +#include + +#include +#include +#include +#include + +static void +nop_cb(struct rte_htimer *timer __rte_unused, void *arg __rte_unused) +{ +} + +static void +add_rand_timers(struct rte_htw *htw, struct rte_htimer *timers, + uint64_t num, uint64_t timeout_start, uint64_t max_timeout) +{ + uint64_t i; + uint64_t expiration_times[num]; + uint64_t start_ts; + uint64_t end_ts; + + for (i = 0; i < num; i++) + expiration_times[i] = timeout_start + rte_rand_max(max_timeout); + + start_ts = rte_get_tsc_cycles(); + + for (i = 0; i < num; i++) { + struct rte_htimer *timer = &timers[i]; + + rte_htw_add(htw, timer, expiration_times[i], 0, nop_cb, NULL, + RTE_HTIMER_FLAG_ABSOLUTE_TIME); + } + + /* actually install the timers */ + rte_htw_process(htw); + + end_ts = rte_get_tsc_cycles(); + + printf(" %.0f TSC cycles / add op\n", + (double)(end_ts - start_ts) / num); +} + +#define TIME_STEP 16 + +static int +test_add_manage_perf(const char *scenario_name, uint64_t num_timers, + uint64_t timespan) +{ + uint64_t manage_calls; + struct rte_htimer *timers; + uint64_t start; + uint64_t now; + struct rte_htw *htw; + uint64_t start_ts; + uint64_t end_ts; + double latency; + + manage_calls = timespan / TIME_STEP; + + printf("Scenario: %s\n", scenario_name); + printf(" Configuration:\n"); + printf(" Timers: %"PRIu64"\n", num_timers); + printf(" Max timeout: %"PRIu64" ticks\n", timespan); + printf(" Average timeouts/manage call: %.3f\n", + num_timers / (double)manage_calls); + printf(" Time advance per manage call: %d\n", TIME_STEP); + + printf(" Results:\n"); + + timers = rte_malloc(NULL, sizeof(struct rte_htimer) * + num_timers, 0); + + if (timers == NULL) + rte_panic("Unable to allocate memory\n"); + + htw = rte_htw_create(); + + if (htw == NULL) + return TEST_FAILED; + + start = 1 + rte_rand_max(UINT64_MAX / 2); + + rte_htw_manage(htw, start - 1); + + add_rand_timers(htw, timers, num_timers, start, timespan); + + start_ts = rte_get_tsc_cycles(); + + for (now = start; now < (start + timespan); now += TIME_STEP) + rte_htw_manage(htw, now); + + end_ts = rte_get_tsc_cycles(); + + latency = end_ts - start_ts; + + printf(" %.0f TSC cycles / manage call\n", + latency / manage_calls); + printf(" %.1f TSC cycles / tick\n", latency / timespan); + + rte_htw_destroy(htw); + + rte_free(timers); + + return TEST_SUCCESS; +} + +static int +test_cancel_perf(uint64_t num_timers, uint64_t timespan) +{ + struct rte_htimer *timers; + uint64_t start; + struct rte_htw *htw; + uint64_t i; + uint64_t start_ts; + uint64_t end_ts; + double latency; + + timers = rte_malloc(NULL, sizeof(struct rte_htimer) * num_timers, 0); + + if (timers == NULL) + rte_panic("Unable to allocate memory\n"); + + htw = rte_htw_create(); + + if (htw == NULL) + return TEST_FAILED; + + start = 1 + rte_rand_max(UINT64_MAX / 2); + + rte_htw_manage(htw, start - 1); + + add_rand_timers(htw, timers, num_timers, start, timespan); + + start_ts = rte_get_tsc_cycles(); + + for (i = 0; i < num_timers; i++) + rte_htw_cancel(htw, &timers[i]); + + end_ts = rte_get_tsc_cycles(); + + latency = end_ts - start_ts; + + printf("Timer delete: %.0f TSC cycles / call\n", + latency / num_timers); + + rte_htw_destroy(htw); + + rte_free(timers); + + return TEST_SUCCESS; +} + +static int +test_htw_perf(void) +{ + rte_delay_us_block(100); + + if (test_add_manage_perf("Sparse", 100000, 10000000) != TEST_SUCCESS) + return TEST_FAILED; + + if (test_add_manage_perf("Dense", 100000, 200000) != TEST_SUCCESS) + return TEST_FAILED; + + if (test_add_manage_perf("Idle", 10, 100000) != TEST_SUCCESS) + return TEST_FAILED; + + if (test_cancel_perf(100000, 100000) != TEST_SUCCESS) + return TEST_FAILED; + + return TEST_SUCCESS; +} + +REGISTER_TEST_COMMAND(htw_perf_autotest, test_htw_perf); diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md index 2deec7ea19..5ea1dfa262 100644 --- a/doc/api/doxy-api-index.md +++ b/doc/api/doxy-api-index.md @@ -67,6 +67,8 @@ The public API headers are grouped by topics: - **timers**: [cycles](@ref rte_cycles.h), [timer](@ref rte_timer.h), + [htimer_mgr](@ref rte_htimer_mgr.h), + [htimer](@ref rte_htimer.h), [alarm](@ref rte_alarm.h) - **locks**: @@ -163,7 +165,8 @@ The public API headers are grouped by topics: [ring](@ref rte_ring.h), [stack](@ref rte_stack.h), [tailq](@ref rte_tailq.h), - [bitmap](@ref rte_bitmap.h) + [bitmap](@ref rte_bitmap.h), + [bitset](@ref rte_bitset.h) - **packet framework**: * [port](@ref rte_port.h): diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in index e859426099..c0cd64db34 100644 --- a/doc/api/doxy-api.conf.in +++ b/doc/api/doxy-api.conf.in @@ -45,6 +45,7 @@ INPUT = @TOPDIR@/doc/api/doxy-api-index.md \ @TOPDIR@/lib/gro \ @TOPDIR@/lib/gso \ @TOPDIR@/lib/hash \ + @TOPDIR@/lib/htimer \ @TOPDIR@/lib/ip_frag \ @TOPDIR@/lib/ipsec \ @TOPDIR@/lib/jobstats \ diff --git a/lib/htimer/meson.build b/lib/htimer/meson.build new file mode 100644 index 0000000000..2dd5d6a24b --- /dev/null +++ b/lib/htimer/meson.build @@ -0,0 +1,7 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2023 Ericsson AB + +sources = files('rte_htw.c', 'rte_htimer_msg_ring.c', 'rte_htimer_mgr.c') +headers = files('rte_htimer_mgr.h', 'rte_htimer.h') + +deps += ['ring'] diff --git a/lib/htimer/rte_htimer.h b/lib/htimer/rte_htimer.h new file mode 100644 index 0000000000..e245b30c65 --- /dev/null +++ b/lib/htimer/rte_htimer.h @@ -0,0 +1,65 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2023 Ericsson AB + */ + +#ifndef _RTE_HTIMER_H_ +#define _RTE_HTIMER_H_ + +#include +#include +#include + +struct rte_htimer; + +typedef void (*rte_htimer_cb_t)(struct rte_htimer *, void *); + +struct rte_htimer { + /** + * Absolute timer expiration time (in ticks). + */ + uint64_t expiration_time; + /** + * Time between expirations (in ticks). Zero for one-shot timers. + */ + uint64_t period; + /** + * Owning lcore (in ticks). Zero for one-shot timers. May safely + * be read from any thread. + */ + uint32_t owner_lcore_id; + /** + * The current state of the timer. + */ + uint32_t state:4; + /** + * Flags set on this timer. + */ + uint32_t flags:28; + /** + * User-specified callback function pointer. + */ + rte_htimer_cb_t cb; + /** + * Argument for user callback. + */ + void *cb_arg; + /** + * Pointers used to add timer to various internal lists. + */ + LIST_ENTRY(rte_htimer) entry; +}; + +#define RTE_HTIMER_FLAG_ABSOLUTE_TIME (UINT32_C(1) << 0) +#define RTE_HTIMER_FLAG_PERIODICAL (UINT32_C(1) << 1) + +#define RTE_HTIMER_STATE_PENDING 1 +#define RTE_HTIMER_STATE_EXPIRED 2 +#define RTE_HTIMER_STATE_CANCELED 3 + +LIST_HEAD(rte_htimer_list, rte_htimer); + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_HTIMER_H_ */ diff --git a/lib/htimer/rte_htimer_mgr.c b/lib/htimer/rte_htimer_mgr.c new file mode 100644 index 0000000000..7bb1630680 --- /dev/null +++ b/lib/htimer/rte_htimer_mgr.c @@ -0,0 +1,488 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2023 Ericsson AB + */ + +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include + +#include "rte_htimer_mgr.h" +#include "rte_htimer_msg.h" +#include "rte_htimer_msg_ring.h" + +#define MAX_MSG_BATCH_SIZE 16 + +struct htimer_mgr { + struct rte_htimer_msg_ring *msg_ring; + struct rte_htw *htw; + + unsigned int async_msgs_idx __rte_cache_aligned; + unsigned int num_async_msgs; + struct rte_htimer_msg async_msgs[MAX_MSG_BATCH_SIZE]; +} __rte_cache_aligned; + +static uint64_t tsc_per_tick; + +static struct htimer_mgr mgrs[RTE_MAX_LCORE + 1]; + +#define MAX_ASYNC_TRANSACTIONS 1024 +#define MSG_RING_SIZE MAX_ASYNC_TRANSACTIONS + +static inline uint64_t +tsc_to_tick(uint64_t tsc) +{ + return tsc / tsc_per_tick; +} + +static inline uint64_t +tsc_to_tick_round_up(uint64_t tsc) +{ + uint64_t tick; + uint64_t remainder; + + tick = tsc / tsc_per_tick; + remainder = tsc % tsc_per_tick; + + if (likely(remainder > 0)) + tick++; + + return tick; +} + +static uint64_t +tick_to_tsc(uint64_t tick) +{ + return tick * tsc_per_tick; +} + +static struct htimer_mgr * +mgr_get(unsigned int lcore_id) +{ + return &mgrs[lcore_id]; +} + +static int +mgr_init(unsigned int lcore_id) +{ + char ring_name[RTE_RING_NAMESIZE]; + unsigned int socket_id; + struct htimer_mgr *mgr = &mgrs[lcore_id]; + + socket_id = rte_lcore_to_socket_id(lcore_id); + + snprintf(ring_name, sizeof(ring_name), "htimer_%d", lcore_id); + + mgr->msg_ring = + rte_htimer_msg_ring_create(ring_name, MSG_RING_SIZE, socket_id, + RING_F_SC_DEQ); + + if (mgr->msg_ring == NULL) + goto err; + + mgr->htw = rte_htw_create(); + + if (mgr->htw == NULL) + goto err_free_ring; + + mgr->async_msgs_idx = 0; + mgr->num_async_msgs = 0; + + return 0; + +err_free_ring: + rte_htimer_msg_ring_free(mgr->msg_ring); +err: + return -ENOMEM; +} + +static void +mgr_deinit(unsigned int lcore_id) +{ + struct htimer_mgr *mgr = &mgrs[lcore_id]; + + rte_htw_destroy(mgr->htw); + + rte_htimer_msg_ring_free(mgr->msg_ring); +} + +static volatile bool initialized; + +static void +assure_initialized(void) +{ + RTE_ASSERT(initialized); +} + +int +rte_htimer_mgr_init(uint64_t _tsc_per_tick) +{ + unsigned int lcore_id; + + RTE_VERIFY(!initialized); + + tsc_per_tick = _tsc_per_tick; + + for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) { + int rc; + + rc = mgr_init(lcore_id); + + if (rc < 0) { + unsigned int deinit_lcore_id; + + for (deinit_lcore_id = 0; deinit_lcore_id < lcore_id; + deinit_lcore_id++) + mgr_deinit(deinit_lcore_id); + + return rc; + } + } + + initialized = true; + + return 0; +} + +void +rte_htimer_mgr_deinit(void) +{ + unsigned int lcore_id; + + assure_initialized(); + + for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) + mgr_deinit(lcore_id); + + initialized = false; +} + +void +rte_htimer_mgr_add(struct rte_htimer *timer, uint64_t expiration_time_tsc, + uint64_t period_tsc, rte_htimer_cb_t timer_cb, + void *timer_cb_arg, uint32_t flags) +{ + unsigned int lcore_id = rte_lcore_id(); + struct htimer_mgr *mgr = mgr_get(lcore_id); + uint64_t expiration_time_tick = + tsc_to_tick_round_up(expiration_time_tsc); + uint64_t period_tick = + tsc_to_tick_round_up(period_tsc); + + assure_initialized(); + + rte_htw_add(mgr->htw, timer, expiration_time_tick, period_tick, + timer_cb, timer_cb_arg, flags); + + timer->owner_lcore_id = lcore_id; +} + +int +rte_htimer_mgr_cancel(struct rte_htimer *timer) +{ + unsigned int lcore_id = rte_lcore_id(); + struct htimer_mgr *mgr = mgr_get(lcore_id); + + assure_initialized(); + + RTE_ASSERT(timer->owner_lcore_id == lcore_id); + + switch (timer->state) { + case RTE_HTIMER_STATE_PENDING: + rte_htw_cancel(mgr->htw, timer); + return 0; + case RTE_HTIMER_STATE_EXPIRED: + return -ETIME; + default: + RTE_ASSERT(timer->state == RTE_HTIMER_STATE_CANCELED); + return -ENOENT; + } +} + +static int +send_msg(unsigned int receiver_lcore_id, enum rte_htimer_msg_type msg_type, + struct rte_htimer *timer, rte_htimer_mgr_async_op_cb_t async_cb, + void *async_cb_arg, const struct rte_htimer_msg_request *request, + const struct rte_htimer_msg_response *response) +{ + struct htimer_mgr *receiver_mgr; + struct rte_htimer_msg_ring *receiver_ring; + struct rte_htimer_msg msg = (struct rte_htimer_msg) { + .msg_type = msg_type, + .timer = timer, + .async_cb = async_cb, + .async_cb_arg = async_cb_arg + }; + int rc; + + if (request != NULL) + msg.request = *request; + else + msg.response = *response; + + receiver_mgr = mgr_get(receiver_lcore_id); + + receiver_ring = receiver_mgr->msg_ring; + + rc = rte_htimer_msg_ring_enqueue(receiver_ring, &msg); + + return rc; +} + +static int +send_request(unsigned int receiver_lcore_id, enum rte_htimer_msg_type msg_type, + struct rte_htimer *timer, + rte_htimer_mgr_async_op_cb_t async_cb, void *async_cb_arg) +{ + unsigned int lcore_id = rte_lcore_id(); + struct rte_htimer_msg_request request = { + .source_lcore_id = lcore_id + }; + + return send_msg(receiver_lcore_id, msg_type, timer, async_cb, + async_cb_arg, &request, NULL); +} + +static int +send_response(unsigned int receiver_lcore_id, enum rte_htimer_msg_type msg_type, + struct rte_htimer *timer, + rte_htimer_mgr_async_op_cb_t async_cb, void *async_cb_arg, + int result) +{ + struct rte_htimer_msg_response response = { + .result = result + }; + + return send_msg(receiver_lcore_id, msg_type, timer, async_cb, + async_cb_arg, NULL, &response); +} + +int +rte_htimer_mgr_async_add(struct rte_htimer *timer, + unsigned int target_lcore_id, + uint64_t expiration_time, uint64_t period, + rte_htimer_cb_t timer_cb, void *timer_cb_arg, + uint32_t flags, + rte_htimer_mgr_async_op_cb_t async_cb, + void *async_cb_arg) +{ + *timer = (struct rte_htimer) { + .expiration_time = expiration_time, + .period = period, + .owner_lcore_id = target_lcore_id, + .flags = flags, + .cb = timer_cb, + .cb_arg = timer_cb_arg + }; + + assure_initialized(); + + if (send_request(target_lcore_id, rte_htimer_msg_type_add_request, + timer, async_cb, async_cb_arg) < 0) + return -EBUSY; + + return 0; +} + +int +rte_htimer_mgr_async_cancel(struct rte_htimer *timer, + rte_htimer_mgr_async_op_cb_t async_cb, + void *async_cb_arg) +{ + if (send_request(timer->owner_lcore_id, + rte_htimer_msg_type_cancel_request, + timer, async_cb, async_cb_arg) < 0) + return -EBUSY; + + return 0; +} + +static int +process_add_request(struct rte_htimer_msg *request) +{ + struct rte_htimer *timer = request->timer; + + if (request->async_cb != NULL && + send_response(request->request.source_lcore_id, + rte_htimer_msg_type_add_response, timer, + request->async_cb, request->async_cb_arg, + RTE_HTIMER_MGR_ASYNC_RESULT_ADDED) < 0) + return -EBUSY; + + rte_htimer_mgr_add(timer, timer->expiration_time, timer->period, + timer->cb, timer->cb_arg, timer->flags); + + return 0; +} + +static int +process_cancel_request(struct rte_htimer_msg *request) +{ + unsigned int lcore_id = rte_lcore_id(); + struct htimer_mgr *mgr = mgr_get(lcore_id); + struct rte_htimer *timer = request->timer; + int result; + + switch (timer->state) { + case RTE_HTIMER_STATE_PENDING: + result = RTE_HTIMER_MGR_ASYNC_RESULT_CANCELED; + break; + case RTE_HTIMER_STATE_CANCELED: + result = RTE_HTIMER_MGR_ASYNC_RESULT_ALREADY_CANCELED; + break; + case RTE_HTIMER_STATE_EXPIRED: + result = RTE_HTIMER_MGR_ASYNC_RESULT_EXPIRED; + break; + default: + RTE_ASSERT(0); + result = -1; + } + + if (request->async_cb != NULL && + send_response(request->request.source_lcore_id, + rte_htimer_msg_type_cancel_response, timer, + request->async_cb, request->async_cb_arg, + result) < 0) + return -EBUSY; + + if (timer->state == RTE_HTIMER_STATE_PENDING) + rte_htw_cancel(mgr->htw, timer); + + return 0; +} + +static int +process_response(struct rte_htimer_msg *msg) +{ + struct rte_htimer_msg_response *response = &msg->response; + + if (msg->async_cb != NULL) + msg->async_cb(msg->timer, response->result, msg->async_cb_arg); + + return 0; +} + +static int +process_msg(struct rte_htimer_msg *msg) +{ + switch (msg->msg_type) { + case rte_htimer_msg_type_add_request: + return process_add_request(msg); + case rte_htimer_msg_type_cancel_request: + return process_cancel_request(msg); + case rte_htimer_msg_type_add_response: + case rte_htimer_msg_type_cancel_response: + return process_response(msg); + default: + RTE_ASSERT(0); + return -EBUSY; + } +} + +static void +dequeue_async_msgs(struct htimer_mgr *mgr) +{ + if (mgr->num_async_msgs == 0) { + unsigned int i; + + mgr->async_msgs_idx = 0; + + mgr->num_async_msgs = + rte_htimer_msg_ring_dequeue_burst(mgr->msg_ring, + mgr->async_msgs, + MAX_MSG_BATCH_SIZE); + + for (i = 0; i < mgr->num_async_msgs; i++) + rte_prefetch1(mgr->async_msgs[i].timer); + } +} + +static void +process_async(struct htimer_mgr *mgr) +{ + for (;;) { + struct rte_htimer_msg *msg; + + dequeue_async_msgs(mgr); + + if (mgr->num_async_msgs == 0) + break; + + msg = &mgr->async_msgs[mgr->async_msgs_idx]; + + if (process_msg(msg) < 0) + break; + + mgr->num_async_msgs--; + mgr->async_msgs_idx++; + } +} + +void +rte_htimer_mgr_manage_time(uint64_t current_time) +{ + unsigned int lcore_id = rte_lcore_id(); + struct htimer_mgr *mgr = mgr_get(lcore_id); + uint64_t current_tick; + + assure_initialized(); + + process_async(mgr); + + current_tick = tsc_to_tick(current_time); + + rte_htw_manage(mgr->htw, current_tick); +} + +void +rte_htimer_mgr_manage(void) +{ + uint64_t current_time; + + assure_initialized(); + + current_time = rte_get_tsc_cycles(); + + rte_htimer_mgr_manage_time(current_time); +} + +void +rte_htimer_mgr_process(void) +{ + unsigned int lcore_id = rte_lcore_id(); + struct htimer_mgr *mgr = mgr_get(lcore_id); + + process_async(mgr); + assure_initialized(); + + rte_htw_process(mgr->htw); +} + +uint64_t +rte_htimer_mgr_current_time(void) +{ + uint64_t current_tick; + + current_tick = rte_htimer_mgr_current_tick(); + + return tick_to_tsc(current_tick); +} + +uint64_t +rte_htimer_mgr_current_tick(void) +{ + unsigned int lcore_id = rte_lcore_id(); + struct htimer_mgr *mgr = mgr_get(lcore_id); + uint64_t current_tick; + + current_tick = rte_htw_current_time(mgr->htw); + + return current_tick; +} diff --git a/lib/htimer/rte_htimer_mgr.h b/lib/htimer/rte_htimer_mgr.h new file mode 100644 index 0000000000..1fbd69dbf6 --- /dev/null +++ b/lib/htimer/rte_htimer_mgr.h @@ -0,0 +1,497 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2023 Ericsson AB + */ + +#ifndef _RTE_HTIMER_MGR_H_ +#define _RTE_HTIMER_MGR_H_ + +/** + * @file + * + * RTE High-performance Timer Manager + * + * The high-performance timer manager (htimer_mgr) API provides access + * to a low-overhead, scalable timer service. + * + * The functionality offered similar to that of , but the + * internals differs significantly, and there are slight differences + * in the programming interface as well. + * + * Core timer management is implemented by means of a hierarchical + * timer wheel (HWT), as per the Varghese and Lauck paper Hashed + * and Hierarchical Timing Wheels: Data Structures for the Efficient + * Implementation of a Timer Facility. + * + * Varghese et al's approach is further enhanced by the placement of a + * bitset in front of each wheel's slots. Each slot has a + * corresponding bit in the bitset. If a bit is clear, there are no + * pending timers scheduled for that slot. A set bit means there + * potentially are timers scheduled for that slot. This scheme reduces + * the overhead of the rte_htimer_mgr_manage() function, where slots + * of one or more of the wheels of the thread's HWT are scanned if + * time has progressed since last call. This improves performance is + * all cases, except for very densely populated timer wheels. + * + * One such HWT is instantiated for each lcore (EAL thread), and + * instances are also available for registered non-EAL threads. + * + * The API may not be called from unregistered + * non-EAL threads. + * + * The per-lcore-id HWT instance is private to that thread. + * + * The htimer API supports scheduling timers to a different thread + * (and thus, a different HWT) than the caller's. It is also possible + * to cancel timers managed by a "remote" timer wheel. + * + * All interaction (i.e., adding timers to or removing timers from) a + * remote HWT is done by sending a request, in the form of message on + * a DPDK ring, to that instance. Such requests are processed and, if + * required, acknowledged when the remote (target) thread calls + * rte_htimer_mgr_manage(), rte_htimer_mgr_manage_time() or + * rte_htimer_mgr_process(). + * + * This message-based interaction avoid comparatively heavy-weight + * synchronization primitives such as spinlocks. Only release-acquire + * type synchronization on the rings are needed. + * + * Timer memory management is the responsibility of the + * application. After library-level initialization has completed, no + * more dynamic memory is allocated by the htimer library. When + * installing timers on remote lcores, care must be taken by the + * application to avoid race conditions, in particular use-after-free + * (or use-after-recycle) issues of the rte_timer structure. A timer + * struct may only be deallocated and/or recycled if the application + * can guarantee that there are no cancel requests in flight. + * + * The htimer library is able to give a definitive answer to the + * question if a remote timer's had expired or not, at the time of + * cancellation. + * + * The htimer library uses TSC as the default time source. A different + * time source may be used, in which case the application must + * explicitly provide the time using rte_htimer_mgr_manage_time(). + * This function may also be used even if TSC is the time source, in + * cases where the application for some other purpose already is in + * possession of the current TSC time, avoiding the overhead of the + * `rdtsc` instruction (or its equivalent on non-x86 ISAs). + * + * The htimer supports periodic and single-shot timers. + * + * The timer tick defines a quantum of time in the htimer library. The + * length of a tick (quantified in TSC) is left to the application to + * specify. The core HWT implementation allows for all 64 bits to be + * used. + * + * Very fine-grained ticks increase the HWT overhead (since more slots + * needs to be scanned). Long ticks will only allow for very + * course-grained timers, and in timer-heavy application may cause + * load spikes when time advances into a new tick. + * + * Seemingly reasonable timer tick length range in between 100 ns and + * 100 us (or maybe up to as high as 1 ms), depending on the + * application. + */ + +#include + +#include +#include + +/** + * The timer has been added to the timer manager on the target lcore. + */ +#define RTE_HTIMER_MGR_ASYNC_RESULT_ADDED 1 + +/** + * The timer cancellation request has completed, before the timer expired + * on the target lcore. + */ +#define RTE_HTIMER_MGR_ASYNC_RESULT_CANCELED 2 + +/** + * The timer cancellation request was denied, since the timer was + * already marked as canceled. + */ +#define RTE_HTIMER_MGR_ASYNC_RESULT_ALREADY_CANCELED 3 + +/** + * At the time of the cancellation request process on the target + * lcore, the timer had already expired. + */ +#define RTE_HTIMER_MGR_ASYNC_RESULT_EXPIRED 4 + +typedef void (*rte_htimer_mgr_async_op_cb_t)(struct rte_htimer *timer, + int result, void *cb_arg); + +/** + * Initialize the htimer library. + * + * Instantiates per-lcore (or per-registered non-EAL thread) timer + * wheels and other htimer library data structures, for all current + * and future threads. + * + * This function must be called prior to any other API + * call. + * + * This function may not be called if the htimer library is already + * initialized, but may be called multiple times, provided the library + * is deinitialized in between rte_htimer_mgr_init() calls. + * + * For applications not using TSC as the time source, the \c tsc_per_tick + * parameter will denote the number of such application time-source-units + * per tick. + * + * This function is not multi-thread safe. + * + * @param tsc_per_tick + * The length (in TSC) of a HWT tick. + * + * @return + * - 0: Success + * - -ENOMEM: Unable to allocate memory needed to initialize timer + * subsystem + * + * @see rte_htimer_mgr_deinit() + * @see rte_get_tsc_hz() + */ + +__rte_experimental +int +rte_htimer_mgr_init(uint64_t tsc_per_tick); + +/** + * Deinitialize the htimer library. + * + * This function deallocates all dynamic memory used by the library, + * including HWT instances used by other threads than the caller. + * + * After this call has been made, no API call may be + * made, except rte_htimer_mgr_init(). + * + * This function may not be called if the htimer library has never be + * initialized, or has been be deinitialized but not yet initialized + * again. + * + * This function is not multi-thread safe. In particular, no thread + * may call any functions (e.g., rte_htimer_mgr_manage()) + * during (or after) the htimer library is deinitialized, except if it + * is initialized again. + * + * @see rte_htimer_mgr_init() + */ + +__rte_experimental +void +rte_htimer_mgr_deinit(void); + +/** + * Adds a timer to the calling thread's timer wheel. + * + * This function schedules a timer on the calling thread's HWT. + * + * The \c timer_cb callback is called at a point when this thread + * calls rte_htimer_mgr_process(), rte_htimer_mgr_manage(), or + * rte_htimer_mgr_manage_time() and the expiration time has passed the + * current time (either as retrieved by rte_htimer_mgr_manage() or + * specified by the application in rte_htimer_mgr_manage_time(). + * + * The HWT trackes times in units of \c ticks, which are likely more + * coarse-grained than the TSC resolution. + * + * The \c expiration_time is specified in units of TSC, and rounded up + * to the nearest tick. Thus, a timer with a certain expiration time + * (specified in TSC) maybe not expire even though this time + * (specified in TSC) was supplied in rte_timer_manage_time(). The + * maximum error is the length of one tick (not including any delays + * caused by infrequent manage calls). + * + * This timer may be canceled using rte_htimer_mgr_cancel() or + * rte_htimer_mgr_async_cancel(). + * + * rte_htimer_mgr_add() is multi-thread safe, and may only be called + * from an EAL thread or a registered non-EAL thread. + * + * @param timer + * The chunk of memory used for managing this timer. This memory + * must not be read or written (or free'd) by the application until + * this timer has expired, or any cancellation attempts have + * completed. + * @param expiration_time + * The expiration time (measured in TSC). For periodical timers, + * this time represent the first expiration time. + * @param period + * The time in between periodic timer expirations (measured in TSC). + * Must be set to zero unless the RTE_HTIMER_FLAG_PERIODICAL flag is set, + * in case it must be a positive integer. + * @param timer_cb + * The timer callback to be called upon timer expiration. + * @param timer_cb_arg + * A pointer which will be supplied back to the application in the + * timer callback call. + * @param flags + * RTE_HTIMER_FLAG_ABSOLUTE_TIME and/or RTE_HTIMER_FLAG_PERIODICAL. + */ + +__rte_experimental +void +rte_htimer_mgr_add(struct rte_htimer *timer, uint64_t expiration_time, + uint64_t period, rte_htimer_cb_t timer_cb, + void *timer_cb_arg, uint32_t flags); + +/** + * Cancel a timer scheduled in the calling thread's timer wheel. + * + * This function cancel a timer scheduled on the calling thread's HWT. + * + * rte_htimer_mgr_cancel() may be called on a timer which has already + * (synchronously or asynchronously) been canceled, or may have expired. + * However, the \c rte_htimer struct pointed to by \c timer may not + * have been freed or recycled since. + * + * rte_htimer_mgr_cancel() may not be called for a timer that was + * never (or, not yet) added. + * + * A timer added using rte_htimer_mgr_async_add() may be not be + * canceled using this function until after the add operation has + * completed (i.e, the completion callback has been run). + * + * rte_htimer_mgr_cancel() is multi-thread safe, and may only be + * called from an EAL thread or a registered non-EAL thread. + * + * @param timer + * The timer to be canceled. + * @return + * - 0: Success + * - -ETIME: Timer has expired, and thus could not be canceled. + * - -ENOENT: Timer was already canceled. + */ + +__rte_experimental +int +rte_htimer_mgr_cancel(struct rte_htimer *timer); + +/** + * Asynchronuosly add a timer to the specified lcore's timer wheel. + * + * This function is the equivalent of rte_htimer_mgr_add(), but allows + * the calling ("source") thread to scheduled a timer in a HWT other + * than it's own. The operation is asynchronous. + * + * The timer works the same as a timer added locally. Thus, the \c + * timer_cb callback is called by the target thread, and it may be + * canceled using rte_htimer_mgr_cancel(). + * + * The source thread may be the same as the target thread. + * + * Only EAL threads or registered non-EAL thread may be targeted. + * + * A successful rte_htimer_mgr_async_add() call guarantees that the + * timer will be scheduled on the target lcore at some future time, + * provided the target thread calls either rte_htimer_mgr_process(), + * rte_htimer_mgr_manage(), and/or rte_htimer_mgr_manage_time(). + * + * The \c async_cb callback is called on the source thread as a part + * of its rte_htimer_mgr_process(), rte_htimer_mgr_manage(), or + * rte_htimer_mgr_manage_time() call, when the asynchronous add + * operation has completed (i.e., the timer is scheduled in the target + * HWT). + * + * \c async_cb may be NULL, in which case no notification is given. + * + * An asynchronously added timer may be asynchronously canceled (i.e., + * using rte_htimer_mgr_async_cancel()) at any point, by any thread, + * after the rte_htimer_mgr_async_add() call. A asynchronously added + * timer may be not be canceled using rte_htimer_mgr_cancel() until + * after the completion callback has been executed. + * + * rte_htimer_mgr_async_add() is multi-thread safe, and may only be called + * from an EAL thread or a registered non-EAL thread. + * + * @param timer + * The chunk of memory used for managing this timer. This memory + * must not be read or written (or free'd) by the application until + * this timer has expired, or any cancellation attempts have + * completed. + * @param target_lcore_id + * The lcore id of the thread which HWT will be manage this timer. + * @param expiration_time + * The expiration time (measured in TSC). For periodical timers, + * this time represent the first expiration time. + * @param period + * The time in between periodic timer expirations (measured in TSC). + * Must be set to zero unless the RTE_HTIMER_FLAG_PERIODICAL flag is set, + * in case it must be a positive integer. + * @param timer_cb + * The timer callback to be called upon timer expiration. + * @param timer_cb_arg + * A pointer which will be supplied back to the application in the + * timer callback call. + * @param async_cb + * The asynchronous operationg callback to be called when the + * add operation is completed. + * @param async_cb_arg + * A pointer which will be supplied back to the application in the + * \c async_cb callback call. + * @param flags + * RTE_HTIMER_FLAG_ABSOLUTE_TIME and/or RTE_HTIMER_FLAG_PERIODICAL. + * @return + * - 0: Success + * - -EBUSY: The maximum number of concurrently queued asynchronous + * operations has been reached. + */ +__rte_experimental +int +rte_htimer_mgr_async_add(struct rte_htimer *timer, + unsigned int target_lcore_id, + uint64_t expiration_time, uint64_t period, + rte_htimer_cb_t timer_cb, void *timer_cb_arg, + uint32_t flags, + rte_htimer_mgr_async_op_cb_t async_cb, + void *async_cb_arg); + +/** + * Asynchronuosly cancel a timer in any thread's timer wheel. + * + * This function is the equivalent of rte_htimer_mgr_cancel(), but + * allows the calling ("source") thread to also cancel a timer in a + * HWT other than it's own. The operation is asynchronous. + * + * A thread may asynchronously cancel a timer scheduled on its own + * HWT. + * + * The \c async_cb callback is called on the source thread as a part + * of its rte_htimer_mgr_process(), rte_htimer_mgr_manage(), or + * rte_htimer_mgr_manage_time() call, when the asynchronous add + * operation has completed (i.e., the timer is scheduled in the target + * HWT). + * + * \c async_cb may be NULL, in which case no notification is given. + * + * A timer may be asynchronously canceled at any point, by any thread, + * after it has been either synchronously or asynchronously added. + * + * rte_htimer_mgr_async_cancel() is multi-thread safe, and may only be + * called from an EAL thread or a registered non-EAL thread. + * + * @param timer + * The memory used for managing this timer. This memory must not be + * read or written (or free'd) by the application until this timer + * has expired, or any cancellation attempts have completed. + * @param async_cb + * The asynchronous operationg callback to be called when the + * add operation is completed. + * @param async_cb_arg + * A pointer which will be supplied back to the application in the + * \c async_cb callback call. + * @return + * - 0: Success + * - -EBUSY: The maximum number of concurrently queued asynchronous + * operations has been reached. + */ +__rte_experimental +int +rte_htimer_mgr_async_cancel(struct rte_htimer *timer, + rte_htimer_mgr_async_op_cb_t async_cb, + void *async_cb_arg); + +/** + * Update HWT time and perform timer expiry and asyncronous operation + * processing. + * + * This function is the equivalent of retrieving the current TSC time, + * and calling rte_htimer_mgr_manage_time(). + * + * rte_htimer_mgr_manage() is multi-thread safe, and may only be + * called from an EAL thread or a registered non-EAL thread. + */ + +__rte_experimental +void +rte_htimer_mgr_manage(void); + +/** + * Progress HWT time, and perform timer expiry and asynchronous + * operation processing in the process. + * + * This function progress the calling thread's HWT up to the point + * specified by \c current_time, calling the callbacks of any expired + * timers. + * + * The time source must be a monotonic clock, and thus each new \c + * current_time must be equal or greater than the time supplied in the + * previous call. + * + * The timer precision for timers scheduled on a particular thread's + * HWT depends on that threads call frequency to this function. + * + * rte_htimer_mgr_manage_time() also performs asynchronous operation + * processing. See rte_htimer_mgr_process() for details. + * + * rte_htimer_mgr_manage_time() is multi-thread safe, and may only be + * called from an EAL thread or a registered non-EAL thread. + * + * @param current_time + * The current time (usually in TSC). + */ + +__rte_experimental +void +rte_htimer_mgr_manage_time(uint64_t current_time); + +/** + * Perform asynchronous operation processing. + * + * rte_htimer_mgr_process() serves pending asynchronous add or cancel + * requests, and produces the necessary responses. The timer callbacks + * of already-expired timers added are called. + * + * This function also processes asynchronous operation response + * messages received, and calls the asynchronous callbacks, if such + * was provided by the application. + * + * rte_htimer_mgr_process() is multi-thread safe, and may only be + * called from an EAL thread or a registered non-EAL thread. + */ + +__rte_experimental +void +rte_htimer_mgr_process(void); + +/** + * Return the current local HWT time in TSC. + * + * This functino returns the most recent time provided by this thread, + * either via rte_htimer_mgr_manage_time(), or as sampled by + * rte_htimer_mgr_manage(). + * + * The initial time, prior to any manage-calls, is 0. + * + * rte_htimer_mgr_current_time() is multi-thread safe, and may only be + * called from an EAL thread or a registered non-EAL thread. + */ + +__rte_experimental +uint64_t +rte_htimer_mgr_current_time(void); + +/** + * Return the current local HWT time in ticks. + * + * This function returns the current time of the calling thread's HWT. The + * tick is the current time provided by the application (via + * rte_htimer_mgr_manage_time()), or as retrieved (using + * rte_timer_get_tsc_cycles() in rte_htimer_mgr_manage()), divided by the + * tick length (as provided in rte_htimer_mgr_init()). + * + * The initial time, prior to any manage-calls, is 0. + * + * rte_htimer_mgr_current_tick() is multi-thread safe, and may only be + * called from an EAL thread or a registered non-EAL thread. + */ + +__rte_experimental +uint64_t +rte_htimer_mgr_current_tick(void); + +#endif diff --git a/lib/htimer/rte_htimer_msg.h b/lib/htimer/rte_htimer_msg.h new file mode 100644 index 0000000000..ceb106e263 --- /dev/null +++ b/lib/htimer/rte_htimer_msg.h @@ -0,0 +1,44 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2023 Ericsson AB + */ + +#ifndef _RTE_HTIMER_MSG_ +#define _RTE_HTIMER_MSG_ + +#include + +typedef void (*rte_htimer_msg_async_op_cb_t)(struct rte_htimer *timer, + int result, void *cb_arg); + +typedef rte_htimer_msg_async_op_cb_t async_cb; + +enum rte_htimer_msg_type { + rte_htimer_msg_type_add_request, + rte_htimer_msg_type_add_response, + rte_htimer_msg_type_cancel_request, + rte_htimer_msg_type_cancel_response +}; + +struct rte_htimer_msg_request { + unsigned int source_lcore_id; +}; + +struct rte_htimer_msg_response { + int result; +}; + +struct rte_htimer_msg { + enum rte_htimer_msg_type msg_type; + + struct rte_htimer *timer; + + rte_htimer_msg_async_op_cb_t async_cb; + void *async_cb_arg; + + union { + struct rte_htimer_msg_request request; + struct rte_htimer_msg_response response; + }; +}; + +#endif diff --git a/lib/htimer/rte_htimer_msg_ring.c b/lib/htimer/rte_htimer_msg_ring.c new file mode 100644 index 0000000000..4019b7819a --- /dev/null +++ b/lib/htimer/rte_htimer_msg_ring.c @@ -0,0 +1,18 @@ +#include "rte_htimer_msg_ring.h" + +struct rte_htimer_msg_ring * +rte_htimer_msg_ring_create(const char *name, unsigned int count, int socket_id, + unsigned int flags) +{ + struct rte_ring *ring = + rte_ring_create_elem(name, sizeof(struct rte_htimer_msg), + count, socket_id, flags); + + return (struct rte_htimer_msg_ring *)ring; +} + +void +rte_htimer_msg_ring_free(struct rte_htimer_msg_ring *msg_ring) +{ + rte_ring_free((struct rte_ring *)msg_ring); +} diff --git a/lib/htimer/rte_htimer_msg_ring.h b/lib/htimer/rte_htimer_msg_ring.h new file mode 100644 index 0000000000..0e408991d1 --- /dev/null +++ b/lib/htimer/rte_htimer_msg_ring.h @@ -0,0 +1,49 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2023 Ericsson AB + */ + +#ifndef _RTE_HTIMER_MSG_RING_ +#define _RTE_HTIMER_MSG_RING_ + +#include + +#include "rte_htimer_msg.h" + +struct rte_htimer_msg_ring { + struct rte_ring ring; +}; + +struct rte_htimer_msg_ring * +rte_htimer_msg_ring_create(const char *name, unsigned int count, int socket_id, + unsigned int flags); + +void +rte_htimer_msg_ring_free(struct rte_htimer_msg_ring *msg_ring); + +static inline unsigned int +rte_htimer_msg_ring_dequeue_burst(struct rte_htimer_msg_ring *msg_ring, + struct rte_htimer_msg *msgs, + unsigned int n) +{ + unsigned int dequeued; + + dequeued = rte_ring_dequeue_burst_elem(&msg_ring->ring, msgs, + sizeof(struct rte_htimer_msg), + n, NULL); + + return dequeued; +} + +static inline unsigned int +rte_htimer_msg_ring_enqueue(struct rte_htimer_msg_ring *msg_ring, + struct rte_htimer_msg *msg) +{ + int rc; + + rc = rte_ring_enqueue_elem(&msg_ring->ring, msg, + sizeof(struct rte_htimer_msg)); + + return rc; +} + +#endif diff --git a/lib/htimer/rte_htw.c b/lib/htimer/rte_htw.c new file mode 100644 index 0000000000..0104dced34 --- /dev/null +++ b/lib/htimer/rte_htw.c @@ -0,0 +1,437 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2023 Ericsson AB + */ + +/* + * This is an implementation of a hierarchical timer wheel based on + * Hashed and Hierarchical Timing Wheels: Data Structures + * for the Efficient Implementation of a Timer Facility by Varghese and + * Lauck. + * + * To improve efficiency when the slots are sparsely populate (i.e., + * many ticks do not have any timers), each slot is represented by a + * bit in a separately-managed, per-wheel, bitset. This allows for + * very efficient scanning. The cost of managing this bitset is small. + */ + +/* XXX: remove */ +#include + +#include +#include +#include +#include + +#include "rte_htw.h" + +#define TICK_BITS 64 + +#define WHEEL_BITS 8 +#define WHEEL_SLOTS (1U << WHEEL_BITS) +#define WHEEL_LEVELS (TICK_BITS / WHEEL_BITS) + +struct wheel { + uint64_t wheel_time; + RTE_BITSET_DECLARE(used_slots, WHEEL_SLOTS); + struct rte_htimer_list slots[WHEEL_SLOTS]; +}; + +struct rte_htw { + uint64_t current_time; + + struct wheel wheels[WHEEL_LEVELS]; + + struct rte_htimer_list added; + struct rte_htimer_list expiring; + + struct rte_htimer *running_timer; +}; + +static uint64_t +time_to_wheel_time(uint64_t t, uint16_t level) +{ + return t >> (level * WHEEL_BITS); +} + +static uint64_t +wheel_time_to_time(uint64_t wheel_time, uint16_t level) +{ + return wheel_time << (level * WHEEL_BITS); +} + +static void +wheel_init(struct wheel *wheel) +{ + uint16_t i; + + wheel->wheel_time = 0; + + rte_bitset_init(wheel->used_slots, WHEEL_SLOTS); + + for (i = 0; i < WHEEL_SLOTS; i++) + LIST_INIT(&wheel->slots[i]); +} + +static uint64_t +list_next_timeout(struct rte_htimer_list *timers) +{ + struct rte_htimer *timer; + uint64_t candidate = UINT64_MAX; + + LIST_FOREACH(timer, timers, entry) + candidate = RTE_MIN(timer->expiration_time, candidate); + + return candidate; +} + +static uint16_t +wheel_time_to_slot(uint64_t wheel_time) +{ + return wheel_time % WHEEL_SLOTS; +} + +static uint64_t +wheel_current_slot_time(struct wheel *wheel, uint16_t level) +{ + return wheel->wheel_time << (level * WHEEL_BITS); +} + +static uint64_t +wheel_next_timeout(struct wheel *wheel, uint16_t level, uint64_t upper_bound) +{ + uint16_t start_slot; + ssize_t slot; + + start_slot = wheel_current_slot_time(wheel, level); + + if (wheel_time_to_time(wheel->wheel_time, level) >= upper_bound) + return upper_bound; + + RTE_BITSET_FOREACH_SET_WRAP(slot, wheel->used_slots, WHEEL_SLOTS, + start_slot, WHEEL_SLOTS) { + struct rte_htimer_list *timers = &wheel->slots[slot]; + uint64_t next_timeout; + + next_timeout = list_next_timeout(timers); + + if (next_timeout != UINT64_MAX) + return next_timeout; + } + + return UINT64_MAX; +} + +static uint16_t +get_slot(uint64_t t, uint16_t level) +{ + uint64_t wheel_time; + uint16_t slot; + + wheel_time = time_to_wheel_time(t, level); + slot = wheel_time_to_slot(wheel_time); + + return slot; +} + +struct rte_htw * +rte_htw_create(void) +{ + struct rte_htw *htw; + uint16_t level; + + RTE_BUILD_BUG_ON((TICK_BITS % WHEEL_BITS) != 0); + RTE_BUILD_BUG_ON(sizeof(uint16_t) * CHAR_BIT <= WHEEL_BITS); + + htw = rte_malloc(NULL, sizeof(struct rte_htw), RTE_CACHE_LINE_SIZE); + + if (htw == NULL) { + rte_errno = ENOMEM; + return NULL; + } + + htw->current_time = 0; + + LIST_INIT(&htw->added); + LIST_INIT(&htw->expiring); + + for (level = 0; level < WHEEL_LEVELS; level++) + wheel_init(&htw->wheels[level]); + + return htw; +} + +void +rte_htw_destroy(struct rte_htw *htw) +{ + rte_free(htw); +} + +static uint16_t +get_level(uint64_t remaining_time) +{ + int last_set = 64 - __builtin_clzll(remaining_time); + + return (last_set - 1) / WHEEL_BITS; +} + +static void +mark_added(struct rte_htw *htw, struct rte_htimer *timer) +{ + timer->state = RTE_HTIMER_STATE_PENDING; + LIST_INSERT_HEAD(&htw->added, timer, entry); +} + +void +rte_htw_add(struct rte_htw *htw, struct rte_htimer *timer, + uint64_t expiration_time, uint64_t period, + rte_htimer_cb_t timer_cb, void *timer_cb_arg, uint32_t flags) +{ + RTE_ASSERT(rte_htimer_is_periodical(timer) ? + period > 0 : period == 0); + + if (flags & RTE_HTIMER_FLAG_ABSOLUTE_TIME) + timer->expiration_time = expiration_time; + else + timer->expiration_time = htw->current_time + expiration_time; + + timer->period = period; + timer->flags = flags; + timer->cb = timer_cb; + timer->cb_arg = timer_cb_arg; + + mark_added(htw, timer); +} + +void +rte_htw_cancel(struct rte_htw *htw, struct rte_htimer *timer) +{ + /* + * One could consider clearing the relevant used_slots bit in + * case this was the last entry in the wheel's slot + * list. However, from a correctness point of view, a "false + * positive" is not an issue. From a performance perspective, + * checking the list head and clearing the bit is likely more + * expensive than just deferring a minor cost to a future + * rte_htw_manage() call. + */ + + RTE_ASSERT(timer->state == RTE_HTIMER_STATE_PENDING || + timer->state == RTE_HTIMER_STATE_EXPIRED); + + if (likely(timer->state == RTE_HTIMER_STATE_PENDING)) { + LIST_REMOVE(timer, entry); + timer->state = RTE_HTIMER_STATE_CANCELED; + } else if (timer == htw->running_timer) { + /* periodical timer being canceled by its own callback */ + RTE_ASSERT(timer->flags & RTE_HTIMER_FLAG_PERIODICAL); + + timer->state = RTE_HTIMER_STATE_CANCELED; + + /* signals running timer canceled */ + htw->running_timer = NULL; + } +} + +static void +mark_expiring(struct rte_htw *htw, struct rte_htimer *timer) +{ + LIST_INSERT_HEAD(&htw->expiring, timer, entry); +} + +static void +schedule_timer(struct rte_htw *htw, struct rte_htimer *timer) +{ + uint64_t remaining_time; + uint16_t level; + struct wheel *wheel; + uint16_t slot; + struct rte_htimer_list *slot_timers; + + remaining_time = timer->expiration_time - htw->current_time; + + level = get_level(remaining_time); + + wheel = &htw->wheels[level]; + + slot = get_slot(timer->expiration_time, level); + + slot_timers = &htw->wheels[level].slots[slot]; + + LIST_INSERT_HEAD(slot_timers, timer, entry); + + rte_bitset_set(wheel->used_slots, slot); +} + +static void +process_added(struct rte_htw *htw) +{ + struct rte_htimer *timer; + + while ((timer = LIST_FIRST(&htw->added)) != NULL) { + LIST_REMOVE(timer, entry); + + if (timer->expiration_time > htw->current_time) + schedule_timer(htw, timer); + else + mark_expiring(htw, timer); + } +} + +static void +process_expiring(struct rte_htw *htw) +{ + struct rte_htimer *timer; + + while ((timer = LIST_FIRST(&htw->expiring)) != NULL) { + bool is_periodical; + bool running_timer_canceled; + + /* + * The timer struct may cannot be safely accessed + * after the callback has been called (except for + * non-canceled periodical timers), since the callback + * may have free'd (or reused) the memory. + */ + + LIST_REMOVE(timer, entry); + + is_periodical = timer->flags & RTE_HTIMER_FLAG_PERIODICAL; + + timer->state = RTE_HTIMER_STATE_EXPIRED; + + htw->running_timer = timer; + + timer->cb(timer, timer->cb_arg); + + running_timer_canceled = htw->running_timer == NULL; + + htw->running_timer = NULL; + + if (is_periodical && !running_timer_canceled) { + timer->expiration_time += timer->period; + mark_added(htw, timer); + } + } +} + +uint64_t +rte_htw_current_time(struct rte_htw *htw) +{ + return htw->current_time; +} + +uint64_t +rte_htw_next_timeout(struct rte_htw *htw, uint64_t upper_bound) +{ + uint16_t level; + + /* scheduling timeouts will sort them in temporal order */ + process_added(htw); + + if (!LIST_EMPTY(&htw->expiring)) + return 0; + + for (level = 0; level < WHEEL_LEVELS; level++) { + uint64_t wheel_timeout; + + wheel_timeout = wheel_next_timeout(&htw->wheels[level], + level, upper_bound); + if (wheel_timeout != UINT64_MAX) + return RTE_MIN(wheel_timeout, upper_bound); + } + + return upper_bound; +} + +static __rte_always_inline void +process_slot(struct rte_htw *htw, uint16_t level, struct wheel *wheel, + uint16_t slot) +{ + struct rte_htimer_list *slot_timers; + struct rte_htimer *timer; + + slot_timers = &wheel->slots[slot]; + + rte_bitset_clear(wheel->used_slots, slot); + + while ((timer = LIST_FIRST(slot_timers)) != NULL) { + LIST_REMOVE(timer, entry); + + if (level == 0 || timer->expiration_time <= htw->current_time) + mark_expiring(htw, timer); + else + schedule_timer(htw, timer); + } +} + +static __rte_always_inline void +process_slots(struct rte_htw *htw, uint16_t level, struct wheel *wheel, + uint16_t start_slot, uint16_t num_slots) +{ + ssize_t slot; + + RTE_BITSET_FOREACH_SET_WRAP(slot, wheel->used_slots, WHEEL_SLOTS, + start_slot, num_slots) + process_slot(htw, level, wheel, slot); +} + +static void +advance(struct rte_htw *htw) +{ + uint16_t level; + + for (level = 0; level < WHEEL_LEVELS; level++) { + struct wheel *wheel = &htw->wheels[level]; + uint64_t new_wheel_time; + uint16_t start_slot; + uint16_t num_slots; + + new_wheel_time = time_to_wheel_time(htw->current_time, level); + + if (new_wheel_time == wheel->wheel_time) + break; + + start_slot = wheel_time_to_slot(wheel->wheel_time + 1); + num_slots = RTE_MIN(new_wheel_time - wheel->wheel_time, + WHEEL_SLOTS); + + wheel->wheel_time = new_wheel_time; + + process_slots(htw, level, wheel, start_slot, num_slots); + } +} + +void +rte_htw_manage(struct rte_htw *htw, uint64_t new_time) +{ + RTE_VERIFY(new_time >= htw->current_time); + + /* + * Scheduling added timers, core timer wheeling processing and + * expiry callback execution is kept as separate stages, to + * avoid having the core wheel traversal code to deal with a + * situation where a timeout callbacks re-adding the timer. + * This split also results in seemingly reasonable semantics + * in regards to the execution of the callbacks of + * already-expired timeouts (e.g., with time 0) being added in + * a timeout callback. Instead of creating an end-less loop, + * with rte_htw_manage() never returning, it defers the + * execution of the timer until the next rte_htw_manage() + * call. + */ + + process_added(htw); + + htw->current_time = new_time; + + advance(htw); + + process_expiring(htw); +} + +void +rte_htw_process(struct rte_htw *htw) +{ + process_added(htw); + process_expiring(htw); +} diff --git a/lib/htimer/rte_htw.h b/lib/htimer/rte_htw.h new file mode 100644 index 0000000000..c93358bb13 --- /dev/null +++ b/lib/htimer/rte_htw.h @@ -0,0 +1,49 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2023 Ericsson AB + */ + +#ifndef _RTE_HTW_H_ +#define _RTE_HTW_H_ + +#include +#include + +#include + +#ifdef __cplusplus +extern "C" { +#endif + +struct rte_htw; + +struct rte_htw * +rte_htw_create(void); + +void +rte_htw_destroy(struct rte_htw *htw); + +void +rte_htw_add(struct rte_htw *htw, struct rte_htimer *timer, + uint64_t expiration_time, uint64_t period, + rte_htimer_cb_t cb, void *cb_arg, uint32_t flags); + +void +rte_htw_cancel(struct rte_htw *htw, struct rte_htimer *timer); + +uint64_t +rte_htw_current_time(struct rte_htw *htw); + +uint64_t +rte_htw_next_timeout(struct rte_htw *htw, uint64_t upper_bound); + +void +rte_htw_manage(struct rte_htw *htw, uint64_t new_time); + +void +rte_htw_process(struct rte_htw *htw); + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_HTW_H_ */ diff --git a/lib/htimer/version.map b/lib/htimer/version.map new file mode 100644 index 0000000000..0e71dc7d57 --- /dev/null +++ b/lib/htimer/version.map @@ -0,0 +1,17 @@ +EXPERIMENTAL { + global: + + rte_htimer_mgr_init; + rte_htimer_mgr_deinit; + rte_htimer_mgr_add; + rte_htimer_mgr_cancel; + rte_htimer_mgr_async_add; + rte_htimer_mgr_async_cancel; + rte_htimer_mgr_manage; + rte_htimer_mgr_manage_time; + rte_htimer_mgr_process; + rte_htimer_mgr_current_time; + rte_htimer_mgr_current_tick; + + local: *; +}; diff --git a/lib/meson.build b/lib/meson.build index 2bc0932ad5..c7c0e42ae8 100644 --- a/lib/meson.build +++ b/lib/meson.build @@ -37,6 +37,7 @@ libraries = [ 'gpudev', 'gro', 'gso', + 'htimer', 'ip_frag', 'jobstats', 'kni',