From patchwork Tue Aug 27 15:32:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Gregory X-Patchwork-Id: 143405 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id DE0DE45879; Tue, 27 Aug 2024 17:32:51 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id B0B5C40E4A; Tue, 27 Aug 2024 17:32:45 +0200 (CEST) Received: from mail-ed1-f43.google.com (mail-ed1-f43.google.com [209.85.208.43]) by mails.dpdk.org (Postfix) with ESMTP id CF62840E37 for ; Tue, 27 Aug 2024 17:32:43 +0200 (CEST) Received: by mail-ed1-f43.google.com with SMTP id 4fb4d7f45d1cf-5bebd3b7c22so10714415a12.0 for ; Tue, 27 Aug 2024 08:32:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1724772763; x=1725377563; darn=dpdk.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=OEbIQeQrEYwNp/ju4FfdaddP1Dli5TYj7zBjPcNfKps=; b=bn3Q4pwzs+vPtzw70WR6RHO624ZwVjiDumu206Cfpk8ojX4UKGuvJ7BEjtAuHVeic4 L2Xl0pIwxvy2QDs9+Pl6JEI8zYT4zMqR2twyye8VrQFZ9vBVqoKCTHKUQi+FdtWMCYaG XvQKqsBA+teHMef+DyeR6fzHCfyM+Ttl1AWtUGIhpc3DFIBf8S0jITu3IrV6F3sHM0RL BgMViihX0VvmOh76eC8FaJwpDpzhM416Ff5DjKxuDPNYGuL4d3E1HhextiSRXkv1vBdF Za5DZRtO3JobjzadfM6O68wPr4E4KLBp8Oy8+jjy2j/D/RkqUTBYL+yeVCPONpTwB7fh Tfyg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724772763; x=1725377563; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=OEbIQeQrEYwNp/ju4FfdaddP1Dli5TYj7zBjPcNfKps=; b=cR1JYE8+VyabfRpNm4xj2McZAhcAhkDHvNGih78Ui8a+4PdMeJwUL4vQwyQvi7f/yb 6aStMKpRp83Q7V2CrX7+JZQE5UHhgUC2GzTNPjubG/c56cRb9FIgumA3snb/CiZ7f7ET 1ZsKvEndHG8RuqN4G5MNXY7TMIoFH0b0bCyDDcO1eTYi9HlQJrRGktFQXNki07WJ/qF8 UcIpZhnZPQfVEdn0U7+Ks9H3k9w9YuWJoBHwrp4KkZ7ZbiG9MVxCdgM00ocw6shmAOmO KnoPHf2tUG4ZaaNwZbPLFPXuKl6N8bUCGQgsWjhowXj15A65nVZsFpgIKQmi3VZIPKG6 pxWw== X-Gm-Message-State: AOJu0YxcKeWcCjjubCwu6fk9tIedrYDRucizrH+JJ2kgORi0qDN+Yusb AqRgbIgUNQtl5tjJBkzzkmZn9EmaxahoUt/hGsOuD+9FmE4fF1n+Ecvp0fHj8OLoVf0O1rMuUaY J X-Google-Smtp-Source: AGHT+IGaRZog5hKcyylLUrf2thtC2V9+brj/CZ8tazQjtv6R0iOZTjB6CMw2ZO/VFcLfuJcGhoMfpg== X-Received: by 2002:a17:907:608c:b0:a80:b016:2525 with SMTP id a640c23a62f3a-a86e292afa5mr377894166b.8.1724772763148; Tue, 27 Aug 2024 08:32:43 -0700 (PDT) Received: from C02FF2N1MD6T.bytedance.net ([93.115.195.2]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a86e582da41sm122323066b.109.2024.08.27.08.32.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Aug 2024 08:32:42 -0700 (PDT) From: Daniel Gregory To: Stanislaw Kardach , Bruce Richardson Cc: dev@dpdk.org, Punit Agrawal , Liang Ma , Pengcheng Wang , Chunsong Feng , Daniel Gregory , Stephen Hemminger Subject: [PATCH v3 1/9] config/riscv: detect presence of Zbc extension Date: Tue, 27 Aug 2024 16:32:21 +0100 Message-Id: <20240827153230.52880-2-daniel.gregory@bytedance.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20240827153230.52880-1-daniel.gregory@bytedance.com> References: <20240712154645.80622-1-daniel.gregory@bytedance.com> <20240827153230.52880-1-daniel.gregory@bytedance.com> MIME-Version: 1.0 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org The RISC-V Zbc extension adds carry-less multiply instructions we can use to implement more efficient CRC hashing algorithms. The RISC-V C api defines architecture extension test macros https://github.com/riscv-non-isa/riscv-c-api-doc/blob/main/riscv-c-api.md#architecture-extension-test-macros These let us detect whether the Zbc extension is supported on the compiler and -march we're building with. The C api also defines Zbc intrinsics we can use rather than inline assembly on newer versions of GCC (14.1.0+) and Clang (18.1.0+). The Linux kernel exposes a RISC-V hardware probing syscall for getting information about the system at run-time including which extensions are available. We detect whether this interface is present by looking for the header, as it's only present in newer kernels (v6.4+). Furthermore, support for detecting certain extensions, including Zbc, wasn't present until versions after this, so we need to check the constants this header exports. The kernel exposes bitmasks for each extension supported by the probing interface, rather than the bit index that is set if that extensions is present, so modify the existing cpu flag HWCAP table entries to line up with this. The values returned by the interface are 64-bits long, so grow the hwcap registers array to be able to hold them. If the Zbc extension and intrinsics are both present and we can detect the Zbc extension at runtime, we define a flag, RTE_RISCV_FEATURE_ZBC. Signed-off-by: Daniel Gregory --- config/riscv/meson.build | 41 ++++++++++ lib/eal/riscv/include/rte_cpuflags.h | 2 + lib/eal/riscv/rte_cpuflags.c | 112 +++++++++++++++++++-------- 3 files changed, 123 insertions(+), 32 deletions(-) diff --git a/config/riscv/meson.build b/config/riscv/meson.build index 07d7d9da23..5d8411b254 100644 --- a/config/riscv/meson.build +++ b/config/riscv/meson.build @@ -119,6 +119,47 @@ foreach flag: arch_config['machine_args'] endif endforeach +# check if we can do buildtime detection of extensions supported by the target +riscv_extension_macros = false +if (cc.get_define('__riscv_arch_test', args: machine_args) == '1') + message('Detected architecture extension test macros') + riscv_extension_macros = true +else + warning('RISC-V architecture extension test macros not available. Build-time detection of extensions not possible') +endif + +# check if we can use hwprobe interface for runtime extension detection +riscv_hwprobe = false +if (cc.check_header('asm/hwprobe.h', args: machine_args)) + message('Detected hwprobe interface, enabling runtime detection of supported extensions') + machine_args += ['-DRTE_RISCV_FEATURE_HWPROBE'] + riscv_hwprobe = true +else + warning('Hwprobe interface not available (present in Linux v6.4+), instruction extensions won\'t be enabled') +endif + +# detect extensions +# RISC-V Carry-less multiplication extension (Zbc) for hardware implementations +# of CRC-32C (lib/hash/rte_crc_riscv64.h) and CRC-32/16 (lib/net/net_crc_zbc.c). +# Requires intrinsics available in GCC 14.1.0+ and Clang 18.1.0+ +if (riscv_extension_macros and riscv_hwprobe and + (cc.get_define('__riscv_zbc', args: machine_args) != '')) + if ((cc.get_id() == 'gcc' and cc.version().version_compare('>=14.1.0')) + or (cc.get_id() == 'clang' and cc.version().version_compare('>=18.1.0'))) + # determine whether we can detect Zbc extension (this wasn't possible until + # Linux kernel v6.8) + if (cc.compiles('''#include + int a = RISCV_HWPROBE_EXT_ZBC;''', args: machine_args)) + message('Compiling with the Zbc extension') + machine_args += ['-DRTE_RISCV_FEATURE_ZBC'] + else + warning('Detected Zbc extension but cannot use because runtime detection doesn\'t support it (support present in Linux kernel v6.8+)') + endif + else + warning('Detected Zbc extension but cannot use because intrinsics are not available (present in GCC 14.1.0+ and Clang 18.1.0+)') + endif +endif + # apply flags foreach flag: dpdk_flags if flag.length() > 0 diff --git a/lib/eal/riscv/include/rte_cpuflags.h b/lib/eal/riscv/include/rte_cpuflags.h index d742efc40f..4e26b584b3 100644 --- a/lib/eal/riscv/include/rte_cpuflags.h +++ b/lib/eal/riscv/include/rte_cpuflags.h @@ -42,6 +42,8 @@ enum rte_cpu_flag_t { RTE_CPUFLAG_RISCV_ISA_X, /* Non-standard extension present */ RTE_CPUFLAG_RISCV_ISA_Y, /* Reserved */ RTE_CPUFLAG_RISCV_ISA_Z, /* Reserved */ + + RTE_CPUFLAG_RISCV_EXT_ZBC, /* Carry-less multiplication */ }; #include "generic/rte_cpuflags.h" diff --git a/lib/eal/riscv/rte_cpuflags.c b/lib/eal/riscv/rte_cpuflags.c index eb4105c18b..dedf0395ab 100644 --- a/lib/eal/riscv/rte_cpuflags.c +++ b/lib/eal/riscv/rte_cpuflags.c @@ -11,6 +11,15 @@ #include #include #include +#include + +/* + * when hardware probing is not possible, we assume all extensions are missing + * at runtime + */ +#ifdef RTE_RISCV_FEATURE_HWPROBE +#include +#endif #ifndef AT_HWCAP #define AT_HWCAP 16 @@ -29,54 +38,90 @@ enum cpu_register_t { REG_HWCAP, REG_HWCAP2, REG_PLATFORM, - REG_MAX + REG_HWPROBE_IMA_EXT_0, + REG_MAX, }; -typedef uint32_t hwcap_registers_t[REG_MAX]; +typedef uint64_t hwcap_registers_t[REG_MAX]; /** * Struct to hold a processor feature entry */ struct feature_entry { uint32_t reg; - uint32_t bit; + uint64_t mask; #define CPU_FLAG_NAME_MAX_LEN 64 char name[CPU_FLAG_NAME_MAX_LEN]; }; -#define FEAT_DEF(name, reg, bit) \ - [RTE_CPUFLAG_##name] = {reg, bit, #name}, +#define FEAT_DEF(name, reg, mask) \ + [RTE_CPUFLAG_##name] = {reg, mask, #name}, typedef Elf64_auxv_t _Elfx_auxv_t; const struct feature_entry rte_cpu_feature_table[] = { - FEAT_DEF(RISCV_ISA_A, REG_HWCAP, 0) - FEAT_DEF(RISCV_ISA_B, REG_HWCAP, 1) - FEAT_DEF(RISCV_ISA_C, REG_HWCAP, 2) - FEAT_DEF(RISCV_ISA_D, REG_HWCAP, 3) - FEAT_DEF(RISCV_ISA_E, REG_HWCAP, 4) - FEAT_DEF(RISCV_ISA_F, REG_HWCAP, 5) - FEAT_DEF(RISCV_ISA_G, REG_HWCAP, 6) - FEAT_DEF(RISCV_ISA_H, REG_HWCAP, 7) - FEAT_DEF(RISCV_ISA_I, REG_HWCAP, 8) - FEAT_DEF(RISCV_ISA_J, REG_HWCAP, 9) - FEAT_DEF(RISCV_ISA_K, REG_HWCAP, 10) - FEAT_DEF(RISCV_ISA_L, REG_HWCAP, 11) - FEAT_DEF(RISCV_ISA_M, REG_HWCAP, 12) - FEAT_DEF(RISCV_ISA_N, REG_HWCAP, 13) - FEAT_DEF(RISCV_ISA_O, REG_HWCAP, 14) - FEAT_DEF(RISCV_ISA_P, REG_HWCAP, 15) - FEAT_DEF(RISCV_ISA_Q, REG_HWCAP, 16) - FEAT_DEF(RISCV_ISA_R, REG_HWCAP, 17) - FEAT_DEF(RISCV_ISA_S, REG_HWCAP, 18) - FEAT_DEF(RISCV_ISA_T, REG_HWCAP, 19) - FEAT_DEF(RISCV_ISA_U, REG_HWCAP, 20) - FEAT_DEF(RISCV_ISA_V, REG_HWCAP, 21) - FEAT_DEF(RISCV_ISA_W, REG_HWCAP, 22) - FEAT_DEF(RISCV_ISA_X, REG_HWCAP, 23) - FEAT_DEF(RISCV_ISA_Y, REG_HWCAP, 24) - FEAT_DEF(RISCV_ISA_Z, REG_HWCAP, 25) + FEAT_DEF(RISCV_ISA_A, REG_HWCAP, 1 << 0) + FEAT_DEF(RISCV_ISA_B, REG_HWCAP, 1 << 1) + FEAT_DEF(RISCV_ISA_C, REG_HWCAP, 1 << 2) + FEAT_DEF(RISCV_ISA_D, REG_HWCAP, 1 << 3) + FEAT_DEF(RISCV_ISA_E, REG_HWCAP, 1 << 4) + FEAT_DEF(RISCV_ISA_F, REG_HWCAP, 1 << 5) + FEAT_DEF(RISCV_ISA_G, REG_HWCAP, 1 << 6) + FEAT_DEF(RISCV_ISA_H, REG_HWCAP, 1 << 7) + FEAT_DEF(RISCV_ISA_I, REG_HWCAP, 1 << 8) + FEAT_DEF(RISCV_ISA_J, REG_HWCAP, 1 << 9) + FEAT_DEF(RISCV_ISA_K, REG_HWCAP, 1 << 10) + FEAT_DEF(RISCV_ISA_L, REG_HWCAP, 1 << 11) + FEAT_DEF(RISCV_ISA_M, REG_HWCAP, 1 << 12) + FEAT_DEF(RISCV_ISA_N, REG_HWCAP, 1 << 13) + FEAT_DEF(RISCV_ISA_O, REG_HWCAP, 1 << 14) + FEAT_DEF(RISCV_ISA_P, REG_HWCAP, 1 << 15) + FEAT_DEF(RISCV_ISA_Q, REG_HWCAP, 1 << 16) + FEAT_DEF(RISCV_ISA_R, REG_HWCAP, 1 << 17) + FEAT_DEF(RISCV_ISA_S, REG_HWCAP, 1 << 18) + FEAT_DEF(RISCV_ISA_T, REG_HWCAP, 1 << 19) + FEAT_DEF(RISCV_ISA_U, REG_HWCAP, 1 << 20) + FEAT_DEF(RISCV_ISA_V, REG_HWCAP, 1 << 21) + FEAT_DEF(RISCV_ISA_W, REG_HWCAP, 1 << 22) + FEAT_DEF(RISCV_ISA_X, REG_HWCAP, 1 << 23) + FEAT_DEF(RISCV_ISA_Y, REG_HWCAP, 1 << 24) + FEAT_DEF(RISCV_ISA_Z, REG_HWCAP, 1 << 25) + +#ifdef RTE_RISCV_FEATURE_ZBC + FEAT_DEF(RISCV_EXT_ZBC, REG_HWPROBE_IMA_EXT_0, RISCV_HWPROBE_EXT_ZBC) +#else + FEAT_DEF(RISCV_EXT_ZBC, REG_HWPROBE_IMA_EXT_0, 0) +#endif }; + +#ifdef RTE_RISCV_FEATURE_HWPROBE +/* + * Use kernel interface for probing hardware capabilities to get extensions + * present on this machine + */ +static uint64_t +rte_cpu_hwprobe_ima_ext(void) +{ + long ret; + struct riscv_hwprobe extensions_pair; + + struct riscv_hwprobe *pairs = &extensions_pair; + size_t pair_count = 1; + /* empty set of cpus returns extensions present on all cpus */ + cpu_set_t *cpus = NULL; + size_t cpusetsize = 0; + unsigned int flags = 0; + + extensions_pair.key = RISCV_HWPROBE_KEY_IMA_EXT_0; + ret = syscall(__NR_riscv_hwprobe, pairs, pair_count, cpusetsize, cpus, + flags); + + if (ret != 0) + return 0; + return extensions_pair.value; +} +#endif /* RTE_RISCV_FEATURE_HWPROBE */ + /* * Read AUXV software register and get cpu features for ARM */ @@ -85,6 +130,9 @@ rte_cpu_get_features(hwcap_registers_t out) { out[REG_HWCAP] = rte_cpu_getauxval(AT_HWCAP); out[REG_HWCAP2] = rte_cpu_getauxval(AT_HWCAP2); +#ifdef RTE_RISCV_FEATURE_HWPROBE + out[REG_HWPROBE_IMA_EXT_0] = rte_cpu_hwprobe_ima_ext(); +#endif } /* @@ -104,7 +152,7 @@ rte_cpu_get_flag_enabled(enum rte_cpu_flag_t feature) return -EFAULT; rte_cpu_get_features(regs); - return (regs[feat->reg] >> feat->bit) & 1; + return (regs[feat->reg] & feat->mask) != 0; } const char *