From patchwork Fri Oct 9 16:02:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 80190 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 80C49A04BC; Fri, 9 Oct 2020 18:02:35 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 6945B1D6E5; Fri, 9 Oct 2020 18:02:34 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id 1C6901D6E4 for ; Fri, 9 Oct 2020 18:02:31 +0200 (CEST) IronPort-SDR: kCs+odQxRa6+JTLVRYik3D71BKSaa0JA1sOVkedhnMrFLTXct9p0JBJUvRbiJmgYU5UcKlXleV cAt6SUdA3OAQ== X-IronPort-AV: E=McAfee;i="6000,8403,9769"; a="250197254" X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="250197254" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 09:02:29 -0700 IronPort-SDR: jKod50cnRcIflFf40fYr0qz61D3fzOnhdbXEd+S3jMCXpAyocqDEPyRSR42e8cwTV6hV58B+AA piVQXpNHAMAA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="528981757" Received: from silpixa00399498.ir.intel.com (HELO silpixa00399498.ger.corp.intel.com) ([10.237.222.52]) by orsmga005.jf.intel.com with ESMTP; 09 Oct 2020 09:02:26 -0700 From: Anatoly Burakov To: dev@dpdk.org Cc: Liang Ma , Bruce Richardson , Konstantin Ananyev , david.hunt@intel.com, jerinjacobk@gmail.com, thomas@monjalon.net, timothy.mcdaniel@intel.com, gage.eads@intel.com, chris.macnamara@intel.com Date: Fri, 9 Oct 2020 17:02:18 +0100 Message-Id: <532f45c5d79b4c30a919553d322bb66e91534466.1602258833.git.anatoly.burakov@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <1601647919-25312-1-git-send-email-liang.j.ma@intel.com> References: <1601647919-25312-1-git-send-email-liang.j.ma@intel.com> Subject: [dpdk-dev] [PATCH v5 01/10] eal: add new x86 cpuid support for WAITPKG X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Liang Ma Add new x86 cpuid support for WAITPKG. This flag indicate processor support umwait/umonitor/tpause instruction. Signed-off-by: Liang Ma Signed-off-by: Anatoly Burakov Acked-by: Konstantin Ananyev --- lib/librte_eal/x86/include/rte_cpuflags.h | 2 ++ lib/librte_eal/x86/rte_cpuflags.c | 2 ++ 2 files changed, 4 insertions(+) diff --git a/lib/librte_eal/x86/include/rte_cpuflags.h b/lib/librte_eal/x86/include/rte_cpuflags.h index c1d20364d1..5041a830a7 100644 --- a/lib/librte_eal/x86/include/rte_cpuflags.h +++ b/lib/librte_eal/x86/include/rte_cpuflags.h @@ -132,6 +132,8 @@ enum rte_cpu_flag_t { RTE_CPUFLAG_MOVDIR64B, /**< Direct Store Instructions 64B */ RTE_CPUFLAG_AVX512VP2INTERSECT, /**< AVX512 Two Register Intersection */ + /**< UMWAIT/TPAUSE Instructions */ + RTE_CPUFLAG_WAITPKG, /**< UMINITOR/UMWAIT/TPAUSE */ /* The last item */ RTE_CPUFLAG_NUMFLAGS, /**< This should always be the last! */ }; diff --git a/lib/librte_eal/x86/rte_cpuflags.c b/lib/librte_eal/x86/rte_cpuflags.c index 30439e7951..0325c4b93b 100644 --- a/lib/librte_eal/x86/rte_cpuflags.c +++ b/lib/librte_eal/x86/rte_cpuflags.c @@ -110,6 +110,8 @@ const struct feature_entry rte_cpu_feature_table[] = { FEAT_DEF(AVX512F, 0x00000007, 0, RTE_REG_EBX, 16) FEAT_DEF(RDSEED, 0x00000007, 0, RTE_REG_EBX, 18) + FEAT_DEF(WAITPKG, 0x00000007, 0, RTE_REG_ECX, 5) + FEAT_DEF(LAHF_SAHF, 0x80000001, 0, RTE_REG_ECX, 0) FEAT_DEF(LZCNT, 0x80000001, 0, RTE_REG_ECX, 4) From patchwork Fri Oct 9 16:02:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 80191 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id F2FE1A04BC; Fri, 9 Oct 2020 18:02:55 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 321FA1D713; Fri, 9 Oct 2020 18:02:37 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id 6AF941D6E8 for ; Fri, 9 Oct 2020 18:02:33 +0200 (CEST) IronPort-SDR: 8uOB8YLW3aog3ez0CKngLRCFxqAPuWf74EFmKnv1lr0Zx0QSY7rHV2z+I3QbuxkzQWi2ES1UiV j/RyJOCDANYg== X-IronPort-AV: E=McAfee;i="6000,8403,9769"; a="250197281" X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="250197281" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 09:02:32 -0700 IronPort-SDR: n5QY+O9hc4IcqakAINI1Zw/b6ThFYqpjCsXEPqqnIGA4VQayjB4rfeiJJR3TmlpMPUnaO3JGkv a35I+wYlL/lw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="528981779" Received: from silpixa00399498.ir.intel.com (HELO silpixa00399498.ger.corp.intel.com) ([10.237.222.52]) by orsmga005.jf.intel.com with ESMTP; 09 Oct 2020 09:02:29 -0700 From: Anatoly Burakov To: dev@dpdk.org Cc: Liang Ma , Jan Viktorin , Ruifeng Wang , David Christensen , Bruce Richardson , Konstantin Ananyev , david.hunt@intel.com, jerinjacobk@gmail.com, thomas@monjalon.net, timothy.mcdaniel@intel.com, gage.eads@intel.com, chris.macnamara@intel.com Date: Fri, 9 Oct 2020 17:02:19 +0100 Message-Id: X-Mailer: git-send-email 2.17.1 In-Reply-To: <532f45c5d79b4c30a919553d322bb66e91534466.1602258833.git.anatoly.burakov@intel.com> References: <532f45c5d79b4c30a919553d322bb66e91534466.1602258833.git.anatoly.burakov@intel.com> In-Reply-To: <1601647919-25312-1-git-send-email-liang.j.ma@intel.com> References: <1601647919-25312-1-git-send-email-liang.j.ma@intel.com> Subject: [dpdk-dev] [PATCH v5 02/10] eal: add power management intrinsics X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Liang Ma Add two new power management intrinsics, and provide an implementation in eal/x86 based on UMONITOR/UMWAIT instructions. The instructions are implemented as raw byte opcodes because there is not yet widespread compiler support for these instructions. The power management instructions provide an architecture-specific function to either wait until a specified TSC timestamp is reached, or optionally wait until either a TSC timestamp is reached or a memory location is written to. The monitor function also provides an optional comparison, to avoid sleeping when the expected write has already happened, and no more writes are expected. For more details, please refer to Intel(R) 64 and IA-32 Architectures Software Developer's Manual, Volume 2. Signed-off-by: Liang Ma Signed-off-by: Anatoly Burakov Acked-by: David Christensen --- Notes: v5: - Removed return values - Simplified intrinsics and hardcoded C0.2 state - Added other arch stubs lib/librte_eal/arm/include/meson.build | 1 + .../arm/include/rte_power_intrinsics.h | 62 ++++++++++ .../include/generic/rte_power_intrinsics.h | 61 ++++++++++ lib/librte_eal/include/meson.build | 1 + lib/librte_eal/ppc/include/meson.build | 1 + .../ppc/include/rte_power_intrinsics.h | 62 ++++++++++ lib/librte_eal/x86/include/meson.build | 1 + .../x86/include/rte_power_intrinsics.h | 106 ++++++++++++++++++ 8 files changed, 295 insertions(+) create mode 100644 lib/librte_eal/arm/include/rte_power_intrinsics.h create mode 100644 lib/librte_eal/include/generic/rte_power_intrinsics.h create mode 100644 lib/librte_eal/ppc/include/rte_power_intrinsics.h create mode 100644 lib/librte_eal/x86/include/rte_power_intrinsics.h diff --git a/lib/librte_eal/arm/include/meson.build b/lib/librte_eal/arm/include/meson.build index 73b750a18f..c6a9f70d73 100644 --- a/lib/librte_eal/arm/include/meson.build +++ b/lib/librte_eal/arm/include/meson.build @@ -20,6 +20,7 @@ arch_headers = files( 'rte_pause_32.h', 'rte_pause_64.h', 'rte_pause.h', + 'rte_power_intrinsics.h', 'rte_prefetch_32.h', 'rte_prefetch_64.h', 'rte_prefetch.h', diff --git a/lib/librte_eal/arm/include/rte_power_intrinsics.h b/lib/librte_eal/arm/include/rte_power_intrinsics.h new file mode 100644 index 0000000000..4aad44a0b9 --- /dev/null +++ b/lib/librte_eal/arm/include/rte_power_intrinsics.h @@ -0,0 +1,62 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2020 Intel Corporation + */ + +#ifndef _RTE_POWER_INTRINSIC_ARM_H_ +#define _RTE_POWER_INTRINSIC_ARM_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include +#include + +#include "generic/rte_power_intrinsics.h" + +/** + * This function is not supported on ARM. + * + * @param p + * Address to monitor for changes. Must be aligned on an 64-byte boundary. + * @param expected_value + * Before attempting the monitoring, the `p` address may be read and compared + * against this value. If `value_mask` is zero, this step will be skipped. + * @param value_mask + * The 64-bit mask to use to extract current value from `p`. + * @param tsc_timestamp + * Maximum TSC timestamp to wait for. + * + * @return + * - 0 on success + */ +static inline void rte_power_monitor(const volatile void *p, + const uint64_t expected_value, const uint64_t value_mask, + const uint64_t tsc_timestamp) +{ + RTE_SET_USED(p); + RTE_SET_USED(expected_value); + RTE_SET_USED(value_mask); + RTE_SET_USED(tsc_timestamp); +} + +/** + * This function is not supported on ARM. + * + * @param tsc_timestamp + * Maximum TSC timestamp to wait for. + * + * @return + * - 1 if wakeup was due to TSC timeout expiration. + * - 0 if wakeup was due to other reasons. + */ +static inline void rte_power_pause(const uint64_t tsc_timestamp) +{ + RTE_SET_USED(tsc_timestamp); +} + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_POWER_INTRINSIC_ARM_H_ */ diff --git a/lib/librte_eal/include/generic/rte_power_intrinsics.h b/lib/librte_eal/include/generic/rte_power_intrinsics.h new file mode 100644 index 0000000000..e36c1f8976 --- /dev/null +++ b/lib/librte_eal/include/generic/rte_power_intrinsics.h @@ -0,0 +1,61 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2020 Intel Corporation + */ + +#ifndef _RTE_POWER_INTRINSIC_H_ +#define _RTE_POWER_INTRINSIC_H_ + +#include + +/** + * @file + * Advanced power management operations. + * + * This file define APIs for advanced power management, + * which are architecture-dependent. + */ + +/** + * Monitor specific address for changes. This will cause the CPU to enter an + * architecture-defined optimized power state until either the specified + * memory address is written to, a certain TSC timestamp is reached, or other + * reasons cause the CPU to wake up. + * + * Additionally, an `expected` 64-bit value and 64-bit mask are provided. If + * mask is non-zero, the current value pointed to by the `p` pointer will be + * checked against the expected value, and if they match, the entering of + * optimized power state may be aborted. + * + * @param p + * Address to monitor for changes. Must be aligned on an 64-byte boundary. + * @param expected_value + * Before attempting the monitoring, the `p` address may be read and compared + * against this value. If `value_mask` is zero, this step will be skipped. + * @param value_mask + * The 64-bit mask to use to extract current value from `p`. + * @param tsc_timestamp + * Maximum TSC timestamp to wait for. Note that the wait behavior is + * architecture-dependent. + * + * @return + * - 0 on success + * - -ENOTSUP if not supported + */ +static inline void rte_power_monitor(const volatile void *p, + const uint64_t expected_value, const uint64_t value_mask, + const uint64_t tsc_timestamp); + +/** + * Enter an architecture-defined optimized power state until a certain TSC + * timestamp is reached. + * + * @param tsc_timestamp + * Maximum TSC timestamp to wait for. Note that the wait behavior is + * architecture-dependent. + * + * @return + * Architecture-dependent return value. + */ +static inline void rte_power_pause(const uint64_t tsc_timestamp); + +#endif /* _RTE_POWER_INTRINSIC_H_ */ diff --git a/lib/librte_eal/include/meson.build b/lib/librte_eal/include/meson.build index cd09027958..3a12e87e19 100644 --- a/lib/librte_eal/include/meson.build +++ b/lib/librte_eal/include/meson.build @@ -60,6 +60,7 @@ generic_headers = files( 'generic/rte_memcpy.h', 'generic/rte_pause.h', 'generic/rte_prefetch.h', + 'generic/rte_power_intrinsics.h', 'generic/rte_rwlock.h', 'generic/rte_spinlock.h', 'generic/rte_ticketlock.h', diff --git a/lib/librte_eal/ppc/include/meson.build b/lib/librte_eal/ppc/include/meson.build index ab4bd28092..0873b2aecb 100644 --- a/lib/librte_eal/ppc/include/meson.build +++ b/lib/librte_eal/ppc/include/meson.build @@ -10,6 +10,7 @@ arch_headers = files( 'rte_io.h', 'rte_memcpy.h', 'rte_pause.h', + 'rte_power_intrinsics.h', 'rte_prefetch.h', 'rte_rwlock.h', 'rte_spinlock.h', diff --git a/lib/librte_eal/ppc/include/rte_power_intrinsics.h b/lib/librte_eal/ppc/include/rte_power_intrinsics.h new file mode 100644 index 0000000000..70fd7b094f --- /dev/null +++ b/lib/librte_eal/ppc/include/rte_power_intrinsics.h @@ -0,0 +1,62 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2020 Intel Corporation + */ + +#ifndef _RTE_POWER_INTRINSIC_PPC_H_ +#define _RTE_POWER_INTRINSIC_PPC_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include +#include + +#include "generic/rte_power_intrinsics.h" + +/** + * This function is not supported on PPC64. + * + * @param p + * Address to monitor for changes. Must be aligned on an 64-byte boundary. + * @param expected_value + * Before attempting the monitoring, the `p` address may be read and compared + * against this value. If `value_mask` is zero, this step will be skipped. + * @param value_mask + * The 64-bit mask to use to extract current value from `p`. + * @param tsc_timestamp + * Maximum TSC timestamp to wait for. + * + * @return + * - 0 on success + */ +static inline void rte_power_monitor(const volatile void *p, + const uint64_t expected_value, const uint64_t value_mask, + const uint64_t tsc_timestamp) +{ + RTE_SET_USED(p); + RTE_SET_USED(expected_value); + RTE_SET_USED(value_mask); + RTE_SET_USED(tsc_timestamp); +} + +/** + * This function is not supported on PPC64. + * + * @param tsc_timestamp + * Maximum TSC timestamp to wait for. + * + * @return + * - 1 if wakeup was due to TSC timeout expiration. + * - 0 if wakeup was due to other reasons. + */ +static inline void rte_power_pause(const uint64_t tsc_timestamp) +{ + RTE_SET_USED(tsc_timestamp); +} + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_POWER_INTRINSIC_PPC_H_ */ diff --git a/lib/librte_eal/x86/include/meson.build b/lib/librte_eal/x86/include/meson.build index f0e998c2fe..494a8142a2 100644 --- a/lib/librte_eal/x86/include/meson.build +++ b/lib/librte_eal/x86/include/meson.build @@ -13,6 +13,7 @@ arch_headers = files( 'rte_io.h', 'rte_memcpy.h', 'rte_prefetch.h', + 'rte_power_intrinsics.h', 'rte_pause.h', 'rte_rtm.h', 'rte_rwlock.h', diff --git a/lib/librte_eal/x86/include/rte_power_intrinsics.h b/lib/librte_eal/x86/include/rte_power_intrinsics.h new file mode 100644 index 0000000000..8d579eaf64 --- /dev/null +++ b/lib/librte_eal/x86/include/rte_power_intrinsics.h @@ -0,0 +1,106 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2020 Intel Corporation + */ + +#ifndef _RTE_POWER_INTRINSIC_X86_64_H_ +#define _RTE_POWER_INTRINSIC_X86_64_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include +#include + +#include "generic/rte_power_intrinsics.h" + +/** + * Monitor specific address for changes. This will cause the CPU to enter an + * architecture-defined optimized power state until either the specified + * memory address is written to, a certain TSC timestamp is reached, or other + * reasons cause the CPU to wake up. + * + * Additionally, an `expected` 64-bit value and 64-bit mask are provided. If + * mask is non-zero, the current value pointed to by the `p` pointer will be + * checked against the expected value, and if they match, the entering of + * optimized power state may be aborted. + * + * This function uses UMONITOR/UMWAIT instructions and will enter C0.2 state. + * For more information about usage of these instructions, please refer to + * Intel(R) 64 and IA-32 Architectures Software Developer's Manual. + * + * @param p + * Address to monitor for changes. Must be aligned on an 64-byte boundary. + * @param expected_value + * Before attempting the monitoring, the `p` address may be read and compared + * against this value. If `value_mask` is zero, this step will be skipped. + * @param value_mask + * The 64-bit mask to use to extract current value from `p`. + * @param tsc_timestamp + * Maximum TSC timestamp to wait for. + * + * @return + * - 0 on success + */ +static inline void rte_power_monitor(const volatile void *p, + const uint64_t expected_value, const uint64_t value_mask, + const uint64_t tsc_timestamp) +{ + const uint32_t tsc_l = (uint32_t)tsc_timestamp; + const uint32_t tsc_h = (uint32_t)(tsc_timestamp >> 32); + /* + * we're using raw byte codes for now as only the newest compiler + * versions support this instruction natively. + */ + + /* set address for UMONITOR */ + asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;" + : + : "D"(p)); + + if (value_mask) { + const uint64_t cur_value = *(const volatile uint64_t *)p; + const uint64_t masked = cur_value & value_mask; + /* if the masked value is already matching, abort */ + if (masked == expected_value) + return; + } + /* execute UMWAIT */ + asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;" + : /* ignore rflags */ + : "D"(0), /* enter C0.2 */ + "a"(tsc_l), "d"(tsc_h)); +} + +/** + * Enter an architecture-defined optimized power state until a certain TSC + * timestamp is reached. + * + * This function uses TPAUSE instruction and will enter C0.2 state. For more + * information about usage of this instruction, please refer to Intel(R) 64 and + * IA-32 Architectures Software Developer's Manual. + * + * @param tsc_timestamp + * Maximum TSC timestamp to wait for. + * + * @return + * - 1 if wakeup was due to TSC timeout expiration. + * - 0 if wakeup was due to other reasons. + */ +static inline void rte_power_pause(const uint64_t tsc_timestamp) +{ + const uint32_t tsc_l = (uint32_t)tsc_timestamp; + const uint32_t tsc_h = (uint32_t)(tsc_timestamp >> 32); + + /* execute TPAUSE */ + asm volatile(".byte 0x66, 0x0f, 0xae, 0xf7;" + : /* ignore rflags */ + : "D"(0), /* enter C0.2 */ + "a"(tsc_l), "d"(tsc_h)); +} + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_POWER_INTRINSIC_X86_64_H_ */ From patchwork Fri Oct 9 16:02:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 80192 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id F0870A04BC; Fri, 9 Oct 2020 18:03:25 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 6F6C21D722; Fri, 9 Oct 2020 18:02:41 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id 726FE1D717 for ; Fri, 9 Oct 2020 18:02:37 +0200 (CEST) IronPort-SDR: 87aZmHtd4AsC4H+Dy/hsSlseDgrT0kUn2BaERXhpIWK7XSeOmNtuHrj3wfYJj/DstDQXuxht/1 lTVgN+rD6tmA== X-IronPort-AV: E=McAfee;i="6000,8403,9769"; a="250197322" X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="250197322" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 09:02:37 -0700 IronPort-SDR: SVZinkiOjgqutzKf+g4StzvHt/5509L3b8H6qoisTHMlHgK21eij8WIwFLowSmHwb4jx0Jf+Bn Yp+dpvpb2Lgw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="528981842" Received: from silpixa00399498.ir.intel.com (HELO silpixa00399498.ger.corp.intel.com) ([10.237.222.52]) by orsmga005.jf.intel.com with ESMTP; 09 Oct 2020 09:02:32 -0700 From: Anatoly Burakov To: dev@dpdk.org Cc: Jan Viktorin , Ruifeng Wang , David Christensen , Ray Kinsella , Neil Horman , Bruce Richardson , Konstantin Ananyev , david.hunt@intel.com, liang.j.ma@intel.com, jerinjacobk@gmail.com, thomas@monjalon.net, timothy.mcdaniel@intel.com, gage.eads@intel.com, chris.macnamara@intel.com Date: Fri, 9 Oct 2020 17:02:20 +0100 Message-Id: X-Mailer: git-send-email 2.17.1 In-Reply-To: <532f45c5d79b4c30a919553d322bb66e91534466.1602258833.git.anatoly.burakov@intel.com> References: <532f45c5d79b4c30a919553d322bb66e91534466.1602258833.git.anatoly.burakov@intel.com> In-Reply-To: <1601647919-25312-1-git-send-email-liang.j.ma@intel.com> References: <1601647919-25312-1-git-send-email-liang.j.ma@intel.com> Subject: [dpdk-dev] [PATCH v5 03/10] eal: add intrinsics support check infrastructure X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Currently, it is not possible to check support for intrinsics that are platform-specific, cannot be abstracted in a generic way, or do not have support on all architectures. The CPUID flags can be used to some extent, but they are only defined for their platform, while intrinsics will be available to all code as they are in generic headers. This patch introduces infrastructure to check support for certain platform-specific intrinsics, and adds support for checking support for IA power management-related intrinsics for UMWAIT/UMONITOR and TPAUSE. Signed-off-by: Anatoly Burakov Acked-by: David Christensen --- .../arm/include/rte_power_intrinsics.h | 8 ++++++ lib/librte_eal/arm/rte_cpuflags.c | 6 +++++ lib/librte_eal/include/generic/rte_cpuflags.h | 26 +++++++++++++++++++ .../include/generic/rte_power_intrinsics.h | 8 ++++++ .../ppc/include/rte_power_intrinsics.h | 8 ++++++ lib/librte_eal/ppc/rte_cpuflags.c | 6 +++++ lib/librte_eal/rte_eal_version.map | 1 + .../x86/include/rte_power_intrinsics.h | 8 ++++++ lib/librte_eal/x86/rte_cpuflags.c | 12 +++++++++ 9 files changed, 83 insertions(+) diff --git a/lib/librte_eal/arm/include/rte_power_intrinsics.h b/lib/librte_eal/arm/include/rte_power_intrinsics.h index 4aad44a0b9..055ec5877a 100644 --- a/lib/librte_eal/arm/include/rte_power_intrinsics.h +++ b/lib/librte_eal/arm/include/rte_power_intrinsics.h @@ -17,6 +17,10 @@ extern "C" { /** * This function is not supported on ARM. * + * @warning It is responsibility of the user to check if this function is + * supported at runtime using `rte_cpu_get_features()` API call. Failing to do + * so may result in an illegal CPU instruction error. + * * @param p * Address to monitor for changes. Must be aligned on an 64-byte boundary. * @param expected_value @@ -43,6 +47,10 @@ static inline void rte_power_monitor(const volatile void *p, /** * This function is not supported on ARM. * + * @warning It is responsibility of the user to check if this function is + * supported at runtime using `rte_cpu_get_features()` API call. Failing to do + * so may result in an illegal CPU instruction error. + * * @param tsc_timestamp * Maximum TSC timestamp to wait for. * diff --git a/lib/librte_eal/arm/rte_cpuflags.c b/lib/librte_eal/arm/rte_cpuflags.c index caf3dc83a5..7eef11fa02 100644 --- a/lib/librte_eal/arm/rte_cpuflags.c +++ b/lib/librte_eal/arm/rte_cpuflags.c @@ -138,3 +138,9 @@ rte_cpu_get_flag_name(enum rte_cpu_flag_t feature) return NULL; return rte_cpu_feature_table[feature].name; } + +void +rte_cpu_get_intrinsics_support(struct rte_cpu_intrinsics *intrinsics) +{ + memset(intrinsics, 0, sizeof(*intrinsics)); +} diff --git a/lib/librte_eal/include/generic/rte_cpuflags.h b/lib/librte_eal/include/generic/rte_cpuflags.h index 872f0ebe3e..28a5aecde8 100644 --- a/lib/librte_eal/include/generic/rte_cpuflags.h +++ b/lib/librte_eal/include/generic/rte_cpuflags.h @@ -13,6 +13,32 @@ #include "rte_common.h" #include +#include + +/** + * Structure used to describe platform-specific intrinsics that may or may not + * be supported at runtime. + */ +struct rte_cpu_intrinsics { + uint32_t power_monitor : 1; + /**< indicates support for rte_power_monitor function */ + uint32_t power_pause : 1; + /**< indicates support for rte_power_pause function */ +}; + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Check CPU support for various intrinsics at runtime. + * + * @param intrinsics + * Pointer to a structure to be filled. + */ +__rte_experimental +void +rte_cpu_get_intrinsics_support(struct rte_cpu_intrinsics *intrinsics); + /** * Enumeration of all CPU features supported */ diff --git a/lib/librte_eal/include/generic/rte_power_intrinsics.h b/lib/librte_eal/include/generic/rte_power_intrinsics.h index e36c1f8976..218eda7e86 100644 --- a/lib/librte_eal/include/generic/rte_power_intrinsics.h +++ b/lib/librte_eal/include/generic/rte_power_intrinsics.h @@ -26,6 +26,10 @@ * checked against the expected value, and if they match, the entering of * optimized power state may be aborted. * + * @warning It is responsibility of the user to check if this function is + * supported at runtime using `rte_cpu_get_features()` API call. Failing to do + * so may result in an illegal CPU instruction error. + * * @param p * Address to monitor for changes. Must be aligned on an 64-byte boundary. * @param expected_value @@ -49,6 +53,10 @@ static inline void rte_power_monitor(const volatile void *p, * Enter an architecture-defined optimized power state until a certain TSC * timestamp is reached. * + * @warning It is responsibility of the user to check if this function is + * supported at runtime using `rte_cpu_get_features()` API call. Failing to do + * so may result in an illegal CPU instruction error. + * * @param tsc_timestamp * Maximum TSC timestamp to wait for. Note that the wait behavior is * architecture-dependent. diff --git a/lib/librte_eal/ppc/include/rte_power_intrinsics.h b/lib/librte_eal/ppc/include/rte_power_intrinsics.h index 70fd7b094f..d63ad86849 100644 --- a/lib/librte_eal/ppc/include/rte_power_intrinsics.h +++ b/lib/librte_eal/ppc/include/rte_power_intrinsics.h @@ -17,6 +17,10 @@ extern "C" { /** * This function is not supported on PPC64. * + * @warning It is responsibility of the user to check if this function is + * supported at runtime using `rte_cpu_get_features()` API call. Failing to do + * so may result in an illegal CPU instruction error. + * * @param p * Address to monitor for changes. Must be aligned on an 64-byte boundary. * @param expected_value @@ -43,6 +47,10 @@ static inline void rte_power_monitor(const volatile void *p, /** * This function is not supported on PPC64. * + * @warning It is responsibility of the user to check if this function is + * supported at runtime using `rte_cpu_get_features()` API call. Failing to do + * so may result in an illegal CPU instruction error. + * * @param tsc_timestamp * Maximum TSC timestamp to wait for. * diff --git a/lib/librte_eal/ppc/rte_cpuflags.c b/lib/librte_eal/ppc/rte_cpuflags.c index 3bb7563ce9..eee8234384 100644 --- a/lib/librte_eal/ppc/rte_cpuflags.c +++ b/lib/librte_eal/ppc/rte_cpuflags.c @@ -108,3 +108,9 @@ rte_cpu_get_flag_name(enum rte_cpu_flag_t feature) return NULL; return rte_cpu_feature_table[feature].name; } + +void +rte_cpu_get_intrinsics_support(struct rte_cpu_intrinsics *intrinsics) +{ + memset(intrinsics, 0, sizeof(*intrinsics)); +} diff --git a/lib/librte_eal/rte_eal_version.map b/lib/librte_eal/rte_eal_version.map index a93dea9fe6..ed944f2bd4 100644 --- a/lib/librte_eal/rte_eal_version.map +++ b/lib/librte_eal/rte_eal_version.map @@ -400,6 +400,7 @@ EXPERIMENTAL { # added in 20.11 __rte_eal_trace_generic_size_t; rte_service_lcore_may_be_active; + rte_cpu_get_intrinsics_support; }; INTERNAL { diff --git a/lib/librte_eal/x86/include/rte_power_intrinsics.h b/lib/librte_eal/x86/include/rte_power_intrinsics.h index 8d579eaf64..3afc165a1f 100644 --- a/lib/librte_eal/x86/include/rte_power_intrinsics.h +++ b/lib/librte_eal/x86/include/rte_power_intrinsics.h @@ -29,6 +29,10 @@ extern "C" { * For more information about usage of these instructions, please refer to * Intel(R) 64 and IA-32 Architectures Software Developer's Manual. * + * @warning It is responsibility of the user to check if this function is + * supported at runtime using `rte_cpu_get_features()` API call. Failing to do + * so may result in an illegal CPU instruction error. + * * @param p * Address to monitor for changes. Must be aligned on an 64-byte boundary. * @param expected_value @@ -80,6 +84,10 @@ static inline void rte_power_monitor(const volatile void *p, * information about usage of this instruction, please refer to Intel(R) 64 and * IA-32 Architectures Software Developer's Manual. * + * @warning It is responsibility of the user to check if this function is + * supported at runtime using `rte_cpu_get_features()` API call. Failing to do + * so may result in an illegal CPU instruction error. + * * @param tsc_timestamp * Maximum TSC timestamp to wait for. * diff --git a/lib/librte_eal/x86/rte_cpuflags.c b/lib/librte_eal/x86/rte_cpuflags.c index 0325c4b93b..a96312ff7f 100644 --- a/lib/librte_eal/x86/rte_cpuflags.c +++ b/lib/librte_eal/x86/rte_cpuflags.c @@ -7,6 +7,7 @@ #include #include #include +#include #include "rte_cpuid.h" @@ -179,3 +180,14 @@ rte_cpu_get_flag_name(enum rte_cpu_flag_t feature) return NULL; return rte_cpu_feature_table[feature].name; } + +void +rte_cpu_get_intrinsics_support(struct rte_cpu_intrinsics *intrinsics) +{ + memset(intrinsics, 0, sizeof(*intrinsics)); + + if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_WAITPKG)) { + intrinsics->power_monitor = 1; + intrinsics->power_pause = 1; + } +} From patchwork Fri Oct 9 16:02:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 80193 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 9F191A04BC; Fri, 9 Oct 2020 18:03:49 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id E6ED71D728; Fri, 9 Oct 2020 18:02:44 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id C2A251D727 for ; Fri, 9 Oct 2020 18:02:42 +0200 (CEST) IronPort-SDR: bFY2FOpLpxOvM8dvfVKi51y/jD8dd4t7PpQmDnPLUTuunNNuaCJ8v/vWiDYprnY4fGxjE+lpvF 1PhqlNYFNepg== X-IronPort-AV: E=McAfee;i="6000,8403,9769"; a="250197349" X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="250197349" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 09:02:41 -0700 IronPort-SDR: EeTgId3zgrL9dGZoFlysz7wz82FcGJKwj/Llp3fWhVHqsHeCkv2SSm6SMkHzqfnc9hNtYsgxoT 74RY/Axt/HVQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="528981877" Received: from silpixa00399498.ir.intel.com (HELO silpixa00399498.ger.corp.intel.com) ([10.237.222.52]) by orsmga005.jf.intel.com with ESMTP; 09 Oct 2020 09:02:37 -0700 From: Anatoly Burakov To: dev@dpdk.org Cc: Liang Ma , Thomas Monjalon , Ferruh Yigit , Andrew Rybchenko , Ray Kinsella , Neil Horman , david.hunt@intel.com, konstantin.ananyev@intel.com, jerinjacobk@gmail.com, bruce.richardson@intel.com, timothy.mcdaniel@intel.com, gage.eads@intel.com, chris.macnamara@intel.com Date: Fri, 9 Oct 2020 17:02:21 +0100 Message-Id: <931cbea6d091f16a51ad7eed736b4b6e69df93aa.1602258833.git.anatoly.burakov@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <532f45c5d79b4c30a919553d322bb66e91534466.1602258833.git.anatoly.burakov@intel.com> References: <532f45c5d79b4c30a919553d322bb66e91534466.1602258833.git.anatoly.burakov@intel.com> In-Reply-To: <1601647919-25312-1-git-send-email-liang.j.ma@intel.com> References: <1601647919-25312-1-git-send-email-liang.j.ma@intel.com> Subject: [dpdk-dev] [PATCH v5 04/10] ethdev: add simple power management API X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Liang Ma Add a simple API to allow getting address of next RX descriptor from the PMD, as well as release notes information. Signed-off-by: Liang Ma Signed-off-by: Anatoly Burakov --- Notes: v5: - Bring function format in line with other functions in the file - Ensure the API is supported by the driver before calling it (Konstantin) doc/guides/rel_notes/release_20_11.rst | 16 ++++++++++++++ lib/librte_ethdev/rte_ethdev.c | 17 ++++++++++++++ lib/librte_ethdev/rte_ethdev.h | 24 ++++++++++++++++++++ lib/librte_ethdev/rte_ethdev_driver.h | 28 ++++++++++++++++++++++++ lib/librte_ethdev/rte_ethdev_version.map | 1 + 5 files changed, 86 insertions(+) diff --git a/doc/guides/rel_notes/release_20_11.rst b/doc/guides/rel_notes/release_20_11.rst index 808bdc4e54..e85af5d3e9 100644 --- a/doc/guides/rel_notes/release_20_11.rst +++ b/doc/guides/rel_notes/release_20_11.rst @@ -55,6 +55,11 @@ New Features Also, make sure to start the actual text at the margin. ======================================================= +* **ethdev: add 1 new EXPERIMENTAL API for PMD power management.** + + * ``rte_eth_get_wake_addr()`` + * add new eth_dev_ops ``get_wake_addr`` + * **Updated Broadcom bnxt driver.** Updated the Broadcom bnxt driver with new features and improvements, including: @@ -136,6 +141,17 @@ New Features * Extern objects and functions can be plugged into the pipeline. * Transaction-oriented table updates. +* **Add PMD power management mechanism** + + 3 new Ethernet PMD power management mechanism is added through existing + RX callback infrastructure. + + * Add power saving scheme based on UMWAIT instruction (x86 only) + * Add power saving scheme based on ``rte_pause()`` + * Add power saving scheme based on frequency scaling through the power library + * Add new EXPERIMENTAL API ``rte_power_pmd_mgmt_queue_enable()`` + * Add new EXPERIMENTAL API ``rte_power_pmd_mgmt_queue_disable()`` + Removed Items ------------- diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c index 48d1333b17..352108f43c 100644 --- a/lib/librte_ethdev/rte_ethdev.c +++ b/lib/librte_ethdev/rte_ethdev.c @@ -4804,6 +4804,23 @@ rte_eth_tx_burst_mode_get(uint16_t port_id, uint16_t queue_id, dev->dev_ops->tx_burst_mode_get(dev, queue_id, mode)); } +int +rte_eth_get_wake_addr(uint16_t port_id, uint16_t queue_id, + volatile void **wake_addr, uint64_t *expected, uint64_t *mask) +{ + struct rte_eth_dev *dev; + + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV); + + dev = &rte_eth_devices[port_id]; + + RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->get_wake_addr, -ENOTSUP); + + return eth_err(port_id, + dev->dev_ops->get_wake_addr(dev->data->rx_queues[queue_id], + wake_addr, expected, mask)); +} + int rte_eth_dev_set_mc_addr_list(uint16_t port_id, struct rte_ether_addr *mc_addr_set, diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h index d2bf74f128..a6cfe3cd57 100644 --- a/lib/librte_ethdev/rte_ethdev.h +++ b/lib/librte_ethdev/rte_ethdev.h @@ -4014,6 +4014,30 @@ __rte_experimental int rte_eth_tx_burst_mode_get(uint16_t port_id, uint16_t queue_id, struct rte_eth_burst_mode *mode); +/** + * Retrieve the wake up address from specific queue + * + * @param port_id + * The port identifier of the Ethernet device. + * @param queue_id + * The Tx queue on the Ethernet device for which information + * will be retrieved. + * @param wake_addr + * The pointer point to the address which is used for monitoring. + * @param expected + * The pointer point to value to be expected when descriptor is set. + * @param mask + * The pointer point to comparison bitmask for the expected value. + * + * @return + * - 0: Success. + * -EINVAL: Failed to get wake address. + */ +__rte_experimental +int rte_eth_get_wake_addr(uint16_t port_id, uint16_t queue_id, + volatile void **wake_addr, + uint64_t *expected, uint64_t *mask); + /** * Retrieve device registers and register attributes (number of registers and * register size) diff --git a/lib/librte_ethdev/rte_ethdev_driver.h b/lib/librte_ethdev/rte_ethdev_driver.h index c3062c246c..935d46f25c 100644 --- a/lib/librte_ethdev/rte_ethdev_driver.h +++ b/lib/librte_ethdev/rte_ethdev_driver.h @@ -574,6 +574,31 @@ typedef int (*eth_tx_hairpin_queue_setup_t) uint16_t nb_tx_desc, const struct rte_eth_hairpin_conf *hairpin_conf); +/** + * @internal + * Get the Wake up address. + * + * @param rxq + * Ethdev queue pointer. + * @param tail_desc_addr + * The pointer point to descriptor address var. + * @param expected + * The pointer point to value to be expected when descriptor is set. + * @param mask + * The pointer point to comparison bitmask for the expected value. + * @return + * Negative errno value on error, 0 on success. + * + * @retval 0 + * Success. + * @retval -EINVAL + * Failed to get descriptor address. + */ +typedef int (*eth_get_wake_addr_t) + (void *rxq, volatile void **tail_desc_addr, + uint64_t *expected, uint64_t *mask); + + /** * @internal A structure containing the functions exported by an Ethernet driver. */ @@ -713,6 +738,9 @@ struct eth_dev_ops { /**< Set up device RX hairpin queue. */ eth_tx_hairpin_queue_setup_t tx_hairpin_queue_setup; /**< Set up device TX hairpin queue. */ + eth_get_wake_addr_t get_wake_addr; + /**< Get wake up address. */ + }; /** diff --git a/lib/librte_ethdev/rte_ethdev_version.map b/lib/librte_ethdev/rte_ethdev_version.map index c95ef5157a..3cb2093980 100644 --- a/lib/librte_ethdev/rte_ethdev_version.map +++ b/lib/librte_ethdev/rte_ethdev_version.map @@ -229,6 +229,7 @@ EXPERIMENTAL { # added in 20.11 rte_eth_link_speed_to_str; rte_eth_link_to_str; + rte_eth_get_wake_addr; }; INTERNAL { From patchwork Fri Oct 9 16:02:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 80194 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 0D220A04BC; Fri, 9 Oct 2020 18:04:12 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 7F0A61D72F; Fri, 9 Oct 2020 18:02:49 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id 497FE1D72C for ; Fri, 9 Oct 2020 18:02:45 +0200 (CEST) IronPort-SDR: M8/ulO1IyR5Q1BfTq7nnC8hr4cgO7/DlwjkpujdJZL/e3VQhzzFX0fFjMHojE81jcC8XZqwDbW top+W06I+74w== X-IronPort-AV: E=McAfee;i="6000,8403,9769"; a="250197375" X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="250197375" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 09:02:44 -0700 IronPort-SDR: VtNdzPr7g2vvqTmtbG7zGQ8QjRoANOFc+O50VLTup8u1TpiilnKDvPNH2aVgscF4b+nAWFbyF1 v1aIP1usIe6A== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="528981897" Received: from silpixa00399498.ir.intel.com (HELO silpixa00399498.ger.corp.intel.com) ([10.237.222.52]) by orsmga005.jf.intel.com with ESMTP; 09 Oct 2020 09:02:41 -0700 From: Anatoly Burakov To: dev@dpdk.org Cc: Liang Ma , David Hunt , Ray Kinsella , Neil Horman , konstantin.ananyev@intel.com, jerinjacobk@gmail.com, bruce.richardson@intel.com, thomas@monjalon.net, timothy.mcdaniel@intel.com, gage.eads@intel.com, chris.macnamara@intel.com Date: Fri, 9 Oct 2020 17:02:22 +0100 Message-Id: <4831f3a979c41eb542994c0ad2b64f46eb818939.1602258833.git.anatoly.burakov@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <532f45c5d79b4c30a919553d322bb66e91534466.1602258833.git.anatoly.burakov@intel.com> References: <532f45c5d79b4c30a919553d322bb66e91534466.1602258833.git.anatoly.burakov@intel.com> In-Reply-To: <1601647919-25312-1-git-send-email-liang.j.ma@intel.com> References: <1601647919-25312-1-git-send-email-liang.j.ma@intel.com> Subject: [dpdk-dev] [PATCH v5 05/10] power: add PMD power management API and callback X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Liang Ma Add a simple on/off switch that will enable saving power when no packets are arriving. It is based on counting the number of empty polls and, when the number reaches a certain threshold, entering an architecture-defined optimized power state that will either wait until a TSC timestamp expires, or when packets arrive. This API mandates a core-to-single-queue mapping (that is, multiple queued per device are supported, but they have to be polled on different cores). This design is using PMD RX callbacks. 1. UMWAIT/UMONITOR: When a certain threshold of empty polls is reached, the core will go into a power optimized sleep while waiting on an address of next RX descriptor to be written to. 2. Pause instruction Instead of move the core into deeper C state, this method uses the pause instruction to avoid busy polling. 3. Frequency scaling Reuse existing DPDK power library to scale up/down core frequency depending on traffic volume. Signed-off-by: Liang Ma Signed-off-by: Anatoly Burakov --- Notes: v5: - Make error checking more robust - Prevent initializing scaling if ACPI or PSTATE env wasn't set - Prevent initializing UMWAIT path if PMD doesn't support get_wake_addr - Add some debug logging - Replace x86-specific code path to generic path using the intrinsic check lib/librte_power/meson.build | 5 +- lib/librte_power/pmd_mgmt.h | 38 ++++ lib/librte_power/rte_power_pmd_mgmt.c | 244 +++++++++++++++++++++++++ lib/librte_power/rte_power_pmd_mgmt.h | 88 +++++++++ lib/librte_power/rte_power_version.map | 4 + 5 files changed, 377 insertions(+), 2 deletions(-) create mode 100644 lib/librte_power/pmd_mgmt.h create mode 100644 lib/librte_power/rte_power_pmd_mgmt.c create mode 100644 lib/librte_power/rte_power_pmd_mgmt.h diff --git a/lib/librte_power/meson.build b/lib/librte_power/meson.build index 78c031c943..cc3c7a8646 100644 --- a/lib/librte_power/meson.build +++ b/lib/librte_power/meson.build @@ -9,6 +9,7 @@ sources = files('rte_power.c', 'power_acpi_cpufreq.c', 'power_kvm_vm.c', 'guest_channel.c', 'rte_power_empty_poll.c', 'power_pstate_cpufreq.c', + 'rte_power_pmd_mgmt.c', 'power_common.c') -headers = files('rte_power.h','rte_power_empty_poll.h') -deps += ['timer'] +headers = files('rte_power.h','rte_power_empty_poll.h','rte_power_pmd_mgmt.h') +deps += ['timer' ,'ethdev'] diff --git a/lib/librte_power/pmd_mgmt.h b/lib/librte_power/pmd_mgmt.h new file mode 100644 index 0000000000..20be53bacf --- /dev/null +++ b/lib/librte_power/pmd_mgmt.h @@ -0,0 +1,38 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2010-2020 Intel Corporation + */ + +#ifndef _PMD_MGMT_H +#define _PMD_MGMT_H + +/** + * @file + * Power Management + */ + +/** + * Possible power management states of an ethdev port. + */ +enum pmd_mgmt_state { + /** Device power management is disabled. */ + PMD_MGMT_DISABLED = 0, + /** Device power management is enabled. */ + PMD_MGMT_ENABLED, +}; + +struct pmd_queue_cfg { + enum pmd_mgmt_state pwr_mgmt_state; + /**< Power mgmt Callback mode */ + enum rte_power_pmd_mgmt_type cb_mode; + /**< Empty poll number */ + uint16_t empty_poll_stats; + /**< Callback instance */ + const struct rte_eth_rxtx_callback *cur_cb; +} __rte_cache_aligned; + +struct pmd_port_cfg { + int ref_cnt; + struct pmd_queue_cfg *queue_cfg; +} __rte_cache_aligned; + +#endif diff --git a/lib/librte_power/rte_power_pmd_mgmt.c b/lib/librte_power/rte_power_pmd_mgmt.c new file mode 100644 index 0000000000..07dfe7c077 --- /dev/null +++ b/lib/librte_power/rte_power_pmd_mgmt.c @@ -0,0 +1,244 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2010-2020 Intel Corporation + */ + +#include +#include +#include +#include +#include +#include + +#include "rte_power_pmd_mgmt.h" +#include "pmd_mgmt.h" + + +#define EMPTYPOLL_MAX 512 +#define PAUSE_NUM 64 + +static struct pmd_port_cfg port_cfg[RTE_MAX_ETHPORTS]; + +static uint16_t +rte_power_mgmt_umwait(uint16_t port_id, uint16_t qidx, + struct rte_mbuf **pkts __rte_unused, uint16_t nb_rx, + uint16_t max_pkts __rte_unused, void *_ __rte_unused) +{ + + struct pmd_queue_cfg *q_conf; + q_conf = &port_cfg[port_id].queue_cfg[qidx]; + + if (unlikely(nb_rx == 0)) { + q_conf->empty_poll_stats++; + if (unlikely(q_conf->empty_poll_stats > EMPTYPOLL_MAX)) { + volatile void *target_addr; + uint64_t expected, mask; + uint16_t ret; + + /* + * get address of next descriptor in the RX + * ring for this queue, as well as expected + * value and a mask. + */ + ret = rte_eth_get_wake_addr(port_id, qidx, + &target_addr, &expected, &mask); + if (ret == 0) + /* -1ULL is maximum value for TSC */ + rte_power_monitor(target_addr, expected, + mask, -1ULL); + } + } else + q_conf->empty_poll_stats = 0; + + return nb_rx; +} + +static uint16_t +rte_power_mgmt_pause(uint16_t port_id, uint16_t qidx, + struct rte_mbuf **pkts __rte_unused, uint16_t nb_rx, + uint16_t max_pkts __rte_unused, void *_ __rte_unused) +{ + struct pmd_queue_cfg *q_conf; + int i; + q_conf = &port_cfg[port_id].queue_cfg[qidx]; + + if (unlikely(nb_rx == 0)) { + q_conf->empty_poll_stats++; + if (unlikely(q_conf->empty_poll_stats > EMPTYPOLL_MAX)) { + for (i = 0; i < PAUSE_NUM; i++) + rte_pause(); + } + } else + q_conf->empty_poll_stats = 0; + + return nb_rx; +} + +static uint16_t +rte_power_mgmt_scalefreq(uint16_t port_id, uint16_t qidx, + struct rte_mbuf **pkts __rte_unused, uint16_t nb_rx, + uint16_t max_pkts __rte_unused, void *_ __rte_unused) +{ + struct pmd_queue_cfg *q_conf; + q_conf = &port_cfg[port_id].queue_cfg[qidx]; + + if (unlikely(nb_rx == 0)) { + q_conf->empty_poll_stats++; + if (unlikely(q_conf->empty_poll_stats > EMPTYPOLL_MAX)) { + /*scale down freq */ + rte_power_freq_min(rte_lcore_id()); + + } + } else { + q_conf->empty_poll_stats = 0; + /* scal up freq */ + rte_power_freq_max(rte_lcore_id()); + } + + return nb_rx; +} + +int +rte_power_pmd_mgmt_queue_enable(unsigned int lcore_id, + uint16_t port_id, + uint16_t queue_id, + enum rte_power_pmd_mgmt_type mode) +{ + struct rte_eth_dev *dev; + struct pmd_queue_cfg *queue_cfg; + int ret = 0; + + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL); + dev = &rte_eth_devices[port_id]; + + if (port_cfg[port_id].queue_cfg == NULL) { + port_cfg[port_id].ref_cnt = 0; + /* allocate memory for empty poll stats */ + port_cfg[port_id].queue_cfg = rte_malloc_socket(NULL, + sizeof(struct pmd_queue_cfg) + * RTE_MAX_QUEUES_PER_PORT, + 0, dev->data->numa_node); + if (port_cfg[port_id].queue_cfg == NULL) + return -ENOMEM; + } + + queue_cfg = &port_cfg[port_id].queue_cfg[queue_id]; + + if (queue_cfg->pwr_mgmt_state == PMD_MGMT_ENABLED) { + ret = -EINVAL; + goto failure_handler; + } + + switch (mode) { + case RTE_POWER_MGMT_TYPE_WAIT: + { + /* check if rte_power_monitor is supported */ + uint64_t dummy_expected, dummy_mask; + struct rte_cpu_intrinsics i; + void *dummy_addr; + + rte_cpu_get_intrinsics_support(&i); + + if (!i.power_monitor) { + RTE_LOG(DEBUG, POWER, "Monitoring intrinsics are not supported\n"); + ret = -ENOTSUP; + goto failure_handler; + } + + /* check if the device supports the necessary PMD API */ + if (rte_eth_get_wake_addr(port_id, queue_id, &dummy_addr, + &dummy_expected, &dummy_mask) == -ENOTSUP) { + RTE_LOG(DEBUG, POWER, "The device does not support get_wake_addr\n"); + ret = -ENOTSUP; + goto failure_handler; + } + + queue_cfg->cur_cb = rte_eth_add_rx_callback(port_id, queue_id, + rte_power_mgmt_umwait, NULL); + break; + } + case RTE_POWER_MGMT_TYPE_SCALE: + { + enum power_management_env env; + /* only PSTATE and ACPI modes are supported */ + if (!rte_power_check_env_supported(PM_ENV_ACPI_CPUFREQ) && + !rte_power_check_env_supported(PM_ENV_PSTATE_CPUFREQ)) { + RTE_LOG(DEBUG, POWER, "Neither ACPI nor PSTATE modes are supported\n"); + ret = -ENOTSUP; + goto failure_handler; + } + /* ensure we could initialize the power library */ + if (rte_power_init(lcore_id)) { + ret = -EINVAL; + goto failure_handler; + } + /* ensure we initialized the correct env */ + env = rte_power_get_env(); + if (env != PM_ENV_ACPI_CPUFREQ && + env != PM_ENV_PSTATE_CPUFREQ) { + RTE_LOG(DEBUG, POWER, "Neither ACPI nor PSTATE modes were initialized\n"); + ret = -ENOTSUP; + goto failure_handler; + } + queue_cfg->cur_cb = rte_eth_add_rx_callback(port_id, queue_id, + rte_power_mgmt_scalefreq, NULL); + break; + } + case RTE_POWER_MGMT_TYPE_PAUSE: + queue_cfg->cur_cb = rte_eth_add_rx_callback(port_id, queue_id, + rte_power_mgmt_pause, NULL); + break; + } + queue_cfg->cb_mode = mode; + port_cfg[port_id].ref_cnt++; + queue_cfg->pwr_mgmt_state = PMD_MGMT_ENABLED; + return ret; + +failure_handler: + if (port_cfg[port_id].ref_cnt == 0) { + rte_free(port_cfg[port_id].queue_cfg); + port_cfg[port_id].queue_cfg = NULL; + } + return ret; +} + +int +rte_power_pmd_mgmt_queue_disable(unsigned int lcore_id, + uint16_t port_id, + uint16_t queue_id) +{ + struct pmd_queue_cfg *queue_cfg; + + if (port_cfg[port_id].ref_cnt <= 0) + return -EINVAL; + + queue_cfg = &port_cfg[port_id].queue_cfg[queue_id]; + + if (queue_cfg->pwr_mgmt_state == PMD_MGMT_DISABLED) + return -EINVAL; + + switch (queue_cfg->cb_mode) { + case RTE_POWER_MGMT_TYPE_WAIT: + case RTE_POWER_MGMT_TYPE_PAUSE: + rte_eth_remove_rx_callback(port_id, queue_id, + queue_cfg->cur_cb); + break; + case RTE_POWER_MGMT_TYPE_SCALE: + rte_power_freq_max(lcore_id); + rte_eth_remove_rx_callback(port_id, queue_id, + queue_cfg->cur_cb); + rte_power_exit(lcore_id); + break; + } + /* it's not recommend to free callback instance here. + * it cause memory leak which is a known issue. + */ + queue_cfg->cur_cb = NULL; + queue_cfg->pwr_mgmt_state = PMD_MGMT_DISABLED; + port_cfg[port_id].ref_cnt--; + + if (port_cfg[port_id].ref_cnt == 0) { + rte_free(port_cfg[port_id].queue_cfg); + port_cfg[port_id].queue_cfg = NULL; + } + return 0; +} diff --git a/lib/librte_power/rte_power_pmd_mgmt.h b/lib/librte_power/rte_power_pmd_mgmt.h new file mode 100644 index 0000000000..8b110f1148 --- /dev/null +++ b/lib/librte_power/rte_power_pmd_mgmt.h @@ -0,0 +1,88 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2010-2020 Intel Corporation + */ + +#ifndef _RTE_POWER_PMD_MGMT_H +#define _RTE_POWER_PMD_MGMT_H + +/** + * @file + * RTE PMD Power Management + */ +#include +#include + +#include +#include +#include +#include +#include + +#ifdef __cplusplus +extern "C" { +#endif + +/** + * PMD Power Management Type + */ +enum rte_power_pmd_mgmt_type { + /** WAIT callback mode. */ + RTE_POWER_MGMT_TYPE_WAIT = 1, + /** PAUSE callback mode. */ + RTE_POWER_MGMT_TYPE_PAUSE, + /** Freq Scaling callback mode. */ + RTE_POWER_MGMT_TYPE_SCALE, +}; + +/** + * @warning + * @b EXPERIMENTAL: this API may change, or be removed, without prior notice + * + * Setup per-queue power management callback. + * @param lcore_id + * lcore_id. + * @param port_id + * The port identifier of the Ethernet device. + * @param queue_id + * The queue identifier of the Ethernet device. + * @param mode + * The power management callback function type. + + * @return + * 0 on success + * <0 on error + */ + +__rte_experimental +int +rte_power_pmd_mgmt_queue_enable(unsigned int lcore_id, + uint16_t port_id, + uint16_t queue_id, + enum rte_power_pmd_mgmt_type mode); + +/** + * @warning + * @b EXPERIMENTAL: this API may change, or be removed, without prior notice + * + * Remove per-queue power management callback. + * @param lcore_id + * lcore_id. + * @param port_id + * The port identifier of the Ethernet device. + * @param queue_id + * The queue identifier of the Ethernet device. + * @return + * 0 on success + * <0 on error + */ + +__rte_experimental +int +rte_power_pmd_mgmt_queue_disable(unsigned int lcore_id, + uint16_t port_id, + uint16_t queue_id); +#ifdef __cplusplus +} +#endif + +#endif diff --git a/lib/librte_power/rte_power_version.map b/lib/librte_power/rte_power_version.map index 69ca9af616..3f2f6cd6f6 100644 --- a/lib/librte_power/rte_power_version.map +++ b/lib/librte_power/rte_power_version.map @@ -34,4 +34,8 @@ EXPERIMENTAL { rte_power_guest_channel_receive_msg; rte_power_poll_stat_fetch; rte_power_poll_stat_update; + # added in 20.11 + rte_power_pmd_mgmt_queue_enable; + rte_power_pmd_mgmt_queue_disable; + }; From patchwork Fri Oct 9 16:02:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 80195 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 97A2FA04BC; Fri, 9 Oct 2020 18:04:41 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 67D971D8CC; Fri, 9 Oct 2020 18:02:51 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id 3CF871D72E for ; Fri, 9 Oct 2020 18:02:48 +0200 (CEST) IronPort-SDR: wfPzpk4Jmzy/eGi25XGBMgntGpv5y0tuCGgjVF9225bbtGHAe3bxaQlEbeY3OMECEOJFJk0mGI LGIg7GuLT1xw== X-IronPort-AV: E=McAfee;i="6000,8403,9769"; a="250197400" X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="250197400" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 09:02:47 -0700 IronPort-SDR: uhr7YlIrHs8LI1a2IG9mITKXpB8yvrOkUmavPnqJ5t0PTf2lHDBHF7NUb4+penOD0hXFQRjPSQ N8mUaNB+sLVQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="528981923" Received: from silpixa00399498.ir.intel.com (HELO silpixa00399498.ger.corp.intel.com) ([10.237.222.52]) by orsmga005.jf.intel.com with ESMTP; 09 Oct 2020 09:02:44 -0700 From: Anatoly Burakov To: dev@dpdk.org Cc: Liang Ma , Jeff Guo , Haiyue Wang , david.hunt@intel.com, konstantin.ananyev@intel.com, jerinjacobk@gmail.com, bruce.richardson@intel.com, thomas@monjalon.net, timothy.mcdaniel@intel.com, gage.eads@intel.com, chris.macnamara@intel.com Date: Fri, 9 Oct 2020 17:02:23 +0100 Message-Id: X-Mailer: git-send-email 2.17.1 In-Reply-To: <532f45c5d79b4c30a919553d322bb66e91534466.1602258833.git.anatoly.burakov@intel.com> References: <532f45c5d79b4c30a919553d322bb66e91534466.1602258833.git.anatoly.burakov@intel.com> In-Reply-To: <1601647919-25312-1-git-send-email-liang.j.ma@intel.com> References: <1601647919-25312-1-git-send-email-liang.j.ma@intel.com> Subject: [dpdk-dev] [PATCH v5 06/10] net/ixgbe: implement power management API X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Liang Ma Implement support for the power management API by implementing a `get_wake_addr` function that will return an address of an RX ring's status bit. Signed-off-by: Anatoly Burakov Signed-off-by: Liang Ma Acked-by: Haiyue Wang --- drivers/net/ixgbe/ixgbe_ethdev.c | 1 + drivers/net/ixgbe/ixgbe_rxtx.c | 22 ++++++++++++++++++++++ drivers/net/ixgbe/ixgbe_rxtx.h | 2 ++ 3 files changed, 25 insertions(+) diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c index 0b98e210e7..30b3f416d4 100644 --- a/drivers/net/ixgbe/ixgbe_ethdev.c +++ b/drivers/net/ixgbe/ixgbe_ethdev.c @@ -588,6 +588,7 @@ static const struct eth_dev_ops ixgbe_eth_dev_ops = { .udp_tunnel_port_del = ixgbe_dev_udp_tunnel_port_del, .tm_ops_get = ixgbe_tm_ops_get, .tx_done_cleanup = ixgbe_dev_tx_done_cleanup, + .get_wake_addr = ixgbe_get_wake_addr, }; /* diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c index 977ecf5137..7a9fd2aec6 100644 --- a/drivers/net/ixgbe/ixgbe_rxtx.c +++ b/drivers/net/ixgbe/ixgbe_rxtx.c @@ -1366,6 +1366,28 @@ const uint32_t RTE_PTYPE_INNER_L3_IPV4_EXT | RTE_PTYPE_INNER_L4_UDP, }; +int ixgbe_get_wake_addr(void *rx_queue, volatile void **tail_desc_addr, + uint64_t *expected, uint64_t *mask) +{ + volatile union ixgbe_adv_rx_desc *rxdp; + struct ixgbe_rx_queue *rxq = rx_queue; + uint16_t desc; + + desc = rxq->rx_tail; + rxdp = &rxq->rx_ring[desc]; + /* watch for changes in status bit */ + *tail_desc_addr = &rxdp->wb.upper.status_error; + + /* + * we expect the DD bit to be set to 1 if this descriptor was already + * written to. + */ + *expected = rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD); + *mask = rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD); + + return 0; +} + /* @note: fix ixgbe_dev_supported_ptypes_get() if any change here. */ static inline uint32_t ixgbe_rxd_pkt_info_to_pkt_type(uint32_t pkt_info, uint16_t ptype_mask) diff --git a/drivers/net/ixgbe/ixgbe_rxtx.h b/drivers/net/ixgbe/ixgbe_rxtx.h index 7e09291b22..75020fa2fc 100644 --- a/drivers/net/ixgbe/ixgbe_rxtx.h +++ b/drivers/net/ixgbe/ixgbe_rxtx.h @@ -299,5 +299,7 @@ uint64_t ixgbe_get_tx_port_offloads(struct rte_eth_dev *dev); uint64_t ixgbe_get_rx_queue_offloads(struct rte_eth_dev *dev); uint64_t ixgbe_get_rx_port_offloads(struct rte_eth_dev *dev); uint64_t ixgbe_get_tx_queue_offloads(struct rte_eth_dev *dev); +int ixgbe_get_wake_addr(void *rx_queue, volatile void **tail_desc_addr, + uint64_t *expected, uint64_t *mask); #endif /* _IXGBE_RXTX_H_ */ From patchwork Fri Oct 9 16:02:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 80196 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id DEBF7A04BC; Fri, 9 Oct 2020 18:05:03 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 1CAA61D8CB; Fri, 9 Oct 2020 18:02:56 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id 3198C1D73C for ; Fri, 9 Oct 2020 18:02:51 +0200 (CEST) IronPort-SDR: u493QrAs0PL+8tQbILpKQ0f4rejMfuEiMMqwTSJCSCGT1ue0te6cz0MglywzSysGUKjFjN5o6N KbuwPgPBA2dQ== X-IronPort-AV: E=McAfee;i="6000,8403,9769"; a="250197416" X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="250197416" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 09:02:50 -0700 IronPort-SDR: +jAeOGTPGyauu5xaUsVfXRWckUFAwaIRlRQcn/X/8b+lGNWUzjxfSQZkVe12KN6yeb+JrbPlUl 1kSx7r03T1Xg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="528981942" Received: from silpixa00399498.ir.intel.com (HELO silpixa00399498.ger.corp.intel.com) ([10.237.222.52]) by orsmga005.jf.intel.com with ESMTP; 09 Oct 2020 09:02:47 -0700 From: Anatoly Burakov To: dev@dpdk.org Cc: Liang Ma , Beilei Xing , Jeff Guo , david.hunt@intel.com, konstantin.ananyev@intel.com, jerinjacobk@gmail.com, bruce.richardson@intel.com, thomas@monjalon.net, timothy.mcdaniel@intel.com, gage.eads@intel.com, chris.macnamara@intel.com Date: Fri, 9 Oct 2020 17:02:24 +0100 Message-Id: <78bfa354463be2c3560ee97c369ae7266e0fb50f.1602258833.git.anatoly.burakov@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <532f45c5d79b4c30a919553d322bb66e91534466.1602258833.git.anatoly.burakov@intel.com> References: <532f45c5d79b4c30a919553d322bb66e91534466.1602258833.git.anatoly.burakov@intel.com> In-Reply-To: <1601647919-25312-1-git-send-email-liang.j.ma@intel.com> References: <1601647919-25312-1-git-send-email-liang.j.ma@intel.com> Subject: [dpdk-dev] [PATCH v5 07/10] net/i40e: implement power management API X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Liang Ma Implement support for the power management API by implementing a `get_wake_addr` function that will return an address of an RX ring's status bit. Signed-off-by: Liang Ma Signed-off-by: Anatoly Burakov Acked-by: Jeff Guo --- drivers/net/i40e/i40e_ethdev.c | 1 + drivers/net/i40e/i40e_rxtx.c | 23 +++++++++++++++++++++++ drivers/net/i40e/i40e_rxtx.h | 2 ++ 3 files changed, 26 insertions(+) diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c index 943cfe71dc..cab86f8ec9 100644 --- a/drivers/net/i40e/i40e_ethdev.c +++ b/drivers/net/i40e/i40e_ethdev.c @@ -513,6 +513,7 @@ static const struct eth_dev_ops i40e_eth_dev_ops = { .mtu_set = i40e_dev_mtu_set, .tm_ops_get = i40e_tm_ops_get, .tx_done_cleanup = i40e_tx_done_cleanup, + .get_wake_addr = i40e_get_wake_addr, }; /* store statistics names and its offset in stats structure */ diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c index 322fc1ed75..c17f27292f 100644 --- a/drivers/net/i40e/i40e_rxtx.c +++ b/drivers/net/i40e/i40e_rxtx.c @@ -71,6 +71,29 @@ #define I40E_TX_OFFLOAD_NOTSUP_MASK \ (PKT_TX_OFFLOAD_MASK ^ I40E_TX_OFFLOAD_MASK) +int +i40e_get_wake_addr(void *rx_queue, volatile void **tail_desc_addr, + uint64_t *expected, uint64_t *mask) +{ + struct i40e_rx_queue *rxq = rx_queue; + volatile union i40e_rx_desc *rxdp; + uint16_t desc; + + desc = rxq->rx_tail; + rxdp = &rxq->rx_ring[desc]; + /* watch for changes in status bit */ + *tail_desc_addr = &rxdp->wb.qword1.status_error_len; + + /* + * we expect the DD bit to be set to 1 if this descriptor was already + * written to. + */ + *expected = rte_cpu_to_le_64(1 << I40E_RX_DESC_STATUS_DD_SHIFT); + *mask = rte_cpu_to_le_64(1 << I40E_RX_DESC_STATUS_DD_SHIFT); + + return 0; +} + static inline void i40e_rxd_to_vlan_tci(struct rte_mbuf *mb, volatile union i40e_rx_desc *rxdp) { diff --git a/drivers/net/i40e/i40e_rxtx.h b/drivers/net/i40e/i40e_rxtx.h index 57d7b4160b..f23a2073e3 100644 --- a/drivers/net/i40e/i40e_rxtx.h +++ b/drivers/net/i40e/i40e_rxtx.h @@ -248,6 +248,8 @@ uint16_t i40e_recv_scattered_pkts_vec_avx2(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts); uint16_t i40e_xmit_pkts_vec_avx2(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts); +int i40e_get_wake_addr(void *rx_queue, volatile void **tail_desc_addr, + uint64_t *expected, uint64_t *value); /* For each value it means, datasheet of hardware can tell more details * From patchwork Fri Oct 9 16:02:25 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 80197 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 0E7F3A04BC; Fri, 9 Oct 2020 18:05:26 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id AB3F71D736; Fri, 9 Oct 2020 18:03:00 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id 2E38C1D73C for ; Fri, 9 Oct 2020 18:02:54 +0200 (CEST) IronPort-SDR: BuOtpjEyt3MpGxxvW1zGkGkFq53ryuQNO5bSDpnzSwWJV0E4PMrs6R7YXAm3BLdovtiAFge++I xVJG47JKh2+A== X-IronPort-AV: E=McAfee;i="6000,8403,9769"; a="250197428" X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="250197428" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 09:02:53 -0700 IronPort-SDR: WK4ad5nGiR9pURkSKkS7t7ddfchlJ0cToNOko2luPhhwKdRggsU00oNfP4QzgsdYivpAwMJS1W QaFdx5jIVByQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="528981957" Received: from silpixa00399498.ir.intel.com (HELO silpixa00399498.ger.corp.intel.com) ([10.237.222.52]) by orsmga005.jf.intel.com with ESMTP; 09 Oct 2020 09:02:50 -0700 From: Anatoly Burakov To: dev@dpdk.org Cc: Liang Ma , Qiming Yang , Qi Zhang , david.hunt@intel.com, konstantin.ananyev@intel.com, jerinjacobk@gmail.com, bruce.richardson@intel.com, thomas@monjalon.net, timothy.mcdaniel@intel.com, gage.eads@intel.com, chris.macnamara@intel.com Date: Fri, 9 Oct 2020 17:02:25 +0100 Message-Id: X-Mailer: git-send-email 2.17.1 In-Reply-To: <532f45c5d79b4c30a919553d322bb66e91534466.1602258833.git.anatoly.burakov@intel.com> References: <532f45c5d79b4c30a919553d322bb66e91534466.1602258833.git.anatoly.burakov@intel.com> In-Reply-To: <1601647919-25312-1-git-send-email-liang.j.ma@intel.com> References: <1601647919-25312-1-git-send-email-liang.j.ma@intel.com> Subject: [dpdk-dev] [PATCH v5 08/10] net/ice: implement power management API X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Liang Ma Implement support for the power management API by implementing a `get_wake_addr` function that will return an address of an RX ring's status bit. Signed-off-by: Liang Ma Signed-off-by: Anatoly Burakov --- drivers/net/ice/ice_ethdev.c | 1 + drivers/net/ice/ice_rxtx.c | 23 +++++++++++++++++++++++ drivers/net/ice/ice_rxtx.h | 2 ++ 3 files changed, 26 insertions(+) diff --git a/drivers/net/ice/ice_ethdev.c b/drivers/net/ice/ice_ethdev.c index d8ce09d28f..260de5dfd7 100644 --- a/drivers/net/ice/ice_ethdev.c +++ b/drivers/net/ice/ice_ethdev.c @@ -216,6 +216,7 @@ static const struct eth_dev_ops ice_eth_dev_ops = { .udp_tunnel_port_add = ice_dev_udp_tunnel_port_add, .udp_tunnel_port_del = ice_dev_udp_tunnel_port_del, .tx_done_cleanup = ice_tx_done_cleanup, + .get_wake_addr = ice_get_wake_addr, }; /* store statistics names and its offset in stats structure */ diff --git a/drivers/net/ice/ice_rxtx.c b/drivers/net/ice/ice_rxtx.c index 93a0ac6918..9e55eca942 100644 --- a/drivers/net/ice/ice_rxtx.c +++ b/drivers/net/ice/ice_rxtx.c @@ -25,6 +25,29 @@ uint64_t rte_net_ice_dynflag_proto_xtr_ipv6_flow_mask; uint64_t rte_net_ice_dynflag_proto_xtr_tcp_mask; uint64_t rte_net_ice_dynflag_proto_xtr_ip_offset_mask; +int ice_get_wake_addr(void *rx_queue, volatile void **tail_desc_addr, + uint64_t *expected, uint64_t *mask) +{ + volatile union ice_rx_flex_desc *rxdp; + struct ice_rx_queue *rxq = rx_queue; + uint16_t desc; + + desc = rxq->rx_tail; + rxdp = &rxq->rx_ring[desc]; + /* watch for changes in status bit */ + *tail_desc_addr = &rxdp->wb.status_error0; + + /* + * we expect the DD bit to be set to 1 if this descriptor was already + * written to. + */ + *expected = rte_cpu_to_le_16(1 << ICE_RX_FLEX_DESC_STATUS0_DD_S); + *mask = rte_cpu_to_le_16(1 << ICE_RX_FLEX_DESC_STATUS0_DD_S); + + return 0; +} + + static inline uint8_t ice_proto_xtr_type_to_rxdid(uint8_t xtr_type) { diff --git a/drivers/net/ice/ice_rxtx.h b/drivers/net/ice/ice_rxtx.h index 1c23c7541e..c729e474c9 100644 --- a/drivers/net/ice/ice_rxtx.h +++ b/drivers/net/ice/ice_rxtx.h @@ -250,6 +250,8 @@ uint16_t ice_xmit_pkts_vec_avx2(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts); int ice_fdir_programming(struct ice_pf *pf, struct ice_fltr_desc *fdir_desc); int ice_tx_done_cleanup(void *txq, uint32_t free_cnt); +int ice_get_wake_addr(void *rx_queue, volatile void **tail_desc_addr, + uint64_t *expected, uint64_t *mask); #define FDIR_PARSING_ENABLE_PER_QUEUE(ad, on) do { \ int i; \ From patchwork Fri Oct 9 16:02:26 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 80198 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 95E2CA04BC; Fri, 9 Oct 2020 18:05:44 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 411831D8D8; Fri, 9 Oct 2020 18:03:02 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id D4D2B1D8D1 for ; Fri, 9 Oct 2020 18:02:56 +0200 (CEST) IronPort-SDR: KZNn7GMa3lL7DmP+YZRfuaRG2yRaP6pAN2SHUl7nrF2IlFXjGS6ogdmBCfE53ZJb2Kesvbuu+R ajYPjuEUzCWg== X-IronPort-AV: E=McAfee;i="6000,8403,9769"; a="250197444" X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="250197444" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 09:02:56 -0700 IronPort-SDR: jvJ1MYtzXKBUikxbD4LEsIhad2qHblId6ByHnxYXtTbUeDR/MUjqdhjVKr6RJy1cCbXBpP5Apw 63sXdGDGSOng== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="528981975" Received: from silpixa00399498.ir.intel.com (HELO silpixa00399498.ger.corp.intel.com) ([10.237.222.52]) by orsmga005.jf.intel.com with ESMTP; 09 Oct 2020 09:02:53 -0700 From: Anatoly Burakov To: dev@dpdk.org Cc: Liang Ma , David Hunt , konstantin.ananyev@intel.com, jerinjacobk@gmail.com, bruce.richardson@intel.com, thomas@monjalon.net, timothy.mcdaniel@intel.com, gage.eads@intel.com, chris.macnamara@intel.com Date: Fri, 9 Oct 2020 17:02:26 +0100 Message-Id: X-Mailer: git-send-email 2.17.1 In-Reply-To: <532f45c5d79b4c30a919553d322bb66e91534466.1602258833.git.anatoly.burakov@intel.com> References: <532f45c5d79b4c30a919553d322bb66e91534466.1602258833.git.anatoly.burakov@intel.com> In-Reply-To: <1601647919-25312-1-git-send-email-liang.j.ma@intel.com> References: <1601647919-25312-1-git-send-email-liang.j.ma@intel.com> Subject: [dpdk-dev] [PATCH v5 09/10] examples/l3fwd-power: enable PMD power mgmt X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Liang Ma Add PMD power management feature support to l3fwd-power sample app. Signed-off-by: Liang Ma Signed-off-by: Anatoly Burakov --- Notes: v5: - Moved doc update here - Some minor formatting fixes .../sample_app_ug/l3_forward_power_man.rst | 13 ++++++ examples/l3fwd-power/main.c | 41 ++++++++++++++++++- 2 files changed, 53 insertions(+), 1 deletion(-) diff --git a/doc/guides/sample_app_ug/l3_forward_power_man.rst b/doc/guides/sample_app_ug/l3_forward_power_man.rst index 0cc6f2e62e..8722fbaeaa 100644 --- a/doc/guides/sample_app_ug/l3_forward_power_man.rst +++ b/doc/guides/sample_app_ug/l3_forward_power_man.rst @@ -109,6 +109,8 @@ where, * --telemetry: Telemetry mode. +* --pmd-mgmt: PMD power management mode. + See :doc:`l3_forward` for details. The L3fwd-power example reuses the L3fwd command line options. @@ -459,3 +461,14 @@ reference cycles and accordingly busy rate is set to either 0% or The new stats ``empty_poll`` , ``full_poll`` and ``busy_percent`` can be viewed by running the script ``/usertools/dpdk-telemetry-client.py`` and selecting the menu option ``Send for global Metrics``. + +PMD power management Mode +------------------------- + +The PMD power management mode support for ``l3fwd-power`` is a standalone mode, in this mode +``l3fwd-power`` does simple l3fwding along with enable the power saving scheme on specific +port/queue/lcore. Main purpose for this mode is to demonstrate how to use the PMD power management API. + +.. code-block:: console + + ./examples/l3fwd-power/build/l3fwd-power --pmd-mgmt -l 1-3 -- -p 0x0f --config="(0,0,2),(0,1,3)" diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c index d0e6c9bd77..af64dd521f 100644 --- a/examples/l3fwd-power/main.c +++ b/examples/l3fwd-power/main.c @@ -47,6 +47,7 @@ #include #include #include +#include #include "perf_core.h" #include "main.h" @@ -199,7 +200,8 @@ enum appmode { APP_MODE_LEGACY, APP_MODE_EMPTY_POLL, APP_MODE_TELEMETRY, - APP_MODE_INTERRUPT + APP_MODE_INTERRUPT, + APP_MODE_PMD_MGMT }; enum appmode app_mode; @@ -1750,6 +1752,7 @@ parse_ep_config(const char *q_arg) #define CMD_LINE_OPT_EMPTY_POLL "empty-poll" #define CMD_LINE_OPT_INTERRUPT_ONLY "interrupt-only" #define CMD_LINE_OPT_TELEMETRY "telemetry" +#define CMD_LINE_OPT_PMD_MGMT "pmd-mgmt" /* Parse the argument given in the command line of the application */ static int @@ -1771,6 +1774,7 @@ parse_args(int argc, char **argv) {CMD_LINE_OPT_LEGACY, 0, 0, 0}, {CMD_LINE_OPT_TELEMETRY, 0, 0, 0}, {CMD_LINE_OPT_INTERRUPT_ONLY, 0, 0, 0}, + {CMD_LINE_OPT_PMD_MGMT, 0, 0, 0}, {NULL, 0, 0, 0} }; @@ -1881,6 +1885,16 @@ parse_args(int argc, char **argv) printf("telemetry mode is enabled\n"); } + if (!strncmp(lgopts[option_index].name, + CMD_LINE_OPT_PMD_MGMT, + sizeof(CMD_LINE_OPT_PMD_MGMT))) { + if (app_mode != APP_MODE_DEFAULT) { + printf(" power mgmt mode is mutually exclusive with other modes\n"); + return -1; + } + app_mode = APP_MODE_PMD_MGMT; + printf("PMD power mgmt mode is enabled\n"); + } if (!strncmp(lgopts[option_index].name, CMD_LINE_OPT_INTERRUPT_ONLY, sizeof(CMD_LINE_OPT_INTERRUPT_ONLY))) { @@ -2437,6 +2451,8 @@ mode_to_str(enum appmode mode) return "telemetry"; case APP_MODE_INTERRUPT: return "interrupt-only"; + case APP_MODE_PMD_MGMT: + return "pmd mgmt"; default: return "invalid"; } @@ -2705,6 +2721,12 @@ main(int argc, char **argv) } else if (!check_ptype(portid)) rte_exit(EXIT_FAILURE, "PMD can not provide needed ptypes\n"); + if (app_mode == APP_MODE_PMD_MGMT) { + rte_power_pmd_mgmt_queue_enable(lcore_id, + portid, queueid, + RTE_POWER_MGMT_TYPE_SCALE); + + } } } @@ -2790,6 +2812,9 @@ main(int argc, char **argv) SKIP_MASTER); } else if (app_mode == APP_MODE_INTERRUPT) { rte_eal_mp_remote_launch(main_intr_loop, NULL, CALL_MASTER); + } else if (app_mode == APP_MODE_PMD_MGMT) { + rte_eal_mp_remote_launch(main_telemetry_loop, NULL, + CALL_MASTER); } if (app_mode == APP_MODE_EMPTY_POLL || app_mode == APP_MODE_TELEMETRY) @@ -2812,6 +2837,20 @@ main(int argc, char **argv) if (app_mode == APP_MODE_EMPTY_POLL) rte_power_empty_poll_stat_free(); + if (app_mode == APP_MODE_PMD_MGMT) { + for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) { + if (rte_lcore_is_enabled(lcore_id) == 0) + continue; + qconf = &lcore_conf[lcore_id]; + for (queue = 0; queue < qconf->n_rx_queue; ++queue) { + portid = qconf->rx_queue_list[queue].port_id; + queueid = qconf->rx_queue_list[queue].queue_id; + rte_power_pmd_mgmt_queue_disable(lcore_id, + portid, queueid); + } + } + } + if ((app_mode == APP_MODE_LEGACY || app_mode == APP_MODE_EMPTY_POLL) && deinit_power_library()) rte_exit(EXIT_FAILURE, "deinit_power_library failed\n"); From patchwork Fri Oct 9 16:02:27 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 80199 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 76207A04BC; Fri, 9 Oct 2020 18:06:11 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id D465D1D735; Fri, 9 Oct 2020 18:03:06 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id A75BC1D734 for ; Fri, 9 Oct 2020 18:02:59 +0200 (CEST) IronPort-SDR: 1/jM2OIcfF4R2A6ZgPUGgUCCgDpQcRbZ26gRC0hhD9Ime/eUB3c8/2fDoSP0cjB3SqoffzBuT/ NzdBFOvVLREw== X-IronPort-AV: E=McAfee;i="6000,8403,9769"; a="250197456" X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="250197456" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 09:02:58 -0700 IronPort-SDR: 9NKHKq/bUNZ7x4Hue66PJtJbKs1w9btEJeok0GsivchyPfT78TKozMq4Q3dR78K9i1kl5LJmQw z7BEOlGBu7ww== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="528981986" Received: from silpixa00399498.ir.intel.com (HELO silpixa00399498.ger.corp.intel.com) ([10.237.222.52]) by orsmga005.jf.intel.com with ESMTP; 09 Oct 2020 09:02:56 -0700 From: Anatoly Burakov To: dev@dpdk.org Cc: Liang Ma , David Hunt , konstantin.ananyev@intel.com, jerinjacobk@gmail.com, bruce.richardson@intel.com, thomas@monjalon.net, timothy.mcdaniel@intel.com, gage.eads@intel.com, chris.macnamara@intel.com Date: Fri, 9 Oct 2020 17:02:27 +0100 Message-Id: X-Mailer: git-send-email 2.17.1 In-Reply-To: <532f45c5d79b4c30a919553d322bb66e91534466.1602258833.git.anatoly.burakov@intel.com> References: <532f45c5d79b4c30a919553d322bb66e91534466.1602258833.git.anatoly.burakov@intel.com> In-Reply-To: <1601647919-25312-1-git-send-email-liang.j.ma@intel.com> References: <1601647919-25312-1-git-send-email-liang.j.ma@intel.com> Subject: [dpdk-dev] [PATCH v5 10/10] doc: update programmer's guide for power library X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Liang Ma Update programmer's guide to document PMD power management usage. Signed-off-by: Liang Ma Signed-off-by: Anatoly Burakov --- Notes: v5: - Moved l3fwd-power update to the l3fwd-power-related commit - Some rewordings and clarifications doc/guides/prog_guide/power_man.rst | 42 +++++++++++++++++++++++++++++ 1 file changed, 42 insertions(+) diff --git a/doc/guides/prog_guide/power_man.rst b/doc/guides/prog_guide/power_man.rst index 0a3755a901..38c64d31e4 100644 --- a/doc/guides/prog_guide/power_man.rst +++ b/doc/guides/prog_guide/power_man.rst @@ -192,6 +192,45 @@ User Cases ---------- The mechanism can applied to any device which is based on polling. e.g. NIC, FPGA. +PMD Power Management API +------------------------ + +Abstract +~~~~~~~~ +Existing power management mechanisms require developers to change application +design or change code to make use of it. The PMD power management API provides a +convenient alternative by utilizing Ethernet PMD RX callbacks, and triggering +power saving whenever empty poll count reaches a certain number. + + * UMWAIT/UMONITOR + + This power saving scheme will put the CPU into optimized power state and use + the UMWAIT/UMONITOR instructions to monitor the Ethernet PMD RX descriptor + address, and wake the CPU up whenever there's new traffic. + + * Pause + + This power saving scheme will use the `rte_pause` function to avoid busy + polling. + + * Frequency scaling + + This power saving scheme will use existing power library functionality to + scale the core frequency up/down depending on traffic volume. + + +.. note:: + + Currently, this power management API is limited to mandatory mapping of 1 + queue to 1 core (multiple queues are supported, but they must be polled from + different cores). + +API Overview for PMD Power Management +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +* **Queue Enable**: Enable specific power scheme for certain queue/port/core + +* **Queue Disable**: Disable power scheme for certain queue/port/core + References ---------- @@ -200,3 +239,6 @@ References * The :doc:`../sample_app_ug/vm_power_management` chapter in the :doc:`../sample_app_ug/index` section. + +* The :doc:`../sample_app_ug/rxtx_callbacks` + chapter in the :doc:`../sample_app_ug/index` section.