From patchwork Wed Mar 20 10:55:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "lihuisong (C)" X-Patchwork-Id: 138489 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 5F3D143CFF; Wed, 20 Mar 2024 12:02:58 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 5E9B241132; Wed, 20 Mar 2024 12:02:47 +0100 (CET) Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by mails.dpdk.org (Postfix) with ESMTP id 24A7B40298 for ; Wed, 20 Mar 2024 12:02:44 +0100 (CET) Received: from mail.maildlp.com (unknown [172.19.163.174]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4V05GF10fczXjP1; Wed, 20 Mar 2024 19:00:05 +0800 (CST) Received: from kwepemm600004.china.huawei.com (unknown [7.193.23.242]) by mail.maildlp.com (Postfix) with ESMTPS id B12A61404DB; Wed, 20 Mar 2024 19:02:41 +0800 (CST) Received: from localhost.localdomain (10.28.79.22) by kwepemm600004.china.huawei.com (7.193.23.242) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.35; Wed, 20 Mar 2024 19:02:41 +0800 From: Huisong Li To: CC: , , , , , , Subject: [PATCH 1/2] power: introduce PM QoS interface Date: Wed, 20 Mar 2024 18:55:28 +0800 Message-ID: <20240320105529.5626-2-lihuisong@huawei.com> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20240320105529.5626-1-lihuisong@huawei.com> References: <20240320105529.5626-1-lihuisong@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.28.79.22] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To kwepemm600004.china.huawei.com (7.193.23.242) X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org The system-wide CPU latency QoS limit has a positive impact on the idle state selection in cpuidle governor. Linux creates a cpu_dma_latency device under '/dev' directory to obtain the CPU latency QoS limit on system and send the QoS request for userspace. Please see the PM QoS framework in the following link: https://docs.kernel.org/power/pm_qos_interface.html?highlight=qos This feature has beed supported by kernel-v2.6.25. The deeper the idle state, the lower the power consumption, but the longer the resume time. Some service are delay sensitive and very except the low resume time, like interrupt packet receiving mode. So this PM QoS API make it easy to obtain the CPU latency limit on system and send the CPU latency QoS request for the application that need them. The recommend usage method is as follows: 1) an application process first creates QoS request. 2) update the CPU latency request to zero when need. 3) back to the default value when no need(this step is optional). 4) release QoS request when process exit. Signed-off-by: Huisong Li --- doc/guides/prog_guide/power_man.rst | 16 ++++ doc/guides/rel_notes/release_24_03.rst | 4 + lib/power/meson.build | 2 + lib/power/rte_power_qos.c | 98 ++++++++++++++++++++++++ lib/power/rte_power_qos.h | 101 +++++++++++++++++++++++++ lib/power/version.map | 4 + 6 files changed, 225 insertions(+) create mode 100644 lib/power/rte_power_qos.c create mode 100644 lib/power/rte_power_qos.h diff --git a/doc/guides/prog_guide/power_man.rst b/doc/guides/prog_guide/power_man.rst index f6674efe2d..493c75bf9d 100644 --- a/doc/guides/prog_guide/power_man.rst +++ b/doc/guides/prog_guide/power_man.rst @@ -249,6 +249,22 @@ Get Num Pkgs Get Num Dies Get the number of die's on a given package. +PM QoS API +---------- +The deeper the idle state, the lower the power consumption, but the longer +the resume time. Some service threads are delay sensitive and very except +the low resume time, like interrupt packet receiving mode. + +This PM QoS API is aimed to obtain the CPU latency limit on system and send the +CPU latency QoS request for the application that need them. + +* ``rte_power_qos_get_curr_cpu_latency()`` is used to get the current CPU + latency limit on system. +* For sending CPU latency QoS request, first call ``rte_power_create_qos_request()`` + to create a QoS request, then update CPU latency value by calling + ``rte_power_qos_update_request()``. The ``rte_power_release_qos_request()`` is + used to release this QoS request when process exit. + References ---------- diff --git a/doc/guides/rel_notes/release_24_03.rst b/doc/guides/rel_notes/release_24_03.rst index 14826ea08f..b5be724133 100644 --- a/doc/guides/rel_notes/release_24_03.rst +++ b/doc/guides/rel_notes/release_24_03.rst @@ -196,6 +196,10 @@ New Features Added DMA producer mode to measure performance of ``OP_FORWARD`` mode of event DMA adapter. +* **Added CPU latency PM QoS support.** + + Added the interface querying cpu latency PM QoS limit on system and + the interface sending cpu latency QoS request in power lib. Removed Items ------------- diff --git a/lib/power/meson.build b/lib/power/meson.build index b8426589b2..8222e178b0 100644 --- a/lib/power/meson.build +++ b/lib/power/meson.build @@ -23,12 +23,14 @@ sources = files( 'rte_power.c', 'rte_power_uncore.c', 'rte_power_pmd_mgmt.c', + 'rte_power_qos.c', ) headers = files( 'rte_power.h', 'rte_power_guest_channel.h', 'rte_power_pmd_mgmt.h', 'rte_power_uncore.h', + 'rte_power_qos.h', ) if cc.has_argument('-Wno-cast-qual') cflags += '-Wno-cast-qual' diff --git a/lib/power/rte_power_qos.c b/lib/power/rte_power_qos.c new file mode 100644 index 0000000000..d2b55923a0 --- /dev/null +++ b/lib/power/rte_power_qos.c @@ -0,0 +1,98 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2024 HiSilicon Limited + */ + +#include +#include +#include + +#include + +#include "power_common.h" +#include "rte_power_qos.h" + +#define QOS_CPU_DMA_LATENCY_DEV "/dev/cpu_dma_latency" + +struct rte_power_qos_info { + /* + * Keep file descriptor to update QoS request until there are no + * necessary anymore. + */ + int fd; + int cur_cpu_latency; /* unit microseconds */ + }; + +struct rte_power_qos_info g_qos = { + .fd = -1, + .cur_cpu_latency = -1, +}; + +int +rte_power_qos_get_curr_cpu_latency(int *latency) +{ + int fd, ret; + + fd = open(QOS_CPU_DMA_LATENCY_DEV, O_RDONLY); + if (fd < 0) { + POWER_LOG(ERR, "Failed to open %s", QOS_CPU_DMA_LATENCY_DEV); + return -1; + } + + ret = read(fd, latency, sizeof(*latency)); + if (ret == 0) { + POWER_LOG(ERR, "Failed to read %s", QOS_CPU_DMA_LATENCY_DEV); + return -1; + } + close(fd); + + return 0; +} + +int +rte_power_qos_update_request(int latency) +{ + int ret; + + if (g_qos.fd == -1) { + POWER_LOG(ERR, "please create QoS request first."); + return -EINVAL; + } + + if (latency < 0) { + POWER_LOG(ERR, "latency should be non negative number."); + return -EINVAL; + } + + if (g_qos.cur_cpu_latency != -1 && latency == g_qos.cur_cpu_latency) + return 0; + + ret = write(g_qos.fd, &latency, sizeof(latency)); + if (ret == 0) { + POWER_LOG(ERR, "Failed to write %s", QOS_CPU_DMA_LATENCY_DEV); + return -1; + } + g_qos.cur_cpu_latency = latency; + + return 0; +} + +int +rte_power_create_qos_request(void) +{ + g_qos.fd = open(QOS_CPU_DMA_LATENCY_DEV, O_WRONLY); + if (g_qos.fd < 0) { + POWER_LOG(ERR, "Failed to open %s.", QOS_CPU_DMA_LATENCY_DEV); + return -1; + } + + return 0; +} + +void +rte_power_release_qos_request(void) +{ + if (g_qos.fd != -1) { + close(g_qos.fd); + g_qos.fd = -1; + } +} diff --git a/lib/power/rte_power_qos.h b/lib/power/rte_power_qos.h new file mode 100644 index 0000000000..d39f5d0c0f --- /dev/null +++ b/lib/power/rte_power_qos.h @@ -0,0 +1,101 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2024 HiSilicon Limited + */ + +#ifndef RTE_POWER_QOS_H +#define RTE_POWER_QOS_H + +#include + +#ifdef __cplusplus +extern "C" { +#endif + +/** + * @file rte_power_qos.h + * + * PM QoS API. + * + * The system-wide CPU latency QoS limit has a positive impact on the idle + * state selection in cpuidle governor. + * + * Linux creates a cpu_dma_latency device under '/dev' directory to obtain the + * CPU latency QoS limit on system and send the QoS request for userspace. + * Please see the PM QoS framework in the following link: + * https://docs.kernel.org/power/pm_qos_interface.html?highlight=qos + * + * The deeper the idle state, the lower the power consumption, but the longer + * the resume time. Some service are delay sensitive and very except the + * low resume time, like interrupt packet receiving mode. + * + * So this PM QoS API make it easy to obtain the CPU latency limit on system and + * send the CPU latency QoS request for the application that need them. + * + * The recommend usage method is as follows: + * 1) an application process first creates QoS request. + * 2) update the CPU latency request to zero when need. + * 3) back to the default value @see PM_QOS_CPU_LATENCY_DEFAULT_VALUE when + * no need (this step is optional). + * 4)release QoS request when process exit. + */ + +#define QOS_USEC_PER_SEC 1000000 +#define PM_QOS_CPU_LATENCY_DEFAULT_VALUE (2000 * QOS_USEC_PER_SEC) +#define PM_QOS_STRICT_LATENCY_VALUE 0 + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Create CPU latency QoS request and release this request by + * @see rte_power_release_qos_request. + * + * @return + * 0 on success. Otherwise negative value is returned. + */ +__rte_experimental +int rte_power_create_qos_request(void); + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * release CPU latency QoS request. + */ +__rte_experimental +void rte_power_release_qos_request(void); + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Get the current CPU latency QoS limit on system. + * The default value in kernel is @see PM_QOS_CPU_LATENCY_DEFAULT_VALUE. + * + * @return + * 0 on success. Otherwise negative value is returned. + */ +__rte_experimental +int rte_power_qos_get_curr_cpu_latency(int *latency); + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Update the CPU latency QoS request. + * Note: need to create QoS request first and then call this API. + * + * @param latency + * The latency should be greater than and equal to zero. + * + * @return + * 0 on success. Otherwise negative value is returned. + */ +__rte_experimental +int rte_power_qos_update_request(int latency); + +#ifdef __cplusplus +} +#endif + +#endif /* RTE_POWER_QOS_H */ diff --git a/lib/power/version.map b/lib/power/version.map index ad92a65f91..42770762b1 100644 --- a/lib/power/version.map +++ b/lib/power/version.map @@ -51,4 +51,8 @@ EXPERIMENTAL { rte_power_set_uncore_env; rte_power_uncore_freqs; rte_power_unset_uncore_env; + rte_power_create_qos_request; + rte_power_release_qos_request; + rte_power_qos_get_curr_cpu_latency; + rte_power_qos_update_request; }; From patchwork Wed Mar 20 10:55:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "lihuisong (C)" X-Patchwork-Id: 138488 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 4D4B843CFF; Wed, 20 Mar 2024 12:02:51 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 2FEBC410FC; Wed, 20 Mar 2024 12:02:46 +0100 (CET) Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) by mails.dpdk.org (Postfix) with ESMTP id B6E6D402A2 for ; Wed, 20 Mar 2024 12:02:43 +0100 (CET) Received: from mail.maildlp.com (unknown [172.19.162.254]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4V05Gf0VMlz1Q9lj; Wed, 20 Mar 2024 19:00:26 +0800 (CST) Received: from kwepemm600004.china.huawei.com (unknown [7.193.23.242]) by mail.maildlp.com (Postfix) with ESMTPS id 18FCF18007B; Wed, 20 Mar 2024 19:02:42 +0800 (CST) Received: from localhost.localdomain (10.28.79.22) by kwepemm600004.china.huawei.com (7.193.23.242) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.35; Wed, 20 Mar 2024 19:02:41 +0800 From: Huisong Li To: CC: , , , , , , Subject: [PATCH 2/2] examples/l3fwd-power: add PM QoS request configuration Date: Wed, 20 Mar 2024 18:55:29 +0800 Message-ID: <20240320105529.5626-3-lihuisong@huawei.com> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20240320105529.5626-1-lihuisong@huawei.com> References: <20240320105529.5626-1-lihuisong@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.28.79.22] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To kwepemm600004.china.huawei.com (7.193.23.242) X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Add PM QoS request configuration to declease the process resume latency. Signed-off-by: Huisong Li --- examples/l3fwd-power/main.c | 41 ++++++++++++++++++++++++++++++++++++- 1 file changed, 40 insertions(+), 1 deletion(-) diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c index f4adcf41b5..78f292ed02 100644 --- a/examples/l3fwd-power/main.c +++ b/examples/l3fwd-power/main.c @@ -47,6 +47,7 @@ #include #include #include +#include #include "perf_core.h" #include "main.h" @@ -2232,12 +2233,48 @@ static int check_ptype(uint16_t portid) } +static int +pm_qos_init(void) +{ + int cur_cpu_latency; + int ret; + + ret = rte_power_qos_get_curr_cpu_latency(&cur_cpu_latency); + if (ret < 0) { + RTE_LOG(ERR, L3FWD_POWER, "failed to get current cpu latency.\n"); + return ret; + } + RTE_LOG(INFO, L3FWD_POWER, "current cpu latency is %dus on system.\n", + (cur_cpu_latency / QOS_USEC_PER_SEC)); + + ret = rte_power_create_qos_request(); + if (ret < 0) { + RTE_LOG(ERR, L3FWD_POWER, "Failed to create power QoS request.\n"); + return ret; + } + + /* + * Set strict latency requirement to prevent service thread going into + * a deeper sleep state whose resume time is longer. + */ + ret = rte_power_qos_update_request(PM_QOS_STRICT_LATENCY_VALUE); + if (ret < 0) + RTE_LOG(ERR, L3FWD_POWER, "Failed to change cpu latency to 0.\n"); + return ret; +} + static int init_power_library(void) { enum power_management_env env; unsigned int lcore_id; - int ret = 0; + int ret; + + ret = pm_qos_init(); + if (ret != 0) { + RTE_LOG(ERR, L3FWD_POWER, "init power Qos failed.\n"); + return ret; + } RTE_LCORE_FOREACH(lcore_id) { /* init power management library */ @@ -2268,6 +2305,8 @@ deinit_power_library(void) unsigned int lcore_id, max_pkg, max_die, die, pkg; int ret = 0; + rte_power_release_qos_request(); + RTE_LCORE_FOREACH(lcore_id) { /* deinit power management library */ ret = rte_power_exit(lcore_id);