From patchwork Thu Feb 2 12:49:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tomasz Duszynski X-Patchwork-Id: 122952 X-Patchwork-Delegate: david.marchand@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 9D51341BAE; Thu, 2 Feb 2023 13:50:09 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 271F942DAA; Thu, 2 Feb 2023 13:50:09 +0100 (CET) Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com [67.231.156.173]) by mails.dpdk.org (Postfix) with ESMTP id DB5F442DAA for ; Thu, 2 Feb 2023 13:50:06 +0100 (CET) Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3127Kep2007927; Thu, 2 Feb 2023 04:50:03 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=pfpt0220; bh=JvBedDQv4E9DNBHmqU9nSuR5kaCrI00Jw4LB4a1+GsQ=; b=TF1lQ9Yo9F/UiRTl/N2DcU9+glqN85UVigSmbmPSPumuu8NEaVO1JXBwsqnPRSwAnpBv 2eDWyGGuz3Nmi1494MWK9cKKcX1NWCFW/n0uOt0uZo5yshkvOCV1vVRZpqq2GLqmLYmZ OK3+s1VPfInLy7mEJDXTGJbC4aZiKnSJd45HnpDGw8BssSXAc56DdkDi63tPAXQqgBi5 +k4BR7GoadePHw2iYf5p6jvA8irBdqY3UZBEFZcTurEAMejPWg8OuwjIY9pHZi19fzqy nhU8yiekje/GOs9Z/L9stGYTjSRVnc6CuVmBiCsjCouUP8MoeDW+jct8KFT13LVqWy7i Uw== Received: from dc5-exch02.marvell.com ([199.233.59.182]) by mx0b-0016f401.pphosted.com (PPS) with ESMTPS id 3nfjrj832e-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Thu, 02 Feb 2023 04:50:03 -0800 Received: from DC5-EXCH01.marvell.com (10.69.176.38) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server (TLS) id 15.0.1497.42; Thu, 2 Feb 2023 04:50:01 -0800 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server id 15.0.1497.42 via Frontend Transport; Thu, 2 Feb 2023 04:50:01 -0800 Received: from cavium-DT10.. (unknown [10.28.34.39]) by maili.marvell.com (Postfix) with ESMTP id E69535B6926; Thu, 2 Feb 2023 04:49:57 -0800 (PST) From: Tomasz Duszynski To: , Thomas Monjalon , Tomasz Duszynski CC: , , , , , , Subject: [PATCH v9 1/4] lib: add generic support for reading PMU events Date: Thu, 2 Feb 2023 13:49:48 +0100 Message-ID: <20230202124951.2915770-2-tduszynski@marvell.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230202124951.2915770-1-tduszynski@marvell.com> References: <20230202094358.2838758-1-tduszynski@marvell.com> <20230202124951.2915770-1-tduszynski@marvell.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: HqUMJBHLAhKI-i5wVAwNbyrssSiUH1Aw X-Proofpoint-GUID: HqUMJBHLAhKI-i5wVAwNbyrssSiUH1Aw X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.930,Hydra:6.0.562,FMLib:17.11.122.1 definitions=2023-02-02_04,2023-02-02_01,2022-06-22_01 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Add support for programming PMU counters and reading their values in runtime bypassing kernel completely. This is especially useful in cases where CPU cores are isolated (nohz_full) i.e run dedicated tasks. In such cases one cannot use standard perf utility without sacrificing latency and performance. Signed-off-by: Tomasz Duszynski Acked-by: Morten Brørup --- MAINTAINERS | 5 + app/test/meson.build | 1 + app/test/test_pmu.c | 55 +++ doc/api/doxy-api-index.md | 3 +- doc/api/doxy-api.conf.in | 1 + doc/guides/prog_guide/profile_app.rst | 8 + doc/guides/rel_notes/release_23_03.rst | 9 + lib/meson.build | 1 + lib/pmu/meson.build | 13 + lib/pmu/pmu_private.h | 29 ++ lib/pmu/rte_pmu.c | 464 +++++++++++++++++++++++++ lib/pmu/rte_pmu.h | 205 +++++++++++ lib/pmu/version.map | 20 ++ 13 files changed, 813 insertions(+), 1 deletion(-) create mode 100644 app/test/test_pmu.c create mode 100644 lib/pmu/meson.build create mode 100644 lib/pmu/pmu_private.h create mode 100644 lib/pmu/rte_pmu.c create mode 100644 lib/pmu/rte_pmu.h create mode 100644 lib/pmu/version.map diff --git a/MAINTAINERS b/MAINTAINERS index 9a0f416d2e..9f13eafd95 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1697,6 +1697,11 @@ M: Nithin Dabilpuram M: Pavan Nikhilesh F: lib/node/ +PMU - EXPERIMENTAL +M: Tomasz Duszynski +F: lib/pmu/ +F: app/test/test_pmu* + Test Applications ----------------- diff --git a/app/test/meson.build b/app/test/meson.build index f34d19e3c3..7b6b69dcf1 100644 --- a/app/test/meson.build +++ b/app/test/meson.build @@ -111,6 +111,7 @@ test_sources = files( 'test_reciprocal_division_perf.c', 'test_red.c', 'test_pie.c', + 'test_pmu.c', 'test_reorder.c', 'test_rib.c', 'test_rib6.c', diff --git a/app/test/test_pmu.c b/app/test/test_pmu.c new file mode 100644 index 0000000000..a9bfb1a427 --- /dev/null +++ b/app/test/test_pmu.c @@ -0,0 +1,55 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(C) 2023 Marvell International Ltd. + */ + +#include "test.h" + +#ifndef RTE_EXEC_ENV_LINUX + +static int +test_pmu(void) +{ + printf("pmu_autotest only supported on Linux, skipping test\n"); + return TEST_SKIPPED; +} + +#else + +#include + +static int +test_pmu_read(void) +{ + int tries = 10, event = -1; + uint64_t val = 0; + + if (rte_pmu_init() < 0) + return TEST_FAILED; + + while (tries--) + val += rte_pmu_read(event); + + rte_pmu_fini(); + + return val ? TEST_SUCCESS : TEST_FAILED; +} + +static struct unit_test_suite pmu_tests = { + .suite_name = "pmu autotest", + .setup = NULL, + .teardown = NULL, + .unit_test_cases = { + TEST_CASE(test_pmu_read), + TEST_CASES_END() + } +}; + +static int +test_pmu(void) +{ + return unit_test_suite_runner(&pmu_tests); +} + +#endif /* RTE_EXEC_ENV_LINUX */ + +REGISTER_TEST_COMMAND(pmu_autotest, test_pmu); diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md index de488c7abf..7f1938f92f 100644 --- a/doc/api/doxy-api-index.md +++ b/doc/api/doxy-api-index.md @@ -222,7 +222,8 @@ The public API headers are grouped by topics: [log](@ref rte_log.h), [errno](@ref rte_errno.h), [trace](@ref rte_trace.h), - [trace_point](@ref rte_trace_point.h) + [trace_point](@ref rte_trace_point.h), + [pmu](@ref rte_pmu.h) - **misc**: [EAL config](@ref rte_eal.h), diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in index f0886c3bd1..920e615996 100644 --- a/doc/api/doxy-api.conf.in +++ b/doc/api/doxy-api.conf.in @@ -63,6 +63,7 @@ INPUT = @TOPDIR@/doc/api/doxy-api-index.md \ @TOPDIR@/lib/pci \ @TOPDIR@/lib/pdump \ @TOPDIR@/lib/pipeline \ + @TOPDIR@/lib/pmu \ @TOPDIR@/lib/port \ @TOPDIR@/lib/power \ @TOPDIR@/lib/rawdev \ diff --git a/doc/guides/prog_guide/profile_app.rst b/doc/guides/prog_guide/profile_app.rst index 14292d4c25..a8b501fe0c 100644 --- a/doc/guides/prog_guide/profile_app.rst +++ b/doc/guides/prog_guide/profile_app.rst @@ -7,6 +7,14 @@ Profile Your Application The following sections describe methods of profiling DPDK applications on different architectures. +Performance counter based profiling +----------------------------------- + +Majority of architectures support some sort hardware measurement unit which provides a set of +programmable counters that monitor specific events. There are different tools which can gather +that information, perf being an example here. Though in some scenarios, eg. when CPU cores are +isolated (nohz_full) and run dedicated tasks, using perf is less than ideal. In such cases one can +read specific events directly from application via ``rte_pmu_read()``. Profiling on x86 ---------------- diff --git a/doc/guides/rel_notes/release_23_03.rst b/doc/guides/rel_notes/release_23_03.rst index 73f5d94e14..733541d56c 100644 --- a/doc/guides/rel_notes/release_23_03.rst +++ b/doc/guides/rel_notes/release_23_03.rst @@ -55,10 +55,19 @@ New Features Also, make sure to start the actual text at the margin. ======================================================= +* **Added PMU library.** + + Added a new PMU (performance measurement unit) library which allows applications + to perform self monitoring activities without depending on external utilities like perf. + After integration with :doc:`../prog_guide/trace_lib` data gathered from hardware counters + can be stored in CTF format for further analysis. + * **Updated AMD axgbe driver.** * Added multi-process support. +* **Added multi-process support for axgbe PMD.** + * **Updated Corigine nfp driver.** * Added support for meter options. diff --git a/lib/meson.build b/lib/meson.build index a90fee31b7..7132131b5c 100644 --- a/lib/meson.build +++ b/lib/meson.build @@ -11,6 +11,7 @@ libraries = [ 'kvargs', # eal depends on kvargs 'telemetry', # basic info querying + 'pmu', 'eal', # everything depends on eal 'ring', 'rcu', # rcu depends on ring diff --git a/lib/pmu/meson.build b/lib/pmu/meson.build new file mode 100644 index 0000000000..a4160b494e --- /dev/null +++ b/lib/pmu/meson.build @@ -0,0 +1,13 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(C) 2023 Marvell International Ltd. + +if not is_linux + build = false + reason = 'only supported on Linux' + subdir_done() +endif + +includes = [global_inc] + +sources = files('rte_pmu.c') +headers = files('rte_pmu.h') diff --git a/lib/pmu/pmu_private.h b/lib/pmu/pmu_private.h new file mode 100644 index 0000000000..849549b125 --- /dev/null +++ b/lib/pmu/pmu_private.h @@ -0,0 +1,29 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2023 Marvell + */ + +#ifndef _PMU_PRIVATE_H_ +#define _PMU_PRIVATE_H_ + +/** + * Architecture specific PMU init callback. + * + * @return + * 0 in case of success, negative value otherwise. + */ +int +pmu_arch_init(void); + +/** + * Architecture specific PMU cleanup callback. + */ +void +pmu_arch_fini(void); + +/** + * Apply architecture specific settings to config before passing it to syscall. + */ +void +pmu_arch_fixup_config(uint64_t config[3]); + +#endif /* _PMU_PRIVATE_H_ */ diff --git a/lib/pmu/rte_pmu.c b/lib/pmu/rte_pmu.c new file mode 100644 index 0000000000..4cf3161155 --- /dev/null +++ b/lib/pmu/rte_pmu.c @@ -0,0 +1,464 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(C) 2023 Marvell International Ltd. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +#include "pmu_private.h" + +#define EVENT_SOURCE_DEVICES_PATH "/sys/bus/event_source/devices" + +#ifndef GENMASK_ULL +#define GENMASK_ULL(h, l) ((~0ULL - (1ULL << (l)) + 1) & (~0ULL >> ((64 - 1 - (h))))) +#endif + +#ifndef FIELD_PREP +#define FIELD_PREP(m, v) (((uint64_t)(v) << (__builtin_ffsll(m) - 1)) & (m)) +#endif + +RTE_DEFINE_PER_LCORE(struct rte_pmu_event_group, _event_group); +struct rte_pmu rte_pmu; + +/* + * Following __rte_weak functions provide default no-op. Architectures should override them if + * necessary. + */ + +int +__rte_weak pmu_arch_init(void) +{ + return 0; +} + +void +__rte_weak pmu_arch_fini(void) +{ +} + +void +__rte_weak pmu_arch_fixup_config(uint64_t __rte_unused config[3]) +{ +} + +static int +get_term_format(const char *name, int *num, uint64_t *mask) +{ + char *config = NULL; + char path[PATH_MAX]; + int high, low, ret; + FILE *fp; + + /* quiesce -Wmaybe-uninitialized warning */ + *num = 0; + *mask = 0; + + snprintf(path, sizeof(path), EVENT_SOURCE_DEVICES_PATH "/%s/format/%s", rte_pmu.name, name); + fp = fopen(path, "r"); + if (fp == NULL) + return -errno; + + errno = 0; + ret = fscanf(fp, "%m[^:]:%d-%d", &config, &low, &high); + if (ret < 2) { + ret = -ENODATA; + goto out; + } + if (errno) { + ret = -errno; + goto out; + } + + if (ret == 2) + high = low; + + *mask = GENMASK_ULL(high, low); + /* Last digit should be [012]. If last digit is missing 0 is implied. */ + *num = config[strlen(config) - 1]; + *num = isdigit(*num) ? *num - '0' : 0; + + ret = 0; +out: + free(config); + fclose(fp); + + return ret; +} + +static int +parse_event(char *buf, uint64_t config[3]) +{ + char *token, *term; + int num, ret, val; + uint64_t mask; + + config[0] = config[1] = config[2] = 0; + + token = strtok(buf, ","); + while (token) { + errno = 0; + /* = */ + ret = sscanf(token, "%m[^=]=%i", &term, &val); + if (ret < 1) + return -ENODATA; + if (errno) + return -errno; + if (ret == 1) + val = 1; + + ret = get_term_format(term, &num, &mask); + free(term); + if (ret) + return ret; + + config[num] |= FIELD_PREP(mask, val); + token = strtok(NULL, ","); + } + + return 0; +} + +static int +get_event_config(const char *name, uint64_t config[3]) +{ + char path[PATH_MAX], buf[BUFSIZ]; + FILE *fp; + int ret; + + snprintf(path, sizeof(path), EVENT_SOURCE_DEVICES_PATH "/%s/events/%s", rte_pmu.name, name); + fp = fopen(path, "r"); + if (fp == NULL) + return -errno; + + ret = fread(buf, 1, sizeof(buf), fp); + if (ret == 0) { + fclose(fp); + + return -EINVAL; + } + fclose(fp); + buf[ret] = '\0'; + + return parse_event(buf, config); +} + +static int +do_perf_event_open(uint64_t config[3], int group_fd) +{ + struct perf_event_attr attr = { + .size = sizeof(struct perf_event_attr), + .type = PERF_TYPE_RAW, + .exclude_kernel = 1, + .exclude_hv = 1, + .disabled = 1, + }; + + pmu_arch_fixup_config(config); + + attr.config = config[0]; + attr.config1 = config[1]; + attr.config2 = config[2]; + + return syscall(SYS_perf_event_open, &attr, 0, -1, group_fd, 0); +} + +static int +open_events(struct rte_pmu_event_group *group) +{ + struct rte_pmu_event *event; + uint64_t config[3]; + int num = 0, ret; + + /* group leader gets created first, with fd = -1 */ + group->fds[0] = -1; + + TAILQ_FOREACH(event, &rte_pmu.event_list, next) { + ret = get_event_config(event->name, config); + if (ret) + continue; + + ret = do_perf_event_open(config, group->fds[0]); + if (ret == -1) { + ret = -errno; + goto out; + } + + group->fds[event->index] = ret; + num++; + } + + return 0; +out: + for (--num; num >= 0; num--) { + close(group->fds[num]); + group->fds[num] = -1; + } + + + return ret; +} + +static int +mmap_events(struct rte_pmu_event_group *group) +{ + long page_size = sysconf(_SC_PAGE_SIZE); + unsigned int i; + void *addr; + int ret; + + for (i = 0; i < rte_pmu.num_group_events; i++) { + addr = mmap(0, page_size, PROT_READ, MAP_SHARED, group->fds[i], 0); + if (addr == MAP_FAILED) { + ret = -errno; + goto out; + } + + group->mmap_pages[i] = addr; + } + + return 0; +out: + for (; i; i--) { + munmap(group->mmap_pages[i - 1], page_size); + group->mmap_pages[i - 1] = NULL; + } + + return ret; +} + +static void +cleanup_events(struct rte_pmu_event_group *group) +{ + unsigned int i; + + if (group->fds[0] != -1) + ioctl(group->fds[0], PERF_EVENT_IOC_DISABLE, PERF_IOC_FLAG_GROUP); + + for (i = 0; i < rte_pmu.num_group_events; i++) { + if (group->mmap_pages[i]) { + munmap(group->mmap_pages[i], sysconf(_SC_PAGE_SIZE)); + group->mmap_pages[i] = NULL; + } + + if (group->fds[i] != -1) { + close(group->fds[i]); + group->fds[i] = -1; + } + } + + group->enabled = false; +} + +int __rte_noinline +rte_pmu_enable_group(void) +{ + struct rte_pmu_event_group *group = &RTE_PER_LCORE(_event_group); + int ret; + + if (rte_pmu.num_group_events == 0) + return -ENODEV; + + ret = open_events(group); + if (ret) + goto out; + + ret = mmap_events(group); + if (ret) + goto out; + + if (ioctl(group->fds[0], PERF_EVENT_IOC_RESET, PERF_IOC_FLAG_GROUP) == -1) { + ret = -errno; + goto out; + } + + if (ioctl(group->fds[0], PERF_EVENT_IOC_ENABLE, PERF_IOC_FLAG_GROUP) == -1) { + ret = -errno; + goto out; + } + + rte_spinlock_lock(&rte_pmu.lock); + TAILQ_INSERT_TAIL(&rte_pmu.event_group_list, group, next); + rte_spinlock_unlock(&rte_pmu.lock); + group->enabled = true; + + return 0; + +out: + cleanup_events(group); + + return ret; +} + +static int +scan_pmus(void) +{ + char path[PATH_MAX]; + struct dirent *dent; + const char *name; + DIR *dirp; + + dirp = opendir(EVENT_SOURCE_DEVICES_PATH); + if (dirp == NULL) + return -errno; + + while ((dent = readdir(dirp))) { + name = dent->d_name; + if (name[0] == '.') + continue; + + /* sysfs entry should either contain cpus or be a cpu */ + if (!strcmp(name, "cpu")) + break; + + snprintf(path, sizeof(path), EVENT_SOURCE_DEVICES_PATH "/%s/cpus", name); + if (access(path, F_OK) == 0) + break; + } + + if (dent) { + rte_pmu.name = strdup(name); + if (rte_pmu.name == NULL) { + closedir(dirp); + + return -ENOMEM; + } + } + + closedir(dirp); + + return rte_pmu.name ? 0 : -ENODEV; +} + +static struct rte_pmu_event * +new_event(const char *name) +{ + struct rte_pmu_event *event; + + event = calloc(1, sizeof(*event)); + if (event == NULL) + goto out; + + event->name = strdup(name); + if (event->name == NULL) { + free(event); + event = NULL; + } + +out: + return event; +} + +static void +free_event(struct rte_pmu_event *event) +{ + free(event->name); + free(event); +} + +int +rte_pmu_add_event(const char *name) +{ + struct rte_pmu_event *event; + char path[PATH_MAX]; + + if (rte_pmu.name == NULL) + return -ENODEV; + + if (rte_pmu.num_group_events + 1 >= MAX_NUM_GROUP_EVENTS) + return -ENOSPC; + + snprintf(path, sizeof(path), EVENT_SOURCE_DEVICES_PATH "/%s/events/%s", rte_pmu.name, name); + if (access(path, R_OK)) + return -ENODEV; + + TAILQ_FOREACH(event, &rte_pmu.event_list, next) { + if (!strcmp(event->name, name)) + return event->index; + continue; + } + + event = new_event(name); + if (event == NULL) + return -ENOMEM; + + event->index = rte_pmu.num_group_events++; + TAILQ_INSERT_TAIL(&rte_pmu.event_list, event, next); + + return event->index; +} + +int +rte_pmu_init(void) +{ + int ret; + + /* Allow calling init from multiple contexts within a single thread. This simplifies + * resource management a bit e.g in case fast-path tracepoint has already been enabled + * via command line but application doesn't care enough and performs init/fini again. + */ + if (rte_pmu.initialized) { + rte_pmu.initialized++; + return 0; + } + + ret = scan_pmus(); + if (ret) + goto out; + + ret = pmu_arch_init(); + if (ret) + goto out; + + TAILQ_INIT(&rte_pmu.event_list); + TAILQ_INIT(&rte_pmu.event_group_list); + rte_spinlock_init(&rte_pmu.lock); + rte_pmu.initialized = 1; + + return 0; +out: + free(rte_pmu.name); + rte_pmu.name = NULL; + + return ret; +} + +void +rte_pmu_fini(void) +{ + struct rte_pmu_event_group *group, *tmp_group; + struct rte_pmu_event *event, *tmp_event; + + /* cleanup once init count drops to zero */ + if (!rte_pmu.initialized || --rte_pmu.initialized) + return; + + RTE_TAILQ_FOREACH_SAFE(event, &rte_pmu.event_list, next, tmp_event) { + TAILQ_REMOVE(&rte_pmu.event_list, event, next); + free_event(event); + } + + RTE_TAILQ_FOREACH_SAFE(group, &rte_pmu.event_group_list, next, tmp_group) { + TAILQ_REMOVE(&rte_pmu.event_group_list, group, next); + cleanup_events(group); + } + + pmu_arch_fini(); + free(rte_pmu.name); + rte_pmu.name = NULL; + rte_pmu.num_group_events = 0; +} diff --git a/lib/pmu/rte_pmu.h b/lib/pmu/rte_pmu.h new file mode 100644 index 0000000000..e360375a0c --- /dev/null +++ b/lib/pmu/rte_pmu.h @@ -0,0 +1,205 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2023 Marvell + */ + +#ifndef _RTE_PMU_H_ +#define _RTE_PMU_H_ + +/** + * @file + * + * PMU event tracing operations + * + * This file defines generic API and types necessary to setup PMU and + * read selected counters in runtime. + */ + +#ifdef __cplusplus +extern "C" { +#endif + +#include + +#include +#include +#include +#include +#include + +/** Maximum number of events in a group */ +#define MAX_NUM_GROUP_EVENTS 8 + +/** + * A structure describing a group of events. + */ +struct rte_pmu_event_group { + struct perf_event_mmap_page *mmap_pages[MAX_NUM_GROUP_EVENTS]; /**< array of user pages */ + int fds[MAX_NUM_GROUP_EVENTS]; /**< array of event descriptors */ + bool enabled; /**< true if group was enabled on particular lcore */ + TAILQ_ENTRY(rte_pmu_event_group) next; /**< list entry */ +} __rte_cache_aligned; + +/** + * A structure describing an event. + */ +struct rte_pmu_event { + char *name; /**< name of an event */ + unsigned int index; /**< event index into fds/mmap_pages */ + TAILQ_ENTRY(rte_pmu_event) next; /**< list entry */ +}; + +/** + * A PMU state container. + */ +struct rte_pmu { + char *name; /**< name of core PMU listed under /sys/bus/event_source/devices */ + rte_spinlock_t lock; /**< serialize access to event group list */ + TAILQ_HEAD(, rte_pmu_event_group) event_group_list; /**< list of event groups */ + unsigned int num_group_events; /**< number of events in a group */ + TAILQ_HEAD(, rte_pmu_event) event_list; /**< list of matching events */ + unsigned int initialized; /**< initialization counter */ +}; + +/** lcore event group */ +RTE_DECLARE_PER_LCORE(struct rte_pmu_event_group, _event_group); + +/** PMU state container */ +extern struct rte_pmu rte_pmu; + +/** Each architecture supporting PMU needs to provide its own version */ +#ifndef rte_pmu_pmc_read +#define rte_pmu_pmc_read(index) ({ 0; }) +#endif + +/** + * @internal + * + * Read PMU counter. + * + * @param pc + * Pointer to the mmapped user page. + * @return + * Counter value read from hardware. + */ +__rte_internal +static __rte_always_inline uint64_t +rte_pmu_read_userpage(struct perf_event_mmap_page *pc) +{ + uint64_t width, offset; + uint32_t seq, index; + int64_t pmc; + + for (;;) { + seq = pc->lock; + rte_compiler_barrier(); + index = pc->index; + offset = pc->offset; + width = pc->pmc_width; + + /* index set to 0 means that particular counter cannot be used */ + if (likely(pc->cap_user_rdpmc && index)) { + pmc = rte_pmu_pmc_read(index - 1); + pmc <<= 64 - width; + pmc >>= 64 - width; + offset += pmc; + } + + rte_compiler_barrier(); + + if (likely(pc->lock == seq)) + return offset; + } + + return 0; +} + +/** + * @internal + * + * Enable group of events on the calling lcore. + * + * @return + * 0 in case of success, negative value otherwise. + */ +__rte_internal +int +rte_pmu_enable_group(void); + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Initialize PMU library. + * + * @return + * 0 in case of success, negative value otherwise. + */ +__rte_experimental +int +rte_pmu_init(void); + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Finalize PMU library. This should be called after PMU counters are no longer being read. + */ +__rte_experimental +void +rte_pmu_fini(void); + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Add event to the group of enabled events. + * + * @param name + * Name of an event listed under /sys/bus/event_source/devices/pmu/events. + * @return + * Event index in case of success, negative value otherwise. + */ +__rte_experimental +int +rte_pmu_add_event(const char *name); + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Read hardware counter configured to count occurrences of an event. + * + * @param index + * Index of an event to be read. + * @return + * Event value read from register. In case of errors or lack of support + * 0 is returned. In other words, stream of zeros in a trace file + * indicates problem with reading particular PMU event register. + */ +__rte_experimental +static __rte_always_inline uint64_t +rte_pmu_read(unsigned int index) +{ + struct rte_pmu_event_group *group = &RTE_PER_LCORE(_event_group); + int ret; + + if (unlikely(!rte_pmu.initialized)) + return 0; + + if (unlikely(!group->enabled)) { + ret = rte_pmu_enable_group(); + if (ret) + return 0; + } + + if (unlikely(index >= rte_pmu.num_group_events)) + return 0; + + return rte_pmu_read_userpage(group->mmap_pages[index]); +} + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_PMU_H_ */ diff --git a/lib/pmu/version.map b/lib/pmu/version.map new file mode 100644 index 0000000000..50fb0f354e --- /dev/null +++ b/lib/pmu/version.map @@ -0,0 +1,20 @@ +DPDK_23 { + local: *; +}; + +EXPERIMENTAL { + global: + + per_lcore__event_group; + rte_pmu; + rte_pmu_add_event; + rte_pmu_fini; + rte_pmu_init; + rte_pmu_read; +}; + +INTERNAL { + global: + + rte_pmu_enable_group; +}; From patchwork Thu Feb 2 12:49:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tomasz Duszynski X-Patchwork-Id: 122953 X-Patchwork-Delegate: david.marchand@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 51F9341BAE; Thu, 2 Feb 2023 13:50:17 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 8714E42FB6; Thu, 2 Feb 2023 13:50:11 +0100 (CET) Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com [67.231.156.173]) by mails.dpdk.org (Postfix) with ESMTP id 03EE040689 for ; Thu, 2 Feb 2023 13:50:07 +0100 (CET) Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3127GW79007589; Thu, 2 Feb 2023 04:50:06 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=pfpt0220; bh=Cp1lVwMbd5oxrNrXOXoisHQCT8RuRKgYFKz4rv66aSc=; b=J99+P+/XU81CVKOR4sA2LM3gFRqst5lnn0tSlPsgfe5VJFwaVuLkzRRpNJgq5PKtKEmY 0hvw6xPLSZCstglacj05g4CZ2jmY8yVtrjx46Tzgo4cJJwDOmXy1x8UuZnEXfPGY8lJe IaHChyhesUlB7e+UGTTu8iBxaq2wl2fyVIAG1eEVsPTbWelKlToLCcQd5Aqa4s0HXERl T7fuq/+uv/lM/0zERaHDvE8mN41V9dIQWAlHaB3lFCI7lgcvvnhZoWO0QPcJ0xL+0nVo WMRJJRSySbk62KMt/kPiFgelNWMcKnF5LxyTcWHCWZiJkYLBTOq6SIkRe0MYJVE9OxTZ qg== Received: from dc5-exch02.marvell.com ([199.233.59.182]) by mx0b-0016f401.pphosted.com (PPS) with ESMTPS id 3nfjrj832q-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Thu, 02 Feb 2023 04:50:05 -0800 Received: from DC5-EXCH01.marvell.com (10.69.176.38) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server (TLS) id 15.0.1497.42; Thu, 2 Feb 2023 04:50:04 -0800 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server id 15.0.1497.42 via Frontend Transport; Thu, 2 Feb 2023 04:50:04 -0800 Received: from cavium-DT10.. (unknown [10.28.34.39]) by maili.marvell.com (Postfix) with ESMTP id 6CDD75B6931; Thu, 2 Feb 2023 04:50:01 -0800 (PST) From: Tomasz Duszynski To: , Tomasz Duszynski , Ruifeng Wang CC: , , , , , , , Subject: [PATCH v9 2/4] pmu: support reading ARM PMU events in runtime Date: Thu, 2 Feb 2023 13:49:49 +0100 Message-ID: <20230202124951.2915770-3-tduszynski@marvell.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230202124951.2915770-1-tduszynski@marvell.com> References: <20230202094358.2838758-1-tduszynski@marvell.com> <20230202124951.2915770-1-tduszynski@marvell.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: TjOABKCEvoIXTCgGUu9nhbZ8i0MDVEcs X-Proofpoint-GUID: TjOABKCEvoIXTCgGUu9nhbZ8i0MDVEcs X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.930,Hydra:6.0.562,FMLib:17.11.122.1 definitions=2023-02-02_04,2023-02-02_01,2022-06-22_01 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Add support for reading ARM PMU events in runtime. Signed-off-by: Tomasz Duszynski Acked-by: Morten Brørup --- app/test/test_pmu.c | 4 ++ lib/pmu/meson.build | 7 +++ lib/pmu/pmu_arm64.c | 94 +++++++++++++++++++++++++++++++++++++ lib/pmu/rte_pmu.h | 4 ++ lib/pmu/rte_pmu_pmc_arm64.h | 30 ++++++++++++ 5 files changed, 139 insertions(+) create mode 100644 lib/pmu/pmu_arm64.c create mode 100644 lib/pmu/rte_pmu_pmc_arm64.h diff --git a/app/test/test_pmu.c b/app/test/test_pmu.c index a9bfb1a427..623e04b691 100644 --- a/app/test/test_pmu.c +++ b/app/test/test_pmu.c @@ -26,6 +26,10 @@ test_pmu_read(void) if (rte_pmu_init() < 0) return TEST_FAILED; +#if defined(RTE_ARCH_ARM64) + event = rte_pmu_add_event("cpu_cycles"); +#endif + while (tries--) val += rte_pmu_read(event); diff --git a/lib/pmu/meson.build b/lib/pmu/meson.build index a4160b494e..e857681137 100644 --- a/lib/pmu/meson.build +++ b/lib/pmu/meson.build @@ -11,3 +11,10 @@ includes = [global_inc] sources = files('rte_pmu.c') headers = files('rte_pmu.h') +indirect_headers += files( + 'rte_pmu_pmc_arm64.h', +) + +if dpdk_conf.has('RTE_ARCH_ARM64') + sources += files('pmu_arm64.c') +endif diff --git a/lib/pmu/pmu_arm64.c b/lib/pmu/pmu_arm64.c new file mode 100644 index 0000000000..9e15727948 --- /dev/null +++ b/lib/pmu/pmu_arm64.c @@ -0,0 +1,94 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(C) 2023 Marvell International Ltd. + */ + +#include +#include +#include +#include + +#include +#include + +#include "pmu_private.h" + +#define PERF_USER_ACCESS_PATH "/proc/sys/kernel/perf_user_access" + +static int restore_uaccess; + +static int +read_attr_int(const char *path, int *val) +{ + char buf[BUFSIZ]; + int ret, fd; + + fd = open(path, O_RDONLY); + if (fd == -1) + return -errno; + + ret = read(fd, buf, sizeof(buf)); + if (ret == -1) { + close(fd); + + return -errno; + } + + *val = strtol(buf, NULL, 10); + close(fd); + + return 0; +} + +static int +write_attr_int(const char *path, int val) +{ + char buf[BUFSIZ]; + int num, ret, fd; + + fd = open(path, O_WRONLY); + if (fd == -1) + return -errno; + + num = snprintf(buf, sizeof(buf), "%d", val); + ret = write(fd, buf, num); + if (ret == -1) { + close(fd); + + return -errno; + } + + close(fd); + + return 0; +} + +int +pmu_arch_init(void) +{ + int ret; + + ret = read_attr_int(PERF_USER_ACCESS_PATH, &restore_uaccess); + if (ret) + return ret; + + /* user access already enabled */ + if (restore_uaccess == 1) + return 0; + + return write_attr_int(PERF_USER_ACCESS_PATH, 1); +} + +void +pmu_arch_fini(void) +{ + write_attr_int(PERF_USER_ACCESS_PATH, restore_uaccess); +} + +void +pmu_arch_fixup_config(uint64_t config[3]) +{ + /* select 64 bit counters */ + config[1] |= RTE_BIT64(0); + /* enable userspace access */ + config[1] |= RTE_BIT64(1); +} diff --git a/lib/pmu/rte_pmu.h b/lib/pmu/rte_pmu.h index e360375a0c..b18938dab1 100644 --- a/lib/pmu/rte_pmu.h +++ b/lib/pmu/rte_pmu.h @@ -26,6 +26,10 @@ extern "C" { #include #include +#if defined(RTE_ARCH_ARM64) +#include "rte_pmu_pmc_arm64.h" +#endif + /** Maximum number of events in a group */ #define MAX_NUM_GROUP_EVENTS 8 diff --git a/lib/pmu/rte_pmu_pmc_arm64.h b/lib/pmu/rte_pmu_pmc_arm64.h new file mode 100644 index 0000000000..10648f0c5f --- /dev/null +++ b/lib/pmu/rte_pmu_pmc_arm64.h @@ -0,0 +1,30 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2023 Marvell. + */ +#ifndef _RTE_PMU_PMC_ARM64_H_ +#define _RTE_PMU_PMC_ARM64_H_ + +#include + +static __rte_always_inline uint64_t +rte_pmu_pmc_read(int index) +{ + uint64_t val; + + if (index == 31) { + /* CPU Cycles (0x11) must be read via pmccntr_el0 */ + asm volatile("mrs %0, pmccntr_el0" : "=r" (val)); + } else { + asm volatile( + "msr pmselr_el0, %x0\n" + "mrs %0, pmxevcntr_el0\n" + : "=r" (val) + : "rZ" (index) + ); + } + + return val; +} +#define rte_pmu_pmc_read rte_pmu_pmc_read + +#endif /* _RTE_PMU_PMC_ARM64_H_ */ From patchwork Thu Feb 2 12:49:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tomasz Duszynski X-Patchwork-Id: 122954 X-Patchwork-Delegate: david.marchand@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 4D39641BAE; Thu, 2 Feb 2023 13:50:22 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id A504C42F96; Thu, 2 Feb 2023 13:50:14 +0100 (CET) Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com [67.231.156.173]) by mails.dpdk.org (Postfix) with ESMTP id 2157C42F9A for ; Thu, 2 Feb 2023 13:50:13 +0100 (CET) Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3127RWqE030478; Thu, 2 Feb 2023 04:50:10 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=pfpt0220; bh=+Jf+rjLqsk7CtmWTduDirWVGEh+QD0JRZyhswJOA4wY=; b=UZMO0IVlwqV5+Tqar5J274GrxJE3rhfkqIpmWs6/targz4bc7Vt9GtKSKY39TL/CC97a LMONKS8t/Qos24PHfVbKgHMCiuyrvcaJkbOADAfe/kaLW1pVFjb9YL2RI7DUgvupcgrV h7xDJNs8YU7vi0Lx17XsiAS52KtBREZe/coOfOGIprjDKnuo+Bs5sEYsHtgj/yF0Cd2O KuVBeD+gNlfu87stfZegB2fu2ph6wUsvEqSPjTx2ZGdcpkGiIiNhSrHD+PQ+V2gVQL6V BluwC0XKjIsrp5JjCbMz16gYcWN4czeI1oFqi7CYGvrcZG/ZBR10KOJf/FqgtLSPx2yJ RQ== Received: from dc5-exch01.marvell.com ([199.233.59.181]) by mx0b-0016f401.pphosted.com (PPS) with ESMTPS id 3nfjrj8335-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Thu, 02 Feb 2023 04:50:10 -0800 Received: from DC5-EXCH01.marvell.com (10.69.176.38) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server (TLS) id 15.0.1497.42; Thu, 2 Feb 2023 04:50:08 -0800 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server id 15.0.1497.42 via Frontend Transport; Thu, 2 Feb 2023 04:50:08 -0800 Received: from cavium-DT10.. (unknown [10.28.34.39]) by maili.marvell.com (Postfix) with ESMTP id 2B0865B6926; Thu, 2 Feb 2023 04:50:04 -0800 (PST) From: Tomasz Duszynski To: , Tomasz Duszynski CC: , , , , , , , Subject: [PATCH v9 3/4] pmu: support reading Intel x86_64 PMU events in runtime Date: Thu, 2 Feb 2023 13:49:50 +0100 Message-ID: <20230202124951.2915770-4-tduszynski@marvell.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230202124951.2915770-1-tduszynski@marvell.com> References: <20230202094358.2838758-1-tduszynski@marvell.com> <20230202124951.2915770-1-tduszynski@marvell.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: B4wLO_Zjp2jYgrjUivv20CWv_TpVPo3j X-Proofpoint-GUID: B4wLO_Zjp2jYgrjUivv20CWv_TpVPo3j X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.930,Hydra:6.0.562,FMLib:17.11.122.1 definitions=2023-02-02_04,2023-02-02_01,2022-06-22_01 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Add support for reading Intel x86_64 PMU events in runtime. Signed-off-by: Tomasz Duszynski Acked-by: Morten Brørup --- app/test/test_pmu.c | 2 ++ lib/pmu/meson.build | 1 + lib/pmu/rte_pmu.h | 2 ++ lib/pmu/rte_pmu_pmc_x86_64.h | 24 ++++++++++++++++++++++++ 4 files changed, 29 insertions(+) create mode 100644 lib/pmu/rte_pmu_pmc_x86_64.h diff --git a/app/test/test_pmu.c b/app/test/test_pmu.c index 623e04b691..614395482f 100644 --- a/app/test/test_pmu.c +++ b/app/test/test_pmu.c @@ -28,6 +28,8 @@ test_pmu_read(void) #if defined(RTE_ARCH_ARM64) event = rte_pmu_add_event("cpu_cycles"); +#elif defined(RTE_ARCH_X86_64) + event = rte_pmu_add_event("cpu-cycles"); #endif while (tries--) diff --git a/lib/pmu/meson.build b/lib/pmu/meson.build index e857681137..5b92e5c4e3 100644 --- a/lib/pmu/meson.build +++ b/lib/pmu/meson.build @@ -13,6 +13,7 @@ sources = files('rte_pmu.c') headers = files('rte_pmu.h') indirect_headers += files( 'rte_pmu_pmc_arm64.h', + 'rte_pmu_pmc_x86_64.h', ) if dpdk_conf.has('RTE_ARCH_ARM64') diff --git a/lib/pmu/rte_pmu.h b/lib/pmu/rte_pmu.h index b18938dab1..0f7004c31c 100644 --- a/lib/pmu/rte_pmu.h +++ b/lib/pmu/rte_pmu.h @@ -28,6 +28,8 @@ extern "C" { #if defined(RTE_ARCH_ARM64) #include "rte_pmu_pmc_arm64.h" +#elif defined(RTE_ARCH_X86_64) +#include "rte_pmu_pmc_x86_64.h" #endif /** Maximum number of events in a group */ diff --git a/lib/pmu/rte_pmu_pmc_x86_64.h b/lib/pmu/rte_pmu_pmc_x86_64.h new file mode 100644 index 0000000000..7b67466960 --- /dev/null +++ b/lib/pmu/rte_pmu_pmc_x86_64.h @@ -0,0 +1,24 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2023 Marvell. + */ +#ifndef _RTE_PMU_PMC_X86_64_H_ +#define _RTE_PMU_PMC_X86_64_H_ + +#include + +static __rte_always_inline uint64_t +rte_pmu_pmc_read(int index) +{ + uint64_t low, high; + + asm volatile( + "rdpmc\n" + : "=a" (low), "=d" (high) + : "c" (index) + ); + + return low | (high << 32); +} +#define rte_pmu_pmc_read rte_pmu_pmc_read + +#endif /* _RTE_PMU_PMC_X86_64_H_ */ From patchwork Thu Feb 2 12:49:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tomasz Duszynski X-Patchwork-Id: 122955 X-Patchwork-Delegate: david.marchand@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 80EF341BAE; Thu, 2 Feb 2023 13:50:27 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id D405942FA8; Thu, 2 Feb 2023 13:50:18 +0100 (CET) Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com [67.231.156.173]) by mails.dpdk.org (Postfix) with ESMTP id B688B40689 for ; Thu, 2 Feb 2023 13:50:16 +0100 (CET) Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3127GW7C007589; Thu, 2 Feb 2023 04:50:14 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=pfpt0220; bh=tN8t/Gpvra0lzbgE+T/fKh3goqVz2KTEifS4xeCaC04=; b=RgJswxjtTR7kl4ENGeR4gBIsJS12K1fTR0Cigf2oo8ccLfD8Ie5xAmgCYYo458kcQX/6 ZPLLBjNnzcDFQyByNN54k9ULRPU7+zx1sqCLYn8cPHD3TraB81UUUs78B52YnRoCaxBV mWfANYfk8HdUCqC2N2p6Ba5GE11aVn6TdzeNYnca9GusNYfAzGV0rttvdMalfX4pNs4a +IR9g4abcOvRLr6yFFeEe5nu03TR1Fk+iyWgGII7ENZ3WV65C5TAJBWSiEopxjBOvcNU 3WXnnhPB7j4lFMwJywH2dSB1dfKFk8b1dgVT7ZEdcmbWHoOC5TsEpFAzzg7/hyvuHIXC fQ== Received: from dc5-exch01.marvell.com ([199.233.59.181]) by mx0b-0016f401.pphosted.com (PPS) with ESMTPS id 3nfjrj833f-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Thu, 02 Feb 2023 04:50:14 -0800 Received: from DC5-EXCH01.marvell.com (10.69.176.38) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server (TLS) id 15.0.1497.42; Thu, 2 Feb 2023 04:50:12 -0800 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server id 15.0.1497.42 via Frontend Transport; Thu, 2 Feb 2023 04:50:12 -0800 Received: from cavium-DT10.. (unknown [10.28.34.39]) by maili.marvell.com (Postfix) with ESMTP id A95AF3F7080; Thu, 2 Feb 2023 04:50:08 -0800 (PST) From: Tomasz Duszynski To: , Jerin Jacob , Sunil Kumar Kori , Tomasz Duszynski CC: , , , , , , Subject: [PATCH v9 4/4] eal: add PMU support to tracing library Date: Thu, 2 Feb 2023 13:49:51 +0100 Message-ID: <20230202124951.2915770-5-tduszynski@marvell.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230202124951.2915770-1-tduszynski@marvell.com> References: <20230202094358.2838758-1-tduszynski@marvell.com> <20230202124951.2915770-1-tduszynski@marvell.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: WkBWStPSQgzPQZ2cjB75jnGkLbqqnYQt X-Proofpoint-GUID: WkBWStPSQgzPQZ2cjB75jnGkLbqqnYQt X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.930,Hydra:6.0.562,FMLib:17.11.122.1 definitions=2023-02-02_04,2023-02-02_01,2022-06-22_01 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org In order to profile app one needs to store significant amount of samples somewhere for an analysis latern on. Since trace library supports storing data in a CTF format lets take adventage of that and add a dedicated PMU tracepoint. Signed-off-by: Tomasz Duszynski Acked-by: Morten Brørup --- app/test/test_trace_perf.c | 10 ++++ doc/guides/prog_guide/profile_app.rst | 5 ++ doc/guides/prog_guide/trace_lib.rst | 32 +++++++++++++ lib/eal/common/eal_common_trace.c | 13 ++++- lib/eal/common/eal_common_trace_points.c | 5 ++ lib/eal/include/rte_eal_trace.h | 13 +++++ lib/eal/meson.build | 3 ++ lib/eal/version.map | 1 + lib/pmu/rte_pmu.c | 61 ++++++++++++++++++++++++ lib/pmu/rte_pmu.h | 14 ++++++ lib/pmu/version.map | 1 + 11 files changed, 157 insertions(+), 1 deletion(-) diff --git a/app/test/test_trace_perf.c b/app/test/test_trace_perf.c index 46ae7d8074..f1929f2734 100644 --- a/app/test/test_trace_perf.c +++ b/app/test/test_trace_perf.c @@ -114,6 +114,10 @@ worker_fn_##func(void *arg) \ #define GENERIC_DOUBLE rte_eal_trace_generic_double(3.66666) #define GENERIC_STR rte_eal_trace_generic_str("hello world") #define VOID_FP app_dpdk_test_fp() +#ifdef RTE_EXEC_ENV_LINUX +/* 0 corresponds first event passed via --trace= */ +#define READ_PMU rte_eal_trace_pmu_read(0) +#endif WORKER_DEFINE(GENERIC_VOID) WORKER_DEFINE(GENERIC_U64) @@ -122,6 +126,9 @@ WORKER_DEFINE(GENERIC_FLOAT) WORKER_DEFINE(GENERIC_DOUBLE) WORKER_DEFINE(GENERIC_STR) WORKER_DEFINE(VOID_FP) +#ifdef RTE_EXEC_ENV_LINUX +WORKER_DEFINE(READ_PMU) +#endif static void run_test(const char *str, lcore_function_t f, struct test_data *data, size_t sz) @@ -174,6 +181,9 @@ test_trace_perf(void) run_test("double", worker_fn_GENERIC_DOUBLE, data, sz); run_test("string", worker_fn_GENERIC_STR, data, sz); run_test("void_fp", worker_fn_VOID_FP, data, sz); +#ifdef RTE_EXEC_ENV_LINUX + run_test("read_pmu", worker_fn_READ_PMU, data, sz); +#endif rte_free(data); return TEST_SUCCESS; diff --git a/doc/guides/prog_guide/profile_app.rst b/doc/guides/prog_guide/profile_app.rst index a8b501fe0c..6a53341c6b 100644 --- a/doc/guides/prog_guide/profile_app.rst +++ b/doc/guides/prog_guide/profile_app.rst @@ -16,6 +16,11 @@ that information, perf being an example here. Though in some scenarios, eg. when isolated (nohz_full) and run dedicated tasks, using perf is less than ideal. In such cases one can read specific events directly from application via ``rte_pmu_read()``. +Alternatively tracing library can be used which offers dedicated tracepoint +``rte_eal_trace_pmu_event()``. + +Refer to :doc:`../prog_guide/trace_lib` for more details. + Profiling on x86 ---------------- diff --git a/doc/guides/prog_guide/trace_lib.rst b/doc/guides/prog_guide/trace_lib.rst index 9a8f38073d..a8e97ee1ec 100644 --- a/doc/guides/prog_guide/trace_lib.rst +++ b/doc/guides/prog_guide/trace_lib.rst @@ -46,6 +46,7 @@ DPDK tracing library features trace format and is compatible with ``LTTng``. For detailed information, refer to `Common Trace Format `_. +- Support reading PMU events on ARM64 and x86-64 (Intel) How to add a tracepoint? ------------------------ @@ -137,6 +138,37 @@ the user must use ``RTE_TRACE_POINT_FP`` instead of ``RTE_TRACE_POINT``. ``RTE_TRACE_POINT_FP`` is compiled out by default and it can be enabled using the ``enable_trace_fp`` option for meson build. +PMU tracepoint +-------------- + +Performance measurement unit (PMU) event values can be read from hardware +registers using predefined ``rte_pmu_read`` tracepoint. + +Tracing is enabled via ``--trace`` EAL option by passing both expression +matching PMU tracepoint name i.e ``lib.eal.pmu.read`` and expression +``e=ev1[,ev2,...]`` matching particular events:: + + --trace='.*pmu.read\|e=cpu_cycles,l1d_cache' + +Event names are available under ``/sys/bus/event_source/devices/PMU/events`` +directory, where ``PMU`` is a placeholder for either a ``cpu`` or a directory +containing ``cpus``. + +In contrary to other tracepoints this does not need any extra variables +added to source files. Instead, caller passes index which follows the order of +events specified via ``--trace`` parameter. In the following example index ``0`` +corresponds to ``cpu_cyclces`` while index ``1`` corresponds to ``l1d_cache``. + +.. code-block:: c + + ... + rte_eal_trace_pmu_read(0); + rte_eal_trace_pmu_read(1); + ... + +PMU tracing support must be explicitly enabled using the ``enable_trace_fp`` +option for meson build. + Event record mode ----------------- diff --git a/lib/eal/common/eal_common_trace.c b/lib/eal/common/eal_common_trace.c index 75162b722d..8796052d0c 100644 --- a/lib/eal/common/eal_common_trace.c +++ b/lib/eal/common/eal_common_trace.c @@ -11,6 +11,9 @@ #include #include #include +#ifdef RTE_EXEC_ENV_LINUX +#include +#endif #include #include "eal_trace.h" @@ -71,8 +74,13 @@ eal_trace_init(void) goto free_meta; /* Apply global configurations */ - STAILQ_FOREACH(arg, &trace.args, next) + STAILQ_FOREACH(arg, &trace.args, next) { trace_args_apply(arg->val); +#ifdef RTE_EXEC_ENV_LINUX + if (rte_pmu_init() == 0) + rte_pmu_add_events_by_pattern(arg->val); +#endif + } rte_trace_mode_set(trace.mode); @@ -88,6 +96,9 @@ eal_trace_init(void) void eal_trace_fini(void) { +#ifdef RTE_EXEC_ENV_LINUX + rte_pmu_fini(); +#endif trace_mem_free(); trace_metadata_destroy(); eal_trace_args_free(); diff --git a/lib/eal/common/eal_common_trace_points.c b/lib/eal/common/eal_common_trace_points.c index 0b0b254615..1e46ce549a 100644 --- a/lib/eal/common/eal_common_trace_points.c +++ b/lib/eal/common/eal_common_trace_points.c @@ -75,3 +75,8 @@ RTE_TRACE_POINT_REGISTER(rte_eal_trace_intr_enable, lib.eal.intr.enable) RTE_TRACE_POINT_REGISTER(rte_eal_trace_intr_disable, lib.eal.intr.disable) + +#ifdef RTE_EXEC_ENV_LINUX +RTE_TRACE_POINT_REGISTER(rte_eal_trace_pmu_read, + lib.eal.pmu.read) +#endif diff --git a/lib/eal/include/rte_eal_trace.h b/lib/eal/include/rte_eal_trace.h index 5ef4398230..afb459b198 100644 --- a/lib/eal/include/rte_eal_trace.h +++ b/lib/eal/include/rte_eal_trace.h @@ -17,6 +17,9 @@ extern "C" { #include #include +#ifdef RTE_EXEC_ENV_LINUX +#include +#endif #include #include "eal_interrupts.h" @@ -279,6 +282,16 @@ RTE_TRACE_POINT( rte_trace_point_emit_string(cpuset); ) +#ifdef RTE_EXEC_ENV_LINUX +RTE_TRACE_POINT_FP( + rte_eal_trace_pmu_read, + RTE_TRACE_POINT_ARGS(unsigned int index), + uint64_t val; + val = rte_pmu_read(index); + rte_trace_point_emit_u64(val); +) +#endif + #ifdef __cplusplus } #endif diff --git a/lib/eal/meson.build b/lib/eal/meson.build index 056beb9461..f5865dbcd9 100644 --- a/lib/eal/meson.build +++ b/lib/eal/meson.build @@ -26,6 +26,9 @@ deps += ['kvargs'] if not is_windows deps += ['telemetry'] endif +if is_linux + deps += ['pmu'] +endif if dpdk_conf.has('RTE_USE_LIBBSD') ext_deps += libbsd endif diff --git a/lib/eal/version.map b/lib/eal/version.map index 6523102157..2f8f66874b 100644 --- a/lib/eal/version.map +++ b/lib/eal/version.map @@ -441,6 +441,7 @@ EXPERIMENTAL { rte_thread_join; # added in 23.03 + __rte_eal_trace_pmu_read; # WINDOWS_NO_EXPORT rte_thread_set_name; }; diff --git a/lib/pmu/rte_pmu.c b/lib/pmu/rte_pmu.c index 4cf3161155..f1c5630344 100644 --- a/lib/pmu/rte_pmu.c +++ b/lib/pmu/rte_pmu.c @@ -402,6 +402,67 @@ rte_pmu_add_event(const char *name) return event->index; } +static int +add_events(const char *pattern) +{ + char *token, *copy; + int ret = 0; + + copy = strdup(pattern); + if (copy == NULL) + return -ENOMEM; + + token = strtok(copy, ","); + while (token) { + ret = rte_pmu_add_event(token); + if (ret < 0) + break; + + token = strtok(NULL, ","); + } + + free(copy); + + return ret >= 0 ? 0 : ret; +} + +int +rte_pmu_add_events_by_pattern(const char *pattern) +{ + regmatch_t rmatch; + char buf[BUFSIZ]; + unsigned int num; + regex_t reg; + int ret; + + /* events are matched against occurrences of e=ev1[,ev2,..] pattern */ + ret = regcomp(®, "e=([_[:alnum:]-],?)+", REG_EXTENDED); + if (ret) + return -EINVAL; + + for (;;) { + if (regexec(®, pattern, 1, &rmatch, 0)) + break; + + num = rmatch.rm_eo - rmatch.rm_so; + if (num > sizeof(buf)) + num = sizeof(buf); + + /* skip e= pattern prefix */ + memcpy(buf, pattern + rmatch.rm_so + 2, num - 2); + buf[num - 2] = '\0'; + ret = add_events(buf); + if (ret) + break; + + pattern += rmatch.rm_eo; + } + + regfree(®); + + return ret; +} + int rte_pmu_init(void) { diff --git a/lib/pmu/rte_pmu.h b/lib/pmu/rte_pmu.h index 0f7004c31c..0f6250e81f 100644 --- a/lib/pmu/rte_pmu.h +++ b/lib/pmu/rte_pmu.h @@ -169,6 +169,20 @@ __rte_experimental int rte_pmu_add_event(const char *name); +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Add events matching pattern to the group of enabled events. + * + * @param pattern + * Pattern e=ev1[,ev2,...] matching events, where evX is a placeholder for an event listed under + * /sys/bus/event_source/devices/pmu/events. + */ +__rte_experimental +int +rte_pmu_add_events_by_pattern(const char *pattern); + /** * @warning * @b EXPERIMENTAL: this API may change without prior notice diff --git a/lib/pmu/version.map b/lib/pmu/version.map index 50fb0f354e..20a27d085c 100644 --- a/lib/pmu/version.map +++ b/lib/pmu/version.map @@ -8,6 +8,7 @@ EXPERIMENTAL { per_lcore__event_group; rte_pmu; rte_pmu_add_event; + rte_pmu_add_events_by_pattern; rte_pmu_fini; rte_pmu_init; rte_pmu_read;