From patchwork Wed Sep 14 09:29:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Laatz X-Patchwork-Id: 116300 X-Patchwork-Delegate: david.marchand@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id E3AADA0032; Wed, 14 Sep 2022 11:26:26 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id D4834427FF; Wed, 14 Sep 2022 11:26:26 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by mails.dpdk.org (Postfix) with ESMTP id 7659A42802 for ; Wed, 14 Sep 2022 11:26:24 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1663147584; x=1694683584; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=eaWKuxjJaARSxj0hUAPRY7HMCK04aDxFt/oi1pCwDb4=; b=Chr4xsXp6b6CRTZhXVcsUqDzpWl1OuzjdsB9Oie+RE1xvXhwTa5U6NTi 97BWR2KYrcv9W2rpBvh/zlMjCbBjbZeSsZ+ffRQe7kdMcASbnQFgrCWAA OBGMM9UwH6Z7mJPx+p9QIIdg1ZEIZwEWP9v9bz7D3ZqTMtKJ4GD0k2uRQ ftgARskUpacJBG1ntm1ZFZy9M0PIqPQAsqHTad9WpurhZaRuacyibFywP d+7iGRERoJO5K90I7PcBvJV84IBSHDkBAWyQguEhW4aObArlaAyTD0o2Z TChHvJNbj8ABuMRTVhN20RW9OKSj1cakJNtgg/dcwiljsT6Tezh7DH0uN g==; X-IronPort-AV: E=McAfee;i="6500,9779,10469"; a="384675033" X-IronPort-AV: E=Sophos;i="5.93,315,1654585200"; d="scan'208";a="384675033" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2022 02:26:23 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.93,315,1654585200"; d="scan'208";a="612446494" Received: from silpixa00401122.ir.intel.com ([10.237.213.42]) by orsmga007.jf.intel.com with ESMTP; 14 Sep 2022 02:26:17 -0700 From: Kevin Laatz To: dev@dpdk.org Cc: anatoly.burakov@intel.com, Kevin Laatz , Conor Walsh , David Hunt , Bruce Richardson , Nicolas Chautru , Fan Zhang , Ashish Gupta , Akhil Goyal , Chengwen Feng , Ray Kinsella , Thomas Monjalon , Ferruh Yigit , Andrew Rybchenko , Jerin Jacob , Sachin Saxena , Hemant Agrawal , Ori Kam , Honnappa Nagarahalli , Konstantin Ananyev Subject: [PATCH v7 1/4] eal: add lcore poll busyness telemetry Date: Wed, 14 Sep 2022 10:29:26 +0100 Message-Id: <20220914092929.1159773-2-kevin.laatz@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220914092929.1159773-1-kevin.laatz@intel.com> References: <24c49429394294cfbf0d9c506b205029bac77c8b.1657890378.git.anatoly.burakov@intel.com> <20220914092929.1159773-1-kevin.laatz@intel.com> MIME-Version: 1.0 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org From: Anatoly Burakov Currently, there is no way to measure lcore poll busyness in a passive way, without any modifications to the application. This patch adds a new EAL API that will be able to passively track core polling busyness. The poll busyness is calculated by relying on the fact that most DPDK API's will poll for work (packets, completions, eventdev events, etc). Empty polls can be counted as "idle", while non-empty polls can be counted as busy. To measure lcore poll busyness, we simply call the telemetry timestamping function with the number of polls a particular code section has processed, and count the number of cycles we've spent processing empty bursts. The more empty bursts we encounter, the less cycles we spend in "busy" state, and the less core poll busyness will be reported. In order for all of the above to work without modifications to the application, the library code needs to be instrumented with calls to the lcore telemetry busyness timestamping function. The following parts of DPDK are instrumented with lcore poll busyness timestamping calls: - All major driver API's: - ethdev - cryptodev - compressdev - regexdev - bbdev - rawdev - eventdev - dmadev - Some additional libraries: - ring - distributor To avoid performance impact from having lcore telemetry support, a global variable is exported by EAL, and a call to timestamping function is wrapped into a macro, so that whenever telemetry is disabled, it only takes one additional branch and no function calls are performed. It is disabled at compile time by default. This patch also adds a telemetry endpoint to report lcore poll busyness, as well as telemetry endpoints to enable/disable lcore telemetry. A documentation entry has been added to the howto guides to explain the usage of the new telemetry endpoints and API. Signed-off-by: Kevin Laatz Signed-off-by: Conor Walsh Signed-off-by: David Hunt Signed-off-by: Anatoly Burakov --- v7: * Rename funcs, vars, files to include "poll" where missing. v5: * Fix Windows build * Make lcore_telemetry_free() an internal interface * Minor cleanup v4: * Fix doc build * Rename timestamp macro to RTE_LCORE_POLL_BUSYNESS_TIMESTAMP * Make enable/disable read and write atomic * Change rte_lcore_poll_busyness_enabled_set() param to bool * Move mem alloc from enable/disable to init/cleanup * Other minor fixes v3: * Fix missed renaming to poll busyness * Fix clang compilation * Fix arm compilation v2: * Use rte_get_tsc_hz() to adjust the telemetry period * Rename to reflect polling busyness vs general busyness * Fix segfault when calling telemetry timestamp from an unregistered non-EAL thread. * Minor cleanup --- config/meson.build | 1 + config/rte_config.h | 1 + lib/bbdev/rte_bbdev.h | 17 +- lib/compressdev/rte_compressdev.c | 2 + lib/cryptodev/rte_cryptodev.h | 2 + lib/distributor/rte_distributor.c | 21 +- lib/distributor/rte_distributor_single.c | 14 +- lib/dmadev/rte_dmadev.h | 15 +- .../common/eal_common_lcore_poll_telemetry.c | 303 ++++++++++++++++++ lib/eal/common/meson.build | 1 + lib/eal/freebsd/eal.c | 1 + lib/eal/include/rte_lcore.h | 85 ++++- lib/eal/linux/eal.c | 1 + lib/eal/meson.build | 3 + lib/eal/version.map | 7 + lib/ethdev/rte_ethdev.h | 2 + lib/eventdev/rte_eventdev.h | 10 +- lib/rawdev/rte_rawdev.c | 6 +- lib/regexdev/rte_regexdev.h | 5 +- lib/ring/rte_ring_elem_pvt.h | 1 + meson_options.txt | 2 + 21 files changed, 475 insertions(+), 25 deletions(-) create mode 100644 lib/eal/common/eal_common_lcore_poll_telemetry.c diff --git a/config/meson.build b/config/meson.build index 7f7b6c92fd..d5954a059c 100644 --- a/config/meson.build +++ b/config/meson.build @@ -297,6 +297,7 @@ endforeach dpdk_conf.set('RTE_MAX_ETHPORTS', get_option('max_ethports')) dpdk_conf.set('RTE_LIBEAL_USE_HPET', get_option('use_hpet')) dpdk_conf.set('RTE_ENABLE_TRACE_FP', get_option('enable_trace_fp')) +dpdk_conf.set('RTE_LCORE_POLL_BUSYNESS', get_option('enable_lcore_poll_busyness')) # values which have defaults which may be overridden dpdk_conf.set('RTE_MAX_VFIO_GROUPS', 64) dpdk_conf.set('RTE_DRIVER_MEMPOOL_BUCKET_SIZE_KB', 64) diff --git a/config/rte_config.h b/config/rte_config.h index ae56a86394..86ac3b8a6e 100644 --- a/config/rte_config.h +++ b/config/rte_config.h @@ -39,6 +39,7 @@ #define RTE_LOG_DP_LEVEL RTE_LOG_INFO #define RTE_BACKTRACE 1 #define RTE_MAX_VFIO_CONTAINERS 64 +#define RTE_LCORE_POLL_BUSYNESS_PERIOD_MS 2 /* bsd module defines */ #define RTE_CONTIGMEM_MAX_NUM_BUFS 64 diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h index b88c88167e..d6a98d3f11 100644 --- a/lib/bbdev/rte_bbdev.h +++ b/lib/bbdev/rte_bbdev.h @@ -28,6 +28,7 @@ extern "C" { #include #include +#include #include "rte_bbdev_op.h" @@ -599,7 +600,9 @@ rte_bbdev_dequeue_enc_ops(uint16_t dev_id, uint16_t queue_id, { struct rte_bbdev *dev = &rte_bbdev_devices[dev_id]; struct rte_bbdev_queue_data *q_data = &dev->data->queues[queue_id]; - return dev->dequeue_enc_ops(q_data, ops, num_ops); + const uint16_t nb_ops = dev->dequeue_enc_ops(q_data, ops, num_ops); + RTE_LCORE_POLL_BUSYNESS_TIMESTAMP(nb_ops); + return nb_ops; } /** @@ -631,7 +634,9 @@ rte_bbdev_dequeue_dec_ops(uint16_t dev_id, uint16_t queue_id, { struct rte_bbdev *dev = &rte_bbdev_devices[dev_id]; struct rte_bbdev_queue_data *q_data = &dev->data->queues[queue_id]; - return dev->dequeue_dec_ops(q_data, ops, num_ops); + const uint16_t nb_ops = dev->dequeue_dec_ops(q_data, ops, num_ops); + RTE_LCORE_POLL_BUSYNESS_TIMESTAMP(nb_ops); + return nb_ops; } @@ -662,7 +667,9 @@ rte_bbdev_dequeue_ldpc_enc_ops(uint16_t dev_id, uint16_t queue_id, { struct rte_bbdev *dev = &rte_bbdev_devices[dev_id]; struct rte_bbdev_queue_data *q_data = &dev->data->queues[queue_id]; - return dev->dequeue_ldpc_enc_ops(q_data, ops, num_ops); + const uint16_t nb_ops = dev->dequeue_ldpc_enc_ops(q_data, ops, num_ops); + RTE_LCORE_POLL_BUSYNESS_TIMESTAMP(nb_ops); + return nb_ops; } /** @@ -692,7 +699,9 @@ rte_bbdev_dequeue_ldpc_dec_ops(uint16_t dev_id, uint16_t queue_id, { struct rte_bbdev *dev = &rte_bbdev_devices[dev_id]; struct rte_bbdev_queue_data *q_data = &dev->data->queues[queue_id]; - return dev->dequeue_ldpc_dec_ops(q_data, ops, num_ops); + const uint16_t nb_ops = dev->dequeue_ldpc_dec_ops(q_data, ops, num_ops); + RTE_LCORE_POLL_BUSYNESS_TIMESTAMP(nb_ops); + return nb_ops; } /** Definitions of device event types */ diff --git a/lib/compressdev/rte_compressdev.c b/lib/compressdev/rte_compressdev.c index 22c438f2dd..fabc495a8e 100644 --- a/lib/compressdev/rte_compressdev.c +++ b/lib/compressdev/rte_compressdev.c @@ -580,6 +580,8 @@ rte_compressdev_dequeue_burst(uint8_t dev_id, uint16_t qp_id, nb_ops = (*dev->dequeue_burst) (dev->data->queue_pairs[qp_id], ops, nb_ops); + RTE_LCORE_POLL_BUSYNESS_TIMESTAMP(nb_ops); + return nb_ops; } diff --git a/lib/cryptodev/rte_cryptodev.h b/lib/cryptodev/rte_cryptodev.h index 56f459c6a0..a5b1d7c594 100644 --- a/lib/cryptodev/rte_cryptodev.h +++ b/lib/cryptodev/rte_cryptodev.h @@ -1915,6 +1915,8 @@ rte_cryptodev_dequeue_burst(uint8_t dev_id, uint16_t qp_id, rte_rcu_qsbr_thread_offline(list->qsbr, 0); } #endif + + RTE_LCORE_POLL_BUSYNESS_TIMESTAMP(nb_ops); return nb_ops; } diff --git a/lib/distributor/rte_distributor.c b/lib/distributor/rte_distributor.c index 3035b7a999..428157ec64 100644 --- a/lib/distributor/rte_distributor.c +++ b/lib/distributor/rte_distributor.c @@ -56,6 +56,8 @@ rte_distributor_request_pkt(struct rte_distributor *d, while (rte_rdtsc() < t) rte_pause(); + /* this was an empty poll */ + RTE_LCORE_POLL_BUSYNESS_TIMESTAMP(0); } /* @@ -134,24 +136,29 @@ rte_distributor_get_pkt(struct rte_distributor *d, if (unlikely(d->alg_type == RTE_DIST_ALG_SINGLE)) { if (return_count <= 1) { + uint16_t cnt; pkts[0] = rte_distributor_get_pkt_single(d->d_single, - worker_id, return_count ? oldpkt[0] : NULL); - return (pkts[0]) ? 1 : 0; - } else - return -EINVAL; + worker_id, + return_count ? oldpkt[0] : NULL); + cnt = (pkts[0] != NULL) ? 1 : 0; + RTE_LCORE_POLL_BUSYNESS_TIMESTAMP(cnt); + return cnt; + } + return -EINVAL; } rte_distributor_request_pkt(d, worker_id, oldpkt, return_count); - count = rte_distributor_poll_pkt(d, worker_id, pkts); - while (count == -1) { + while ((count = rte_distributor_poll_pkt(d, worker_id, pkts)) == -1) { uint64_t t = rte_rdtsc() + 100; while (rte_rdtsc() < t) rte_pause(); - count = rte_distributor_poll_pkt(d, worker_id, pkts); + /* this was an empty poll */ + RTE_LCORE_POLL_BUSYNESS_TIMESTAMP(0); } + RTE_LCORE_POLL_BUSYNESS_TIMESTAMP(count); return count; } diff --git a/lib/distributor/rte_distributor_single.c b/lib/distributor/rte_distributor_single.c index 2c77ac454a..4c916c0fd2 100644 --- a/lib/distributor/rte_distributor_single.c +++ b/lib/distributor/rte_distributor_single.c @@ -31,8 +31,13 @@ rte_distributor_request_pkt_single(struct rte_distributor_single *d, union rte_distributor_buffer_single *buf = &d->bufs[worker_id]; int64_t req = (((int64_t)(uintptr_t)oldpkt) << RTE_DISTRIB_FLAG_BITS) | RTE_DISTRIB_GET_BUF; - RTE_WAIT_UNTIL_MASKED(&buf->bufptr64, RTE_DISTRIB_FLAGS_MASK, - ==, 0, __ATOMIC_RELAXED); + + while ((__atomic_load_n(&buf->bufptr64, __ATOMIC_RELAXED) + & RTE_DISTRIB_FLAGS_MASK) != 0) { + rte_pause(); + /* this was an empty poll */ + RTE_LCORE_POLL_BUSYNESS_TIMESTAMP(0); + } /* Sync with distributor on GET_BUF flag. */ __atomic_store_n(&(buf->bufptr64), req, __ATOMIC_RELEASE); @@ -59,8 +64,11 @@ rte_distributor_get_pkt_single(struct rte_distributor_single *d, { struct rte_mbuf *ret; rte_distributor_request_pkt_single(d, worker_id, oldpkt); - while ((ret = rte_distributor_poll_pkt_single(d, worker_id)) == NULL) + while ((ret = rte_distributor_poll_pkt_single(d, worker_id)) == NULL) { rte_pause(); + /* this was an empty poll */ + RTE_LCORE_POLL_BUSYNESS_TIMESTAMP(0); + } return ret; } diff --git a/lib/dmadev/rte_dmadev.h b/lib/dmadev/rte_dmadev.h index e7f992b734..3e27e0fd2b 100644 --- a/lib/dmadev/rte_dmadev.h +++ b/lib/dmadev/rte_dmadev.h @@ -149,6 +149,7 @@ #include #include #include +#include #ifdef __cplusplus extern "C" { @@ -1027,7 +1028,7 @@ rte_dma_completed(int16_t dev_id, uint16_t vchan, const uint16_t nb_cpls, uint16_t *last_idx, bool *has_error) { struct rte_dma_fp_object *obj = &rte_dma_fp_objs[dev_id]; - uint16_t idx; + uint16_t idx, nb_ops; bool err; #ifdef RTE_DMADEV_DEBUG @@ -1050,8 +1051,10 @@ rte_dma_completed(int16_t dev_id, uint16_t vchan, const uint16_t nb_cpls, has_error = &err; *has_error = false; - return (*obj->completed)(obj->dev_private, vchan, nb_cpls, last_idx, - has_error); + nb_ops = (*obj->completed)(obj->dev_private, vchan, nb_cpls, last_idx, + has_error); + RTE_LCORE_POLL_BUSYNESS_TIMESTAMP(nb_ops); + return nb_ops; } /** @@ -1090,7 +1093,7 @@ rte_dma_completed_status(int16_t dev_id, uint16_t vchan, enum rte_dma_status_code *status) { struct rte_dma_fp_object *obj = &rte_dma_fp_objs[dev_id]; - uint16_t idx; + uint16_t idx, nb_ops; #ifdef RTE_DMADEV_DEBUG if (!rte_dma_is_valid(dev_id) || nb_cpls == 0 || status == NULL) @@ -1101,8 +1104,10 @@ rte_dma_completed_status(int16_t dev_id, uint16_t vchan, if (last_idx == NULL) last_idx = &idx; - return (*obj->completed_status)(obj->dev_private, vchan, nb_cpls, + nb_ops = (*obj->completed_status)(obj->dev_private, vchan, nb_cpls, last_idx, status); + RTE_LCORE_POLL_BUSYNESS_TIMESTAMP(nb_ops); + return nb_ops; } /** diff --git a/lib/eal/common/eal_common_lcore_poll_telemetry.c b/lib/eal/common/eal_common_lcore_poll_telemetry.c new file mode 100644 index 0000000000..d97996e85f --- /dev/null +++ b/lib/eal/common/eal_common_lcore_poll_telemetry.c @@ -0,0 +1,303 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2022 Intel Corporation + */ + +#include +#include +#include + +#include +#include +#include +#include + +#ifdef RTE_LCORE_POLL_BUSYNESS +#include +#endif + +rte_atomic32_t __rte_lcore_poll_telemetry_enabled; + +#ifdef RTE_LCORE_POLL_BUSYNESS + +struct lcore_poll_telemetry { + int poll_busyness; + /**< Calculated poll busyness (gets set/returned by the API) */ + int raw_poll_busyness; + /**< Calculated poll busyness times 100. */ + uint64_t interval_ts; + /**< when previous telemetry interval started */ + uint64_t empty_cycles; + /**< empty cycle count since last interval */ + uint64_t last_poll_ts; + /**< last poll timestamp */ + bool last_empty; + /**< if last poll was empty */ + unsigned int contig_poll_cnt; + /**< contiguous (always empty/non empty) poll counter */ +} __rte_cache_aligned; + +static struct lcore_poll_telemetry *telemetry_data; + +#define LCORE_POLL_BUSYNESS_MAX 100 +#define LCORE_POLL_BUSYNESS_NOT_SET -1 +#define LCORE_POLL_BUSYNESS_MIN 0 + +#define SMOOTH_COEFF 5 +#define STATE_CHANGE_OPT 32 + +static void lcore_config_init(void) +{ + int lcore_id; + + RTE_LCORE_FOREACH(lcore_id) { + struct lcore_poll_telemetry *td = &telemetry_data[lcore_id]; + + td->interval_ts = 0; + td->last_poll_ts = 0; + td->empty_cycles = 0; + td->last_empty = true; + td->contig_poll_cnt = 0; + td->poll_busyness = LCORE_POLL_BUSYNESS_NOT_SET; + td->raw_poll_busyness = 0; + } +} + +int rte_lcore_poll_busyness(unsigned int lcore_id) +{ + const uint64_t tsc_ms = rte_get_timer_hz() / MS_PER_S; + /* if more than 1000 busyness periods have passed, this core is considered inactive */ + const uint64_t active_thresh = RTE_LCORE_POLL_BUSYNESS_PERIOD_MS * tsc_ms * 1000; + struct lcore_poll_telemetry *tdata; + + if (lcore_id >= RTE_MAX_LCORE) + return -EINVAL; + tdata = &telemetry_data[lcore_id]; + + /* if the lcore is not active */ + if (tdata->interval_ts == 0) + return LCORE_POLL_BUSYNESS_NOT_SET; + /* if the core hasn't been active in a while */ + else if ((rte_rdtsc() - tdata->interval_ts) > active_thresh) + return LCORE_POLL_BUSYNESS_NOT_SET; + + /* this core is active, report its poll busyness */ + return telemetry_data[lcore_id].poll_busyness; +} + +int rte_lcore_poll_busyness_enabled(void) +{ + return rte_atomic32_read(&__rte_lcore_poll_telemetry_enabled); +} + +void rte_lcore_poll_busyness_enabled_set(bool enable) +{ + int set = rte_atomic32_cmpset((volatile uint32_t *)&__rte_lcore_poll_telemetry_enabled, + (int)!enable, (int)enable); + + /* Reset counters on successful disable */ + if (set && !enable) + lcore_config_init(); +} + +static inline int calc_raw_poll_busyness(const struct lcore_poll_telemetry *tdata, + const uint64_t empty, const uint64_t total) +{ + /* + * We don't want to use floating point math here, but we want for our poll + * busyness to react smoothly to sudden changes, while still keeping the + * accuracy and making sure that over time the average follows poll busyness + * as measured just-in-time. Therefore, we will calculate the average poll + * busyness using integer math, but shift the decimal point two places + * to the right, so that 100.0 becomes 10000. This allows us to report + * integer values (0..100) while still allowing ourselves to follow the + * just-in-time measurements when we calculate our averages. + */ + const int max_raw_idle = LCORE_POLL_BUSYNESS_MAX * 100; + + const int prev_raw_idle = max_raw_idle - tdata->raw_poll_busyness; + + /* calculate rate of idle cycles, times 100 */ + const int cur_raw_idle = (int)((empty * max_raw_idle) / total); + + /* smoothen the idleness */ + const int smoothened_idle = + (cur_raw_idle + prev_raw_idle * (SMOOTH_COEFF - 1)) / SMOOTH_COEFF; + + /* convert idleness to poll busyness */ + return max_raw_idle - smoothened_idle; +} + +void __rte_lcore_poll_busyness_timestamp(uint16_t nb_rx) +{ + const unsigned int lcore_id = rte_lcore_id(); + uint64_t interval_ts, empty_cycles, cur_tsc, last_poll_ts; + struct lcore_poll_telemetry *tdata; + const bool empty = nb_rx == 0; + uint64_t diff_int, diff_last; + bool last_empty; + + /* This telemetry is not supported for unregistered non-EAL threads */ + if (lcore_id >= RTE_MAX_LCORE) { + RTE_LOG(DEBUG, EAL, + "Lcore telemetry not supported on unregistered non-EAL thread %d", + lcore_id); + return; + } + + tdata = &telemetry_data[lcore_id]; + last_empty = tdata->last_empty; + + /* optimization: don't do anything if status hasn't changed */ + if (last_empty == empty && tdata->contig_poll_cnt++ < STATE_CHANGE_OPT) + return; + /* status changed or we're waiting for too long, reset counter */ + tdata->contig_poll_cnt = 0; + + cur_tsc = rte_rdtsc(); + + interval_ts = tdata->interval_ts; + empty_cycles = tdata->empty_cycles; + last_poll_ts = tdata->last_poll_ts; + + diff_int = cur_tsc - interval_ts; + diff_last = cur_tsc - last_poll_ts; + + /* is this the first time we're here? */ + if (interval_ts == 0) { + tdata->poll_busyness = LCORE_POLL_BUSYNESS_MIN; + tdata->raw_poll_busyness = 0; + tdata->interval_ts = cur_tsc; + tdata->empty_cycles = 0; + tdata->contig_poll_cnt = 0; + goto end; + } + + /* update the empty counter if we got an empty poll earlier */ + if (last_empty) + empty_cycles += diff_last; + + /* have we passed the interval? */ + uint64_t interval = ((rte_get_tsc_hz() / MS_PER_S) * RTE_LCORE_POLL_BUSYNESS_PERIOD_MS); + if (diff_int > interval) { + int raw_poll_busyness; + + /* get updated poll_busyness value */ + raw_poll_busyness = calc_raw_poll_busyness(tdata, empty_cycles, diff_int); + + /* set a new interval, reset empty counter */ + tdata->interval_ts = cur_tsc; + tdata->empty_cycles = 0; + tdata->raw_poll_busyness = raw_poll_busyness; + /* bring poll busyness back to 0..100 range, biased to round up */ + tdata->poll_busyness = (raw_poll_busyness + 50) / 100; + } else + /* we may have updated empty counter */ + tdata->empty_cycles = empty_cycles; + +end: + /* update status for next poll */ + tdata->last_poll_ts = cur_tsc; + tdata->last_empty = empty; +} + +static int +lcore_poll_busyness_enable(const char *cmd __rte_unused, + const char *params __rte_unused, + struct rte_tel_data *d) +{ + rte_lcore_poll_busyness_enabled_set(true); + + rte_tel_data_start_dict(d); + + rte_tel_data_add_dict_int(d, "poll_busyness_enabled", 1); + + return 0; +} + +static int +lcore_poll_busyness_disable(const char *cmd __rte_unused, + const char *params __rte_unused, + struct rte_tel_data *d) +{ + rte_lcore_poll_busyness_enabled_set(false); + + rte_tel_data_start_dict(d); + + rte_tel_data_add_dict_int(d, "poll_busyness_enabled", 0); + + return 0; +} + +static int +lcore_handle_poll_busyness(const char *cmd __rte_unused, + const char *params __rte_unused, struct rte_tel_data *d) +{ + char corenum[64]; + int i; + + rte_tel_data_start_dict(d); + + RTE_LCORE_FOREACH(i) { + if (!rte_lcore_is_enabled(i)) + continue; + snprintf(corenum, sizeof(corenum), "%d", i); + rte_tel_data_add_dict_int(d, corenum, rte_lcore_poll_busyness(i)); + } + + return 0; +} + +void +eal_lcore_poll_telemetry_free(void) +{ + if (telemetry_data != NULL) { + free(telemetry_data); + telemetry_data = NULL; + } +} + +RTE_INIT(lcore_init_poll_telemetry) +{ + telemetry_data = calloc(RTE_MAX_LCORE, sizeof(telemetry_data[0])); + if (telemetry_data == NULL) + rte_panic("Could not init lcore telemetry data: Out of memory\n"); + + lcore_config_init(); + + rte_telemetry_register_cmd("/eal/lcore/poll_busyness", lcore_handle_poll_busyness, + "return percentage poll busyness of cores"); + + rte_telemetry_register_cmd("/eal/lcore/poll_busyness_enable", lcore_poll_busyness_enable, + "enable lcore poll busyness measurement"); + + rte_telemetry_register_cmd("/eal/lcore/poll_busyness_disable", lcore_poll_busyness_disable, + "disable lcore poll busyness measurement"); + + rte_atomic32_set(&__rte_lcore_poll_telemetry_enabled, true); +} + +#else + +int rte_lcore_poll_busyness(unsigned int lcore_id __rte_unused) +{ + return -ENOTSUP; +} + +int rte_lcore_poll_busyness_enabled(void) +{ + return -ENOTSUP; +} + +void rte_lcore_poll_busyness_enabled_set(bool enable __rte_unused) +{ +} + +void __rte_lcore_poll_busyness_timestamp(uint16_t nb_rx __rte_unused) +{ +} + +void eal_lcore_poll_telemetry_free(void) +{ +} + +#endif diff --git a/lib/eal/common/meson.build b/lib/eal/common/meson.build index 917758cc65..e5741ce9f9 100644 --- a/lib/eal/common/meson.build +++ b/lib/eal/common/meson.build @@ -17,6 +17,7 @@ sources += files( 'eal_common_hexdump.c', 'eal_common_interrupts.c', 'eal_common_launch.c', + 'eal_common_lcore_poll_telemetry.c', 'eal_common_lcore.c', 'eal_common_log.c', 'eal_common_mcfg.c', diff --git a/lib/eal/freebsd/eal.c b/lib/eal/freebsd/eal.c index 26fbc91b26..92c4af9c28 100644 --- a/lib/eal/freebsd/eal.c +++ b/lib/eal/freebsd/eal.c @@ -895,6 +895,7 @@ rte_eal_cleanup(void) rte_mp_channel_cleanup(); rte_trace_save(); eal_trace_fini(); + eal_lcore_poll_telemetry_free(); /* after this point, any DPDK pointers will become dangling */ rte_eal_memory_detach(); rte_eal_alarm_cleanup(); diff --git a/lib/eal/include/rte_lcore.h b/lib/eal/include/rte_lcore.h index b598e1b9ec..2191c2473a 100644 --- a/lib/eal/include/rte_lcore.h +++ b/lib/eal/include/rte_lcore.h @@ -16,6 +16,7 @@ #include #include #include +#include #ifdef __cplusplus extern "C" { @@ -415,9 +416,91 @@ rte_ctrl_thread_create(pthread_t *thread, const char *name, const pthread_attr_t *attr, void *(*start_routine)(void *), void *arg); +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Read poll busyness value corresponding to an lcore. + * + * @param lcore_id + * Lcore to read poll busyness value for. + * @return + * - value between 0 and 100 on success + * - -1 if lcore is not active + * - -EINVAL if lcore is invalid + * - -ENOMEM if not enough memory available + * - -ENOTSUP if not supported + */ +__rte_experimental +int +rte_lcore_poll_busyness(unsigned int lcore_id); + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Check if lcore poll busyness telemetry is enabled. + * + * @return + * - true if lcore telemetry is enabled + * - false if lcore telemetry is disabled + * - -ENOTSUP if not lcore telemetry supported + */ +__rte_experimental +int +rte_lcore_poll_busyness_enabled(void); + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Enable or disable poll busyness telemetry. + * + * @param enable + * 1 to enable, 0 to disable + */ +__rte_experimental +void +rte_lcore_poll_busyness_enabled_set(bool enable); + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Lcore poll busyness timestamping function. + * + * @param nb_rx + * Number of buffers processed by lcore. + */ +__rte_experimental +void +__rte_lcore_poll_busyness_timestamp(uint16_t nb_rx); + +/** @internal lcore telemetry enabled status */ +extern rte_atomic32_t __rte_lcore_poll_telemetry_enabled; + +/** @internal free memory allocated for lcore telemetry */ +void +eal_lcore_poll_telemetry_free(void); + +/** + * Call lcore poll busyness timestamp function. + * + * @param nb_rx + * Number of buffers processed by lcore. + */ +#ifdef RTE_LCORE_POLL_BUSYNESS +#define RTE_LCORE_POLL_BUSYNESS_TIMESTAMP(nb_rx) do { \ + int enabled = (int)rte_atomic32_read(&__rte_lcore_poll_telemetry_enabled); \ + if (enabled) \ + __rte_lcore_poll_busyness_timestamp(nb_rx); \ +} while (0) +#else +#define RTE_LCORE_POLL_BUSYNESS_TIMESTAMP(nb_rx) do { } while (0) +#endif + #ifdef __cplusplus } #endif - #endif /* _RTE_LCORE_H_ */ diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c index 37d29643a5..5e81352a81 100644 --- a/lib/eal/linux/eal.c +++ b/lib/eal/linux/eal.c @@ -1364,6 +1364,7 @@ rte_eal_cleanup(void) rte_mp_channel_cleanup(); rte_trace_save(); eal_trace_fini(); + eal_lcore_poll_telemetry_free(); /* after this point, any DPDK pointers will become dangling */ rte_eal_memory_detach(); eal_mp_dev_hotplug_cleanup(); diff --git a/lib/eal/meson.build b/lib/eal/meson.build index 056beb9461..2fb90d446b 100644 --- a/lib/eal/meson.build +++ b/lib/eal/meson.build @@ -25,6 +25,9 @@ subdir(arch_subdir) deps += ['kvargs'] if not is_windows deps += ['telemetry'] +else + # core poll busyness telemetry depends on telemetry library + dpdk_conf.set('RTE_LCORE_POLL_BUSYNESS', false) endif if dpdk_conf.has('RTE_USE_LIBBSD') ext_deps += libbsd diff --git a/lib/eal/version.map b/lib/eal/version.map index 1f293e768b..3275d1fac4 100644 --- a/lib/eal/version.map +++ b/lib/eal/version.map @@ -424,6 +424,13 @@ EXPERIMENTAL { rte_thread_self; rte_thread_set_affinity_by_id; rte_thread_set_priority; + + # added in 22.11 + __rte_lcore_poll_busyness_timestamp; + __rte_lcore_poll_telemetry_enabled; + rte_lcore_poll_busyness; + rte_lcore_poll_busyness_enabled; + rte_lcore_poll_busyness_enabled_set; }; INTERNAL { diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h index de9e970d4d..4c8113f31f 100644 --- a/lib/ethdev/rte_ethdev.h +++ b/lib/ethdev/rte_ethdev.h @@ -5675,6 +5675,8 @@ rte_eth_rx_burst(uint16_t port_id, uint16_t queue_id, #endif rte_ethdev_trace_rx_burst(port_id, queue_id, (void **)rx_pkts, nb_rx); + + RTE_LCORE_POLL_BUSYNESS_TIMESTAMP(nb_rx); return nb_rx; } diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h index 6a6f6ea4c1..a65b3c7c85 100644 --- a/lib/eventdev/rte_eventdev.h +++ b/lib/eventdev/rte_eventdev.h @@ -2153,6 +2153,7 @@ rte_event_dequeue_burst(uint8_t dev_id, uint8_t port_id, struct rte_event ev[], uint16_t nb_events, uint64_t timeout_ticks) { const struct rte_event_fp_ops *fp_ops; + uint16_t nb_evts; void *port; fp_ops = &rte_event_fp_ops[dev_id]; @@ -2175,10 +2176,13 @@ rte_event_dequeue_burst(uint8_t dev_id, uint8_t port_id, struct rte_event ev[], * requests nb_events as const one */ if (nb_events == 1) - return (fp_ops->dequeue)(port, ev, timeout_ticks); + nb_evts = (fp_ops->dequeue)(port, ev, timeout_ticks); else - return (fp_ops->dequeue_burst)(port, ev, nb_events, - timeout_ticks); + nb_evts = (fp_ops->dequeue_burst)(port, ev, nb_events, + timeout_ticks); + + RTE_LCORE_POLL_BUSYNESS_TIMESTAMP(nb_evts); + return nb_evts; } #define RTE_EVENT_DEV_MAINT_OP_FLUSH (1 << 0) diff --git a/lib/rawdev/rte_rawdev.c b/lib/rawdev/rte_rawdev.c index 2f0a4f132e..1cba53270a 100644 --- a/lib/rawdev/rte_rawdev.c +++ b/lib/rawdev/rte_rawdev.c @@ -16,6 +16,7 @@ #include #include #include +#include #include "rte_rawdev.h" #include "rte_rawdev_pmd.h" @@ -226,12 +227,15 @@ rte_rawdev_dequeue_buffers(uint16_t dev_id, rte_rawdev_obj_t context) { struct rte_rawdev *dev; + int nb_ops; RTE_RAWDEV_VALID_DEVID_OR_ERR_RET(dev_id, -EINVAL); dev = &rte_rawdevs[dev_id]; RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dequeue_bufs, -ENOTSUP); - return (*dev->dev_ops->dequeue_bufs)(dev, buffers, count, context); + nb_ops = (*dev->dev_ops->dequeue_bufs)(dev, buffers, count, context); + RTE_LCORE_POLL_BUSYNESS_TIMESTAMP(nb_ops); + return nb_ops; } int diff --git a/lib/regexdev/rte_regexdev.h b/lib/regexdev/rte_regexdev.h index 3bce8090f6..8caaed502f 100644 --- a/lib/regexdev/rte_regexdev.h +++ b/lib/regexdev/rte_regexdev.h @@ -1530,6 +1530,7 @@ rte_regexdev_dequeue_burst(uint8_t dev_id, uint16_t qp_id, struct rte_regex_ops **ops, uint16_t nb_ops) { struct rte_regexdev *dev = &rte_regex_devices[dev_id]; + uint16_t deq_ops; #ifdef RTE_LIBRTE_REGEXDEV_DEBUG RTE_REGEXDEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL); RTE_FUNC_PTR_OR_ERR_RET(*dev->dequeue, -ENOTSUP); @@ -1538,7 +1539,9 @@ rte_regexdev_dequeue_burst(uint8_t dev_id, uint16_t qp_id, return -EINVAL; } #endif - return (*dev->dequeue)(dev, qp_id, ops, nb_ops); + deq_ops = (*dev->dequeue)(dev, qp_id, ops, nb_ops); + RTE_LCORE_POLL_BUSYNESS_TIMESTAMP(deq_ops); + return deq_ops; } #ifdef __cplusplus diff --git a/lib/ring/rte_ring_elem_pvt.h b/lib/ring/rte_ring_elem_pvt.h index 83788c56e6..cf2370c238 100644 --- a/lib/ring/rte_ring_elem_pvt.h +++ b/lib/ring/rte_ring_elem_pvt.h @@ -379,6 +379,7 @@ __rte_ring_do_dequeue_elem(struct rte_ring *r, void *obj_table, end: if (available != NULL) *available = entries - n; + RTE_LCORE_POLL_BUSYNESS_TIMESTAMP(n); return n; } diff --git a/meson_options.txt b/meson_options.txt index 7c220ad68d..9b20a36fdb 100644 --- a/meson_options.txt +++ b/meson_options.txt @@ -20,6 +20,8 @@ option('enable_driver_sdk', type: 'boolean', value: false, description: 'Install headers to build drivers.') option('enable_kmods', type: 'boolean', value: false, description: 'build kernel modules') +option('enable_lcore_poll_busyness', type: 'boolean', value: false, description: + 'enable collection of lcore poll busyness telemetry') option('examples', type: 'string', value: '', description: 'Comma-separated list of examples to build by default') option('flexran_sdk', type: 'string', value: '', description: From patchwork Wed Sep 14 09:29:27 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Laatz X-Patchwork-Id: 116301 X-Patchwork-Delegate: david.marchand@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 9F860A0032; Wed, 14 Sep 2022 11:26:33 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 08BB842B6C; Wed, 14 Sep 2022 11:26:28 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by mails.dpdk.org (Postfix) with ESMTP id 77E8540151 for ; Wed, 14 Sep 2022 11:26:25 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1663147585; x=1694683585; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=a5TOHRyM7kiYanLRoAXeydND7E87eM1Y+cADkGnpOwo=; b=VJQqqKFblTzZ/+jRsiCv5FUiciARQFwBb1qx0i+GZq6kWosyJCtjmLLF MaC/9ylf1tLzZMwaqDjGS3auEJSIuTydk+TRNSLs2Zjq/kO8XTNbVI9ST qx8r0J8Qy1wntOJmMlgiDARnBALoQxhqnHqmPJwvRXCrBrAHU04vHpV7/ MCU5H89l+5z7VIQfe1gkd3+sPI5HOKiTc/o1Sn3hRVOIXQoxsRh4KPijr KgruZbfx1/AkgdjO19fu9WTLChNgrzSDvhkXbit1mzrT5iMmBTOm3IpnM x3DpldcMiyMi9T+lkKHNDFtnty+zBGVFr9ginq0fc/TkBDnAEuZxdsDFe A==; X-IronPort-AV: E=McAfee;i="6500,9779,10469"; a="384675037" X-IronPort-AV: E=Sophos;i="5.93,315,1654585200"; d="scan'208";a="384675037" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2022 02:26:23 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.93,315,1654585200"; d="scan'208";a="612446506" Received: from silpixa00401122.ir.intel.com ([10.237.213.42]) by orsmga007.jf.intel.com with ESMTP; 14 Sep 2022 02:26:22 -0700 From: Kevin Laatz To: dev@dpdk.org Cc: anatoly.burakov@intel.com Subject: [PATCH v7 2/4] eal: add cpuset lcore telemetry entries Date: Wed, 14 Sep 2022 10:29:27 +0100 Message-Id: <20220914092929.1159773-3-kevin.laatz@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220914092929.1159773-1-kevin.laatz@intel.com> References: <24c49429394294cfbf0d9c506b205029bac77c8b.1657890378.git.anatoly.burakov@intel.com> <20220914092929.1159773-1-kevin.laatz@intel.com> MIME-Version: 1.0 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org From: Anatoly Burakov Expose per-lcore cpuset information to telemetry. Signed-off-by: Anatoly Burakov --- .../common/eal_common_lcore_poll_telemetry.c | 47 +++++++++++++++++++ 1 file changed, 47 insertions(+) diff --git a/lib/eal/common/eal_common_lcore_poll_telemetry.c b/lib/eal/common/eal_common_lcore_poll_telemetry.c index d97996e85f..a19d6ccb95 100644 --- a/lib/eal/common/eal_common_lcore_poll_telemetry.c +++ b/lib/eal/common/eal_common_lcore_poll_telemetry.c @@ -19,6 +19,8 @@ rte_atomic32_t __rte_lcore_poll_telemetry_enabled; #ifdef RTE_LCORE_POLL_BUSYNESS +#include "eal_private.h" + struct lcore_poll_telemetry { int poll_busyness; /**< Calculated poll busyness (gets set/returned by the API) */ @@ -247,6 +249,48 @@ lcore_handle_poll_busyness(const char *cmd __rte_unused, return 0; } +static int +lcore_handle_cpuset(const char *cmd __rte_unused, + const char *params __rte_unused, + struct rte_tel_data *d) +{ + char corenum[64]; + int i; + + rte_tel_data_start_dict(d); + + RTE_LCORE_FOREACH(i) { + const struct lcore_config *cfg = &lcore_config[i]; + const rte_cpuset_t *cpuset = &cfg->cpuset; + struct rte_tel_data *ld; + unsigned int cpu; + + if (!rte_lcore_is_enabled(i)) + continue; + + /* create an array of integers */ + ld = rte_tel_data_alloc(); + if (ld == NULL) + return -ENOMEM; + rte_tel_data_start_array(ld, RTE_TEL_INT_VAL); + + /* add cpu ID's from cpuset to the array */ + for (cpu = 0; cpu < CPU_SETSIZE; cpu++) { + if (!CPU_ISSET(cpu, cpuset)) + continue; + rte_tel_data_add_array_int(ld, cpu); + } + + /* add array to the per-lcore container */ + snprintf(corenum, sizeof(corenum), "%d", i); + + /* tell telemetry library to free this array automatically */ + rte_tel_data_add_dict_container(d, corenum, ld, 0); + } + + return 0; +} + void eal_lcore_poll_telemetry_free(void) { @@ -273,6 +317,9 @@ RTE_INIT(lcore_init_poll_telemetry) rte_telemetry_register_cmd("/eal/lcore/poll_busyness_disable", lcore_poll_busyness_disable, "disable lcore poll busyness measurement"); + rte_telemetry_register_cmd("/eal/lcore/cpuset", lcore_handle_cpuset, + "list physical core affinity for each lcore"); + rte_atomic32_set(&__rte_lcore_poll_telemetry_enabled, true); } From patchwork Wed Sep 14 09:29:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Laatz X-Patchwork-Id: 116302 X-Patchwork-Delegate: david.marchand@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id BA63FA0032; Wed, 14 Sep 2022 11:26:38 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id D757D42B72; Wed, 14 Sep 2022 11:26:28 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by mails.dpdk.org (Postfix) with ESMTP id AC27E42802 for ; Wed, 14 Sep 2022 11:26:25 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1663147585; x=1694683585; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=iPDn5M7HmAVmLELuKMR+oK3Vqd3NucyVVdXDKuUDDes=; b=HxJ3ueP3QlNbTXdWaGH9GsQeN42FQlSQ+uh2vPWbF2ToWJb0Ib2gj3rq QItpKyS+O5dFq5svz6Av6ZaDA13qX9jDTWrGr8suS4K5smG3FHPsHzlu8 x1Mnsqc9reZTZpRs6KAOwyPJz/Jv3HmsgzjJONCD3KMGbLB0VeiNNRjLk ZZ6iIWgs77on3Kv2otEngyCa/hIjPR5Gkw5fAvAsU1m7/YlbAvu3QSPSj 9lsY7BNZT5jBNU2hgWBrSavwmUSmnHa1jA3/nXdw8rnMWUwceCoUZ0elg ZrJXZOdB0DBTaEHnGBfSjhJOoHlRL4vv1inhudHniE1dnkGIntnMDHQGf g==; X-IronPort-AV: E=McAfee;i="6500,9779,10469"; a="384675038" X-IronPort-AV: E=Sophos;i="5.93,315,1654585200"; d="scan'208";a="384675038" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2022 02:26:24 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.93,315,1654585200"; d="scan'208";a="612446522" Received: from silpixa00401122.ir.intel.com ([10.237.213.42]) by orsmga007.jf.intel.com with ESMTP; 14 Sep 2022 02:26:23 -0700 From: Kevin Laatz To: dev@dpdk.org Cc: anatoly.burakov@intel.com, Kevin Laatz Subject: [PATCH v7 3/4] app/test: add unit tests for lcore poll busyness Date: Wed, 14 Sep 2022 10:29:28 +0100 Message-Id: <20220914092929.1159773-4-kevin.laatz@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220914092929.1159773-1-kevin.laatz@intel.com> References: <24c49429394294cfbf0d9c506b205029bac77c8b.1657890378.git.anatoly.burakov@intel.com> <20220914092929.1159773-1-kevin.laatz@intel.com> MIME-Version: 1.0 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Add API unit tests and perf unit tests for the newly added lcore poll busyness feature. Signed-off-by: Kevin Laatz --- app/test/meson.build | 4 + app/test/test_lcore_poll_busyness_api.c | 134 +++++++++++++++++++++++ app/test/test_lcore_poll_busyness_perf.c | 72 ++++++++++++ 3 files changed, 210 insertions(+) create mode 100644 app/test/test_lcore_poll_busyness_api.c create mode 100644 app/test/test_lcore_poll_busyness_perf.c diff --git a/app/test/meson.build b/app/test/meson.build index bf1d81f84a..d543e730a2 100644 --- a/app/test/meson.build +++ b/app/test/meson.build @@ -74,6 +74,8 @@ test_sources = files( 'test_ipsec_perf.c', 'test_kni.c', 'test_kvargs.c', + 'test_lcore_poll_busyness_api.c', + 'test_lcore_poll_busyness_perf.c', 'test_lcores.c', 'test_logs.c', 'test_lpm.c', @@ -192,6 +194,7 @@ fast_tests = [ ['interrupt_autotest', true, true], ['ipfrag_autotest', false, true], ['lcores_autotest', true, true], + ['lcore_poll_busyness_autotest', true, true], ['logs_autotest', true, true], ['lpm_autotest', true, true], ['lpm6_autotest', true, true], @@ -292,6 +295,7 @@ perf_test_names = [ 'trace_perf_autotest', 'ipsec_perf_autotest', 'thash_perf_autotest', + 'lcore_poll_busyness_perf_autotest' ] driver_test_names = [ diff --git a/app/test/test_lcore_poll_busyness_api.c b/app/test/test_lcore_poll_busyness_api.c new file mode 100644 index 0000000000..db76322994 --- /dev/null +++ b/app/test/test_lcore_poll_busyness_api.c @@ -0,0 +1,134 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2022 Intel Corporation + */ + +#include + +#include "test.h" + +/* Arbitrary amount of "work" to simulate busyness with */ +#define WORK 32 +#define TIMESTAMP_ITERS 1000000 + +#define LCORE_POLL_BUSYNESS_NOT_SET -1 + +static int +test_lcore_poll_busyness_enable_disable(void) +{ + int initial_state, curr_state; + bool req_state; + + /* Get the initial state */ + initial_state = rte_lcore_poll_busyness_enabled(); + if (initial_state == -ENOTSUP) + return TEST_SKIPPED; + + /* Set state to the inverse of the initial state and check for the change */ + req_state = !initial_state; + rte_lcore_poll_busyness_enabled_set(req_state); + curr_state = rte_lcore_poll_busyness_enabled(); + if (curr_state != req_state) + return TEST_FAILED; + + /* Now change the state back to the original state. By changing it back, both + * enable and disable will have been tested. + */ + req_state = !curr_state; + rte_lcore_poll_busyness_enabled_set(req_state); + curr_state = rte_lcore_poll_busyness_enabled(); + if (curr_state != req_state) + return TEST_FAILED; + + return TEST_SUCCESS; +} + +static int +test_lcore_poll_busyness_invalid_lcore(void) +{ + int ret; + + /* Check if lcore poll busyness is enabled */ + if (rte_lcore_poll_busyness_enabled() == -ENOTSUP) + return TEST_SKIPPED; + + /* Only lcore_id <= RTE_MAX_LCORE are valid */ + ret = rte_lcore_poll_busyness(RTE_MAX_LCORE); + if (ret != -EINVAL) + return TEST_FAILED; + + return TEST_SUCCESS; +} + +static int +test_lcore_poll_busyness_inactive_lcore(void) +{ + int ret; + + /* Check if lcore poll busyness is enabled */ + if (rte_lcore_poll_busyness_enabled() == -ENOTSUP) + return TEST_SKIPPED; + + /* Use the test thread lcore_id for this test. Since it is not a polling + * application, the busyness is expected to return -1. + * + * Note: this will not work with affinitized cores + */ + ret = rte_lcore_poll_busyness(rte_lcore_id()); + if (ret != LCORE_POLL_BUSYNESS_NOT_SET) + return TEST_FAILED; + + return TEST_SUCCESS; +} + +static void +simulate_lcore_poll_busyness(int iters) +{ + int i; + + for (i = 0; i < iters; i++) + RTE_LCORE_POLL_BUSYNESS_TIMESTAMP(WORK); +} + +/* The test cannot know of an application running to test for valid lcore poll + * busyness data. For this test, we simulate lcore poll busyness for the + * lcore_id of the test thread for testing purposes. + */ +static int +test_lcore_poll_busyness_active_lcore(void) +{ + int ret; + + /* Check if lcore poll busyness is enabled */ + if (rte_lcore_poll_busyness_enabled() == -ENOTSUP) + return TEST_SKIPPED; + + simulate_lcore_poll_busyness(TIMESTAMP_ITERS); + + /* After timestamping with "work" many times, lcore poll busyness should be > 0 */ + ret = rte_lcore_poll_busyness(rte_lcore_id()); + if (ret <= 0) + return TEST_FAILED; + + return TEST_SUCCESS; +} + +static struct unit_test_suite lcore_poll_busyness_tests = { + .suite_name = "lcore poll busyness autotest", + .setup = NULL, + .teardown = NULL, + .unit_test_cases = { + TEST_CASE(test_lcore_poll_busyness_enable_disable), + TEST_CASE(test_lcore_poll_busyness_invalid_lcore), + TEST_CASE(test_lcore_poll_busyness_inactive_lcore), + TEST_CASE(test_lcore_poll_busyness_active_lcore), + TEST_CASES_END() + } +}; + +static int +test_lcore_poll_busyness_api(void) +{ + return unit_test_suite_runner(&lcore_poll_busyness_tests); +} + +REGISTER_TEST_COMMAND(lcore_poll_busyness_autotest, test_lcore_poll_busyness_api); diff --git a/app/test/test_lcore_poll_busyness_perf.c b/app/test/test_lcore_poll_busyness_perf.c new file mode 100644 index 0000000000..5c27d21b00 --- /dev/null +++ b/app/test/test_lcore_poll_busyness_perf.c @@ -0,0 +1,72 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2022 Intel Corporation + */ + +#include +#include + +#include +#include + +#include "test.h" + +/* Arbitrary amount of "work" to simulate busyness with */ +#define WORK 32 +#define TIMESTAMP_ITERS 1000000 +#define TEST_ITERS 10000 + +static void +simulate_lcore_poll_busyness(int iters) +{ + int i; + + for (i = 0; i < iters; i++) + RTE_LCORE_POLL_BUSYNESS_TIMESTAMP(WORK); +} + +static void +test_timestamp_perf(void) +{ + uint64_t start, end, diff; + uint64_t min = UINT64_MAX; + uint64_t max = 0; + uint64_t total = 0; + int i; + + for (i = 0; i < TEST_ITERS; i++) { + start = rte_rdtsc(); + RTE_LCORE_POLL_BUSYNESS_TIMESTAMP(WORK); + end = rte_rdtsc(); + + diff = end - start; + min = RTE_MIN(diff, min); + max = RTE_MAX(diff, max); + total += diff; + } + + printf("### Timestamp perf ###\n"); + printf("Min cycles: %"PRIu64"\n", min); + printf("Avg cycles: %"PRIu64"\n", total / TEST_ITERS); + printf("Max cycles: %"PRIu64"\n", max); + printf("\n"); +} + + +static int +test_lcore_poll_busyness_perf(void) +{ + if (rte_lcore_poll_busyness_enabled() == -ENOTSUP) { + printf("Lcore poll busyness may be disabled...\n"); + return TEST_SKIPPED; + } + + /* Initialize and prime the timestamp struct with simulated "work" for this lcore */ + simulate_lcore_poll_busyness(10000); + + /* Run perf tests */ + test_timestamp_perf(); + + return TEST_SUCCESS; +} + +REGISTER_TEST_COMMAND(lcore_poll_busyness_perf_autotest, test_lcore_poll_busyness_perf); From patchwork Wed Sep 14 09:29:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Laatz X-Patchwork-Id: 116303 X-Patchwork-Delegate: david.marchand@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id F21ACA0032; Wed, 14 Sep 2022 11:26:43 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id CC92C42B76; Wed, 14 Sep 2022 11:26:29 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by mails.dpdk.org (Postfix) with ESMTP id 2C9074280D for ; Wed, 14 Sep 2022 11:26:26 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1663147587; x=1694683587; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=CCYH84CxqxOBWravHGKI5REiL4NY4Mbar5gZrJDNNjQ=; b=hlXvgc1Cr64FJlytD4eScxSKHLzCOItU5BBRuwO6k74AnR7R3rHBkxVV CLW3INNOExw/GSuOIaoCPtD+5giVScw9QvhdftAKM1ono2i8YD7qVn2s5 RrtPE3wJPucR/Hw/Yie5Eabimyh89qKm7CfFN3XfVUIRwXc0w3q55iVKi Fn7+IcAxSsyJc+LQb7UXYjcvH2sOJHuUeMOxpxYM/JsHbPKLg/kbPe3qX g9GyVLrQIFHX9L3O9EgLh8lN7JAU7QtfbcPQCq+G5z4pANUhkb7/Hg5ij evSH0NdovlFpII7QjYN2rRnZPCL693Szt39iJOafjWLlvhrvDyVJbczK0 A==; X-IronPort-AV: E=McAfee;i="6500,9779,10469"; a="384675043" X-IronPort-AV: E=Sophos;i="5.93,315,1654585200"; d="scan'208";a="384675043" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2022 02:26:26 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.93,315,1654585200"; d="scan'208";a="612446540" Received: from silpixa00401122.ir.intel.com ([10.237.213.42]) by orsmga007.jf.intel.com with ESMTP; 14 Sep 2022 02:26:25 -0700 From: Kevin Laatz To: dev@dpdk.org Cc: anatoly.burakov@intel.com, Kevin Laatz Subject: [PATCH v7 4/4] doc: add howto guide for lcore poll busyness Date: Wed, 14 Sep 2022 10:29:29 +0100 Message-Id: <20220914092929.1159773-5-kevin.laatz@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220914092929.1159773-1-kevin.laatz@intel.com> References: <24c49429394294cfbf0d9c506b205029bac77c8b.1657890378.git.anatoly.burakov@intel.com> <20220914092929.1159773-1-kevin.laatz@intel.com> MIME-Version: 1.0 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Add a new section to the howto guides for using the new lcore poll busyness telemetry endpoints and describe general usage. Signed-off-by: Kevin Laatz --- v6: * Add mention of perf autotest in note mentioning perf impact. v4: * Include note on perf impact when the feature is enabled * Add doc to toctree * Updates to incorporate changes made earlier in the patchset v3: * Update naming to poll busyness --- doc/guides/howto/index.rst | 1 + doc/guides/howto/lcore_poll_busyness.rst | 93 ++++++++++++++++++++++++ 2 files changed, 94 insertions(+) create mode 100644 doc/guides/howto/lcore_poll_busyness.rst diff --git a/doc/guides/howto/index.rst b/doc/guides/howto/index.rst index bf6337d021..0a9060c1d3 100644 --- a/doc/guides/howto/index.rst +++ b/doc/guides/howto/index.rst @@ -21,3 +21,4 @@ HowTo Guides debug_troubleshoot openwrt avx512 + lcore_poll_busyness diff --git a/doc/guides/howto/lcore_poll_busyness.rst b/doc/guides/howto/lcore_poll_busyness.rst new file mode 100644 index 0000000000..be5ea2a85d --- /dev/null +++ b/doc/guides/howto/lcore_poll_busyness.rst @@ -0,0 +1,93 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright(c) 2022 Intel Corporation. + +Lcore Poll Busyness Telemetry +============================= + +The lcore poll busyness telemetry provides a built-in, generic method of gathering +lcore utilization metrics for running applications. These metrics are exposed +via a new telemetry endpoint. + +Since most DPDK APIs polling based, the poll busyness is calculated based on +APIs receiving 'work' (packets, completions, events, etc). Empty polls are +considered as idle, while non-empty polls are considered busy. Using the amount +of cycles spent processing empty polls, the busyness can be calculated and recorded. + +Application Specified Busyness +------------------------------ + +Improved accuracy of the reported busyness may need more contextual awareness +from the application. For example, an application may make a number of calls to +rx_burst before processing packets. If the last burst was an "empty poll", then +the processing time of the packets would be falsely considered as "idle", since +the last burst was empty. The application should track if any of the polls +contained "work" to do and should mark the 'bulk' as "busy" cycles before +proceeding to the processesing. This type of awareness is only available within +the application. + +Applications can be modified to incorporate the extra contextual awareness in +order to improve the reported busyness by marking areas of code as "busy" or +"idle" appropriately. This can be done by inserting the timestamping macro:: + + RTE_LCORE_POLL_BUSYNESS_TIMESTAMP(0) /* to mark section as idle */ + RTE_LCORE_POLL_BUSYNESS_TIMESTAMP(32) /* where 32 is nb_pkts to mark section as busy (non-zero is busy) */ + +All cycles since the last state change (idle to busy, or vice versa) will be +counted towards the current state's counter. + +Consuming the Telemetry +----------------------- + +The telemetry gathered for lcore poll busyness can be read from the `telemetry.py` +script via the new `/eal/lcore/poll_busyness` endpoint:: + + $ ./usertools/dpdk-telemetry.py + --> /eal/lcore/poll_busyness + {"/eal/lcore/poll_busyness": {"12": -1, "13": 85, "14": 84}} + +* Cores not collecting poll busyness will report "-1". E.g. control cores or inactive cores. +* All enabled cores will report their poll busyness in the range 0-100. + +Enabling and Disabling Lcore Poll Busyness Telemetry +---------------------------------------------------- + +By default, the lcore poll busyness telemetry is disabled at compile time. In +order to allow DPDK to gather this metric, the ``enable_lcore_poll_busyness`` +meson option must be set to ``true``. + +.. note:: + Enabling lcore poll busyness telemetry may impact performance due to the + additional timestamping, potentially per poll depending on the application. + This can be measured with the `lcore_poll_busyness_perf_autotest`. + +At compile time +^^^^^^^^^^^^^^^ + +Support can be enabled/disabled at compile time via the meson option. +It is disabled by default.:: + + $ meson configure -Denable_lcore_poll_busyness=true #enable + + $ meson configure -Denable_lcore_poll_busyness=false #disable + +At run time +^^^^^^^^^^^ + +Support can also be enabled/disabled during runtime (if the meson option is +enabled at compile time). Disabling at runtime comes at the cost of an additional +branch, however no additional function calls are performed. + +To enable/disable support at runtime, a call can be made to the appropriately +telemetry endpoint. + +Disable:: + + $ ./usertools/dpdk-telemetry.py + --> /eal/lcore/poll_busyness_disable + {"/eal/lcore/poll_busyness_disable": {"poll_busyness_enabled": 0}} + +Enable:: + + $ ./usertools/dpdk-telemetry.py + --> /eal/lcore/poll_busyness_enable + {"/eal/lcore/poll_busyness_enable": {"poll_busyness_enabled": 1}}