From patchwork Tue May 16 14:37:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavan Nikhilesh Bhagavatula X-Patchwork-Id: 126888 X-Patchwork-Delegate: jerinj@marvell.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 7D68242B24; Tue, 16 May 2023 16:38:02 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 0F855410EE; Tue, 16 May 2023 16:38:02 +0200 (CEST) Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com [67.231.156.173]) by mails.dpdk.org (Postfix) with ESMTP id 5278E40697 for ; Tue, 16 May 2023 16:38:00 +0200 (CEST) Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 34GBPrVS003574 for ; Tue, 16 May 2023 07:37:59 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : mime-version : content-transfer-encoding : content-type; s=pfpt0220; bh=bL7rTtesWPXk7L0zx0EB7rtS9+r2sV61K1l3eKX4sFo=; b=L9GYeZi49RTXlbocsA+niCMMe8GX5ojoOjWQBdAgE7PYLAZurAsAr4DZBaRc502wl5Kt sky0Gb1iDKeCSAGZsG7x2mJqqAp3ZaDHuB+UylDA/Bku4zEDAJiMXlskhvCfOMwlmzwp fqHdF6D0GsJyRiTpdTBjiBpRVINj53PPHHTUZ8Og0OA5BG9u7tM2JbAa6plttLpN+rpp 5O7rbjgUdb7eT1wj7Do1g6ctTAbLzS3zOrp3LS/jl7T7bdCcDuQBX1HPwgqg7zfscmi2 gZGsr5jKIwjYkQw3INL3djl/xH9zRgVNy2ak0wNqz09jwJSsHMSAEo03hrrKuTfvRGAX Qw== Received: from dc5-exch01.marvell.com ([199.233.59.181]) by mx0b-0016f401.pphosted.com (PPS) with ESMTPS id 3qja2jsh0p-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT) for ; Tue, 16 May 2023 07:37:59 -0700 Received: from DC5-EXCH01.marvell.com (10.69.176.38) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server (TLS) id 15.0.1497.48; Tue, 16 May 2023 07:37:57 -0700 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server id 15.0.1497.48 via Frontend Transport; Tue, 16 May 2023 07:37:57 -0700 Received: from MININT-80QBFE8.corp.innovium.com (unknown [10.28.164.122]) by maili.marvell.com (Postfix) with ESMTP id 8C51B3F7071; Tue, 16 May 2023 07:37:54 -0700 (PDT) From: To: , Pavan Nikhilesh , "Shijith Thotton" , Nithin Dabilpuram , Kiran Kumar K , Sunil Kumar Kori , Satha Rao CC: Subject: [PATCH 1/3] event/cnxk: align TX queue buffer adjustment Date: Tue, 16 May 2023 20:07:50 +0530 Message-ID: <20230516143752.4941-1-pbhagavatula@marvell.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 X-Proofpoint-GUID: 0-ZwvA0DIEi3pe2bAVT1l_p6nV9O2xIA X-Proofpoint-ORIG-GUID: 0-ZwvA0DIEi3pe2bAVT1l_p6nV9O2xIA X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-05-16_07,2023-05-16_01,2023-02-09_01 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org From: Pavan Nikhilesh Remove recalculating SQB thresholds in Tx queue buffer adjustment. The adjustment is already done during Tx queue setup. Signed-off-by: Pavan Nikhilesh --- Depends-on: series-27660 drivers/event/cnxk/cn10k_eventdev.c | 9 +-------- drivers/event/cnxk/cn10k_tx_worker.h | 6 +++--- drivers/event/cnxk/cn9k_eventdev.c | 9 +-------- drivers/event/cnxk/cn9k_worker.h | 12 +++++++++--- drivers/net/cnxk/cn10k_tx.h | 12 ++++++------ drivers/net/cnxk/cn9k_tx.h | 5 +++-- 6 files changed, 23 insertions(+), 30 deletions(-) -- 2.39.1 diff --git a/drivers/event/cnxk/cn10k_eventdev.c b/drivers/event/cnxk/cn10k_eventdev.c index 89f32c4d1e..f7c6a83ff0 100644 --- a/drivers/event/cnxk/cn10k_eventdev.c +++ b/drivers/event/cnxk/cn10k_eventdev.c @@ -840,16 +840,9 @@ cn10k_sso_txq_fc_update(const struct rte_eth_dev *eth_dev, int32_t tx_queue_id) sq = &cnxk_eth_dev->sqs[tx_queue_id]; txq = eth_dev->data->tx_queues[tx_queue_id]; sqes_per_sqb = 1U << txq->sqes_per_sqb_log2; - sq->nb_sqb_bufs_adj = - sq->nb_sqb_bufs - - RTE_ALIGN_MUL_CEIL(sq->nb_sqb_bufs, sqes_per_sqb) / - sqes_per_sqb; if (cnxk_eth_dev->tx_offloads & RTE_ETH_TX_OFFLOAD_SECURITY) - sq->nb_sqb_bufs_adj -= (cnxk_eth_dev->outb.nb_desc / - (sqes_per_sqb - 1)); + sq->nb_sqb_bufs_adj -= (cnxk_eth_dev->outb.nb_desc / sqes_per_sqb); txq->nb_sqb_bufs_adj = sq->nb_sqb_bufs_adj; - txq->nb_sqb_bufs_adj = - ((100 - ROC_NIX_SQB_THRESH) * txq->nb_sqb_bufs_adj) / 100; } } diff --git a/drivers/event/cnxk/cn10k_tx_worker.h b/drivers/event/cnxk/cn10k_tx_worker.h index c18786a14c..7b2798ad2e 100644 --- a/drivers/event/cnxk/cn10k_tx_worker.h +++ b/drivers/event/cnxk/cn10k_tx_worker.h @@ -32,9 +32,9 @@ cn10k_sso_txq_fc_wait(const struct cn10k_eth_txq *txq) static __rte_always_inline int32_t cn10k_sso_sq_depth(const struct cn10k_eth_txq *txq) { - return (txq->nb_sqb_bufs_adj - - __atomic_load_n((int16_t *)txq->fc_mem, __ATOMIC_RELAXED)) - << txq->sqes_per_sqb_log2; + int32_t avail = (int32_t)txq->nb_sqb_bufs_adj - + (int32_t)__atomic_load_n(txq->fc_mem, __ATOMIC_RELAXED); + return (avail << txq->sqes_per_sqb_log2) - avail; } static __rte_always_inline uint16_t diff --git a/drivers/event/cnxk/cn9k_eventdev.c b/drivers/event/cnxk/cn9k_eventdev.c index df23219f14..a9d603c22f 100644 --- a/drivers/event/cnxk/cn9k_eventdev.c +++ b/drivers/event/cnxk/cn9k_eventdev.c @@ -893,16 +893,9 @@ cn9k_sso_txq_fc_update(const struct rte_eth_dev *eth_dev, int32_t tx_queue_id) sq = &cnxk_eth_dev->sqs[tx_queue_id]; txq = eth_dev->data->tx_queues[tx_queue_id]; sqes_per_sqb = 1U << txq->sqes_per_sqb_log2; - sq->nb_sqb_bufs_adj = - sq->nb_sqb_bufs - - RTE_ALIGN_MUL_CEIL(sq->nb_sqb_bufs, sqes_per_sqb) / - sqes_per_sqb; if (cnxk_eth_dev->tx_offloads & RTE_ETH_TX_OFFLOAD_SECURITY) - sq->nb_sqb_bufs_adj -= (cnxk_eth_dev->outb.nb_desc / - (sqes_per_sqb - 1)); + sq->nb_sqb_bufs_adj -= (cnxk_eth_dev->outb.nb_desc / sqes_per_sqb); txq->nb_sqb_bufs_adj = sq->nb_sqb_bufs_adj; - txq->nb_sqb_bufs_adj = - ((100 - ROC_NIX_SQB_THRESH) * txq->nb_sqb_bufs_adj) / 100; } } diff --git a/drivers/event/cnxk/cn9k_worker.h b/drivers/event/cnxk/cn9k_worker.h index 988cb3acb6..d15dd309fe 100644 --- a/drivers/event/cnxk/cn9k_worker.h +++ b/drivers/event/cnxk/cn9k_worker.h @@ -711,6 +711,14 @@ cn9k_sso_hws_xmit_sec_one(const struct cn9k_eth_txq *txq, uint64_t base, } #endif +static __rte_always_inline int32_t +cn9k_sso_sq_depth(const struct cn9k_eth_txq *txq) +{ + int32_t avail = (int32_t)txq->nb_sqb_bufs_adj - + (int32_t)__atomic_load_n(txq->fc_mem, __ATOMIC_RELAXED); + return (avail << txq->sqes_per_sqb_log2) - avail; +} + static __rte_always_inline uint16_t cn9k_sso_hws_event_tx(uint64_t base, struct rte_event *ev, uint64_t *cmd, uint64_t *txq_data, const uint32_t flags) @@ -734,9 +742,7 @@ cn9k_sso_hws_event_tx(uint64_t base, struct rte_event *ev, uint64_t *cmd, if (flags & NIX_TX_OFFLOAD_MBUF_NOFF_F && txq->tx_compl.ena) handle_tx_completion_pkts(txq, 1, 1); - if (((txq->nb_sqb_bufs_adj - - __atomic_load_n((int16_t *)txq->fc_mem, __ATOMIC_RELAXED)) - << txq->sqes_per_sqb_log2) <= 0) + if (cn9k_sso_sq_depth(txq) <= 0) return 0; cn9k_nix_tx_skeleton(txq, cmd, flags, 0); cn9k_nix_xmit_prepare(txq, m, cmd, flags, txq->lso_tun_fmt, txq->mark_flag, diff --git a/drivers/net/cnxk/cn10k_tx.h b/drivers/net/cnxk/cn10k_tx.h index c9ec01cd9d..bab08a2d3b 100644 --- a/drivers/net/cnxk/cn10k_tx.h +++ b/drivers/net/cnxk/cn10k_tx.h @@ -35,12 +35,13 @@ #define NIX_XMIT_FC_OR_RETURN(txq, pkts) \ do { \ + int64_t avail; \ /* Cached value is low, Update the fc_cache_pkts */ \ if (unlikely((txq)->fc_cache_pkts < (pkts))) { \ + avail = txq->nb_sqb_bufs_adj - *txq->fc_mem; \ /* Multiply with sqe_per_sqb to express in pkts */ \ (txq)->fc_cache_pkts = \ - ((txq)->nb_sqb_bufs_adj - *(txq)->fc_mem) \ - << (txq)->sqes_per_sqb_log2; \ + (avail << (txq)->sqes_per_sqb_log2) - avail; \ /* Check it again for the room */ \ if (unlikely((txq)->fc_cache_pkts < (pkts))) \ return 0; \ @@ -113,10 +114,9 @@ cn10k_nix_vwqe_wait_fc(struct cn10k_eth_txq *txq, int64_t req) if (cached < 0) { /* Check if we have space else retry. */ do { - refill = - (txq->nb_sqb_bufs_adj - - __atomic_load_n(txq->fc_mem, __ATOMIC_RELAXED)) - << txq->sqes_per_sqb_log2; + refill = txq->nb_sqb_bufs_adj - + __atomic_load_n(txq->fc_mem, __ATOMIC_RELAXED); + refill = (refill << txq->sqes_per_sqb_log2) - refill; } while (refill <= 0); __atomic_compare_exchange(&txq->fc_cache_pkts, &cached, &refill, 0, __ATOMIC_RELEASE, diff --git a/drivers/net/cnxk/cn9k_tx.h b/drivers/net/cnxk/cn9k_tx.h index e956c1ad2a..8efb75b505 100644 --- a/drivers/net/cnxk/cn9k_tx.h +++ b/drivers/net/cnxk/cn9k_tx.h @@ -32,12 +32,13 @@ #define NIX_XMIT_FC_OR_RETURN(txq, pkts) \ do { \ + int64_t avail; \ /* Cached value is low, Update the fc_cache_pkts */ \ if (unlikely((txq)->fc_cache_pkts < (pkts))) { \ + avail = txq->nb_sqb_bufs_adj - *txq->fc_mem; \ /* Multiply with sqe_per_sqb to express in pkts */ \ (txq)->fc_cache_pkts = \ - ((txq)->nb_sqb_bufs_adj - *(txq)->fc_mem) \ - << (txq)->sqes_per_sqb_log2; \ + (avail << (txq)->sqes_per_sqb_log2) - avail; \ /* Check it again for the room */ \ if (unlikely((txq)->fc_cache_pkts < (pkts))) \ return 0; \ From patchwork Tue May 16 14:37:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavan Nikhilesh Bhagavatula X-Patchwork-Id: 126889 X-Patchwork-Delegate: jerinj@marvell.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 5787E42B24; Tue, 16 May 2023 16:38:07 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 1704941153; Tue, 16 May 2023 16:38:05 +0200 (CEST) Received: from mx0b-0016f401.pphosted.com (mx0a-0016f401.pphosted.com [67.231.148.174]) by mails.dpdk.org (Postfix) with ESMTP id 41E7C41153 for ; Tue, 16 May 2023 16:38:03 +0200 (CEST) Received: from pps.filterd (m0045849.ppops.net [127.0.0.1]) by mx0a-0016f401.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 34G7HUcl021183 for ; Tue, 16 May 2023 07:38:02 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=pfpt0220; bh=QQ6eYzxguq/xK738F55Eg5+202FkuWh0xWR8Op3dPZc=; b=X/d4jR0vitOVnokLu5qELiS5/CX34OkvNjELK4eanboxkxU+5JINIzKq6tFkEc9gF6Dd DPiKA2Arj0D4N0D8SmkSQHq9QUp4OFmlBPtoZFhJaEFSSjyJ7I2Et2xWey9dtd7eviRz hA1lOSjJJzqcqQYYLmTn6Bsv4JjrGqLw9VEV4ebVmOgvueH32VRrb+xaTsrRc72I2693 Gt4G3bnLJSGkEqBMLOEHPYdC1eT4np7amXoYH1cNVb+VGjo8ueE459hGEXTw8QtU9KXC WdKgK9ofbamIY0vXmGw/hYKhF9n5aI54yAaJPoB4EKbyFH+UXkhNtjYbLe2MHP/epIjt qQ== Received: from dc5-exch02.marvell.com ([199.233.59.182]) by mx0a-0016f401.pphosted.com (PPS) with ESMTPS id 3qkvbmkrfc-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT) for ; Tue, 16 May 2023 07:38:01 -0700 Received: from DC5-EXCH01.marvell.com (10.69.176.38) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server (TLS) id 15.0.1497.48; Tue, 16 May 2023 07:38:00 -0700 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server id 15.0.1497.48 via Frontend Transport; Tue, 16 May 2023 07:37:59 -0700 Received: from MININT-80QBFE8.corp.innovium.com (unknown [10.28.164.122]) by maili.marvell.com (Postfix) with ESMTP id F3E5A3F706A; Tue, 16 May 2023 07:37:57 -0700 (PDT) From: To: , Pavan Nikhilesh , "Shijith Thotton" CC: Subject: [PATCH 2/3] event/cnxk: use local labels in asm intrinsic Date: Tue, 16 May 2023 20:07:51 +0530 Message-ID: <20230516143752.4941-2-pbhagavatula@marvell.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230516143752.4941-1-pbhagavatula@marvell.com> References: <20230516143752.4941-1-pbhagavatula@marvell.com> MIME-Version: 1.0 X-Proofpoint-GUID: jIAdrlmE0xkvxJ3fYV8H5lv2Zo51VIe_ X-Proofpoint-ORIG-GUID: jIAdrlmE0xkvxJ3fYV8H5lv2Zo51VIe_ X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-05-16_07,2023-05-16_01,2023-02-09_01 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org From: Pavan Nikhilesh Using labels in asm generates them as regular function and shades callstack in tools like gdb or perf. Use local label instead for better visibility. Signed-off-by: Pavan Nikhilesh --- drivers/event/cnxk/cn10k_worker.h | 8 ++--- drivers/event/cnxk/cn9k_worker.h | 25 ++++++++-------- drivers/event/cnxk/cnxk_tim_worker.h | 44 ++++++++++++++-------------- drivers/event/cnxk/cnxk_worker.h | 8 ++--- 4 files changed, 43 insertions(+), 42 deletions(-) diff --git a/drivers/event/cnxk/cn10k_worker.h b/drivers/event/cnxk/cn10k_worker.h index 27d3a9bca2..6ec81a5ce5 100644 --- a/drivers/event/cnxk/cn10k_worker.h +++ b/drivers/event/cnxk/cn10k_worker.h @@ -268,12 +268,12 @@ cn10k_sso_hws_get_work_empty(struct cn10k_sso_hws *ws, struct rte_event *ev, #ifdef RTE_ARCH_ARM64 asm volatile(PLT_CPU_FEATURE_PREAMBLE " ldp %[tag], %[wqp], [%[tag_loc]] \n" - " tbz %[tag], 63, done%= \n" + " tbz %[tag], 63, .Ldone%= \n" " sevl \n" - "rty%=: wfe \n" + ".Lrty%=: wfe \n" " ldp %[tag], %[wqp], [%[tag_loc]] \n" - " tbnz %[tag], 63, rty%= \n" - "done%=: dmb ld \n" + " tbnz %[tag], 63, .Lrty%= \n" + ".Ldone%=: dmb ld \n" : [tag] "=&r"(gw.u64[0]), [wqp] "=&r"(gw.u64[1]) : [tag_loc] "r"(ws->base + SSOW_LF_GWS_WQE0) : "memory"); diff --git a/drivers/event/cnxk/cn9k_worker.h b/drivers/event/cnxk/cn9k_worker.h index d15dd309fe..51b4db419e 100644 --- a/drivers/event/cnxk/cn9k_worker.h +++ b/drivers/event/cnxk/cn9k_worker.h @@ -232,18 +232,19 @@ cn9k_sso_hws_dual_get_work(uint64_t base, uint64_t pair_base, rte_prefetch_non_temporal(dws->lookup_mem); #ifdef RTE_ARCH_ARM64 asm volatile(PLT_CPU_FEATURE_PREAMBLE - "rty%=: \n" + ".Lrty%=: \n" " ldr %[tag], [%[tag_loc]] \n" " ldr %[wqp], [%[wqp_loc]] \n" - " tbnz %[tag], 63, rty%= \n" - "done%=: str %[gw], [%[pong]] \n" + " tbnz %[tag], 63, .Lrty%= \n" + ".Ldone%=: str %[gw], [%[pong]] \n" " dmb ld \n" " sub %[mbuf], %[wqp], #0x80 \n" " prfm pldl1keep, [%[mbuf]] \n" : [tag] "=&r"(gw.u64[0]), [wqp] "=&r"(gw.u64[1]), [mbuf] "=&r"(mbuf) : [tag_loc] "r"(base + SSOW_LF_GWS_TAG), - [wqp_loc] "r"(base + SSOW_LF_GWS_WQP), [gw] "r"(dws->gw_wdata), + [wqp_loc] "r"(base + SSOW_LF_GWS_WQP), + [gw] "r"(dws->gw_wdata), [pong] "r"(pair_base + SSOW_LF_GWS_OP_GET_WORK0)); #else gw.u64[0] = plt_read64(base + SSOW_LF_GWS_TAG); @@ -282,13 +283,13 @@ cn9k_sso_hws_get_work(struct cn9k_sso_hws *ws, struct rte_event *ev, asm volatile(PLT_CPU_FEATURE_PREAMBLE " ldr %[tag], [%[tag_loc]] \n" " ldr %[wqp], [%[wqp_loc]] \n" - " tbz %[tag], 63, done%= \n" + " tbz %[tag], 63, .Ldone%= \n" " sevl \n" - "rty%=: wfe \n" + ".Lrty%=: wfe \n" " ldr %[tag], [%[tag_loc]] \n" " ldr %[wqp], [%[wqp_loc]] \n" - " tbnz %[tag], 63, rty%= \n" - "done%=: dmb ld \n" + " tbnz %[tag], 63, .Lrty%= \n" + ".Ldone%=: dmb ld \n" " sub %[mbuf], %[wqp], #0x80 \n" " prfm pldl1keep, [%[mbuf]] \n" : [tag] "=&r"(gw.u64[0]), [wqp] "=&r"(gw.u64[1]), @@ -330,13 +331,13 @@ cn9k_sso_hws_get_work_empty(uint64_t base, struct rte_event *ev, asm volatile(PLT_CPU_FEATURE_PREAMBLE " ldr %[tag], [%[tag_loc]] \n" " ldr %[wqp], [%[wqp_loc]] \n" - " tbz %[tag], 63, done%= \n" + " tbz %[tag], 63, .Ldone%= \n" " sevl \n" - "rty%=: wfe \n" + ".Lrty%=: wfe \n" " ldr %[tag], [%[tag_loc]] \n" " ldr %[wqp], [%[wqp_loc]] \n" - " tbnz %[tag], 63, rty%= \n" - "done%=: dmb ld \n" + " tbnz %[tag], 63, .Lrty%= \n" + ".Ldone%=: dmb ld \n" " sub %[mbuf], %[wqp], #0x80 \n" : [tag] "=&r"(gw.u64[0]), [wqp] "=&r"(gw.u64[1]), [mbuf] "=&r"(mbuf) diff --git a/drivers/event/cnxk/cnxk_tim_worker.h b/drivers/event/cnxk/cnxk_tim_worker.h index 8fafb8f09c..f0857f26ba 100644 --- a/drivers/event/cnxk/cnxk_tim_worker.h +++ b/drivers/event/cnxk/cnxk_tim_worker.h @@ -262,12 +262,12 @@ cnxk_tim_add_entry_sp(struct cnxk_tim_ring *const tim_ring, #ifdef RTE_ARCH_ARM64 asm volatile(PLT_CPU_FEATURE_PREAMBLE " ldxr %[hbt], [%[w1]] \n" - " tbz %[hbt], 33, dne%= \n" + " tbz %[hbt], 33, .Ldne%= \n" " sevl \n" - "rty%=: wfe \n" + ".Lrty%=: wfe \n" " ldxr %[hbt], [%[w1]] \n" - " tbnz %[hbt], 33, rty%= \n" - "dne%=: \n" + " tbnz %[hbt], 33, .Lrty%=\n" + ".Ldne%=: \n" : [hbt] "=&r"(hbt_state) : [w1] "r"((&bkt->w1)) : "memory"); @@ -345,12 +345,12 @@ cnxk_tim_add_entry_mp(struct cnxk_tim_ring *const tim_ring, #ifdef RTE_ARCH_ARM64 asm volatile(PLT_CPU_FEATURE_PREAMBLE " ldxr %[hbt], [%[w1]] \n" - " tbz %[hbt], 33, dne%= \n" + " tbz %[hbt], 33, .Ldne%= \n" " sevl \n" - "rty%=: wfe \n" + ".Lrty%=: wfe \n" " ldxr %[hbt], [%[w1]] \n" - " tbnz %[hbt], 33, rty%= \n" - "dne%=: \n" + " tbnz %[hbt], 33, .Lrty%=\n" + ".Ldne%=: \n" : [hbt] "=&r"(hbt_state) : [w1] "r"((&bkt->w1)) : "memory"); @@ -374,13 +374,13 @@ cnxk_tim_add_entry_mp(struct cnxk_tim_ring *const tim_ring, cnxk_tim_bkt_dec_lock(bkt); #ifdef RTE_ARCH_ARM64 asm volatile(PLT_CPU_FEATURE_PREAMBLE - " ldxr %[rem], [%[crem]] \n" - " tbz %[rem], 63, dne%= \n" + " ldxr %[rem], [%[crem]] \n" + " tbz %[rem], 63, .Ldne%= \n" " sevl \n" - "rty%=: wfe \n" - " ldxr %[rem], [%[crem]] \n" - " tbnz %[rem], 63, rty%= \n" - "dne%=: \n" + ".Lrty%=: wfe \n" + " ldxr %[rem], [%[crem]] \n" + " tbnz %[rem], 63, .Lrty%= \n" + ".Ldne%=: \n" : [rem] "=&r"(rem) : [crem] "r"(&bkt->w1) : "memory"); @@ -478,12 +478,12 @@ cnxk_tim_add_entry_brst(struct cnxk_tim_ring *const tim_ring, #ifdef RTE_ARCH_ARM64 asm volatile(PLT_CPU_FEATURE_PREAMBLE " ldxr %[hbt], [%[w1]] \n" - " tbz %[hbt], 33, dne%= \n" + " tbz %[hbt], 33, .Ldne%= \n" " sevl \n" - "rty%=: wfe \n" + ".Lrty%=: wfe \n" " ldxr %[hbt], [%[w1]] \n" - " tbnz %[hbt], 33, rty%= \n" - "dne%=: \n" + " tbnz %[hbt], 33, .Lrty%=\n" + ".Ldne%=: \n" : [hbt] "=&r"(hbt_state) : [w1] "r"((&bkt->w1)) : "memory"); @@ -510,13 +510,13 @@ cnxk_tim_add_entry_brst(struct cnxk_tim_ring *const tim_ring, asm volatile(PLT_CPU_FEATURE_PREAMBLE " ldxrb %w[lock_cnt], [%[lock]] \n" " tst %w[lock_cnt], 255 \n" - " beq dne%= \n" + " beq .Ldne%= \n" " sevl \n" - "rty%=: wfe \n" + ".Lrty%=: wfe \n" " ldxrb %w[lock_cnt], [%[lock]] \n" " tst %w[lock_cnt], 255 \n" - " bne rty%= \n" - "dne%=: \n" + " bne .Lrty%= \n" + ".Ldne%=: \n" : [lock_cnt] "=&r"(lock_cnt) : [lock] "r"(&bkt->lock) : "memory"); diff --git a/drivers/event/cnxk/cnxk_worker.h b/drivers/event/cnxk/cnxk_worker.h index 22d90afba2..2bd41f8a5e 100644 --- a/drivers/event/cnxk/cnxk_worker.h +++ b/drivers/event/cnxk/cnxk_worker.h @@ -71,12 +71,12 @@ cnxk_sso_hws_swtag_wait(uintptr_t tag_op) asm volatile(PLT_CPU_FEATURE_PREAMBLE " ldr %[swtb], [%[swtp_loc]] \n" - " tbz %[swtb], 62, done%= \n" + " tbz %[swtb], 62, .Ldone%= \n" " sevl \n" - "rty%=: wfe \n" + ".Lrty%=: wfe \n" " ldr %[swtb], [%[swtp_loc]] \n" - " tbnz %[swtb], 62, rty%= \n" - "done%=: \n" + " tbnz %[swtb], 62, .Lrty%= \n" + ".Ldone%=: \n" : [swtb] "=&r"(swtp) : [swtp_loc] "r"(tag_op)); #else From patchwork Tue May 16 14:37:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavan Nikhilesh Bhagavatula X-Patchwork-Id: 126890 X-Patchwork-Delegate: jerinj@marvell.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 6BFE342B24; Tue, 16 May 2023 16:38:15 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 90FFF42D38; Tue, 16 May 2023 16:38:07 +0200 (CEST) Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com [67.231.156.173]) by mails.dpdk.org (Postfix) with ESMTP id 1CDAA42D17 for ; Tue, 16 May 2023 16:38:06 +0200 (CEST) Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 34GDd5iv006439 for ; Tue, 16 May 2023 07:38:05 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=pfpt0220; bh=TlJLpI3k2V6clqHGHbfZ6J0VFnfcPE5Ln4bDboO2k50=; b=bTWYHenipDY5CB2Ld+X/T0fgbBipVb4P+2E6WXVFyeL+0XuFF1pxAR8zWLMkdRMK1U4N RXPimreOlh54ueHUmVJA2UJayPzoKrBrLE8Gup8s+6lOTf7CvobOq7WHH5H7b4HK+hZP YjR5Y5WqwfxiyV1RnLVXcdk73cIjjSPnS7uUGaspHHbyo8BmC3QAYO7KTA8h1JQNDdC9 WQJlEpmR5KMOkk6XySdM8GC+WjINHhhFrZMu94UV/ZTZmaOO2FeQzRlZLqjUg2Jmuhex BU+1sSPtoPjn0rZrQnv4L+7cHEv8wn+1YsjPm06JriL/5OSl7ayMmZlCMepPSD/s+DOc 2g== Received: from dc5-exch02.marvell.com ([199.233.59.182]) by mx0b-0016f401.pphosted.com (PPS) with ESMTPS id 3qja2jsh1e-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT) for ; Tue, 16 May 2023 07:38:04 -0700 Received: from DC5-EXCH01.marvell.com (10.69.176.38) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server (TLS) id 15.0.1497.48; Tue, 16 May 2023 07:38:02 -0700 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server id 15.0.1497.48 via Frontend Transport; Tue, 16 May 2023 07:38:02 -0700 Received: from MININT-80QBFE8.corp.innovium.com (unknown [10.28.164.122]) by maili.marvell.com (Postfix) with ESMTP id 229423F7043; Tue, 16 May 2023 07:37:59 -0700 (PDT) From: To: , Pavan Nikhilesh , "Shijith Thotton" , Nithin Dabilpuram , Kiran Kumar K , Sunil Kumar Kori , Satha Rao CC: Subject: [PATCH 3/3] event/cnxk: use WFE in Tx fc wait Date: Tue, 16 May 2023 20:07:52 +0530 Message-ID: <20230516143752.4941-3-pbhagavatula@marvell.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230516143752.4941-1-pbhagavatula@marvell.com> References: <20230516143752.4941-1-pbhagavatula@marvell.com> MIME-Version: 1.0 X-Proofpoint-GUID: am1Mgh55d8_p8kxD51ApiFJEUfYy8iWl X-Proofpoint-ORIG-GUID: am1Mgh55d8_p8kxD51ApiFJEUfYy8iWl X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-05-16_07,2023-05-16_01,2023-02-09_01 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org From: Pavan Nikhilesh Use WFE is Tx path when waiting for space in the Tx queue. Depending upon the Tx queue contention and size, WFE will reduce the cache pressure and power consumption. In multi-core scenarios we have observed up to 8W power reduction. Signed-off-by: Pavan Nikhilesh --- drivers/event/cnxk/cn10k_tx_worker.h | 18 ++++ drivers/net/cnxk/cn10k_tx.h | 152 +++++++++++++++++++++++---- 2 files changed, 147 insertions(+), 23 deletions(-) diff --git a/drivers/event/cnxk/cn10k_tx_worker.h b/drivers/event/cnxk/cn10k_tx_worker.h index 7b2798ad2e..df57a4b137 100644 --- a/drivers/event/cnxk/cn10k_tx_worker.h +++ b/drivers/event/cnxk/cn10k_tx_worker.h @@ -24,9 +24,27 @@ cn10k_sso_hws_xtract_meta(struct rte_mbuf *m, const uint64_t *txq_data) static __rte_always_inline void cn10k_sso_txq_fc_wait(const struct cn10k_eth_txq *txq) { +#ifdef RTE_ARCH_ARM64 + uint64_t space; + + asm volatile(PLT_CPU_FEATURE_PREAMBLE + " ldxr %[space], [%[addr]] \n" + " cmp %[adj], %[space] \n" + " b.hi .Ldne%= \n" + " sevl \n" + ".Lrty%=: wfe \n" + " ldxr %[space], [%[addr]] \n" + " cmp %[adj], %[space] \n" + " b.ls .Lrty%= \n" + ".Ldne%=: \n" + : [space] "=&r"(space) + : [adj] "r"(txq->nb_sqb_bufs_adj), [addr] "r"(txq->fc_mem) + : "memory"); +#else while ((uint64_t)txq->nb_sqb_bufs_adj <= __atomic_load_n(txq->fc_mem, __ATOMIC_RELAXED)) ; +#endif } static __rte_always_inline int32_t diff --git a/drivers/net/cnxk/cn10k_tx.h b/drivers/net/cnxk/cn10k_tx.h index bab08a2d3b..9049ac6b1a 100644 --- a/drivers/net/cnxk/cn10k_tx.h +++ b/drivers/net/cnxk/cn10k_tx.h @@ -102,27 +102,72 @@ cn10k_nix_tx_mbuf_validate(struct rte_mbuf *m, const uint32_t flags) } static __plt_always_inline void -cn10k_nix_vwqe_wait_fc(struct cn10k_eth_txq *txq, int64_t req) +cn10k_nix_vwqe_wait_fc(struct cn10k_eth_txq *txq, uint16_t req) { int64_t cached, refill; + int64_t pkts; retry: +#ifdef RTE_ARCH_ARM64 + + asm volatile(PLT_CPU_FEATURE_PREAMBLE + " ldxr %[pkts], [%[addr]] \n" + " tbz %[pkts], 63, .Ldne%= \n" + " sevl \n" + ".Lrty%=: wfe \n" + " ldxr %[pkts], [%[addr]] \n" + " tbnz %[pkts], 63, .Lrty%= \n" + ".Ldne%=: \n" + : [pkts] "=&r"(pkts) + : [addr] "r"(&txq->fc_cache_pkts) + : "memory"); +#else + RTE_SET_USED(pkts); while (__atomic_load_n(&txq->fc_cache_pkts, __ATOMIC_RELAXED) < 0) ; +#endif cached = __atomic_fetch_sub(&txq->fc_cache_pkts, req, __ATOMIC_ACQUIRE) - req; /* Check if there is enough space, else update and retry. */ - if (cached < 0) { - /* Check if we have space else retry. */ - do { - refill = txq->nb_sqb_bufs_adj - - __atomic_load_n(txq->fc_mem, __ATOMIC_RELAXED); - refill = (refill << txq->sqes_per_sqb_log2) - refill; - } while (refill <= 0); - __atomic_compare_exchange(&txq->fc_cache_pkts, &cached, &refill, - 0, __ATOMIC_RELEASE, - __ATOMIC_RELAXED); + if (cached >= 0) + return; + + /* Check if we have space else retry. */ +#ifdef RTE_ARCH_ARM64 + int64_t val; + + asm volatile(PLT_CPU_FEATURE_PREAMBLE + " ldxr %[val], [%[addr]] \n" + " sub %[val], %[adj], %[val] \n" + " lsl %[refill], %[val], %[shft] \n" + " sub %[refill], %[refill], %[val] \n" + " sub %[refill], %[refill], %[sub] \n" + " cmp %[refill], #0x0 \n" + " b.ge .Ldne%= \n" + " sevl \n" + ".Lrty%=: wfe \n" + " ldxr %[val], [%[addr]] \n" + " sub %[val], %[adj], %[val] \n" + " lsl %[refill], %[val], %[shft] \n" + " sub %[refill], %[refill], %[val] \n" + " sub %[refill], %[refill], %[sub] \n" + " cmp %[refill], #0x0 \n" + " b.lt .Lrty%= \n" + ".Ldne%=: \n" + : [refill] "=&r"(refill), [val] "=&r" (val) + : [addr] "r"(txq->fc_mem), [adj] "r"(txq->nb_sqb_bufs_adj), + [shft] "r"(txq->sqes_per_sqb_log2), [sub] "r"(req) + : "memory"); +#else + do { + refill = (txq->nb_sqb_bufs_adj - __atomic_load_n(txq->fc_mem, __ATOMIC_RELAXED)); + refill = (refill << txq->sqes_per_sqb_log2) - refill; + refill -= req; + } while (refill < 0); +#endif + if (!__atomic_compare_exchange(&txq->fc_cache_pkts, &cached, &refill, + 0, __ATOMIC_RELEASE, + __ATOMIC_RELAXED)) goto retry; - } } /* Function to determine no of tx subdesc required in case ext @@ -283,10 +328,27 @@ static __rte_always_inline void cn10k_nix_sec_fc_wait_one(struct cn10k_eth_txq *txq) { uint64_t nb_desc = txq->cpt_desc; - uint64_t *fc = txq->cpt_fc; - - while (nb_desc <= __atomic_load_n(fc, __ATOMIC_RELAXED)) + uint64_t fc; + +#ifdef RTE_ARCH_ARM64 + asm volatile(PLT_CPU_FEATURE_PREAMBLE + " ldxr %[space], [%[addr]] \n" + " cmp %[nb_desc], %[space] \n" + " b.hi .Ldne%= \n" + " sevl \n" + ".Lrty%=: wfe \n" + " ldxr %[space], [%[addr]] \n" + " cmp %[nb_desc], %[space] \n" + " b.ls .Lrty%= \n" + ".Ldne%=: \n" + : [space] "=&r"(fc) + : [nb_desc] "r"(nb_desc), [addr] "r"(txq->cpt_fc) + : "memory"); +#else + RTE_SET_USED(fc); + while (nb_desc <= __atomic_load_n(txq->cpt_fc, __ATOMIC_RELAXED)) ; +#endif } static __rte_always_inline void @@ -294,7 +356,7 @@ cn10k_nix_sec_fc_wait(struct cn10k_eth_txq *txq, uint16_t nb_pkts) { int32_t nb_desc, val, newval; int32_t *fc_sw; - volatile uint64_t *fc; + uint64_t *fc; /* Check if there is any CPT instruction to submit */ if (!nb_pkts) @@ -302,21 +364,59 @@ cn10k_nix_sec_fc_wait(struct cn10k_eth_txq *txq, uint16_t nb_pkts) again: fc_sw = txq->cpt_fc_sw; - val = __atomic_fetch_sub(fc_sw, nb_pkts, __ATOMIC_RELAXED) - nb_pkts; +#ifdef RTE_ARCH_ARM64 + asm volatile(PLT_CPU_FEATURE_PREAMBLE + " ldxr %w[pkts], [%[addr]] \n" + " tbz %w[pkts], 31, .Ldne%= \n" + " sevl \n" + ".Lrty%=: wfe \n" + " ldxr %w[pkts], [%[addr]] \n" + " tbnz %w[pkts], 31, .Lrty%= \n" + ".Ldne%=: \n" + : [pkts] "=&r"(val) + : [addr] "r"(fc_sw) + : "memory"); +#else + /* Wait for primary core to refill FC. */ + while (__atomic_load_n(fc_sw, __ATOMIC_RELAXED) < 0) + ; +#endif + + val = __atomic_fetch_sub(fc_sw, nb_pkts, __ATOMIC_ACQUIRE) - nb_pkts; if (likely(val >= 0)) return; nb_desc = txq->cpt_desc; fc = txq->cpt_fc; +#ifdef RTE_ARCH_ARM64 + asm volatile(PLT_CPU_FEATURE_PREAMBLE + " ldxr %[refill], [%[addr]] \n" + " sub %[refill], %[desc], %[refill] \n" + " sub %[refill], %[refill], %[pkts] \n" + " cmp %[refill], #0x0 \n" + " b.ge .Ldne%= \n" + " sevl \n" + ".Lrty%=: wfe \n" + " ldxr %[refill], [%[addr]] \n" + " sub %[refill], %[desc], %[refill] \n" + " sub %[refill], %[refill], %[pkts] \n" + " cmp %[refill], #0x0 \n" + " b.lt .Lrty%= \n" + ".Ldne%=: \n" + : [refill] "=&r"(newval) + : [addr] "r"(fc), [desc] "r"(nb_desc), [pkts] "r"(nb_pkts) + : "memory"); +#else while (true) { newval = nb_desc - __atomic_load_n(fc, __ATOMIC_RELAXED); newval -= nb_pkts; if (newval >= 0) break; } +#endif - if (!__atomic_compare_exchange_n(fc_sw, &val, newval, false, - __ATOMIC_RELAXED, __ATOMIC_RELAXED)) + if (!__atomic_compare_exchange_n(fc_sw, &val, newval, false, __ATOMIC_RELEASE, + __ATOMIC_RELAXED)) goto again; } @@ -3110,10 +3210,16 @@ cn10k_nix_xmit_pkts_vector(void *tx_queue, uint64_t *ws, wd.data[1] |= ((uint64_t)(lnum - 17)) << 12; wd.data[1] |= (uint64_t)(lmt_id + 16); - if (flags & NIX_TX_VWQE_F) - cn10k_nix_vwqe_wait_fc(txq, - burst - (cn10k_nix_pkts_per_vec_brst(flags) >> - 1)); + if (flags & NIX_TX_VWQE_F) { + if (flags & NIX_TX_MULTI_SEG_F) { + if (burst - (cn10k_nix_pkts_per_vec_brst(flags) >> 1) > 0) + cn10k_nix_vwqe_wait_fc(txq, + burst - (cn10k_nix_pkts_per_vec_brst(flags) >> 1)); + } else { + cn10k_nix_vwqe_wait_fc(txq, + burst - (cn10k_nix_pkts_per_vec_brst(flags) >> 1)); + } + } /* STEOR1 */ roc_lmt_submit_steorl(wd.data[1], pa); } else if (lnum) {