From patchwork Tue Jun 1 07:56:51 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ruifeng Wang X-Patchwork-Id: 93703 X-Patchwork-Delegate: david.marchand@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id B0B71A0524; Tue, 1 Jun 2021 09:57:24 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 9A731410FC; Tue, 1 Jun 2021 09:57:24 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by mails.dpdk.org (Postfix) with ESMTP id 9C30240040 for ; Tue, 1 Jun 2021 09:57:23 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 174A11FB; Tue, 1 Jun 2021 00:57:23 -0700 (PDT) Received: from net-arm-n1amp-01.shanghai.arm.com (net-arm-n1amp-01.shanghai.arm.com [10.169.210.111]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 3C9963F73D; Tue, 1 Jun 2021 00:57:19 -0700 (PDT) From: Ruifeng Wang To: jerinj@marvell.com, hemant.agrawal@nxp.com, ferruh.yigit@intel.com, thomas@monjalon.net, david.marchand@redhat.com Cc: dev@dpdk.org, nd@arm.com, honnappa.nagarahalli@arm.com, Ruifeng Wang Date: Tue, 1 Jun 2021 07:56:51 +0000 Message-Id: <20210601075653.84927-2-ruifeng.wang@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210601075653.84927-1-ruifeng.wang@arm.com> References: <20210318102550.59265-1-ruifeng.wang@arm.com> <20210601075653.84927-1-ruifeng.wang@arm.com> MIME-Version: 1.0 Subject: [dpdk-dev] [PATCH v2 1/3] examples/l3fwd: reorganize code for better performance X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Moved rfc1812 process prior to NEON registers store. On N1SDP, this reorganization mitigates CPU frontend stall and backend stall when forwarding. On N1SDP with MLX5 40G NIC, this change showed 10.2% performance gain in single port single core MRR test. On ThunderX2, this changed showed no performance degradation. Signed-off-by: Ruifeng Wang --- examples/l3fwd/l3fwd_neon.h | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/examples/l3fwd/l3fwd_neon.h b/examples/l3fwd/l3fwd_neon.h index 86ac5971d7..ea7fe22d00 100644 --- a/examples/l3fwd/l3fwd_neon.h +++ b/examples/l3fwd/l3fwd_neon.h @@ -43,11 +43,6 @@ processx4_step3(struct rte_mbuf *pkt[FWDSTEP], uint16_t dst_port[FWDSTEP]) ve[2] = vsetq_lane_u32(vgetq_lane_u32(te[2], 3), ve[2], 3); ve[3] = vsetq_lane_u32(vgetq_lane_u32(te[3], 3), ve[3], 3); - vst1q_u32(p[0], ve[0]); - vst1q_u32(p[1], ve[1]); - vst1q_u32(p[2], ve[2]); - vst1q_u32(p[3], ve[3]); - rfc1812_process((struct rte_ipv4_hdr *) ((struct rte_ether_hdr *)p[0] + 1), &dst_port[0], pkt[0]->packet_type); @@ -60,6 +55,11 @@ processx4_step3(struct rte_mbuf *pkt[FWDSTEP], uint16_t dst_port[FWDSTEP]) rfc1812_process((struct rte_ipv4_hdr *) ((struct rte_ether_hdr *)p[3] + 1), &dst_port[3], pkt[3]->packet_type); + + vst1q_u32(p[0], ve[0]); + vst1q_u32(p[1], ve[1]); + vst1q_u32(p[2], ve[2]); + vst1q_u32(p[3], ve[3]); } /* From patchwork Tue Jun 1 07:56:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ruifeng Wang X-Patchwork-Id: 93704 X-Patchwork-Delegate: david.marchand@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id CFAFBA0524; Tue, 1 Jun 2021 09:57:31 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id C0E81410FA; Tue, 1 Jun 2021 09:57:31 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by mails.dpdk.org (Postfix) with ESMTP id 6C37640040 for ; Tue, 1 Jun 2021 09:57:30 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id EB27C1FB; Tue, 1 Jun 2021 00:57:29 -0700 (PDT) Received: from net-arm-n1amp-01.shanghai.arm.com (net-arm-n1amp-01.shanghai.arm.com [10.169.210.111]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 1A7203F73D; Tue, 1 Jun 2021 00:57:26 -0700 (PDT) From: Ruifeng Wang To: jerinj@marvell.com, hemant.agrawal@nxp.com, ferruh.yigit@intel.com, thomas@monjalon.net, david.marchand@redhat.com Cc: dev@dpdk.org, nd@arm.com, honnappa.nagarahalli@arm.com, Ruifeng Wang Date: Tue, 1 Jun 2021 07:56:52 +0000 Message-Id: <20210601075653.84927-3-ruifeng.wang@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210601075653.84927-1-ruifeng.wang@arm.com> References: <20210318102550.59265-1-ruifeng.wang@arm.com> <20210601075653.84927-1-ruifeng.wang@arm.com> MIME-Version: 1.0 Subject: [dpdk-dev] [PATCH v2 2/3] examples/l3fwd: eliminate unnecessary calculations X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Both L2 and L3 headers will be used in forward processing. And these two headers are in the same cache line. It has the same effect for prefetching with L2 header address and prefetching with L3 header address. Changed to use L2 header address for prefetching. The change showed no measurable performance improvement, but it definitely removed unnecessary instructions for address calculation. Signed-off-by: Ruifeng Wang Acked-by: Jerin Jacob --- examples/l3fwd/l3fwd_lpm_neon.h | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/examples/l3fwd/l3fwd_lpm_neon.h b/examples/l3fwd/l3fwd_lpm_neon.h index d6c0ba64ab..78ee83b76c 100644 --- a/examples/l3fwd/l3fwd_lpm_neon.h +++ b/examples/l3fwd/l3fwd_lpm_neon.h @@ -98,14 +98,14 @@ l3fwd_lpm_send_packets(int nb_rx, struct rte_mbuf **pkts_burst, if (k) { for (i = 0; i < FWDSTEP; i++) { rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[i], - struct rte_ether_hdr *) + 1); + void *)); } for (j = 0; j != k - FWDSTEP; j += FWDSTEP) { for (i = 0; i < FWDSTEP; i++) { rte_prefetch0(rte_pktmbuf_mtod( pkts_burst[j + i + FWDSTEP], - struct rte_ether_hdr *) + 1); + void *)); } processx4_step1(&pkts_burst[j], &dip, &ipv4_flag); @@ -125,17 +125,17 @@ l3fwd_lpm_send_packets(int nb_rx, struct rte_mbuf **pkts_burst, switch (m) { case 3: rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[j], - struct rte_ether_hdr *) + 1); + void *)); j++; /* fallthrough */ case 2: rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[j], - struct rte_ether_hdr *) + 1); + void *)); j++; /* fallthrough */ case 1: rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[j], - struct rte_ether_hdr *) + 1); + void *)); j++; } From patchwork Tue Jun 1 07:56:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ruifeng Wang X-Patchwork-Id: 93705 X-Patchwork-Delegate: david.marchand@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id E7226A0524; Tue, 1 Jun 2021 09:57:39 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id D56DA4111C; Tue, 1 Jun 2021 09:57:39 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by mails.dpdk.org (Postfix) with ESMTP id 49A7E41104 for ; Tue, 1 Jun 2021 09:57:38 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C2C001FB; Tue, 1 Jun 2021 00:57:37 -0700 (PDT) Received: from net-arm-n1amp-01.shanghai.arm.com (net-arm-n1amp-01.shanghai.arm.com [10.169.210.111]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id A80373F73D; Tue, 1 Jun 2021 00:57:34 -0700 (PDT) From: Ruifeng Wang To: jerinj@marvell.com, hemant.agrawal@nxp.com, ferruh.yigit@intel.com, thomas@monjalon.net, david.marchand@redhat.com Cc: dev@dpdk.org, nd@arm.com, honnappa.nagarahalli@arm.com, Ruifeng Wang , Ola Liljedahl Date: Tue, 1 Jun 2021 07:56:53 +0000 Message-Id: <20210601075653.84927-4-ruifeng.wang@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210601075653.84927-1-ruifeng.wang@arm.com> References: <20210318102550.59265-1-ruifeng.wang@arm.com> <20210601075653.84927-1-ruifeng.wang@arm.com> MIME-Version: 1.0 Subject: [dpdk-dev] [PATCH v2 3/3] examples/l3fwd: eliminate unnecessary reloads in loop X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Number of rx queue and number of rx port in lcore config are constants during the period of l3 forward application running. But compiler has no this information. Copied values from lcore config to local variables and used the local variables for iteration. Compiler can see that the local variables are not changed, so qconf reloads at each iteration can be eliminated. The change showed 1.8% performance uplift in single core, single port, single queue test on N1SDP platform with MLX5 NIC. Signed-off-by: Ruifeng Wang Reviewed-by: Ola Liljedahl Reviewed-by: Honnappa Nagarahalli Acked-by: Jerin Jacob --- examples/l3fwd/l3fwd_lpm.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/examples/l3fwd/l3fwd_lpm.c b/examples/l3fwd/l3fwd_lpm.c index 427c72b1d2..ff1c18a442 100644 --- a/examples/l3fwd/l3fwd_lpm.c +++ b/examples/l3fwd/l3fwd_lpm.c @@ -154,14 +154,16 @@ lpm_main_loop(__rte_unused void *dummy) lcore_id = rte_lcore_id(); qconf = &lcore_conf[lcore_id]; - if (qconf->n_rx_queue == 0) { + const uint16_t n_rx_q = qconf->n_rx_queue; + const uint16_t n_tx_p = qconf->n_tx_port; + if (n_rx_q == 0) { RTE_LOG(INFO, L3FWD, "lcore %u has nothing to do\n", lcore_id); return 0; } RTE_LOG(INFO, L3FWD, "entering main loop on lcore %u\n", lcore_id); - for (i = 0; i < qconf->n_rx_queue; i++) { + for (i = 0; i < n_rx_q; i++) { portid = qconf->rx_queue_list[i].port_id; queueid = qconf->rx_queue_list[i].queue_id; @@ -181,7 +183,7 @@ lpm_main_loop(__rte_unused void *dummy) diff_tsc = cur_tsc - prev_tsc; if (unlikely(diff_tsc > drain_tsc)) { - for (i = 0; i < qconf->n_tx_port; ++i) { + for (i = 0; i < n_tx_p; ++i) { portid = qconf->tx_port_id[i]; if (qconf->tx_mbufs[portid].len == 0) continue; @@ -197,7 +199,7 @@ lpm_main_loop(__rte_unused void *dummy) /* * Read packet from RX queues */ - for (i = 0; i < qconf->n_rx_queue; ++i) { + for (i = 0; i < n_rx_q; ++i) { portid = qconf->rx_queue_list[i].port_id; queueid = qconf->rx_queue_list[i].queue_id; nb_rx = rte_eth_rx_burst(portid, queueid, pkts_burst,