From patchwork Tue May 30 05:48:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ruifeng Wang X-Patchwork-Id: 127682 X-Patchwork-Delegate: rasland@nvidia.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 31C9542BDB; Tue, 30 May 2023 07:48:27 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 0A88D40F18; Tue, 30 May 2023 07:48:27 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by mails.dpdk.org (Postfix) with ESMTP id 715F2406BC; Tue, 30 May 2023 07:48:26 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 26A69C14; Mon, 29 May 2023 22:49:11 -0700 (PDT) Received: from net-arm-n1amp-02.shanghai.arm.com (net-arm-n1amp-02.shanghai.arm.com [10.169.210.108]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id F30713F6C4; Mon, 29 May 2023 22:48:22 -0700 (PDT) From: Ruifeng Wang To: rasland@nvidia.com, matan@nvidia.com, viacheslavo@nvidia.com Cc: dev@dpdk.org, honnappa.nagarahalli@arm.com, stable@dpdk.org, nd@arm.com, Ruifeng Wang , Ali Alnubani Subject: [PATCH v2] net/mlx5: fix risk in Rx descriptor read in NEON vector path Date: Tue, 30 May 2023 13:48:04 +0800 Message-Id: <20230530054804.4101060-1-ruifeng.wang@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220104030056.268974-1-ruifeng.wang@arm.com> References: <20220104030056.268974-1-ruifeng.wang@arm.com> MIME-Version: 1.0 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org In NEON vector PMD, vector load loads two contiguous 8B of descriptor data into vector register. Given vector load ensures no 16B atomicity, read of the word that includes op_own field could be reordered after read of other words. In this case, some words could contain invalid data. Reloaded qword0 after read barrier to update vector register. This ensures that the fetched data is correct. Testpmd single core test on N1SDP/ThunderX2 showed no performance drop. Fixes: 1742c2d9fab0 ("net/mlx5: fix synchronization on polling Rx completions") Cc: stable@dpdk.org Signed-off-by: Ruifeng Wang Tested-by: Ali Alnubani Acked-by: Viacheslav Ovsiienko --- v2: Rebased and added tags that received. drivers/net/mlx5/mlx5_rxtx_vec_neon.h | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h index 75e8ed7e5a..9079da65de 100644 --- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h +++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h @@ -675,6 +675,14 @@ rxq_cq_process_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq, c0 = vld1q_u64((uint64_t *)(p0 + 48)); /* Synchronize for loading the rest of blocks. */ rte_io_rmb(); + /* B.0 (CQE 3) reload lower half of the block. */ + c3 = vld1q_lane_u64((uint64_t *)(p3 + 48), c3, 0); + /* B.0 (CQE 2) reload lower half of the block. */ + c2 = vld1q_lane_u64((uint64_t *)(p2 + 48), c2, 0); + /* B.0 (CQE 1) reload lower half of the block. */ + c1 = vld1q_lane_u64((uint64_t *)(p1 + 48), c1, 0); + /* B.0 (CQE 0) reload lower half of the block. */ + c0 = vld1q_lane_u64((uint64_t *)(p0 + 48), c0, 0); /* Prefetch next 4 CQEs. */ if (pkts_n - pos >= 2 * MLX5_VPMD_DESCS_PER_LOOP) { unsigned int next = pos + MLX5_VPMD_DESCS_PER_LOOP;