From patchwork Mon Jul 29 12:41:03 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Slava Ovsiienko X-Patchwork-Id: 57243 X-Patchwork-Delegate: rasland@nvidia.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 351261BF8D; Mon, 29 Jul 2019 14:41:18 +0200 (CEST) Received: from mellanox.co.il (mail-il-dmz.mellanox.com [193.47.165.129]) by dpdk.org (Postfix) with ESMTP id 17F751BF79 for ; Mon, 29 Jul 2019 14:41:14 +0200 (CEST) Received: from Internal Mail-Server by MTLPINE2 (envelope-from viacheslavo@mellanox.com) with ESMTPS (AES256-SHA encrypted); 29 Jul 2019 15:41:11 +0300 Received: from pegasus12.mtr.labs.mlnx (pegasus12.mtr.labs.mlnx [10.210.17.40]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id x6TCfBRH010749; Mon, 29 Jul 2019 15:41:11 +0300 Received: from pegasus12.mtr.labs.mlnx (localhost [127.0.0.1]) by pegasus12.mtr.labs.mlnx (8.14.7/8.14.7) with ESMTP id x6TCfBw0005043; Mon, 29 Jul 2019 12:41:11 GMT Received: (from viacheslavo@localhost) by pegasus12.mtr.labs.mlnx (8.14.7/8.14.7/Submit) id x6TCfBPB005042; Mon, 29 Jul 2019 12:41:11 GMT X-Authentication-Warning: pegasus12.mtr.labs.mlnx: viacheslavo set sender to viacheslavo@mellanox.com using -f From: Viacheslav Ovsiienko To: dev@dpdk.org Cc: yskoh@mellanox.com, shahafs@mellanox.com Date: Mon, 29 Jul 2019 12:41:03 +0000 Message-Id: <1564404065-4823-2-git-send-email-viacheslavo@mellanox.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1564404065-4823-1-git-send-email-viacheslavo@mellanox.com> References: <1564404065-4823-1-git-send-email-viacheslavo@mellanox.com> Subject: [dpdk-dev] [PATCH 1/3] net/mlx5: fix Tx completion descriptors fetching loop X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" This patch limits the amount of fetched and processed completion descriptors in one tx_burst routine call. The completion processing involves the buffer freeing which may be time consuming and introduce the significant latency, so limiting the amount of processed completions mitigates the latency issue. Fixes: 18a1c20044c0 ("net/mlx5: implement Tx burst template") Signed-off-by: Viacheslav Ovsiienko --- drivers/net/mlx5/mlx5_defs.h | 7 +++++++ drivers/net/mlx5/mlx5_rxtx.c | 46 +++++++++++++++++++++++++++++--------------- 2 files changed, 38 insertions(+), 15 deletions(-) diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h index 8c118d5..461e916 100644 --- a/drivers/net/mlx5/mlx5_defs.h +++ b/drivers/net/mlx5/mlx5_defs.h @@ -37,6 +37,13 @@ */ #define MLX5_TX_COMP_THRESH_INLINE_DIV (1 << 3) +/* + * Maximal amount of normal completion CQEs + * processed in one call of tx_burst() routine. + */ +#define MLX5_TX_COMP_MAX_CQE 2u + + /* Size of per-queue MR cache array for linear search. */ #define MLX5_MR_CACHE_N 8 diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c index 007df8f..c2b93c6 100644 --- a/drivers/net/mlx5/mlx5_rxtx.c +++ b/drivers/net/mlx5/mlx5_rxtx.c @@ -1992,13 +1992,13 @@ enum mlx5_txcmp_code { mlx5_tx_handle_completion(struct mlx5_txq_data *restrict txq, unsigned int olx __rte_unused) { + unsigned int count = MLX5_TX_COMP_MAX_CQE; bool update = false; + uint16_t tail = txq->elts_tail; int ret; do { - volatile struct mlx5_wqe_cseg *cseg; volatile struct mlx5_cqe *cqe; - uint16_t tail; cqe = &txq->cqes[txq->cq_ci & txq->cqe_m]; ret = check_cqe(cqe, txq->cqe_s, txq->cq_ci); @@ -2006,19 +2006,21 @@ enum mlx5_txcmp_code { if (likely(ret != MLX5_CQE_STATUS_ERR)) { /* No new CQEs in completion queue. */ assert(ret == MLX5_CQE_STATUS_HW_OWN); - if (likely(update)) { - /* Update the consumer index. */ - rte_compiler_barrier(); - *txq->cq_db = - rte_cpu_to_be_32(txq->cq_ci); - } - return; + break; } /* Some error occurred, try to restart. */ rte_wmb(); tail = mlx5_tx_error_cqe_handle (txq, (volatile struct mlx5_err_cqe *)cqe); + if (likely(tail != txq->elts_tail)) { + mlx5_tx_free_elts(txq, tail, olx); + assert(tail == txq->elts_tail); + } + /* Allow flushing all CQEs from the queue. */ + count = txq->cqe_s; } else { + volatile struct mlx5_wqe_cseg *cseg; + /* Normal transmit completion. */ ++txq->cq_ci; rte_cio_rmb(); @@ -2031,13 +2033,27 @@ enum mlx5_txcmp_code { if (txq->cq_pi) --txq->cq_pi; #endif - if (likely(tail != txq->elts_tail)) { - /* Free data buffers from elts. */ - mlx5_tx_free_elts(txq, tail, olx); - assert(tail == txq->elts_tail); - } update = true; - } while (true); + /* + * We have to restrict the amount of processed CQEs + * in one tx_burst routine call. The CQ may be large + * and many CQEs may be updated by the NIC in one + * transaction. Buffers freeing is time consuming, + * multiple iterations may introduce significant + * latency. + */ + } while (--count); + if (likely(tail != txq->elts_tail)) { + /* Free data buffers from elts. */ + mlx5_tx_free_elts(txq, tail, olx); + assert(tail == txq->elts_tail); + } + if (likely(update)) { + /* Update the consumer index. */ + rte_compiler_barrier(); + *txq->cq_db = + rte_cpu_to_be_32(txq->cq_ci); + } } /**