net/mlx5: fix Tx queue size created with DevX

Message ID 20210204120409.1194-1-viacheslavo@nvidia.com (mailing list archive)
State Accepted, archived
Delegated to: Raslan Darawsheh
Headers
Series net/mlx5: fix Tx queue size created with DevX |

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/travis-robot warning Travis build: failed
ci/iol-testing warning Testing issues

Commit Message

Slava Ovsiienko Feb. 4, 2021, 12:04 p.m. UTC
  The number of descriptors specified for queue creation
implies the queue should be able to contain the specified
amount of packets being sent. Typically one packet takes
one queue descriptor (WQE) to be handled. If there is inline
data option enabled one packet might require more WQEs to
embrace the inline data and the overall queue size (the
number of queue descriptors) should be adjusted accordingly.

In mlx5 PMD the queues can be created either via Verbs, using
the rdma-core library or via DevX as direct kernel/firmware call.
The rdma-core does queue size adjustment internally, depending on
TSO and inline setting. The DevX approach missed this point.
This caused the queue size discrepancy and performance variations.

The patch adjusts the Tx queue size for the DexV approach
in the same as it is done in rdma-core implementation.

Fixes: 86d259cec852 ("net/mlx5: separate Tx queue object creations")
Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 drivers/net/mlx5/mlx5_devx.c | 21 +++++++++++++++++++--
 1 file changed, 19 insertions(+), 2 deletions(-)
  

Comments

Raslan Darawsheh Feb. 4, 2021, 4:51 p.m. UTC | #1
Hi,

> -----Original Message-----
> From: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
> Sent: Thursday, February 4, 2021 2:04 PM
> To: dev@dpdk.org
> Cc: Raslan Darawsheh <rasland@nvidia.com>; Matan Azrad
> <matan@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-Thomas
> Monjalon <thomas@monjalon.net>; stable@dpdk.org
> Subject: [PATCH] net/mlx5: fix Tx queue size created with DevX
> 
> The number of descriptors specified for queue creation
> implies the queue should be able to contain the specified
> amount of packets being sent. Typically one packet takes
> one queue descriptor (WQE) to be handled. If there is inline
> data option enabled one packet might require more WQEs to
> embrace the inline data and the overall queue size (the
> number of queue descriptors) should be adjusted accordingly.
> 
> In mlx5 PMD the queues can be created either via Verbs, using
> the rdma-core library or via DevX as direct kernel/firmware call.
> The rdma-core does queue size adjustment internally, depending on
> TSO and inline setting. The DevX approach missed this point.
> This caused the queue size discrepancy and performance variations.
> 
> The patch adjusts the Tx queue size for the DexV approach
> in the same as it is done in rdma-core implementation.
> 
> Fixes: 86d259cec852 ("net/mlx5: separate Tx queue object creations")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Patch applied to next-net-mlx,

Kindest regards,
Raslan Darawsheh
  

Patch

diff --git a/drivers/net/mlx5/mlx5_devx.c b/drivers/net/mlx5/mlx5_devx.c
index 935cbd03ab..ef34c38580 100644
--- a/drivers/net/mlx5/mlx5_devx.c
+++ b/drivers/net/mlx5/mlx5_devx.c
@@ -1036,7 +1036,7 @@  mlx5_txq_devx_obj_new(struct rte_eth_dev *dev, uint16_t idx)
 	};
 	void *reg_addr;
 	uint32_t cqe_n, log_desc_n;
-	uint32_t wqe_n;
+	uint32_t wqe_n, wqe_size;
 	int ret = 0;
 
 	MLX5_ASSERT(txq_data);
@@ -1069,8 +1069,25 @@  mlx5_txq_devx_obj_new(struct rte_eth_dev *dev, uint16_t idx)
 	txq_data->cq_pi = 0;
 	txq_data->cq_db = txq_obj->cq_obj.db_rec;
 	*txq_data->cq_db = 0;
+	/*
+	 * Ajust the amount of WQEs depending on inline settings.
+	 * The number of descriptors should be enough to handle
+	 * the specified number of packets. If queue is being created
+	 * with Verbs the rdma-core does queue size adjustment
+	 * internally in the mlx5_calc_sq_size(), we do the same
+	 * for the queue being created with DevX at this point.
+	 */
+	wqe_size = txq_data->tso_en ? txq_ctrl->max_tso_header : 0;
+	wqe_size += sizeof(struct mlx5_wqe_cseg) +
+		    sizeof(struct mlx5_wqe_eseg) +
+		    sizeof(struct mlx5_wqe_dseg);
+	if (txq_data->inlen_send)
+		wqe_size = RTE_MAX(wqe_size, txq_data->inlen_send +
+					     sizeof(struct mlx5_wqe_cseg) +
+					     sizeof(struct mlx5_wqe_eseg));
+	wqe_size = RTE_ALIGN_CEIL(wqe_size, MLX5_WQE_SIZE) / MLX5_WQE_SIZE;
 	/* Create Send Queue object with DevX. */
-	wqe_n = RTE_MIN(1UL << txq_data->elts_n,
+	wqe_n = RTE_MIN((1UL << txq_data->elts_n) * wqe_size,
 			(uint32_t)priv->sh->device_attr.max_qp_wr);
 	log_desc_n = log2above(wqe_n);
 	ret = mlx5_txq_create_devx_sq_resources(dev, idx, log_desc_n);