net/mlx5: fix UAR memory mapping type

Message ID 1595429948-20873-1-git-send-email-viacheslavo@mellanox.com (mailing list archive)
State Accepted, archived
Delegated to: Raslan Darawsheh
Headers
Series net/mlx5: fix UAR memory mapping type |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK
ci/travis-robot success Travis build: passed
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-testing success Testing PASS

Commit Message

Slava Ovsiienko July 22, 2020, 2:59 p.m. UTC
  The User Access Region is a special mechanism to provide direct
access to the hardware registers, and is the part of PCI address
space that is mapped to CPU virtual address. The mapping can be
performed with the type "Write-Combining" or "Non-Cached", and
these ones might be supported or not on different setups.

To prevent device probing failure the UAR allocation attempt
with alternative mapping type is performed. The datapath
takes the actual UAR mapping into account on queue creation.

There was another issue with NULL UAR base address.
OFED 5.0.x and Upstream rdma_core before v29 returned the NULL as
UAR base address if UAR was not the first object in the UAR page.
It caused the PMD failure and we should try to get another UAR
till we get the first one with non-NULL base address returned.

Fixes: fc4d4f732bbc ("net/mlx5: introduce shared UAR resource")

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---
 drivers/net/mlx5/mlx5.c      | 160 ++++++++++++++++++++++++++++++++++++++-----
 drivers/net/mlx5/mlx5_defs.h |  11 +++
 2 files changed, 153 insertions(+), 18 deletions(-)
  

Comments

Ori Kam July 23, 2020, 11:58 a.m. UTC | #1
> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Viacheslav Ovsiienko
> 
> The User Access Region is a special mechanism to provide direct
> access to the hardware registers, and is the part of PCI address
> space that is mapped to CPU virtual address. The mapping can be
> performed with the type "Write-Combining" or "Non-Cached", and
> these ones might be supported or not on different setups.
> 
> To prevent device probing failure the UAR allocation attempt
> with alternative mapping type is performed. The datapath
> takes the actual UAR mapping into account on queue creation.
> 
> There was another issue with NULL UAR base address.
> OFED 5.0.x and Upstream rdma_core before v29 returned the NULL as
> UAR base address if UAR was not the first object in the UAR page.
> It caused the PMD failure and we should try to get another UAR
> till we get the first one with non-NULL base address returned.
> 
> Fixes: fc4d4f732bbc ("net/mlx5: introduce shared UAR resource")
> 
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> ---
> --
> 1.8.3.1

Acked-by: Ori Kam <orika@mellanox.com>
Thanks,
Ori
  
Raslan Darawsheh July 23, 2020, 12:34 p.m. UTC | #2
Hi,

> -----Original Message-----
> From: Ori Kam <orika@mellanox.com>
> Sent: Thursday, July 23, 2020 2:58 PM
> To: Slava Ovsiienko <viacheslavo@mellanox.com>; dev@dpdk.org
> Cc: Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> <rasland@mellanox.com>; Thomas Monjalon <thomas@monjalon.net>;
> Dekel Peled <dekelp@mellanox.com>
> Subject: RE: [dpdk-dev] [PATCH] net/mlx5: fix UAR memory mapping type
> 
> 
> 
> > -----Original Message-----
> > From: dev <dev-bounces@dpdk.org> On Behalf Of Viacheslav Ovsiienko
> >
> > The User Access Region is a special mechanism to provide direct
> > access to the hardware registers, and is the part of PCI address
> > space that is mapped to CPU virtual address. The mapping can be
> > performed with the type "Write-Combining" or "Non-Cached", and
> > these ones might be supported or not on different setups.
> >
> > To prevent device probing failure the UAR allocation attempt
> > with alternative mapping type is performed. The datapath
> > takes the actual UAR mapping into account on queue creation.
> >
> > There was another issue with NULL UAR base address.
> > OFED 5.0.x and Upstream rdma_core before v29 returned the NULL as
> > UAR base address if UAR was not the first object in the UAR page.
> > It caused the PMD failure and we should try to get another UAR
> > till we get the first one with non-NULL base address returned.
> >
> > Fixes: fc4d4f732bbc ("net/mlx5: introduce shared UAR resource")
> >
> > Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> > ---
> > --
> > 1.8.3.1
> 
> Acked-by: Ori Kam <orika@mellanox.com>
> Thanks,
> Ori


Patch applied to next-net-mlx,

Kindest regards,
Raslan Darawsheh
  

Patch

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 6c7a7ee..a5cccd1 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -706,6 +706,141 @@  struct mlx5_flow_id_pool *
 	prf->obj = NULL;
 }
 
+/*
+ * Allocate Rx and Tx UARs in robust fashion.
+ * This routine handles the following UAR allocation issues:
+ *
+ *  - tries to allocate the UAR with the most appropriate memory
+ *    mapping type from the ones supported by the host
+ *
+ *  - tries to allocate the UAR with non-NULL base address
+ *    OFED 5.0.x and Upstream rdma_core before v29 returned the NULL as
+ *    UAR base address if UAR was not the first object in the UAR page.
+ *    It caused the PMD failure and we should try to get another UAR
+ *    till we get the first one with non-NULL base address returned.
+ */
+static int
+mlx5_alloc_rxtx_uars(struct mlx5_dev_ctx_shared *sh,
+		     const struct mlx5_dev_config *config)
+{
+	uint32_t uar_mapping, retry;
+	int err = 0;
+
+	for (retry = 0; retry < MLX5_ALLOC_UAR_RETRY; ++retry) {
+#ifdef MLX5DV_UAR_ALLOC_TYPE_NC
+		/* Control the mapping type according to the settings. */
+		uar_mapping = (config->dbnc == MLX5_TXDB_NCACHED) ?
+			      MLX5DV_UAR_ALLOC_TYPE_NC :
+			      MLX5DV_UAR_ALLOC_TYPE_BF;
+#else
+		RTE_SET_USED(config);
+		/*
+		 * It seems we have no way to control the memory mapping type
+		 * for the UAR, the default "Write-Combining" type is supposed.
+		 * The UAR initialization on queue creation queries the
+		 * actual mapping type done by Verbs/kernel and setups the
+		 * PMD datapath accordingly.
+		 */
+		uar_mapping = 0;
+#endif
+		sh->tx_uar = mlx5_glue->devx_alloc_uar(sh->ctx, uar_mapping);
+#ifdef MLX5DV_UAR_ALLOC_TYPE_NC
+		if (!sh->tx_uar &&
+		    uar_mapping == MLX5DV_UAR_ALLOC_TYPE_BF) {
+			if (config->dbnc == MLX5_TXDB_CACHED ||
+			    config->dbnc == MLX5_TXDB_HEURISTIC)
+				DRV_LOG(WARNING, "Devarg tx_db_nc setting "
+						 "is not supported by DevX");
+			/*
+			 * In some environments like virtual machine
+			 * the Write Combining mapped might be not supported
+			 * and UAR allocation fails. We try "Non-Cached"
+			 * mapping for the case. The tx_burst routines take
+			 * the UAR mapping type into account on UAR setup
+			 * on queue creation.
+			 */
+			DRV_LOG(WARNING, "Failed to allocate Tx DevX UAR (BF)");
+			uar_mapping = MLX5DV_UAR_ALLOC_TYPE_NC;
+			sh->tx_uar = mlx5_glue->devx_alloc_uar
+							(sh->ctx, uar_mapping);
+		} else if (!sh->tx_uar &&
+			   uar_mapping == MLX5DV_UAR_ALLOC_TYPE_NC) {
+			if (config->dbnc == MLX5_TXDB_NCACHED)
+				DRV_LOG(WARNING, "Devarg tx_db_nc settings "
+						 "is not supported by DevX");
+			/*
+			 * If Verbs/kernel does not support "Non-Cached"
+			 * try the "Write-Combining".
+			 */
+			DRV_LOG(WARNING, "Failed to allocate Tx DevX UAR (NC)");
+			uar_mapping = MLX5DV_UAR_ALLOC_TYPE_BF;
+			sh->tx_uar = mlx5_glue->devx_alloc_uar
+							(sh->ctx, uar_mapping);
+		}
+#endif
+		if (!sh->tx_uar) {
+			DRV_LOG(ERR, "Failed to allocate Tx DevX UAR (BF/NC)");
+			err = ENOMEM;
+			goto exit;
+		}
+		if (sh->tx_uar->base_addr)
+			break;
+		/*
+		 * The UARs are allocated by rdma_core within the
+		 * IB device context, on context closure all UARs
+		 * will be freed, should be no memory/object leakage.
+		 */
+		DRV_LOG(WARNING, "Retrying to allocate Tx DevX UAR");
+		sh->tx_uar = NULL;
+	}
+	/* Check whether we finally succeeded with valid UAR allocation. */
+	if (!sh->tx_uar) {
+		DRV_LOG(ERR, "Failed to allocate Tx DevX UAR (NULL base)");
+		err = ENOMEM;
+		goto exit;
+	}
+	for (retry = 0; retry < MLX5_ALLOC_UAR_RETRY; ++retry) {
+		uar_mapping = 0;
+		sh->devx_rx_uar = mlx5_glue->devx_alloc_uar
+							(sh->ctx, uar_mapping);
+#ifdef MLX5DV_UAR_ALLOC_TYPE_NC
+		if (!sh->devx_rx_uar &&
+		    uar_mapping == MLX5DV_UAR_ALLOC_TYPE_BF) {
+			/*
+			 * Rx UAR is used to control interrupts only,
+			 * should be no datapath noticeable impact,
+			 * can try "Non-Cached" mapping safely.
+			 */
+			DRV_LOG(WARNING, "Failed to allocate Rx DevX UAR (BF)");
+			uar_mapping = MLX5DV_UAR_ALLOC_TYPE_NC;
+			sh->devx_rx_uar = mlx5_glue->devx_alloc_uar
+							(sh->ctx, uar_mapping);
+		}
+#endif
+		if (!sh->devx_rx_uar) {
+			DRV_LOG(ERR, "Failed to allocate Rx DevX UAR (BF/NC)");
+			err = ENOMEM;
+			goto exit;
+		}
+		if (sh->devx_rx_uar->base_addr)
+			break;
+		/*
+		 * The UARs are allocated by rdma_core within the
+		 * IB device context, on context closure all UARs
+		 * will be freed, should be no memory/object leakage.
+		 */
+		DRV_LOG(WARNING, "Retrying to allocate Rx DevX UAR");
+		sh->devx_rx_uar = NULL;
+	}
+	/* Check whether we finally succeeded with valid UAR allocation. */
+	if (!sh->devx_rx_uar) {
+		DRV_LOG(ERR, "Failed to allocate Rx DevX UAR (NULL base)");
+		err = ENOMEM;
+	}
+exit:
+	return err;
+}
+
 /**
  * Allocate shared device context. If there is multiport device the
  * master and representors will share this context, if there is single
@@ -807,18 +942,11 @@  struct mlx5_dev_ctx_shared *
 			err = ENOMEM;
 			goto error;
 		}
-		sh->tx_uar = mlx5_glue->devx_alloc_uar(sh->ctx, 0);
-		if (!sh->tx_uar) {
-			DRV_LOG(ERR, "Failed to allocate DevX UAR.");
-			err = ENOMEM;
-			goto error;
-		}
-		sh->devx_rx_uar = mlx5_glue->devx_alloc_uar(sh->ctx, 0);
-		if (!sh->devx_rx_uar) {
-			DRV_LOG(ERR, "Failed to allocate Rx DevX UAR.");
-			err = ENOMEM;
+		err = mlx5_alloc_rxtx_uars(sh, config);
+		if (err)
 			goto error;
-		}
+		MLX5_ASSERT(sh->tx_uar && sh->tx_uar->base_addr);
+		MLX5_ASSERT(sh->devx_rx_uar && sh->devx_rx_uar->base_addr);
 	}
 	sh->flow_id_pool = mlx5_flow_id_pool_alloc
 					((1 << HAIRPIN_FLOW_ID_BITS) - 1);
@@ -874,20 +1002,16 @@  struct mlx5_dev_ctx_shared *
 	pthread_mutex_destroy(&sh->txpp.mutex);
 	pthread_mutex_unlock(&mlx5_dev_ctx_list_mutex);
 	MLX5_ASSERT(sh);
-	if (sh->cnt_id_tbl) {
+	if (sh->cnt_id_tbl)
 		mlx5_l3t_destroy(sh->cnt_id_tbl);
-		sh->cnt_id_tbl = NULL;
-	}
-	if (sh->tx_uar) {
-		mlx5_glue->devx_free_uar(sh->tx_uar);
-		sh->tx_uar = NULL;
-	}
 	if (sh->tis)
 		claim_zero(mlx5_devx_cmd_destroy(sh->tis));
 	if (sh->td)
 		claim_zero(mlx5_devx_cmd_destroy(sh->td));
 	if (sh->devx_rx_uar)
 		mlx5_glue->devx_free_uar(sh->devx_rx_uar);
+	if (sh->tx_uar)
+		mlx5_glue->devx_free_uar(sh->tx_uar);
 	if (sh->pd)
 		claim_zero(mlx5_glue->dealloc_pd(sh->pd));
 	if (sh->ctx)
diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h
index 7ed3e88..e5f7acc 100644
--- a/drivers/net/mlx5/mlx5_defs.h
+++ b/drivers/net/mlx5/mlx5_defs.h
@@ -196,4 +196,15 @@ 
 #define static_assert _Static_assert
 #endif
 
+/*
+ * Defines the amount of retries to allocate the first UAR in the page.
+ * OFED 5.0.x and Upstream rdma_core before v29 returned the NULL as
+ * UAR base address if UAR was not the first object in the UAR page.
+ * It caused the PMD failure and we should try to get another UAR
+ * till we get the first one with non-NULL base address returned.
+ * Should follow the rdma_core internal (not exported) definition
+ * MLX5_NUM_NON_FP_BFREGS_PER_UAR.
+ */
+#define MLX5_ALLOC_UAR_RETRY 2
+
 #endif /* RTE_PMD_MLX5_DEFS_H_ */