common/mlx5: fix error handling in multi-class probe

Message ID 20211124220238.3119860-1-michaelba@nvidia.com (mailing list archive)
State Accepted, archived
Delegated to: Raslan Darawsheh
Headers
Series common/mlx5: fix error handling in multi-class probe |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/github-robot: build success github build: passed
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-x86_64-compile-testing success Testing PASS
ci/iol-x86_64-unit-testing success Testing PASS
ci/iol-aarch64-unit-testing success Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS

Commit Message

Michael Baum Nov. 24, 2021, 10:02 p.m. UTC
  From: Michael Baum <michaelba@nvidia.com>

The common drivers_probe function calls in a loop to all probe functions
for classes requested by the user. After it manages to probe them all,
it updates this on the device in the "classes_loaded" field.

If one of them fails, all those probed to it are remove using the
drivers_remove function. However, this function only releases the
classes in the "classes_loaded" field on the given device and misses the
newly probed classes.

This patch removes the condition from the release function, and ensures
that the caller function sends a more accurate parameter.

Fixes: 8a41f4deccc3 ("common/mlx5: introduce layer for multiple class drivers")
Cc: stable@dpdk.org

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
---
 drivers/common/mlx5/mlx5_common.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)
  

Comments

Thomas Monjalon Nov. 25, 2021, 9:26 a.m. UTC | #1
24/11/2021 23:02, michaelba@nvidia.com:
> From: Michael Baum <michaelba@nvidia.com>
> 
> The common drivers_probe function calls in a loop to all probe functions
> for classes requested by the user. After it manages to probe them all,
> it updates this on the device in the "classes_loaded" field.
> 
> If one of them fails, all those probed to it are remove using the
> drivers_remove function. However, this function only releases the
> classes in the "classes_loaded" field on the given device and misses the
> newly probed classes.
> 
> This patch removes the condition from the release function, and ensures
> that the caller function sends a more accurate parameter.
> 
> Fixes: 8a41f4deccc3 ("common/mlx5: introduce layer for multiple class drivers")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Michael Baum <michaelba@nvidia.com>
> Acked-by: Matan Azrad <matan@nvidia.com>

How well it has been tested?
How critical it is to have in 21.11?
  
Matan Azrad Nov. 25, 2021, 10:34 a.m. UTC | #2
From: Thomas Monjalon
> 24/11/2021 23:02, michaelba@nvidia.com:
> > From: Michael Baum <michaelba@nvidia.com>
> >
> > The common drivers_probe function calls in a loop to all probe
> > functions for classes requested by the user. After it manages to probe
> > them all, it updates this on the device in the "classes_loaded" field.
> >
> > If one of them fails, all those probed to it are remove using the
> > drivers_remove function. However, this function only releases the
> > classes in the "classes_loaded" field on the given device and misses
> > the newly probed classes.
> >
> > This patch removes the condition from the release function, and
> > ensures that the caller function sends a more accurate parameter.
> >
> > Fixes: 8a41f4deccc3 ("common/mlx5: introduce layer for multiple class
> > drivers")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Michael Baum <michaelba@nvidia.com>
> > Acked-by: Matan Azrad <matan@nvidia.com>
> 
> How well it has been tested?

It was tested carefully for all the remove cases.

> How critical it is to have in 21.11?

It is error flow issue, not critical.

>
  
Raslan Darawsheh Jan. 6, 2022, 9:05 a.m. UTC | #3
Hi,

> -----Original Message-----
> From: Michael Baum <michaelba@nvidia.com>
> Sent: Thursday, November 25, 2021 12:03 AM
> To: dev@dpdk.org
> Cc: Matan Azrad <matan@nvidia.com>; Raslan Darawsheh
> <rasland@nvidia.com>; Slava Ovsiienko <viacheslavo@nvidia.com>; Michael
> Baum <michaelba@nvidia.com>; stable@dpdk.org
> Subject: [PATCH] common/mlx5: fix error handling in multi-class probe
> 
> From: Michael Baum <michaelba@nvidia.com>
> 
> The common drivers_probe function calls in a loop to all probe functions
> for classes requested by the user. After it manages to probe them all,
> it updates this on the device in the "classes_loaded" field.
> 
> If one of them fails, all those probed to it are remove using the
> drivers_remove function. However, this function only releases the
> classes in the "classes_loaded" field on the given device and misses the
> newly probed classes.
> 
> This patch removes the condition from the release function, and ensures
> that the caller function sends a more accurate parameter.
> 
> Fixes: 8a41f4deccc3 ("common/mlx5: introduce layer for multiple class
> drivers")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Michael Baum <michaelba@nvidia.com>
> Acked-by: Matan Azrad <matan@nvidia.com>

Patch applied to next-net-mlx,

Kindest regards,
Raslan Darawsheh
  

Patch

diff --git a/drivers/common/mlx5/mlx5_common.c b/drivers/common/mlx5/mlx5_common.c
index f1650f94c6..faa3d65ab3 100644
--- a/drivers/common/mlx5/mlx5_common.c
+++ b/drivers/common/mlx5/mlx5_common.c
@@ -616,7 +616,6 @@  drivers_remove(struct mlx5_common_device *cdev, uint32_t enabled_classes)
 	unsigned int i = 0;
 	int ret = 0;
 
-	enabled_classes &= cdev->classes_loaded;
 	while (enabled_classes) {
 		driver = driver_get(RTE_BIT64(i));
 		if (driver != NULL) {
@@ -665,9 +664,11 @@  drivers_probe(struct mlx5_common_device *cdev, uint32_t user_classes)
 	cdev->classes_loaded |= enabled_classes;
 	return 0;
 probe_err:
-	/* Only unload drivers which are enabled which were enabled
-	 * in this probe instance.
+	/*
+	 * Need to remove only drivers which were not probed before this probe
+	 * instance, but have already been probed before this failure.
 	 */
+	enabled_classes &= ~cdev->classes_loaded;
 	drivers_remove(cdev, enabled_classes);
 	return ret;
 }