net/mlx5: fix xstats get reinitialization

Message ID 20210925151129.24727-1-themiron@yandex-team.ru (mailing list archive)
State Superseded, archived
Delegated to: Raslan Darawsheh
Headers
Series net/mlx5: fix xstats get reinitialization |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/github-robot: build success github build: passed
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS
ci/iol-x86_64-compile-testing success Testing PASS
ci/iol-x86_64-unit-testing success Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS

Commit Message

Vladislav Grishenko Sept. 25, 2021, 3:11 p.m. UTC
  The mlx5_xstats_get gets the device extended statistics.
In this function the driver may reinitialize the structures that are
used to read device counters.

In case of reinitialization, the number of counters may change, which
wouldn't be taken into account by the get API callback and can cause
a segmentation fault.

In case of insufficient supplied stats table size, ex. zero to query
the number of extended stats, reinitialization may never happen and
the returned stats number, that is used for subsequent stats getting,
will not be sufficient too.

This issue is fixed by getting and allocating the counters size after
the reinitialization.

Fixes: 1a611fdaf6ec ("net/mlx5: support missing counter in extended statistics")
Fixes: a4193ae3bc4f ("net/mlx5: support extended statistics")
Cc: stable@dpdk.org

Signed-off-by: Vladislav Grishenko <themiron@yandex-team.ru>
---
 drivers/net/mlx5/mlx5_stats.c | 35 +++++++++++++++++++++++++----------
 1 file changed, 25 insertions(+), 10 deletions(-)
  

Comments

Slava Ovsiienko Nov. 10, 2021, 1:37 p.m. UTC | #1
Hi, Vladislav

Thank you for the fix.
In general, it looks good to me.
Please, see my comments below.

> -----Original Message-----
> From: Vladislav Grishenko <themiron@yandex-team.ru>
> Sent: Saturday, September 25, 2021 18:11
> To: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
> <viacheslavo@nvidia.com>; Shahaf Shuler <shahafs@nvidia.com>; NBU-
> Contact-Adrien Mazarguil <adrien.mazarguil@6wind.com>; Elad Persiko
> <eladpe@mellanox.com>
> Cc: dev@dpdk.org; stable@dpdk.org
> Subject: [PATCH] net/mlx5: fix xstats get reinitialization
> 
> The mlx5_xstats_get gets the device extended statistics.
> In this function the driver may reinitialize the structures that are used to read
> device counters.
> 
> In case of reinitialization, the number of counters may change, which
> wouldn't be taken into account by the get API callback and can cause a
> segmentation fault.
> 
> In case of insufficient supplied stats table size, ex. zero to query the number of
[SO] I was embarrassed a little bit with ".ex". Let's clarify?

> extended stats, reinitialization may never happen and the returned stats
> number, that is used for subsequent stats getting, will not be sufficient too.
> 
> This issue is fixed by getting and allocating the counters size after the
> reinitialization.
Yes, it would be fair.

> 
> Fixes: 1a611fdaf6ec ("net/mlx5: support missing counter in extended
> statistics")
> Fixes: a4193ae3bc4f ("net/mlx5: support extended statistics")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Vladislav Grishenko <themiron@yandex-team.ru>
> ---
>  drivers/net/mlx5/mlx5_stats.c | 35 +++++++++++++++++++++++++----------
>  1 file changed, 25 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_stats.c b/drivers/net/mlx5/mlx5_stats.c
> index ae2f5668a7..7dd7724b05 100644
> --- a/drivers/net/mlx5/mlx5_stats.c
> +++ b/drivers/net/mlx5/mlx5_stats.c
> @@ -39,23 +39,37 @@ mlx5_xstats_get(struct rte_eth_dev *dev, struct
> rte_eth_xstat *stats,
>  		unsigned int n)
>  {
>  	struct mlx5_priv *priv = dev->data->dev_private;
> -	unsigned int i;
> -	uint64_t counters[n];
>  	struct mlx5_xstats_ctrl *xstats_ctrl = &priv->xstats_ctrl;
> -	uint16_t mlx5_stats_n = xstats_ctrl->mlx5_stats_n;
> +	uint16_t mlx5_stats_n;
> +	int stats_n;
> +
> +	stats_n = mlx5_os_get_stats_n(dev);
> +	if (stats_n < 0)
> +		return stats_n;
> +	if (xstats_ctrl->stats_n != stats_n)
> +		mlx5_os_stats_init(dev);
> +	mlx5_stats_n = xstats_ctrl->mlx5_stats_n;
> 
>  	if (n >= mlx5_stats_n && stats) {
> -		int stats_n;
> +		uint64_t *counters;
> +		unsigned int i;
>  		int ret;
> 
> -		stats_n = mlx5_os_get_stats_n(dev);
> -		if (stats_n < 0)
> -			return stats_n;
> -		if (xstats_ctrl->stats_n != stats_n)
> -			mlx5_os_stats_init(dev);
> +		counters = mlx5_malloc(MLX5_MEM_SYS, sizeof(*counters) *

[SO] My concern is alloc/free - can we revert back to local counter[n] on stack ?


With best regards,
Slava

> +				mlx5_stats_n, 0,
> +				SOCKET_ID_ANY);
> +		if (counters == NULL) {
> +			DRV_LOG(WARNING, "port %u unable to allocate "
> +					"memory for xstats counters",
> +					dev->data->port_id);
> +			rte_errno = ENOMEM;
> +			return -rte_errno;
> +		}
>  		ret = mlx5_os_read_dev_counters(dev, counters);
> -		if (ret)
> +		if (ret) {
> +			mlx5_free(counters);
>  			return ret;
> +		}
>  		for (i = 0; i != mlx5_stats_n; ++i) {
>  			stats[i].id = i;
>  			if (xstats_ctrl->info[i].dev) {
> @@ -76,6 +90,7 @@ mlx5_xstats_get(struct rte_eth_dev *dev, struct
> rte_eth_xstat *stats,
>  					(counters[i] - xstats_ctrl->base[i]);
>  			}
>  		}
> +		mlx5_free(counters);
>  	}
>  	mlx5_stats_n = mlx5_txpp_xstats_get(dev, stats, n, mlx5_stats_n);
>  	return mlx5_stats_n;
> --
> 2.17.1
  

Patch

diff --git a/drivers/net/mlx5/mlx5_stats.c b/drivers/net/mlx5/mlx5_stats.c
index ae2f5668a7..7dd7724b05 100644
--- a/drivers/net/mlx5/mlx5_stats.c
+++ b/drivers/net/mlx5/mlx5_stats.c
@@ -39,23 +39,37 @@  mlx5_xstats_get(struct rte_eth_dev *dev, struct rte_eth_xstat *stats,
 		unsigned int n)
 {
 	struct mlx5_priv *priv = dev->data->dev_private;
-	unsigned int i;
-	uint64_t counters[n];
 	struct mlx5_xstats_ctrl *xstats_ctrl = &priv->xstats_ctrl;
-	uint16_t mlx5_stats_n = xstats_ctrl->mlx5_stats_n;
+	uint16_t mlx5_stats_n;
+	int stats_n;
+
+	stats_n = mlx5_os_get_stats_n(dev);
+	if (stats_n < 0)
+		return stats_n;
+	if (xstats_ctrl->stats_n != stats_n)
+		mlx5_os_stats_init(dev);
+	mlx5_stats_n = xstats_ctrl->mlx5_stats_n;
 
 	if (n >= mlx5_stats_n && stats) {
-		int stats_n;
+		uint64_t *counters;
+		unsigned int i;
 		int ret;
 
-		stats_n = mlx5_os_get_stats_n(dev);
-		if (stats_n < 0)
-			return stats_n;
-		if (xstats_ctrl->stats_n != stats_n)
-			mlx5_os_stats_init(dev);
+		counters = mlx5_malloc(MLX5_MEM_SYS, sizeof(*counters) *
+				mlx5_stats_n, 0,
+				SOCKET_ID_ANY);
+		if (counters == NULL) {
+			DRV_LOG(WARNING, "port %u unable to allocate "
+					"memory for xstats counters",
+					dev->data->port_id);
+			rte_errno = ENOMEM;
+			return -rte_errno;
+		}
 		ret = mlx5_os_read_dev_counters(dev, counters);
-		if (ret)
+		if (ret) {
+			mlx5_free(counters);
 			return ret;
+		}
 		for (i = 0; i != mlx5_stats_n; ++i) {
 			stats[i].id = i;
 			if (xstats_ctrl->info[i].dev) {
@@ -76,6 +90,7 @@  mlx5_xstats_get(struct rte_eth_dev *dev, struct rte_eth_xstat *stats,
 					(counters[i] - xstats_ctrl->base[i]);
 			}
 		}
+		mlx5_free(counters);
 	}
 	mlx5_stats_n = mlx5_txpp_xstats_get(dev, stats, n, mlx5_stats_n);
 	return mlx5_stats_n;