[v2,1/2] net/mlx5: fix use after free when releasing tx queues

Message ID 952a177cf4cc074101bb13773326b7107f496290.1661223500.git.wangyunjian@huawei.com (mailing list archive)
State Changes Requested, archived
Delegated to: Raslan Darawsheh
Headers
Series fixes for mlx5 |

Checks

Context Check Description
ci/checkpatch warning coding style issues

Commit Message

Yunjian Wang Aug. 23, 2022, 6:45 a.m. UTC
  The bonding slave remove function was calling the eth_dev_tx_queue_config
function, which frees dev->data->tx_queues, and then tries to free
priv->txqs[idx] in mlx5_txq_release function, which causes the heap use
after free issue. Add checks whether dev->data->tx_queues is not NULL.

Fixes: 94e257ec8ca ("net/mlx5: fix Rx/Tx queue checks")
Cc: stable@dpdk.org

Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
---
 drivers/net/mlx5/mlx5_txq.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
  

Comments

Yunjian Wang Sept. 23, 2022, 9:31 a.m. UTC | #1
Friendly ping.

> -----Original Message-----
> From: wangyunjian
> Sent: Tuesday, August 23, 2022 2:46 PM
> To: dev@dpdk.org
> Cc: matan@nvidia.com; rasland@nvidia.com; viacheslavo@nvidia.com;
> dkozlyuk@nvidia.com; Huangshaozhang <huangshaozhang@huawei.com>;
> wangyunjian <wangyunjian@huawei.com>; stable@dpdk.org
> Subject: [dpdk-dev] [PATCH v2 1/2] net/mlx5: fix use after free when releasing
> tx queues
> 
> The bonding slave remove function was calling the eth_dev_tx_queue_config
> function, which frees dev->data->tx_queues, and then tries to free
> priv->txqs[idx] in mlx5_txq_release function, which causes the heap use
> after free issue. Add checks whether dev->data->tx_queues is not NULL.
> 
> Fixes: 94e257ec8ca ("net/mlx5: fix Rx/Tx queue checks")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
> ---
>  drivers/net/mlx5/mlx5_txq.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c index
> 0140f8b3b2..cb2c33a060 100644
> --- a/drivers/net/mlx5/mlx5_txq.c
> +++ b/drivers/net/mlx5/mlx5_txq.c
> @@ -1198,7 +1198,8 @@ mlx5_txq_release(struct rte_eth_dev *dev, uint16_t
> idx)
>  	struct mlx5_priv *priv = dev->data->dev_private;
>  	struct mlx5_txq_ctrl *txq_ctrl;
> 
> -	if (priv->txqs == NULL || (*priv->txqs)[idx] == NULL)
> +	if (dev->data->tx_queues == NULL || priv->txqs == NULL ||
> +		(*priv->txqs)[idx] == NULL)
>  		return 0;
>  	txq_ctrl = container_of((*priv->txqs)[idx], struct mlx5_txq_ctrl, txq);
>  	if (__atomic_sub_fetch(&txq_ctrl->refcnt, 1, __ATOMIC_RELAXED) > 1)
> --
> 2.27.0
  
Slava Ovsiienko Sept. 26, 2022, 12:30 p.m. UTC | #2
Hi, Yunjian

Could you, please, tell more details about problematic scenario?
In bonding slave? It is not fully clean for me how mlx5_txq_release
frees priv->txqs[idx] (BTW NULL is OK to free, it is safe).
We have check for NULL here:
> > -	if (priv->txqs == NULL || (*priv->txqs)[idx] == NULL)

priv->txq is internal objects managed by PMD, dev->data->tx_queues
are DPDK-wide ones. Theoretically it might happen when DPDK objects
are created and internals are not, and vice versa. So, checking 
for existence of external objects in the routine that manages internals
does not look so reasonable. Internal queue object management is based
on the atomic reference counter and, generally speaking, should not depend
on externals.

With best regards,
Slava 

> -----Original Message-----
> From: wangyunjian <wangyunjian@huawei.com>
> Sent: Friday, September 23, 2022 12:32
> To: dev@dpdk.org
> Cc: Matan Azrad <matan@nvidia.com>; Raslan Darawsheh <rasland@nvidia.com>;
> Slava Ovsiienko <viacheslavo@nvidia.com>; Dmitry Kozlyuk
> <dkozlyuk@nvidia.com>; Huangshaozhang <huangshaozhang@huawei.com>;
> stable@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v2 1/2] net/mlx5: fix use after free when
> releasing tx queues
> 
> Friendly ping.
> 
> > -----Original Message-----
> > From: wangyunjian
> > Sent: Tuesday, August 23, 2022 2:46 PM
> > To: dev@dpdk.org
> > Cc: matan@nvidia.com; rasland@nvidia.com; viacheslavo@nvidia.com;
> > dkozlyuk@nvidia.com; Huangshaozhang <huangshaozhang@huawei.com>;
> > wangyunjian <wangyunjian@huawei.com>; stable@dpdk.org
> > Subject: [dpdk-dev] [PATCH v2 1/2] net/mlx5: fix use after free when
> > releasing tx queues
> >
> > The bonding slave remove function was calling the
> > eth_dev_tx_queue_config function, which frees dev->data->tx_queues,
> > and then tries to free
> > priv->txqs[idx] in mlx5_txq_release function, which causes the heap
> > priv->use
> > after free issue. Add checks whether dev->data->tx_queues is not NULL.
> >
> > Fixes: 94e257ec8ca ("net/mlx5: fix Rx/Tx queue checks")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
> > ---
> >  drivers/net/mlx5/mlx5_txq.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
> > index
> > 0140f8b3b2..cb2c33a060 100644
> > --- a/drivers/net/mlx5/mlx5_txq.c
> > +++ b/drivers/net/mlx5/mlx5_txq.c
> > @@ -1198,7 +1198,8 @@ mlx5_txq_release(struct rte_eth_dev *dev,
> > uint16_t
> > idx)
> >  	struct mlx5_priv *priv = dev->data->dev_private;
> >  	struct mlx5_txq_ctrl *txq_ctrl;
> >
> > -	if (priv->txqs == NULL || (*priv->txqs)[idx] == NULL)
> > +	if (dev->data->tx_queues == NULL || priv->txqs == NULL ||
> > +		(*priv->txqs)[idx] == NULL)
> >  		return 0;
> >  	txq_ctrl = container_of((*priv->txqs)[idx], struct mlx5_txq_ctrl,
> txq);
> >  	if (__atomic_sub_fetch(&txq_ctrl->refcnt, 1, __ATOMIC_RELAXED) > 1)
> > --
> > 2.27.0
  

Patch

diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index 0140f8b3b2..cb2c33a060 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -1198,7 +1198,8 @@  mlx5_txq_release(struct rte_eth_dev *dev, uint16_t idx)
 	struct mlx5_priv *priv = dev->data->dev_private;
 	struct mlx5_txq_ctrl *txq_ctrl;
 
-	if (priv->txqs == NULL || (*priv->txqs)[idx] == NULL)
+	if (dev->data->tx_queues == NULL || priv->txqs == NULL ||
+		(*priv->txqs)[idx] == NULL)
 		return 0;
 	txq_ctrl = container_of((*priv->txqs)[idx], struct mlx5_txq_ctrl, txq);
 	if (__atomic_sub_fetch(&txq_ctrl->refcnt, 1, __ATOMIC_RELAXED) > 1)