net/mlx5: fix use after free when releasing tx queues
Checks
Commit Message
From: Pengfei Sun <sunpengfei16@huawei.com>
In function mlx5_dev_configure, dev->data->tx_queues is assigned
to priv->txqs. When a member is removed from a bond, the function
eth_dev_tx_queue_config is called to release dev->data->tx_queues.
However, function mlx5_dev_close will access priv->txqs again and
cause the use after free problem.
In function mlx5_dev_close, before free priv->txqs, we add a check
that dev->data->tx_queues is not NULL.
build/app/dpdk-testpmd -c7 -a 0000:08:00.2 -- -i --nb-cores=2
--total-num-mbufs=2048
testpmd> port stop 0
testpmd> create bonding device 4 0
testpmd> add bonding member 0 1
testpmd> remove bonding member 0 1
testpmd> quit
ASan reports:
==2571911==ERROR: AddressSanitizer: heap-use-after-free on address
0x000174529880 at pc 0x0000113c8440 bp 0xffffefae0ea0 sp 0xffffefae0eb0
READ of size 8 at 0x000174529880 thread T0
#0 0x113c843c in mlx5_txq_release ../drivers/net/mlx5/mlx5_txq.c:
1203
#1 0xffdb53c in mlx5_dev_close ../drivers/net/mlx5/mlx5.c:2286
#2 0xe12dc0 in rte_eth_dev_close ../lib/ethdev/rte_ethdev.c:1877
#3 0x6bac1c in close_port ../app/test-pmd/testpmd.c:3540
#4 0x6bc320 in pmd_test_exit ../app/test-pmd/testpmd.c:3808
#5 0x6c1a94 in main ../app/test-pmd/testpmd.c:4759
#6 0xffff9328f038 (/usr/lib64/libc.so.6+0x2b038)
#7 0xffff9328f110 in __libc_start_main (/usr/lib64/libc.so.6+
0x2b110)
Fixes: 6e78005 ("net/mlx5: add reference counter on DPDK Tx queues")
Cc: stable@dpdk.org
Reported-by: Yunjian Wang <wangyunjian@huawei.com>
Signed-off-by: Pengfei Sun <sunpengfei16@huawei.com>
---
drivers/net/mlx5/mlx5.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Comments
Hi,
> -----Original Message-----
> From: Yunjian Wang <wangyunjian@huawei.com>
> Sent: Tuesday, February 20, 2024 10:32
> To: dev@dpdk.org
> Cc: Dariusz Sosnowski <dsosnowski@nvidia.com>; Ori Kam
> <orika@nvidia.com>; Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
> <viacheslavo@nvidia.com>; Suanming Mou <suanmingm@nvidia.com>;
> luyicai@huawei.com; Pengfei Sun <sunpengfei16@huawei.com>;
> stable@dpdk.org
> Subject: [PATCH] net/mlx5: fix use after free when releasing tx queues
>
> From: Pengfei Sun <sunpengfei16@huawei.com>
>
> In function mlx5_dev_configure, dev->data->tx_queues is assigned to priv-
> >txqs. When a member is removed from a bond, the function
> eth_dev_tx_queue_config is called to release dev->data->tx_queues.
> However, function mlx5_dev_close will access priv->txqs again and cause the
> use after free problem.
>
> In function mlx5_dev_close, before free priv->txqs, we add a check that dev-
> >data->tx_queues is not NULL.
>
> build/app/dpdk-testpmd -c7 -a 0000:08:00.2 -- -i --nb-cores=2
> --total-num-mbufs=2048
>
> testpmd> port stop 0
> testpmd> create bonding device 4 0
> testpmd> add bonding member 0 1
> testpmd> remove bonding member 0 1
> testpmd> quit
>
> ASan reports:
> ==2571911==ERROR: AddressSanitizer: heap-use-after-free on address
> 0x000174529880 at pc 0x0000113c8440 bp 0xffffefae0ea0 sp 0xffffefae0eb0
> READ of size 8 at 0x000174529880 thread T0
> #0 0x113c843c in mlx5_txq_release ../drivers/net/mlx5/mlx5_txq.c:
> 1203
> #1 0xffdb53c in mlx5_dev_close ../drivers/net/mlx5/mlx5.c:2286
> #2 0xe12dc0 in rte_eth_dev_close ../lib/ethdev/rte_ethdev.c:1877
> #3 0x6bac1c in close_port ../app/test-pmd/testpmd.c:3540
> #4 0x6bc320 in pmd_test_exit ../app/test-pmd/testpmd.c:3808
> #5 0x6c1a94 in main ../app/test-pmd/testpmd.c:4759
> #6 0xffff9328f038 (/usr/lib64/libc.so.6+0x2b038)
> #7 0xffff9328f110 in __libc_start_main (/usr/lib64/libc.so.6+
> 0x2b110)
>
> Fixes: 6e78005 ("net/mlx5: add reference counter on DPDK Tx queues")
> Cc: stable@dpdk.org
>
> Reported-by: Yunjian Wang <wangyunjian@huawei.com>
> Signed-off-by: Pengfei Sun <sunpengfei16@huawei.com>
Acked-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
Thank you for the patch.
Question to ethdev maintainers:
While reviewing this patch, I took a look at rte_eth_dev_internal_reset() which is called by bonding PMD for removed members.
This resets Rx and Tx queue configuration, and dev->data->dev_conf,
but not dev->data->dev_configured flag.
So theoretically, after this call, a port can be started without port configuration, which seems invalid.
What do you think? Should it be fixed?
Best regards,
Dariusz Sosnowski
20/02/2024 14:55, Dariusz Sosnowski:
> Hi,
>
> > -----Original Message-----
> > From: Yunjian Wang <wangyunjian@huawei.com>
> > Sent: Tuesday, February 20, 2024 10:32
> > To: dev@dpdk.org
> > Cc: Dariusz Sosnowski <dsosnowski@nvidia.com>; Ori Kam
> > <orika@nvidia.com>; Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
> > <viacheslavo@nvidia.com>; Suanming Mou <suanmingm@nvidia.com>;
> > luyicai@huawei.com; Pengfei Sun <sunpengfei16@huawei.com>;
> > stable@dpdk.org
> > Subject: [PATCH] net/mlx5: fix use after free when releasing tx queues
> >
> > From: Pengfei Sun <sunpengfei16@huawei.com>
> >
> > In function mlx5_dev_configure, dev->data->tx_queues is assigned to priv-
> > >txqs. When a member is removed from a bond, the function
> > eth_dev_tx_queue_config is called to release dev->data->tx_queues.
> > However, function mlx5_dev_close will access priv->txqs again and cause the
> > use after free problem.
> >
> > In function mlx5_dev_close, before free priv->txqs, we add a check that dev-
> > >data->tx_queues is not NULL.
> >
> > build/app/dpdk-testpmd -c7 -a 0000:08:00.2 -- -i --nb-cores=2
> > --total-num-mbufs=2048
> >
> > testpmd> port stop 0
> > testpmd> create bonding device 4 0
> > testpmd> add bonding member 0 1
> > testpmd> remove bonding member 0 1
> > testpmd> quit
> >
> > ASan reports:
> > ==2571911==ERROR: AddressSanitizer: heap-use-after-free on address
> > 0x000174529880 at pc 0x0000113c8440 bp 0xffffefae0ea0 sp 0xffffefae0eb0
> > READ of size 8 at 0x000174529880 thread T0
> > #0 0x113c843c in mlx5_txq_release ../drivers/net/mlx5/mlx5_txq.c:
> > 1203
> > #1 0xffdb53c in mlx5_dev_close ../drivers/net/mlx5/mlx5.c:2286
> > #2 0xe12dc0 in rte_eth_dev_close ../lib/ethdev/rte_ethdev.c:1877
> > #3 0x6bac1c in close_port ../app/test-pmd/testpmd.c:3540
> > #4 0x6bc320 in pmd_test_exit ../app/test-pmd/testpmd.c:3808
> > #5 0x6c1a94 in main ../app/test-pmd/testpmd.c:4759
> > #6 0xffff9328f038 (/usr/lib64/libc.so.6+0x2b038)
> > #7 0xffff9328f110 in __libc_start_main (/usr/lib64/libc.so.6+
> > 0x2b110)
> >
> > Fixes: 6e78005 ("net/mlx5: add reference counter on DPDK Tx queues")
> > Cc: stable@dpdk.org
> >
> > Reported-by: Yunjian Wang <wangyunjian@huawei.com>
> > Signed-off-by: Pengfei Sun <sunpengfei16@huawei.com>
> Acked-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
>
> Thank you for the patch.
>
> Question to ethdev maintainers:
>
> While reviewing this patch, I took a look at rte_eth_dev_internal_reset() which is called by bonding PMD for removed members.
> This resets Rx and Tx queue configuration, and dev->data->dev_conf,
> but not dev->data->dev_configured flag.
> So theoretically, after this call, a port can be started without port configuration, which seems invalid.
> What do you think? Should it be fixed?
Probably yes
Hi,
> -----Original Message-----
> From: Yunjian Wang <wangyunjian@huawei.com>
> Sent: Tuesday, February 20, 2024 11:32 AM
> To: dev@dpdk.org
> Cc: Dariusz Sosnowski <dsosnowski@nvidia.com>; Ori Kam
> <orika@nvidia.com>; Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
> <viacheslavo@nvidia.com>; Suanming Mou <suanmingm@nvidia.com>;
> luyicai@huawei.com; Pengfei Sun <sunpengfei16@huawei.com>;
> stable@dpdk.org
> Subject: [PATCH] net/mlx5: fix use after free when releasing tx queues
>
> From: Pengfei Sun <sunpengfei16@huawei.com>
>
> In function mlx5_dev_configure, dev->data->tx_queues is assigned to priv-
> >txqs. When a member is removed from a bond, the function
> eth_dev_tx_queue_config is called to release dev->data->tx_queues.
> However, function mlx5_dev_close will access priv->txqs again and cause the
> use after free problem.
>
> In function mlx5_dev_close, before free priv->txqs, we add a check that dev-
> >data->tx_queues is not NULL.
>
> build/app/dpdk-testpmd -c7 -a 0000:08:00.2 -- -i --nb-cores=2
> --total-num-mbufs=2048
>
> testpmd> port stop 0
> testpmd> create bonding device 4 0
> testpmd> add bonding member 0 1
> testpmd> remove bonding member 0 1
> testpmd> quit
>
> ASan reports:
> ==2571911==ERROR: AddressSanitizer: heap-use-after-free on address
> 0x000174529880 at pc 0x0000113c8440 bp 0xffffefae0ea0 sp 0xffffefae0eb0
> READ of size 8 at 0x000174529880 thread T0
> #0 0x113c843c in mlx5_txq_release ../drivers/net/mlx5/mlx5_txq.c:
> 1203
> #1 0xffdb53c in mlx5_dev_close ../drivers/net/mlx5/mlx5.c:2286
> #2 0xe12dc0 in rte_eth_dev_close ../lib/ethdev/rte_ethdev.c:1877
> #3 0x6bac1c in close_port ../app/test-pmd/testpmd.c:3540
> #4 0x6bc320 in pmd_test_exit ../app/test-pmd/testpmd.c:3808
> #5 0x6c1a94 in main ../app/test-pmd/testpmd.c:4759
> #6 0xffff9328f038 (/usr/lib64/libc.so.6+0x2b038)
> #7 0xffff9328f110 in __libc_start_main (/usr/lib64/libc.so.6+
> 0x2b110)
>
> Fixes: 6e78005 ("net/mlx5: add reference counter on DPDK Tx queues")
> Cc: stable@dpdk.org
>
> Reported-by: Yunjian Wang <wangyunjian@huawei.com>
> Signed-off-by: Pengfei Sun <sunpengfei16@huawei.com>
Patch applied to next-net-mlx,
Kindest regards
Raslan Darawsheh
@@ -2279,7 +2279,7 @@ mlx5_dev_close(struct rte_eth_dev *dev)
mlx5_free(priv->rxq_privs);
priv->rxq_privs = NULL;
}
- if (priv->txqs != NULL) {
+ if (priv->txqs != NULL && dev->data->tx_queues != NULL) {
/* XXX race condition if mlx5_tx_burst() is still running. */
rte_delay_us_sleep(1000);
for (i = 0; (i != priv->txqs_n); ++i)