[2/2] net/bonding: fix oob access in "other" aggregator modes

Message ID 1553200094-5487-2-git-send-email-david.marchand@redhat.com (mailing list archive)
State Accepted, archived
Delegated to: Ferruh Yigit
Headers
Series [1/2] net/bonding: fix more incorrect slave id types |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK

Commit Message

David Marchand March 21, 2019, 8:28 p.m. UTC
  From: Zhaohui <zhaohui8@huawei.com>

slave aggregator_port_id is in [0, RTE_MAX_ETHPORTS-1] range.
If RTE_MAX_ETHPORTS is > 8, we can hit out of bound accesses on
agg_bandwidth[] and agg_count[] arrays.

Fixes: 6d72657ce379 ("net/bonding: add other aggregator modes")
Cc: stable@dpdk.org

Signed-off-by: Zhaohui <zhaohui8@huawei.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
---
 drivers/net/bonding/rte_eth_bond_8023ad.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
  

Comments

Maxime Coquelin March 22, 2019, 11:18 a.m. UTC | #1
On 3/21/19 9:28 PM, David Marchand wrote:
> From: Zhaohui <zhaohui8@huawei.com>
> 
> slave aggregator_port_id is in [0, RTE_MAX_ETHPORTS-1] range.
> If RTE_MAX_ETHPORTS is > 8, we can hit out of bound accesses on
> agg_bandwidth[] and agg_count[] arrays.
> 
> Fixes: 6d72657ce379 ("net/bonding: add other aggregator modes")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Zhaohui <zhaohui8@huawei.com>
> Signed-off-by: David Marchand <david.marchand@redhat.com>
> ---
>   drivers/net/bonding/rte_eth_bond_8023ad.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/bonding/rte_eth_bond_8023ad.c b/drivers/net/bonding/rte_eth_bond_8023ad.c
> index 3943ec1..5004898 100644
> --- a/drivers/net/bonding/rte_eth_bond_8023ad.c
> +++ b/drivers/net/bonding/rte_eth_bond_8023ad.c
> @@ -669,8 +669,8 @@
>   	struct port *agg, *port;
>   	uint16_t slaves_count, new_agg_id, i, j = 0;
>   	uint16_t *slaves;
> -	uint64_t agg_bandwidth[8] = {0};
> -	uint64_t agg_count[8] = {0};
> +	uint64_t agg_bandwidth[RTE_MAX_ETHPORTS] = {0};
> +	uint64_t agg_count[RTE_MAX_ETHPORTS] = {0};
>   	uint16_t default_slave = 0;
>   	uint16_t mode_count_id;
>   	uint16_t mode_band_id;
> 

Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
  
Chas Williams March 24, 2019, 1:35 p.m. UTC | #2
Have you ever experienced this problem in practice? I ask because I am 
considering some fixes that would limit the number of slaves to a more 
reasonable number (and reduce the over stack usage of the bonding driver 
in general).

On 3/21/19 4:28 PM, David Marchand wrote:
> From: Zhaohui <zhaohui8@huawei.com>
> 
> slave aggregator_port_id is in [0, RTE_MAX_ETHPORTS-1] range.
> If RTE_MAX_ETHPORTS is > 8, we can hit out of bound accesses on
> agg_bandwidth[] and agg_count[] arrays.
> 
> Fixes: 6d72657ce379 ("net/bonding: add other aggregator modes")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Zhaohui <zhaohui8@huawei.com>
> Signed-off-by: David Marchand <david.marchand@redhat.com>

Acked-by: Chas Williams <chas3@att.com>

> ---
>   drivers/net/bonding/rte_eth_bond_8023ad.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/bonding/rte_eth_bond_8023ad.c b/drivers/net/bonding/rte_eth_bond_8023ad.c
> index 3943ec1..5004898 100644
> --- a/drivers/net/bonding/rte_eth_bond_8023ad.c
> +++ b/drivers/net/bonding/rte_eth_bond_8023ad.c
> @@ -669,8 +669,8 @@
>   	struct port *agg, *port;
>   	uint16_t slaves_count, new_agg_id, i, j = 0;
>   	uint16_t *slaves;
> -	uint64_t agg_bandwidth[8] = {0};
> -	uint64_t agg_count[8] = {0};
> +	uint64_t agg_bandwidth[RTE_MAX_ETHPORTS] = {0};
> +	uint64_t agg_count[RTE_MAX_ETHPORTS] = {0};
>   	uint16_t default_slave = 0;
>   	uint16_t mode_count_id;
>   	uint16_t mode_band_id;
>
  
David Marchand March 24, 2019, 5:11 p.m. UTC | #3
On Sun, Mar 24, 2019 at 2:35 PM Chas Williams <3chas3@gmail.com> wrote:

> Have you ever experienced this problem in practice? I ask because I am
> considering some fixes that would limit the number of slaves to a more
> reasonable number (and reduce the over stack usage of the bonding driver
> in general).
>

Not too hard to reproduce, the problem is not the number of slaves.
With a default RTE_MAX_ETHPORTS at 32, any slave whose portid >= 8 would
trigger an oob access.


Example:
# git describe
v19.02-294-gad2f555

# gdb ./master/app/testpmd
(gdb) b selection_logic
(gdb) run -c 0x3 --no-pci --vdev net_ring0 --vdev net_ring1 --vdev
net_ring2 --vdev net_ring3 --vdev net_ring4 --vdev net_ring5 --vdev
net_ring6 --vdev net_ring7 --vdev net_ring8 --vdev net_ring9 --vdev
net_bonding0,mode=4,slave=8,slave=9 -- -i --total-num-mbufs 2048

Breakpoint 1, selection_logic (internals=0x17fe51980, slave_id=8) at
/root/dpdk/drivers/net/bonding/rte_eth_bond_8023ad.c:670
670        uint16_t slaves_count, new_agg_id, i, j = 0;
Missing separate debuginfos, use: debuginfo-install
glibc-2.17-260.el7_6.3.x86_64 libgcc-4.8.5-36.el7.x86_64
numactl-libs-2.0.9-7.el7.x86_64
(gdb) n
672        uint64_t agg_bandwidth[8] = {0};
(gdb)
673        uint64_t agg_count[8] = {0};
(gdb)
674        uint16_t default_slave = 0;
(gdb) p agg_bandwidth
$1 = {0, 0, 0, 0, 0, 0, 0, 0}
(gdb) p agg_count
$2 = {0, 0, 0, 0, 0, 0, 0, 0}
(gdb) n
678        slaves = internals->active_slaves;
(gdb)
679        slaves_count = internals->active_slave_count;
(gdb)
680        port = &bond_mode_8023ad_ports[slave_id];
(gdb)
683        for (i = 0; i < slaves_count; ++i) {
(gdb)
684            agg = &bond_mode_8023ad_ports[slaves[i]];
(gdb)
686            if (agg->aggregator_port_id != slaves[i])
(gdb)
689            agg_count[agg->aggregator_port_id] += 1;
(gdb)
690            rte_eth_link_get_nowait(slaves[i], &link_info);
(gdb) p agg_bandwidth
$3 = {1, 0, 0, 0, 0, 0, 0, 0}
(gdb) p agg_count
$4 = {0, 0, 0, 0, 0, 0, 0, 0}
  
Chas Williams March 24, 2019, 5:24 p.m. UTC | #4
On 3/24/19 1:11 PM, David Marchand wrote:
> On Sun, Mar 24, 2019 at 2:35 PM Chas Williams <3chas3@gmail.com 
> <mailto:3chas3@gmail.com>> wrote:
> 
>     Have you ever experienced this problem in practice? I ask because I am
>     considering some fixes that would limit the number of slaves to a more
>     reasonable number (and reduce the over stack usage of the bonding
>     driver
>     in general).
> 
> 
> Not too hard to reproduce, the problem is not the number of slaves.
> With a default RTE_MAX_ETHPORTS at 32, any slave whose portid >= 8 would 
> trigger an oob access.
Err... Well I have a lot of questions then about this whole thing.  What 
is max_index() doing?

                 mode_count_id = max_index(agg_count, slaves_count);

It's indexing up to slaves_count, which is likely to be somewhere around 
2. agg_count() is indexed by the port id. It's likely agg_count was 
intended to be indexed by the slave index and not the port id.

> 
> 
> Example:
> # git describe
> v19.02-294-gad2f555
> 
> # gdb ./master/app/testpmd
> (gdb) b selection_logic
> (gdb) run -c 0x3 --no-pci --vdev net_ring0 --vdev net_ring1 --vdev 
> net_ring2 --vdev net_ring3 --vdev net_ring4 --vdev net_ring5 --vdev 
> net_ring6 --vdev net_ring7 --vdev net_ring8 --vdev net_ring9 --vdev 
> net_bonding0,mode=4,slave=8,slave=9 -- -i --total-num-mbufs 2048
> 
> Breakpoint 1, selection_logic (internals=0x17fe51980, slave_id=8) at 
> /root/dpdk/drivers/net/bonding/rte_eth_bond_8023ad.c:670
> 670        uint16_t slaves_count, new_agg_id, i, j = 0;
> Missing separate debuginfos, use: debuginfo-install 
> glibc-2.17-260.el7_6.3.x86_64 libgcc-4.8.5-36.el7.x86_64 
> numactl-libs-2.0.9-7.el7.x86_64
> (gdb) n
> 672        uint64_t agg_bandwidth[8] = {0};
> (gdb)
> 673        uint64_t agg_count[8] = {0};
> (gdb)
> 674        uint16_t default_slave = 0;
> (gdb) p agg_bandwidth
> $1 = {0, 0, 0, 0, 0, 0, 0, 0}
> (gdb) p agg_count
> $2 = {0, 0, 0, 0, 0, 0, 0, 0}
> (gdb) n
> 678        slaves = internals->active_slaves;
> (gdb)
> 679        slaves_count = internals->active_slave_count;
> (gdb)
> 680        port = &bond_mode_8023ad_ports[slave_id];
> (gdb)
> 683        for (i = 0; i < slaves_count; ++i) {
> (gdb)
> 684            agg = &bond_mode_8023ad_ports[slaves[i]];
> (gdb)
> 686            if (agg->aggregator_port_id != slaves[i])
> (gdb)
> 689            agg_count[agg->aggregator_port_id] += 1;
> (gdb)
> 690            rte_eth_link_get_nowait(slaves[i], &link_info);
> (gdb) p agg_bandwidth
> $3 = {1, 0, 0, 0, 0, 0, 0, 0}
> (gdb) p agg_count
> $4 = {0, 0, 0, 0, 0, 0, 0, 0}
> 
> 
> -- 
> David Marchand
  
David Marchand March 24, 2019, 6:01 p.m. UTC | #5
On Sun, Mar 24, 2019 at 6:24 PM Chas Williams <3chas3@gmail.com> wrote:

> On 3/24/19 1:11 PM, David Marchand wrote:
> > On Sun, Mar 24, 2019 at 2:35 PM Chas Williams <3chas3@gmail.com
> > <mailto:3chas3@gmail.com>> wrote:
> >
> >     Have you ever experienced this problem in practice? I ask because I
> am
> >     considering some fixes that would limit the number of slaves to a
> more
> >     reasonable number (and reduce the over stack usage of the bonding
> >     driver
> >     in general).
> >
> >
> > Not too hard to reproduce, the problem is not the number of slaves.
> > With a default RTE_MAX_ETHPORTS at 32, any slave whose portid >= 8 would
> > trigger an oob access.
> Err... Well I have a lot of questions then about this whole thing.  What
> is max_index() doing?
>
>                  mode_count_id = max_index(agg_count, slaves_count);
>
> It's indexing up to slaves_count, which is likely to be somewhere around
> 2. agg_count() is indexed by the port id. It's likely agg_count was
> intended to be indexed by the slave index and not the port id.
>

Good point, it is likely that this whole code is not working at all...
I did not go far enough to test/verify this part functionally.
  
Ferruh Yigit Sept. 30, 2019, 1:49 p.m. UTC | #6
On 3/24/2019 1:35 PM, Chas Williams wrote:
> Have you ever experienced this problem in practice? I ask because I am 
> considering some fixes that would limit the number of slaves to a more 
> reasonable number (and reduce the over stack usage of the bonding driver 
> in general).
> 
> On 3/21/19 4:28 PM, David Marchand wrote:
>> From: Zhaohui <zhaohui8@huawei.com>
>>
>> slave aggregator_port_id is in [0, RTE_MAX_ETHPORTS-1] range.
>> If RTE_MAX_ETHPORTS is > 8, we can hit out of bound accesses on
>> agg_bandwidth[] and agg_count[] arrays.
>>
>> Fixes: 6d72657ce379 ("net/bonding: add other aggregator modes")
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Zhaohui <zhaohui8@huawei.com>
>> Signed-off-by: David Marchand <david.marchand@redhat.com>
> 
> Acked-by: Chas Williams <chas3@att.com>

Applied to dpdk-next-net/master, thanks.
  

Patch

diff --git a/drivers/net/bonding/rte_eth_bond_8023ad.c b/drivers/net/bonding/rte_eth_bond_8023ad.c
index 3943ec1..5004898 100644
--- a/drivers/net/bonding/rte_eth_bond_8023ad.c
+++ b/drivers/net/bonding/rte_eth_bond_8023ad.c
@@ -669,8 +669,8 @@ 
 	struct port *agg, *port;
 	uint16_t slaves_count, new_agg_id, i, j = 0;
 	uint16_t *slaves;
-	uint64_t agg_bandwidth[8] = {0};
-	uint64_t agg_count[8] = {0};
+	uint64_t agg_bandwidth[RTE_MAX_ETHPORTS] = {0};
+	uint64_t agg_count[RTE_MAX_ETHPORTS] = {0};
 	uint16_t default_slave = 0;
 	uint16_t mode_count_id;
 	uint16_t mode_band_id;