[V2,2/4] net/bonding: fix non-terminable while loop
Checks
Commit Message
From: Huisong Li <lihuisong@huawei.com>
All slaves will be stopped and removed when closing a bonded port. But the
while loop can not stop if both rte_eth_dev_stop and
rte_eth_bond_slave_remove fail to run.
Fixes: fb0379bc5db3 ("net/bonding: check stop call status")
Cc: stable@dpdk.org
Signed-off-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
---
drivers/net/bonding/rte_eth_bond_pmd.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
Comments
On 3/24/2022 3:00 AM, Min Hu (Connor) wrote:
> From: Huisong Li <lihuisong@huawei.com>
>
> All slaves will be stopped and removed when closing a bonded port. But the
> while loop can not stop if both rte_eth_dev_stop and
> rte_eth_bond_slave_remove fail to run.
>
Agree that this is a defect introduced in below commit. Thanks for the fix.
> Fixes: fb0379bc5db3 ("net/bonding: check stop call status")
> Cc: stable@dpdk.org
>
> Signed-off-by: Huisong Li <lihuisong@huawei.com>
> Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
> ---
> drivers/net/bonding/rte_eth_bond_pmd.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c b/drivers/net/bonding/rte_eth_bond_pmd.c
> index 469dc71170..00d4deda44 100644
> --- a/drivers/net/bonding/rte_eth_bond_pmd.c
> +++ b/drivers/net/bonding/rte_eth_bond_pmd.c
> @@ -2149,13 +2149,14 @@ bond_ethdev_close(struct rte_eth_dev *dev)
> return 0;
>
> RTE_BOND_LOG(INFO, "Closing bonded device %s", dev->device->name);
> - while (internals->slave_count != skipped) {
> + while (skipped < internals->slave_count) {
When below fixed with adding 'continue', no need to change the check,
right? Although new one is also correct.
> uint16_t port_id = internals->slaves[skipped].port_id;
>
> if (rte_eth_dev_stop(port_id) != 0) {
> RTE_BOND_LOG(ERR, "Failed to stop device on port %u",
> port_id);
> skipped++;
> + continue;
Can't we remove the slave even if 'stop()' failed? If so I think better
to just log the error and keep continue in that case, what do you think?
> }
>
> if (rte_eth_bond_slave_remove(bond_port_id, port_id) != 0) {
Hi, Ferruh,
在 2022/4/27 2:19, Ferruh Yigit 写道:
> On 3/24/2022 3:00 AM, Min Hu (Connor) wrote:
>> From: Huisong Li <lihuisong@huawei.com>
>>
>> All slaves will be stopped and removed when closing a bonded port. But
>> the
>> while loop can not stop if both rte_eth_dev_stop and
>> rte_eth_bond_slave_remove fail to run.
>>
>
> Agree that this is a defect introduced in below commit. Thanks for the fix.
thanks.
>
>> Fixes: fb0379bc5db3 ("net/bonding: check stop call status")
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Huisong Li <lihuisong@huawei.com>
>> Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
>> ---
>> drivers/net/bonding/rte_eth_bond_pmd.c | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c
>> b/drivers/net/bonding/rte_eth_bond_pmd.c
>> index 469dc71170..00d4deda44 100644
>> --- a/drivers/net/bonding/rte_eth_bond_pmd.c
>> +++ b/drivers/net/bonding/rte_eth_bond_pmd.c
>> @@ -2149,13 +2149,14 @@ bond_ethdev_close(struct rte_eth_dev *dev)
>> return 0;
>> RTE_BOND_LOG(INFO, "Closing bonded device %s", dev->device->name);
>> - while (internals->slave_count != skipped) {
>> + while (skipped < internals->slave_count) {
>
> When below fixed with adding 'continue', no need to change the check,
> right? Although new one is also correct.
Agreed.
>
>> uint16_t port_id = internals->slaves[skipped].port_id;
>> if (rte_eth_dev_stop(port_id) != 0) {
>> RTE_BOND_LOG(ERR, "Failed to stop device on port %u",
>> port_id);
>> skipped++;
>> + continue;
>
> Can't we remove the slave even if 'stop()' failed? If so I think better
> to just log the error and keep continue in that case, what do you think?
NO, if slave stop failed, we cannot remove the slave.
just see the function stack:
rte_eth_bond_slave_remove
__eth_bond_slave_remove_lock_free
slave_remove
rte_eth_dev_internal_reset
if (dev->data->dev_started) {
RTE_ETHDEV_LOG(ERR, "Port %u must be stopped to allow reset\n",
dev->data->port_id);
return;
}
>
>> }
>> if (rte_eth_bond_slave_remove(bond_port_id, port_id) != 0) {
>
> .
On 4/29/2022 7:52 AM, Min Hu (Connor) wrote:
> Hi, Ferruh,
>
> 在 2022/4/27 2:19, Ferruh Yigit 写道:
>> On 3/24/2022 3:00 AM, Min Hu (Connor) wrote:
>>> From: Huisong Li <lihuisong@huawei.com>
>>>
>>> All slaves will be stopped and removed when closing a bonded port.
>>> But the
>>> while loop can not stop if both rte_eth_dev_stop and
>>> rte_eth_bond_slave_remove fail to run.
>>>
>>
>> Agree that this is a defect introduced in below commit. Thanks for the
>> fix.
> thanks.
>>
>>> Fixes: fb0379bc5db3 ("net/bonding: check stop call status")
>>> Cc: stable@dpdk.org
>>>
>>> Signed-off-by: Huisong Li <lihuisong@huawei.com>
>>> Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
>>> ---
>>> drivers/net/bonding/rte_eth_bond_pmd.c | 3 ++-
>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c
>>> b/drivers/net/bonding/rte_eth_bond_pmd.c
>>> index 469dc71170..00d4deda44 100644
>>> --- a/drivers/net/bonding/rte_eth_bond_pmd.c
>>> +++ b/drivers/net/bonding/rte_eth_bond_pmd.c
>>> @@ -2149,13 +2149,14 @@ bond_ethdev_close(struct rte_eth_dev *dev)
>>> return 0;
>>> RTE_BOND_LOG(INFO, "Closing bonded device %s", dev->device->name);
>>> - while (internals->slave_count != skipped) {
>>> + while (skipped < internals->slave_count) {
>>
>> When below fixed with adding 'continue', no need to change the check,
>> right? Although new one is also correct.
> Agreed.
>>
>>> uint16_t port_id = internals->slaves[skipped].port_id;
>>> if (rte_eth_dev_stop(port_id) != 0) {
>>> RTE_BOND_LOG(ERR, "Failed to stop device on port %u",
>>> port_id);
>>> skipped++;
>>> + continue;
>>
>> Can't we remove the slave even if 'stop()' failed? If so I think
>> better to just log the error and keep continue in that case, what do
>> you think?
> NO, if slave stop failed, we cannot remove the slave.
Got it, thanks for clarification.
> just see the function stack:
> rte_eth_bond_slave_remove
> __eth_bond_slave_remove_lock_free
> slave_remove
> rte_eth_dev_internal_reset
> if (dev->data->dev_started) {
> RTE_ETHDEV_LOG(ERR, "Port %u must be stopped to allow reset\n",
> dev->data->port_id);
> return;
> }
>
>>
>>> }
>>> if (rte_eth_bond_slave_remove(bond_port_id, port_id) != 0) {
>>
>> .
@@ -2149,13 +2149,14 @@ bond_ethdev_close(struct rte_eth_dev *dev)
return 0;
RTE_BOND_LOG(INFO, "Closing bonded device %s", dev->device->name);
- while (internals->slave_count != skipped) {
+ while (skipped < internals->slave_count) {
uint16_t port_id = internals->slaves[skipped].port_id;
if (rte_eth_dev_stop(port_id) != 0) {
RTE_BOND_LOG(ERR, "Failed to stop device on port %u",
port_id);
skipped++;
+ continue;
}
if (rte_eth_bond_slave_remove(bond_port_id, port_id) != 0) {