[v2] app/testpmd : fix testpmd quit error
Checks
Commit Message
From: wenxuan wu <wenxuanx.wu@intel.com>
When testpmd use func get_eth_dev_info() while related resource
had been released.
Change the logic of func port_is_bonding_slave, this func
eth_dev_info_get_print_err while pf is released would result
in this error. Use ports instead to avoid this bug.
Fixes: 0a0821bcf312 ("app/testpmd: remove most uses of internal ethdev array")
Cc: stable@dpdk.org
Signed-off-by: wenxuan wu <wenxuanx.wu@intel.com>
---
app/test-pmd/testpmd.c | 12 +-----------
1 file changed, 1 insertion(+), 11 deletions(-)
Comments
Hi Wenxuan,
> -----Original Message-----
> From: Wu, WenxuanX <wenxuanx.wu@intel.com>
> Sent: Friday, March 4, 2022 10:37 AM
> To: Yang, Qiming <qiming.yang@intel.com>; Zhang, Qi Z
> <qi.z.zhang@intel.com>; Li, Xiaoyun <xiaoyun.li@intel.com>; Singh, Aman
> Deep <aman.deep.singh@intel.com>; Zhang, Yuying
> <yuying.zhang@intel.com>
> Cc: dev@dpdk.org
> Subject: [PATCH v2] app/testpmd : fix testpmd quit error
>
> From: wenxuan wu <wenxuanx.wu@intel.com>
>
> When testpmd use func get_eth_dev_info() while related resource had been
> released.
>
> Change the logic of func port_is_bonding_slave, this func
> eth_dev_info_get_print_err while pf is released would result in this error.
> Use ports instead to avoid this bug.
>
> Fixes: 0a0821bcf312 ("app/testpmd: remove most uses of internal ethdev
> array")
> Cc: stable@dpdk.org
>
> Signed-off-by: wenxuan wu <wenxuanx.wu@intel.com>
> ---
> app/test-pmd/testpmd.c | 12 +-----------
> 1 file changed, 1 insertion(+), 11 deletions(-)
>
> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
> e1da961311..37038c9183 100644
> --- a/app/test-pmd/testpmd.c
> +++ b/app/test-pmd/testpmd.c
> @@ -3824,19 +3824,9 @@ void clear_port_slave_flag(portid_t slave_pid)
> uint8_t port_is_bonding_slave(portid_t slave_pid) {
> struct rte_port *port;
> - struct rte_eth_dev_info dev_info;
> - int ret;
>
> port = &ports[slave_pid];
> - ret = eth_dev_info_get_print_err(slave_pid, &dev_info);
> - if (ret != 0) {
> - TESTPMD_LOG(ERR,
> - "Failed to get device info for port id %d,"
> - "cannot determine if the port is a bonded slave",
> - slave_pid);
> - return 0;
> - }
> - if ((*dev_info.dev_flags & RTE_ETH_DEV_BONDED_SLAVE) || (port-
> >slave_flag == 1))
> + if ((*port->dev_info.dev_flags & RTE_ETH_DEV_BONDED_SLAVE) ||
> +(port->slave_flag == 1))
Is port->dev_info.dev_flags updated in time when the bonding status changes?
It may use eth_dev_info_get_print_err() to update dev_info of port.
> return 1;
> return 0;
> }
> --
> 2.25.1
On 3/4/2022 2:37 AM, wenxuanx.wu@intel.com wrote:
> From: wenxuan wu <wenxuanx.wu@intel.com>
>
> When testpmd use func get_eth_dev_info() while related resource
> had been released.
>
Is 'eth_dev_info_get_print_err()' fails at this stage?
What resource is released, the 'slave_port' itself?
And there may be another logic wrong, it shouldn't try
to detect if a released port is bonding port or not.
> Change the logic of func port_is_bonding_slave, this func
> eth_dev_info_get_print_err while pf is released would result
> in this error. Use ports instead to avoid this bug.
>
This relies to application level stored value to decide about
port, not sure if this is reliable.
> Fixes: 0a0821bcf312 ("app/testpmd: remove most uses of internal ethdev array")
> Cc: stable@dpdk.org
>
> Signed-off-by: wenxuan wu <wenxuanx.wu@intel.com>
> ---
> app/test-pmd/testpmd.c | 12 +-----------
> 1 file changed, 1 insertion(+), 11 deletions(-)
>
> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
> index e1da961311..37038c9183 100644
> --- a/app/test-pmd/testpmd.c
> +++ b/app/test-pmd/testpmd.c
> @@ -3824,19 +3824,9 @@ void clear_port_slave_flag(portid_t slave_pid)
> uint8_t port_is_bonding_slave(portid_t slave_pid)
> {
> struct rte_port *port;
> - struct rte_eth_dev_info dev_info;
> - int ret;
>
> port = &ports[slave_pid];
> - ret = eth_dev_info_get_print_err(slave_pid, &dev_info);
> - if (ret != 0) {
> - TESTPMD_LOG(ERR,
> - "Failed to get device info for port id %d,"
> - "cannot determine if the port is a bonded slave",
> - slave_pid);
> - return 0;
> - }
> - if ((*dev_info.dev_flags & RTE_ETH_DEV_BONDED_SLAVE) || (port->slave_flag == 1))
> + if ((*port->dev_info.dev_flags & RTE_ETH_DEV_BONDED_SLAVE) || (port->slave_flag == 1))
> return 1;
> return 0;
> }
> -----Original Message-----
> From: Yigit, Ferruh <ferruh.yigit@intel.com>
> Sent: 2022年3月5日 0:19
> To: Wu, WenxuanX <wenxuanx.wu@intel.com>; Yang, Qiming
> <qiming.yang@intel.com>; Zhang, Qi Z <qi.z.zhang@intel.com>; Li, Xiaoyun
> <xiaoyun.li@intel.com>; Singh, Aman Deep <aman.deep.singh@intel.com>;
> Zhang, Yuying <yuying.zhang@intel.com>
> Cc: dev@dpdk.org
> Subject: Re: [PATCH v2] app/testpmd : fix testpmd quit error
>
> On 3/4/2022 2:37 AM, wenxuanx.wu@intel.com wrote:
> > From: wenxuan wu <wenxuanx.wu@intel.com>
> >
> > When testpmd use func get_eth_dev_info() while related resource had
> > been released.
> >
>
> Is 'eth_dev_info_get_print_err()' fails at this stage?
> What resource is released, the 'slave_port' itself?
Yeah, 1PF ,2VF_repr.
Close port pf ,ok.
Close port vf , error.
In port_close() func,
Is_bonding_slave() call eth_dev_info_get_print_err() to confirm whether it is a slave or not , but when port is a vf port ,and pf had been released, this eth_dev_info_get_print_err(vf_id) would read a freed memory ,result in this bug.
>
> And there may be another logic wrong, it shouldn't try to detect if a released
> port is bonding port or not.
>
> > Change the logic of func port_is_bonding_slave, this func
> > eth_dev_info_get_print_err while pf is released would result in this
> > error. Use ports instead to avoid this bug.
> >
>
> This relies to application level stored value to decide about port, not sure if
> this is reliable.
>
> > Fixes: 0a0821bcf312 ("app/testpmd: remove most uses of internal ethdev
> > array")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: wenxuan wu <wenxuanx.wu@intel.com>
> > ---
> > app/test-pmd/testpmd.c | 12 +-----------
> > 1 file changed, 1 insertion(+), 11 deletions(-)
> >
> > diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
> > e1da961311..37038c9183 100644
> > --- a/app/test-pmd/testpmd.c
> > +++ b/app/test-pmd/testpmd.c
> > @@ -3824,19 +3824,9 @@ void clear_port_slave_flag(portid_t slave_pid)
> > uint8_t port_is_bonding_slave(portid_t slave_pid)
> > {
> > struct rte_port *port;
> > - struct rte_eth_dev_info dev_info;
> > - int ret;
> >
> > port = &ports[slave_pid];
> > - ret = eth_dev_info_get_print_err(slave_pid, &dev_info);
> > - if (ret != 0) {
> > - TESTPMD_LOG(ERR,
> > - "Failed to get device info for port id %d,"
> > - "cannot determine if the port is a bonded slave",
> > - slave_pid);
> > - return 0;
> > - }
> > - if ((*dev_info.dev_flags & RTE_ETH_DEV_BONDED_SLAVE) || (port-
> >slave_flag == 1))
> > + if ((*port->dev_info.dev_flags & RTE_ETH_DEV_BONDED_SLAVE) ||
> > +(port->slave_flag == 1))
> > return 1;
> > return 0;
> > }
On 3/9/2022 10:10 AM, Wu, WenxuanX wrote:
>
>
>> -----Original Message-----
>> From: Yigit, Ferruh <ferruh.yigit@intel.com>
>> Sent: 2022年3月5日 0:19
>> To: Wu, WenxuanX <wenxuanx.wu@intel.com>; Yang, Qiming
>> <qiming.yang@intel.com>; Zhang, Qi Z <qi.z.zhang@intel.com>; Li, Xiaoyun
>> <xiaoyun.li@intel.com>; Singh, Aman Deep <aman.deep.singh@intel.com>;
>> Zhang, Yuying <yuying.zhang@intel.com>
>> Cc: dev@dpdk.org
>> Subject: Re: [PATCH v2] app/testpmd : fix testpmd quit error
>>
>> On 3/4/2022 2:37 AM, wenxuanx.wu@intel.com wrote:
>>> From: wenxuan wu <wenxuanx.wu@intel.com>
>>>
>>> When testpmd use func get_eth_dev_info() while related resource had
>>> been released.
>>>
>>
>> Is 'eth_dev_info_get_print_err()' fails at this stage?
>> What resource is released, the 'slave_port' itself?
> Yeah, 1PF ,2VF_repr.
> Close port pf ,ok.
> Close port vf , error.
> In port_close() func,
> Is_bonding_slave() call eth_dev_info_get_print_err() to confirm whether it is a slave or not , but when port is a vf port ,and pf had been released, this eth_dev_info_get_print_err(vf_id) would read a freed memory ,result in this bug.
I see the intention now, as rephrase:
When PF closed first, ethdev calls to VFs will fail,
in this case 'eth_dev_info_get_print_err()' fails
if it is called for VF when its PF is closed.
I think this approach is hack, more proper option can be
PF port refuse to close when there are outstanding VF ports
but this is more work.
For quick fix perhaps we can continue with first version
of your patch, which closes the ports in reverse order.
It has its shortcomings as we have discussed in that version,
but better than this approach and it cover most of the cases
properly.
But please add comment for intention of the change
and that it may not fix all cases clearly in the code.
>>
>> And there may be another logic wrong, it shouldn't try to detect if a released
>> port is bonding port or not.
>>
>>> Change the logic of func port_is_bonding_slave, this func
>>> eth_dev_info_get_print_err while pf is released would result in this
>>> error. Use ports instead to avoid this bug.
>>>
>>
>> This relies to application level stored value to decide about port, not sure if
>> this is reliable.
>>
>>> Fixes: 0a0821bcf312 ("app/testpmd: remove most uses of internal ethdev
>>> array")
>>> Cc: stable@dpdk.org
>>>
>>> Signed-off-by: wenxuan wu <wenxuanx.wu@intel.com>
>>> ---
>>> app/test-pmd/testpmd.c | 12 +-----------
>>> 1 file changed, 1 insertion(+), 11 deletions(-)
>>>
>>> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
>>> e1da961311..37038c9183 100644
>>> --- a/app/test-pmd/testpmd.c
>>> +++ b/app/test-pmd/testpmd.c
>>> @@ -3824,19 +3824,9 @@ void clear_port_slave_flag(portid_t slave_pid)
>>> uint8_t port_is_bonding_slave(portid_t slave_pid)
>>> {
>>> struct rte_port *port;
>>> - struct rte_eth_dev_info dev_info;
>>> - int ret;
>>>
>>> port = &ports[slave_pid];
>>> - ret = eth_dev_info_get_print_err(slave_pid, &dev_info);
>>> - if (ret != 0) {
>>> - TESTPMD_LOG(ERR,
>>> - "Failed to get device info for port id %d,"
>>> - "cannot determine if the port is a bonded slave",
>>> - slave_pid);
>>> - return 0;
>>> - }
>>> - if ((*dev_info.dev_flags & RTE_ETH_DEV_BONDED_SLAVE) || (port-
>>> slave_flag == 1))
>>> + if ((*port->dev_info.dev_flags & RTE_ETH_DEV_BONDED_SLAVE) ||
>>> +(port->slave_flag == 1))
>>> return 1;
>>> return 0;
>>> }
>
@@ -3824,19 +3824,9 @@ void clear_port_slave_flag(portid_t slave_pid)
uint8_t port_is_bonding_slave(portid_t slave_pid)
{
struct rte_port *port;
- struct rte_eth_dev_info dev_info;
- int ret;
port = &ports[slave_pid];
- ret = eth_dev_info_get_print_err(slave_pid, &dev_info);
- if (ret != 0) {
- TESTPMD_LOG(ERR,
- "Failed to get device info for port id %d,"
- "cannot determine if the port is a bonded slave",
- slave_pid);
- return 0;
- }
- if ((*dev_info.dev_flags & RTE_ETH_DEV_BONDED_SLAVE) || (port->slave_flag == 1))
+ if ((*port->dev_info.dev_flags & RTE_ETH_DEV_BONDED_SLAVE) || (port->slave_flag == 1))
return 1;
return 0;
}