[v2] app/testpmd : fix testpmd quit error

Message ID 20220304023701.499961-1-wenxuanx.wu@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers
Series [v2] app/testpmd : fix testpmd quit error |

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/github-robot: build success github build: passed
ci/iol-aarch64-unit-testing success Testing PASS
ci/iol-x86_64-compile-testing success Testing PASS
ci/iol-x86_64-unit-testing success Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS
ci/iol-abi-testing success Testing PASS

Commit Message

Wu, WenxuanX March 4, 2022, 2:37 a.m. UTC
  From: wenxuan wu <wenxuanx.wu@intel.com>

When testpmd use func get_eth_dev_info() while related resource
had been released.

Change the logic of func port_is_bonding_slave, this func 
eth_dev_info_get_print_err while pf is released would result 
in this error. Use ports instead to avoid this bug.

Fixes: 0a0821bcf312 ("app/testpmd: remove most uses of internal ethdev array")
Cc: stable@dpdk.org

Signed-off-by: wenxuan wu <wenxuanx.wu@intel.com>
---
 app/test-pmd/testpmd.c | 12 +-----------
 1 file changed, 1 insertion(+), 11 deletions(-)
  

Comments

Zhang, Yuying March 4, 2022, 10:12 a.m. UTC | #1
Hi Wenxuan,

> -----Original Message-----
> From: Wu, WenxuanX <wenxuanx.wu@intel.com>
> Sent: Friday, March 4, 2022 10:37 AM
> To: Yang, Qiming <qiming.yang@intel.com>; Zhang, Qi Z
> <qi.z.zhang@intel.com>; Li, Xiaoyun <xiaoyun.li@intel.com>; Singh, Aman
> Deep <aman.deep.singh@intel.com>; Zhang, Yuying
> <yuying.zhang@intel.com>
> Cc: dev@dpdk.org
> Subject: [PATCH v2] app/testpmd : fix testpmd quit error
> 
> From: wenxuan wu <wenxuanx.wu@intel.com>
> 
> When testpmd use func get_eth_dev_info() while related resource had been
> released.
> 
> Change the logic of func port_is_bonding_slave, this func
> eth_dev_info_get_print_err while pf is released would result in this error.
> Use ports instead to avoid this bug.
> 
> Fixes: 0a0821bcf312 ("app/testpmd: remove most uses of internal ethdev
> array")
> Cc: stable@dpdk.org
> 
> Signed-off-by: wenxuan wu <wenxuanx.wu@intel.com>
> ---
>  app/test-pmd/testpmd.c | 12 +-----------
>  1 file changed, 1 insertion(+), 11 deletions(-)
> 
> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
> e1da961311..37038c9183 100644
> --- a/app/test-pmd/testpmd.c
> +++ b/app/test-pmd/testpmd.c
> @@ -3824,19 +3824,9 @@ void clear_port_slave_flag(portid_t slave_pid)
> uint8_t port_is_bonding_slave(portid_t slave_pid)  {
>  	struct rte_port *port;
> -	struct rte_eth_dev_info dev_info;
> -	int ret;
> 
>  	port = &ports[slave_pid];
> -	ret = eth_dev_info_get_print_err(slave_pid, &dev_info);
> -	if (ret != 0) {
> -		TESTPMD_LOG(ERR,
> -			"Failed to get device info for port id %d,"
> -			"cannot determine if the port is a bonded slave",
> -			slave_pid);
> -		return 0;
> -	}
> -	if ((*dev_info.dev_flags & RTE_ETH_DEV_BONDED_SLAVE) || (port-
> >slave_flag == 1))
> +	if ((*port->dev_info.dev_flags & RTE_ETH_DEV_BONDED_SLAVE) ||
> +(port->slave_flag == 1))

Is port->dev_info.dev_flags updated in time when the bonding status changes?
It may use eth_dev_info_get_print_err() to update dev_info of port.

>  		return 1;
>  	return 0;
>  }
> --
> 2.25.1
  
Ferruh Yigit March 4, 2022, 4:18 p.m. UTC | #2
On 3/4/2022 2:37 AM, wenxuanx.wu@intel.com wrote:
> From: wenxuan wu <wenxuanx.wu@intel.com>
> 
> When testpmd use func get_eth_dev_info() while related resource
> had been released.
> 

Is 'eth_dev_info_get_print_err()' fails at this stage?
What resource is released, the 'slave_port' itself?

And there may be another logic wrong, it shouldn't try
to detect if a released port is bonding port or not.

> Change the logic of func port_is_bonding_slave, this func
> eth_dev_info_get_print_err while pf is released would result
> in this error. Use ports instead to avoid this bug.
> 

This relies to application level stored value to decide about
port, not sure if this is reliable.

> Fixes: 0a0821bcf312 ("app/testpmd: remove most uses of internal ethdev array")
> Cc: stable@dpdk.org
> 
> Signed-off-by: wenxuan wu <wenxuanx.wu@intel.com>
> ---
>   app/test-pmd/testpmd.c | 12 +-----------
>   1 file changed, 1 insertion(+), 11 deletions(-)
> 
> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
> index e1da961311..37038c9183 100644
> --- a/app/test-pmd/testpmd.c
> +++ b/app/test-pmd/testpmd.c
> @@ -3824,19 +3824,9 @@ void clear_port_slave_flag(portid_t slave_pid)
>   uint8_t port_is_bonding_slave(portid_t slave_pid)
>   {
>   	struct rte_port *port;
> -	struct rte_eth_dev_info dev_info;
> -	int ret;
>   
>   	port = &ports[slave_pid];
> -	ret = eth_dev_info_get_print_err(slave_pid, &dev_info);
> -	if (ret != 0) {
> -		TESTPMD_LOG(ERR,
> -			"Failed to get device info for port id %d,"
> -			"cannot determine if the port is a bonded slave",
> -			slave_pid);
> -		return 0;
> -	}
> -	if ((*dev_info.dev_flags & RTE_ETH_DEV_BONDED_SLAVE) || (port->slave_flag == 1))
> +	if ((*port->dev_info.dev_flags & RTE_ETH_DEV_BONDED_SLAVE) || (port->slave_flag == 1))
>   		return 1;
>   	return 0;
>   }
  
Wu, WenxuanX March 9, 2022, 10:10 a.m. UTC | #3
> -----Original Message-----
> From: Yigit, Ferruh <ferruh.yigit@intel.com>
> Sent: 2022年3月5日 0:19
> To: Wu, WenxuanX <wenxuanx.wu@intel.com>; Yang, Qiming
> <qiming.yang@intel.com>; Zhang, Qi Z <qi.z.zhang@intel.com>; Li, Xiaoyun
> <xiaoyun.li@intel.com>; Singh, Aman Deep <aman.deep.singh@intel.com>;
> Zhang, Yuying <yuying.zhang@intel.com>
> Cc: dev@dpdk.org
> Subject: Re: [PATCH v2] app/testpmd : fix testpmd quit error
> 
> On 3/4/2022 2:37 AM, wenxuanx.wu@intel.com wrote:
> > From: wenxuan wu <wenxuanx.wu@intel.com>
> >
> > When testpmd use func get_eth_dev_info() while related resource had
> > been released.
> >
> 
> Is 'eth_dev_info_get_print_err()' fails at this stage?
> What resource is released, the 'slave_port' itself?
Yeah, 1PF ,2VF_repr.
Close port pf ,ok.
Close port vf , error.
In port_close() func, 
   Is_bonding_slave() call eth_dev_info_get_print_err() to confirm whether it is a slave or not , but when port is a vf port ,and  pf had been released, this eth_dev_info_get_print_err(vf_id) would read a freed memory ,result in this bug.
> 
> And there may be another logic wrong, it shouldn't try to detect if a released
> port is bonding port or not.
> 
> > Change the logic of func port_is_bonding_slave, this func
> > eth_dev_info_get_print_err while pf is released would result in this
> > error. Use ports instead to avoid this bug.
> >
> 
> This relies to application level stored value to decide about port, not sure if
> this is reliable.
> 
> > Fixes: 0a0821bcf312 ("app/testpmd: remove most uses of internal ethdev
> > array")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: wenxuan wu <wenxuanx.wu@intel.com>
> > ---
> >   app/test-pmd/testpmd.c | 12 +-----------
> >   1 file changed, 1 insertion(+), 11 deletions(-)
> >
> > diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
> > e1da961311..37038c9183 100644
> > --- a/app/test-pmd/testpmd.c
> > +++ b/app/test-pmd/testpmd.c
> > @@ -3824,19 +3824,9 @@ void clear_port_slave_flag(portid_t slave_pid)
> >   uint8_t port_is_bonding_slave(portid_t slave_pid)
> >   {
> >   	struct rte_port *port;
> > -	struct rte_eth_dev_info dev_info;
> > -	int ret;
> >
> >   	port = &ports[slave_pid];
> > -	ret = eth_dev_info_get_print_err(slave_pid, &dev_info);
> > -	if (ret != 0) {
> > -		TESTPMD_LOG(ERR,
> > -			"Failed to get device info for port id %d,"
> > -			"cannot determine if the port is a bonded slave",
> > -			slave_pid);
> > -		return 0;
> > -	}
> > -	if ((*dev_info.dev_flags & RTE_ETH_DEV_BONDED_SLAVE) || (port-
> >slave_flag == 1))
> > +	if ((*port->dev_info.dev_flags & RTE_ETH_DEV_BONDED_SLAVE) ||
> > +(port->slave_flag == 1))
> >   		return 1;
> >   	return 0;
> >   }
  
Ferruh Yigit March 10, 2022, 11:11 a.m. UTC | #4
On 3/9/2022 10:10 AM, Wu, WenxuanX wrote:
> 
> 
>> -----Original Message-----
>> From: Yigit, Ferruh <ferruh.yigit@intel.com>
>> Sent: 2022年3月5日 0:19
>> To: Wu, WenxuanX <wenxuanx.wu@intel.com>; Yang, Qiming
>> <qiming.yang@intel.com>; Zhang, Qi Z <qi.z.zhang@intel.com>; Li, Xiaoyun
>> <xiaoyun.li@intel.com>; Singh, Aman Deep <aman.deep.singh@intel.com>;
>> Zhang, Yuying <yuying.zhang@intel.com>
>> Cc: dev@dpdk.org
>> Subject: Re: [PATCH v2] app/testpmd : fix testpmd quit error
>>
>> On 3/4/2022 2:37 AM, wenxuanx.wu@intel.com wrote:
>>> From: wenxuan wu <wenxuanx.wu@intel.com>
>>>
>>> When testpmd use func get_eth_dev_info() while related resource had
>>> been released.
>>>
>>
>> Is 'eth_dev_info_get_print_err()' fails at this stage?
>> What resource is released, the 'slave_port' itself?
> Yeah, 1PF ,2VF_repr.
> Close port pf ,ok.
> Close port vf , error.
> In port_close() func,
>     Is_bonding_slave() call eth_dev_info_get_print_err() to confirm whether it is a slave or not , but when port is a vf port ,and  pf had been released, this eth_dev_info_get_print_err(vf_id) would read a freed memory ,result in this bug.

I see the intention now, as rephrase:

When PF closed first, ethdev calls to VFs will fail,
in this case 'eth_dev_info_get_print_err()' fails
if it is called for VF when its PF is closed.

I think this approach is hack, more proper option can be
PF port refuse to close when there are outstanding VF ports
but this is more work.

For quick fix perhaps we can continue with first version
of your patch, which closes the ports in reverse order.
It has its shortcomings as we have discussed in that version,
but better than this approach and it cover most of the cases
properly.
But please add comment for intention of the change
and that it may not fix all cases clearly in the code.

>>
>> And there may be another logic wrong, it shouldn't try to detect if a released
>> port is bonding port or not.
>>
>>> Change the logic of func port_is_bonding_slave, this func
>>> eth_dev_info_get_print_err while pf is released would result in this
>>> error. Use ports instead to avoid this bug.
>>>
>>
>> This relies to application level stored value to decide about port, not sure if
>> this is reliable.
>>
>>> Fixes: 0a0821bcf312 ("app/testpmd: remove most uses of internal ethdev
>>> array")
>>> Cc: stable@dpdk.org
>>>
>>> Signed-off-by: wenxuan wu <wenxuanx.wu@intel.com>
>>> ---
>>>    app/test-pmd/testpmd.c | 12 +-----------
>>>    1 file changed, 1 insertion(+), 11 deletions(-)
>>>
>>> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
>>> e1da961311..37038c9183 100644
>>> --- a/app/test-pmd/testpmd.c
>>> +++ b/app/test-pmd/testpmd.c
>>> @@ -3824,19 +3824,9 @@ void clear_port_slave_flag(portid_t slave_pid)
>>>    uint8_t port_is_bonding_slave(portid_t slave_pid)
>>>    {
>>>    	struct rte_port *port;
>>> -	struct rte_eth_dev_info dev_info;
>>> -	int ret;
>>>
>>>    	port = &ports[slave_pid];
>>> -	ret = eth_dev_info_get_print_err(slave_pid, &dev_info);
>>> -	if (ret != 0) {
>>> -		TESTPMD_LOG(ERR,
>>> -			"Failed to get device info for port id %d,"
>>> -			"cannot determine if the port is a bonded slave",
>>> -			slave_pid);
>>> -		return 0;
>>> -	}
>>> -	if ((*dev_info.dev_flags & RTE_ETH_DEV_BONDED_SLAVE) || (port-
>>> slave_flag == 1))
>>> +	if ((*port->dev_info.dev_flags & RTE_ETH_DEV_BONDED_SLAVE) ||
>>> +(port->slave_flag == 1))
>>>    		return 1;
>>>    	return 0;
>>>    }
>
  

Patch

diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index e1da961311..37038c9183 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -3824,19 +3824,9 @@  void clear_port_slave_flag(portid_t slave_pid)
 uint8_t port_is_bonding_slave(portid_t slave_pid)
 {
 	struct rte_port *port;
-	struct rte_eth_dev_info dev_info;
-	int ret;
 
 	port = &ports[slave_pid];
-	ret = eth_dev_info_get_print_err(slave_pid, &dev_info);
-	if (ret != 0) {
-		TESTPMD_LOG(ERR,
-			"Failed to get device info for port id %d,"
-			"cannot determine if the port is a bonded slave",
-			slave_pid);
-		return 0;
-	}
-	if ((*dev_info.dev_flags & RTE_ETH_DEV_BONDED_SLAVE) || (port->slave_flag == 1))
+	if ((*port->dev_info.dev_flags & RTE_ETH_DEV_BONDED_SLAVE) || (port->slave_flag == 1))
 		return 1;
 	return 0;
 }