ethdev: stop the device before close

Message ID 20211020104715.2526074-1-andrew.rybchenko@oktetlabs.ru (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers
Series ethdev: stop the device before close |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/github-robot: build success github build: passed
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-x86_64-compile-testing success Testing PASS
ci/iol-x86_64-unit-testing fail Testing issues
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-intel-Functional fail Functional Testing issues
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS

Commit Message

Andrew Rybchenko Oct. 20, 2021, 10:47 a.m. UTC
  From: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>

In drivers important cleanup could happen on the device stop.
Do stop in the rte_eth_dev_close() function for robustness and
to simplify drivers code.

Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
---
In fact the patch is required to fix segfault in the case of
net/virtio on close without stop after Rx interrupts enabled.

I believe that the right way to address the problem is automated
stop from close, but I guess it cannot not be backported and
may be fix in a different way required in stable branches.

 lib/ethdev/rte_ethdev.c | 11 +++++++++++
 1 file changed, 11 insertions(+)
  

Comments

Thomas Monjalon Oct. 20, 2021, 12:17 p.m. UTC | #1
20/10/2021 12:47, Andrew Rybchenko:
> From: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
> 
> In drivers important cleanup could happen on the device stop.
> Do stop in the rte_eth_dev_close() function for robustness and
> to simplify drivers code.
> 
> Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
> Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> ---
> In fact the patch is required to fix segfault in the case of
> net/virtio on close without stop after Rx interrupts enabled.
> 
> I believe that the right way to address the problem is automated
> stop from close, but I guess it cannot not be backported and
> may be fix in a different way required in stable branches.

It is possible to do this addition.
But the right fix (not changing API behaviour) should be to return early
if the port is not stopped.

> @@ -1894,6 +1894,17 @@ rte_eth_dev_close(uint16_t port_id)
>  	dev = &rte_eth_devices[port_id];
>  
>  	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_close, -ENOTSUP);
> +
> +	if (dev->data->dev_started) {
> +		*lasterr = rte_eth_dev_stop(port_id);
> +		if (*lasterr != 0) {
> +			RTE_ETHDEV_LOG(ERR,
> +				"Failed to stop device (port %u) before close: %s - ignore\n",
> +				port_id, rte_strerror(-*lasterr));
> +			lasterr = &binerr;
> +		}
> +	}
> +
>  	*lasterr = (*dev->dev_ops->dev_close)(dev);
>  	if (*lasterr != 0)
>  		lasterr = &binerr;
  
Andrew Rybchenko Oct. 20, 2021, 12:24 p.m. UTC | #2
On 10/20/21 3:17 PM, Thomas Monjalon wrote:
> 20/10/2021 12:47, Andrew Rybchenko:
>> From: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
>>
>> In drivers important cleanup could happen on the device stop.
>> Do stop in the rte_eth_dev_close() function for robustness and
>> to simplify drivers code.
>>
>> Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
>> Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>> ---
>> In fact the patch is required to fix segfault in the case of
>> net/virtio on close without stop after Rx interrupts enabled.
>>
>> I believe that the right way to address the problem is automated
>> stop from close, but I guess it cannot not be backported and
>> may be fix in a different way required in stable branches.
> 
> It is possible to do this addition.
> But the right fix (not changing API behaviour) should be to return early
> if the port is not stopped.

Isn't returning an error a change of API behaviour?

This way we change behaviour less since some PMDs allow
to close in started state and do stop itself.

> 
>> @@ -1894,6 +1894,17 @@ rte_eth_dev_close(uint16_t port_id)
>>  	dev = &rte_eth_devices[port_id];
>>  
>>  	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_close, -ENOTSUP);
>> +
>> +	if (dev->data->dev_started) {
>> +		*lasterr = rte_eth_dev_stop(port_id);
>> +		if (*lasterr != 0) {
>> +			RTE_ETHDEV_LOG(ERR,
>> +				"Failed to stop device (port %u) before close: %s - ignore\n",
>> +				port_id, rte_strerror(-*lasterr));
>> +			lasterr = &binerr;
>> +		}
>> +	}
>> +
>>  	*lasterr = (*dev->dev_ops->dev_close)(dev);
>>  	if (*lasterr != 0)
>>  		lasterr = &binerr;
> 
>
  
Andrew Rybchenko Oct. 20, 2021, 1:06 p.m. UTC | #3
On 10/20/21 3:24 PM, Andrew Rybchenko wrote:
> On 10/20/21 3:17 PM, Thomas Monjalon wrote:
>> 20/10/2021 12:47, Andrew Rybchenko:
>>> From: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
>>>
>>> In drivers important cleanup could happen on the device stop.
>>> Do stop in the rte_eth_dev_close() function for robustness and
>>> to simplify drivers code.
>>>
>>> Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
>>> Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>>> ---
>>> In fact the patch is required to fix segfault in the case of
>>> net/virtio on close without stop after Rx interrupts enabled.
>>>
>>> I believe that the right way to address the problem is automated
>>> stop from close, but I guess it cannot not be backported and
>>> may be fix in a different way required in stable branches.
>>
>> It is possible to do this addition.
>> But the right fix (not changing API behaviour) should be to return early
>> if the port is not stopped.
> 
> Isn't returning an error a change of API behaviour?

After looking at rte_eth_dev_close() description I agreethat we
should return an error. Yes, it is a behaviour change, but a
change in accordance with the documentation.
I'll submit v2 which checks that port  is stopped and return
error.

> 
> This way we change behaviour less since some PMDs allow
> to close in started state and do stop itself.
> 
>>
>>> @@ -1894,6 +1894,17 @@ rte_eth_dev_close(uint16_t port_id)
>>>  	dev = &rte_eth_devices[port_id];
>>>  
>>>  	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_close, -ENOTSUP);
>>> +
>>> +	if (dev->data->dev_started) {
>>> +		*lasterr = rte_eth_dev_stop(port_id);
>>> +		if (*lasterr != 0) {
>>> +			RTE_ETHDEV_LOG(ERR,
>>> +				"Failed to stop device (port %u) before close: %s - ignore\n",
>>> +				port_id, rte_strerror(-*lasterr));
>>> +			lasterr = &binerr;
>>> +		}
>>> +	}
>>> +
>>>  	*lasterr = (*dev->dev_ops->dev_close)(dev);
>>>  	if (*lasterr != 0)
>>>  		lasterr = &binerr;
>>
>>
  

Patch

diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
index a151c05849..b9f0938f20 100644
--- a/lib/ethdev/rte_ethdev.c
+++ b/lib/ethdev/rte_ethdev.c
@@ -1894,6 +1894,17 @@  rte_eth_dev_close(uint16_t port_id)
 	dev = &rte_eth_devices[port_id];
 
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_close, -ENOTSUP);
+
+	if (dev->data->dev_started) {
+		*lasterr = rte_eth_dev_stop(port_id);
+		if (*lasterr != 0) {
+			RTE_ETHDEV_LOG(ERR,
+				"Failed to stop device (port %u) before close: %s - ignore\n",
+				port_id, rte_strerror(-*lasterr));
+			lasterr = &binerr;
+		}
+	}
+
 	*lasterr = (*dev->dev_ops->dev_close)(dev);
 	if (*lasterr != 0)
 		lasterr = &binerr;