net/mlx5: fix link state update

Message ID 20200327172449.6514-1-bganne@cisco.com (mailing list archive)
State Superseded, archived
Delegated to: Raslan Darawsheh
Headers
Series net/mlx5: fix link state update |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-testing success Testing PASS
ci/travis-robot success Travis build: passed
ci/Intel-compilation success Compilation OK

Commit Message

Benoit Ganne (bganne) March 27, 2020, 5:24 p.m. UTC
  mlx5 PMD refuses to update link state if link speed is defined but
status is down or if link speed is undefined but status is up, even if
the ioctl() succeeded.
This prevents application to detect link up/down event, especially when
the link speed is not correctly detected.
As link speed is nice to have whereas link status is mandatory for
operations, always update link state regardless of link speed. The
application can then check link speed if needs be.

Signed-off-by: Benoît Ganne <bganne@cisco.com>
---

Following the discussion on dpdk-users [1], I am submitting a tentative
patch.
[1] http://mails.dpdk.org/archives/users/2020-March/thread.html#4744

 drivers/net/mlx5/mlx5_ethdev.c | 10 ----------
 1 file changed, 10 deletions(-)
  

Comments

Matan Azrad March 29, 2020, 7 a.m. UTC | #1
Hi

 From: Benoît Ganne
> mlx5 PMD refuses to update link state if link speed is defined but status is
> down or if link speed is undefined but status is up, even if the ioctl()
> succeeded.
> This prevents application to detect link up/down event, especially when the
> link speed is not correctly detected.

Do you use the wait option? Or no wait?

> As link speed is nice to have whereas link status is mandatory for operations,
> always update link state regardless of link speed. The application can then
> check link speed if needs be.

Is it documented well? I didn't find doc\description says link speed is best effort.

PMD cannot guess whether link speed is mandatory for the user or not....

I think, at least you should update ethdev relevant API descriptions and get agreement from more PMD maintainers... 

 
> Signed-off-by: Benoît Ganne <bganne@cisco.com>
> ---
> 
> Following the discussion on dpdk-users [1], I am submitting a tentative patch.
> [1]
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmails.
> dpdk.org%2Farchives%2Fusers%2F2020-
> March%2Fthread.html%234744&amp;data=02%7C01%7Cmatan%40mellanox.
> com%7C2cb881fc068c434e2f6b08d7d273c924%7Ca652971c7d2e4d9ba6a4d14
> 9256f461b%7C0%7C0%7C637209267055735109&amp;sdata=WV%2Fsd%2B%2
> BKssI3d8uxMf8cqackb%2FHrpqRIWOos2BWynU4%3D&amp;reserved=0
> 
>  drivers/net/mlx5/mlx5_ethdev.c | 10 ----------
>  1 file changed, 10 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_ethdev.c
> b/drivers/net/mlx5/mlx5_ethdev.c index d7d3bc73c..c15f4d62b 100644
> --- a/drivers/net/mlx5/mlx5_ethdev.c
> +++ b/drivers/net/mlx5/mlx5_ethdev.c
> @@ -896,11 +896,6 @@ mlx5_link_update_unlocked_gset(struct
> rte_eth_dev *dev,
>  				ETH_LINK_HALF_DUPLEX :
> ETH_LINK_FULL_DUPLEX);
>  	dev_link.link_autoneg = !(dev->data->dev_conf.link_speeds &
>  			ETH_LINK_SPEED_FIXED);
> -	if (((dev_link.link_speed && !dev_link.link_status) ||
> -	     (!dev_link.link_speed && dev_link.link_status))) {
> -		rte_errno = EAGAIN;
> -		return -rte_errno;
> -	}
>  	*link = dev_link;
>  	return 0;
>  }
> @@ -1032,11 +1027,6 @@ mlx5_link_update_unlocked_gs(struct
> rte_eth_dev *dev,
>  				ETH_LINK_HALF_DUPLEX :
> ETH_LINK_FULL_DUPLEX);
>  	dev_link.link_autoneg = !(dev->data->dev_conf.link_speeds &
>  				  ETH_LINK_SPEED_FIXED);
> -	if (((dev_link.link_speed && !dev_link.link_status) ||
> -	     (!dev_link.link_speed && dev_link.link_status))) {
> -		rte_errno = EAGAIN;
> -		return -rte_errno;
> -	}
>  	*link = dev_link;
>  	return 0;
>  }
> --
> 2.17.1
  
Benoit Ganne (bganne) March 30, 2020, 9:55 a.m. UTC | #2
> From: Matan Azrad <matan@mellanox.com>

>> From: Benoît Ganne
>> mlx5 PMD refuses to update link state if link speed is defined but
>> status is down or if link speed is undefined but status is up,
>> even if the ioctl() succeeded.
>> This prevents application to detect link up/down event, especially
>> when the link speed is not correctly detected.

> Do you use the wait option? Or no wait?

We are using the no wait option.

>> As link speed is nice to have whereas link status is mandatory for
>> operations, always update link state regardless of link speed.
>> The application can then check link speed if needs be.

> Is it documented well? I didn't find doc\description says link speed is
> best effort.
> PMD cannot guess whether link speed is mandatory for the user or not....

I do not think it is documented and I suppose at the end of the day it depends from the HW and configurations...
My commit message is misleading, my apologizes. What I meant was to let the app decide whether it should retry or not, based on the data it gets.
Right now, the PMD *prevents* the app to get link state if the link speed is undefined even if the app does not care about link speed. It makes an assumption on behalf of the apps. Moreover, it seems to be one of the few PMDs to make this assumption: at least the mlx4 PMD and the i40e PMD does not seem to make this assumption.
This is a problem for our app (fd.io VPP) but I was also told about other apps having the same issue too.

> I think, at least you should update ethdev relevant API descriptions and
> get agreement from more PMD maintainers...

I hope to get feedback 😊 it is precisely the goal of this.

ben
  
Matan Azrad March 30, 2020, 10:12 a.m. UTC | #3
Hi

From: Benoit Ganne (bganne) <bganne@cisco.com>
> > From: Matan Azrad <matan@mellanox.com>
> 
> >> From: Benoît Ganne
> >> mlx5 PMD refuses to update link state if link speed is defined but
> >> status is down or if link speed is undefined but status is up, even
> >> if the ioctl() succeeded.
> >> This prevents application to detect link up/down event, especially
> >> when the link speed is not correctly detected.
> 
> > Do you use the wait option? Or no wait?
> 
> We are using the no wait option.

I suggest to call again if failed for N retries time.

> >> As link speed is nice to have whereas link status is mandatory for
> >> operations, always update link state regardless of link speed.
> >> The application can then check link speed if needs be.
> 
> > Is it documented well? I didn't find doc\description says link speed
> > is best effort.
> > PMD cannot guess whether link speed is mandatory for the user or not....
> 
> I do not think it is documented and I suppose at the end of the day it
> depends from the HW and configurations...
> My commit message is misleading, my apologizes. What I meant was to let
> the app decide whether it should retry or not, based on the data it gets.
> Right now, the PMD *prevents* the app to get link state if the link speed is
> undefined even if the app does not care about link speed.

In mlx5 this is not the case, we have no one updated and second not - there are going together:

You can see that we have 2 different system calls: 1 to get up\down and second to get link speed.
If link speed doesn't appropriate to the link state it may say that something was changed between the calls and the link status we got from the first call is not correct anymore.

In this case, we should call both calls again, that’s what we are doing in "nowait" option.
If the user doesn't want "nowait" option, (means PMD is not allowed to take more time for response) he should call again when the callback failed in the time and retries manner the user prefers.

> It makes an
> assumption on behalf of the apps. Moreover, it seems to be one of the few
> PMDs to make this assumption: at least the mlx4 PMD and the i40e PMD
> does not seem to make this assumption.
> This is a problem for our app (fd.io VPP) but I was also told about other apps
> having the same issue too.
> 
> > I think, at least you should update ethdev relevant API descriptions
> > and get agreement from more PMD maintainers...
> 
> I hope to get feedback 😊 it is precisely the goal of this.
> 
> ben
  
Benoit Ganne (bganne) March 30, 2020, 12:03 p.m. UTC | #4
Hi Matan,

>>>> mlx5 PMD refuses to update link state if link speed is defined but
>>>> status is down or if link speed is undefined but status is up, even
>>>> if the ioctl() succeeded.
>>>> This prevents application to detect link up/down event, especially
>>>> when the link speed is not correctly detected.
>>> Do you use the wait option? Or no wait?
>> We are using the no wait option.
> I suggest to call again if failed for N retries time.

Unfortunately it will not solve our problem: if link speed is undefined but the link is up then the test '!dev_link.link_speed && dev_link.link_status' at http://git.dpdk.org/dpdk/tree/drivers/net/mlx5/mlx5_ethdev.c#n899 will always be true and the function will always return EAGAIN.
This actually happens in Azure with CX4-Lx VFs.

>> What I meant was to let the app decide whether it should retry or not,
>> based on the data it gets.
>> Right now, the PMD *prevents* the app to get link state if the link
>> speed is undefined even if the app does not care about link speed.

> In mlx5 this is not the case, we have no one updated and second not -
> there are going together:
> You can see that we have 2 different system calls: 1 to get up\down and
> second to get link speed.
> If link speed doesn't appropriate to the link state it may say that
> something was changed between the calls and the link status we got from
> the first call is not correct anymore.
> In this case, we should call both calls again, that’s what we are doing in
> "nowait" option.
> If the user doesn't want "nowait" option, (means PMD is not allowed to
> take more time for response) he should call again when the callback failed
> in the time and retries manner the user prefers.

Ok, now I understand the logic behind the current behavior: the 2 syscalls being not atomics, you try to detect inconsistencies that way.
But if the link speed is undefined, then the state will never be correctly updated.
I still believe it is unnecessarily heavy-handed: in most networking application I have seen (and I have 2 examples of current shipping networking products), a missing link speed is not critical whereas link being reported as down means no traffic flowing.

Best
ben
  
Matan Azrad March 30, 2020, 1:44 p.m. UTC | #5
Hi
 From: Benoit Ganne 
> Sent: Monday, March 30, 2020 3:03 PM
> To: Matan Azrad <matan@mellanox.com>; dev@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH] net/mlx5: fix link state update
> 
> Hi Matan,
> 
> >>>> mlx5 PMD refuses to update link state if link speed is defined but
> >>>> status is down or if link speed is undefined but status is up, even
> >>>> if the ioctl() succeeded.
> >>>> This prevents application to detect link up/down event, especially
> >>>> when the link speed is not correctly detected.
> >>> Do you use the wait option? Or no wait?
> >> We are using the no wait option.
> > I suggest to call again if failed for N retries time.
> 
> Unfortunately it will not solve our problem: if link speed is undefined but the
> link is up then the test '!dev_link.link_speed && dev_link.link_status' at
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgit.dp
> dk.org%2Fdpdk%2Ftree%2Fdrivers%2Fnet%2Fmlx5%2Fmlx5_ethdev.c%23n8
> 99&amp;data=02%7C01%7Cmatan%40mellanox.com%7C8a0fb9f9ba94422a4a
> 4208d7d4a2539c%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C637
> 211665962283891&amp;sdata=%2FExOJyhm0Bi4kkp434P0U4kRpQs7U0knuTa
> kvsWFO5Q%3D&amp;reserved=0 will always be true and the function will
> always return EAGAIN.
> This actually happens in Azure with CX4-Lx VFs.
> 
> >> What I meant was to let the app decide whether it should retry or
> >> not, based on the data it gets.
> >> Right now, the PMD *prevents* the app to get link state if the link
> >> speed is undefined even if the app does not care about link speed.
> 
> > In mlx5 this is not the case, we have no one updated and second not -
> > there are going together:
> > You can see that we have 2 different system calls: 1 to get up\down
> > and second to get link speed.
> > If link speed doesn't appropriate to the link state it may say that
> > something was changed between the calls and the link status we got
> > from the first call is not correct anymore.
> > In this case, we should call both calls again, that’s what we are
> > doing in "nowait" option.
> > If the user doesn't want "nowait" option, (means PMD is not allowed to
> > take more time for response) he should call again when the callback
> > failed in the time and retries manner the user prefers.
> 
> Ok, now I understand the logic behind the current behavior: the 2 syscalls
> being not atomics, you try to detect inconsistencies that way.
> But if the link speed is undefined, then the state will never be correctly
> updated.

Why link speed is undefined?
Old kernel?
Kernel mlx5 driver issue?

Do you know?

> I still believe it is unnecessarily heavy-handed: in most networking application
> I have seen (and I have 2 examples of current shipping networking products),
> a missing link speed is not critical whereas link being reported as down means
> no traffic flowing.
> 
> Best
> ben
  
Benoit Ganne (bganne) March 30, 2020, 1:53 p.m. UTC | #6
> Why link speed is undefined?
> Old kernel?
> Kernel mlx5 driver issue?
> Do you know?

Seems to be a bug in kernel mlx5 driver: http://mails.dpdk.org/archives/users/2020-March/004758.html
I still think this test is too strict, and from what I can see mlx4 PMD do not implement it (even if it uses 2 syscalls too).

ben
  
Matan Azrad March 30, 2020, 4:13 p.m. UTC | #7
Hi

From: Benoit Ganne (bganne)
> > Why link speed is undefined?
> > Old kernel?
> > Kernel mlx5 driver issue?
> > Do you know?
> 
> Seems to be a bug in kernel mlx5 driver:
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmails.
> dpdk.org%2Farchives%2Fusers%2F2020-
> March%2F004758.html&amp;data=02%7C01%7Cmatan%40mellanox.com%7C
> 69ce3bfd96484267869908d7d4b1ca89%7Ca652971c7d2e4d9ba6a4d149256f46
> 1b%7C0%7C0%7C637211732388079395&amp;sdata=%2BnfjST6fM4zoIYwJz7L
> Z2zL3rGmmOFXbpW0fK7Weps4%3D&amp;reserved=0
> I still think this test is too strict, and from what I can see mlx4 PMD do not
> implement it (even if it uses 2 syscalls too).
Yes, Doesn't mean this is not a bug in mlx4😊


Thanks Ben.

So we have 2 orthogonal efforts here:

1. To check with our driver team what is the issue to report a VF link speed : For me 😊
2. To change documentation of ethdev API to say that only link status is mandatory report. (all others "nice to have" or "best effort"):
	This is for you - I suggest to send prior patch to ethdev for this and let the community to decide.

Are you agree?


> ben
  
Benoit Ganne (bganne) April 1, 2020, 10:17 a.m. UTC | #8
Hi Matan,

>> Seems to be a bug in kernel mlx5 driver:
> 1. To check with our driver team what is the issue to report a VF link

Note that even with the kernel patch, there still seem to be a valid case where it can return SPEED_UNKNOWN.
If this is true, then I don't think we can rely on valid link speed to detect whether the info are correct.

ben
  
Matan Azrad April 1, 2020, 12:46 p.m. UTC | #9
Hi

From: Benoit Ganne
> Hi Matan,
> 
> >> Seems to be a bug in kernel mlx5 driver:
> > 1. To check with our driver team what is the issue to report a VF link

Yes, driver team said the 3 kernel patches should solve the issue.

> Note that even with the kernel patch, there still seem to be a valid case
> where it can return SPEED_UNKNOWN.
> If this is true, then I don't think we can rely on valid link speed to detect
> whether the info are correct.

As I said I'm not against this approach, but need to define the API well.
 
> ben
  
Thomas Monjalon April 7, 2020, 12:54 p.m. UTC | #10
30/03/2020 18:13, Matan Azrad:
> 2. To change documentation of ethdev API to say that only link status
> is mandatory report. (all others "nice to have" or "best effort"):
> 	This is for you - I suggest to send prior patch to ethdev
> 	for this and let the community to decide.

Matan, Benoit, I guess nobody tried to update ethdev API doc yet?
I can do it if you prefer.
  
Benoit Ganne (bganne) April 7, 2020, 1:41 p.m. UTC | #11
>> 2. To change documentation of ethdev API to say that only link status
>> is mandatory report. (all others "nice to have" or "best effort"):
>> 	This is for you - I suggest to send prior patch to ethdev
>> 	for this and let the community to decide.
> Matan, Benoit, I guess nobody tried to update ethdev API doc yet?
> I can do it if you prefer.

My understanding was that this [1] was basically saying 'returning UNKNOWN is ok'. But if this is not what you had in mind, I'll be glad you can take care of it 😊

[1] http://mails.dpdk.org/archives/dev/2020-April/thread.html#161538

Best
ben
  
Matan Azrad Nov. 19, 2020, 8:30 a.m. UTC | #12
+@Asaf Penso

From: Benoit Ganne 
> >> 2. To change documentation of ethdev API to say that only link status
> >> is mandatory report. (all others "nice to have" or "best effort"):
> >> 	This is for you - I suggest to send prior patch to ethdev
> >> 	for this and let the community to decide.
> > Matan, Benoit, I guess nobody tried to update ethdev API doc yet?
> > I can do it if you prefer.
> 
> My understanding was that this [1] was basically saying 'returning UNKNOWN
> is ok'. But if this is not what you had in mind, I'll be glad you can take care of it
> 😊
> 
> [1]
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmails.dpdk
> .org%2Farchives%2Fdev%2F2020-
> April%2Fthread.html%23161538&amp;data=02%7C01%7Cmatan%40mellanox.c
> om%7Cee2c8eebc00d4ea3106b08d7daf97473%7Ca652971c7d2e4d9ba6a4d149
> 256f461b%7C0%7C0%7C637218637252717747&amp;sdata=JSFaHkmSzE1tWB7
> DqEaqVI5B1pIZuIdDrfw0SFyUE8s%3D&amp;reserved=0
> 
> Best
> ben


Hi
Now, when the API is updated that the speed value is best effort using ETH_SPEED_NUM_UNKNOWN new value,

You can go for your approach:
In case the link is up and speed is invalid by kernel we can return link up and ETH_SPEED_NUM_UNKNOWN.

In case applications wants a valid speed they should call the API again.
Need to update it in mlx5 release notes section.

The original patch should be rebased and updated.

Matan
  

Patch

diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index d7d3bc73c..c15f4d62b 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -896,11 +896,6 @@  mlx5_link_update_unlocked_gset(struct rte_eth_dev *dev,
 				ETH_LINK_HALF_DUPLEX : ETH_LINK_FULL_DUPLEX);
 	dev_link.link_autoneg = !(dev->data->dev_conf.link_speeds &
 			ETH_LINK_SPEED_FIXED);
-	if (((dev_link.link_speed && !dev_link.link_status) ||
-	     (!dev_link.link_speed && dev_link.link_status))) {
-		rte_errno = EAGAIN;
-		return -rte_errno;
-	}
 	*link = dev_link;
 	return 0;
 }
@@ -1032,11 +1027,6 @@  mlx5_link_update_unlocked_gs(struct rte_eth_dev *dev,
 				ETH_LINK_HALF_DUPLEX : ETH_LINK_FULL_DUPLEX);
 	dev_link.link_autoneg = !(dev->data->dev_conf.link_speeds &
 				  ETH_LINK_SPEED_FIXED);
-	if (((dev_link.link_speed && !dev_link.link_status) ||
-	     (!dev_link.link_speed && dev_link.link_status))) {
-		rte_errno = EAGAIN;
-		return -rte_errno;
-	}
 	*link = dev_link;
 	return 0;
 }