net/ixgbe: fix link status inconsistencies

Message ID 20200226120637.13109-1-taox.zhu@intel.com (mailing list archive)
State Accepted, archived
Delegated to: xiaolong ye
Headers
Series net/ixgbe: fix link status inconsistencies |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-testing success Testing PASS
ci/Intel-compilation success Compilation OK
ci/travis-robot success Travis build: passed

Commit Message

Zhu, TaoX Feb. 26, 2020, 12:06 p.m. UTC
  From: Zhu Tao <taox.zhu@intel.com>

Setting LINK UP or LINK DOWN is divided into two parts, with
the main task done in a separate thread, which can take up
to 9 seconds. If cancel the thread in execution, may cause state
inconsistencies. Therefore, must wait for the previous setting
to exit normally before setting the new state.

Note: before using threads, use alarm to handle main tasks.
When canceling alarm, the execution of alarm will not be interrupted.

Fixes: 819d0d1d57 ("net/ixgbe: fix blocking system events")
Cc: stable@dpdk.org

Signed-off-by: Zhu Tao <taox.zhu@intel.com>
---
 drivers/net/ixgbe/ixgbe_ethdev.c | 24 ++++++++++++++++++++++--
 1 file changed, 22 insertions(+), 2 deletions(-)
  

Comments

Zhu, TaoX March 3, 2020, 2 a.m. UTC | #1
+Xiaolong

BR,
Zhu, Tao

> -----Original Message-----
> From: Zhu, TaoX
> Sent: Wednesday, February 26, 2020 8:07 PM
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Lu, Wenzhuo
> <wenzhuo.lu@intel.com>
> Cc: dev@dpdk.org; Zhu, TaoX <taox.zhu@intel.com>; stable@dpdk.org
> Subject: [PATCH] net/ixgbe: fix link status inconsistencies
> 
> From: Zhu Tao <taox.zhu@intel.com>
> 
> Setting LINK UP or LINK DOWN is divided into two parts, with the main task
> done in a separate thread, which can take up to 9 seconds. If cancel the
> thread in execution, may cause state inconsistencies. Therefore, must wait
> for the previous setting to exit normally before setting the new state.
> 
> Note: before using threads, use alarm to handle main tasks.
> When canceling alarm, the execution of alarm will not be interrupted.
> 
> Fixes: 819d0d1d57 ("net/ixgbe: fix blocking system events")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Zhu Tao <taox.zhu@intel.com>
> ---
>  drivers/net/ixgbe/ixgbe_ethdev.c | 24 ++++++++++++++++++++++--
>  1 file changed, 22 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c
> b/drivers/net/ixgbe/ixgbe_ethdev.c
> index 23b3f5b0c..2c5797635 100644
> --- a/drivers/net/ixgbe/ixgbe_ethdev.c
> +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
> @@ -4143,16 +4143,35 @@ ixgbevf_check_link(struct ixgbe_hw *hw,
> ixgbe_link_speed *speed,
>  	return ret_val;
>  }
> 
> +/* return 1: setup complete, return 0: setup not complete, and wait
> +timeout*/ static int ixgbe_dev_wait_setup_link_complete(struct
> +rte_eth_dev *dev) { #define DELAY_INTERVAL 100 /* 100ms */
> +#define MAX_TIMEOUT    90 /* 9s (90 * 100ms) in total */
> +	struct ixgbe_adapter *ad = dev->data->dev_private;
> +	int timeout = MAX_TIMEOUT;
> +
> +	while (rte_atomic32_read(&ad->link_thread_running) && timeout) {
> +		msec_delay(DELAY_INTERVAL);
> +		timeout--;
> +	}
> +
> +
> +	return !!timeout;
> +}
> +
>  static void
>  ixgbe_dev_cancel_link_thread(struct rte_eth_dev *dev)  {
>  	struct ixgbe_adapter *ad = dev->data->dev_private;
>  	void *retval;
> 
> -	if (rte_atomic32_read(&ad->link_thread_running)) {
> +	if (!ixgbe_dev_wait_setup_link_complete(dev)) {
>  		pthread_cancel(ad->link_thread_tid);
>  		pthread_join(ad->link_thread_tid, &retval);
>  		rte_atomic32_clear(&ad->link_thread_running);
> +		PMD_DRV_LOG(ERR, "Link thread not complete, cancel it!");
>  	}
>  }
> 
> @@ -4263,7 +4282,8 @@ ixgbe_dev_link_update_share(struct rte_eth_dev
> *dev,
>  	if (link_up == 0) {
>  		if (ixgbe_get_media_type(hw) == ixgbe_media_type_fiber)
> {
>  			intr->flags |= IXGBE_FLAG_NEED_LINK_CONFIG;
> -			if (rte_atomic32_test_and_set(&ad-
> >link_thread_running)) {
> +			if (ixgbe_dev_wait_setup_link_complete(dev) &&
> +			    rte_atomic32_test_and_set(&ad-
> >link_thread_running)) {
>  				if (rte_ctrl_thread_create(&ad-
> >link_thread_tid,
>  					"ixgbe-link-handler",
>  					NULL,
> --
> 2.17.1
  
Zhu, TaoX March 19, 2020, 2:43 a.m. UTC | #2
+Jiang Yu


BR,
Zhu, Tao

> -----Original Message-----
> From: Zhu, TaoX
> Sent: Tuesday, March 3, 2020 10:00 AM
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Lu, Wenzhuo
> <wenzhuo.lu@intel.com>; Ye, Xiaolong <xiaolong.ye@intel.com>
> Cc: dev@dpdk.org; stable@dpdk.org
> Subject: RE: [PATCH] net/ixgbe: fix link status inconsistencies
> 
> +Xiaolong
> 
> BR,
> Zhu, Tao
> 
> > -----Original Message-----
> > From: Zhu, TaoX
> > Sent: Wednesday, February 26, 2020 8:07 PM
> > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Lu, Wenzhuo
> > <wenzhuo.lu@intel.com>
> > Cc: dev@dpdk.org; Zhu, TaoX <taox.zhu@intel.com>; stable@dpdk.org
> > Subject: [PATCH] net/ixgbe: fix link status inconsistencies
> >
> > From: Zhu Tao <taox.zhu@intel.com>
> >
> > Setting LINK UP or LINK DOWN is divided into two parts, with the main
> > task done in a separate thread, which can take up to 9 seconds. If
> > cancel the thread in execution, may cause state inconsistencies.
> > Therefore, must wait for the previous setting to exit normally before
> setting the new state.
> >
> > Note: before using threads, use alarm to handle main tasks.
> > When canceling alarm, the execution of alarm will not be interrupted.
> >
> > Fixes: 819d0d1d57 ("net/ixgbe: fix blocking system events")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Zhu Tao <taox.zhu@intel.com>
> > ---
> >  drivers/net/ixgbe/ixgbe_ethdev.c | 24 ++++++++++++++++++++++--
> >  1 file changed, 22 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c
> > b/drivers/net/ixgbe/ixgbe_ethdev.c
> > index 23b3f5b0c..2c5797635 100644
> > --- a/drivers/net/ixgbe/ixgbe_ethdev.c
> > +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
> > @@ -4143,16 +4143,35 @@ ixgbevf_check_link(struct ixgbe_hw *hw,
> > ixgbe_link_speed *speed,
> >  	return ret_val;
> >  }
> >
> > +/* return 1: setup complete, return 0: setup not complete, and wait
> > +timeout*/ static int ixgbe_dev_wait_setup_link_complete(struct
> > +rte_eth_dev *dev) { #define DELAY_INTERVAL 100 /* 100ms */
> > +#define MAX_TIMEOUT    90 /* 9s (90 * 100ms) in total */
> > +	struct ixgbe_adapter *ad = dev->data->dev_private;
> > +	int timeout = MAX_TIMEOUT;
> > +
> > +	while (rte_atomic32_read(&ad->link_thread_running) && timeout) {
> > +		msec_delay(DELAY_INTERVAL);
> > +		timeout--;
> > +	}
> > +
> > +
> > +	return !!timeout;
> > +}
> > +
> >  static void
> >  ixgbe_dev_cancel_link_thread(struct rte_eth_dev *dev)  {
> >  	struct ixgbe_adapter *ad = dev->data->dev_private;
> >  	void *retval;
> >
> > -	if (rte_atomic32_read(&ad->link_thread_running)) {
> > +	if (!ixgbe_dev_wait_setup_link_complete(dev)) {
> >  		pthread_cancel(ad->link_thread_tid);
> >  		pthread_join(ad->link_thread_tid, &retval);
> >  		rte_atomic32_clear(&ad->link_thread_running);
> > +		PMD_DRV_LOG(ERR, "Link thread not complete, cancel it!");
> >  	}
> >  }
> >
> > @@ -4263,7 +4282,8 @@ ixgbe_dev_link_update_share(struct
> rte_eth_dev
> > *dev,
> >  	if (link_up == 0) {
> >  		if (ixgbe_get_media_type(hw) == ixgbe_media_type_fiber)
> {
> >  			intr->flags |= IXGBE_FLAG_NEED_LINK_CONFIG;
> > -			if (rte_atomic32_test_and_set(&ad-
> > >link_thread_running)) {
> > +			if (ixgbe_dev_wait_setup_link_complete(dev) &&
> > +			    rte_atomic32_test_and_set(&ad-
> > >link_thread_running)) {
> >  				if (rte_ctrl_thread_create(&ad-
> > >link_thread_tid,
> >  					"ixgbe-link-handler",
> >  					NULL,
> > --
> > 2.17.1
  
Xiaolong Ye March 29, 2020, 4:14 a.m. UTC | #3
On 02/26, taox.zhu@intel.com wrote:
>From: Zhu Tao <taox.zhu@intel.com>
>
>Setting LINK UP or LINK DOWN is divided into two parts, with
>the main task done in a separate thread, which can take up
>to 9 seconds. If cancel the thread in execution, may cause state
>inconsistencies. Therefore, must wait for the previous setting
>to exit normally before setting the new state.
>
>Note: before using threads, use alarm to handle main tasks.
>When canceling alarm, the execution of alarm will not be interrupted.
>
>Fixes: 819d0d1d57 ("net/ixgbe: fix blocking system events")
>Cc: stable@dpdk.org
>
>Signed-off-by: Zhu Tao <taox.zhu@intel.com>
>---
> drivers/net/ixgbe/ixgbe_ethdev.c | 24 ++++++++++++++++++++++--
> 1 file changed, 22 insertions(+), 2 deletions(-)
>
>diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
>index 23b3f5b0c..2c5797635 100644
>--- a/drivers/net/ixgbe/ixgbe_ethdev.c
>+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
>@@ -4143,16 +4143,35 @@ ixgbevf_check_link(struct ixgbe_hw *hw, ixgbe_link_speed *speed,
> 	return ret_val;
> }
> 
>+/* return 1: setup complete, return 0: setup not complete, and wait timeout*/
>+static int
>+ixgbe_dev_wait_setup_link_complete(struct rte_eth_dev *dev)
>+{
>+#define DELAY_INTERVAL 100 /* 100ms */
>+#define MAX_TIMEOUT    90 /* 9s (90 * 100ms) in total */
>+	struct ixgbe_adapter *ad = dev->data->dev_private;
>+	int timeout = MAX_TIMEOUT;
>+
>+	while (rte_atomic32_read(&ad->link_thread_running) && timeout) {
>+		msec_delay(DELAY_INTERVAL);
>+		timeout--;
>+	}
>+
>+
>+	return !!timeout;
>+}
>+
> static void
> ixgbe_dev_cancel_link_thread(struct rte_eth_dev *dev)
> {
> 	struct ixgbe_adapter *ad = dev->data->dev_private;
> 	void *retval;
> 
>-	if (rte_atomic32_read(&ad->link_thread_running)) {
>+	if (!ixgbe_dev_wait_setup_link_complete(dev)) {
> 		pthread_cancel(ad->link_thread_tid);
> 		pthread_join(ad->link_thread_tid, &retval);
> 		rte_atomic32_clear(&ad->link_thread_running);
>+		PMD_DRV_LOG(ERR, "Link thread not complete, cancel it!");
> 	}
> }
> 
>@@ -4263,7 +4282,8 @@ ixgbe_dev_link_update_share(struct rte_eth_dev *dev,
> 	if (link_up == 0) {
> 		if (ixgbe_get_media_type(hw) == ixgbe_media_type_fiber) {
> 			intr->flags |= IXGBE_FLAG_NEED_LINK_CONFIG;
>-			if (rte_atomic32_test_and_set(&ad->link_thread_running)) {
>+			if (ixgbe_dev_wait_setup_link_complete(dev) &&
>+			    rte_atomic32_test_and_set(&ad->link_thread_running)) {
> 				if (rte_ctrl_thread_create(&ad->link_thread_tid,
> 					"ixgbe-link-handler",
> 					NULL,
>-- 
>2.17.1
>

Acked-by: Xiaolong Ye <xiaolong.ye@intel.com>

Applied to dpdk-next-net-intel, Thanks.
  

Patch

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index 23b3f5b0c..2c5797635 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -4143,16 +4143,35 @@  ixgbevf_check_link(struct ixgbe_hw *hw, ixgbe_link_speed *speed,
 	return ret_val;
 }
 
+/* return 1: setup complete, return 0: setup not complete, and wait timeout*/
+static int
+ixgbe_dev_wait_setup_link_complete(struct rte_eth_dev *dev)
+{
+#define DELAY_INTERVAL 100 /* 100ms */
+#define MAX_TIMEOUT    90 /* 9s (90 * 100ms) in total */
+	struct ixgbe_adapter *ad = dev->data->dev_private;
+	int timeout = MAX_TIMEOUT;
+
+	while (rte_atomic32_read(&ad->link_thread_running) && timeout) {
+		msec_delay(DELAY_INTERVAL);
+		timeout--;
+	}
+
+
+	return !!timeout;
+}
+
 static void
 ixgbe_dev_cancel_link_thread(struct rte_eth_dev *dev)
 {
 	struct ixgbe_adapter *ad = dev->data->dev_private;
 	void *retval;
 
-	if (rte_atomic32_read(&ad->link_thread_running)) {
+	if (!ixgbe_dev_wait_setup_link_complete(dev)) {
 		pthread_cancel(ad->link_thread_tid);
 		pthread_join(ad->link_thread_tid, &retval);
 		rte_atomic32_clear(&ad->link_thread_running);
+		PMD_DRV_LOG(ERR, "Link thread not complete, cancel it!");
 	}
 }
 
@@ -4263,7 +4282,8 @@  ixgbe_dev_link_update_share(struct rte_eth_dev *dev,
 	if (link_up == 0) {
 		if (ixgbe_get_media_type(hw) == ixgbe_media_type_fiber) {
 			intr->flags |= IXGBE_FLAG_NEED_LINK_CONFIG;
-			if (rte_atomic32_test_and_set(&ad->link_thread_running)) {
+			if (ixgbe_dev_wait_setup_link_complete(dev) &&
+			    rte_atomic32_test_and_set(&ad->link_thread_running)) {
 				if (rte_ctrl_thread_create(&ad->link_thread_tid,
 					"ixgbe-link-handler",
 					NULL,