[3/5] net/bnxt: fix race-condition when report error recovery

Message ID 20230301030610.49468-4-fengchengwen@huawei.com (mailing list archive)
State Changes Requested, archived
Delegated to: Ferruh Yigit
Headers
Series fix race-condition of proactive error handling mode |

Checks

Context Check Description
ci/checkpatch success coding style OK

Commit Message

fengchengwen March 1, 2023, 3:06 a.m. UTC
  If set data path functions to dummy functions before reports error
recovering event, there maybe a race-condition with data path threads,
this patch fixes it by setting data path functions to dummy functions
only after reports such event.

Fixes: e11052f3a46f ("net/bnxt: support proactive error handling mode")
Cc: stable@dpdk.org

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
---
 drivers/net/bnxt/bnxt_cpr.c    | 13 +++++++------
 drivers/net/bnxt/bnxt_ethdev.c |  4 ++--
 2 files changed, 9 insertions(+), 8 deletions(-)
  

Comments

Konstantin Ananyev March 2, 2023, 12:23 p.m. UTC | #1
> If set data path functions to dummy functions before reports error
> recovering event, there maybe a race-condition with data path threads,
> this patch fixes it by setting data path functions to dummy functions
> only after reports such event.
> 
> Fixes: e11052f3a46f ("net/bnxt: support proactive error handling mode")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> ---
>  drivers/net/bnxt/bnxt_cpr.c    | 13 +++++++------
>  drivers/net/bnxt/bnxt_ethdev.c |  4 ++--
>  2 files changed, 9 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/net/bnxt/bnxt_cpr.c b/drivers/net/bnxt/bnxt_cpr.c
> index 5bb376d4d5..3950840600 100644
> --- a/drivers/net/bnxt/bnxt_cpr.c
> +++ b/drivers/net/bnxt/bnxt_cpr.c
> @@ -168,14 +168,9 @@ void bnxt_handle_async_event(struct bnxt *bp,
>  		PMD_DRV_LOG(INFO, "Port conn async event\n");
>  		break;
>  	case HWRM_ASYNC_EVENT_CMPL_EVENT_ID_RESET_NOTIFY:
> -		/*
> -		 * Avoid any rx/tx packet processing during firmware reset
> -		 * operation.
> -		 */
> -		bnxt_stop_rxtx(bp->eth_dev);
> -
>  		/* Ignore reset notify async events when stopping the port */
>  		if (!bp->eth_dev->data->dev_started) {
> +			bnxt_stop_rxtx(bp->eth_dev);
>  			bp->flags |= BNXT_FLAG_FATAL_ERROR;
>  			return;
>  		}
> @@ -184,6 +179,12 @@ void bnxt_handle_async_event(struct bnxt *bp,
>  					     RTE_ETH_EVENT_ERR_RECOVERING,
>  					     NULL);
> 
> +		/*
> +		 * Avoid any rx/tx packet processing during firmware reset
> +		 * operation.
> +		 */
> +		bnxt_stop_rxtx(bp->eth_dev);
> +
>  		pthread_mutex_lock(&bp->err_recovery_lock);
>  		event_data = data1;
>  		/* timestamp_lo/hi values are in units of 100ms */
> diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
> index 753e86b4b2..4083a69d02 100644
> --- a/drivers/net/bnxt/bnxt_ethdev.c
> +++ b/drivers/net/bnxt/bnxt_ethdev.c
> @@ -4562,14 +4562,14 @@ static void bnxt_check_fw_health(void *arg)
>  	bp->flags |= BNXT_FLAG_FATAL_ERROR;
>  	bp->flags |= BNXT_FLAG_FW_RESET;
> 
> -	bnxt_stop_rxtx(bp->eth_dev);
> -
>  	PMD_DRV_LOG(ERR, "Detected FW dead condition\n");
> 
>  	rte_eth_dev_callback_process(bp->eth_dev,
>  				     RTE_ETH_EVENT_ERR_RECOVERING,
>  				     NULL);
> 
> +	bnxt_stop_rxtx(bp->eth_dev);
> +
>  	if (bnxt_is_primary_func(bp))
>  		wait_msec = info->primary_func_wait_period;
>  	else
> --

Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>

> 2.17.1
  

Patch

diff --git a/drivers/net/bnxt/bnxt_cpr.c b/drivers/net/bnxt/bnxt_cpr.c
index 5bb376d4d5..3950840600 100644
--- a/drivers/net/bnxt/bnxt_cpr.c
+++ b/drivers/net/bnxt/bnxt_cpr.c
@@ -168,14 +168,9 @@  void bnxt_handle_async_event(struct bnxt *bp,
 		PMD_DRV_LOG(INFO, "Port conn async event\n");
 		break;
 	case HWRM_ASYNC_EVENT_CMPL_EVENT_ID_RESET_NOTIFY:
-		/*
-		 * Avoid any rx/tx packet processing during firmware reset
-		 * operation.
-		 */
-		bnxt_stop_rxtx(bp->eth_dev);
-
 		/* Ignore reset notify async events when stopping the port */
 		if (!bp->eth_dev->data->dev_started) {
+			bnxt_stop_rxtx(bp->eth_dev);
 			bp->flags |= BNXT_FLAG_FATAL_ERROR;
 			return;
 		}
@@ -184,6 +179,12 @@  void bnxt_handle_async_event(struct bnxt *bp,
 					     RTE_ETH_EVENT_ERR_RECOVERING,
 					     NULL);
 
+		/*
+		 * Avoid any rx/tx packet processing during firmware reset
+		 * operation.
+		 */
+		bnxt_stop_rxtx(bp->eth_dev);
+
 		pthread_mutex_lock(&bp->err_recovery_lock);
 		event_data = data1;
 		/* timestamp_lo/hi values are in units of 100ms */
diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index 753e86b4b2..4083a69d02 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -4562,14 +4562,14 @@  static void bnxt_check_fw_health(void *arg)
 	bp->flags |= BNXT_FLAG_FATAL_ERROR;
 	bp->flags |= BNXT_FLAG_FW_RESET;
 
-	bnxt_stop_rxtx(bp->eth_dev);
-
 	PMD_DRV_LOG(ERR, "Detected FW dead condition\n");
 
 	rte_eth_dev_callback_process(bp->eth_dev,
 				     RTE_ETH_EVENT_ERR_RECOVERING,
 				     NULL);
 
+	bnxt_stop_rxtx(bp->eth_dev);
+
 	if (bnxt_is_primary_func(bp))
 		wait_msec = info->primary_func_wait_period;
 	else