net/bnxt: fix crash caused by error recovery

Message ID 20211119035041.4493-1-somnath.kotur@broadcom.com (mailing list archive)
State Accepted, archived
Delegated to: Ajit Khaparde
Headers
Series net/bnxt: fix crash caused by error recovery |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/github-robot: build success github build: passed
ci/iol-x86_64-unit-testing success Testing PASS
ci/iol-aarch64-unit-testing success Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS
ci/iol-x86_64-compile-testing success Testing PASS

Commit Message

Somnath Kotur Nov. 19, 2021, 3:50 a.m. UTC
  bnxt_stop_rxtx() does not stop data path processing as intended
as it does not update the recently introduced fast-path pointers
'(struct rte_eth_fp_ops)->rx_pkt_burst'. Since both the burst routines
only use the fast-path pointer, the real burst routines get invoked
instead of the dummy ones set by bnxt_stop_rxtx() leading to crashes
in the data path (e.g. dereferencing freed structures)

Fix the segfault by updating the fast-path pointer as well

Fixes: c87d435a4d79 ("ethdev: copy fast-path API into separate structure")

Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
---
 drivers/net/bnxt/bnxt_cpr.c    | 9 +++++++++
 drivers/net/bnxt/bnxt_ethdev.c | 7 ++++++-
 2 files changed, 15 insertions(+), 1 deletion(-)
  

Comments

Ajit Khaparde Nov. 19, 2021, 6:07 a.m. UTC | #1
On Thu, Nov 18, 2021 at 7:57 PM Somnath Kotur
<somnath.kotur@broadcom.com> wrote:
>
> bnxt_stop_rxtx() does not stop data path processing as intended
> as it does not update the recently introduced fast-path pointers
> '(struct rte_eth_fp_ops)->rx_pkt_burst'. Since both the burst routines
> only use the fast-path pointer, the real burst routines get invoked
> instead of the dummy ones set by bnxt_stop_rxtx() leading to crashes
> in the data path (e.g. dereferencing freed structures)
>
> Fix the segfault by updating the fast-path pointer as well
>
> Fixes: c87d435a4d79 ("ethdev: copy fast-path API into separate structure")
>
> Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Patch applied to dpdk-next-net-brcm.

> ---
>  drivers/net/bnxt/bnxt_cpr.c    | 9 +++++++++
>  drivers/net/bnxt/bnxt_ethdev.c | 7 ++++++-
>  2 files changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/bnxt/bnxt_cpr.c b/drivers/net/bnxt/bnxt_cpr.c
> index 6bb70d516e..a43b22a8f8 100644
> --- a/drivers/net/bnxt/bnxt_cpr.c
> +++ b/drivers/net/bnxt/bnxt_cpr.c
> @@ -387,4 +387,13 @@ void bnxt_stop_rxtx(struct bnxt *bp)
>  {
>         bp->eth_dev->rx_pkt_burst = &bnxt_dummy_recv_pkts;
>         bp->eth_dev->tx_pkt_burst = &bnxt_dummy_xmit_pkts;
> +
> +       rte_eth_fp_ops[bp->eth_dev->data->port_id].rx_pkt_burst =
> +               bp->eth_dev->rx_pkt_burst;
> +       rte_eth_fp_ops[bp->eth_dev->data->port_id].tx_pkt_burst =
> +               bp->eth_dev->tx_pkt_burst;
> +       rte_mb();
> +
> +       /* Allow time for threads to exit the real burst functions. */
> +       rte_delay_ms(100);
>  }
> diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
> index 4413b5d72e..c1bdf9a921 100644
> --- a/drivers/net/bnxt/bnxt_ethdev.c
> +++ b/drivers/net/bnxt/bnxt_ethdev.c
> @@ -4323,6 +4323,8 @@ static void bnxt_dev_recover(void *arg)
>
>         /* Clear Error flag so that device re-init should happen */
>         bp->flags &= ~BNXT_FLAG_FATAL_ERROR;
> +       PMD_DRV_LOG(INFO, "Port: %u Starting recovery...\n",
> +                   bp->eth_dev->data->port_id);
>
>         rc = bnxt_check_fw_ready(bp);
>         if (rc)
> @@ -4347,7 +4349,8 @@ static void bnxt_dev_recover(void *arg)
>         if (rc)
>                 goto err_start;
>
> -       PMD_DRV_LOG(INFO, "Recovered from FW reset\n");
> +       PMD_DRV_LOG(INFO, "Port: %u Recovered from FW reset\n",
> +                   bp->eth_dev->data->port_id);
>         pthread_mutex_unlock(&bp->err_recovery_lock);
>
>         return;
> @@ -4372,6 +4375,8 @@ void bnxt_dev_reset_and_resume(void *arg)
>         int rc;
>
>         bnxt_dev_cleanup(bp);
> +       PMD_DRV_LOG(INFO, "Port: %u Finished bnxt_dev_cleanup\n",
> +                   bp->eth_dev->data->port_id);
>
>         bnxt_wait_for_device_shutdown(bp);
>
> --
> 2.28.0.497.g54e85e7
>
  

Patch

diff --git a/drivers/net/bnxt/bnxt_cpr.c b/drivers/net/bnxt/bnxt_cpr.c
index 6bb70d516e..a43b22a8f8 100644
--- a/drivers/net/bnxt/bnxt_cpr.c
+++ b/drivers/net/bnxt/bnxt_cpr.c
@@ -387,4 +387,13 @@  void bnxt_stop_rxtx(struct bnxt *bp)
 {
 	bp->eth_dev->rx_pkt_burst = &bnxt_dummy_recv_pkts;
 	bp->eth_dev->tx_pkt_burst = &bnxt_dummy_xmit_pkts;
+
+	rte_eth_fp_ops[bp->eth_dev->data->port_id].rx_pkt_burst =
+		bp->eth_dev->rx_pkt_burst;
+	rte_eth_fp_ops[bp->eth_dev->data->port_id].tx_pkt_burst =
+		bp->eth_dev->tx_pkt_burst;
+	rte_mb();
+
+	/* Allow time for threads to exit the real burst functions. */
+	rte_delay_ms(100);
 }
diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index 4413b5d72e..c1bdf9a921 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -4323,6 +4323,8 @@  static void bnxt_dev_recover(void *arg)
 
 	/* Clear Error flag so that device re-init should happen */
 	bp->flags &= ~BNXT_FLAG_FATAL_ERROR;
+	PMD_DRV_LOG(INFO, "Port: %u Starting recovery...\n",
+		    bp->eth_dev->data->port_id);
 
 	rc = bnxt_check_fw_ready(bp);
 	if (rc)
@@ -4347,7 +4349,8 @@  static void bnxt_dev_recover(void *arg)
 	if (rc)
 		goto err_start;
 
-	PMD_DRV_LOG(INFO, "Recovered from FW reset\n");
+	PMD_DRV_LOG(INFO, "Port: %u Recovered from FW reset\n",
+		    bp->eth_dev->data->port_id);
 	pthread_mutex_unlock(&bp->err_recovery_lock);
 
 	return;
@@ -4372,6 +4375,8 @@  void bnxt_dev_reset_and_resume(void *arg)
 	int rc;
 
 	bnxt_dev_cleanup(bp);
+	PMD_DRV_LOG(INFO, "Port: %u Finished bnxt_dev_cleanup\n",
+		    bp->eth_dev->data->port_id);
 
 	bnxt_wait_for_device_shutdown(bp);