vdpa/mlx5: fix unregister kick handler order

Message ID 20230808113221.227319-1-yajunw@nvidia.com (mailing list archive)
State Accepted, archived
Delegated to: Maxime Coquelin
Headers
Series vdpa/mlx5: fix unregister kick handler order |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/github-robot: build success github build: passed
ci/intel-Functional success Functional PASS
ci/iol-x86_64-compile-testing success Testing PASS
ci/iol-testing success Testing PASS
ci/iol-x86_64-unit-testing success Testing PASS
ci/iol-unit-testing success Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS
ci/iol-aarch-unit-testing success Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-intel-Performance success Performance Testing PASS

Commit Message

Yajun Wu Aug. 8, 2023, 11:32 a.m. UTC
  The mlx5_vdpa_virtq_kick_handler function may still be running and waiting
on virtq->virtq_lock while mlx5_vdpa_cqe_event_unset function is trying to
re-initialize the virtq->virtq_lock.

This causes mlx5_vdpa_virtq_kick_handler thread can't be wake up and can't
be unregister. Following print may loop forever when calling
rte_vhost_driver_unregister(socket_path):

    mlx5_vdpa: Try again to unregister fd 154 of virtq 11 interrupt
    mlx5_vdpa: Try again to unregister fd 154 of virtq 11 interrupt
    ...

The fix is to move mlx5_vdpa_virtq_unregister_intr_handle before
mlx5_vdpa_cqe_event_unset.

Fixes: 057f7d2084 ("vdpa/mlx5: optimize datapath-control synchronization")
Cc: stable@dpdk.org

Signed-off-by: Yajun Wu <yajunw@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
---
 drivers/vdpa/mlx5/mlx5_vdpa.c         | 1 +
 drivers/vdpa/mlx5/mlx5_vdpa_cthread.c | 1 -
 2 files changed, 1 insertion(+), 1 deletion(-)
  

Comments

Maxime Coquelin Oct. 12, 2023, 1:44 p.m. UTC | #1
On 8/8/23 13:32, Yajun Wu wrote:
> The mlx5_vdpa_virtq_kick_handler function may still be running and waiting
> on virtq->virtq_lock while mlx5_vdpa_cqe_event_unset function is trying to
> re-initialize the virtq->virtq_lock.
> 
> This causes mlx5_vdpa_virtq_kick_handler thread can't be wake up and can't
> be unregister. Following print may loop forever when calling
> rte_vhost_driver_unregister(socket_path):
> 
>      mlx5_vdpa: Try again to unregister fd 154 of virtq 11 interrupt
>      mlx5_vdpa: Try again to unregister fd 154 of virtq 11 interrupt
>      ...
> 
> The fix is to move mlx5_vdpa_virtq_unregister_intr_handle before
> mlx5_vdpa_cqe_event_unset.
> 
> Fixes: 057f7d2084 ("vdpa/mlx5: optimize datapath-control synchronization")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Yajun Wu <yajunw@nvidia.com>
> Acked-by: Matan Azrad <matan@nvidia.com>
> ---
>   drivers/vdpa/mlx5/mlx5_vdpa.c         | 1 +
>   drivers/vdpa/mlx5/mlx5_vdpa_cthread.c | 1 -
>   2 files changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c b/drivers/vdpa/mlx5/mlx5_vdpa.c
> index f1737f82a8..8b1de8bd62 100644
> --- a/drivers/vdpa/mlx5/mlx5_vdpa.c
> +++ b/drivers/vdpa/mlx5/mlx5_vdpa.c
> @@ -282,6 +282,7 @@ _internal_mlx5_vdpa_dev_close(struct mlx5_vdpa_priv *priv,
>   	int ret = 0;
>   	int vid = priv->vid;
>   
> +	mlx5_vdpa_virtq_unreg_intr_handle_all(priv);
>   	mlx5_vdpa_cqe_event_unset(priv);
>   	if (priv->state == MLX5_VDPA_STATE_CONFIGURED) {
>   		ret |= mlx5_vdpa_lm_log(priv);
> diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_cthread.c b/drivers/vdpa/mlx5/mlx5_vdpa_cthread.c
> index 6e6624e5a3..1d84e422d4 100644
> --- a/drivers/vdpa/mlx5/mlx5_vdpa_cthread.c
> +++ b/drivers/vdpa/mlx5/mlx5_vdpa_cthread.c
> @@ -190,7 +190,6 @@ mlx5_vdpa_c_thread_handle(void *arg)
>   			pthread_mutex_unlock(&virtq->virtq_lock);
>   			break;
>   		case MLX5_VDPA_TASK_DEV_CLOSE_NOWAIT:
> -			mlx5_vdpa_virtq_unreg_intr_handle_all(priv);
>   			pthread_mutex_lock(&priv->steer_update_lock);
>   			mlx5_vdpa_steer_unset(priv);
>   			pthread_mutex_unlock(&priv->steer_update_lock);

Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>

Thanks,
Maxime
  
Maxime Coquelin Oct. 12, 2023, 1:50 p.m. UTC | #2
On 8/8/23 13:32, Yajun Wu wrote:
> The mlx5_vdpa_virtq_kick_handler function may still be running and waiting
> on virtq->virtq_lock while mlx5_vdpa_cqe_event_unset function is trying to
> re-initialize the virtq->virtq_lock.
> 
> This causes mlx5_vdpa_virtq_kick_handler thread can't be wake up and can't
> be unregister. Following print may loop forever when calling
> rte_vhost_driver_unregister(socket_path):
> 
>      mlx5_vdpa: Try again to unregister fd 154 of virtq 11 interrupt
>      mlx5_vdpa: Try again to unregister fd 154 of virtq 11 interrupt
>      ...
> 
> The fix is to move mlx5_vdpa_virtq_unregister_intr_handle before
> mlx5_vdpa_cqe_event_unset.
> 
> Fixes: 057f7d2084 ("vdpa/mlx5: optimize datapath-control synchronization")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Yajun Wu <yajunw@nvidia.com>
> Acked-by: Matan Azrad <matan@nvidia.com>
> ---
>   drivers/vdpa/mlx5/mlx5_vdpa.c         | 1 +
>   drivers/vdpa/mlx5/mlx5_vdpa_cthread.c | 1 -
>   2 files changed, 1 insertion(+), 1 deletion(-)


Applied to nex-virtio/for-next-net.

Thanks,
Maxime
  

Patch

diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c b/drivers/vdpa/mlx5/mlx5_vdpa.c
index f1737f82a8..8b1de8bd62 100644
--- a/drivers/vdpa/mlx5/mlx5_vdpa.c
+++ b/drivers/vdpa/mlx5/mlx5_vdpa.c
@@ -282,6 +282,7 @@  _internal_mlx5_vdpa_dev_close(struct mlx5_vdpa_priv *priv,
 	int ret = 0;
 	int vid = priv->vid;
 
+	mlx5_vdpa_virtq_unreg_intr_handle_all(priv);
 	mlx5_vdpa_cqe_event_unset(priv);
 	if (priv->state == MLX5_VDPA_STATE_CONFIGURED) {
 		ret |= mlx5_vdpa_lm_log(priv);
diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_cthread.c b/drivers/vdpa/mlx5/mlx5_vdpa_cthread.c
index 6e6624e5a3..1d84e422d4 100644
--- a/drivers/vdpa/mlx5/mlx5_vdpa_cthread.c
+++ b/drivers/vdpa/mlx5/mlx5_vdpa_cthread.c
@@ -190,7 +190,6 @@  mlx5_vdpa_c_thread_handle(void *arg)
 			pthread_mutex_unlock(&virtq->virtq_lock);
 			break;
 		case MLX5_VDPA_TASK_DEV_CLOSE_NOWAIT:
-			mlx5_vdpa_virtq_unreg_intr_handle_all(priv);
 			pthread_mutex_lock(&priv->steer_update_lock);
 			mlx5_vdpa_steer_unset(priv);
 			pthread_mutex_unlock(&priv->steer_update_lock);