[v2,2/7] vdpa/mlx5: fix dead loop when process interrupted

Message ID 20220224155101.1991626-3-xuemingl@nvidia.com (mailing list archive)
State Superseded, archived
Delegated to: Maxime Coquelin
Headers
Series vdpa/mlx5: improve device shutdown time |

Checks

Context Check Description
ci/checkpatch success coding style OK

Commit Message

Xueming Li Feb. 24, 2022, 3:50 p.m. UTC
  In Ctrl+C handling, sometimes kick handling thread gets endless EGAIN
error and fall into dead lock.

Kick happens frequently in real system due to busy traffic or retry
mechanism. This patch simplifies kick firmware anyway and skip setting
hardware notifier due to potential device error, notifier could be set
in next successful kick request.

Fixes: 62c813706e41 ("vdpa/mlx5: map doorbell")
Cc: stable@dpdk.org

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
---
 drivers/vdpa/mlx5/mlx5_vdpa_virtq.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)
  

Comments

Maxime Coquelin April 20, 2022, 10:33 a.m. UTC | #1
On 2/24/22 16:50, Xueming Li wrote:
> In Ctrl+C handling, sometimes kick handling thread gets endless EGAIN
> error and fall into dead lock.
> 
> Kick happens frequently in real system due to busy traffic or retry
> mechanism. This patch simplifies kick firmware anyway and skip setting
> hardware notifier due to potential device error, notifier could be set
> in next successful kick request.
> 
> Fixes: 62c813706e41 ("vdpa/mlx5: map doorbell")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Xueming Li <xuemingl@nvidia.com>
> ---
>   drivers/vdpa/mlx5/mlx5_vdpa_virtq.c | 8 +++++---
>   1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c b/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c
> index de324506cb9..e1e05924a40 100644
> --- a/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c
> +++ b/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c
> @@ -23,11 +23,11 @@ mlx5_vdpa_virtq_kick_handler(void *cb_arg)
>   	struct mlx5_vdpa_priv *priv = virtq->priv;
>   	uint64_t buf;
>   	int nbytes;
> +	int retry;
>   
>   	if (rte_intr_fd_get(virtq->intr_handle) < 0)
>   		return;
> -
> -	do {
> +	for (retry = 0; retry < 3; ++retry) {
>   		nbytes = read(rte_intr_fd_get(virtq->intr_handle), &buf,
>   			      8);
>   		if (nbytes < 0) {
> @@ -39,7 +39,9 @@ mlx5_vdpa_virtq_kick_handler(void *cb_arg)
>   				virtq->index, strerror(errno));
>   		}
>   		break;
> -	} while (1);
> +	}
> +	if (nbytes < 0)
> +		return;
>   	rte_write32(virtq->index, priv->virtq_db_addr);
>   	if (virtq->notifier_state == MLX5_VDPA_NOTIFIER_STATE_DISABLED) {
>   		if (rte_vhost_host_notifier_ctrl(priv->vid, virtq->index, true))

Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>

Thanks,
Maxime
  

Patch

diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c b/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c
index de324506cb9..e1e05924a40 100644
--- a/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c
+++ b/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c
@@ -23,11 +23,11 @@  mlx5_vdpa_virtq_kick_handler(void *cb_arg)
 	struct mlx5_vdpa_priv *priv = virtq->priv;
 	uint64_t buf;
 	int nbytes;
+	int retry;
 
 	if (rte_intr_fd_get(virtq->intr_handle) < 0)
 		return;
-
-	do {
+	for (retry = 0; retry < 3; ++retry) {
 		nbytes = read(rte_intr_fd_get(virtq->intr_handle), &buf,
 			      8);
 		if (nbytes < 0) {
@@ -39,7 +39,9 @@  mlx5_vdpa_virtq_kick_handler(void *cb_arg)
 				virtq->index, strerror(errno));
 		}
 		break;
-	} while (1);
+	}
+	if (nbytes < 0)
+		return;
 	rte_write32(virtq->index, priv->virtq_db_addr);
 	if (virtq->notifier_state == MLX5_VDPA_NOTIFIER_STATE_DISABLED) {
 		if (rte_vhost_host_notifier_ctrl(priv->vid, virtq->index, true))