vhost: cleanup vq resubmit info when set_inflight_fd

Message ID 20240321095805.923117-1-haoqian.he@smartx.com (mailing list archive)
State New
Delegated to: Maxime Coquelin
Headers
Series vhost: cleanup vq resubmit info when set_inflight_fd |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/github-robot: build success github build: passed
ci/intel-Functional success Functional PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-abi-testing success Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-compile-amd64-testing success Testing PASS
ci/iol-unit-arm64-testing success Testing PASS
ci/iol-unit-amd64-testing fail Testing issues
ci/iol-compile-arm64-testing success Testing PASS
ci/iol-sample-apps-testing success Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS

Commit Message

Haoqian He March 21, 2024, 9:57 a.m. UTC
  We should cleanup vq resubmit info when set_inflight_fd
before set_vring_kick which will check if there is any
inflight io waiting for resubmission.

Otherwise, when the vm is rebooting immediately after
reconnecting to the vhost target (inflight io has not
been resubmitted yet), the vhost backend still use the
old resubmit info set when reconnection.

Signed-off-by: Haoqian He <haoqian.he@smartx.com>
---
 lib/vhost/vhost_user.c | 1 +
 1 file changed, 1 insertion(+)
  

Comments

Haoqian He April 3, 2024, 5:08 a.m. UTC | #1
> 2024年3月21日 17:57,Haoqian He <haoqian.he@smartx.com> 写道:
> 
> We should cleanup vq resubmit info when set_inflight_fd
> before set_vring_kick which will check if there is any
> inflight io waiting for resubmission.
> 
> Otherwise, when the vm is rebooting immediately after
> reconnecting to the vhost target (inflight io has not
> been resubmitted yet), the vhost backend still use the
> old resubmit info set when reconnection.
> 
> Signed-off-by: Haoqian He <haoqian.he@smartx.com>
> ---
> lib/vhost/vhost_user.c | 1 +
> 1 file changed, 1 insertion(+)
> 
> diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c
> index 414192500e..7c54afc5fb 100644
> --- a/lib/vhost/vhost_user.c
> +++ b/lib/vhost/vhost_user.c
> @@ -1871,6 +1871,7 @@ vhost_user_set_inflight_fd(struct virtio_net **pdev,
> 		if (!vq)
> 			continue;
> 
> +		cleanup_vq_inflight(dev, vq);
> 		if (vq_is_packed(dev)) {
> 			vq->inflight_packed = addr;
> 			vq->inflight_packed->desc_num = queue_size;
> -- 
> 2.41.0
> 

Ping.

Hi, This issue can be reproduced by restarting vm internally and the vhost live recovery continuously.

Thanks,
Haoqian
  
Haoqian He April 12, 2024, 8:10 a.m. UTC | #2
> 2024年3月21日 17:57,Haoqian He <haoqian.he@smartx.com> 写道:
> 
> We should cleanup vq resubmit info when set_inflight_fd
> before set_vring_kick which will check if there is any
> inflight io waiting for resubmission.
> 
> Otherwise, when the vm is rebooting immediately after
> reconnecting to the vhost target (inflight io has not
> been resubmitted yet), the vhost backend still use the
> old resubmit info set when reconnection.
> 
> Signed-off-by: Haoqian He <haoqian.he@smartx.com>
> ---
> lib/vhost/vhost_user.c | 1 +
> 1 file changed, 1 insertion(+)
> 
> diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c
> index 414192500e..7c54afc5fb 100644
> --- a/lib/vhost/vhost_user.c
> +++ b/lib/vhost/vhost_user.c
> @@ -1871,6 +1871,7 @@ vhost_user_set_inflight_fd(struct virtio_net **pdev,
> 		if (!vq)
> 			continue;
> 
> +		cleanup_vq_inflight(dev, vq);
> 		if (vq_is_packed(dev)) {
> 			vq->inflight_packed = addr;
> 			vq->inflight_packed->desc_num = queue_size;
> -- 
> 2.41.0
> 

Ping.

Hi, Maxime.

This patch fix the potential error when VM reboot after vhost live recovery which
could lead to the VM hang as missing resubmit info cleanup.

If inflight io that should be resubmitted during the latest vhost reconnection has
not been submitted yet, so GET_VRING_BASE would not wait these inflight io,
at this time the resubmit info has been set and restart the VM immediately.

Currently, we do not cleanup the resubmit info before VM restart, so when VM
restarts, SET_VRING_KICK will resubmit these inflight io (If resubmit info is not
null, function set_vring_kick will return without updating resubmit info).

It’s an error, any stale inflight io should not be resubmitted after the VM restart.

Thanks,
Haoqian
  
Maxime Coquelin April 25, 2024, 1:12 p.m. UTC | #3
Hi Haoqian,

We try to avoid passing functions or variable names in the commit title.
Maybe something like this would work:
"vhost: cleanup resubmit info before inflight setup"

On 3/21/24 10:57, Haoqian He wrote:
> We should cleanup vq resubmit info when set_inflight_fd

virtqueue*

> before set_vring_kick which will check if there is any
> inflight io waiting for resubmission.

IO

> 
> Otherwise, when the vm is rebooting immediately after

VM

> reconnecting to the vhost target (inflight io has not
> been resubmitted yet), the vhost backend still use the

uses

> old resubmit info set when reconnection.

reconnecting

> 

You need to add "Fixes:" tag pointing to the commit introducing the 
issue, so that it is backported in stable releases.

This is documented in the contribution guidelines if you have any doubt
on the formatting.

> Signed-off-by: Haoqian He <haoqian.he@smartx.com>
> ---
>   lib/vhost/vhost_user.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c
> index 414192500e..7c54afc5fb 100644
> --- a/lib/vhost/vhost_user.c
> +++ b/lib/vhost/vhost_user.c
> @@ -1871,6 +1871,7 @@ vhost_user_set_inflight_fd(struct virtio_net **pdev,
>   		if (!vq)
>   			continue;
>   
> +		cleanup_vq_inflight(dev, vq);
>   		if (vq_is_packed(dev)) {
>   			vq->inflight_packed = addr;
>   			vq->inflight_packed->desc_num = queue_size;
  

Patch

diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c
index 414192500e..7c54afc5fb 100644
--- a/lib/vhost/vhost_user.c
+++ b/lib/vhost/vhost_user.c
@@ -1871,6 +1871,7 @@  vhost_user_set_inflight_fd(struct virtio_net **pdev,
 		if (!vq)
 			continue;
 
+		cleanup_vq_inflight(dev, vq);
 		if (vq_is_packed(dev)) {
 			vq->inflight_packed = addr;
 			vq->inflight_packed->desc_num = queue_size;