[v5] vhost: fix crash on port deletion

Message ID 20210820154615.551-1-gaoxiangliu0@163.com (mailing list archive)
State Superseded, archived
Delegated to: Maxime Coquelin
Headers
Series [v5] vhost: fix crash on port deletion |

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/github-robot: build success github build: passed
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS
ci/iol-x86_64-compile-testing success Testing PASS
ci/iol-x86_64-unit-testing fail Testing issues
ci/iol-mellanox-Performance success Performance Testing PASS
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS

Commit Message

Gaoxiang Liu Aug. 20, 2021, 3:46 p.m. UTC
  The rte_vhost_driver_unregister() and vhost_user_read_cb()
can be called at the same time by 2 threads.
when memory of vsocket is freed in rte_vhost_driver_unregister(),
the invalid memory of vsocket is accessd in vhost_user_read_cb().
It's a bug of both mode for vhost as server or client.

Eg vhostuser port is created as server.
Thread1 calls rte_vhost_driver_unregister().
Before the listen fd is deleted from poll waiting fds,
"vhost-events" thread then calls vhost_user_server_new_connection(),
then a new conn fd is added in fdset when trying to reconnect.
"vhost-events" thread then calls vhost_user_read_cb() and
access invalid memory of socket while thread1 frees the memory of
vsocket.

Eg vhostuser port is created as client.
Thread1 calls rte_vhost_driver_unregister().
Before vsocket of reconn is deleted from reconn list,
"vhost_reconn" thread then calls vhost_user_add_connection()
then a new conn fd is added in fdset when trying to reconnect.
"vhost-events" thread then calls vhost_user_read_cb() and
access invalid memory of socket while thread1 frees the memory of
vsocket.

The fix is to move the "fdset_try_del" in front of free memory of conn,
then avoid the race condition.

The core trace is:
Program terminated with signal 11, Segmentation fault.

Fixes: 52d874dc6705 ("vhost: fix crash on closing in client mode")

Signed-off-by: Gaoxiang Liu <liugaoxiang@huawei.com>

v2:
* Fix coding style issues.

v3:
* Add detailed log.

v4:
* Add the reason when vhostuser port is created as server.

v5:
* Add detailed log when vhostuser port is created as client.
---
 lib/vhost/socket.c | 27 ++++++++++++++-------------
 1 file changed, 14 insertions(+), 13 deletions(-)
  

Comments

Chenbo Xia Aug. 26, 2021, 8:37 a.m. UTC | #1
Hi Gaoxiang,

> -----Original Message-----
> From: Gaoxiang Liu <gaoxiangliu0@163.com>
> Sent: Friday, August 20, 2021 11:46 PM
> To: maxime.coquelin@redhat.com; Xia, Chenbo <chenbo.xia@intel.com>
> Cc: dev@dpdk.org; liugaoxiang@huawei.com; Gaoxiang Liu <gaoxiangliu0@163.com>
> Subject: [PATCH v5] vhost: fix crash on port deletion
> 
> The rte_vhost_driver_unregister() and vhost_user_read_cb()
> can be called at the same time by 2 threads.
> when memory of vsocket is freed in rte_vhost_driver_unregister(),
> the invalid memory of vsocket is accessd in vhost_user_read_cb().
> It's a bug of both mode for vhost as server or client.
> 
> Eg vhostuser port is created as server.
> Thread1 calls rte_vhost_driver_unregister().
> Before the listen fd is deleted from poll waiting fds,
> "vhost-events" thread then calls vhost_user_server_new_connection(),
> then a new conn fd is added in fdset when trying to reconnect.
> "vhost-events" thread then calls vhost_user_read_cb() and
> access invalid memory of socket while thread1 frees the memory of
> vsocket.
> 
> Eg vhostuser port is created as client.
> Thread1 calls rte_vhost_driver_unregister().
> Before vsocket of reconn is deleted from reconn list,
> "vhost_reconn" thread then calls vhost_user_add_connection()
> then a new conn fd is added in fdset when trying to reconnect.
> "vhost-events" thread then calls vhost_user_read_cb() and
> access invalid memory of socket while thread1 frees the memory of
> vsocket.
> 
> The fix is to move the "fdset_try_del" in front of free memory of conn,
> then avoid the race condition.
> 
> The core trace is:
> Program terminated with signal 11, Segmentation fault.

Thanks for the detailed commit log. It looks better now and easier to understand.

And please make the some words more formal. Like 'Eg' -> 'E.g.,'

> 
> Fixes: 52d874dc6705 ("vhost: fix crash on closing in client mode")
> 
> Signed-off-by: Gaoxiang Liu <liugaoxiang@huawei.com>

Please add a new line with " --- " (as I showed you in v4). It should look like:


Signed-off-by: Gaoxiang Liu <liugaoxiang@huawei.com>
---

v2:
* Fix coding style issues.

If you don’t do this, the description will be shown in the commit log, which is
not wanted.

> 
> v2:
> * Fix coding style issues.
> 
> v3:
> * Add detailed log.
> 
> v4:
> * Add the reason when vhostuser port is created as server.
> 
> v5:
> * Add detailed log when vhostuser port is created as client.
> ---
>  lib/vhost/socket.c | 27 ++++++++++++++-------------
>  1 file changed, 14 insertions(+), 13 deletions(-)
> 
> diff --git a/lib/vhost/socket.c b/lib/vhost/socket.c
> index 5d0d728d5..2eb8fcadd 100644
> --- a/lib/vhost/socket.c
> +++ b/lib/vhost/socket.c
> @@ -1024,6 +1024,20 @@ rte_vhost_driver_unregister(const char *path)
>  	for (i = 0; i < vhost_user.vsocket_cnt; i++) {
>  		struct vhost_user_socket *vsocket = vhost_user.vsockets[i];
> 
> +		if (vsocket->is_server) {
> +			/*
> +			 * If r/wcb is executing, release vhost_user's
> +			 * mutex lock, and try again since the r/wcb
> +			 * may use the mutex lock.
> +			 */
> +			if (fdset_try_del(&vhost_user.fdset, vsocket->socket_fd) ==
> -1) {
> +				pthread_mutex_unlock(&vhost_user.mutex);
> +				goto again;
> +			}
> +		} else if (vsocket->reconnect) {
> +			vhost_user_remove_reconnect(vsocket);
> +		}
> +
>  		if (!strcmp(vsocket->path, path)) {

You should first check the param 'path' before doing anything, so please move the
strcmp too.

>  			pthread_mutex_lock(&vsocket->conn_mutex);
>  			for (conn = TAILQ_FIRST(&vsocket->conn_list);
> @@ -1056,21 +1070,8 @@ rte_vhost_driver_unregister(const char *path)
>  			pthread_mutex_unlock(&vsocket->conn_mutex);
> 
>  			if (vsocket->is_server) {
> -				/*
> -				 * If r/wcb is executing, release vhost_user's
> -				 * mutex lock, and try again since the r/wcb
> -				 * may use the mutex lock.
> -				 */
> -				if (fdset_try_del(&vhost_user.fdset,
> -						vsocket->socket_fd) == -1) {
> -					pthread_mutex_unlock(&vhost_user.mutex);
> -					goto again;
> -				}
> -
>  				close(vsocket->socket_fd);
>  				unlink(path);

Any reason why we don't move this?

Thanks,
Chenbo

> -			} else if (vsocket->reconnect) {
> -				vhost_user_remove_reconnect(vsocket);
>  			}
> 
>  			pthread_mutex_destroy(&vsocket->conn_mutex);
> --
> 2.32.0
>
  

Patch

diff --git a/lib/vhost/socket.c b/lib/vhost/socket.c
index 5d0d728d5..2eb8fcadd 100644
--- a/lib/vhost/socket.c
+++ b/lib/vhost/socket.c
@@ -1024,6 +1024,20 @@  rte_vhost_driver_unregister(const char *path)
 	for (i = 0; i < vhost_user.vsocket_cnt; i++) {
 		struct vhost_user_socket *vsocket = vhost_user.vsockets[i];
 
+		if (vsocket->is_server) {
+			/*
+			 * If r/wcb is executing, release vhost_user's
+			 * mutex lock, and try again since the r/wcb
+			 * may use the mutex lock.
+			 */
+			if (fdset_try_del(&vhost_user.fdset, vsocket->socket_fd) == -1) {
+				pthread_mutex_unlock(&vhost_user.mutex);
+				goto again;
+			}
+		} else if (vsocket->reconnect) {
+			vhost_user_remove_reconnect(vsocket);
+		}
+
 		if (!strcmp(vsocket->path, path)) {
 			pthread_mutex_lock(&vsocket->conn_mutex);
 			for (conn = TAILQ_FIRST(&vsocket->conn_list);
@@ -1056,21 +1070,8 @@  rte_vhost_driver_unregister(const char *path)
 			pthread_mutex_unlock(&vsocket->conn_mutex);
 
 			if (vsocket->is_server) {
-				/*
-				 * If r/wcb is executing, release vhost_user's
-				 * mutex lock, and try again since the r/wcb
-				 * may use the mutex lock.
-				 */
-				if (fdset_try_del(&vhost_user.fdset,
-						vsocket->socket_fd) == -1) {
-					pthread_mutex_unlock(&vhost_user.mutex);
-					goto again;
-				}
-
 				close(vsocket->socket_fd);
 				unlink(path);
-			} else if (vsocket->reconnect) {
-				vhost_user_remove_reconnect(vsocket);
 			}
 
 			pthread_mutex_destroy(&vsocket->conn_mutex);