[v7] vhost: fix crash on port deletion
Checks
Commit Message
The rte_vhost_driver_unregister() and vhost_user_read_cb()
can be called at the same time by 2 threads.
when memory of vsocket is freed in rte_vhost_driver_unregister(),
the invalid memory of vsocket is accessed in vhost_user_read_cb().
It's a bug of both mode for vhost as server or client.
E.g., vhostuser port is created as server.
Thread1 calls rte_vhost_driver_unregister().
Before the listen fd is deleted from poll waiting fds,
"vhost-events" thread then calls vhost_user_server_new_connection(),
then a new conn fd is added in fdset when trying to reconnect.
"vhost-events" thread then calls vhost_user_read_cb() and
accesses invalid memory of socket while thread1 frees the memory of
vsocket.
E.g., vhostuser port is created as client.
Thread1 calls rte_vhost_driver_unregister().
Before vsocket of reconn is deleted from reconn list,
"vhost_reconn" thread then calls vhost_user_add_connection()
then a new conn fd is added in fdset when trying to reconnect.
"vhost-events" thread then calls vhost_user_read_cb() and
accesses invalid memory of socket while thread1 frees the memory of
vsocket.
The fix is to move the "fdset_try_del" in front of free memory of conn,
then avoid the race condition.
The core trace is:
Program terminated with signal 11, Segmentation fault.
Fixes: 52d874dc6705 ("vhost: fix crash on closing in client mode")
Signed-off-by: Gaoxiang Liu <liugaoxiang@huawei.com>
---
v2:
* Fix coding style issues.
v3:
* Add detailed log.
v4:
* Add the reason, when vhostuser port is created as server.
v5:
* Add detailed log when vhostuser port is created as client
v6:
* Add 'path' check before deleting listen fd
* Fix spelling issues
v7:
* Fix coding style issues.
---
lib/vhost/socket.c | 107 ++++++++++++++++++++++-----------------------
1 file changed, 53 insertions(+), 54 deletions(-)
Comments
Hi,
> -----Original Message-----
> From: Gaoxiang Liu <gaoxiangliu0@163.com>
> Sent: Thursday, September 2, 2021 11:46 PM
> To: maxime.coquelin@redhat.com; Xia, Chenbo <chenbo.xia@intel.com>
> Cc: dev@dpdk.org; liugaoxiang@huawei.com; Gaoxiang Liu <gaoxiangliu0@163.com>
> Subject: [PATCH v7] vhost: fix crash on port deletion
>
> The rte_vhost_driver_unregister() and vhost_user_read_cb()
> can be called at the same time by 2 threads.
> when memory of vsocket is freed in rte_vhost_driver_unregister(),
> the invalid memory of vsocket is accessed in vhost_user_read_cb().
> It's a bug of both mode for vhost as server or client.
>
> E.g., vhostuser port is created as server.
> Thread1 calls rte_vhost_driver_unregister().
> Before the listen fd is deleted from poll waiting fds,
> "vhost-events" thread then calls vhost_user_server_new_connection(),
> then a new conn fd is added in fdset when trying to reconnect.
> "vhost-events" thread then calls vhost_user_read_cb() and
> accesses invalid memory of socket while thread1 frees the memory of
> vsocket.
>
> E.g., vhostuser port is created as client.
> Thread1 calls rte_vhost_driver_unregister().
> Before vsocket of reconn is deleted from reconn list,
> "vhost_reconn" thread then calls vhost_user_add_connection()
> then a new conn fd is added in fdset when trying to reconnect.
> "vhost-events" thread then calls vhost_user_read_cb() and
> accesses invalid memory of socket while thread1 frees the memory of
> vsocket.
>
> The fix is to move the "fdset_try_del" in front of free memory of conn,
> then avoid the race condition.
>
> The core trace is:
> Program terminated with signal 11, Segmentation fault.
>
> Fixes: 52d874dc6705 ("vhost: fix crash on closing in client mode")
Please check comment/reply in v6. And a suggestion:
Wait for comment/problem solved in old version before new version.
It can save everyone's effort of tracking all versions.
> -----Original Message-----
> From: Gaoxiang Liu <gaoxiangliu0@163.com>
> Sent: Thursday, September 2, 2021 11:46 PM
> To: maxime.coquelin@redhat.com; Xia, Chenbo <chenbo.xia@intel.com>
> Cc: dev@dpdk.org; liugaoxiang@huawei.com; Gaoxiang Liu <gaoxiangliu0@163.com>
> Subject: [PATCH v7] vhost: fix crash on port deletion
>
> The rte_vhost_driver_unregister() and vhost_user_read_cb()
> can be called at the same time by 2 threads.
> when memory of vsocket is freed in rte_vhost_driver_unregister(),
> the invalid memory of vsocket is accessed in vhost_user_read_cb().
> It's a bug of both mode for vhost as server or client.
>
> E.g., vhostuser port is created as server.
> Thread1 calls rte_vhost_driver_unregister().
> Before the listen fd is deleted from poll waiting fds,
> "vhost-events" thread then calls vhost_user_server_new_connection(),
> then a new conn fd is added in fdset when trying to reconnect.
> "vhost-events" thread then calls vhost_user_read_cb() and
> accesses invalid memory of socket while thread1 frees the memory of
> vsocket.
>
> E.g., vhostuser port is created as client.
> Thread1 calls rte_vhost_driver_unregister().
> Before vsocket of reconn is deleted from reconn list,
> "vhost_reconn" thread then calls vhost_user_add_connection()
> then a new conn fd is added in fdset when trying to reconnect.
> "vhost-events" thread then calls vhost_user_read_cb() and
> accesses invalid memory of socket while thread1 frees the memory of
> vsocket.
>
> The fix is to move the "fdset_try_del" in front of free memory of conn,
> then avoid the race condition.
>
> The core trace is:
> Program terminated with signal 11, Segmentation fault.
>
> Fixes: 52d874dc6705 ("vhost: fix crash on closing in client mode")
>
> Signed-off-by: Gaoxiang Liu <liugaoxiang@huawei.com>
@Maxime, I noticed the author and sob tag are using different emails. You
may need to change the author email when applying.
For this patch:
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
On 9/2/21 5:45 PM, Gaoxiang Liu wrote:
> The rte_vhost_driver_unregister() and vhost_user_read_cb()
> can be called at the same time by 2 threads.
> when memory of vsocket is freed in rte_vhost_driver_unregister(),
> the invalid memory of vsocket is accessed in vhost_user_read_cb().
> It's a bug of both mode for vhost as server or client.
>
> E.g., vhostuser port is created as server.
> Thread1 calls rte_vhost_driver_unregister().
> Before the listen fd is deleted from poll waiting fds,
> "vhost-events" thread then calls vhost_user_server_new_connection(),
> then a new conn fd is added in fdset when trying to reconnect.
> "vhost-events" thread then calls vhost_user_read_cb() and
> accesses invalid memory of socket while thread1 frees the memory of
> vsocket.
>
> E.g., vhostuser port is created as client.
> Thread1 calls rte_vhost_driver_unregister().
> Before vsocket of reconn is deleted from reconn list,
> "vhost_reconn" thread then calls vhost_user_add_connection()
> then a new conn fd is added in fdset when trying to reconnect.
> "vhost-events" thread then calls vhost_user_read_cb() and
> accesses invalid memory of socket while thread1 frees the memory of
> vsocket.
>
> The fix is to move the "fdset_try_del" in front of free memory of conn,
> then avoid the race condition.
>
> The core trace is:
> Program terminated with signal 11, Segmentation fault.
>
> Fixes: 52d874dc6705 ("vhost: fix crash on closing in client mode")
>
> Signed-off-by: Gaoxiang Liu <liugaoxiang@huawei.com>
> ---
>
> v2:
> * Fix coding style issues.
>
> v3:
> * Add detailed log.
>
> v4:
> * Add the reason, when vhostuser port is created as server.
>
> v5:
> * Add detailed log when vhostuser port is created as client
>
> v6:
> * Add 'path' check before deleting listen fd
> * Fix spelling issues
>
> v7:
> * Fix coding style issues.
> ---
> lib/vhost/socket.c | 107 ++++++++++++++++++++++-----------------------
> 1 file changed, 53 insertions(+), 54 deletions(-)
>
Applied to dpdk-next-virtio/main.
Thanks,
Maxime
@@ -1023,66 +1023,65 @@ rte_vhost_driver_unregister(const char *path)
for (i = 0; i < vhost_user.vsocket_cnt; i++) {
struct vhost_user_socket *vsocket = vhost_user.vsockets[i];
+ if (strcmp(vsocket->path, path))
+ continue;
- if (!strcmp(vsocket->path, path)) {
- pthread_mutex_lock(&vsocket->conn_mutex);
- for (conn = TAILQ_FIRST(&vsocket->conn_list);
- conn != NULL;
- conn = next) {
- next = TAILQ_NEXT(conn, next);
-
- /*
- * If r/wcb is executing, release vsocket's
- * conn_mutex and vhost_user's mutex locks, and
- * try again since the r/wcb may use the
- * conn_mutex and mutex locks.
- */
- if (fdset_try_del(&vhost_user.fdset,
- conn->connfd) == -1) {
- pthread_mutex_unlock(
- &vsocket->conn_mutex);
- pthread_mutex_unlock(&vhost_user.mutex);
- goto again;
- }
-
- VHOST_LOG_CONFIG(INFO,
- "free connfd = %d for device '%s'\n",
- conn->connfd, path);
- close(conn->connfd);
- vhost_destroy_device(conn->vid);
- TAILQ_REMOVE(&vsocket->conn_list, conn, next);
- free(conn);
- }
- pthread_mutex_unlock(&vsocket->conn_mutex);
-
- if (vsocket->is_server) {
- /*
- * If r/wcb is executing, release vhost_user's
- * mutex lock, and try again since the r/wcb
- * may use the mutex lock.
- */
- if (fdset_try_del(&vhost_user.fdset,
- vsocket->socket_fd) == -1) {
- pthread_mutex_unlock(&vhost_user.mutex);
- goto again;
- }
-
- close(vsocket->socket_fd);
- unlink(path);
- } else if (vsocket->reconnect) {
- vhost_user_remove_reconnect(vsocket);
+ if (vsocket->is_server) {
+ /*
+ * If r/wcb is executing, release vhost_user's
+ * mutex lock, and try again since the r/wcb
+ * may use the mutex lock.
+ */
+ if (fdset_try_del(&vhost_user.fdset, vsocket->socket_fd) == -1) {
+ pthread_mutex_unlock(&vhost_user.mutex);
+ goto again;
}
+ } else if (vsocket->reconnect) {
+ vhost_user_remove_reconnect(vsocket);
+ }
- pthread_mutex_destroy(&vsocket->conn_mutex);
- vhost_user_socket_mem_free(vsocket);
+ pthread_mutex_lock(&vsocket->conn_mutex);
+ for (conn = TAILQ_FIRST(&vsocket->conn_list);
+ conn != NULL;
+ conn = next) {
+ next = TAILQ_NEXT(conn, next);
- count = --vhost_user.vsocket_cnt;
- vhost_user.vsockets[i] = vhost_user.vsockets[count];
- vhost_user.vsockets[count] = NULL;
- pthread_mutex_unlock(&vhost_user.mutex);
+ /*
+ * If r/wcb is executing, release vsocket's
+ * conn_mutex and vhost_user's mutex locks, and
+ * try again since the r/wcb may use the
+ * conn_mutex and mutex locks.
+ */
+ if (fdset_try_del(&vhost_user.fdset,
+ conn->connfd) == -1) {
+ pthread_mutex_unlock(&vsocket->conn_mutex);
+ pthread_mutex_unlock(&vhost_user.mutex);
+ goto again;
+ }
- return 0;
+ VHOST_LOG_CONFIG(INFO,
+ "free connfd = %d for device '%s'\n",
+ conn->connfd, path);
+ close(conn->connfd);
+ vhost_destroy_device(conn->vid);
+ TAILQ_REMOVE(&vsocket->conn_list, conn, next);
+ free(conn);
+ }
+ pthread_mutex_unlock(&vsocket->conn_mutex);
+
+ if (vsocket->is_server) {
+ close(vsocket->socket_fd);
+ unlink(path);
}
+
+ pthread_mutex_destroy(&vsocket->conn_mutex);
+ vhost_user_socket_mem_free(vsocket);
+
+ count = --vhost_user.vsocket_cnt;
+ vhost_user.vsockets[i] = vhost_user.vsockets[count];
+ vhost_user.vsockets[count] = NULL;
+ pthread_mutex_unlock(&vhost_user.mutex);
+ return 0;
}
pthread_mutex_unlock(&vhost_user.mutex);