vhost: fix use-after-free race during cleanup

Message ID 20251104080931.8102-1-shperetz@nvidia.com (mailing list archive)
State Changes Requested
Delegated to: Maxime Coquelin
Headers
Series vhost: fix use-after-free race during cleanup |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/iol-mellanox-Functional success Functional Testing PASS
ci/iol-marvell-Functional success Functional Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/Intel-compilation success Compilation OK
ci/iol-intel-Performance success Performance Testing PASS
ci/intel-Testing success Testing PASS
ci/intel-Functional success Functional PASS
ci/iol-sample-apps-testing success Testing PASS
ci/github-robot: build success github build: passed
ci/github-robot-post success github post: success
ci/iol-compile-amd64-testing success Testing PASS
ci/iol-unit-amd64-testing success Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-compile-arm64-testing success Testing PASS
ci/iol-unit-arm64-testing success Testing PASS
ci/aws-unit-testing success Unit Testing PASS

Commit Message

Shani Peretz Nov. 4, 2025, 8:09 a.m. UTC
This commit fixes a use-after-free that causes the application
to crash on shutdown (detected by ASAN).

The vhost library uses a background event dispatch thread that monitors
fds with epoll. It runs in an infinite loop, waiting for I/O events
and calling callbacks when they occur.

During cleanup, a race condition existed:

  Main Thread:                    Event Dispatch Thread:
  1. Remove fds from fdset        while (1) {
  2. Close file descriptors           epoll_wait() [gets interrupted]
  3. Free fdset memory                [continues loop]
  4. Continue...                      Accesses fdset...   CRASH
                                  }

The main thread would free the fdset memory while the background thread
was still running and using it.

The code had a `destroy` flag that the event dispatch thread checked,
but it was never set during cleanup, and the code never waited for
the thread to actually exit before freeing memory.

This commit implements `fdset_destroy()` that will set the destroy
flag, wait for thread termination, and clean up all resources.
The socket.c is updated to call fdset_destroy() when the last vhost-user
socket is unregistered.

Fixes: 0e38b42bf61c ("vhost: manage FD with epoll")
Cc: stable@dpdk.org

Signed-off-by: Shani Peretz <shperetz@nvidia.com>
---
 lib/vhost/fd_man.c | 38 ++++++++++++++++++++++++++++++++++++++
 lib/vhost/fd_man.h |  1 +
 lib/vhost/socket.c |  7 +++++++
 3 files changed, 46 insertions(+)
  

Comments

fengchengwen Nov. 4, 2025, 9:32 a.m. UTC | #1
On 11/4/2025 4:09 PM, Shani Peretz wrote:
> This commit fixes a use-after-free that causes the application
> to crash on shutdown (detected by ASAN).
> 
> The vhost library uses a background event dispatch thread that monitors
> fds with epoll. It runs in an infinite loop, waiting for I/O events
> and calling callbacks when they occur.
> 
> During cleanup, a race condition existed:
> 
>   Main Thread:                    Event Dispatch Thread:
>   1. Remove fds from fdset        while (1) {
>   2. Close file descriptors           epoll_wait() [gets interrupted]
>   3. Free fdset memory                [continues loop]
>   4. Continue...                      Accesses fdset...   CRASH
>                                   }
> 
> The main thread would free the fdset memory while the background thread
> was still running and using it.

Who will free fdset memory ? I check the lib/vhost/socket.c and found there are no explicit free.

I think it maybe the hugepage free because the fdset use rte_zmalloc(). If it's, please explicit
add it into the commit log.

> 
> The code had a `destroy` flag that the event dispatch thread checked,
> but it was never set during cleanup, and the code never waited for
> the thread to actually exit before freeing memory.
> 
> This commit implements `fdset_destroy()` that will set the destroy
> flag, wait for thread termination, and clean up all resources.
> The socket.c is updated to call fdset_destroy() when the last vhost-user
> socket is unregistered.
> 
> Fixes: 0e38b42bf61c ("vhost: manage FD with epoll")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Shani Peretz <shperetz@nvidia.com>
  
Maxime Coquelin Nov. 4, 2025, 2:31 p.m. UTC | #2
Hi Shani,

Thanks for the fix, more comments below:

On Tue, Nov 4, 2025 at 10:50 AM fengchengwen <fengchengwen@huawei.com> wrote:
>
> On 11/4/2025 4:09 PM, Shani Peretz wrote:
> > This commit fixes a use-after-free that causes the application
> > to crash on shutdown (detected by ASAN).
> >
> > The vhost library uses a background event dispatch thread that monitors
> > fds with epoll. It runs in an infinite loop, waiting for I/O events
> > and calling callbacks when they occur.
> >
> > During cleanup, a race condition existed:
> >
> >   Main Thread:                    Event Dispatch Thread:
> >   1. Remove fds from fdset        while (1) {
> >   2. Close file descriptors           epoll_wait() [gets interrupted]
> >   3. Free fdset memory                [continues loop]
> >   4. Continue...                      Accesses fdset...   CRASH
> >                                   }
> >
> > The main thread would free the fdset memory while the background thread
> > was still running and using it.
>
> Who will free fdset memory ? I check the lib/vhost/socket.c and found there are no explicit free.
>
> I think it maybe the hugepage free because the fdset use rte_zmalloc(). If it's, please explicit
> add it into the commit log.

I agree with Feng, it would be good to provide more information on who
is freeing the memory.

> >
> > The code had a `destroy` flag that the event dispatch thread checked,
> > but it was never set during cleanup, and the code never waited for
> > the thread to actually exit before freeing memory.
> >
> > This commit implements `fdset_destroy()` that will set the destroy
> > flag, wait for thread termination, and clean up all resources.
> > The socket.c is updated to call fdset_destroy() when the last vhost-user
> > socket is unregistered.
> >
> > Fixes: 0e38b42bf61c ("vhost: manage FD with epoll")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Shani Peretz <shperetz@nvidia.com>
>
>

We also need to call fdset_destroy in vduse_device_destroy() if it is
destorying the last VDUSE device.
We might need to add a counter to struct vduse to know whether this is
the last device.

Other than that, the patch looks good to me.

Thanks,
Maxime
  
Shani Peretz Nov. 9, 2025, 12:25 p.m. UTC | #3
> -----Original Message-----
> From: fengchengwen <fengchengwen@huawei.com>
> Sent: Tuesday, 4 November 2025 11:33
> To: Shani Peretz <shperetz@nvidia.com>; dev@dpdk.org
> Cc: stable@dpdk.org; Maxime Coquelin <maxime.coquelin@redhat.com>;
> Chenbo Xia <chenbox@nvidia.com>; David Marchand
> <david.marchand@redhat.com>
> Subject: Re: [PATCH] vhost: fix use-after-free race during cleanup
> 
> External email: Use caution opening links or attachments
> 
> 
> On 11/4/2025 4:09 PM, Shani Peretz wrote:
> > This commit fixes a use-after-free that causes the application to
> > crash on shutdown (detected by ASAN).
> >
> > The vhost library uses a background event dispatch thread that
> > monitors fds with epoll. It runs in an infinite loop, waiting for I/O
> > events and calling callbacks when they occur.
> >
> > During cleanup, a race condition existed:
> >
> >   Main Thread:                    Event Dispatch Thread:
> >   1. Remove fds from fdset        while (1) {
> >   2. Close file descriptors           epoll_wait() [gets interrupted]
> >   3. Free fdset memory                [continues loop]
> >   4. Continue...                      Accesses fdset...   CRASH
> >                                   }
> >
> > The main thread would free the fdset memory while the background
> > thread was still running and using it.
> 
> Who will free fdset memory ? I check the lib/vhost/socket.c and found there
> are no explicit free.
> 
> I think it maybe the hugepage free because the fdset use rte_zmalloc(). If it's,
> please explicit add it into the commit log.

Yes you're right I double checked with a debugger and indeed the fdset memory is freed when hugepage free, I'll update the commit log.

> 
> >
> > The code had a `destroy` flag that the event dispatch thread checked,
> > but it was never set during cleanup, and the code never waited for the
> > thread to actually exit before freeing memory.
> >
> > This commit implements `fdset_destroy()` that will set the destroy
> > flag, wait for thread termination, and clean up all resources.
> > The socket.c is updated to call fdset_destroy() when the last
> > vhost-user socket is unregistered.
> >
> > Fixes: 0e38b42bf61c ("vhost: manage FD with epoll")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Shani Peretz <shperetz@nvidia.com>
>
  

Patch

diff --git a/lib/vhost/fd_man.c b/lib/vhost/fd_man.c
index f9147edee7..ba1b2ead86 100644
--- a/lib/vhost/fd_man.c
+++ b/lib/vhost/fd_man.c
@@ -393,3 +393,41 @@  fdset_event_dispatch(void *arg)
 
 	return 0;
 }
+
+/**
+ * Destroy the fdset and stop its event dispatch thread.
+ */
+void
+fdset_destroy(struct fdset *pfdset)
+{
+	uint32_t val;
+	int i;
+
+	if (pfdset == NULL)
+		return;
+
+	/* Signal the event dispatch thread to stop */
+	pfdset->destroy = true;
+
+	/* Wait for the event dispatch thread to finish */
+	rte_thread_join(pfdset->tid, &val);
+
+	/* Close the epoll file descriptor */
+	close(pfdset->epfd);
+
+	/* Destroy the mutex */
+	pthread_mutex_destroy(&pfdset->fd_mutex);
+
+	/* Remove from global registry */
+	pthread_mutex_lock(&fdsets_mutex);
+	for (i = 0; i < MAX_FDSETS; i++) {
+		if (fdsets[i] == pfdset) {
+			fdsets[i] = NULL;
+			break;
+		}
+	}
+	pthread_mutex_unlock(&fdsets_mutex);
+
+	/* Free the fdset structure */
+	rte_free(pfdset);
+}
diff --git a/lib/vhost/fd_man.h b/lib/vhost/fd_man.h
index eadcc6fb42..ed2109f3c8 100644
--- a/lib/vhost/fd_man.h
+++ b/lib/vhost/fd_man.h
@@ -21,5 +21,6 @@  int fdset_add(struct fdset *pfdset, int fd,
 
 void fdset_del(struct fdset *pfdset, int fd);
 int fdset_try_del(struct fdset *pfdset, int fd);
+void fdset_destroy(struct fdset *pfdset);
 
 #endif
diff --git a/lib/vhost/socket.c b/lib/vhost/socket.c
index 9b4f332f94..0240da8ff0 100644
--- a/lib/vhost/socket.c
+++ b/lib/vhost/socket.c
@@ -1141,6 +1141,13 @@  rte_vhost_driver_unregister(const char *path)
 		count = --vhost_user.vsocket_cnt;
 		vhost_user.vsockets[i] = vhost_user.vsockets[count];
 		vhost_user.vsockets[count] = NULL;
+
+		/* Check if we need to destroy the vhost fdset */
+		if (vhost_user.vsocket_cnt == 0 && vhost_user.fdset != NULL) {
+			fdset_destroy(vhost_user.fdset);
+			vhost_user.fdset = NULL;
+		}
+
 		pthread_mutex_unlock(&vhost_user.mutex);
 		return 0;
 	}