From patchwork Tue Jun 11 13:39:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 140927 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 28941424CD; Tue, 11 Jun 2024 15:40:12 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 839D04067C; Tue, 11 Jun 2024 15:40:11 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id A0CD44021F for ; Tue, 11 Jun 2024 15:40:10 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1718113210; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/9+TMbHix+Vm2cBCVE/9hx0DFnwiSd2cSBZVHtpIxLs=; b=Kje9U4PptoLtWdOPd0La5S81CZiuHp8F5XLuOqcH+g8I8HAVlhlyLuZ+UEcO32YiLZTlm2 R/5hGHR9/L5djJPE2s1uFPPJlPJMQ2BwsuKsjreYAGSr+rbuqwPykZ+j/gWOQ1UOkLzJRc xd/dPPgAoVetAkfSl01hcg9R1qsawb0= Received: from mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-450-LQjGsJf5P96o04-epedknw-1; Tue, 11 Jun 2024 09:40:06 -0400 X-MC-Unique: LQjGsJf5P96o04-epedknw-1 Received: from mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.15]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id B007F19560AB; Tue, 11 Jun 2024 13:40:05 +0000 (UTC) Received: from max-p1.redhat.com (unknown [10.39.208.15]) by mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id D0A981956087; Tue, 11 Jun 2024 13:40:03 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, david.marchand@redhat.com, chenbox@nvidia.com Cc: Maxime Coquelin Subject: [PATCH v4 1/5] vhost: rename polling mutex Date: Tue, 11 Jun 2024 15:39:53 +0200 Message-ID: <20240611133957.72032-2-maxime.coquelin@redhat.com> In-Reply-To: <20240611133957.72032-1-maxime.coquelin@redhat.com> References: <20240611133957.72032-1-maxime.coquelin@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.15 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org This trivial patch fixes a typo in fd's manager polling mutex name. Reviewed-by: David Marchand Signed-off-by: Maxime Coquelin --- lib/vhost/fd_man.c | 8 ++++---- lib/vhost/fd_man.h | 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/lib/vhost/fd_man.c b/lib/vhost/fd_man.c index 481e6b900a..67ee1589e1 100644 --- a/lib/vhost/fd_man.c +++ b/lib/vhost/fd_man.c @@ -125,9 +125,9 @@ fdset_add(struct fdset *pfdset, int fd, fd_cb rcb, fd_cb wcb, void *dat) pthread_mutex_lock(&pfdset->fd_mutex); i = pfdset->num < MAX_FDS ? pfdset->num++ : -1; if (i == -1) { - pthread_mutex_lock(&pfdset->fd_pooling_mutex); + pthread_mutex_lock(&pfdset->fd_polling_mutex); fdset_shrink_nolock(pfdset); - pthread_mutex_unlock(&pfdset->fd_pooling_mutex); + pthread_mutex_unlock(&pfdset->fd_polling_mutex); i = pfdset->num < MAX_FDS ? pfdset->num++ : -1; if (i == -1) { pthread_mutex_unlock(&pfdset->fd_mutex); @@ -244,9 +244,9 @@ fdset_event_dispatch(void *arg) numfds = pfdset->num; pthread_mutex_unlock(&pfdset->fd_mutex); - pthread_mutex_lock(&pfdset->fd_pooling_mutex); + pthread_mutex_lock(&pfdset->fd_polling_mutex); val = poll(pfdset->rwfds, numfds, 1000 /* millisecs */); - pthread_mutex_unlock(&pfdset->fd_pooling_mutex); + pthread_mutex_unlock(&pfdset->fd_polling_mutex); if (val < 0) continue; diff --git a/lib/vhost/fd_man.h b/lib/vhost/fd_man.h index 7816fb11ac..4e00f94758 100644 --- a/lib/vhost/fd_man.h +++ b/lib/vhost/fd_man.h @@ -24,7 +24,7 @@ struct fdset { struct pollfd rwfds[MAX_FDS]; struct fdentry fd[MAX_FDS]; pthread_mutex_t fd_mutex; - pthread_mutex_t fd_pooling_mutex; + pthread_mutex_t fd_polling_mutex; int num; /* current fd number of this fdset */ union pipefds { From patchwork Tue Jun 11 13:39:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 140928 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id CE659424CD; Tue, 11 Jun 2024 15:40:24 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id EFCC840A75; Tue, 11 Jun 2024 15:40:15 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id 58E6C40A73 for ; Tue, 11 Jun 2024 15:40:14 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1718113213; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=i8vTeGeJLTACkfcOVafWfHL+9Xm0MEhbWXRHx2TT8Do=; b=FlspXn5FXJGuK8HsVAKMJmuUNM513E4rQ2W3cWinATpY+pRGRwQkEalIlgQYnhbfVhzmbF MMK5E7Phjc6FpjfBx8kMVgo32XzJBwlM/xufTFLZqlubUJgu+TWbvfFBLQCEaT2n+sg6Vd tzJq5E+A/dfG6H6IRqgyY7rrowUtLG4= Received: from mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-558-lE0ixlWoNHedtAwo7HMJ_A-1; Tue, 11 Jun 2024 09:40:09 -0400 X-MC-Unique: lE0ixlWoNHedtAwo7HMJ_A-1 Received: from mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.15]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 140EC1955E8C; Tue, 11 Jun 2024 13:40:08 +0000 (UTC) Received: from max-p1.redhat.com (unknown [10.39.208.15]) by mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 3729E1956087; Tue, 11 Jun 2024 13:40:05 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, david.marchand@redhat.com, chenbox@nvidia.com Cc: Maxime Coquelin Subject: [PATCH v4 2/5] vhost: make use of FD manager init function Date: Tue, 11 Jun 2024 15:39:54 +0200 Message-ID: <20240611133957.72032-3-maxime.coquelin@redhat.com> In-Reply-To: <20240611133957.72032-1-maxime.coquelin@redhat.com> References: <20240611133957.72032-1-maxime.coquelin@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.15 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Instead of statically initialize the fdset, this patch converts VDUSE and Vhost-user to use fdset_init() function, which now also initialize the mutexes. This is preliminary rework to hide FDs manager pipe from its users. Signed-off-by: Maxime Coquelin --- lib/vhost/fd_man.c | 10 +++++++--- lib/vhost/fd_man.h | 2 +- lib/vhost/socket.c | 12 +++++------- lib/vhost/vduse.c | 15 ++++++--------- 4 files changed, 19 insertions(+), 20 deletions(-) diff --git a/lib/vhost/fd_man.c b/lib/vhost/fd_man.c index 67ee1589e1..a07fba5b6d 100644 --- a/lib/vhost/fd_man.c +++ b/lib/vhost/fd_man.c @@ -96,19 +96,23 @@ fdset_add_fd(struct fdset *pfdset, int idx, int fd, pfd->revents = 0; } -void +int fdset_init(struct fdset *pfdset) { int i; - if (pfdset == NULL) - return; + pthread_mutex_init(&pfdset->fd_mutex, NULL); + pthread_mutex_init(&pfdset->fd_polling_mutex, NULL); + pthread_mutex_init(&pfdset->sync_mutex, NULL); + pthread_cond_init(&pfdset->sync_cond, NULL); for (i = 0; i < MAX_FDS; i++) { pfdset->fd[i].fd = -1; pfdset->fd[i].dat = NULL; } pfdset->num = 0; + + return 0; } /** diff --git a/lib/vhost/fd_man.h b/lib/vhost/fd_man.h index 4e00f94758..0f4cddfe56 100644 --- a/lib/vhost/fd_man.h +++ b/lib/vhost/fd_man.h @@ -43,7 +43,7 @@ struct fdset { }; -void fdset_init(struct fdset *pfdset); +int fdset_init(struct fdset *pfdset); int fdset_add(struct fdset *pfdset, int fd, fd_cb rcb, fd_cb wcb, void *dat); diff --git a/lib/vhost/socket.c b/lib/vhost/socket.c index c681d53abb..896c8e6471 100644 --- a/lib/vhost/socket.c +++ b/lib/vhost/socket.c @@ -90,13 +90,6 @@ static int create_unix_socket(struct vhost_user_socket *vsocket); static int vhost_user_start_client(struct vhost_user_socket *vsocket); static struct vhost_user vhost_user = { - .fdset = { - .fd = { [0 ... MAX_FDS - 1] = {-1, NULL, NULL, NULL, 0} }, - .fd_mutex = PTHREAD_MUTEX_INITIALIZER, - .fd_pooling_mutex = PTHREAD_MUTEX_INITIALIZER, - .sync_mutex = PTHREAD_MUTEX_INITIALIZER, - .num = 0 - }, .vsocket_cnt = 0, .mutex = PTHREAD_MUTEX_INITIALIZER, }; @@ -1192,6 +1185,11 @@ rte_vhost_driver_start(const char *path) return vduse_device_create(path, vsocket->net_compliant_ol_flags); if (fdset_tid.opaque_id == 0) { + if (fdset_init(&vhost_user.fdset) < 0) { + VHOST_CONFIG_LOG(path, ERR, "failed to init Vhost-user fdset"); + return -1; + } + /** * create a pipe which will be waited by poll and notified to * rebuild the wait list of poll. diff --git a/lib/vhost/vduse.c b/lib/vhost/vduse.c index e0c6991b69..530c148399 100644 --- a/lib/vhost/vduse.c +++ b/lib/vhost/vduse.c @@ -31,15 +31,7 @@ struct vduse { struct fdset fdset; }; -static struct vduse vduse = { - .fdset = { - .fd = { [0 ... MAX_FDS - 1] = {-1, NULL, NULL, NULL, 0} }, - .fd_mutex = PTHREAD_MUTEX_INITIALIZER, - .fd_pooling_mutex = PTHREAD_MUTEX_INITIALIZER, - .sync_mutex = PTHREAD_MUTEX_INITIALIZER, - .num = 0 - }, -}; +static struct vduse vduse; static bool vduse_events_thread; @@ -435,6 +427,11 @@ vduse_device_create(const char *path, bool compliant_ol_flags) /* If first device, create events dispatcher thread */ if (vduse_events_thread == false) { + if (fdset_init(&vduse.fdset) < 0) { + VHOST_CONFIG_LOG(path, ERR, "failed to init VDUSE fdset"); + return -1; + } + /** * create a pipe which will be waited by poll and notified to * rebuild the wait list of poll. From patchwork Tue Jun 11 13:39:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 140929 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 609BF424CD; Tue, 11 Jun 2024 15:40:32 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 3D5EB40A81; Tue, 11 Jun 2024 15:40:17 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mails.dpdk.org (Postfix) with ESMTP id 3B01140A73 for ; Tue, 11 Jun 2024 15:40:15 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1718113214; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=UW86mXMLPp7WGfGNsbapc89QSBDe2AumDvzg+FaN75Y=; b=LUytZUDMtdeXyWNnVle59Ppt49A6GPwfbDbWVEjOVB6POFHnzOumGAftSrCYM6sBWqygrJ 0zrF5uKnStxjMHKzHqfqd+ifCI1WMRd+h2vlPG2kJ1u1dJHIVhYsyYdb6asxjkorTi7fIo wyDtDLOczYnhi/O/v5MYrta6zLQ054s= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-611-OyUUsWpUNpCqhd32hmFEQg-1; Tue, 11 Jun 2024 09:40:11 -0400 X-MC-Unique: OyUUsWpUNpCqhd32hmFEQg-1 Received: from mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.15]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 838421955DA3; Tue, 11 Jun 2024 13:40:10 +0000 (UTC) Received: from max-p1.redhat.com (unknown [10.39.208.15]) by mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 987B51956087; Tue, 11 Jun 2024 13:40:08 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, david.marchand@redhat.com, chenbox@nvidia.com Cc: Maxime Coquelin Subject: [PATCH v4 3/5] vhost: hide synchronization within FD manager Date: Tue, 11 Jun 2024 15:39:55 +0200 Message-ID: <20240611133957.72032-4-maxime.coquelin@redhat.com> In-Reply-To: <20240611133957.72032-1-maxime.coquelin@redhat.com> References: <20240611133957.72032-1-maxime.coquelin@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.15 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org This patch forces synchronization for all FDs additions or deletions in the FD set. With that, it is no more necessary for the user to know about the FD set pipe, so hide its initialization in the FD manager. Signed-off-by: Maxime Coquelin --- lib/vhost/fd_man.c | 180 ++++++++++++++++++++++++--------------------- lib/vhost/fd_man.h | 8 +- lib/vhost/socket.c | 12 +-- lib/vhost/vduse.c | 18 +---- 4 files changed, 101 insertions(+), 117 deletions(-) diff --git a/lib/vhost/fd_man.c b/lib/vhost/fd_man.c index a07fba5b6d..75843b52ef 100644 --- a/lib/vhost/fd_man.c +++ b/lib/vhost/fd_man.c @@ -2,7 +2,9 @@ * Copyright(c) 2010-2014 Intel Corporation */ +#include #include +#include #include #include @@ -17,6 +19,87 @@ RTE_LOG_REGISTER_SUFFIX(vhost_fdset_logtype, fdset, INFO); #define FDPOLLERR (POLLERR | POLLHUP | POLLNVAL) +static void +fdset_pipe_read_cb(int readfd, void *dat, + int *remove __rte_unused) +{ + char charbuf[16]; + struct fdset *fdset = dat; + int r = read(readfd, charbuf, sizeof(charbuf)); + /* + * Just an optimization, we don't care if read() failed + * so ignore explicitly its return value to make the + * compiler happy + */ + RTE_SET_USED(r); + + pthread_mutex_lock(&fdset->sync_mutex); + fdset->sync = true; + pthread_cond_broadcast(&fdset->sync_cond); + pthread_mutex_unlock(&fdset->sync_mutex); +} + +static void +fdset_pipe_uninit(struct fdset *fdset) +{ + fdset_del(fdset, fdset->u.readfd); + close(fdset->u.readfd); + fdset->u.readfd = -1; + close(fdset->u.writefd); + fdset->u.writefd = -1; +} + +static int +fdset_pipe_init(struct fdset *fdset) +{ + int ret; + + pthread_mutex_init(&fdset->sync_mutex, NULL); + pthread_cond_init(&fdset->sync_cond, NULL); + + if (pipe(fdset->u.pipefd) < 0) { + VHOST_FDMAN_LOG(ERR, + "failed to create pipe for vhost fdset"); + return -1; + } + + ret = fdset_add(fdset, fdset->u.readfd, + fdset_pipe_read_cb, NULL, fdset); + if (ret < 0) { + VHOST_FDMAN_LOG(ERR, + "failed to add pipe readfd %d into vhost server fdset", + fdset->u.readfd); + + fdset_pipe_uninit(fdset); + return -1; + } + + return 0; +} + +static void +fdset_sync(struct fdset *fdset) +{ + int ret; + + pthread_mutex_lock(&fdset->sync_mutex); + + fdset->sync = false; + ret = write(fdset->u.writefd, "1", 1); + if (ret < 0) { + VHOST_FDMAN_LOG(ERR, + "Failed to write to notification pipe: %s", + strerror(errno)); + goto out_unlock; + } + + while (!fdset->sync) + pthread_cond_wait(&fdset->sync_cond, &fdset->sync_mutex); + +out_unlock: + pthread_mutex_unlock(&fdset->sync_mutex); +} + static int get_last_valid_idx(struct fdset *pfdset, int last_valid_idx) { @@ -96,6 +179,12 @@ fdset_add_fd(struct fdset *pfdset, int idx, int fd, pfd->revents = 0; } +void +fdset_uninit(struct fdset *pfdset) +{ + fdset_pipe_uninit(pfdset); +} + int fdset_init(struct fdset *pfdset) { @@ -103,8 +192,6 @@ fdset_init(struct fdset *pfdset) pthread_mutex_init(&pfdset->fd_mutex, NULL); pthread_mutex_init(&pfdset->fd_polling_mutex, NULL); - pthread_mutex_init(&pfdset->sync_mutex, NULL); - pthread_cond_init(&pfdset->sync_cond, NULL); for (i = 0; i < MAX_FDS; i++) { pfdset->fd[i].fd = -1; @@ -112,7 +199,7 @@ fdset_init(struct fdset *pfdset) } pfdset->num = 0; - return 0; + return fdset_pipe_init(pfdset); } /** @@ -142,6 +229,8 @@ fdset_add(struct fdset *pfdset, int fd, fd_cb rcb, fd_cb wcb, void *dat) fdset_add_fd(pfdset, i, fd, rcb, wcb, dat); pthread_mutex_unlock(&pfdset->fd_mutex); + fdset_sync(pfdset); + return 0; } @@ -173,6 +262,8 @@ fdset_del(struct fdset *pfdset, int fd) pthread_mutex_unlock(&pfdset->fd_mutex); } while (i != -1); + fdset_sync(pfdset); + return dat; } @@ -206,6 +297,9 @@ fdset_try_del(struct fdset *pfdset, int fd) } pthread_mutex_unlock(&pfdset->fd_mutex); + + fdset_sync(pfdset); + return 0; } @@ -311,83 +405,3 @@ fdset_event_dispatch(void *arg) return 0; } - -static void -fdset_pipe_read_cb(int readfd, void *dat, - int *remove __rte_unused) -{ - char charbuf[16]; - struct fdset *fdset = dat; - int r = read(readfd, charbuf, sizeof(charbuf)); - /* - * Just an optimization, we don't care if read() failed - * so ignore explicitly its return value to make the - * compiler happy - */ - RTE_SET_USED(r); - - pthread_mutex_lock(&fdset->sync_mutex); - fdset->sync = true; - pthread_cond_broadcast(&fdset->sync_cond); - pthread_mutex_unlock(&fdset->sync_mutex); -} - -void -fdset_pipe_uninit(struct fdset *fdset) -{ - fdset_del(fdset, fdset->u.readfd); - close(fdset->u.readfd); - close(fdset->u.writefd); -} - -int -fdset_pipe_init(struct fdset *fdset) -{ - int ret; - - if (pipe(fdset->u.pipefd) < 0) { - VHOST_FDMAN_LOG(ERR, - "failed to create pipe for vhost fdset"); - return -1; - } - - ret = fdset_add(fdset, fdset->u.readfd, - fdset_pipe_read_cb, NULL, fdset); - - if (ret < 0) { - VHOST_FDMAN_LOG(ERR, - "failed to add pipe readfd %d into vhost server fdset", - fdset->u.readfd); - - fdset_pipe_uninit(fdset); - return -1; - } - - return 0; -} - -void -fdset_pipe_notify(struct fdset *fdset) -{ - int r = write(fdset->u.writefd, "1", 1); - /* - * Just an optimization, we don't care if write() failed - * so ignore explicitly its return value to make the - * compiler happy - */ - RTE_SET_USED(r); -} - -void -fdset_pipe_notify_sync(struct fdset *fdset) -{ - pthread_mutex_lock(&fdset->sync_mutex); - - fdset->sync = false; - fdset_pipe_notify(fdset); - - while (!fdset->sync) - pthread_cond_wait(&fdset->sync_cond, &fdset->sync_mutex); - - pthread_mutex_unlock(&fdset->sync_mutex); -} diff --git a/lib/vhost/fd_man.h b/lib/vhost/fd_man.h index 0f4cddfe56..c18e3a435c 100644 --- a/lib/vhost/fd_man.h +++ b/lib/vhost/fd_man.h @@ -42,6 +42,7 @@ struct fdset { bool sync; }; +void fdset_uninit(struct fdset *pfdset); int fdset_init(struct fdset *pfdset); @@ -53,11 +54,4 @@ int fdset_try_del(struct fdset *pfdset, int fd); uint32_t fdset_event_dispatch(void *arg); -int fdset_pipe_init(struct fdset *fdset); - -void fdset_pipe_uninit(struct fdset *fdset); - -void fdset_pipe_notify(struct fdset *fdset); -void fdset_pipe_notify_sync(struct fdset *fdset); - #endif diff --git a/lib/vhost/socket.c b/lib/vhost/socket.c index 896c8e6471..0251111017 100644 --- a/lib/vhost/socket.c +++ b/lib/vhost/socket.c @@ -279,7 +279,6 @@ vhost_user_add_connection(int fd, struct vhost_user_socket *vsocket) TAILQ_INSERT_TAIL(&vsocket->conn_list, conn, next); pthread_mutex_unlock(&vsocket->conn_mutex); - fdset_pipe_notify(&vhost_user.fdset); return; err_cleanup: @@ -1190,20 +1189,11 @@ rte_vhost_driver_start(const char *path) return -1; } - /** - * create a pipe which will be waited by poll and notified to - * rebuild the wait list of poll. - */ - if (fdset_pipe_init(&vhost_user.fdset) < 0) { - VHOST_CONFIG_LOG(path, ERR, "failed to create pipe for vhost fdset"); - return -1; - } - int ret = rte_thread_create_internal_control(&fdset_tid, "vhost-evt", fdset_event_dispatch, &vhost_user.fdset); if (ret != 0) { VHOST_CONFIG_LOG(path, ERR, "failed to create fdset handling thread"); - fdset_pipe_uninit(&vhost_user.fdset); + fdset_uninit(&vhost_user.fdset); return -1; } } diff --git a/lib/vhost/vduse.c b/lib/vhost/vduse.c index 530c148399..d87fc500d4 100644 --- a/lib/vhost/vduse.c +++ b/lib/vhost/vduse.c @@ -225,7 +225,6 @@ vduse_vring_setup(struct virtio_net *dev, unsigned int index) close(vq->kickfd); vq->kickfd = VIRTIO_UNINITIALIZED_EVENTFD; } - fdset_pipe_notify(&vduse.fdset); vhost_enable_guest_notification(dev, vq, 1); VHOST_CONFIG_LOG(dev->ifname, INFO, "Ctrl queue event handler installed"); } @@ -238,10 +237,8 @@ vduse_vring_cleanup(struct virtio_net *dev, unsigned int index) struct vduse_vq_eventfd vq_efd; int ret; - if (vq == dev->cvq && vq->kickfd >= 0) { + if (vq == dev->cvq && vq->kickfd >= 0) fdset_del(&vduse.fdset, vq->kickfd); - fdset_pipe_notify(&vduse.fdset); - } vq_efd.index = index; vq_efd.fd = VDUSE_EVENTFD_DEASSIGN; @@ -432,20 +429,11 @@ vduse_device_create(const char *path, bool compliant_ol_flags) return -1; } - /** - * create a pipe which will be waited by poll and notified to - * rebuild the wait list of poll. - */ - if (fdset_pipe_init(&vduse.fdset) < 0) { - VHOST_CONFIG_LOG(path, ERR, "failed to create pipe for vduse fdset"); - return -1; - } - ret = rte_thread_create_internal_control(&fdset_tid, "vduse-evt", fdset_event_dispatch, &vduse.fdset); if (ret != 0) { VHOST_CONFIG_LOG(path, ERR, "failed to create vduse fdset handling thread"); - fdset_pipe_uninit(&vduse.fdset); + fdset_uninit(&vduse.fdset); return -1; } @@ -573,7 +561,6 @@ vduse_device_create(const char *path, bool compliant_ol_flags) dev->vduse_dev_fd); goto out_dev_destroy; } - fdset_pipe_notify(&vduse.fdset); free(dev_config); @@ -616,7 +603,6 @@ vduse_device_destroy(const char *path) vduse_device_stop(dev); fdset_del(&vduse.fdset, dev->vduse_dev_fd); - fdset_pipe_notify_sync(&vduse.fdset); if (dev->vduse_dev_fd >= 0) { close(dev->vduse_dev_fd); From patchwork Tue Jun 11 13:39:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 140930 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id C1050424CD; Tue, 11 Jun 2024 15:40:40 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 6B4C740A89; Tue, 11 Jun 2024 15:40:19 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id 6827D40A84 for ; Tue, 11 Jun 2024 15:40:17 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1718113217; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=w9Rg8DT2AKf7EwryUTQ+BQPEPfU0CQR8iyW4cvaWfR4=; b=B5ASdopGVHNj0iz8sWOvovlaL1LxuVyEhs0lUTGp4EwLH3pG3ICVeiba8qtgPvnZZoSXYp LDKjV1SjBHTASKpv64qnwjYLV0TkPEkBfTDi5MSJMe6uKFvDP3oESeugwfymPpZahNDstQ OjvIG5q6tKeSwWq13rBHkdFXMN5lilk= Received: from mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-648-GRUEO5nqOruwSqNPmkXoUQ-1; Tue, 11 Jun 2024 09:40:13 -0400 X-MC-Unique: GRUEO5nqOruwSqNPmkXoUQ-1 Received: from mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.15]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id D9DAC1955DC8; Tue, 11 Jun 2024 13:40:12 +0000 (UTC) Received: from max-p1.redhat.com (unknown [10.39.208.15]) by mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 0C06D1956087; Tue, 11 Jun 2024 13:40:10 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, david.marchand@redhat.com, chenbox@nvidia.com Cc: Maxime Coquelin Subject: [PATCH v4 4/5] vhost: improve fdset initialization Date: Tue, 11 Jun 2024 15:39:56 +0200 Message-ID: <20240611133957.72032-5-maxime.coquelin@redhat.com> In-Reply-To: <20240611133957.72032-1-maxime.coquelin@redhat.com> References: <20240611133957.72032-1-maxime.coquelin@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.15 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org This patch heavily reworks fdset initialization: - fdsets are now dynamically allocated by the FD manager - the event dispatcher is now created by the FD manager - struct fdset is now opaque to VDUSE and Vhost Signed-off-by: Maxime Coquelin --- lib/vhost/fd_man.c | 171 ++++++++++++++++++++++++++++++++++++++++----- lib/vhost/fd_man.h | 39 ++--------- lib/vhost/socket.c | 24 +++---- lib/vhost/vduse.c | 29 +++----- 4 files changed, 173 insertions(+), 90 deletions(-) diff --git a/lib/vhost/fd_man.c b/lib/vhost/fd_man.c index 75843b52ef..866904016e 100644 --- a/lib/vhost/fd_man.c +++ b/lib/vhost/fd_man.c @@ -3,12 +3,16 @@ */ #include +#include #include #include #include #include #include +#include +#include +#include #include "fd_man.h" @@ -19,6 +23,79 @@ RTE_LOG_REGISTER_SUFFIX(vhost_fdset_logtype, fdset, INFO); #define FDPOLLERR (POLLERR | POLLHUP | POLLNVAL) +struct fdentry { + int fd; /* -1 indicates this entry is empty */ + fd_cb rcb; /* callback when this fd is readable. */ + fd_cb wcb; /* callback when this fd is writeable.*/ + void *dat; /* fd context */ + int busy; /* whether this entry is being used in cb. */ +}; + +struct fdset { + char name[RTE_THREAD_NAME_SIZE]; + struct pollfd rwfds[MAX_FDS]; + struct fdentry fd[MAX_FDS]; + rte_thread_t tid; + pthread_mutex_t fd_mutex; + pthread_mutex_t fd_polling_mutex; + int num; /* current fd number of this fdset */ + + union pipefds { + struct { + int pipefd[2]; + }; + struct { + int readfd; + int writefd; + }; + } u; + + pthread_mutex_t sync_mutex; + pthread_cond_t sync_cond; + bool sync; + bool destroy; +}; + +static int fdset_add_no_sync(struct fdset *pfdset, int fd, fd_cb rcb, fd_cb wcb, void *dat); +static uint32_t fdset_event_dispatch(void *arg); + +#define MAX_FDSETS 8 + +static struct fdset *fdsets[MAX_FDSETS]; +static pthread_mutex_t fdsets_mutex = PTHREAD_MUTEX_INITIALIZER; + +static struct fdset * +fdset_lookup(const char *name) +{ + int i; + + for (i = 0; i < MAX_FDSETS; i++) { + struct fdset *fdset = fdsets[i]; + if (fdset == NULL) + continue; + + if (!strncmp(fdset->name, name, RTE_THREAD_NAME_SIZE)) + return fdset; + } + + return NULL; +} + +static int +fdset_insert(struct fdset *fdset) +{ + int i; + + for (i = 0; i < MAX_FDSETS; i++) { + if (fdsets[i] == NULL) { + fdsets[i] = fdset; + return 0; + } + } + + return -1; +} + static void fdset_pipe_read_cb(int readfd, void *dat, int *remove __rte_unused) @@ -63,7 +140,7 @@ fdset_pipe_init(struct fdset *fdset) return -1; } - ret = fdset_add(fdset, fdset->u.readfd, + ret = fdset_add_no_sync(fdset, fdset->u.readfd, fdset_pipe_read_cb, NULL, fdset); if (ret < 0) { VHOST_FDMAN_LOG(ERR, @@ -179,34 +256,77 @@ fdset_add_fd(struct fdset *pfdset, int idx, int fd, pfd->revents = 0; } -void -fdset_uninit(struct fdset *pfdset) -{ - fdset_pipe_uninit(pfdset); -} - -int -fdset_init(struct fdset *pfdset) +struct fdset * +fdset_init(const char *name) { + struct fdset *fdset; + uint32_t val; int i; - pthread_mutex_init(&pfdset->fd_mutex, NULL); - pthread_mutex_init(&pfdset->fd_polling_mutex, NULL); + pthread_mutex_lock(&fdsets_mutex); + fdset = fdset_lookup(name); + if (fdset) { + pthread_mutex_unlock(&fdsets_mutex); + return fdset; + } + + fdset = rte_zmalloc(NULL, sizeof(*fdset), 0); + if (!fdset) { + VHOST_FDMAN_LOG(ERR, "Failed to alloc fdset %s", name); + goto err_unlock; + } + + rte_strscpy(fdset->name, name, RTE_THREAD_NAME_SIZE); + + pthread_mutex_init(&fdset->fd_mutex, NULL); + pthread_mutex_init(&fdset->fd_polling_mutex, NULL); for (i = 0; i < MAX_FDS; i++) { - pfdset->fd[i].fd = -1; - pfdset->fd[i].dat = NULL; + fdset->fd[i].fd = -1; + fdset->fd[i].dat = NULL; + } + fdset->num = 0; + + if (fdset_pipe_init(fdset)) { + VHOST_FDMAN_LOG(ERR, "Failed to init pipe for %s", name); + goto err_free; + } + + if (rte_thread_create_internal_control(&fdset->tid, fdset->name, + fdset_event_dispatch, fdset)) { + VHOST_FDMAN_LOG(ERR, "Failed to create %s event dispatch thread", + fdset->name); + goto err_pipe; } - pfdset->num = 0; - return fdset_pipe_init(pfdset); + if (fdset_insert(fdset)) { + VHOST_FDMAN_LOG(ERR, "Failed to insert fdset %s", name); + goto err_thread; + } + + pthread_mutex_unlock(&fdsets_mutex); + + return fdset; + +err_thread: + fdset->destroy = true; + fdset_sync(fdset); + rte_thread_join(fdset->tid, &val); +err_pipe: + fdset_pipe_uninit(fdset); +err_free: + rte_free(fdset); +err_unlock: + pthread_mutex_unlock(&fdsets_mutex); + + return NULL; } /** * Register the fd in the fdset with read/write handler and context. */ -int -fdset_add(struct fdset *pfdset, int fd, fd_cb rcb, fd_cb wcb, void *dat) +static int +fdset_add_no_sync(struct fdset *pfdset, int fd, fd_cb rcb, fd_cb wcb, void *dat) { int i; @@ -229,6 +349,18 @@ fdset_add(struct fdset *pfdset, int fd, fd_cb rcb, fd_cb wcb, void *dat) fdset_add_fd(pfdset, i, fd, rcb, wcb, dat); pthread_mutex_unlock(&pfdset->fd_mutex); + return 0; +} + +int +fdset_add(struct fdset *pfdset, int fd, fd_cb rcb, fd_cb wcb, void *dat) +{ + int ret; + + ret = fdset_add_no_sync(pfdset, fd, rcb, wcb, dat); + if (ret < 0) + return ret; + fdset_sync(pfdset); return 0; @@ -312,7 +444,7 @@ fdset_try_del(struct fdset *pfdset, int fd) * will wait until the flag is reset to zero(which indicates the callback is * finished), then it could free the context after fdset_del. */ -uint32_t +static uint32_t fdset_event_dispatch(void *arg) { int i; @@ -401,6 +533,9 @@ fdset_event_dispatch(void *arg) if (need_shrink) fdset_shrink(pfdset); + + if (pfdset->destroy) + break; } return 0; diff --git a/lib/vhost/fd_man.h b/lib/vhost/fd_man.h index c18e3a435c..079fa0155f 100644 --- a/lib/vhost/fd_man.h +++ b/lib/vhost/fd_man.h @@ -8,50 +8,19 @@ #include #include +struct fdset; + #define MAX_FDS 1024 typedef void (*fd_cb)(int fd, void *dat, int *remove); -struct fdentry { - int fd; /* -1 indicates this entry is empty */ - fd_cb rcb; /* callback when this fd is readable. */ - fd_cb wcb; /* callback when this fd is writeable.*/ - void *dat; /* fd context */ - int busy; /* whether this entry is being used in cb. */ -}; - -struct fdset { - struct pollfd rwfds[MAX_FDS]; - struct fdentry fd[MAX_FDS]; - pthread_mutex_t fd_mutex; - pthread_mutex_t fd_polling_mutex; - int num; /* current fd number of this fdset */ - - union pipefds { - struct { - int pipefd[2]; - }; - struct { - int readfd; - int writefd; - }; - } u; - - pthread_mutex_t sync_mutex; - pthread_cond_t sync_cond; - bool sync; -}; - -void fdset_uninit(struct fdset *pfdset); - -int fdset_init(struct fdset *pfdset); +struct fdset *fdset_init(const char *name); int fdset_add(struct fdset *pfdset, int fd, fd_cb rcb, fd_cb wcb, void *dat); void *fdset_del(struct fdset *pfdset, int fd); -int fdset_try_del(struct fdset *pfdset, int fd); -uint32_t fdset_event_dispatch(void *arg); +int fdset_try_del(struct fdset *pfdset, int fd); #endif diff --git a/lib/vhost/socket.c b/lib/vhost/socket.c index 0251111017..a75728a2e4 100644 --- a/lib/vhost/socket.c +++ b/lib/vhost/socket.c @@ -77,7 +77,7 @@ struct vhost_user_connection { #define MAX_VHOST_SOCKET 1024 struct vhost_user { struct vhost_user_socket *vsockets[MAX_VHOST_SOCKET]; - struct fdset fdset; + struct fdset *fdset; int vsocket_cnt; pthread_mutex_t mutex; }; @@ -262,7 +262,7 @@ vhost_user_add_connection(int fd, struct vhost_user_socket *vsocket) conn->connfd = fd; conn->vsocket = vsocket; conn->vid = vid; - ret = fdset_add(&vhost_user.fdset, fd, vhost_user_read_cb, + ret = fdset_add(vhost_user.fdset, fd, vhost_user_read_cb, NULL, conn); if (ret < 0) { VHOST_CONFIG_LOG(vsocket->path, ERR, @@ -395,7 +395,7 @@ vhost_user_start_server(struct vhost_user_socket *vsocket) if (ret < 0) goto err; - ret = fdset_add(&vhost_user.fdset, fd, vhost_user_server_new_connection, + ret = fdset_add(vhost_user.fdset, fd, vhost_user_server_new_connection, NULL, vsocket); if (ret < 0) { VHOST_CONFIG_LOG(path, ERR, "failed to add listen fd %d to vhost server fdset", @@ -1083,7 +1083,7 @@ rte_vhost_driver_unregister(const char *path) * mutex lock, and try again since the r/wcb * may use the mutex lock. */ - if (fdset_try_del(&vhost_user.fdset, vsocket->socket_fd) == -1) { + if (fdset_try_del(vhost_user.fdset, vsocket->socket_fd) == -1) { pthread_mutex_unlock(&vhost_user.mutex); goto again; } @@ -1103,7 +1103,7 @@ rte_vhost_driver_unregister(const char *path) * try again since the r/wcb may use the * conn_mutex and mutex locks. */ - if (fdset_try_del(&vhost_user.fdset, + if (fdset_try_del(vhost_user.fdset, conn->connfd) == -1) { pthread_mutex_unlock(&vsocket->conn_mutex); pthread_mutex_unlock(&vhost_user.mutex); @@ -1171,7 +1171,6 @@ int rte_vhost_driver_start(const char *path) { struct vhost_user_socket *vsocket; - static rte_thread_t fdset_tid; pthread_mutex_lock(&vhost_user.mutex); vsocket = find_vhost_user_socket(path); @@ -1183,19 +1182,12 @@ rte_vhost_driver_start(const char *path) if (vsocket->is_vduse) return vduse_device_create(path, vsocket->net_compliant_ol_flags); - if (fdset_tid.opaque_id == 0) { - if (fdset_init(&vhost_user.fdset) < 0) { + if (vhost_user.fdset == NULL) { + vhost_user.fdset = fdset_init("vhost-evt"); + if (vhost_user.fdset == NULL) { VHOST_CONFIG_LOG(path, ERR, "failed to init Vhost-user fdset"); return -1; } - - int ret = rte_thread_create_internal_control(&fdset_tid, - "vhost-evt", fdset_event_dispatch, &vhost_user.fdset); - if (ret != 0) { - VHOST_CONFIG_LOG(path, ERR, "failed to create fdset handling thread"); - fdset_uninit(&vhost_user.fdset); - return -1; - } } if (vsocket->is_server) diff --git a/lib/vhost/vduse.c b/lib/vhost/vduse.c index d87fc500d4..c66602905c 100644 --- a/lib/vhost/vduse.c +++ b/lib/vhost/vduse.c @@ -28,13 +28,11 @@ #define VDUSE_CTRL_PATH "/dev/vduse/control" struct vduse { - struct fdset fdset; + struct fdset *fdset; }; static struct vduse vduse; -static bool vduse_events_thread; - static const char * const vduse_reqs_str[] = { "VDUSE_GET_VQ_STATE", "VDUSE_SET_STATUS", @@ -215,7 +213,7 @@ vduse_vring_setup(struct virtio_net *dev, unsigned int index) } if (vq == dev->cvq) { - ret = fdset_add(&vduse.fdset, vq->kickfd, vduse_control_queue_event, NULL, dev); + ret = fdset_add(vduse.fdset, vq->kickfd, vduse_control_queue_event, NULL, dev); if (ret) { VHOST_CONFIG_LOG(dev->ifname, ERR, "Failed to setup kickfd handler for VQ %u: %s", @@ -238,7 +236,7 @@ vduse_vring_cleanup(struct virtio_net *dev, unsigned int index) int ret; if (vq == dev->cvq && vq->kickfd >= 0) - fdset_del(&vduse.fdset, vq->kickfd); + fdset_del(vduse.fdset, vq->kickfd); vq_efd.index = index; vq_efd.fd = VDUSE_EVENTFD_DEASSIGN; @@ -413,7 +411,6 @@ int vduse_device_create(const char *path, bool compliant_ol_flags) { int control_fd, dev_fd, vid, ret; - rte_thread_t fdset_tid; uint32_t i, max_queue_pairs, total_queues; struct virtio_net *dev; struct virtio_net_config vnet_config = {{ 0 }}; @@ -422,22 +419,12 @@ vduse_device_create(const char *path, bool compliant_ol_flags) struct vduse_dev_config *dev_config = NULL; const char *name = path + strlen("/dev/vduse/"); - /* If first device, create events dispatcher thread */ - if (vduse_events_thread == false) { - if (fdset_init(&vduse.fdset) < 0) { + if (vduse.fdset == NULL) { + vduse.fdset = fdset_init("vduse-evt"); + if (vduse.fdset == NULL) { VHOST_CONFIG_LOG(path, ERR, "failed to init VDUSE fdset"); return -1; } - - ret = rte_thread_create_internal_control(&fdset_tid, "vduse-evt", - fdset_event_dispatch, &vduse.fdset); - if (ret != 0) { - VHOST_CONFIG_LOG(path, ERR, "failed to create vduse fdset handling thread"); - fdset_uninit(&vduse.fdset); - return -1; - } - - vduse_events_thread = true; } control_fd = open(VDUSE_CTRL_PATH, O_RDWR); @@ -555,7 +542,7 @@ vduse_device_create(const char *path, bool compliant_ol_flags) dev->cvq = dev->virtqueue[max_queue_pairs * 2]; - ret = fdset_add(&vduse.fdset, dev->vduse_dev_fd, vduse_events_handler, NULL, dev); + ret = fdset_add(vduse.fdset, dev->vduse_dev_fd, vduse_events_handler, NULL, dev); if (ret) { VHOST_CONFIG_LOG(name, ERR, "Failed to add fd %d to vduse fdset", dev->vduse_dev_fd); @@ -602,7 +589,7 @@ vduse_device_destroy(const char *path) vduse_device_stop(dev); - fdset_del(&vduse.fdset, dev->vduse_dev_fd); + fdset_del(vduse.fdset, dev->vduse_dev_fd); if (dev->vduse_dev_fd >= 0) { close(dev->vduse_dev_fd); From patchwork Tue Jun 11 13:39:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 140931 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 1F437424CD; Tue, 11 Jun 2024 15:40:53 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 035DD40B9C; Tue, 11 Jun 2024 15:40:21 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id 8878740A8B for ; Tue, 11 Jun 2024 15:40:19 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1718113219; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZTdIPN2YAuD8H+AaweVXwPCT4yY4yqshXOxCzLgxaJA=; b=jQn1Cx2dnbX+JVUyksO6fg1EV/TjpKG2YfOZjVzXtUQk+f6IktoZNztQxPbLOKVR5R3ZOI qNWPdq5ZkP6TF1W8XU4yOraEYQLwFRv4/IMnCKRgXairCz4L23tZjSGRL8QKJZi+I7LsMF 28HrKuOT9/xh0wfdNQVOhJvfH6TkPc4= Received: from mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-682-kMi8-TX0PvKJ1lm9m_PygQ-1; Tue, 11 Jun 2024 09:40:16 -0400 X-MC-Unique: kMi8-TX0PvKJ1lm9m_PygQ-1 Received: from mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.15]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 8F3941955E8E; Tue, 11 Jun 2024 13:40:15 +0000 (UTC) Received: from max-p1.redhat.com (unknown [10.39.208.15]) by mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 6B1021955E84; Tue, 11 Jun 2024 13:40:13 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, david.marchand@redhat.com, chenbox@nvidia.com Cc: Maxime Coquelin Subject: [PATCH v4 5/5] vhost: manage FD with epoll Date: Tue, 11 Jun 2024 15:39:57 +0200 Message-ID: <20240611133957.72032-6-maxime.coquelin@redhat.com> In-Reply-To: <20240611133957.72032-1-maxime.coquelin@redhat.com> References: <20240611133957.72032-1-maxime.coquelin@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.15 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org From: David Marchand Switch to epoll so that the concern over the poll() fd array is removed. Add a simple list of used entries and track the next free entry. epoll() is thread safe, we no more need a synchronization mechanism and so can remove the notification pipe. Signed-off-by: David Marchand Signed-off-by: Maxime Coquelin --- lib/vhost/fd_man.c | 439 +++++++++++++++------------------------------ lib/vhost/fd_man.h | 5 +- 2 files changed, 144 insertions(+), 300 deletions(-) diff --git a/lib/vhost/fd_man.c b/lib/vhost/fd_man.c index 866904016e..12c0f2c2b3 100644 --- a/lib/vhost/fd_man.c +++ b/lib/vhost/fd_man.c @@ -3,9 +3,9 @@ */ #include -#include #include #include +#include #include #include @@ -21,49 +21,33 @@ RTE_LOG_REGISTER_SUFFIX(vhost_fdset_logtype, fdset, INFO); #define VHOST_FDMAN_LOG(level, ...) \ RTE_LOG_LINE(level, VHOST_FDMAN, "" __VA_ARGS__) -#define FDPOLLERR (POLLERR | POLLHUP | POLLNVAL) - struct fdentry { int fd; /* -1 indicates this entry is empty */ fd_cb rcb; /* callback when this fd is readable. */ fd_cb wcb; /* callback when this fd is writeable.*/ void *dat; /* fd context */ int busy; /* whether this entry is being used in cb. */ + LIST_ENTRY(fdentry) next; }; struct fdset { char name[RTE_THREAD_NAME_SIZE]; - struct pollfd rwfds[MAX_FDS]; + int epfd; struct fdentry fd[MAX_FDS]; + LIST_HEAD(, fdentry) fdlist; + int next_free_idx; rte_thread_t tid; pthread_mutex_t fd_mutex; - pthread_mutex_t fd_polling_mutex; - int num; /* current fd number of this fdset */ - - union pipefds { - struct { - int pipefd[2]; - }; - struct { - int readfd; - int writefd; - }; - } u; - - pthread_mutex_t sync_mutex; - pthread_cond_t sync_cond; - bool sync; bool destroy; }; -static int fdset_add_no_sync(struct fdset *pfdset, int fd, fd_cb rcb, fd_cb wcb, void *dat); -static uint32_t fdset_event_dispatch(void *arg); - #define MAX_FDSETS 8 static struct fdset *fdsets[MAX_FDSETS]; static pthread_mutex_t fdsets_mutex = PTHREAD_MUTEX_INITIALIZER; +static uint32_t fdset_event_dispatch(void *arg); + static struct fdset * fdset_lookup(const char *name) { @@ -96,166 +80,6 @@ fdset_insert(struct fdset *fdset) return -1; } -static void -fdset_pipe_read_cb(int readfd, void *dat, - int *remove __rte_unused) -{ - char charbuf[16]; - struct fdset *fdset = dat; - int r = read(readfd, charbuf, sizeof(charbuf)); - /* - * Just an optimization, we don't care if read() failed - * so ignore explicitly its return value to make the - * compiler happy - */ - RTE_SET_USED(r); - - pthread_mutex_lock(&fdset->sync_mutex); - fdset->sync = true; - pthread_cond_broadcast(&fdset->sync_cond); - pthread_mutex_unlock(&fdset->sync_mutex); -} - -static void -fdset_pipe_uninit(struct fdset *fdset) -{ - fdset_del(fdset, fdset->u.readfd); - close(fdset->u.readfd); - fdset->u.readfd = -1; - close(fdset->u.writefd); - fdset->u.writefd = -1; -} - -static int -fdset_pipe_init(struct fdset *fdset) -{ - int ret; - - pthread_mutex_init(&fdset->sync_mutex, NULL); - pthread_cond_init(&fdset->sync_cond, NULL); - - if (pipe(fdset->u.pipefd) < 0) { - VHOST_FDMAN_LOG(ERR, - "failed to create pipe for vhost fdset"); - return -1; - } - - ret = fdset_add_no_sync(fdset, fdset->u.readfd, - fdset_pipe_read_cb, NULL, fdset); - if (ret < 0) { - VHOST_FDMAN_LOG(ERR, - "failed to add pipe readfd %d into vhost server fdset", - fdset->u.readfd); - - fdset_pipe_uninit(fdset); - return -1; - } - - return 0; -} - -static void -fdset_sync(struct fdset *fdset) -{ - int ret; - - pthread_mutex_lock(&fdset->sync_mutex); - - fdset->sync = false; - ret = write(fdset->u.writefd, "1", 1); - if (ret < 0) { - VHOST_FDMAN_LOG(ERR, - "Failed to write to notification pipe: %s", - strerror(errno)); - goto out_unlock; - } - - while (!fdset->sync) - pthread_cond_wait(&fdset->sync_cond, &fdset->sync_mutex); - -out_unlock: - pthread_mutex_unlock(&fdset->sync_mutex); -} - -static int -get_last_valid_idx(struct fdset *pfdset, int last_valid_idx) -{ - int i; - - for (i = last_valid_idx; i >= 0 && pfdset->fd[i].fd == -1; i--) - ; - - return i; -} - -static void -fdset_move(struct fdset *pfdset, int dst, int src) -{ - pfdset->fd[dst] = pfdset->fd[src]; - pfdset->rwfds[dst] = pfdset->rwfds[src]; -} - -static void -fdset_shrink_nolock(struct fdset *pfdset) -{ - int i; - int last_valid_idx = get_last_valid_idx(pfdset, pfdset->num - 1); - - for (i = 0; i < last_valid_idx; i++) { - if (pfdset->fd[i].fd != -1) - continue; - - fdset_move(pfdset, i, last_valid_idx); - last_valid_idx = get_last_valid_idx(pfdset, last_valid_idx - 1); - } - pfdset->num = last_valid_idx + 1; -} - -/* - * Find deleted fd entries and remove them - */ -static void -fdset_shrink(struct fdset *pfdset) -{ - pthread_mutex_lock(&pfdset->fd_mutex); - fdset_shrink_nolock(pfdset); - pthread_mutex_unlock(&pfdset->fd_mutex); -} - -/** - * Returns the index in the fdset for a given fd. - * @return - * index for the fd, or -1 if fd isn't in the fdset. - */ -static int -fdset_find_fd(struct fdset *pfdset, int fd) -{ - int i; - - for (i = 0; i < pfdset->num && pfdset->fd[i].fd != fd; i++) - ; - - return i == pfdset->num ? -1 : i; -} - -static void -fdset_add_fd(struct fdset *pfdset, int idx, int fd, - fd_cb rcb, fd_cb wcb, void *dat) -{ - struct fdentry *pfdentry = &pfdset->fd[idx]; - struct pollfd *pfd = &pfdset->rwfds[idx]; - - pfdentry->fd = fd; - pfdentry->rcb = rcb; - pfdentry->wcb = wcb; - pfdentry->dat = dat; - - pfd->fd = fd; - pfd->events = rcb ? POLLIN : 0; - pfd->events |= wcb ? POLLOUT : 0; - pfd->revents = 0; -} - struct fdset * fdset_init(const char *name) { @@ -272,23 +96,27 @@ fdset_init(const char *name) fdset = rte_zmalloc(NULL, sizeof(*fdset), 0); if (!fdset) { - VHOST_FDMAN_LOG(ERR, "Failed to alloc fdset %s", name); + VHOST_FDMAN_LOG(ERR, "failed to alloc fdset %s", name); goto err_unlock; } rte_strscpy(fdset->name, name, RTE_THREAD_NAME_SIZE); pthread_mutex_init(&fdset->fd_mutex, NULL); - pthread_mutex_init(&fdset->fd_polling_mutex, NULL); - for (i = 0; i < MAX_FDS; i++) { + for (i = 0; i < (int)RTE_DIM(fdset->fd); i++) { fdset->fd[i].fd = -1; fdset->fd[i].dat = NULL; } - fdset->num = 0; + LIST_INIT(&fdset->fdlist); - if (fdset_pipe_init(fdset)) { - VHOST_FDMAN_LOG(ERR, "Failed to init pipe for %s", name); + /* + * Any non-zero value would work (see man epoll_create), + * but pass MAX_FDS for consistency. + */ + fdset->epfd = epoll_create(MAX_FDS); + if (fdset->epfd < 0) { + VHOST_FDMAN_LOG(ERR, "failed to create epoll for %s fdset", name); goto err_free; } @@ -296,7 +124,7 @@ fdset_init(const char *name) fdset_event_dispatch, fdset)) { VHOST_FDMAN_LOG(ERR, "Failed to create %s event dispatch thread", fdset->name); - goto err_pipe; + goto err_epoll; } if (fdset_insert(fdset)) { @@ -310,10 +138,9 @@ fdset_init(const char *name) err_thread: fdset->destroy = true; - fdset_sync(fdset); rte_thread_join(fdset->tid, &val); -err_pipe: - fdset_pipe_uninit(fdset); +err_epoll: + close(fdset->epfd); err_free: rte_free(fdset); err_unlock: @@ -322,81 +149,136 @@ fdset_init(const char *name) return NULL; } -/** - * Register the fd in the fdset with read/write handler and context. - */ static int -fdset_add_no_sync(struct fdset *pfdset, int fd, fd_cb rcb, fd_cb wcb, void *dat) +fdset_insert_entry(struct fdset *pfdset, int fd, fd_cb rcb, fd_cb wcb, void *dat) { - int i; + struct fdentry *pfdentry; - if (pfdset == NULL || fd == -1) + if (pfdset->next_free_idx >= (int)RTE_DIM(pfdset->fd)) return -1; - pthread_mutex_lock(&pfdset->fd_mutex); - i = pfdset->num < MAX_FDS ? pfdset->num++ : -1; - if (i == -1) { - pthread_mutex_lock(&pfdset->fd_polling_mutex); - fdset_shrink_nolock(pfdset); - pthread_mutex_unlock(&pfdset->fd_polling_mutex); - i = pfdset->num < MAX_FDS ? pfdset->num++ : -1; - if (i == -1) { - pthread_mutex_unlock(&pfdset->fd_mutex); - return -2; - } - } + pfdentry = &pfdset->fd[pfdset->next_free_idx]; + pfdentry->fd = fd; + pfdentry->rcb = rcb; + pfdentry->wcb = wcb; + pfdentry->dat = dat; - fdset_add_fd(pfdset, i, fd, rcb, wcb, dat); - pthread_mutex_unlock(&pfdset->fd_mutex); + LIST_INSERT_HEAD(&pfdset->fdlist, pfdentry, next); + + /* Find next free slot */ + pfdset->next_free_idx++; + for (; pfdset->next_free_idx < (int)RTE_DIM(pfdset->fd); pfdset->next_free_idx++) { + if (pfdset->fd[pfdset->next_free_idx].fd != -1) + continue; + break; + } return 0; } -int -fdset_add(struct fdset *pfdset, int fd, fd_cb rcb, fd_cb wcb, void *dat) +static void +fdset_remove_entry(struct fdset *pfdset, struct fdentry *pfdentry) { - int ret; + int entry_idx; - ret = fdset_add_no_sync(pfdset, fd, rcb, wcb, dat); - if (ret < 0) - return ret; + pfdentry->fd = -1; + pfdentry->rcb = pfdentry->wcb = NULL; + pfdentry->dat = NULL; - fdset_sync(pfdset); + entry_idx = pfdentry - pfdset->fd; + if (entry_idx < pfdset->next_free_idx) + pfdset->next_free_idx = entry_idx; - return 0; + LIST_REMOVE(pfdentry, next); +} + +static struct fdentry * +fdset_find_entry_locked(struct fdset *pfdset, int fd) +{ + struct fdentry *pfdentry; + + LIST_FOREACH(pfdentry, &pfdset->fdlist, next) { + if (pfdentry->fd != fd) + continue; + return pfdentry; + } + + return NULL; } /** - * Unregister the fd from the fdset. - * Returns context of a given fd or NULL. + * Register the fd in the fdset with read/write handler and context. */ -void * +int +fdset_add(struct fdset *pfdset, int fd, fd_cb rcb, fd_cb wcb, void *dat) +{ + struct epoll_event ev; + struct fdentry *pfdentry; + int ret = 0; + + if (pfdset == NULL || fd == -1) { + ret = -1; + goto out; + } + + pthread_mutex_lock(&pfdset->fd_mutex); + ret = fdset_insert_entry(pfdset, fd, rcb, wcb, dat); + if (ret < 0) { + VHOST_FDMAN_LOG(ERR, "failed to insert fdset entry"); + pthread_mutex_unlock(&pfdset->fd_mutex); + goto out; + } + pthread_mutex_unlock(&pfdset->fd_mutex); + + ev.events = EPOLLERR; + ev.events |= rcb ? EPOLLIN : 0; + ev.events |= wcb ? EPOLLOUT : 0; + ev.data.fd = fd; + + ret = epoll_ctl(pfdset->epfd, EPOLL_CTL_ADD, fd, &ev); + if (ret < 0) { + VHOST_FDMAN_LOG(ERR, "could not add %d fd to %d epfd: %s", + fd, pfdset->epfd, strerror(errno)); + goto out_remove; + } + +out_remove: + pthread_mutex_lock(&pfdset->fd_mutex); + pfdentry = fdset_find_entry_locked(pfdset, fd); + if (pfdentry) + fdset_remove_entry(pfdset, pfdentry); + pthread_mutex_unlock(&pfdset->fd_mutex); +out: + return ret; +} + +static void +fdset_del_locked(struct fdset *pfdset, struct fdentry *pfdentry) +{ + if (epoll_ctl(pfdset->epfd, EPOLL_CTL_DEL, pfdentry->fd, NULL) == -1) + VHOST_FDMAN_LOG(ERR, "could not remove %d fd from %d epfd: %s", + pfdentry->fd, pfdset->epfd, strerror(errno)); + + fdset_remove_entry(pfdset, pfdentry); +} + +void fdset_del(struct fdset *pfdset, int fd) { - int i; - void *dat = NULL; + struct fdentry *pfdentry; if (pfdset == NULL || fd == -1) - return NULL; + return; do { pthread_mutex_lock(&pfdset->fd_mutex); - - i = fdset_find_fd(pfdset, fd); - if (i != -1 && pfdset->fd[i].busy == 0) { - /* busy indicates r/wcb is executing! */ - dat = pfdset->fd[i].dat; - pfdset->fd[i].fd = -1; - pfdset->fd[i].rcb = pfdset->fd[i].wcb = NULL; - pfdset->fd[i].dat = NULL; - i = -1; + pfdentry = fdset_find_entry_locked(pfdset, fd); + if (pfdentry != NULL && pfdentry->busy == 0) { + fdset_del_locked(pfdset, pfdentry); + pfdentry = NULL; } pthread_mutex_unlock(&pfdset->fd_mutex); - } while (i != -1); - - fdset_sync(pfdset); - - return dat; + } while (pfdentry != NULL); } /** @@ -410,28 +292,22 @@ fdset_del(struct fdset *pfdset, int fd) int fdset_try_del(struct fdset *pfdset, int fd) { - int i; + struct fdentry *pfdentry; if (pfdset == NULL || fd == -1) return -2; pthread_mutex_lock(&pfdset->fd_mutex); - i = fdset_find_fd(pfdset, fd); - if (i != -1 && pfdset->fd[i].busy) { + pfdentry = fdset_find_entry_locked(pfdset, fd); + if (pfdentry != NULL && pfdentry->busy != 0) { pthread_mutex_unlock(&pfdset->fd_mutex); return -1; } - if (i != -1) { - pfdset->fd[i].fd = -1; - pfdset->fd[i].rcb = pfdset->fd[i].wcb = NULL; - pfdset->fd[i].dat = NULL; - } + if (pfdentry != NULL) + fdset_del_locked(pfdset, pfdentry); pthread_mutex_unlock(&pfdset->fd_mutex); - - fdset_sync(pfdset); - return 0; } @@ -448,53 +324,29 @@ static uint32_t fdset_event_dispatch(void *arg) { int i; - struct pollfd *pfd; - struct fdentry *pfdentry; fd_cb rcb, wcb; void *dat; int fd, numfds; int remove1, remove2; - int need_shrink; struct fdset *pfdset = arg; - int val; if (pfdset == NULL) return 0; while (1) { + struct epoll_event events[MAX_FDS]; + struct fdentry *pfdentry; - /* - * When poll is blocked, other threads might unregister - * listenfds from and register new listenfds into fdset. - * When poll returns, the entries for listenfds in the fdset - * might have been updated. It is ok if there is unwanted call - * for new listenfds. - */ - pthread_mutex_lock(&pfdset->fd_mutex); - numfds = pfdset->num; - pthread_mutex_unlock(&pfdset->fd_mutex); - - pthread_mutex_lock(&pfdset->fd_polling_mutex); - val = poll(pfdset->rwfds, numfds, 1000 /* millisecs */); - pthread_mutex_unlock(&pfdset->fd_polling_mutex); - if (val < 0) + numfds = epoll_wait(pfdset->epfd, events, RTE_DIM(events), 1000); + if (numfds < 0) continue; - need_shrink = 0; for (i = 0; i < numfds; i++) { pthread_mutex_lock(&pfdset->fd_mutex); - pfdentry = &pfdset->fd[i]; - fd = pfdentry->fd; - pfd = &pfdset->rwfds[i]; - - if (fd < 0) { - need_shrink = 1; - pthread_mutex_unlock(&pfdset->fd_mutex); - continue; - } - - if (!pfd->revents) { + fd = events[i].data.fd; + pfdentry = fdset_find_entry_locked(pfdset, fd); + if (pfdentry == NULL) { pthread_mutex_unlock(&pfdset->fd_mutex); continue; } @@ -508,9 +360,9 @@ fdset_event_dispatch(void *arg) pthread_mutex_unlock(&pfdset->fd_mutex); - if (rcb && pfd->revents & (POLLIN | FDPOLLERR)) + if (rcb && events[i].events & (EPOLLIN | EPOLLERR | EPOLLHUP)) rcb(fd, dat, &remove1); - if (wcb && pfd->revents & (POLLOUT | FDPOLLERR)) + if (wcb && events[i].events & (EPOLLOUT | EPOLLERR | EPOLLHUP)) wcb(fd, dat, &remove2); pfdentry->busy = 0; /* @@ -519,21 +371,14 @@ fdset_event_dispatch(void *arg) * directly. */ /* - * When we are to clean up the fd from fdset, - * because the fd is closed in the cb, - * the old fd val could be reused by when creates new - * listen fd in another thread, we couldn't call - * fdset_del. + * A concurrent fdset_del may have been waiting for the + * fdentry not to be busy, so we can't call + * fdset_del_locked(). */ - if (remove1 || remove2) { - pfdentry->fd = -1; - need_shrink = 1; - } + if (remove1 || remove2) + fdset_del(pfdset, fd); } - if (need_shrink) - fdset_shrink(pfdset); - if (pfdset->destroy) break; } diff --git a/lib/vhost/fd_man.h b/lib/vhost/fd_man.h index 079fa0155f..6398343a6a 100644 --- a/lib/vhost/fd_man.h +++ b/lib/vhost/fd_man.h @@ -6,7 +6,7 @@ #define _FD_MAN_H_ #include #include -#include +#include struct fdset; @@ -19,8 +19,7 @@ struct fdset *fdset_init(const char *name); int fdset_add(struct fdset *pfdset, int fd, fd_cb rcb, fd_cb wcb, void *dat); -void *fdset_del(struct fdset *pfdset, int fd); - +void fdset_del(struct fdset *pfdset, int fd); int fdset_try_del(struct fdset *pfdset, int fd); #endif