From patchwork Thu Dec 28 01:23:27 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?b?546L5b+X5YWL?= X-Patchwork-Id: 32774 X-Patchwork-Delegate: yuanhan.liu@linux.intel.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 0A5AD1B23A; Thu, 28 Dec 2017 02:24:02 +0100 (CET) Received: from smtp.jd.com (smtp.jd.com [58.83.206.59]) by dpdk.org (Postfix) with ESMTP id 58A9C1B238 for ; Thu, 28 Dec 2017 02:24:00 +0100 (CET) Received: from USHUB02.360buyAD.local (172.26.2.76) by hub04.360buyAD.local (172.17.27.18) with Microsoft SMTP Server (TLS) id 14.3.361.1; Thu, 28 Dec 2017 09:24:00 +0800 Received: from localhost.localdomain (39.109.125.68) by USHUB02.360buyAD.local (138.229.76.5) with Microsoft SMTP Server id 14.3.361.1; Thu, 28 Dec 2017 09:23:45 +0800 From: zhike wang To: CC: wang zhike Date: Wed, 27 Dec 2017 17:23:27 -0800 Message-ID: <1514424207-98179-1-git-send-email-wangzhike@jd.com> X-Mailer: git-send-email 1.8.3.1 MIME-Version: 1.0 X-Originating-IP: [39.109.125.68] Subject: [dpdk-dev] [PATCH v2] lib/librte_vhost: move fdset_del out of conn_mutex X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: wang zhike v2: * Move fdset_del before conn destroy. * Fix coding style. This patch fixes below race condition: 1. one thread calls: rte_vhost_driver_unregister->lock conn_mutex ->fdset_del->loop to check fd.busy. 2. another thread calls fdset_event_dispatch, and the busy flag is changed AFTER handling on the fd, i.e, rcb(). However, the rcb, such as vhost_user_read_cb() would try to retrieve the conn_mutex. So issue is that the 1st thread will loop check the flag while holding the mutex, while the 2nd thread would be blocked by mutex and can not change the flag. Then dead lock is observed. Signed-off-by: zhike wang --- lib/librte_vhost/socket.c | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c index 422da00..017f824 100644 --- a/lib/librte_vhost/socket.c +++ b/lib/librte_vhost/socket.c @@ -749,6 +749,9 @@ struct vhost_user_reconnect_list { struct vhost_user_socket *vsocket = vhost_user.vsockets[i]; if (!strcmp(vsocket->path, path)) { + int del_fds[MAX_FDS]; + int num_of_fds = 0, i; + if (vsocket->is_server) { fdset_del(&vhost_user.fdset, vsocket->socket_fd); close(vsocket->socket_fd); @@ -757,13 +760,26 @@ struct vhost_user_reconnect_list { vhost_user_remove_reconnect(vsocket); } + /* fdset_del() must be called without conn_mutex. */ + pthread_mutex_lock(&vsocket->conn_mutex); + for (conn = TAILQ_FIRST(&vsocket->conn_list); + conn != NULL; + conn = next) { + next = TAILQ_NEXT(conn, next); + + del_fds[num_of_fds++] = conn->connfd; + } + pthread_mutex_unlock(&vsocket->conn_mutex); + + for (i = 0; i < num_of_fds; i++) + fdset_del(&vhost_user.fdset, del_fds[i]); + pthread_mutex_lock(&vsocket->conn_mutex); for (conn = TAILQ_FIRST(&vsocket->conn_list); conn != NULL; conn = next) { next = TAILQ_NEXT(conn, next); - fdset_del(&vhost_user.fdset, conn->connfd); RTE_LOG(INFO, VHOST_CONFIG, "free connfd = %d for device '%s'\n", conn->connfd, path);