From patchwork Thu Dec 14 11:35:31 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Victor Kaplansky X-Patchwork-Id: 32260 X-Patchwork-Delegate: yuanhan.liu@linux.intel.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id A80D21B018; Thu, 14 Dec 2017 12:35:42 +0100 (CET) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id 2E04D1B00F; Thu, 14 Dec 2017 12:35:41 +0100 (CET) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 770F71555A; Thu, 14 Dec 2017 11:35:40 +0000 (UTC) Received: from redhat.com (unknown [10.35.206.66]) by smtp.corp.redhat.com (Postfix) with SMTP id 53E9869731; Thu, 14 Dec 2017 11:35:32 +0000 (UTC) Date: Thu, 14 Dec 2017 13:35:31 +0200 From: Victor Kaplansky To: dev@dpdk.org Cc: stable@dpdk.org, Jens Freimann , Maxime Coquelin , Yuanhan Liu , Tiwei Bie , "Tan, Jianfeng" , Victor Kaplansky Message-ID: <20171214133531-mutt-send-email-victork@redhat.com> MIME-Version: 1.0 Content-Disposition: inline X-Mutt-Fcc: =sent X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Thu, 14 Dec 2017 11:35:40 +0000 (UTC) Subject: [dpdk-dev] [PATCH v3] vhost_user: protect active rings from async ring changes X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" When performing live migration or memory hot-plugging, the changes to the device and vrings made by message handler done independently from vring usage by PMD threads. This causes for example segfaults during live-migration with MQ enable, but in general virtually any request sent by qemu changing the state of device can cause problems. These patches fixes all above issues by adding a spinlock to every vring and requiring message handler to start operation only after ensuring that all PMD threads related to the device are out of critical section accessing the vring data. Each vring has its own lock in order to not create contention between PMD threads of different vrings and to prevent performance degradation by scaling queue pair number. See https://bugzilla.redhat.com/show_bug.cgi?id=1450680 Signed-off-by: Victor Kaplansky Tested-by: Maxime Coquelin --- v3: o Added locking to enqueue flow. o Enqueue path guarded as well as dequeue path. o Changed name of active_lock. o Added initialization of guarding spinlock. o Reworked functions skimming over all virt-queues. o Performance measurements done by Maxime Coquelin shows no degradation in bandwidth and throughput. o Spelling. o Taking lock only on set operations. o IOMMU messages are not guarded by access lock. v2: o Fixed checkpatch complains. o Added Signed-off-by. o Refined placement of guard to exclude IOMMU messages. o TODO: performance degradation measurement. lib/librte_vhost/vhost.h | 15 ++++++++++ lib/librte_vhost/vhost.c | 1 + lib/librte_vhost/vhost_user.c | 65 +++++++++++++++++++++++++++++++++++++++++++ lib/librte_vhost/virtio_net.c | 20 +++++++++++-- 4 files changed, 99 insertions(+), 2 deletions(-) diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h index 1cc81c17..26e2c571 100644 --- a/lib/librte_vhost/vhost.h +++ b/lib/librte_vhost/vhost.h @@ -137,6 +137,8 @@ struct vhost_virtqueue { TAILQ_HEAD(, vhost_iotlb_entry) iotlb_list; int iotlb_cache_nr; TAILQ_HEAD(, vhost_iotlb_entry) iotlb_pending_list; + + rte_spinlock_t access_lock; } __rte_cache_aligned; /* Old kernels have no such macros defined */ @@ -302,6 +304,19 @@ vhost_log_used_vring(struct virtio_net *dev, struct vhost_virtqueue *vq, vhost_log_write(dev, vq->log_guest_addr + offset, len); } +static __rte_always_inline void +vhost_user_access_lock(struct vhost_virtqueue *vq) +{ + rte_spinlock_lock(&vq->access_lock); +} + +static __rte_always_inline void +vhost_user_access_unlock(struct vhost_virtqueue *vq) +{ + rte_spinlock_unlock(&vq->access_lock); +} + + /* Macros for printing using RTE_LOG */ #define RTE_LOGTYPE_VHOST_CONFIG RTE_LOGTYPE_USER1 #define RTE_LOGTYPE_VHOST_DATA RTE_LOGTYPE_USER1 diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c index 4f8b73a0..dcc42fc7 100644 --- a/lib/librte_vhost/vhost.c +++ b/lib/librte_vhost/vhost.c @@ -259,6 +259,7 @@ alloc_vring_queue(struct virtio_net *dev, uint32_t vring_idx) dev->virtqueue[vring_idx] = vq; init_vring_queue(dev, vring_idx); + rte_spinlock_init(&vq->access_lock); dev->nr_vring += 1; diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index f4c7ce46..a5b5e4be 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -1190,12 +1190,48 @@ vhost_user_check_and_alloc_queue_pair(struct virtio_net *dev, VhostUserMsg *msg) return alloc_vring_queue(dev, vring_idx); } +static void +vhost_user_lock_all_queue_pairs(struct virtio_net *dev) +{ + unsigned int i = 0; + unsigned int vq_num = 0; + + while (vq_num < dev->nr_vring) { + struct vhost_virtqueue *vq = dev->virtqueue[i]; + + if (vq) { + vhost_user_access_lock(vq); + vq_num++; + } + i++; + } +} + +static void +vhost_user_unlock_all_queue_pairs(struct virtio_net *dev) +{ + unsigned int i = 0; + unsigned int vq_num = 0; + + while (vq_num < dev->nr_vring) { + struct vhost_virtqueue *vq = dev->virtqueue[i]; + + if (vq) { + vhost_user_access_unlock(vq); + vq_num++; + } + i++; + } +} + int vhost_user_msg_handler(int vid, int fd) { struct virtio_net *dev; struct VhostUserMsg msg; int ret; + int unlock_required = 0; + dev = get_device(vid); if (dev == NULL) @@ -1241,6 +1277,32 @@ vhost_user_msg_handler(int vid, int fd) return -1; } + switch (msg.request.master) { + case VHOST_USER_SET_FEATURES: + case VHOST_USER_SET_PROTOCOL_FEATURES: + case VHOST_USER_SET_OWNER: + case VHOST_USER_RESET_OWNER: + case VHOST_USER_SET_MEM_TABLE: + case VHOST_USER_SET_LOG_BASE: + case VHOST_USER_SET_LOG_FD: + case VHOST_USER_SET_VRING_NUM: + case VHOST_USER_SET_VRING_ADDR: + case VHOST_USER_SET_VRING_BASE: + case VHOST_USER_SET_VRING_KICK: + case VHOST_USER_SET_VRING_CALL: + case VHOST_USER_SET_VRING_ERR: + case VHOST_USER_SET_VRING_ENABLE: + case VHOST_USER_SEND_RARP: + case VHOST_USER_NET_SET_MTU: + case VHOST_USER_SET_SLAVE_REQ_FD: + vhost_user_lock_all_queue_pairs(dev); + unlock_required = 1; + break; + default: + break; + + } + switch (msg.request.master) { case VHOST_USER_GET_FEATURES: msg.payload.u64 = vhost_user_get_features(dev); @@ -1342,6 +1404,9 @@ vhost_user_msg_handler(int vid, int fd) } + if (unlock_required) + vhost_user_unlock_all_queue_pairs(dev); + if (msg.flags & VHOST_USER_NEED_REPLY) { msg.payload.u64 = !!ret; msg.size = sizeof(msg.payload.u64); diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c index 6fee16e5..b584e380 100644 --- a/lib/librte_vhost/virtio_net.c +++ b/lib/librte_vhost/virtio_net.c @@ -44,6 +44,7 @@ #include #include #include +#include #include "iotlb.h" #include "vhost.h" @@ -326,8 +327,11 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id, } vq = dev->virtqueue[queue_id]; + + vhost_user_access_lock(vq); + if (unlikely(vq->enabled == 0)) - return 0; + goto out; if (dev->features & (1ULL << VIRTIO_F_IOMMU_PLATFORM)) vhost_user_iotlb_rd_lock(vq); @@ -416,6 +420,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id, && (vq->callfd >= 0)) eventfd_write(vq->callfd, (eventfd_t)1); out: + vhost_user_access_unlock(vq); if (dev->features & (1ULL << VIRTIO_F_IOMMU_PLATFORM)) vhost_user_iotlb_rd_unlock(vq); @@ -651,8 +656,11 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id, } vq = dev->virtqueue[queue_id]; + + vhost_user_access_lock(vq); + if (unlikely(vq->enabled == 0)) - return 0; + goto out; if (dev->features & (1ULL << VIRTIO_F_IOMMU_PLATFORM)) vhost_user_iotlb_rd_lock(vq); @@ -712,6 +720,7 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id, } out: + vhost_user_access_unlock(vq); if (dev->features & (1ULL << VIRTIO_F_IOMMU_PLATFORM)) vhost_user_iotlb_rd_unlock(vq); @@ -1180,9 +1189,12 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id, } vq = dev->virtqueue[queue_id]; + if (unlikely(vq->enabled == 0)) return 0; + vhost_user_access_lock(vq); + vq->batch_copy_nb_elems = 0; if (dev->features & (1ULL << VIRTIO_F_IOMMU_PLATFORM)) @@ -1240,6 +1252,7 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id, if (rarp_mbuf == NULL) { RTE_LOG(ERR, VHOST_DATA, "Failed to allocate memory for mbuf.\n"); + vhost_user_access_unlock(vq); return 0; } @@ -1356,6 +1369,8 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id, if (dev->features & (1ULL << VIRTIO_F_IOMMU_PLATFORM)) vhost_user_iotlb_rd_unlock(vq); + vhost_user_access_unlock(vq); + if (unlikely(rarp_mbuf != NULL)) { /* * Inject it to the head of "pkts" array, so that switch's mac @@ -1366,5 +1381,6 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id, i += 1; } + return i; }