From patchwork Sun Nov 17 15:12:43 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Marchand X-Patchwork-Id: 63064 X-Patchwork-Delegate: david.marchand@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 1A469A04B4; Sun, 17 Nov 2019 16:13:12 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id A1B1458C4; Sun, 17 Nov 2019 16:13:06 +0100 (CET) Received: from us-smtp-1.mimecast.com (us-smtp-delivery-1.mimecast.com [207.211.31.120]) by dpdk.org (Postfix) with ESMTP id 9D5EE54AE for ; Sun, 17 Nov 2019 16:13:04 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1574003584; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=v+qXnrcQ30tnw0rHLHFBmijOSwhm+XiUaf0EoUzPWEM=; b=B9gQ/DOsAT6ZoT89Z+/ptvs2Irorx4OEBtL1fSnvqdzd16hVnDX50cYETKAj1CYsV+qrN5 WKwmvexRihYOFbdd9eK1j5kKJ+kJWa9bi+ehZ7q5E2pt/lT3f1/nxj+iAdAA7YNoVh8gnd DJgKeNrBtE9WJVGq3mZEvud+Mn3TCVM= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-273-2lE0cyWkMBGiZcJZPT5BjQ-1; Sun, 17 Nov 2019 10:13:00 -0500 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id BC2FA1800D4F; Sun, 17 Nov 2019 15:12:58 +0000 (UTC) Received: from dmarchan.remote.csb (unknown [10.40.205.71]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0969160171; Sun, 17 Nov 2019 15:12:55 +0000 (UTC) From: David Marchand To: dev@dpdk.org Cc: thomas@monjalon.net, kirankumark@marvell.com, olivier.matz@6wind.com, ferruh.yigit@intel.com, anatoly.burakov@intel.com, arybchenko@solarflare.com, stephen@networkplumber.org, vattunuru@marvell.com Date: Sun, 17 Nov 2019 16:12:43 +0100 Message-Id: <20191117151244.3854-2-david.marchand@redhat.com> In-Reply-To: <20191117151244.3854-1-david.marchand@redhat.com> References: <20191117151244.3854-1-david.marchand@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-MC-Unique: 2lE0cyWkMBGiZcJZPT5BjQ-1 X-Mimecast-Spam-Score: 0 Subject: [dpdk-dev] [PATCH v15 1/2] kni: support userspace VA X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Vamsi Attunuru Patch adds support for kernel module to work in IOVA = VA mode by providing address translation routines to convert userspace VA to kernel VA. KNI performance using PA is not changed by this patch. But comparing KNI using PA to KNI using VA, the latter will have lower performance due to the cost of the added translation. This translation is implemented only with kernel versions starting 4.6.0. Signed-off-by: Vamsi Attunuru Signed-off-by: Kiran Kumar K Reviewed-by: Jerin Jacob --- Changelog since v14: - reworded commitlog, --- kernel/linux/kni/compat.h | 14 +++++ kernel/linux/kni/kni_dev.h | 42 +++++++++++++ kernel/linux/kni/kni_misc.c | 39 +++++++++--- kernel/linux/kni/kni_net.c | 62 +++++++++++++++---- .../linux/eal/include/rte_kni_common.h | 1 + 5 files changed, 136 insertions(+), 22 deletions(-) diff --git a/kernel/linux/kni/compat.h b/kernel/linux/kni/compat.h index 562d8bf94..062b170ef 100644 --- a/kernel/linux/kni/compat.h +++ b/kernel/linux/kni/compat.h @@ -121,3 +121,17 @@ #if LINUX_VERSION_CODE >= KERNEL_VERSION(4, 11, 0) #define HAVE_SIGNAL_FUNCTIONS_OWN_HEADER #endif + +#if KERNEL_VERSION(4, 6, 0) <= LINUX_VERSION_CODE + +#define HAVE_IOVA_TO_KVA_MAPPING_SUPPORT + +#if KERNEL_VERSION(4, 9, 0) > LINUX_VERSION_CODE +#define GET_USER_PAGES_REMOTE_API_V1 +#elif KERNEL_VERSION(4, 9, 0) == LINUX_VERSION_CODE +#define GET_USER_PAGES_REMOTE_API_V2 +#else +#define GET_USER_PAGES_REMOTE_API_V3 +#endif + +#endif diff --git a/kernel/linux/kni/kni_dev.h b/kernel/linux/kni/kni_dev.h index c1ca6789c..fb641b696 100644 --- a/kernel/linux/kni/kni_dev.h +++ b/kernel/linux/kni/kni_dev.h @@ -41,6 +41,8 @@ struct kni_dev { /* kni list */ struct list_head list; + uint8_t iova_mode; + uint32_t core_id; /* Core ID to bind */ char name[RTE_KNI_NAMESIZE]; /* Network device name */ struct task_struct *pthread; @@ -84,8 +86,48 @@ struct kni_dev { void *va[MBUF_BURST_SZ]; void *alloc_pa[MBUF_BURST_SZ]; void *alloc_va[MBUF_BURST_SZ]; + + struct task_struct *usr_tsk; }; +#ifdef HAVE_IOVA_TO_KVA_MAPPING_SUPPORT +static inline phys_addr_t iova_to_phys(struct task_struct *tsk, + unsigned long iova) +{ + phys_addr_t offset, phys_addr; + struct page *page = NULL; + long ret; + + offset = iova & (PAGE_SIZE - 1); + + /* Read one page struct info */ +#ifdef GET_USER_PAGES_REMOTE_API_V3 + ret = get_user_pages_remote(tsk, tsk->mm, iova, 1, + FOLL_TOUCH, &page, NULL, NULL); +#endif +#ifdef GET_USER_PAGES_REMOTE_API_V2 + ret = get_user_pages_remote(tsk, tsk->mm, iova, 1, + FOLL_TOUCH, &page, NULL); +#endif +#ifdef GET_USER_PAGES_REMOTE_API_V1 + ret = get_user_pages_remote(tsk, tsk->mm, iova, 1 + 0, 0, &page, NULL); +#endif + if (ret < 0) + return 0; + + phys_addr = page_to_phys(page) | offset; + put_page(page); + + return phys_addr; +} + +static inline void *iova_to_kva(struct task_struct *tsk, unsigned long iova) +{ + return phys_to_virt(iova_to_phys(tsk, iova)); +} +#endif + void kni_net_release_fifo_phy(struct kni_dev *kni); void kni_net_rx(struct kni_dev *kni); void kni_net_init(struct net_device *dev); diff --git a/kernel/linux/kni/kni_misc.c b/kernel/linux/kni/kni_misc.c index 84ef03b5f..cda71bde0 100644 --- a/kernel/linux/kni/kni_misc.c +++ b/kernel/linux/kni/kni_misc.c @@ -348,15 +348,36 @@ kni_ioctl_create(struct net *net, uint32_t ioctl_num, strncpy(kni->name, dev_info.name, RTE_KNI_NAMESIZE); /* Translate user space info into kernel space info */ - kni->tx_q = phys_to_virt(dev_info.tx_phys); - kni->rx_q = phys_to_virt(dev_info.rx_phys); - kni->alloc_q = phys_to_virt(dev_info.alloc_phys); - kni->free_q = phys_to_virt(dev_info.free_phys); - - kni->req_q = phys_to_virt(dev_info.req_phys); - kni->resp_q = phys_to_virt(dev_info.resp_phys); - kni->sync_va = dev_info.sync_va; - kni->sync_kva = phys_to_virt(dev_info.sync_phys); + if (dev_info.iova_mode) { +#ifdef HAVE_IOVA_TO_KVA_MAPPING_SUPPORT + kni->tx_q = iova_to_kva(current, dev_info.tx_phys); + kni->rx_q = iova_to_kva(current, dev_info.rx_phys); + kni->alloc_q = iova_to_kva(current, dev_info.alloc_phys); + kni->free_q = iova_to_kva(current, dev_info.free_phys); + + kni->req_q = iova_to_kva(current, dev_info.req_phys); + kni->resp_q = iova_to_kva(current, dev_info.resp_phys); + kni->sync_va = dev_info.sync_va; + kni->sync_kva = iova_to_kva(current, dev_info.sync_phys); + kni->usr_tsk = current; + kni->iova_mode = 1; +#else + pr_err("KNI module does not support IOVA to VA translation\n"); + return -EINVAL; +#endif + } else { + + kni->tx_q = phys_to_virt(dev_info.tx_phys); + kni->rx_q = phys_to_virt(dev_info.rx_phys); + kni->alloc_q = phys_to_virt(dev_info.alloc_phys); + kni->free_q = phys_to_virt(dev_info.free_phys); + + kni->req_q = phys_to_virt(dev_info.req_phys); + kni->resp_q = phys_to_virt(dev_info.resp_phys); + kni->sync_va = dev_info.sync_va; + kni->sync_kva = phys_to_virt(dev_info.sync_phys); + kni->iova_mode = 0; + } kni->mbuf_size = dev_info.mbuf_size; diff --git a/kernel/linux/kni/kni_net.c b/kernel/linux/kni/kni_net.c index f25b1277b..1ba9b1b99 100644 --- a/kernel/linux/kni/kni_net.c +++ b/kernel/linux/kni/kni_net.c @@ -36,6 +36,22 @@ static void kni_net_rx_normal(struct kni_dev *kni); /* kni rx function pointer, with default to normal rx */ static kni_net_rx_t kni_net_rx_func = kni_net_rx_normal; +#ifdef HAVE_IOVA_TO_KVA_MAPPING_SUPPORT +/* iova to kernel virtual address */ +static inline void * +iova2kva(struct kni_dev *kni, void *iova) +{ + return phys_to_virt(iova_to_phys(kni->usr_tsk, (unsigned long)iova)); +} + +static inline void * +iova2data_kva(struct kni_dev *kni, struct rte_kni_mbuf *m) +{ + return phys_to_virt(iova_to_phys(kni->usr_tsk, m->buf_physaddr) + + m->data_off); +} +#endif + /* physical address to kernel virtual address */ static void * pa2kva(void *pa) @@ -62,6 +78,26 @@ kva2data_kva(struct rte_kni_mbuf *m) return phys_to_virt(m->buf_physaddr + m->data_off); } +static inline void * +get_kva(struct kni_dev *kni, void *pa) +{ +#ifdef HAVE_IOVA_TO_KVA_MAPPING_SUPPORT + if (kni->iova_mode == 1) + return iova2kva(kni, pa); +#endif + return pa2kva(pa); +} + +static inline void * +get_data_kva(struct kni_dev *kni, void *pkt_kva) +{ +#ifdef HAVE_IOVA_TO_KVA_MAPPING_SUPPORT + if (kni->iova_mode == 1) + return iova2data_kva(kni, pkt_kva); +#endif + return kva2data_kva(pkt_kva); +} + /* * It can be called to process the request. */ @@ -178,7 +214,7 @@ kni_fifo_trans_pa2va(struct kni_dev *kni, return; for (i = 0; i < num_rx; i++) { - kva = pa2kva(kni->pa[i]); + kva = get_kva(kni, kni->pa[i]); kni->va[i] = pa2va(kni->pa[i], kva); kva_nb_segs = kva->nb_segs; @@ -266,8 +302,8 @@ kni_net_tx(struct sk_buff *skb, struct net_device *dev) if (likely(ret == 1)) { void *data_kva; - pkt_kva = pa2kva(pkt_pa); - data_kva = kva2data_kva(pkt_kva); + pkt_kva = get_kva(kni, pkt_pa); + data_kva = get_data_kva(kni, pkt_kva); pkt_va = pa2va(pkt_pa, pkt_kva); len = skb->len; @@ -338,9 +374,9 @@ kni_net_rx_normal(struct kni_dev *kni) /* Transfer received packets to netif */ for (i = 0; i < num_rx; i++) { - kva = pa2kva(kni->pa[i]); + kva = get_kva(kni, kni->pa[i]); len = kva->pkt_len; - data_kva = kva2data_kva(kva); + data_kva = get_data_kva(kni, kva); kni->va[i] = pa2va(kni->pa[i], kva); skb = netdev_alloc_skb(dev, len); @@ -437,9 +473,9 @@ kni_net_rx_lo_fifo(struct kni_dev *kni) num = ret; /* Copy mbufs */ for (i = 0; i < num; i++) { - kva = pa2kva(kni->pa[i]); + kva = get_kva(kni, kni->pa[i]); len = kva->data_len; - data_kva = kva2data_kva(kva); + data_kva = get_data_kva(kni, kva); kni->va[i] = pa2va(kni->pa[i], kva); while (kva->next) { @@ -449,8 +485,8 @@ kni_net_rx_lo_fifo(struct kni_dev *kni) kva = next_kva; } - alloc_kva = pa2kva(kni->alloc_pa[i]); - alloc_data_kva = kva2data_kva(alloc_kva); + alloc_kva = get_kva(kni, kni->alloc_pa[i]); + alloc_data_kva = get_data_kva(kni, alloc_kva); kni->alloc_va[i] = pa2va(kni->alloc_pa[i], alloc_kva); memcpy(alloc_data_kva, data_kva, len); @@ -517,9 +553,9 @@ kni_net_rx_lo_fifo_skb(struct kni_dev *kni) /* Copy mbufs to sk buffer and then call tx interface */ for (i = 0; i < num; i++) { - kva = pa2kva(kni->pa[i]); + kva = get_kva(kni, kni->pa[i]); len = kva->pkt_len; - data_kva = kva2data_kva(kva); + data_kva = get_data_kva(kni, kva); kni->va[i] = pa2va(kni->pa[i], kva); skb = netdev_alloc_skb(dev, len); @@ -550,8 +586,8 @@ kni_net_rx_lo_fifo_skb(struct kni_dev *kni) break; prev_kva = kva; - kva = pa2kva(kva->next); - data_kva = kva2data_kva(kva); + kva = get_kva(kni, kva->next); + data_kva = get_data_kva(kni, kva); /* Convert physical address to virtual address */ prev_kva->next = pa2va(prev_kva->next, kva); } diff --git a/lib/librte_eal/linux/eal/include/rte_kni_common.h b/lib/librte_eal/linux/eal/include/rte_kni_common.h index 46f75a710..2427a965c 100644 --- a/lib/librte_eal/linux/eal/include/rte_kni_common.h +++ b/lib/librte_eal/linux/eal/include/rte_kni_common.h @@ -125,6 +125,7 @@ struct rte_kni_device_info { unsigned int min_mtu; unsigned int max_mtu; uint8_t mac_addr[6]; + uint8_t iova_mode; }; #define KNI_DEVICE "kni"