[v15,1/2] kni: support userspace VA

Message ID 20191117151244.3854-2-david.marchand@redhat.com (mailing list archive)
State Accepted, archived
Delegated to: David Marchand
Headers
Series kni: support IOVA mode |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-compilation success Compile Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS

Commit Message

David Marchand Nov. 17, 2019, 3:12 p.m. UTC
  From: Vamsi Attunuru <vattunuru@marvell.com>

Patch adds support for kernel module to work in IOVA = VA mode by
providing address translation routines to convert userspace VA to
kernel VA.

KNI performance using PA is not changed by this patch.
But comparing KNI using PA to KNI using VA, the latter will have lower
performance due to the cost of the added translation.

This translation is implemented only with kernel versions starting 4.6.0.

Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
Signed-off-by: Kiran Kumar K <kirankumark@marvell.com>
---
Changelog since v14:
- reworded commitlog,

---
 kernel/linux/kni/compat.h                     | 14 +++++
 kernel/linux/kni/kni_dev.h                    | 42 +++++++++++++
 kernel/linux/kni/kni_misc.c                   | 39 +++++++++---
 kernel/linux/kni/kni_net.c                    | 62 +++++++++++++++----
 .../linux/eal/include/rte_kni_common.h        |  1 +
 5 files changed, 136 insertions(+), 22 deletions(-)
  

Comments

Jerin Jacob Nov. 18, 2019, 2:04 p.m. UTC | #1
On Sun, Nov 17, 2019 at 8:43 PM David Marchand
<david.marchand@redhat.com> wrote:
>
> From: Vamsi Attunuru <vattunuru@marvell.com>
>
> Patch adds support for kernel module to work in IOVA = VA mode by
> providing address translation routines to convert userspace VA to
> kernel VA.
>
> KNI performance using PA is not changed by this patch.
> But comparing KNI using PA to KNI using VA, the latter will have lower
> performance due to the cost of the added translation.
>
> This translation is implemented only with kernel versions starting 4.6.0.
>
> Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
> Signed-off-by: Kiran Kumar K <kirankumark@marvell.com>

Reviewed-by: Jerin Jacob <jerinj@marvell.com>
  
Ferruh Yigit Nov. 20, 2019, 10:47 a.m. UTC | #2
On 11/17/2019 3:12 PM, David Marchand wrote:
> From: Vamsi Attunuru <vattunuru@marvell.com>
> 
> Patch adds support for kernel module to work in IOVA = VA mode by
> providing address translation routines to convert userspace VA to
> kernel VA.
> 
> KNI performance using PA is not changed by this patch.
> But comparing KNI using PA to KNI using VA, the latter will have lower
> performance due to the cost of the added translation.
> 
> This translation is implemented only with kernel versions starting 4.6.0.
> 
> Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
> Signed-off-by: Kiran Kumar K <kirankumark@marvell.com>
> ---
> Changelog since v14:
> - reworded commitlog,
> 
> ---
>  kernel/linux/kni/compat.h                     | 14 +++++
>  kernel/linux/kni/kni_dev.h                    | 42 +++++++++++++
>  kernel/linux/kni/kni_misc.c                   | 39 +++++++++---
>  kernel/linux/kni/kni_net.c                    | 62 +++++++++++++++----
>  .../linux/eal/include/rte_kni_common.h        |  1 +
>  5 files changed, 136 insertions(+), 22 deletions(-)
> 
> diff --git a/kernel/linux/kni/compat.h b/kernel/linux/kni/compat.h
> index 562d8bf94..062b170ef 100644
> --- a/kernel/linux/kni/compat.h
> +++ b/kernel/linux/kni/compat.h
> @@ -121,3 +121,17 @@
>  #if LINUX_VERSION_CODE >= KERNEL_VERSION(4, 11, 0)
>  #define HAVE_SIGNAL_FUNCTIONS_OWN_HEADER
>  #endif
> +
> +#if KERNEL_VERSION(4, 6, 0) <= LINUX_VERSION_CODE
> +
> +#define HAVE_IOVA_TO_KVA_MAPPING_SUPPORT
> +
> +#if KERNEL_VERSION(4, 9, 0) > LINUX_VERSION_CODE
> +#define GET_USER_PAGES_REMOTE_API_V1
> +#elif KERNEL_VERSION(4, 9, 0) == LINUX_VERSION_CODE
> +#define GET_USER_PAGES_REMOTE_API_V2
> +#else
> +#define GET_USER_PAGES_REMOTE_API_V3
> +#endif

A build error has been reported [1], it looks like only version check is not
enough for the distro kernel version that may be backported some related commits.

I am also checking if we can find another way to check which version of the
function to use, but can you also please check?

Thanks,
ferruh


[1]
In file included from
/tmp/dpdk/i686-native-linuxapp-gcc/build/kernel/linux/kni/kni_net.c:25:0:
/tmp/dpdk/kernel/linux/kni/kni_dev.h: In function ‘iova_to_phys’:
/tmp/dpdk/kernel/linux/kni/kni_dev.h:114:9: error: expected ‘)’ before numeric
constant
         0, 0, &page, NULL);
         ^
/tmp/dpdk/kernel/linux/kni/kni_dev.h:113:8: error: too few arguments to function
‘get_user_pages_remote’
  ret = get_user_pages_remote(tsk, tsk->mm, iova, 1
        ^~~~~~~~~~~~~~~~~~~~~
In file included from
/usr/src/linux-headers-4.8.0-22-generic/include/linux/scatterlist.h:7:0,
                 from
/usr/src/linux-headers-4.8.0-22-generic/include/linux/dmaengine.h:24,
                 from
/usr/src/linux-headers-4.8.0-22-generic/include/linux/netdevice.h:38,
                 from
/tmp/dpdk/i686-native-linuxapp-gcc/build/kernel/linux/kni/kni_net.c:14:
/usr/src/linux-headers-4.8.0-22-generic/include/linux/mm.h:1311:6: note:
declared here
 long get_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm,
      ^~~~~~~~~~~~~~~~~~~~~
In file included from
/tmp/dpdk/i686-native-linuxapp-gcc/build/kernel/linux/kni/kni_misc.c:22:0:
/tmp/dpdk/kernel/linux/kni/kni_dev.h: In function ‘iova_to_phys’:
/tmp/dpdk/kernel/linux/kni/kni_dev.h:114:9: error: expected ‘)’ before numeric
constant
         0, 0, &page, NULL);
         ^
/tmp/dpdk/kernel/linux/kni/kni_dev.h:113:8: error: too few arguments to function
‘get_user_pages_remote’
  ret = get_user_pages_remote(tsk, tsk->mm, iova, 1
        ^~~~~~~~~~~~~~~~~~~~~
In file included from
/usr/src/linux-headers-4.8.0-22-generic/include/linux/scatterlist.h:7:0,
                 from
/usr/src/linux-headers-4.8.0-22-generic/include/linux/dmaengine.h:24,
                 from
/usr/src/linux-headers-4.8.0-22-generic/include/linux/netdevice.h:38,
                 from
/tmp/dpdk/i686-native-linuxapp-gcc/build/kernel/linux/kni/kni_misc.c:9:
/usr/src/linux-headers-4.8.0-22-generic/include/linux/mm.h:1311:6: note:
declared here
 long get_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm,
      ^~~~~~~~~~~~~~~~~~~~~
  

Patch

diff --git a/kernel/linux/kni/compat.h b/kernel/linux/kni/compat.h
index 562d8bf94..062b170ef 100644
--- a/kernel/linux/kni/compat.h
+++ b/kernel/linux/kni/compat.h
@@ -121,3 +121,17 @@ 
 #if LINUX_VERSION_CODE >= KERNEL_VERSION(4, 11, 0)
 #define HAVE_SIGNAL_FUNCTIONS_OWN_HEADER
 #endif
+
+#if KERNEL_VERSION(4, 6, 0) <= LINUX_VERSION_CODE
+
+#define HAVE_IOVA_TO_KVA_MAPPING_SUPPORT
+
+#if KERNEL_VERSION(4, 9, 0) > LINUX_VERSION_CODE
+#define GET_USER_PAGES_REMOTE_API_V1
+#elif KERNEL_VERSION(4, 9, 0) == LINUX_VERSION_CODE
+#define GET_USER_PAGES_REMOTE_API_V2
+#else
+#define GET_USER_PAGES_REMOTE_API_V3
+#endif
+
+#endif
diff --git a/kernel/linux/kni/kni_dev.h b/kernel/linux/kni/kni_dev.h
index c1ca6789c..fb641b696 100644
--- a/kernel/linux/kni/kni_dev.h
+++ b/kernel/linux/kni/kni_dev.h
@@ -41,6 +41,8 @@  struct kni_dev {
 	/* kni list */
 	struct list_head list;
 
+	uint8_t iova_mode;
+
 	uint32_t core_id;            /* Core ID to bind */
 	char name[RTE_KNI_NAMESIZE]; /* Network device name */
 	struct task_struct *pthread;
@@ -84,8 +86,48 @@  struct kni_dev {
 	void *va[MBUF_BURST_SZ];
 	void *alloc_pa[MBUF_BURST_SZ];
 	void *alloc_va[MBUF_BURST_SZ];
+
+	struct task_struct *usr_tsk;
 };
 
+#ifdef HAVE_IOVA_TO_KVA_MAPPING_SUPPORT
+static inline phys_addr_t iova_to_phys(struct task_struct *tsk,
+				       unsigned long iova)
+{
+	phys_addr_t offset, phys_addr;
+	struct page *page = NULL;
+	long ret;
+
+	offset = iova & (PAGE_SIZE - 1);
+
+	/* Read one page struct info */
+#ifdef GET_USER_PAGES_REMOTE_API_V3
+	ret = get_user_pages_remote(tsk, tsk->mm, iova, 1,
+				    FOLL_TOUCH, &page, NULL, NULL);
+#endif
+#ifdef GET_USER_PAGES_REMOTE_API_V2
+	ret = get_user_pages_remote(tsk, tsk->mm, iova, 1,
+				    FOLL_TOUCH, &page, NULL);
+#endif
+#ifdef GET_USER_PAGES_REMOTE_API_V1
+	ret = get_user_pages_remote(tsk, tsk->mm, iova, 1
+				    0, 0, &page, NULL);
+#endif
+	if (ret < 0)
+		return 0;
+
+	phys_addr = page_to_phys(page) | offset;
+	put_page(page);
+
+	return phys_addr;
+}
+
+static inline void *iova_to_kva(struct task_struct *tsk, unsigned long iova)
+{
+	return phys_to_virt(iova_to_phys(tsk, iova));
+}
+#endif
+
 void kni_net_release_fifo_phy(struct kni_dev *kni);
 void kni_net_rx(struct kni_dev *kni);
 void kni_net_init(struct net_device *dev);
diff --git a/kernel/linux/kni/kni_misc.c b/kernel/linux/kni/kni_misc.c
index 84ef03b5f..cda71bde0 100644
--- a/kernel/linux/kni/kni_misc.c
+++ b/kernel/linux/kni/kni_misc.c
@@ -348,15 +348,36 @@  kni_ioctl_create(struct net *net, uint32_t ioctl_num,
 	strncpy(kni->name, dev_info.name, RTE_KNI_NAMESIZE);
 
 	/* Translate user space info into kernel space info */
-	kni->tx_q = phys_to_virt(dev_info.tx_phys);
-	kni->rx_q = phys_to_virt(dev_info.rx_phys);
-	kni->alloc_q = phys_to_virt(dev_info.alloc_phys);
-	kni->free_q = phys_to_virt(dev_info.free_phys);
-
-	kni->req_q = phys_to_virt(dev_info.req_phys);
-	kni->resp_q = phys_to_virt(dev_info.resp_phys);
-	kni->sync_va = dev_info.sync_va;
-	kni->sync_kva = phys_to_virt(dev_info.sync_phys);
+	if (dev_info.iova_mode) {
+#ifdef HAVE_IOVA_TO_KVA_MAPPING_SUPPORT
+		kni->tx_q = iova_to_kva(current, dev_info.tx_phys);
+		kni->rx_q = iova_to_kva(current, dev_info.rx_phys);
+		kni->alloc_q = iova_to_kva(current, dev_info.alloc_phys);
+		kni->free_q = iova_to_kva(current, dev_info.free_phys);
+
+		kni->req_q = iova_to_kva(current, dev_info.req_phys);
+		kni->resp_q = iova_to_kva(current, dev_info.resp_phys);
+		kni->sync_va = dev_info.sync_va;
+		kni->sync_kva = iova_to_kva(current, dev_info.sync_phys);
+		kni->usr_tsk = current;
+		kni->iova_mode = 1;
+#else
+		pr_err("KNI module does not support IOVA to VA translation\n");
+		return -EINVAL;
+#endif
+	} else {
+
+		kni->tx_q = phys_to_virt(dev_info.tx_phys);
+		kni->rx_q = phys_to_virt(dev_info.rx_phys);
+		kni->alloc_q = phys_to_virt(dev_info.alloc_phys);
+		kni->free_q = phys_to_virt(dev_info.free_phys);
+
+		kni->req_q = phys_to_virt(dev_info.req_phys);
+		kni->resp_q = phys_to_virt(dev_info.resp_phys);
+		kni->sync_va = dev_info.sync_va;
+		kni->sync_kva = phys_to_virt(dev_info.sync_phys);
+		kni->iova_mode = 0;
+	}
 
 	kni->mbuf_size = dev_info.mbuf_size;
 
diff --git a/kernel/linux/kni/kni_net.c b/kernel/linux/kni/kni_net.c
index f25b1277b..1ba9b1b99 100644
--- a/kernel/linux/kni/kni_net.c
+++ b/kernel/linux/kni/kni_net.c
@@ -36,6 +36,22 @@  static void kni_net_rx_normal(struct kni_dev *kni);
 /* kni rx function pointer, with default to normal rx */
 static kni_net_rx_t kni_net_rx_func = kni_net_rx_normal;
 
+#ifdef HAVE_IOVA_TO_KVA_MAPPING_SUPPORT
+/* iova to kernel virtual address */
+static inline void *
+iova2kva(struct kni_dev *kni, void *iova)
+{
+	return phys_to_virt(iova_to_phys(kni->usr_tsk, (unsigned long)iova));
+}
+
+static inline void *
+iova2data_kva(struct kni_dev *kni, struct rte_kni_mbuf *m)
+{
+	return phys_to_virt(iova_to_phys(kni->usr_tsk, m->buf_physaddr) +
+			    m->data_off);
+}
+#endif
+
 /* physical address to kernel virtual address */
 static void *
 pa2kva(void *pa)
@@ -62,6 +78,26 @@  kva2data_kva(struct rte_kni_mbuf *m)
 	return phys_to_virt(m->buf_physaddr + m->data_off);
 }
 
+static inline void *
+get_kva(struct kni_dev *kni, void *pa)
+{
+#ifdef HAVE_IOVA_TO_KVA_MAPPING_SUPPORT
+	if (kni->iova_mode == 1)
+		return iova2kva(kni, pa);
+#endif
+	return pa2kva(pa);
+}
+
+static inline void *
+get_data_kva(struct kni_dev *kni, void *pkt_kva)
+{
+#ifdef HAVE_IOVA_TO_KVA_MAPPING_SUPPORT
+	if (kni->iova_mode == 1)
+		return iova2data_kva(kni, pkt_kva);
+#endif
+	return kva2data_kva(pkt_kva);
+}
+
 /*
  * It can be called to process the request.
  */
@@ -178,7 +214,7 @@  kni_fifo_trans_pa2va(struct kni_dev *kni,
 			return;
 
 		for (i = 0; i < num_rx; i++) {
-			kva = pa2kva(kni->pa[i]);
+			kva = get_kva(kni, kni->pa[i]);
 			kni->va[i] = pa2va(kni->pa[i], kva);
 
 			kva_nb_segs = kva->nb_segs;
@@ -266,8 +302,8 @@  kni_net_tx(struct sk_buff *skb, struct net_device *dev)
 	if (likely(ret == 1)) {
 		void *data_kva;
 
-		pkt_kva = pa2kva(pkt_pa);
-		data_kva = kva2data_kva(pkt_kva);
+		pkt_kva = get_kva(kni, pkt_pa);
+		data_kva = get_data_kva(kni, pkt_kva);
 		pkt_va = pa2va(pkt_pa, pkt_kva);
 
 		len = skb->len;
@@ -338,9 +374,9 @@  kni_net_rx_normal(struct kni_dev *kni)
 
 	/* Transfer received packets to netif */
 	for (i = 0; i < num_rx; i++) {
-		kva = pa2kva(kni->pa[i]);
+		kva = get_kva(kni, kni->pa[i]);
 		len = kva->pkt_len;
-		data_kva = kva2data_kva(kva);
+		data_kva = get_data_kva(kni, kva);
 		kni->va[i] = pa2va(kni->pa[i], kva);
 
 		skb = netdev_alloc_skb(dev, len);
@@ -437,9 +473,9 @@  kni_net_rx_lo_fifo(struct kni_dev *kni)
 		num = ret;
 		/* Copy mbufs */
 		for (i = 0; i < num; i++) {
-			kva = pa2kva(kni->pa[i]);
+			kva = get_kva(kni, kni->pa[i]);
 			len = kva->data_len;
-			data_kva = kva2data_kva(kva);
+			data_kva = get_data_kva(kni, kva);
 			kni->va[i] = pa2va(kni->pa[i], kva);
 
 			while (kva->next) {
@@ -449,8 +485,8 @@  kni_net_rx_lo_fifo(struct kni_dev *kni)
 				kva = next_kva;
 			}
 
-			alloc_kva = pa2kva(kni->alloc_pa[i]);
-			alloc_data_kva = kva2data_kva(alloc_kva);
+			alloc_kva = get_kva(kni, kni->alloc_pa[i]);
+			alloc_data_kva = get_data_kva(kni, alloc_kva);
 			kni->alloc_va[i] = pa2va(kni->alloc_pa[i], alloc_kva);
 
 			memcpy(alloc_data_kva, data_kva, len);
@@ -517,9 +553,9 @@  kni_net_rx_lo_fifo_skb(struct kni_dev *kni)
 
 	/* Copy mbufs to sk buffer and then call tx interface */
 	for (i = 0; i < num; i++) {
-		kva = pa2kva(kni->pa[i]);
+		kva = get_kva(kni, kni->pa[i]);
 		len = kva->pkt_len;
-		data_kva = kva2data_kva(kva);
+		data_kva = get_data_kva(kni, kva);
 		kni->va[i] = pa2va(kni->pa[i], kva);
 
 		skb = netdev_alloc_skb(dev, len);
@@ -550,8 +586,8 @@  kni_net_rx_lo_fifo_skb(struct kni_dev *kni)
 					break;
 
 				prev_kva = kva;
-				kva = pa2kva(kva->next);
-				data_kva = kva2data_kva(kva);
+				kva = get_kva(kni, kva->next);
+				data_kva = get_data_kva(kni, kva);
 				/* Convert physical address to virtual address */
 				prev_kva->next = pa2va(prev_kva->next, kva);
 			}
diff --git a/lib/librte_eal/linux/eal/include/rte_kni_common.h b/lib/librte_eal/linux/eal/include/rte_kni_common.h
index 46f75a710..2427a965c 100644
--- a/lib/librte_eal/linux/eal/include/rte_kni_common.h
+++ b/lib/librte_eal/linux/eal/include/rte_kni_common.h
@@ -125,6 +125,7 @@  struct rte_kni_device_info {
 	unsigned int min_mtu;
 	unsigned int max_mtu;
 	uint8_t mac_addr[6];
+	uint8_t iova_mode;
 };
 
 #define KNI_DEVICE "kni"