[dpdk-dev,v5] eal: map uio resources after hugepages when the base_virtaddr is configured.

Message ID C6ECDF3AB251BE4894318F4E4512369780C07670@IRSMSX109.ger.corp.intel.com (mailing list archive)
State Not Applicable, archived
Headers

Commit Message

Burakov, Anatoly Nov. 6, 2014, 3:41 p.m. UTC
The explanation of the patch should be generic and impartial, i.e. when this and this happens, it results in such and such problem, and this patch fixes it by doing this and that. In other words, this will appear in the git history, so whoever is reading the commit log will be able to figure out what this patch does and why it's been applied.

Thomas, do we need to do similar changes to VFIO code, to keep consistency? Also, do we really need for this to depend on --base-virtaddr? Why not do it unconditionally, i.e. map PCI resources right after hugepages in memory?

Thanks,
Anatoly

-----Original Message-----
From: lxu [mailto:liang.xu@cinfotech.cn] 
Sent: Thursday, November 6, 2014 3:32 PM
To: dev@dpdk.org
Cc: Burakov, Anatoly; thomas.monjalon@6wind.com; De Lara Guarch, Pablo
Subject: [PATCH v5] eal: map uio resources after hugepages when the base_virtaddr is configured.

Sorry, I'm learning the right way to send a patch by git. 
I have a multiple processes application. When start the secondary process, I got error message "EAL: pci_map_resource(): cannot mmap(11, 0x7ffff7fba000, 0x20000, 0x0): Bad file descriptor (0x7ffff7fb9000)".
The secondary process links a lot of additional shared libraries, so the address 0x7ffff7fba000 had already be used.
I had fixed similar hugepages mmap problems by base_virtaddr. So I believe the uio resource should be mapped into base_virtaddr at this situation.
This patch try to fix it.


Signed-off-by: lxu <liang.xu@cinfotech.cn>
---
 lib/librte_eal/linuxapp/eal/eal_pci_uio.c | 29 ++++++++++++++++++++++++++++-
 1 file changed, 28 insertions(+), 1 deletion(-)

 
+	/* map uio resource into user required virtual address */
+	static void * requested_addr;
+	if (internal_config.base_virtaddr && NULL == requested_addr) {
+		const struct rte_memseg * last = get_physmem_last();
+		requested_addr = RTE_PTR_ADD(last->addr, last->len);
+	}
+
 	dev->intr_handle.fd = -1;
 	dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN;
 
@@ -371,10 +396,12 @@ pci_uio_map_resource(struct rte_pci_device *dev)
 			if (maps[j].addr != NULL)
 				fail = 1;
 			else {
-				mapaddr = pci_map_resource(NULL, fd, (off_t)offset,
+				mapaddr = pci_map_resource(requested_addr, fd, (off_t)offset,
 						(size_t)maps[j].size);
 				if (mapaddr == NULL)
 					fail = 1;
+				else if (NULL != requested_addr)
+					requested_addr = RTE_PTR_ADD(mapaddr, maps[j].size);
 			}
 
 			if (fail) {
--
1.9.1
  

Comments

Thomas Monjalon Nov. 6, 2014, 3:58 p.m. UTC | #1
2014-11-06 15:41, Burakov, Anatoly:
> Thomas, do we need to do similar changes to VFIO code, to keep consistency?
> Also, do we really need for this to depend on --base-virtaddr? Why not do it
> unconditionally, i.e. map PCI resources right after hugepages in memory?

I don't really like the secondary process mechanism at all.
So I won't give good advice here ;)
But I feel this option --base-virtaddr should be improved or removed.
  
Burakov, Anatoly Nov. 6, 2014, 4:10 p.m. UTC | #2
Well, removing --base-virtaddr is not what I'm asking about.

The issue at hand here is that, given our secondary process mechanism (that you don't like :-) ), some stuff may be attempted to be mapped into space a secondary process may already have mapped something else into (some libraries, for example). This issue was originally discovered by OVDK, so we added a --base-virtaddr option to try and map hugepages at exact virtual address, rather than wherever mmap decides to do so on its own.

The issue encountered by Liang (the author of the patch) is similar, only it's not the hugepages are mapped into the occupied space, but rather the PCI resources (which are mapped with NULL by default, so can be mapped anywhere). Therefore he suggested a patch that maps the PCI resources into a space just after the last hugepage when --base-virtaddr is provided. I'm not sure we need the dependence on --base-virtaddr though, it can probably be done unconditionally. If you have no opinion on the matter, we can leave this detail of the patch as it is, then.

Also, I would suspect that if we are to modify where UIO resources are mapped, VFIO code should be modified the same way to avoid inconsistency between the two.

Thanks,
Anatoly

-----Original Message-----
From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com] 
Sent: Thursday, November 6, 2014 3:58 PM
To: Burakov, Anatoly
Cc: lxu; dev@dpdk.org; De Lara Guarch, Pablo
Subject: Re: [PATCH v5] eal: map uio resources after hugepages when the base_virtaddr is configured.

2014-11-06 15:41, Burakov, Anatoly:
> Thomas, do we need to do similar changes to VFIO code, to keep consistency?
> Also, do we really need for this to depend on --base-virtaddr? Why not 
> do it unconditionally, i.e. map PCI resources right after hugepages in memory?

I don't really like the secondary process mechanism at all.
So I won't give good advice here ;)
But I feel this option --base-virtaddr should be improved or removed.

--
Thomas
  
Bruce Richardson Nov. 6, 2014, 5:30 p.m. UTC | #3
On Thu, Nov 06, 2014 at 04:10:15PM +0000, Burakov, Anatoly wrote:
> Well, removing --base-virtaddr is not what I'm asking about.
> 
> The issue at hand here is that, given our secondary process mechanism (that you don't like :-) ), some stuff may be attempted to be mapped into space a secondary process may already have mapped something else into (some libraries, for example). This issue was originally discovered by OVDK, so we added a --base-virtaddr option to try and map hugepages at exact virtual address, rather than wherever mmap decides to do so on its own.
> 
> The issue encountered by Liang (the author of the patch) is similar, only it's not the hugepages are mapped into the occupied space, but rather the PCI resources (which are mapped with NULL by default, so can be mapped anywhere). Therefore he suggested a patch that maps the PCI resources into a space just after the last hugepage when --base-virtaddr is provided. I'm not sure we need the dependence on --base-virtaddr though, it can probably be done unconditionally. If you have no opinion on the matter, we can leave this detail of the patch as it is, then.
> 
> Also, I would suspect that if we are to modify where UIO resources are mapped, VFIO code should be modified the same way to avoid inconsistency between the two.
> 

I find nothing wrong with your logic, Anatoly, it makes sense to me. :-)

I'm curious, however, as to what Thomas has in mind for how we might improve
the base-virtaddr flag.

/Bruce

> Thanks,
> Anatoly
> 
> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com] 
> Sent: Thursday, November 6, 2014 3:58 PM
> To: Burakov, Anatoly
> Cc: lxu; dev@dpdk.org; De Lara Guarch, Pablo
> Subject: Re: [PATCH v5] eal: map uio resources after hugepages when the base_virtaddr is configured.
> 
> 2014-11-06 15:41, Burakov, Anatoly:
> > Thomas, do we need to do similar changes to VFIO code, to keep consistency?
> > Also, do we really need for this to depend on --base-virtaddr? Why not 
> > do it unconditionally, i.e. map PCI resources right after hugepages in memory?
> 
> I don't really like the secondary process mechanism at all.
> So I won't give good advice here ;)
> But I feel this option --base-virtaddr should be improved or removed.
> 
> --
> Thomas
  

Patch

diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
index 7e62266..a2c9ab6 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
@@ -273,6 +273,24 @@  pci_get_uio_dev(struct rte_pci_device *dev, char *dstbuf,
 	return uio_num;
 }
 
+static inline const struct rte_memseg *
+get_physmem_last(void)
+{
+	const struct rte_memseg * seg = rte_eal_get_physmem_layout();
+	const struct rte_memseg * last = seg;
+	unsigned i = 0;
+
+	for (i=0; i<RTE_MAX_MEMSEG; i++, seg++) {
+		if (seg->addr == NULL)
+			break;
+
+		if(seg->addr > last->addr)
+		 	last = seg;
+
+	}
+	return last;
+}
+
 /* map the PCI resource of a PCI device in virtual memory */  int  pci_uio_map_resource(struct rte_pci_device *dev) @@ -290,6 +308,13 @@ pci_uio_map_resource(struct rte_pci_device *dev)
 	struct mapped_pci_resource *uio_res;
 	struct pci_map *maps;