[dpdk-dev,1/2] vfio: fix pci_vfio_map_resource

Message ID 1467930397-39777-1-git-send-email-yongwang@vmware.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers

Commit Message

Yong Wang July 7, 2016, 10:26 p.m. UTC
  The offset of the 2nd mmap when mapping the region after msix_bar
needs to take region address into consideration.  This is exposed
when using vfio-pci to manage vmxnet3 pmd.

Fixes: 90a1633b2347 ("eal/linux: allow to map BARs with MSI-X tables")

Signed-off-by: Yong Wang <yongwang@vmware.com>
Signed-off-by: Ronghua Zhang <rzhang@vmware.com>
---
 lib/librte_eal/linuxapp/eal/eal_pci_vfio.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)
  

Comments

Thomas Monjalon July 14, 2016, 1:25 p.m. UTC | #1
Someone to review this patch please?
It can be integrated in RC3 if we are sure it doesn't break anything.

2016-07-07 15:26, Yong Wang:
> The offset of the 2nd mmap when mapping the region after msix_bar
> needs to take region address into consideration.  This is exposed
> when using vfio-pci to manage vmxnet3 pmd.
> 
> Fixes: 90a1633b2347 ("eal/linux: allow to map BARs with MSI-X tables")
> 
> Signed-off-by: Yong Wang <yongwang@vmware.com>
> Signed-off-by: Ronghua Zhang <rzhang@vmware.com>
> ---
>  lib/librte_eal/linuxapp/eal/eal_pci_vfio.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_vfio.c b/lib/librte_eal/linuxapp/eal/eal_pci_vfio.c
> index f91b924..3729c35 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_pci_vfio.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_pci_vfio.c
> @@ -896,7 +896,7 @@ pci_vfio_map_resource(struct rte_pci_device *dev)
>  			} else {
>  				memreg[0].offset = reg.offset;
>  				memreg[0].size = table_start;
> -				memreg[1].offset = table_end;
> +				memreg[1].offset = reg.offset + table_end;
>  				memreg[1].size = reg.size - table_end;
>  
>  				RTE_LOG(DEBUG, EAL,
> @@ -939,7 +939,8 @@ pci_vfio_map_resource(struct rte_pci_device *dev)
>  			/* if there's a second part, try to map it */
>  			if (map_addr != MAP_FAILED
>  			    && memreg[1].offset && memreg[1].size) {
> -				void *second_addr = RTE_PTR_ADD(bar_addr, memreg[1].offset);
> +				void *second_addr = RTE_PTR_ADD(bar_addr,
> +								memreg[1].offset - memreg[0].offset);
>  				map_addr = pci_map_resource(second_addr,
>  							    vfio_dev_fd, memreg[1].offset,
>  							    memreg[1].size,
>
  
Anatoly Burakov July 14, 2016, 2:50 p.m. UTC | #2
> Someone to review this patch please?
> It can be integrated in RC3 if we are sure it doesn't break anything.
> 
> 2016-07-07 15:26, Yong Wang:
> > The offset of the 2nd mmap when mapping the region after msix_bar
> > needs to take region address into consideration.  This is exposed when
> > using vfio-pci to manage vmxnet3 pmd.
> >
> > Fixes: 90a1633b2347 ("eal/linux: allow to map BARs with MSI-X tables")
> >
> > Signed-off-by: Yong Wang <yongwang@vmware.com>
> > Signed-off-by: Ronghua Zhang <rzhang@vmware.com>

Hi Thomas,

The patch makes sense to me, but I don't have any devices that trigger that particular codepath in my immediate vicinity. The original patch was mentioning an NVMe adapter, so that probably should be tested with this change. I'm CC'ing the original author of that patch just in case. Do we know of any other NICs/devices that might be affected by this? Aside from the obvious example of vmxnet3... 

Thanks,
Anatoly
  
Dan Aloni July 14, 2016, 3:34 p.m. UTC | #3
On Thu, Jul 14, 2016 at 02:50:51PM +0000, Burakov, Anatoly wrote:
> > Someone to review this patch please?
> > It can be integrated in RC3 if we are sure it doesn't break anything.
> > 
> > 2016-07-07 15:26, Yong Wang:
> > > The offset of the 2nd mmap when mapping the region after msix_bar
> > > needs to take region address into consideration.  This is exposed when
> > > using vfio-pci to manage vmxnet3 pmd.
> > >
> > > Fixes: 90a1633b2347 ("eal/linux: allow to map BARs with MSI-X tables")
> > >
> > > Signed-off-by: Yong Wang <yongwang@vmware.com>
> > > Signed-off-by: Ronghua Zhang <rzhang@vmware.com>
> 
> Hi Thomas,
> 
> The patch makes sense to me, but I don't have any devices that trigger that particular codepath in my immediate vicinity. The original patch was mentioning an NVMe adapter, so that probably should be tested with this change. I'm CC'ing the original author of that patch just in case. Do we know of any other NICs/devices that might be affected by this? Aside from the obvious example of vmxnet3... 

The patch makes sense. To explain further, the reg.offset is
resource-relative and not bar-relative like table_start and table_end.
The mmap() needs to be resource-relative.

I think it would be cleaner to use 'reg.offset' to substruct,
conveying the fact that bar_addr already accounts for it.

-                               void *second_addr = RTE_PTR_ADD(bar_addr, memreg[1].offset);
+                               void *second_addr = RTE_PTR_ADD(bar_addr,
+                                                               memreg[1].offset - reg.offset);

The work I've done with the Intel NVMe adapter was cut short due to
other tasks I've had to shift to. But it was left in a state where it
almost worked under VFIO, in a sense that the NVME card entered a
fault state and I had not invested enough time to figure out why.

If it's due to this bug, then I would be able to finally close this
one. But that's a bit hopeful, though. Happy to find actual users of
the changes.
  

Patch

diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_vfio.c b/lib/librte_eal/linuxapp/eal/eal_pci_vfio.c
index f91b924..3729c35 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci_vfio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci_vfio.c
@@ -896,7 +896,7 @@  pci_vfio_map_resource(struct rte_pci_device *dev)
 			} else {
 				memreg[0].offset = reg.offset;
 				memreg[0].size = table_start;
-				memreg[1].offset = table_end;
+				memreg[1].offset = reg.offset + table_end;
 				memreg[1].size = reg.size - table_end;
 
 				RTE_LOG(DEBUG, EAL,
@@ -939,7 +939,8 @@  pci_vfio_map_resource(struct rte_pci_device *dev)
 			/* if there's a second part, try to map it */
 			if (map_addr != MAP_FAILED
 			    && memreg[1].offset && memreg[1].size) {
-				void *second_addr = RTE_PTR_ADD(bar_addr, memreg[1].offset);
+				void *second_addr = RTE_PTR_ADD(bar_addr,
+								memreg[1].offset - memreg[0].offset);
 				map_addr = pci_map_resource(second_addr,
 							    vfio_dev_fd, memreg[1].offset,
 							    memreg[1].size,