[v2] vfio: do not coalesce DMA mappings

Message ID 20230104051936.2456411-1-nipun.gupta@amd.com (mailing list archive)
State Accepted, archived
Delegated to: David Marchand
Headers
Series [v2] vfio: do not coalesce DMA mappings |

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/Intel-compilation success Compilation OK
ci/iol-broadcom-Performance success Performance Testing PASS
ci/intel-Testing success Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-intel-Functional fail Functional Testing issues
ci/github-robot: build success github build: passed
ci/iol-aarch64-unit-testing success Testing PASS
ci/iol-testing success Testing PASS
ci/iol-x86_64-unit-testing success Testing PASS
ci/iol-x86_64-compile-testing success Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS
ci/iol-abi-testing success Testing PASS

Commit Message

Gupta, Nipun Jan. 4, 2023, 5:19 a.m. UTC
  At the cleanup time when dma unmap is done, linux kernel
does not allow unmap of individual segments which were
coalesced together while creating the DMA map for type1 IOMMU
mappings. So, this change updates the mapping of the memory
segments(hugepages) on a per-page basis.

Signed-off-by: Nipun Gupta <nipun.gupta@amd.com>
Signed-off-by: Nikhil Agarwal <nikhil.agarwal@amd.com>
---

Changes in v2:
- Fix checkpatch errors by updated mailmap

 .mailmap                 |  4 ++--
 lib/eal/linux/eal_vfio.c | 29 -----------------------------
 2 files changed, 2 insertions(+), 31 deletions(-)
  

Comments

David Marchand Feb. 2, 2023, 10:48 a.m. UTC | #1
Hi,

On Wed, Jan 4, 2023 at 6:19 AM Nipun Gupta <nipun.gupta@amd.com> wrote:
>
> At the cleanup time when dma unmap is done, linux kernel
> does not allow unmap of individual segments which were
> coalesced together while creating the DMA map for type1 IOMMU
> mappings. So, this change updates the mapping of the memory
> segments(hugepages) on a per-page basis.
>
> Signed-off-by: Nipun Gupta <nipun.gupta@amd.com>
> Signed-off-by: Nikhil Agarwal <nikhil.agarwal@amd.com>

This change is scary.

I won't take it without a review from the maintainer.
Anatoly, can you have a look?


Thanks.
  
Gupta, Nipun Feb. 7, 2023, 8:56 a.m. UTC | #2
[AMD Official Use Only - General]

Hi David,

I agree that change is not straightforward to review, but it should not cause any functional issue as we are still creating all the memory mappings, but one by one for each segment.
For hot plug case this causes issue as mentioned, that VFIO does not allow unmap of the individual segments in case mapping was created of a single coalesced segment. 

But yes, I am not sure why this code was added, which Anatoly may have more understanding on.

Anatoly,

Can you please provide your feedback on this change?

Thanks,
Nipun

> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Thursday, February 2, 2023 4:19 PM
> To: anatoly.burakov@intel.com; Gupta, Nipun <Nipun.Gupta@amd.com>
> Cc: dev@dpdk.org; thomas@monjalon.net; Yigit, Ferruh
> <Ferruh.Yigit@amd.com>; Agarwal, Nikhil <nikhil.agarwal@amd.com>
> Subject: Re: [PATCH v2] vfio: do not coalesce DMA mappings
> 
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
> 
> 
> Hi,
> 
> On Wed, Jan 4, 2023 at 6:19 AM Nipun Gupta <nipun.gupta@amd.com> wrote:
> >
> > At the cleanup time when dma unmap is done, linux kernel
> > does not allow unmap of individual segments which were
> > coalesced together while creating the DMA map for type1 IOMMU
> > mappings. So, this change updates the mapping of the memory
> > segments(hugepages) on a per-page basis.
> >
> > Signed-off-by: Nipun Gupta <nipun.gupta@amd.com>
> > Signed-off-by: Nikhil Agarwal <nikhil.agarwal@amd.com>
> 
> This change is scary.
> 
> I won't take it without a review from the maintainer.
> Anatoly, can you have a look?
> 
> 
> Thanks.
> --
> David Marchand
  
Gupta, Nipun April 4, 2023, 2:57 p.m. UTC | #3
Hi David,
	As the DPDK 23.03 release is out can we have a relook at this change?

Thanks,
Nipun

> -----Original Message-----
> From: Gupta, Nipun
> Sent: Tuesday, February 7, 2023 2:27 PM
> To: David Marchand <david.marchand@redhat.com>;
> anatoly.burakov@intel.com
> Cc: dev@dpdk.org; thomas@monjalon.net; Yigit, Ferruh
> <Ferruh.Yigit@amd.com>; Agarwal, Nikhil <nikhil.agarwal@amd.com>
> Subject: RE: [PATCH v2] vfio: do not coalesce DMA mappings
> 
> [AMD Official Use Only - General]
> 
> Hi David,
> 
> I agree that change is not straightforward to review, but it should not cause any
> functional issue as we are still creating all the memory mappings, but one by one
> for each segment.
> For hot plug case this causes issue as mentioned, that VFIO does not allow
> unmap of the individual segments in case mapping was created of a single
> coalesced segment.
> 
> But yes, I am not sure why this code was added, which Anatoly may have more
> understanding on.
> 
> Anatoly,
> 
> Can you please provide your feedback on this change?
> 
> Thanks,
> Nipun
> 
> > -----Original Message-----
> > From: David Marchand <david.marchand@redhat.com>
> > Sent: Thursday, February 2, 2023 4:19 PM
> > To: anatoly.burakov@intel.com; Gupta, Nipun <Nipun.Gupta@amd.com>
> > Cc: dev@dpdk.org; thomas@monjalon.net; Yigit, Ferruh
> > <Ferruh.Yigit@amd.com>; Agarwal, Nikhil <nikhil.agarwal@amd.com>
> > Subject: Re: [PATCH v2] vfio: do not coalesce DMA mappings
> >
> > Caution: This message originated from an External Source. Use proper caution
> > when opening attachments, clicking links, or responding.
> >
> >
> > Hi,
> >
> > On Wed, Jan 4, 2023 at 6:19 AM Nipun Gupta <nipun.gupta@amd.com>
> wrote:
> > >
> > > At the cleanup time when dma unmap is done, linux kernel
> > > does not allow unmap of individual segments which were
> > > coalesced together while creating the DMA map for type1 IOMMU
> > > mappings. So, this change updates the mapping of the memory
> > > segments(hugepages) on a per-page basis.
> > >
> > > Signed-off-by: Nipun Gupta <nipun.gupta@amd.com>
> > > Signed-off-by: Nikhil Agarwal <nikhil.agarwal@amd.com>
> >
> > This change is scary.
> >
> > I won't take it without a review from the maintainer.
> > Anatoly, can you have a look?
> >
> >
> > Thanks.
> > --
> > David Marchand
  
Anatoly Burakov April 4, 2023, 3:13 p.m. UTC | #4
On 2/7/2023 8:56 AM, Gupta, Nipun wrote:
> [AMD Official Use Only - General]
> 
> Hi David,
> 
> I agree that change is not straightforward to review, but it should not cause any functional issue as we are still creating all the memory mappings, but one by one for each segment.
> For hot plug case this causes issue as mentioned, that VFIO does not allow unmap of the individual segments in case mapping was created of a single coalesced segment.
> 
> But yes, I am not sure why this code was added, which Anatoly may have more understanding on.

The motivation behind this code was that Linux allows limited amount of 
page mappings, so we were trying to save on those. However, since then 
there have been a few changes related to partial unmaps that may make it 
so that this code is not only no longer necessary, but is in fact 
actively harmful. I agree that this at least warrants a second look.

> 
> Anatoly,
> 
> Can you please provide your feedback on this change?

The patch probably shouldn't include the mailmap changes :)

Could you please provide some steps to reproduce the hotplug issue 
you're having? It would be great to have a test case for this patchset 
to put it in context.
  
Gupta, Nipun April 4, 2023, 4:32 p.m. UTC | #5
On 4/4/2023 8:43 PM, Burakov, Anatoly wrote:
> Caution: This message originated from an External Source. Use proper 
> caution when opening attachments, clicking links, or responding.
> 
> 
> On 2/7/2023 8:56 AM, Gupta, Nipun wrote:
>> [AMD Official Use Only - General]
>>
>> Hi David,
>>
>> I agree that change is not straightforward to review, but it should 
>> not cause any functional issue as we are still creating all the memory 
>> mappings, but one by one for each segment.
>> For hot plug case this causes issue as mentioned, that VFIO does not 
>> allow unmap of the individual segments in case mapping was created of 
>> a single coalesced segment.
>>
>> But yes, I am not sure why this code was added, which Anatoly may have 
>> more understanding on.
> 
> The motivation behind this code was that Linux allows limited amount of
> page mappings, so we were trying to save on those. However, since then
> there have been a few changes related to partial unmaps that may make it
> so that this code is not only no longer necessary, but is in fact
> actively harmful. I agree that this at least warrants a second look.
> 
>>
>> Anatoly,
>>
>> Can you please provide your feedback on this change?
> 
> The patch probably shouldn't include the mailmap changes :)

Sure, will send a separate patch for it.

> 
> Could you please provide some steps to reproduce the hotplug issue
> you're having? It would be great to have a test case for this patchset
> to put it in context.

I am working on CDX bus 
(http://patchwork.dpdk.org/project/dpdk/patch/20230124140746.594066-2-nipun.gupta@amd.com/) 
and trying out some cases for plug/unplug.

The test is as follows:
   # Run testpmd application
   ./dpdk-testpmd -c 0x3 -- -i --nb-cores=1

   # Bind to VFIO
   echo "vfio-cdx" >  /sys/bus/cdx/devices/cdx-00\:00/driver_override
   echo "cdx-00:00" > /sys/bus/cdx/drivers_probe

   # Plug a device
   testpmd> port attach cdx:cdx-00:00

   #quit testpmd
   testpmd> quit

This gave error at testpmd exit that memory cannot be freed. On 
debugging I updated this code and seems it should be seen with any of 
the device.

I see similar test case (without quit) mentioned 
https://doc.dpdk.org/dts/test_plans/hotplug_test_plan.html, but the 
difference is that it is with igb_uio and issue is being observed with VFIO.

Please note the device/bus mentioned in the commands is not yet 
upstreamed in DPDK, but patches would be sent out soon.

Thanks,
Nipun

> 
> -- 
> Thanks,
> Anatoly
>
  
Anatoly Burakov April 5, 2023, 2:16 p.m. UTC | #6
On 4/4/2023 5:32 PM, Nipun Gupta wrote:
> 
> 
> On 4/4/2023 8:43 PM, Burakov, Anatoly wrote:
>> Caution: This message originated from an External Source. Use proper 
>> caution when opening attachments, clicking links, or responding.
>>
>>
>> On 2/7/2023 8:56 AM, Gupta, Nipun wrote:
>>> [AMD Official Use Only - General]
>>>
>>> Hi David,
>>>
>>> I agree that change is not straightforward to review, but it should 
>>> not cause any functional issue as we are still creating all the 
>>> memory mappings, but one by one for each segment.
>>> For hot plug case this causes issue as mentioned, that VFIO does not 
>>> allow unmap of the individual segments in case mapping was created of 
>>> a single coalesced segment.
>>>
>>> But yes, I am not sure why this code was added, which Anatoly may 
>>> have more understanding on.
>>
>> The motivation behind this code was that Linux allows limited amount of
>> page mappings, so we were trying to save on those. However, since then
>> there have been a few changes related to partial unmaps that may make it
>> so that this code is not only no longer necessary, but is in fact
>> actively harmful. I agree that this at least warrants a second look.
>>
>>>
>>> Anatoly,
>>>
>>> Can you please provide your feedback on this change?
>>
>> The patch probably shouldn't include the mailmap changes :)
> 
> Sure, will send a separate patch for it.
> 
>>
>> Could you please provide some steps to reproduce the hotplug issue
>> you're having? It would be great to have a test case for this patchset
>> to put it in context.
> 
> I am working on CDX bus 
> (http://patchwork.dpdk.org/project/dpdk/patch/20230124140746.594066-2-nipun.gupta@amd.com/) and trying out some cases for plug/unplug.
> 
> The test is as follows:
>    # Run testpmd application
>    ./dpdk-testpmd -c 0x3 -- -i --nb-cores=1
> 
>    # Bind to VFIO
>    echo "vfio-cdx" >  /sys/bus/cdx/devices/cdx-00\:00/driver_override
>    echo "cdx-00:00" > /sys/bus/cdx/drivers_probe
> 
>    # Plug a device
>    testpmd> port attach cdx:cdx-00:00
> 
>    #quit testpmd
>    testpmd> quit
> 
> This gave error at testpmd exit that memory cannot be freed. On 
> debugging I updated this code and seems it should be seen with any of 
> the device.
> 
> I see similar test case (without quit) mentioned 
> https://doc.dpdk.org/dts/test_plans/hotplug_test_plan.html, but the 
> difference is that it is with igb_uio and issue is being observed with 
> VFIO.
> 
> Please note the device/bus mentioned in the commands is not yet 
> upstreamed in DPDK, but patches would be sent out soon.
> 
> Thanks,
> Nipun
> 

Thanks, I can reproduce this issue with regular devices too (run testpmd 
with no devices, bind a NIC to VFIO, attach it, then quit). You're 
correct in that since the initial mapping was done with mapping large 
contiguous zones (such as when mempools are created before attach), any 
subsequent freeing of memory will cause these errors to happen.

I don't think this can be fixed by anything other than not doing the 
contiguous mapping thing, so provisionally, I think this patch should be 
accepted. I'll play around with it some more and get back to you :)

>>
>> -- 
>> Thanks,
>> Anatoly
>>
  
Gupta, Nipun April 5, 2023, 3:06 p.m. UTC | #7
[AMD Official Use Only - General]



> -----Original Message-----
> From: Burakov, Anatoly <anatoly.burakov@intel.com>
> Sent: Wednesday, April 5, 2023 7:46 PM
> To: Gupta, Nipun <Nipun.Gupta@amd.com>; David Marchand
> <david.marchand@redhat.com>
> Cc: dev@dpdk.org; thomas@monjalon.net; Yigit, Ferruh
> <Ferruh.Yigit@amd.com>; Agarwal, Nikhil <nikhil.agarwal@amd.com>
> Subject: Re: [PATCH v2] vfio: do not coalesce DMA mappings
> 
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
> 
> 
> On 4/4/2023 5:32 PM, Nipun Gupta wrote:
> >
> >
> > On 4/4/2023 8:43 PM, Burakov, Anatoly wrote:
> >> Caution: This message originated from an External Source. Use proper
> >> caution when opening attachments, clicking links, or responding.
> >>
> >>
> >> On 2/7/2023 8:56 AM, Gupta, Nipun wrote:
> >>> [AMD Official Use Only - General]
> >>>
> >>> Hi David,
> >>>
> >>> I agree that change is not straightforward to review, but it should
> >>> not cause any functional issue as we are still creating all the
> >>> memory mappings, but one by one for each segment.
> >>> For hot plug case this causes issue as mentioned, that VFIO does not
> >>> allow unmap of the individual segments in case mapping was created of
> >>> a single coalesced segment.
> >>>
> >>> But yes, I am not sure why this code was added, which Anatoly may
> >>> have more understanding on.
> >>
> >> The motivation behind this code was that Linux allows limited amount of
> >> page mappings, so we were trying to save on those. However, since then
> >> there have been a few changes related to partial unmaps that may make it
> >> so that this code is not only no longer necessary, but is in fact
> >> actively harmful. I agree that this at least warrants a second look.
> >>
> >>>
> >>> Anatoly,
> >>>
> >>> Can you please provide your feedback on this change?
> >>
> >> The patch probably shouldn't include the mailmap changes :)
> >
> > Sure, will send a separate patch for it.
> >
> >>
> >> Could you please provide some steps to reproduce the hotplug issue
> >> you're having? It would be great to have a test case for this patchset
> >> to put it in context.
> >
> > I am working on CDX bus
> > (http://patchwork.dpdk.org/project/dpdk/patch/20230124140746.594066-2-
> nipun.gupta@amd.com/) and trying out some cases for plug/unplug.
> >
> > The test is as follows:
> >    # Run testpmd application
> >    ./dpdk-testpmd -c 0x3 -- -i --nb-cores=1
> >
> >    # Bind to VFIO
> >    echo "vfio-cdx" >  /sys/bus/cdx/devices/cdx-00\:00/driver_override
> >    echo "cdx-00:00" > /sys/bus/cdx/drivers_probe
> >
> >    # Plug a device
> >    testpmd> port attach cdx:cdx-00:00
> >
> >    #quit testpmd
> >    testpmd> quit
> >
> > This gave error at testpmd exit that memory cannot be freed. On
> > debugging I updated this code and seems it should be seen with any of
> > the device.
> >
> > I see similar test case (without quit) mentioned
> > https://doc.dpdk.org/dts/test_plans/hotplug_test_plan.html, but the
> > difference is that it is with igb_uio and issue is being observed with
> > VFIO.
> >
> > Please note the device/bus mentioned in the commands is not yet
> > upstreamed in DPDK, but patches would be sent out soon.
> >
> > Thanks,
> > Nipun
> >
> 
> Thanks, I can reproduce this issue with regular devices too (run testpmd
> with no devices, bind a NIC to VFIO, attach it, then quit). You're
> correct in that since the initial mapping was done with mapping large
> contiguous zones (such as when mempools are created before attach), any
> subsequent freeing of memory will cause these errors to happen.

Thanks for trying on other devices too, and good to know that this is also
reproduced with them :)

Regards,
Nipun

> 
> I don't think this can be fixed by anything other than not doing the
> contiguous mapping thing, so provisionally, I think this patch should be
> accepted. I'll play around with it some more and get back to you :)
> 
> >>
> >> --
> >> Thanks,
> >> Anatoly
> >>
> 
> --
> Thanks,
> Anatoly
  
Gupta, Nipun April 7, 2023, 6:13 a.m. UTC | #8
On 4/4/2023 8:43 PM, Burakov, Anatoly wrote:
> Caution: This message originated from an External Source. Use proper 
> caution when opening attachments, clicking links, or responding.
> 
> 
> On 2/7/2023 8:56 AM, Gupta, Nipun wrote:
>> [AMD Official Use Only - General]
>>
>> Hi David,
>>
>> I agree that change is not straightforward to review, but it should 
>> not cause any functional issue as we are still creating all the memory 
>> mappings, but one by one for each segment.
>> For hot plug case this causes issue as mentioned, that VFIO does not 
>> allow unmap of the individual segments in case mapping was created of 
>> a single coalesced segment.
>>
>> But yes, I am not sure why this code was added, which Anatoly may have 
>> more understanding on.
> 
> The motivation behind this code was that Linux allows limited amount of
> page mappings, so we were trying to save on those. However, since then
> there have been a few changes related to partial unmaps that may make it
> so that this code is not only no longer necessary, but is in fact
> actively harmful. I agree that this at least warrants a second look.
> 
>>
>> Anatoly,
>>
>> Can you please provide your feedback on this change?
> 
> The patch probably shouldn't include the mailmap changes :)

I see in "git log" that all the mailmap changes are with the patch 
submitted, probably as it shows checkpatch warning, so it seems this 
should be fine?

Thanks,
Nipun

> 
> Could you please provide some steps to reproduce the hotplug issue
> you're having? It would be great to have a test case for this patchset
> to put it in context.
> 
> -- 
> Thanks,
> Anatoly
>
  
David Marchand April 7, 2023, 7:26 a.m. UTC | #9
Hello Nipun, Anatoly,

On Fri, Apr 7, 2023 at 8:13 AM Nipun Gupta <nipun.gupta@amd.com> wrote:
> > The patch probably shouldn't include the mailmap changes :)

Mailmap changes can be done by the submitter.
So Nipun did nothing wrong.

>
> I see in "git log" that all the mailmap changes are with the patch
> submitted, probably as it shows checkpatch warning, so it seems this
> should be fine?

If a submitter includes the mailmap change in his patches, it is fine.
Otherwise, the maintainer that merges the first patch of a new
contributor will fix the mailmap at the same time.
  
Thomas Monjalon April 11, 2023, 3:13 p.m. UTC | #10
04/04/2023 18:32, Nipun Gupta:
> On 4/4/2023 8:43 PM, Burakov, Anatoly wrote:
> > The patch probably shouldn't include the mailmap changes :)
> 
> Sure, will send a separate patch for it.

No please, we squash mailmap changes with 1st patch.
  
Gupta, Nipun April 11, 2023, 4:51 p.m. UTC | #11
[AMD Official Use Only - General]



> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Tuesday, April 11, 2023 8:43 PM
> To: Burakov, Anatoly <anatoly.burakov@intel.com>; David Marchand
> <david.marchand@redhat.com>; Gupta, Nipun <Nipun.Gupta@amd.com>
> Cc: dev@dpdk.org; Yigit, Ferruh <Ferruh.Yigit@amd.com>; Agarwal, Nikhil
> <nikhil.agarwal@amd.com>
> Subject: Re: [PATCH v2] vfio: do not coalesce DMA mappings
> 
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
> 
> 
> 04/04/2023 18:32, Nipun Gupta:
> > On 4/4/2023 8:43 PM, Burakov, Anatoly wrote:
> > > The patch probably shouldn't include the mailmap changes :)
> >
> > Sure, will send a separate patch for it.
> 
> No please, we squash mailmap changes with 1st patch.

Agree, I have replied same to this later in this mail thread.
So, there is no change/resubmission required in this patch as of now.

>
  
David Marchand April 24, 2023, 3:22 p.m. UTC | #12
Hello Anatoly,

On Wed, Apr 5, 2023 at 4:17 PM Burakov, Anatoly
<anatoly.burakov@intel.com> wrote:
> >> Could you please provide some steps to reproduce the hotplug issue
> >> you're having? It would be great to have a test case for this patchset
> >> to put it in context.
> >
> > I am working on CDX bus
> > (http://patchwork.dpdk.org/project/dpdk/patch/20230124140746.594066-2-nipun.gupta@amd.com/) and trying out some cases for plug/unplug.
> >
> > The test is as follows:
> >    # Run testpmd application
> >    ./dpdk-testpmd -c 0x3 -- -i --nb-cores=1
> >
> >    # Bind to VFIO
> >    echo "vfio-cdx" >  /sys/bus/cdx/devices/cdx-00\:00/driver_override
> >    echo "cdx-00:00" > /sys/bus/cdx/drivers_probe
> >
> >    # Plug a device
> >    testpmd> port attach cdx:cdx-00:00
> >
> >    #quit testpmd
> >    testpmd> quit
> >
> > This gave error at testpmd exit that memory cannot be freed. On
> > debugging I updated this code and seems it should be seen with any of
> > the device.
> >
> > I see similar test case (without quit) mentioned
> > https://doc.dpdk.org/dts/test_plans/hotplug_test_plan.html, but the
> > difference is that it is with igb_uio and issue is being observed with
> > VFIO.
> >
> > Please note the device/bus mentioned in the commands is not yet
> > upstreamed in DPDK, but patches would be sent out soon.
> >
> > Thanks,
> > Nipun
> >
>
> Thanks, I can reproduce this issue with regular devices too (run testpmd
> with no devices, bind a NIC to VFIO, attach it, then quit). You're
> correct in that since the initial mapping was done with mapping large
> contiguous zones (such as when mempools are created before attach), any
> subsequent freeing of memory will cause these errors to happen.
>
> I don't think this can be fixed by anything other than not doing the
> contiguous mapping thing, so provisionally, I think this patch should be
> accepted. I'll play around with it some more and get back to you :)

Can we conclude on this topic?
It is best we merge this kind of change the sooner possible for a release.

Thanks.
  
Stephen Hemminger April 24, 2023, 4:10 p.m. UTC | #13
On Mon, 24 Apr 2023 17:22:46 +0200
David Marchand <david.marchand@redhat.com> wrote:

> > >  
> >
> > Thanks, I can reproduce this issue with regular devices too (run testpmd
> > with no devices, bind a NIC to VFIO, attach it, then quit). You're
> > correct in that since the initial mapping was done with mapping large
> > contiguous zones (such as when mempools are created before attach), any
> > subsequent freeing of memory will cause these errors to happen.
> >
> > I don't think this can be fixed by anything other than not doing the
> > contiguous mapping thing, so provisionally, I think this patch should be
> > accepted. I'll play around with it some more and get back to you :)  
> 
> Can we conclude on this topic?
> It is best we merge this kind of change the sooner possible for a release.
> 
> Thanks.

Shouldn't the coalesced mappings be able to have correct datastructure
(accounting) so that on shutdown the unmap's are done for the right size?
  
Gupta, Nipun April 24, 2023, 4:16 p.m. UTC | #14
> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Monday, April 24, 2023 9:41 PM
> To: David Marchand <david.marchand@redhat.com>
> Cc: Burakov, Anatoly <anatoly.burakov@intel.com>; Gupta, Nipun
> <Nipun.Gupta@amd.com>; dev@dpdk.org; thomas@monjalon.net; Yigit, Ferruh
> <Ferruh.Yigit@amd.com>; Agarwal, Nikhil <nikhil.agarwal@amd.com>
> Subject: Re: [PATCH v2] vfio: do not coalesce DMA mappings
> 
> 
> On Mon, 24 Apr 2023 17:22:46 +0200
> David Marchand <david.marchand@redhat.com> wrote:
> 
> > > >
> > >
> > > Thanks, I can reproduce this issue with regular devices too (run testpmd
> > > with no devices, bind a NIC to VFIO, attach it, then quit). You're
> > > correct in that since the initial mapping was done with mapping large
> > > contiguous zones (such as when mempools are created before attach), any
> > > subsequent freeing of memory will cause these errors to happen.
> > >
> > > I don't think this can be fixed by anything other than not doing the
> > > contiguous mapping thing, so provisionally, I think this patch should be
> > > accepted. I'll play around with it some more and get back to you :)
> >
> > Can we conclude on this topic?
> > It is best we merge this kind of change the sooner possible for a release.
> >
> > Thanks.
> 
> Shouldn't the coalesced mappings be able to have correct datastructure
> (accounting) so that on shutdown the unmap's are done for the right size?

This issue occurs only on the hotplug case. Other devices which are not hot plugged
and are existing from the start of the application need to have individual (non-
coalesced) mappings. So individual (non-coalesced) mappings are definitely
required. IMO we should not maintain separate mapping for each hot-plugged
device as it would be unrequired overhead.

Regards,
Nipun
  
Gupta, Nipun May 10, 2023, 12:58 p.m. UTC | #15
On 4/24/2023 8:52 PM, David Marchand wrote:
> 
> Hello Anatoly,
> 
> On Wed, Apr 5, 2023 at 4:17 PM Burakov, Anatoly
> <anatoly.burakov@intel.com> wrote:
>>>> Could you please provide some steps to reproduce the hotplug issue
>>>> you're having? It would be great to have a test case for this patchset
>>>> to put it in context.
>>>
>>> I am working on CDX bus
>>> (http://patchwork.dpdk.org/project/dpdk/patch/20230124140746.594066-2-nipun.gupta@amd.com/) and trying out some cases for plug/unplug.
>>>
>>> The test is as follows:
>>>     # Run testpmd application
>>>     ./dpdk-testpmd -c 0x3 -- -i --nb-cores=1
>>>
>>>     # Bind to VFIO
>>>     echo "vfio-cdx" >  /sys/bus/cdx/devices/cdx-00\:00/driver_override
>>>     echo "cdx-00:00" > /sys/bus/cdx/drivers_probe
>>>
>>>     # Plug a device
>>>     testpmd> port attach cdx:cdx-00:00
>>>
>>>     #quit testpmd
>>>     testpmd> quit
>>>
>>> This gave error at testpmd exit that memory cannot be freed. On
>>> debugging I updated this code and seems it should be seen with any of
>>> the device.
>>>
>>> I see similar test case (without quit) mentioned
>>> https://doc.dpdk.org/dts/test_plans/hotplug_test_plan.html, but the
>>> difference is that it is with igb_uio and issue is being observed with
>>> VFIO.
>>>
>>> Please note the device/bus mentioned in the commands is not yet
>>> upstreamed in DPDK, but patches would be sent out soon.
>>>
>>> Thanks,
>>> Nipun
>>>
>>
>> Thanks, I can reproduce this issue with regular devices too (run testpmd
>> with no devices, bind a NIC to VFIO, attach it, then quit). You're
>> correct in that since the initial mapping was done with mapping large
>> contiguous zones (such as when mempools are created before attach), any
>> subsequent freeing of memory will cause these errors to happen.
>>
>> I don't think this can be fixed by anything other than not doing the
>> contiguous mapping thing, so provisionally, I think this patch should be
>> accepted. I'll play around with it some more and get back to you :)
> 
> Can we conclude on this topic?
> It is best we merge this kind of change the sooner possible for a release.

Hi Anatoly,
	Can you kindly update on this?

Thanks,
Nipun

> 
> Thanks.
> 
> 
> --
> David Marchand
>
  
Anatoly Burakov May 11, 2023, 2:08 p.m. UTC | #16
On 5/10/2023 1:58 PM, Nipun Gupta wrote:
> 
> 
> On 4/24/2023 8:52 PM, David Marchand wrote:
>>
>> Hello Anatoly,
>>
>> On Wed, Apr 5, 2023 at 4:17 PM Burakov, Anatoly
>> <anatoly.burakov@intel.com> wrote:
>>>>> Could you please provide some steps to reproduce the hotplug issue
>>>>> you're having? It would be great to have a test case for this patchset
>>>>> to put it in context.
>>>>
>>>> I am working on CDX bus
>>>> (http://patchwork.dpdk.org/project/dpdk/patch/20230124140746.594066-2-nipun.gupta@amd.com/) and trying out some cases for plug/unplug.
>>>>
>>>> The test is as follows:
>>>>     # Run testpmd application
>>>>     ./dpdk-testpmd -c 0x3 -- -i --nb-cores=1
>>>>
>>>>     # Bind to VFIO
>>>>     echo "vfio-cdx" >  /sys/bus/cdx/devices/cdx-00\:00/driver_override
>>>>     echo "cdx-00:00" > /sys/bus/cdx/drivers_probe
>>>>
>>>>     # Plug a device
>>>>     testpmd> port attach cdx:cdx-00:00
>>>>
>>>>     #quit testpmd
>>>>     testpmd> quit
>>>>
>>>> This gave error at testpmd exit that memory cannot be freed. On
>>>> debugging I updated this code and seems it should be seen with any of
>>>> the device.
>>>>
>>>> I see similar test case (without quit) mentioned
>>>> https://doc.dpdk.org/dts/test_plans/hotplug_test_plan.html, but the
>>>> difference is that it is with igb_uio and issue is being observed with
>>>> VFIO.
>>>>
>>>> Please note the device/bus mentioned in the commands is not yet
>>>> upstreamed in DPDK, but patches would be sent out soon.
>>>>
>>>> Thanks,
>>>> Nipun
>>>>
>>>
>>> Thanks, I can reproduce this issue with regular devices too (run testpmd
>>> with no devices, bind a NIC to VFIO, attach it, then quit). You're
>>> correct in that since the initial mapping was done with mapping large
>>> contiguous zones (such as when mempools are created before attach), any
>>> subsequent freeing of memory will cause these errors to happen.
>>>
>>> I don't think this can be fixed by anything other than not doing the
>>> contiguous mapping thing, so provisionally, I think this patch should be
>>> accepted. I'll play around with it some more and get back to you :)
>>
>> Can we conclude on this topic?
>> It is best we merge this kind of change the sooner possible for a 
>> release.
> 
> Hi Anatoly,
>      Can you kindly update on this?
> 

Hi all,

apologies for late reply.

Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
  
David Marchand May 15, 2023, 11:16 a.m. UTC | #17
On Wed, Jan 4, 2023 at 6:19 AM Nipun Gupta <nipun.gupta@amd.com> wrote:
> At the cleanup time when dma unmap is done, linux kernel
> does not allow unmap of individual segments which were
> coalesced together while creating the DMA map for type1 IOMMU
> mappings. So, this change updates the mapping of the memory
> segments(hugepages) on a per-page basis.
>
> Signed-off-by: Nipun Gupta <nipun.gupta@amd.com>
> Signed-off-by: Nikhil Agarwal <nikhil.agarwal@amd.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>


Applied, thanks.
  

Patch

diff --git a/.mailmap b/.mailmap
index 75884b6fe2..a234c9b3de 100644
--- a/.mailmap
+++ b/.mailmap
@@ -954,7 +954,7 @@  Nicolas Chautru <nicolas.chautru@intel.com>
 Nicolas Dichtel <nicolas.dichtel@6wind.com>
 Nicolas Harnois <nicolas.harnois@6wind.com>
 Nicolás Pernas Maradei <nico@emutex.com> <nicolas.pernas.maradei@emutex.com>
-Nikhil Agarwal <nikhil.agarwal@linaro.org>
+Nikhil Agarwal <nikhil.agarwal@amd.com> <nikhil.agarwal@xilinx.com> <nikhil.agarwal@linaro.org>
 Nikhil Jagtap <nikhil.jagtap@gmail.com>
 Nikhil Rao <nikhil.rao@intel.com>
 Nikhil Vasoya <nikhil.vasoya@chelsio.com>
@@ -962,7 +962,7 @@  Nikita Kozlov <nikita@elyzion.net>
 Niklas Söderlund <niklas.soderlund@corigine.com>
 Nikolay Nikolaev <nicknickolaev@gmail.com>
 Ning Li <muziding001@163.com> <lining18@jd.com>
-Nipun Gupta <nipun.gupta@nxp.com>
+Nipun Gupta <nipun.gupta@amd.com> <nipun.gupta@xilinx.com> <nipun.gupta@nxp.com>
 Nir Efrati <nir.efrati@intel.com>
 Nirmoy Das <ndas@suse.de>
 Nithin Dabilpuram <ndabilpuram@marvell.com> <nithin.dabilpuram@caviumnetworks.com>
diff --git a/lib/eal/linux/eal_vfio.c b/lib/eal/linux/eal_vfio.c
index 549b86ae1d..56edccb0db 100644
--- a/lib/eal/linux/eal_vfio.c
+++ b/lib/eal/linux/eal_vfio.c
@@ -1369,19 +1369,6 @@  rte_vfio_get_group_num(const char *sysfs_base,
 	return 1;
 }
 
-static int
-type1_map_contig(const struct rte_memseg_list *msl, const struct rte_memseg *ms,
-		size_t len, void *arg)
-{
-	int *vfio_container_fd = arg;
-
-	if (msl->external)
-		return 0;
-
-	return vfio_type1_dma_mem_map(*vfio_container_fd, ms->addr_64, ms->iova,
-			len, 1);
-}
-
 static int
 type1_map(const struct rte_memseg_list *msl, const struct rte_memseg *ms,
 		void *arg)
@@ -1396,10 +1383,6 @@  type1_map(const struct rte_memseg_list *msl, const struct rte_memseg *ms,
 	if (ms->iova == RTE_BAD_IOVA)
 		return 0;
 
-	/* if IOVA mode is VA, we've already mapped the internal segments */
-	if (!msl->external && rte_eal_iova_mode() == RTE_IOVA_VA)
-		return 0;
-
 	return vfio_type1_dma_mem_map(*vfio_container_fd, ms->addr_64, ms->iova,
 			ms->len, 1);
 }
@@ -1464,18 +1447,6 @@  vfio_type1_dma_mem_map(int vfio_container_fd, uint64_t vaddr, uint64_t iova,
 static int
 vfio_type1_dma_map(int vfio_container_fd)
 {
-	if (rte_eal_iova_mode() == RTE_IOVA_VA) {
-		/* with IOVA as VA mode, we can get away with mapping contiguous
-		 * chunks rather than going page-by-page.
-		 */
-		int ret = rte_memseg_contig_walk(type1_map_contig,
-				&vfio_container_fd);
-		if (ret)
-			return ret;
-		/* we have to continue the walk because we've skipped the
-		 * external segments during the config walk.
-		 */
-	}
 	return rte_memseg_walk(type1_map, &vfio_container_fd);
 }