vhost: fix async DMA map

Message ID 20211025203353.147346-1-maxime.coquelin@redhat.com (mailing list archive)
State Superseded, archived
Delegated to: Maxime Coquelin
Headers
Series vhost: fix async DMA map |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/iol-broadcom-Functional success Functional Testing PASS
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/github-robot: build success github build: passed
ci/iol-x86_64-compile-testing success Testing PASS
ci/iol-x86_64-unit-testing success Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-aarch64-unit-testing success Testing PASS

Commit Message

Maxime Coquelin Oct. 25, 2021, 8:33 p.m. UTC
  This patch fixes possible NULL-pointer dereferencing
reported by Coverity and also fixes NUMA reallocation
of the async DMA map.

Fixes: 7c61fa08b716 ("vhost: enable IOMMU for async vhost")

Coverity issue: 373655

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 lib/vhost/vhost_user.c | 45 +++++++++++++++++++-----------------------
 1 file changed, 20 insertions(+), 25 deletions(-)
  

Comments

Maxime Coquelin Oct. 25, 2021, 8:47 p.m. UTC | #1
Hi Xuan,

On 10/25/21 22:33, Maxime Coquelin wrote:
> This patch fixes possible NULL-pointer dereferencing
> reported by Coverity and also fixes NUMA reallocation
> of the async DMA map.
> 
> Fixes: 7c61fa08b716 ("vhost: enable IOMMU for async vhost")
> 
> Coverity issue: 373655
> 
> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> ---
>   lib/vhost/vhost_user.c | 45 +++++++++++++++++++-----------------------
>   1 file changed, 20 insertions(+), 25 deletions(-)
> 

I posted this patch to fix the issue reported by Coverity and also other
issue on NUMA realloc that I found at the same time. But I wonder
whether all this async_map_status is needed.

Indeed, if the only place where we DMA map is in
vhost_user_mmap_region(). If it fails, the error is propagated, the mem
table are freed and NACK is replied to the master. IOW, the device will
be in an unusable state.

Removing the async DMA map will simplify a lot the code, do you agree to
remove it or there is something I missed?

Thanks,
Maxime
  
Ding, Xuan Oct. 26, 2021, 2:07 a.m. UTC | #2
Hi Maxime,

>-----Original Message-----
>From: Maxime Coquelin <maxime.coquelin@redhat.com>
>Sent: Tuesday, October 26, 2021 4:47 AM
>To: dev@dpdk.org; david.marchand@redhat.com; Xia, Chenbo
><chenbo.xia@intel.com>; Ding, Xuan <xuan.ding@intel.com>
>Subject: Re: [PATCH] vhost: fix async DMA map
>
>Hi Xuan,
>
>On 10/25/21 22:33, Maxime Coquelin wrote:
>> This patch fixes possible NULL-pointer dereferencing
>> reported by Coverity and also fixes NUMA reallocation
>> of the async DMA map.
>>
>> Fixes: 7c61fa08b716 ("vhost: enable IOMMU for async vhost")
>>
>> Coverity issue: 373655
>>
>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>> ---
>>   lib/vhost/vhost_user.c | 45 +++++++++++++++++++-----------------------
>>   1 file changed, 20 insertions(+), 25 deletions(-)
>>
>
>I posted this patch to fix the issue reported by Coverity and also other
>issue on NUMA realloc that I found at the same time. But I wonder
>whether all this async_map_status is needed.

Thanks for your fix! I can help to review and test the patch later.

I add the async_map_status in v2 for compatibility. Some DMA device,
like DSA, may use kernel idxd driver only. If there is no device bound to
DPDK vfio and kernel vfio module is modprobed to ensure rte_vfio_is_enabled() is true,
we will unavoidably do DMA map/unmap and it will fail.

Therefore, the dma_map_status here is used to filter this situation by preventing
unnecessary DMA unmap.

>
>Indeed, if the only place where we DMA map is in
>vhost_user_mmap_region(). If it fails, the error is propagated, the mem
>table are freed and NACK is replied to the master. IOW, the device will
>be in an unusable state.

I agree with you, this is the place I consider right to do DMA map
because we also do SW mapping here, any suggestions?

>
>Removing the async DMA map will simplify a lot the code, do you agree to
>remove it or there is something I missed?

See above. Indeed, it adds a lot of code. But we can't know the driver for
each device in vhost lib, or we can only restrict the user to bind some devices
to DPDK vfio if async logic needed.

>
>Thanks,
>Maxime
  
Maxime Coquelin Oct. 26, 2021, 6:52 a.m. UTC | #3
On 10/26/21 04:07, Ding, Xuan wrote:
> Hi Maxime,
> 
>> -----Original Message-----
>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>> Sent: Tuesday, October 26, 2021 4:47 AM
>> To: dev@dpdk.org; david.marchand@redhat.com; Xia, Chenbo
>> <chenbo.xia@intel.com>; Ding, Xuan <xuan.ding@intel.com>
>> Subject: Re: [PATCH] vhost: fix async DMA map
>>
>> Hi Xuan,
>>
>> On 10/25/21 22:33, Maxime Coquelin wrote:
>>> This patch fixes possible NULL-pointer dereferencing
>>> reported by Coverity and also fixes NUMA reallocation
>>> of the async DMA map.
>>>
>>> Fixes: 7c61fa08b716 ("vhost: enable IOMMU for async vhost")
>>>
>>> Coverity issue: 373655
>>>
>>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>>> ---
>>>    lib/vhost/vhost_user.c | 45 +++++++++++++++++++-----------------------
>>>    1 file changed, 20 insertions(+), 25 deletions(-)
>>>
>>
>> I posted this patch to fix the issue reported by Coverity and also other
>> issue on NUMA realloc that I found at the same time. But I wonder
>> whether all this async_map_status is needed.
> 
> Thanks for your fix! I can help to review and test the patch later.
> 
> I add the async_map_status in v2 for compatibility. Some DMA device,
> like DSA, may use kernel idxd driver only. If there is no device bound to
> DPDK vfio and kernel vfio module is modprobed to ensure rte_vfio_is_enabled() is true,
> we will unavoidably do DMA map/unmap and it will fail.
> 
> Therefore, the dma_map_status here is used to filter this situation by preventing
> unnecessary DMA unmap.

Ok, then I think we can just remove the async DMA map.

>>
>> Indeed, if the only place where we DMA map is in
>> vhost_user_mmap_region(). If it fails, the error is propagated, the mem
>> table are freed and NACK is replied to the master. IOW, the device will
>> be in an unusable state.
> 
> I agree with you, this is the place I consider right to do DMA map
> because we also do SW mapping here, any suggestions?

No suggestion, I was just explaining that at the only place where
DMA map were done, mapping errors were properly handled and propagated.

>>
>> Removing the async DMA map will simplify a lot the code, do you agree to
>> remove it or there is something I missed?
> 
> See above. Indeed, it adds a lot of code. But we can't know the driver for
> each device in vhost lib, or we can only restrict the user to bind some devices
> to DPDK vfio if async logic needed.

I would think we don't care if DMA unmap fails, we can just do the same
as what you do for DMA map, i.e. just ignore the error.

Thanks to this discussion, I have now more concerns on how it works. I
think we have a problem here in case of DMA device hotplug, that device
could miss the necessary map entries from Vhost if no VFIO devices were
attached at VHST_USER_SET_MEM_TABLE time. How would you handle that
case?

Regards,
Maxime

>>
>> Thanks,
>> Maxime
>
  
Ding, Xuan Oct. 26, 2021, 8:49 a.m. UTC | #4
Hi Maxime,

>-----Original Message-----
>From: Maxime Coquelin <maxime.coquelin@redhat.com>
>Sent: Tuesday, October 26, 2021 2:53 PM
>To: Ding, Xuan <xuan.ding@intel.com>; dev@dpdk.org;
>david.marchand@redhat.com; Xia, Chenbo <chenbo.xia@intel.com>
>Cc: Burakov, Anatoly <anatoly.burakov@intel.com>
>Subject: Re: [PATCH] vhost: fix async DMA map
>
>
>
>On 10/26/21 04:07, Ding, Xuan wrote:
>> Hi Maxime,
>>
>>> -----Original Message-----
>>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>>> Sent: Tuesday, October 26, 2021 4:47 AM
>>> To: dev@dpdk.org; david.marchand@redhat.com; Xia, Chenbo
>>> <chenbo.xia@intel.com>; Ding, Xuan <xuan.ding@intel.com>
>>> Subject: Re: [PATCH] vhost: fix async DMA map
>>>
>>> Hi Xuan,
>>>
>>> On 10/25/21 22:33, Maxime Coquelin wrote:
>>>> This patch fixes possible NULL-pointer dereferencing
>>>> reported by Coverity and also fixes NUMA reallocation
>>>> of the async DMA map.
>>>>
>>>> Fixes: 7c61fa08b716 ("vhost: enable IOMMU for async vhost")
>>>>
>>>> Coverity issue: 373655
>>>>
>>>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>> ---
>>>>    lib/vhost/vhost_user.c | 45 +++++++++++++++++++-----------------------
>>>>    1 file changed, 20 insertions(+), 25 deletions(-)
>>>>
>>>
>>> I posted this patch to fix the issue reported by Coverity and also other
>>> issue on NUMA realloc that I found at the same time. But I wonder
>>> whether all this async_map_status is needed.
>>
>> Thanks for your fix! I can help to review and test the patch later.
>>
>> I add the async_map_status in v2 for compatibility. Some DMA device,
>> like DSA, may use kernel idxd driver only. If there is no device bound to
>> DPDK vfio and kernel vfio module is modprobed to ensure
>rte_vfio_is_enabled() is true,
>> we will unavoidably do DMA map/unmap and it will fail.
>>
>> Therefore, the dma_map_status here is used to filter this situation by
>preventing
>> unnecessary DMA unmap.
>
>Ok, then I think we can just remove the async DMA map.
>
>>>
>>> Indeed, if the only place where we DMA map is in
>>> vhost_user_mmap_region(). If it fails, the error is propagated, the mem
>>> table are freed and NACK is replied to the master. IOW, the device will
>>> be in an unusable state.
>>
>> I agree with you, this is the place I consider right to do DMA map
>> because we also do SW mapping here, any suggestions?
>
>No suggestion, I was just explaining that at the only place where
>DMA map were done, mapping errors were properly handled and propagated.

What about just setting async_copy to false, and allow switching to sync path.

>
>>>
>>> Removing the async DMA map will simplify a lot the code, do you agree to
>>> remove it or there is something I missed?
>>
>> See above. Indeed, it adds a lot of code. But we can't know the driver for
>> each device in vhost lib, or we can only restrict the user to bind some
>devices
>> to DPDK vfio if async logic needed.
>
>I would think we don't care if DMA unmap fails, we can just do the same
>as what you do for DMA map, i.e. just ignore the error.

Get your idea, we can do the same as DMA map, and in this way dma_map_status flag can be removed.

>
>Thanks to this discussion, I have now more concerns on how it works. I
>think we have a problem here in case of DMA device hotplug, that device
>could miss the necessary map entries from Vhost if no VFIO devices were
>attached at VHST_USER_SET_MEM_TABLE time. How would you handle that
>case?

DMA device is uncore, so I don't see the  hotplug issue here.
I will have another patch containing compatibility with sync path, and async_map_status flag will be removed.
Hope to get your insights.

Thanks,
Xuan

>
>Regards,
>Maxime
>
>>>
>>> Thanks,
>>> Maxime
>>
  
Maxime Coquelin Oct. 26, 2021, 9:49 a.m. UTC | #5
On 10/26/21 10:49, Ding, Xuan wrote:
> Hi Maxime,
> 
>> -----Original Message-----
>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>> Sent: Tuesday, October 26, 2021 2:53 PM
>> To: Ding, Xuan <xuan.ding@intel.com>; dev@dpdk.org;
>> david.marchand@redhat.com; Xia, Chenbo <chenbo.xia@intel.com>
>> Cc: Burakov, Anatoly <anatoly.burakov@intel.com>
>> Subject: Re: [PATCH] vhost: fix async DMA map
>>
>>
>>
>> On 10/26/21 04:07, Ding, Xuan wrote:
>>> Hi Maxime,
>>>
>>>> -----Original Message-----
>>>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>> Sent: Tuesday, October 26, 2021 4:47 AM
>>>> To: dev@dpdk.org; david.marchand@redhat.com; Xia, Chenbo
>>>> <chenbo.xia@intel.com>; Ding, Xuan <xuan.ding@intel.com>
>>>> Subject: Re: [PATCH] vhost: fix async DMA map
>>>>
>>>> Hi Xuan,
>>>>
>>>> On 10/25/21 22:33, Maxime Coquelin wrote:
>>>>> This patch fixes possible NULL-pointer dereferencing
>>>>> reported by Coverity and also fixes NUMA reallocation
>>>>> of the async DMA map.
>>>>>
>>>>> Fixes: 7c61fa08b716 ("vhost: enable IOMMU for async vhost")
>>>>>
>>>>> Coverity issue: 373655
>>>>>
>>>>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>>> ---
>>>>>     lib/vhost/vhost_user.c | 45 +++++++++++++++++++-----------------------
>>>>>     1 file changed, 20 insertions(+), 25 deletions(-)
>>>>>
>>>>
>>>> I posted this patch to fix the issue reported by Coverity and also other
>>>> issue on NUMA realloc that I found at the same time. But I wonder
>>>> whether all this async_map_status is needed.
>>>
>>> Thanks for your fix! I can help to review and test the patch later.
>>>
>>> I add the async_map_status in v2 for compatibility. Some DMA device,
>>> like DSA, may use kernel idxd driver only. If there is no device bound to
>>> DPDK vfio and kernel vfio module is modprobed to ensure
>> rte_vfio_is_enabled() is true,
>>> we will unavoidably do DMA map/unmap and it will fail.
>>>
>>> Therefore, the dma_map_status here is used to filter this situation by
>> preventing
>>> unnecessary DMA unmap.
>>
>> Ok, then I think we can just remove the async DMA map.
>>
>>>>
>>>> Indeed, if the only place where we DMA map is in
>>>> vhost_user_mmap_region(). If it fails, the error is propagated, the mem
>>>> table are freed and NACK is replied to the master. IOW, the device will
>>>> be in an unusable state.
>>>
>>> I agree with you, this is the place I consider right to do DMA map
>>> because we also do SW mapping here, any suggestions?
>>
>> No suggestion, I was just explaining that at the only place where
>> DMA map were done, mapping errors were properly handled and propagated.
> 
> What about just setting async_copy to false, and allow switching to sync path.
> 
>>
>>>>
>>>> Removing the async DMA map will simplify a lot the code, do you agree to
>>>> remove it or there is something I missed?
>>>
>>> See above. Indeed, it adds a lot of code. But we can't know the driver for
>>> each device in vhost lib, or we can only restrict the user to bind some
>> devices
>>> to DPDK vfio if async logic needed.
>>
>> I would think we don't care if DMA unmap fails, we can just do the same
>> as what you do for DMA map, i.e. just ignore the error.
> 
> Get your idea, we can do the same as DMA map, and in this way dma_map_status flag can be removed.
> 
>>
>> Thanks to this discussion, I have now more concerns on how it works. I
>> think we have a problem here in case of DMA device hotplug, that device
>> could miss the necessary map entries from Vhost if no VFIO devices were
>> attached at VHST_USER_SET_MEM_TABLE time. How would you handle that
>> case?
> 
> DMA device is uncore, so I don't see the  hotplug issue here.

I'm not sure what 'uncore' is, I suppose you mean your device cannot be
physically added/removed to the system.

I was not clear enough in my question. I meant that for example, the
application is started and the Vhost port is created. Then, the DMA
device is bound to VFIO, and probed by the DPDK application. Finally,
the application register the DMA device as an async channel in Vhost.

I think it will not work as the SET_MEM_TABLE will already have
happened, so the guest memory won't be mapped in the VFIO container.

Do you have the same understanding?

> I will have another patch containing compatibility with sync path, and async_map_status flag will be removed.
> Hope to get your insights.

What do you mean by compatibility with sync path?

Thanks for taking care of the async map status removal.

Maxime
> Thanks,
> Xuan
> 
>>
>> Regards,
>> Maxime
>>
>>>>
>>>> Thanks,
>>>> Maxime
>>>
>
  
Ding, Xuan Oct. 26, 2021, 10:27 a.m. UTC | #6
Hi,

>-----Original Message-----
>From: Maxime Coquelin <maxime.coquelin@redhat.com>
>Sent: Tuesday, October 26, 2021 5:49 PM
>To: Ding, Xuan <xuan.ding@intel.com>; dev@dpdk.org;
>david.marchand@redhat.com; Xia, Chenbo <chenbo.xia@intel.com>
>Cc: Burakov, Anatoly <anatoly.burakov@intel.com>
>Subject: Re: [PATCH] vhost: fix async DMA map
>
>
>
>On 10/26/21 10:49, Ding, Xuan wrote:
>> Hi Maxime,
>>
>>> -----Original Message-----
>>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>>> Sent: Tuesday, October 26, 2021 2:53 PM
>>> To: Ding, Xuan <xuan.ding@intel.com>; dev@dpdk.org;
>>> david.marchand@redhat.com; Xia, Chenbo <chenbo.xia@intel.com>
>>> Cc: Burakov, Anatoly <anatoly.burakov@intel.com>
>>> Subject: Re: [PATCH] vhost: fix async DMA map
>>>
>>>
>>>
>>> On 10/26/21 04:07, Ding, Xuan wrote:
>>>> Hi Maxime,
>>>>
>>>>> -----Original Message-----
>>>>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>>> Sent: Tuesday, October 26, 2021 4:47 AM
>>>>> To: dev@dpdk.org; david.marchand@redhat.com; Xia, Chenbo
>>>>> <chenbo.xia@intel.com>; Ding, Xuan <xuan.ding@intel.com>
>>>>> Subject: Re: [PATCH] vhost: fix async DMA map
>>>>>
>>>>> Hi Xuan,
>>>>>
>>>>> On 10/25/21 22:33, Maxime Coquelin wrote:
>>>>>> This patch fixes possible NULL-pointer dereferencing
>>>>>> reported by Coverity and also fixes NUMA reallocation
>>>>>> of the async DMA map.
>>>>>>
>>>>>> Fixes: 7c61fa08b716 ("vhost: enable IOMMU for async vhost")
>>>>>>
>>>>>> Coverity issue: 373655
>>>>>>
>>>>>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>>>> ---
>>>>>>     lib/vhost/vhost_user.c | 45 +++++++++++++++++++-----------------------
>>>>>>     1 file changed, 20 insertions(+), 25 deletions(-)
>>>>>>
>>>>>
>>>>> I posted this patch to fix the issue reported by Coverity and also other
>>>>> issue on NUMA realloc that I found at the same time. But I wonder
>>>>> whether all this async_map_status is needed.
>>>>
>>>> Thanks for your fix! I can help to review and test the patch later.
>>>>
>>>> I add the async_map_status in v2 for compatibility. Some DMA device,
>>>> like DSA, may use kernel idxd driver only. If there is no device bound to
>>>> DPDK vfio and kernel vfio module is modprobed to ensure
>>> rte_vfio_is_enabled() is true,
>>>> we will unavoidably do DMA map/unmap and it will fail.
>>>>
>>>> Therefore, the dma_map_status here is used to filter this situation by
>>> preventing
>>>> unnecessary DMA unmap.
>>>
>>> Ok, then I think we can just remove the async DMA map.
>>>
>>>>>
>>>>> Indeed, if the only place where we DMA map is in
>>>>> vhost_user_mmap_region(). If it fails, the error is propagated, the mem
>>>>> table are freed and NACK is replied to the master. IOW, the device will
>>>>> be in an unusable state.
>>>>
>>>> I agree with you, this is the place I consider right to do DMA map
>>>> because we also do SW mapping here, any suggestions?
>>>
>>> No suggestion, I was just explaining that at the only place where
>>> DMA map were done, mapping errors were properly handled and
>propagated.
>>
>> What about just setting async_copy to false, and allow switching to sync
>path.
>>
>>>
>>>>>
>>>>> Removing the async DMA map will simplify a lot the code, do you agree
>to
>>>>> remove it or there is something I missed?
>>>>
>>>> See above. Indeed, it adds a lot of code. But we can't know the driver for
>>>> each device in vhost lib, or we can only restrict the user to bind some
>>> devices
>>>> to DPDK vfio if async logic needed.
>>>
>>> I would think we don't care if DMA unmap fails, we can just do the same
>>> as what you do for DMA map, i.e. just ignore the error.
>>
>> Get your idea, we can do the same as DMA map, and in this way
>dma_map_status flag can be removed.
>>
>>>
>>> Thanks to this discussion, I have now more concerns on how it works. I
>>> think we have a problem here in case of DMA device hotplug, that device
>>> could miss the necessary map entries from Vhost if no VFIO devices were
>>> attached at VHST_USER_SET_MEM_TABLE time. How would you handle
>that
>>> case?
>>
>> DMA device is uncore, so I don't see the  hotplug issue here.
>
>I'm not sure what 'uncore' is, I suppose you mean your device cannot be
>physically added/removed to the system.

Yes, we are at the same understanding.

>
>I was not clear enough in my question. I meant that for example, the
>application is started and the Vhost port is created. Then, the DMA
>device is bound to VFIO, and probed by the DPDK application. Finally,
>the application register the DMA device as an async channel in Vhost.
>
>I think it will not work as the SET_MEM_TABLE will already have
>happened, so the guest memory won't be mapped in the VFIO container.

Assuming the DMA device supports hotplug, this situation was not actually took into consideration, and maybe the handling should be added in app, with an event callback.

>
>Do you have the same understanding?
>
>> I will have another patch containing compatibility with sync path, and
>async_map_status flag will be removed.
>> Hope to get your insights.
>
>What do you mean by compatibility with sync path?

I mean whatever DMA map succeeds or fails, we return 0 so as not to prevent
SET_EM_TABLE.

>
>Thanks for taking care of the async map status removal.

Found a tricky point for removing dma_map_status flag, currently DMA unmap API will directly access the vfio_config,
And if without dma_map_status, we cannot get rte_errno at this moment.

BTW, I will consider this Coverity fix in new patch. :)

Thanks,
Xuan

>
>Maxime
>> Thanks,
>> Xuan
>>
>>>
>>> Regards,
>>> Maxime
>>>
>>>>>
>>>>> Thanks,
>>>>> Maxime
>>>>
>>
  
Burakov, Anatoly Oct. 26, 2021, 10:58 a.m. UTC | #7
On 26-Oct-21 11:27 AM, Ding, Xuan wrote:
> Hi,
> 
>> -----Original Message-----
>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>> Sent: Tuesday, October 26, 2021 5:49 PM
>> To: Ding, Xuan <xuan.ding@intel.com>; dev@dpdk.org;
>> david.marchand@redhat.com; Xia, Chenbo <chenbo.xia@intel.com>
>> Cc: Burakov, Anatoly <anatoly.burakov@intel.com>
>> Subject: Re: [PATCH] vhost: fix async DMA map
>>
>>
>>
>> On 10/26/21 10:49, Ding, Xuan wrote:
>>> Hi Maxime,
>>>
>>>> -----Original Message-----
>>>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>> Sent: Tuesday, October 26, 2021 2:53 PM
>>>> To: Ding, Xuan <xuan.ding@intel.com>; dev@dpdk.org;
>>>> david.marchand@redhat.com; Xia, Chenbo <chenbo.xia@intel.com>
>>>> Cc: Burakov, Anatoly <anatoly.burakov@intel.com>
>>>> Subject: Re: [PATCH] vhost: fix async DMA map
>>>>
>>>>
>>>>
>>>> On 10/26/21 04:07, Ding, Xuan wrote:
>>>>> Hi Maxime,
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>>>> Sent: Tuesday, October 26, 2021 4:47 AM
>>>>>> To: dev@dpdk.org; david.marchand@redhat.com; Xia, Chenbo
>>>>>> <chenbo.xia@intel.com>; Ding, Xuan <xuan.ding@intel.com>
>>>>>> Subject: Re: [PATCH] vhost: fix async DMA map
>>>>>>
>>>>>> Hi Xuan,
>>>>>>
>>>>>> On 10/25/21 22:33, Maxime Coquelin wrote:
>>>>>>> This patch fixes possible NULL-pointer dereferencing
>>>>>>> reported by Coverity and also fixes NUMA reallocation
>>>>>>> of the async DMA map.
>>>>>>>
>>>>>>> Fixes: 7c61fa08b716 ("vhost: enable IOMMU for async vhost")
>>>>>>>
>>>>>>> Coverity issue: 373655
>>>>>>>
>>>>>>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>>>>> ---
>>>>>>>      lib/vhost/vhost_user.c | 45 +++++++++++++++++++-----------------------
>>>>>>>      1 file changed, 20 insertions(+), 25 deletions(-)
>>>>>>>
>>>>>>
>>>>>> I posted this patch to fix the issue reported by Coverity and also other
>>>>>> issue on NUMA realloc that I found at the same time. But I wonder
>>>>>> whether all this async_map_status is needed.
>>>>>
>>>>> Thanks for your fix! I can help to review and test the patch later.
>>>>>
>>>>> I add the async_map_status in v2 for compatibility. Some DMA device,
>>>>> like DSA, may use kernel idxd driver only. If there is no device bound to
>>>>> DPDK vfio and kernel vfio module is modprobed to ensure
>>>> rte_vfio_is_enabled() is true,
>>>>> we will unavoidably do DMA map/unmap and it will fail.
>>>>>
>>>>> Therefore, the dma_map_status here is used to filter this situation by
>>>> preventing
>>>>> unnecessary DMA unmap.
>>>>
>>>> Ok, then I think we can just remove the async DMA map.
>>>>
>>>>>>
>>>>>> Indeed, if the only place where we DMA map is in
>>>>>> vhost_user_mmap_region(). If it fails, the error is propagated, the mem
>>>>>> table are freed and NACK is replied to the master. IOW, the device will
>>>>>> be in an unusable state.
>>>>>
>>>>> I agree with you, this is the place I consider right to do DMA map
>>>>> because we also do SW mapping here, any suggestions?
>>>>
>>>> No suggestion, I was just explaining that at the only place where
>>>> DMA map were done, mapping errors were properly handled and
>> propagated.
>>>
>>> What about just setting async_copy to false, and allow switching to sync
>> path.
>>>
>>>>
>>>>>>
>>>>>> Removing the async DMA map will simplify a lot the code, do you agree
>> to
>>>>>> remove it or there is something I missed?
>>>>>
>>>>> See above. Indeed, it adds a lot of code. But we can't know the driver for
>>>>> each device in vhost lib, or we can only restrict the user to bind some
>>>> devices
>>>>> to DPDK vfio if async logic needed.
>>>>
>>>> I would think we don't care if DMA unmap fails, we can just do the same
>>>> as what you do for DMA map, i.e. just ignore the error.
>>>
>>> Get your idea, we can do the same as DMA map, and in this way
>> dma_map_status flag can be removed.
>>>
>>>>
>>>> Thanks to this discussion, I have now more concerns on how it works. I
>>>> think we have a problem here in case of DMA device hotplug, that device
>>>> could miss the necessary map entries from Vhost if no VFIO devices were
>>>> attached at VHST_USER_SET_MEM_TABLE time. How would you handle
>> that
>>>> case?
>>>
>>> DMA device is uncore, so I don't see the  hotplug issue here.
>>
>> I'm not sure what 'uncore' is, I suppose you mean your device cannot be
>> physically added/removed to the system.
> 
> Yes, we are at the same understanding.
> 
>>
>> I was not clear enough in my question. I meant that for example, the
>> application is started and the Vhost port is created. Then, the DMA
>> device is bound to VFIO, and probed by the DPDK application. Finally,
>> the application register the DMA device as an async channel in Vhost.
>>
>> I think it will not work as the SET_MEM_TABLE will already have
>> happened, so the guest memory won't be mapped in the VFIO container.
> 
> Assuming the DMA device supports hotplug, this situation was not actually took into consideration, and maybe the handling should be added in app, with an event callback.
> 
>>
>> Do you have the same understanding?
>>
>>> I will have another patch containing compatibility with sync path, and
>> async_map_status flag will be removed.
>>> Hope to get your insights.
>>
>> What do you mean by compatibility with sync path?
> 
> I mean whatever DMA map succeeds or fails, we return 0 so as not to prevent
> SET_EM_TABLE.
> 
>>
>> Thanks for taking care of the async map status removal.
> 
> Found a tricky point for removing dma_map_status flag, currently DMA unmap API will directly access the vfio_config,
> And if without dma_map_status, we cannot get rte_errno at this moment.

This is a bug in VFIO API. We always assume VFIO type is valid, but it 
is only valid whenever 1) VFIO support is detected at startup, and 2) we 
assign at least one device to VFIO. If VFIO was detected but no devices 
were assigned, the VFIO type will not be assigned, and the pointer will 
be NULL. In this case, we should not attempt any unmaps, because there 
are no active devices.

I'll submit a patch to fix this, thanks for catching this issue!

> 
> BTW, I will consider this Coverity fix in new patch. :)
> 
> Thanks,
> Xuan
> 
>>
>> Maxime
>>> Thanks,
>>> Xuan
>>>
>>>>
>>>> Regards,
>>>> Maxime
>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Maxime
>>>>>
>>>
>
  
Burakov, Anatoly Oct. 26, 2021, 3:24 p.m. UTC | #8
On 26-Oct-21 11:58 AM, Burakov, Anatoly wrote:
> On 26-Oct-21 11:27 AM, Ding, Xuan wrote:
>> Hi,
>>
>>> -----Original Message-----
>>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>>> Sent: Tuesday, October 26, 2021 5:49 PM
>>> To: Ding, Xuan <xuan.ding@intel.com>; dev@dpdk.org;
>>> david.marchand@redhat.com; Xia, Chenbo <chenbo.xia@intel.com>
>>> Cc: Burakov, Anatoly <anatoly.burakov@intel.com>
>>> Subject: Re: [PATCH] vhost: fix async DMA map
>>>
>>>
>>>
>>> On 10/26/21 10:49, Ding, Xuan wrote:
>>>> Hi Maxime,
>>>>
>>>>> -----Original Message-----
>>>>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>>> Sent: Tuesday, October 26, 2021 2:53 PM
>>>>> To: Ding, Xuan <xuan.ding@intel.com>; dev@dpdk.org;
>>>>> david.marchand@redhat.com; Xia, Chenbo <chenbo.xia@intel.com>
>>>>> Cc: Burakov, Anatoly <anatoly.burakov@intel.com>
>>>>> Subject: Re: [PATCH] vhost: fix async DMA map
>>>>>
>>>>>
>>>>>
>>>>> On 10/26/21 04:07, Ding, Xuan wrote:
>>>>>> Hi Maxime,
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>>>>> Sent: Tuesday, October 26, 2021 4:47 AM
>>>>>>> To: dev@dpdk.org; david.marchand@redhat.com; Xia, Chenbo
>>>>>>> <chenbo.xia@intel.com>; Ding, Xuan <xuan.ding@intel.com>
>>>>>>> Subject: Re: [PATCH] vhost: fix async DMA map
>>>>>>>
>>>>>>> Hi Xuan,
>>>>>>>
>>>>>>> On 10/25/21 22:33, Maxime Coquelin wrote:
>>>>>>>> This patch fixes possible NULL-pointer dereferencing
>>>>>>>> reported by Coverity and also fixes NUMA reallocation
>>>>>>>> of the async DMA map.
>>>>>>>>
>>>>>>>> Fixes: 7c61fa08b716 ("vhost: enable IOMMU for async vhost")
>>>>>>>>
>>>>>>>> Coverity issue: 373655
>>>>>>>>
>>>>>>>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>>>>>> ---
>>>>>>>>      lib/vhost/vhost_user.c | 45 
>>>>>>>> +++++++++++++++++++-----------------------
>>>>>>>>      1 file changed, 20 insertions(+), 25 deletions(-)
>>>>>>>>
>>>>>>>
>>>>>>> I posted this patch to fix the issue reported by Coverity and 
>>>>>>> also other
>>>>>>> issue on NUMA realloc that I found at the same time. But I wonder
>>>>>>> whether all this async_map_status is needed.
>>>>>>
>>>>>> Thanks for your fix! I can help to review and test the patch later.
>>>>>>
>>>>>> I add the async_map_status in v2 for compatibility. Some DMA device,
>>>>>> like DSA, may use kernel idxd driver only. If there is no device 
>>>>>> bound to
>>>>>> DPDK vfio and kernel vfio module is modprobed to ensure
>>>>> rte_vfio_is_enabled() is true,
>>>>>> we will unavoidably do DMA map/unmap and it will fail.
>>>>>>
>>>>>> Therefore, the dma_map_status here is used to filter this 
>>>>>> situation by
>>>>> preventing
>>>>>> unnecessary DMA unmap.
>>>>>
>>>>> Ok, then I think we can just remove the async DMA map.
>>>>>
>>>>>>>
>>>>>>> Indeed, if the only place where we DMA map is in
>>>>>>> vhost_user_mmap_region(). If it fails, the error is propagated, 
>>>>>>> the mem
>>>>>>> table are freed and NACK is replied to the master. IOW, the 
>>>>>>> device will
>>>>>>> be in an unusable state.
>>>>>>
>>>>>> I agree with you, this is the place I consider right to do DMA map
>>>>>> because we also do SW mapping here, any suggestions?
>>>>>
>>>>> No suggestion, I was just explaining that at the only place where
>>>>> DMA map were done, mapping errors were properly handled and
>>> propagated.
>>>>
>>>> What about just setting async_copy to false, and allow switching to 
>>>> sync
>>> path.
>>>>
>>>>>
>>>>>>>
>>>>>>> Removing the async DMA map will simplify a lot the code, do you 
>>>>>>> agree
>>> to
>>>>>>> remove it or there is something I missed?
>>>>>>
>>>>>> See above. Indeed, it adds a lot of code. But we can't know the 
>>>>>> driver for
>>>>>> each device in vhost lib, or we can only restrict the user to bind 
>>>>>> some
>>>>> devices
>>>>>> to DPDK vfio if async logic needed.
>>>>>
>>>>> I would think we don't care if DMA unmap fails, we can just do the 
>>>>> same
>>>>> as what you do for DMA map, i.e. just ignore the error.
>>>>
>>>> Get your idea, we can do the same as DMA map, and in this way
>>> dma_map_status flag can be removed.
>>>>
>>>>>
>>>>> Thanks to this discussion, I have now more concerns on how it works. I
>>>>> think we have a problem here in case of DMA device hotplug, that 
>>>>> device
>>>>> could miss the necessary map entries from Vhost if no VFIO devices 
>>>>> were
>>>>> attached at VHST_USER_SET_MEM_TABLE time. How would you handle
>>> that
>>>>> case?
>>>>
>>>> DMA device is uncore, so I don't see the  hotplug issue here.
>>>
>>> I'm not sure what 'uncore' is, I suppose you mean your device cannot be
>>> physically added/removed to the system.
>>
>> Yes, we are at the same understanding.
>>
>>>
>>> I was not clear enough in my question. I meant that for example, the
>>> application is started and the Vhost port is created. Then, the DMA
>>> device is bound to VFIO, and probed by the DPDK application. Finally,
>>> the application register the DMA device as an async channel in Vhost.
>>>
>>> I think it will not work as the SET_MEM_TABLE will already have
>>> happened, so the guest memory won't be mapped in the VFIO container.
>>
>> Assuming the DMA device supports hotplug, this situation was not 
>> actually took into consideration, and maybe the handling should be 
>> added in app, with an event callback.
>>
>>>
>>> Do you have the same understanding?
>>>
>>>> I will have another patch containing compatibility with sync path, and
>>> async_map_status flag will be removed.
>>>> Hope to get your insights.
>>>
>>> What do you mean by compatibility with sync path?
>>
>> I mean whatever DMA map succeeds or fails, we return 0 so as not to 
>> prevent
>> SET_EM_TABLE.
>>
>>>
>>> Thanks for taking care of the async map status removal.
>>
>> Found a tricky point for removing dma_map_status flag, currently DMA 
>> unmap API will directly access the vfio_config,
>> And if without dma_map_status, we cannot get rte_errno at this moment.
> 
> This is a bug in VFIO API. We always assume VFIO type is valid, but it 
> is only valid whenever 1) VFIO support is detected at startup, and 2) we 
> assign at least one device to VFIO. If VFIO was detected but no devices 
> were assigned, the VFIO type will not be assigned, and the pointer will 
> be NULL. In this case, we should not attempt any unmaps, because there 
> are no active devices.
> 
> I'll submit a patch to fix this, thanks for catching this issue!

Hi Xuan/Maxime,

It would be great if you could test and confirm that this patch fixes 
the segfault:

http://patches.dpdk.org/project/dpdk/patch/8079312ba39435a0ac92e084cc1a3fe291008a47.1635254797.git.anatoly.burakov@intel.com/
  

Patch

diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c
index 720d1c1c9d..b8fbb8b415 100644
--- a/lib/vhost/vhost_user.c
+++ b/lib/vhost/vhost_user.c
@@ -149,6 +149,9 @@  async_dma_map(struct rte_vhost_mem_region *region, bool *dma_map_success, bool d
 	uint64_t host_iova;
 	int ret = 0;
 
+	if (!rte_vfio_is_enabled("vfio"))
+		return 0;
+
 	host_iova = rte_mem_virt2iova((void *)(uintptr_t)region->host_user_addr);
 	if (do_map) {
 		/* Add mapped region into the default container of DPDK. */
@@ -176,9 +179,7 @@  async_dma_map(struct rte_vhost_mem_region *region, bool *dma_map_success, bool d
 
 			VHOST_LOG_CONFIG(ERR, "DMA engine map failed\n");
 			return ret;
-
 		}
-
 	} else {
 		/* No need to do vfio unmap if the map failed. */
 		if (!*dma_map_success)
@@ -228,11 +229,8 @@  vhost_backend_cleanup(struct virtio_net *dev)
 		free_mem_region(dev);
 		rte_free(dev->mem);
 		dev->mem = NULL;
-
-		if (dev->async_map_status) {
-			rte_free(dev->async_map_status);
-			dev->async_map_status = NULL;
-		}
+		rte_free(dev->async_map_status);
+		dev->async_map_status = NULL;
 	}
 
 	rte_free(dev->guest_pages);
@@ -689,16 +687,17 @@  numa_realloc(struct virtio_net *dev, int index)
 	dev->mem = mem;
 
 	if (dev->async_copy && rte_vfio_is_enabled("vfio")) {
-		if (dev->async_map_status == NULL) {
-			dev->async_map_status = rte_zmalloc_socket("async-dma-map-status",
-					sizeof(bool) * dev->mem->nregions, 0, node);
-			if (!dev->async_map_status) {
-				VHOST_LOG_CONFIG(ERR,
-					"(%d) failed to realloc dma mapping status on node\n",
-					dev->vid);
-				return dev;
-			}
+		bool *ams;
+
+		ams = rte_realloc_socket(dev->async_map_status, sizeof(*ams) * dev->mem->nregions,
+				0, node);
+		if (!ams) {
+			VHOST_LOG_CONFIG(ERR, "Failed to realloc dma mapping status on node %d\n",
+					node);
+			return dev;
 		}
+
+		dev->async_map_status = ams;
 	}
 
 	gp = rte_realloc_socket(dev->guest_pages, dev->max_guest_pages * sizeof(*gp),
@@ -1294,16 +1293,14 @@  vhost_user_mmap_region(struct virtio_net *dev,
 
 	if (dev->async_copy) {
 		if (add_guest_pages(dev, region, alignment) < 0) {
-			VHOST_LOG_CONFIG(ERR,
-					"adding guest pages to region failed.\n");
+			VHOST_LOG_CONFIG(ERR, "adding guest pages to region failed.\n");
 			return -1;
 		}
 
-		if (rte_vfio_is_enabled("vfio")) {
+		if (dev->async_map_status) {
 			ret = async_dma_map(region, &dev->async_map_status[region_index], true);
 			if (ret) {
-				VHOST_LOG_CONFIG(ERR, "Configure IOMMU for DMA "
-							"engine failed\n");
+				VHOST_LOG_CONFIG(ERR, "Configure IOMMU for DMA engine failed\n");
 				return -1;
 			}
 		}
@@ -1501,10 +1498,8 @@  vhost_user_set_mem_table(struct virtio_net **pdev, struct VhostUserMsg *msg,
 	free_mem_region(dev);
 	rte_free(dev->mem);
 	dev->mem = NULL;
-	if (dev->async_map_status) {
-		rte_free(dev->async_map_status);
-		dev->async_map_status = NULL;
-	}
+	rte_free(dev->async_map_status);
+	dev->async_map_status = NULL;
 free_guest_pages:
 	rte_free(dev->guest_pages);
 	dev->guest_pages = NULL;