[1/2] vfio: fix check for vfio_group_fd
Checks
Commit Message
From: Yunjian Wang <wangyunjian@huawei.com>
The issue is that a file descriptor at 0 is a valid one. Currently
the file not found, the return value will be set to 0. As a result,
it is impossible to distinguish between a correct descriptor and a
failed return value. Fix it to return -ENOENT instead of 0.
Fixes: b758423bc4fe ("vfio: fix race condition with sysfs")
Fixes: ff0b67d1c868 ("vfio: DMA mappinge")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
---
lib/librte_eal/linux/eal_vfio.c | 23 +++++++++++++----------
lib/librte_eal/linux/eal_vfio_mp_sync.c | 4 ++--
2 files changed, 15 insertions(+), 12 deletions(-)
Comments
On Tue, May 19, 2020 at 5:42 AM wangyunjian <wangyunjian@huawei.com> wrote:
>
> From: Yunjian Wang <wangyunjian@huawei.com>
>
> The issue is that a file descriptor at 0 is a valid one. Currently
> the file not found, the return value will be set to 0. As a result,
> it is impossible to distinguish between a correct descriptor and a
> failed return value. Fix it to return -ENOENT instead of 0.
>
> Fixes: b758423bc4fe ("vfio: fix race condition with sysfs")
> Fixes: ff0b67d1c868 ("vfio: DMA mappinge")
> Cc: stable@dpdk.org
This patch reverts the (marked for stable) fix b758423bc4fe ("vfio:
fix race condition with sysfs") and comes with a different fix.
- Kevin, Luca, I would put this b758423bc4fe backport on hold.
- Anatoly, I don't want to put 20.05 at risk.
My simple question for 20.05 is, should we revert b758423bc4fe?
- This patchset will go to 20.08 as I don't feel confident in taking it now.
On 19-May-20 8:43 AM, David Marchand wrote:
> On Tue, May 19, 2020 at 5:42 AM wangyunjian <wangyunjian@huawei.com> wrote:
>>
>> From: Yunjian Wang <wangyunjian@huawei.com>
>>
>> The issue is that a file descriptor at 0 is a valid one. Currently
>> the file not found, the return value will be set to 0. As a result,
>> it is impossible to distinguish between a correct descriptor and a
>> failed return value. Fix it to return -ENOENT instead of 0.
>>
>> Fixes: b758423bc4fe ("vfio: fix race condition with sysfs")
>> Fixes: ff0b67d1c868 ("vfio: DMA mappinge")
>> Cc: stable@dpdk.org
>
> This patch reverts the (marked for stable) fix b758423bc4fe ("vfio:
> fix race condition with sysfs") and comes with a different fix.
>
> - Kevin, Luca, I would put this b758423bc4fe backport on hold.
>
> - Anatoly, I don't want to put 20.05 at risk.
> My simple question for 20.05 is, should we revert b758423bc4fe?
No, let's not revert anything.
As far as i can tell, this patch is a more complete fix, but it
essentially does the same thing, just in a different (and better) way.
Still, i haven't reviewed it in detail.
>
> - This patchset will go to 20.08 as I don't feel confident in taking it now.
>
Yes, let's leave it for 20.08.
On Thu, May 21, 2020 at 2:54 PM Burakov, Anatoly
<anatoly.burakov@intel.com> wrote:
>
> On 19-May-20 8:43 AM, David Marchand wrote:
> > On Tue, May 19, 2020 at 5:42 AM wangyunjian <wangyunjian@huawei.com> wrote:
> >>
> >> From: Yunjian Wang <wangyunjian@huawei.com>
> >>
> >> The issue is that a file descriptor at 0 is a valid one. Currently
> >> the file not found, the return value will be set to 0. As a result,
> >> it is impossible to distinguish between a correct descriptor and a
> >> failed return value. Fix it to return -ENOENT instead of 0.
> >>
> >> Fixes: b758423bc4fe ("vfio: fix race condition with sysfs")
> >> Fixes: ff0b67d1c868 ("vfio: DMA mappinge")
> >> Cc: stable@dpdk.org
> >
> > This patch reverts the (marked for stable) fix b758423bc4fe ("vfio:
> > fix race condition with sysfs") and comes with a different fix.
> >
> > - Kevin, Luca, I would put this b758423bc4fe backport on hold.
> >
> > - Anatoly, I don't want to put 20.05 at risk.
> > My simple question for 20.05 is, should we revert b758423bc4fe?
>
> No, let's not revert anything.
>
> As far as i can tell, this patch is a more complete fix, but it
> essentially does the same thing, just in a different (and better) way.
> Still, i haven't reviewed it in detail.
>
> >
> > - This patchset will go to 20.08 as I don't feel confident in taking it now.
> >
>
> Yes, let's leave it for 20.08.
Thanks for the analysis Anatoly.
On 21/05/2020 13:53, Burakov, Anatoly wrote:
> On 19-May-20 8:43 AM, David Marchand wrote:
>> On Tue, May 19, 2020 at 5:42 AM wangyunjian <wangyunjian@huawei.com> wrote:
>>>
>>> From: Yunjian Wang <wangyunjian@huawei.com>
>>>
>>> The issue is that a file descriptor at 0 is a valid one. Currently
>>> the file not found, the return value will be set to 0. As a result,
>>> it is impossible to distinguish between a correct descriptor and a
>>> failed return value. Fix it to return -ENOENT instead of 0.
>>>
>>> Fixes: b758423bc4fe ("vfio: fix race condition with sysfs")
>>> Fixes: ff0b67d1c868 ("vfio: DMA mappinge")
>>> Cc: stable@dpdk.org
>>
>> This patch reverts the (marked for stable) fix b758423bc4fe ("vfio:
>> fix race condition with sysfs") and comes with a different fix.
>>
>> - Kevin, Luca, I would put this b758423bc4fe backport on hold.
>>
>> - Anatoly, I don't want to put 20.05 at risk.
>> My simple question for 20.05 is, should we revert b758423bc4fe?
>
> No, let's not revert anything.
>
> As far as i can tell, this patch is a more complete fix, but it
> essentially does the same thing, just in a different (and better) way.
> Still, i haven't reviewed it in detail.
>
b758423bc4fe was not reverted in master - should that be backported now?
or should we wait until this new fix is ready for backport too?
>>
>> - This patchset will go to 20.08 as I don't feel confident in taking it now.
>>
>
> Yes, let's leave it for 20.08.
>
Ping for review
Thanks,
Yunjian
> -----Original Message-----
> From: Kevin Traynor [mailto:ktraynor@redhat.com]
> Sent: Wednesday, May 27, 2020 6:45 PM
> To: Burakov, Anatoly <anatoly.burakov@intel.com>; David Marchand
> <david.marchand@redhat.com>; wangyunjian <wangyunjian@huawei.com>;
> Luca Boccassi <bluca@debian.org>
> Cc: dev <dev@dpdk.org>; Hemant Agrawal <hemant.agrawal@nxp.com>;
> Sachin Saxena <sachin.saxena@nxp.com>; Lilijun (Jerry)
> <jerry.lilijun@huawei.com>; xudingke <xudingke@huawei.com>; dpdk stable
> <stable@dpdk.org>; Luca Boccassi <bluca@debian.org>
> Subject: Re: [dpdk-stable] [dpdk-dev] [PATCH 1/2] vfio: fix check for
> vfio_group_fd
>
> On 21/05/2020 13:53, Burakov, Anatoly wrote:
> > On 19-May-20 8:43 AM, David Marchand wrote:
> >> On Tue, May 19, 2020 at 5:42 AM wangyunjian <wangyunjian@huawei.com>
> wrote:
> >>>
> >>> From: Yunjian Wang <wangyunjian@huawei.com>
> >>>
> >>> The issue is that a file descriptor at 0 is a valid one. Currently
> >>> the file not found, the return value will be set to 0. As a result,
> >>> it is impossible to distinguish between a correct descriptor and a
> >>> failed return value. Fix it to return -ENOENT instead of 0.
> >>>
> >>> Fixes: b758423bc4fe ("vfio: fix race condition with sysfs")
> >>> Fixes: ff0b67d1c868 ("vfio: DMA mappinge")
> >>> Cc: stable@dpdk.org
> >>
> >> This patch reverts the (marked for stable) fix b758423bc4fe ("vfio:
> >> fix race condition with sysfs") and comes with a different fix.
> >>
> >> - Kevin, Luca, I would put this b758423bc4fe backport on hold.
> >>
> >> - Anatoly, I don't want to put 20.05 at risk.
> >> My simple question for 20.05 is, should we revert b758423bc4fe?
> >
> > No, let's not revert anything.
> >
> > As far as i can tell, this patch is a more complete fix, but it
> > essentially does the same thing, just in a different (and better) way.
> > Still, i haven't reviewed it in detail.
> >
>
> b758423bc4fe was not reverted in master - should that be backported now?
> or should we wait until this new fix is ready for backport too?
>
> >>
> >> - This patchset will go to 20.08 as I don't feel confident in taking it now.
> >>
> >
> > Yes, let's leave it for 20.08.
> >
On 19-May-20 4:42 AM, wangyunjian wrote:
> From: Yunjian Wang <wangyunjian@huawei.com>
>
> The issue is that a file descriptor at 0 is a valid one. Currently
> the file not found, the return value will be set to 0. As a result,
> it is impossible to distinguish between a correct descriptor and a
> failed return value. Fix it to return -ENOENT instead of 0.
>
> Fixes: b758423bc4fe ("vfio: fix race condition with sysfs")
> Fixes: ff0b67d1c868 ("vfio: DMA mappinge")
> Cc: stable@dpdk.org
>
> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
> ---
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
@@ -293,7 +293,7 @@ vfio_open_group_fd(int iommu_group_num)
strerror(errno));
return -1;
}
- return 0;
+ return -ENOENT;
}
/* noiommu group found */
}
@@ -318,12 +318,12 @@ vfio_open_group_fd(int iommu_group_num)
vfio_group_fd = mp_rep->fds[0];
} else if (p->result == SOCKET_NO_FD) {
RTE_LOG(ERR, EAL, " bad VFIO group fd\n");
- vfio_group_fd = 0;
+ vfio_group_fd = -ENOENT;
}
}
free(mp_reply.msgs);
- if (vfio_group_fd < 0)
+ if (vfio_group_fd < 0 && vfio_group_fd != -ENOENT)
RTE_LOG(ERR, EAL, " cannot request group fd\n");
return vfio_group_fd;
}
@@ -379,9 +379,9 @@ vfio_get_group_fd(struct vfio_config *vfio_cfg,
}
vfio_group_fd = vfio_open_group_fd(iommu_group_num);
- if (vfio_group_fd <= 0) {
+ if (vfio_group_fd < 0) {
RTE_LOG(ERR, EAL, "Failed to open group %d\n", iommu_group_num);
- return -1;
+ return vfio_group_fd;
}
cur_grp->group_num = iommu_group_num;
@@ -728,11 +728,14 @@ rte_vfio_setup_device(const char *sysfs_base, const char *dev_addr,
/* get the actual group fd */
vfio_group_fd = rte_vfio_get_group_fd(iommu_group_num);
- if (vfio_group_fd < 0)
+ if (vfio_group_fd < 0 && vfio_group_fd != -ENOENT)
return -1;
- /* if group_fd == 0, that means the device isn't managed by VFIO */
- if (vfio_group_fd == 0) {
+ /*
+ * if vfio_group_fd == -ENOENT, that means the device
+ * isn't managed by VFIO
+ */
+ if (vfio_group_fd == -ENOENT) {
RTE_LOG(WARNING, EAL, " %s not managed by VFIO driver, skipping\n",
dev_addr);
return 1;
@@ -955,10 +958,10 @@ rte_vfio_release_device(const char *sysfs_base, const char *dev_addr,
/* get the actual group fd */
vfio_group_fd = rte_vfio_get_group_fd(iommu_group_num);
- if (vfio_group_fd <= 0) {
+ if (vfio_group_fd < 0) {
RTE_LOG(INFO, EAL, "rte_vfio_get_group_fd failed for %s\n",
dev_addr);
- ret = -1;
+ ret = vfio_group_fd;
goto out;
}
@@ -44,9 +44,9 @@ vfio_mp_primary(const struct rte_mp_msg *msg, const void *peer)
r->req = SOCKET_REQ_GROUP;
r->group_num = m->group_num;
fd = rte_vfio_get_group_fd(m->group_num);
- if (fd < 0)
+ if (fd < 0 && fd != -ENOENT)
r->result = SOCKET_ERR;
- else if (fd == 0)
+ else if (fd == -ENOENT)
/* if VFIO group exists but isn't bound to VFIO driver */
r->result = SOCKET_NO_FD;
else {