[v2] eal: fix positive error codes from probe/remove
Checks
Commit Message
According to API, 'rte_dev_probe()' and 'rte_dev_remove()' must
return 0 or negative error code. Bus code returns positive values
if device wasn't recognized by any driver, so the result of
'bus->plug/unplug()' must be converted. 'local_dev_probe()' and
'local_dev_remove()' also has their internal API, so the conversion
should be done there.
Positive on remove means that device not found by driver.
Positive on probe means that there are no suitable buses/drivers,
i.e. device is not supported.
Users of these API fixed to provide a good example by respecting
DPDK API. This also will allow to catch such issues in the future.
CC: stable@dpdk.org
Fixes: a3ee360f4440 ("eal: add hotplug add/remove device")
Fixes: 244d5130719c ("eal: enable hotplug on multi-process")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
---
Version 2:
* Fixed API callers.
* Check for probe moved from 'rte_dev_probe' to 'local_dev_probe'.
app/test-pmd/testpmd.c | 4 ++--
drivers/net/failsafe/failsafe.c | 2 +-
drivers/net/failsafe/failsafe_eal.c | 4 ++--
drivers/net/failsafe/failsafe_ether.c | 2 +-
drivers/net/vdev_netvsc/vdev_netvsc.c | 2 +-
lib/librte_eal/common/eal_common_dev.c | 5 ++++-
6 files changed, 11 insertions(+), 8 deletions(-)
Comments
On Thu, Jun 6, 2019 at 12:03 PM Ilya Maximets <i.maximets@samsung.com>
wrote:
> According to API, 'rte_dev_probe()' and 'rte_dev_remove()' must
> return 0 or negative error code. Bus code returns positive values
> if device wasn't recognized by any driver, so the result of
> 'bus->plug/unplug()' must be converted. 'local_dev_probe()' and
> 'local_dev_remove()' also has their internal API, so the conversion
> should be done there.
>
> Positive on remove means that device not found by driver.
>
For backports, it is safer to add the check on > 0.
The patch looks good to me.
Reviewed-by: David Marchand <david.marchand@redhat.com>
But I have some comments on the current state of the code.
After inspecting the eal and buses, this problem is not supposed to happen
on the rte_dev_remove path.
rte_dev_remove() ensures that it calls local_dev_remove() after checking
that the device is attached to a driver (see the check on
!rte_dev_probed()).
Anatoly,
- When handling a detach operation in the primary process
https://git.dpdk.org/dpdk/tree/lib/librte_eal/common/hotplug_mp.c#n124, we
signal all other secondary processes to detach right away.
Then we do a bus/device lookup.
Then we call the bus unplug.
Would not it be better to check the device exists _and_ check if the device
is attached to a driver in the primary process before calling other
secondary processes?
Thomas,
- Calling unplug on a device that is not attached is a bit weird to me, all
the more so that we have rte_dev_probed().
But there might be users calling directly the bus unplug api and not the
official api...
Does this enter the ABI stability perimeter?
If not, I would be for changing unplug api so that we only deal with 0 or <
0 on remove path.
On the plug side, is there a reason why we do not check for
rte_dev_probed() and let the bus replies that the device is already probed?
Does it have something to do with representors ?
Only guessing.
- On the plug side again, can't we have an indication from the buses that
they have a driver that can handle the device rather than this odd (and
historical) > 0 return code?
This should not change the current behavior, just make the code a bit
easier to understand.
I know you are travelling, so this can wait anyway.
On 06.06.2019 13:02, Ilya Maximets wrote:
> According to API, 'rte_dev_probe()' and 'rte_dev_remove()' must
> return 0 or negative error code. Bus code returns positive values
> if device wasn't recognized by any driver, so the result of
> 'bus->plug/unplug()' must be converted. 'local_dev_probe()' and
> 'local_dev_remove()' also has their internal API, so the conversion
> should be done there.
>
> Positive on remove means that device not found by driver.
> Positive on probe means that there are no suitable buses/drivers,
> i.e. device is not supported.
>
> Users of these API fixed to provide a good example by respecting
> DPDK API. This also will allow to catch such issues in the future.
>
> CC: stable@dpdk.org
> Fixes: a3ee360f4440 ("eal: add hotplug add/remove device")
> Fixes: 244d5130719c ("eal: enable hotplug on multi-process")
>
> Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
> ---
>
> Version 2:
>
> * Fixed API callers.
> * Check for probe moved from 'rte_dev_probe' to 'local_dev_probe'.
>
> app/test-pmd/testpmd.c | 4 ++--
> drivers/net/failsafe/failsafe.c | 2 +-
> drivers/net/failsafe/failsafe_eal.c | 4 ++--
> drivers/net/failsafe/failsafe_ether.c | 2 +-
> drivers/net/vdev_netvsc/vdev_netvsc.c | 2 +-
> lib/librte_eal/common/eal_common_dev.c | 5 ++++-
> 6 files changed, 11 insertions(+), 8 deletions(-)
Any more thoughts on this patch? Or can it be merged?
Best regards, Ilya Maximets.
07/06/2019 10:32, David Marchand:
> On Thu, Jun 6, 2019 at 12:03 PM Ilya Maximets <i.maximets@samsung.com>
> wrote:
>
> > According to API, 'rte_dev_probe()' and 'rte_dev_remove()' must
> > return 0 or negative error code. Bus code returns positive values
> > if device wasn't recognized by any driver, so the result of
> > 'bus->plug/unplug()' must be converted. 'local_dev_probe()' and
> > 'local_dev_remove()' also has their internal API, so the conversion
> > should be done there.
> >
> > Positive on remove means that device not found by driver.
> >
>
> For backports, it is safer to add the check on > 0.
> The patch looks good to me.
>
> Reviewed-by: David Marchand <david.marchand@redhat.com>
I did not get your comment. Is it OK to get this v2?
What do you mean about backports?
07/06/2019 10:32, David Marchand:
> Thomas,
>
> - Calling unplug on a device that is not attached is a bit weird to me, all
> the more so that we have rte_dev_probed().
> But there might be users calling directly the bus unplug api and not the
> official api...
> Does this enter the ABI stability perimeter?
> If not, I would be for changing unplug api so that we only deal with 0 or <
> 0 on remove path.
Where the positive value is documented?
If it's only a non-documented usage, I tend to think it can be changed.
> On the plug side, is there a reason why we do not check for
> rte_dev_probed() and let the bus replies that the device is already probed?
A device can be re-probed to allow discovering new ports.
> Does it have something to do with representors ?
> Only guessing.
Yes representors are a case of ports which can appear on a new probe.
> - On the plug side again, can't we have an indication from the buses that
> they have a driver that can handle the device rather than this odd (and
> historical) > 0 return code?
> This should not change the current behavior, just make the code a bit
> easier to understand.
The positive code is also used for white/blacklist.
And I think we may need to try probing in order to give a final answer,
in general case.
On Wed, Jun 26, 2019 at 11:03 PM Thomas Monjalon <thomas@monjalon.net>
wrote:
> 07/06/2019 10:32, David Marchand:
> > On Thu, Jun 6, 2019 at 12:03 PM Ilya Maximets <i.maximets@samsung.com>
> > wrote:
> >
> > > According to API, 'rte_dev_probe()' and 'rte_dev_remove()' must
> > > return 0 or negative error code. Bus code returns positive values
> > > if device wasn't recognized by any driver, so the result of
> > > 'bus->plug/unplug()' must be converted. 'local_dev_probe()' and
> > > 'local_dev_remove()' also has their internal API, so the conversion
> > > should be done there.
> > >
> > > Positive on remove means that device not found by driver.
> > >
> >
> > For backports, it is safer to add the check on > 0.
> > The patch looks good to me.
> >
> > Reviewed-by: David Marchand <david.marchand@redhat.com>
>
> I did not get your comment. Is it OK to get this v2?
> What do you mean about backports?
>
>
Yes this v2 is ok.
I wanted to dissociate from my other comments which would not be part of
the fix for stable.
07/06/2019 10:32, David Marchand:
> On Thu, Jun 6, 2019 at 12:03 PM Ilya Maximets <i.maximets@samsung.com>
> wrote:
>
> > According to API, 'rte_dev_probe()' and 'rte_dev_remove()' must
> > return 0 or negative error code. Bus code returns positive values
> > if device wasn't recognized by any driver, so the result of
> > 'bus->plug/unplug()' must be converted. 'local_dev_probe()' and
> > 'local_dev_remove()' also has their internal API, so the conversion
> > should be done there.
> >
> > Positive on remove means that device not found by driver.
> >
>
> For backports, it is safer to add the check on > 0.
> The patch looks good to me.
>
> Reviewed-by: David Marchand <david.marchand@redhat.com>
Applied, thanks
@@ -2361,7 +2361,7 @@ attach_port(char *identifier)
return;
}
- if (rte_dev_probe(identifier) != 0) {
+ if (rte_dev_probe(identifier) < 0) {
TESTPMD_LOG(ERR, "Failed to attach port %s\n", identifier);
return;
}
@@ -2431,7 +2431,7 @@ detach_port_device(portid_t port_id)
port_flow_flush(port_id);
}
- if (rte_dev_remove(dev) != 0) {
+ if (rte_dev_remove(dev) < 0) {
TESTPMD_LOG(ERR, "Failed to detach device %s\n", dev->name);
return;
}
@@ -374,7 +374,7 @@ rte_pmd_failsafe_probe(struct rte_vdev_device *vdev)
}
if (!devargs_already_listed(&devargs)) {
ret = rte_dev_probe(devargs.name);
- if (ret != 0) {
+ if (ret < 0) {
ERROR("Failed to probe devargs %s",
devargs.name);
continue;
@@ -48,7 +48,7 @@ fs_bus_init(struct rte_eth_dev *dev)
ret = rte_eal_hotplug_add(da->bus->name,
da->name,
da->args);
- if (ret) {
+ if (ret < 0) {
ERROR("sub_device %d probe failed %s%s%s", i,
rte_errno ? "(" : "",
rte_errno ? strerror(rte_errno) : "",
@@ -147,7 +147,7 @@ fs_bus_uninit(struct rte_eth_dev *dev)
FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_PROBED) {
sdev_ret = rte_dev_remove(sdev->dev);
- if (sdev_ret) {
+ if (sdev_ret < 0) {
ERROR("Failed to remove requested device %s (err: %d)",
sdev->dev->name, sdev_ret);
continue;
@@ -284,7 +284,7 @@ fs_dev_remove(struct sub_device *sdev)
/* fallthrough */
case DEV_PROBED:
ret = rte_dev_remove(sdev->dev);
- if (ret) {
+ if (ret < 0) {
ERROR("Bus detach failed for sub_device %u",
SUB_ID(sdev));
} else {
@@ -633,7 +633,7 @@ vdev_netvsc_netvsc_probe(const struct if_nameindex *iface,
ctx->devname, ctx->devargs);
vdev_netvsc_foreach_iface(vdev_netvsc_device_probe, 0, ctx);
ret = rte_eal_hotplug_add("vdev", ctx->devname, ctx->devargs);
- if (ret)
+ if (ret < 0)
goto error;
LIST_INSERT_HEAD(&vdev_netvsc_ctx_list, ctx, entry);
++vdev_netvsc_ctx_count;
@@ -172,6 +172,9 @@ local_dev_probe(const char *devargs, struct rte_device **new_dev)
*/
ret = dev->bus->plug(dev);
+ if (ret > 0)
+ ret = -ENOTSUP;
+
if (ret && !rte_dev_is_probed(dev)) { /* if hasn't ever succeeded */
RTE_LOG(ERR, EAL, "Driver cannot attach the device (%s)\n",
dev->name);
@@ -319,7 +322,7 @@ local_dev_remove(struct rte_device *dev)
if (ret) {
RTE_LOG(ERR, EAL, "Driver cannot detach the device (%s)\n",
dev->name);
- return ret;
+ return (ret < 0) ? ret : -ENOENT;
}
return 0;