[1/2] bus/pci: Fail rte_pci_probe if probing all whitelisted devices fail.

Message ID 1566934217-23824-1-git-send-email-nitin.katiyar@ericsson.com (mailing list archive)
State Rejected, archived
Delegated to: David Marchand
Headers
Series [1/2] bus/pci: Fail rte_pci_probe if probing all whitelisted devices fail. |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK
ci/iol-Compile-Testing success Compile Testing PASS
ci/intel-Performance-Testing success Performance Testing PASS
ci/mellanox-Performance-Testing success Performance Testing PASS

Commit Message

Nitin Katiyar Aug. 27, 2019, 7:30 p.m. UTC
  Even if whitelist of devices is provided, rte_pci_probe() increments
the probed counter for all the devices present in the system. If probe
fails for all the whitelisted devices it still return success because
failed and probed counts don't match.

This patch increments probed count only when devices are actually
probed.

Signed-off-by: Nitin Katiyar <nitin.katiyar@ericsson.com>
Signed-off-by: Venkatesan Pradeep <venkatesan.pradeep@ericsson.com>

---
 drivers/bus/pci/pci_common.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)
  

Comments

Stephen Hemminger Aug. 27, 2019, 8:12 p.m. UTC | #1
On Wed, 28 Aug 2019 01:00:16 +0530
Nitin Katiyar <nitin.katiyar@ericsson.com> wrote:

> Even if whitelist of devices is provided, rte_pci_probe() increments
> the probed counter for all the devices present in the system. If probe
> fails for all the whitelisted devices it still return success because
> failed and probed counts don't match.
> 
> This patch increments probed count only when devices are actually
> probed.
> 
> Signed-off-by: Nitin Katiyar <nitin.katiyar@ericsson.com>
> Signed-off-by: Venkatesan Pradeep <venkatesan.pradeep@ericsson.com>

There are two differing interpretations of this.
The simple case which is what your patch fixes is where user gives bad
arguments and no devices are present.

But the more complex case is where the devices show up later via hotplug
or other discovery mechanism. For example, on Hyper-V/Azure SRIOV PCI devices
can show up after application is started.  Your patch might break the
use case of where an application is started before the VF is available.

More detailed example:

1. VM is started.
2. VF is take offline for maintenance or migration.
3. DPDK application is started with whitelist option (no usable PCI found).
4. VF becomes available after maintenance.

Yes, this a somewhat made up order which is unlikely to happen in
real life. But there is nothing stopping it from happening.

I often recommend to customers using whitelist because the typical appliance
scenario has a management interface, and you don't want the DPDK interacting
with the VF of the management interface.

Therefore, from my point of view, this patch is a NO.
  
Nitin Katiyar Aug. 29, 2019, 3:47 a.m. UTC | #2
> -----Original Message-----
> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Wednesday, August 28, 2019 1:43 AM
> To: Nitin Katiyar <nitin.katiyar@ericsson.com>
> Cc: dev@dpdk.org; Venkatesan Pradeep
> <venkatesan.pradeep@ericsson.com>
> Subject: Re: [dpdk-dev] [PATCH 1/2] bus/pci: Fail rte_pci_probe if probing all
> whitelisted devices fail.
> 
> On Wed, 28 Aug 2019 01:00:16 +0530
> Nitin Katiyar <nitin.katiyar@ericsson.com> wrote:
> 
> > Even if whitelist of devices is provided, rte_pci_probe() increments
> > the probed counter for all the devices present in the system. If probe
> > fails for all the whitelisted devices it still return success because
> > failed and probed counts don't match.
> >
> > This patch increments probed count only when devices are actually
> > probed.
> >
> > Signed-off-by: Nitin Katiyar <nitin.katiyar@ericsson.com>
> > Signed-off-by: Venkatesan Pradeep <venkatesan.pradeep@ericsson.com>
> 
> There are two differing interpretations of this.
> The simple case which is what your patch fixes is where user gives bad
> arguments and no devices are present.
> 
> But the more complex case is where the devices show up later via hotplug or
> other discovery mechanism. For example, on Hyper-V/Azure SRIOV PCI
> devices can show up after application is started.  Your patch might break the
> use case of where an application is started before the VF is available.
> 
> More detailed example:
> 
> 1. VM is started.
> 2. VF is take offline for maintenance or migration.
> 3. DPDK application is started with whitelist option (no usable PCI found).
> 4. VF becomes available after maintenance.
> 
> Yes, this a somewhat made up order which is unlikely to happen in real life.
> But there is nothing stopping it from happening.
> 
> I often recommend to customers using whitelist because the typical appliance
> scenario has a management interface, and you don't want the DPDK
> interacting with the VF of the management interface.
> 
> Therefore, from my point of view, this patch is a NO.
Hi,
Thanks for your comments. I am sorry I couldn't understand the scenario you mentioned.

If we are not probing the device then why should we be incrementing the probed counter. If current implementation doesn't handle the scenario where all the devices in concern failed in probe (as per the whitelist) and code fails to catch that case. Application like OVS using DPDK comes up successfully although it doesn't have any physical device in usable state.

Best regards,
Nitin
  
Stephen Hemminger Aug. 29, 2019, 3:33 p.m. UTC | #3
On Thu, 29 Aug 2019 03:47:02 +0000
Nitin Katiyar <nitin.katiyar@ericsson.com> wrote:

> > -----Original Message-----
> > From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> > Sent: Wednesday, August 28, 2019 1:43 AM
> > To: Nitin Katiyar <nitin.katiyar@ericsson.com>
> > Cc: dev@dpdk.org; Venkatesan Pradeep
> > <venkatesan.pradeep@ericsson.com>
> > Subject: Re: [dpdk-dev] [PATCH 1/2] bus/pci: Fail rte_pci_probe if probing all
> > whitelisted devices fail.
> > 
> > On Wed, 28 Aug 2019 01:00:16 +0530
> > Nitin Katiyar <nitin.katiyar@ericsson.com> wrote:
> >   
> > > Even if whitelist of devices is provided, rte_pci_probe() increments
> > > the probed counter for all the devices present in the system. If probe
> > > fails for all the whitelisted devices it still return success because
> > > failed and probed counts don't match.
> > >
> > > This patch increments probed count only when devices are actually
> > > probed.
> > >
> > > Signed-off-by: Nitin Katiyar <nitin.katiyar@ericsson.com>
> > > Signed-off-by: Venkatesan Pradeep <venkatesan.pradeep@ericsson.com>  
> > 
> > There are two differing interpretations of this.
> > The simple case which is what your patch fixes is where user gives bad
> > arguments and no devices are present.
> > 
> > But the more complex case is where the devices show up later via hotplug or
> > other discovery mechanism. For example, on Hyper-V/Azure SRIOV PCI
> > devices can show up after application is started.  Your patch might break the
> > use case of where an application is started before the VF is available.
> > 
> > More detailed example:
> > 
> > 1. VM is started.
> > 2. VF is take offline for maintenance or migration.
> > 3. DPDK application is started with whitelist option (no usable PCI found).
> > 4. VF becomes available after maintenance.
> > 
> > Yes, this a somewhat made up order which is unlikely to happen in real life.
> > But there is nothing stopping it from happening.
> > 
> > I often recommend to customers using whitelist because the typical appliance
> > scenario has a management interface, and you don't want the DPDK
> > interacting with the VF of the management interface.
> > 
> > Therefore, from my point of view, this patch is a NO.  
> Hi,
> Thanks for your comments. I am sorry I couldn't understand the scenario you mentioned.
> 
> If we are not probing the device then why should we be incrementing the probed counter. If current implementation doesn't handle the scenario where all the devices in concern failed in probe (as per the whitelist) and code fails to catch that case. Application like OVS using DPDK comes up successfully although it doesn't have any physical device in usable state.
> 
> Best regards,
> Nitin
> 

When application starts there maybe no PCI devices, but PCI devices arrive
later via hotplug.
  

Patch

diff --git a/drivers/bus/pci/pci_common.c b/drivers/bus/pci/pci_common.c
index 6b46b4f..25d1002 100644
--- a/drivers/bus/pci/pci_common.c
+++ b/drivers/bus/pci/pci_common.c
@@ -300,15 +300,17 @@  static struct rte_devargs *pci_devargs_lookup(struct rte_pci_device *dev)
 		probe_all = 1;
 
 	FOREACH_DEVICE_ON_PCIBUS(dev) {
-		probed++;
-
 		devargs = dev->device.devargs;
 		/* probe all or only whitelisted devices */
-		if (probe_all)
+		if (probe_all) {
 			ret = pci_probe_all_drivers(dev);
+			probed++;
+		}
 		else if (devargs != NULL &&
-			devargs->policy == RTE_DEV_WHITELISTED)
+			devargs->policy == RTE_DEV_WHITELISTED) {
 			ret = pci_probe_all_drivers(dev);
+			probed++;
+		}
 		if (ret < 0) {
 			if (ret != -EEXIST) {
 				RTE_LOG(ERR, EAL, "Requested device "