Message ID | 20211026090610.10823-1-houssem.bouhlel@6wind.com (mailing list archive) |
---|---|
State | New |
Delegated to: | David Marchand |
Headers | show |
Series | bus/pci: fix selection of default device NUMA node | expand |
Context | Check | Description |
---|---|---|
ci/iol-aarch64-unit-testing | success | Testing PASS |
ci/iol-aarch64-compile-testing | success | Testing PASS |
ci/iol-intel-Functional | success | Functional Testing PASS |
ci/iol-intel-Performance | success | Performance Testing PASS |
ci/iol-mellanox-Performance | fail | Performance Testing issues |
ci/iol-x86_64-compile-testing | success | Testing PASS |
ci/iol-x86_64-unit-testing | success | Testing PASS |
ci/intel-Testing | success | Testing PASS |
ci/iol-broadcom-Functional | success | Functional Testing PASS |
ci/Intel-compilation | success | Compilation OK |
ci/iol-broadcom-Performance | success | Performance Testing PASS |
ci/github-robot: build | success | github build: passed |
ci/checkpatch | success | coding style OK |
On Tue, Oct 26, 2021 at 11:06:10AM +0200, Houssem Bouhlel wrote: > There can be dev binding issue when no hugepages > are allocated for socket 0. > To avoid this, set device numa node value based on > the first lcore instead of 0. > > Fixes: 831dba47bd36 ("bus/vmbus: add Hyper-V virtual bus support") Sorry, the Fixes line is wrong. This is the correct one: Fixes: 8a04cb612589 ("pci: set default numa node for broken systems") > Cc: stable@dpdk.org > > Signed-off-by: Houssem Bouhlel <houssem.bouhlel@6wind.com> > Signed-off-by: Olivier Matz <olivier.matz@6wind.com> > --- > drivers/bus/pci/pci_common.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/bus/pci/pci_common.c b/drivers/bus/pci/pci_common.c > index f8fff2c98ebf..c70ab2373c79 100644 > --- a/drivers/bus/pci/pci_common.c > +++ b/drivers/bus/pci/pci_common.c > @@ -166,6 +166,7 @@ rte_pci_probe_one_driver(struct rte_pci_driver *dr, > struct rte_pci_device *dev) > { > int ret; > + unsigned int socket_id; > bool already_probed; > struct rte_pci_addr *loc; > > @@ -194,7 +195,8 @@ rte_pci_probe_one_driver(struct rte_pci_driver *dr, > if (rte_socket_count() > 1) > RTE_LOG(INFO, EAL, "Device %s is not NUMA-aware, defaulting socket to 0\n", > dev->name); One more comment (sorry, I should have done it before you send the mail): We should move this log below, and use the socket_id instead of 0. > - dev->device.numa_node = 0; > + socket_id = rte_lcore_to_socket_id(rte_get_next_lcore(-1, 0, 0)); > + dev->device.numa_node = socket_id; > } > > already_probed = rte_dev_is_probed(&dev->device); > -- > 2.30.2 >
+CC David On Tue, Oct 26, 2021 at 11:17:08AM +0200, Olivier Matz wrote: > On Tue, Oct 26, 2021 at 11:06:10AM +0200, Houssem Bouhlel wrote: > > There can be dev binding issue when no hugepages > > are allocated for socket 0. > > To avoid this, set device numa node value based on > > the first lcore instead of 0. > > > > Fixes: 831dba47bd36 ("bus/vmbus: add Hyper-V virtual bus support") > > Sorry, the Fixes line is wrong. This is the correct one: > Fixes: 8a04cb612589 ("pci: set default numa node for broken systems") > > > Cc: stable@dpdk.org > > > > Signed-off-by: Houssem Bouhlel <houssem.bouhlel@6wind.com> > > Signed-off-by: Olivier Matz <olivier.matz@6wind.com> > > --- > > drivers/bus/pci/pci_common.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/bus/pci/pci_common.c b/drivers/bus/pci/pci_common.c > > index f8fff2c98ebf..c70ab2373c79 100644 > > --- a/drivers/bus/pci/pci_common.c > > +++ b/drivers/bus/pci/pci_common.c > > @@ -166,6 +166,7 @@ rte_pci_probe_one_driver(struct rte_pci_driver *dr, > > struct rte_pci_device *dev) > > { > > int ret; > > + unsigned int socket_id; > > bool already_probed; > > struct rte_pci_addr *loc; > > > > @@ -194,7 +195,8 @@ rte_pci_probe_one_driver(struct rte_pci_driver *dr, > > if (rte_socket_count() > 1) > > RTE_LOG(INFO, EAL, "Device %s is not NUMA-aware, defaulting socket to 0\n", > > dev->name); > > One more comment (sorry, I should have done it before you send the mail): > We should move this log below, and use the socket_id instead of 0. > > > - dev->device.numa_node = 0; > > + socket_id = rte_lcore_to_socket_id(rte_get_next_lcore(-1, 0, 0)); > > + dev->device.numa_node = socket_id; After some offline discussions with David, some additional comments: - a similar change may be needed in other bus drivers - instead of setting the numa node to an existing socket, it can make more sense to keep its value to unknown (-1). This would however be a behavior change for pci bus, which returns 0 since 2015 for unknown cases. See: 81f8d2317df2 ("eal/linux: fix socket value for undetermined numa node") 8a04cb612589 ("pci: set default numa node for broken systems") I'll tend to be in favor of using -1. Any other opinion? Should we announce a behavior change in this case? > > } > > > > already_probed = rte_dev_is_probed(&dev->device); > > -- > > 2.30.2 > >
On Fri, Oct 29, 2021 at 10:45 AM Olivier Matz <olivier.matz@6wind.com> wrote: > > +CC David > > On Tue, Oct 26, 2021 at 11:17:08AM +0200, Olivier Matz wrote: > > On Tue, Oct 26, 2021 at 11:06:10AM +0200, Houssem Bouhlel wrote: > > > There can be dev binding issue when no hugepages > > > are allocated for socket 0. > > > To avoid this, set device numa node value based on > > > the first lcore instead of 0. > > > > > > Fixes: 831dba47bd36 ("bus/vmbus: add Hyper-V virtual bus support") > > > > Sorry, the Fixes line is wrong. This is the correct one: > > Fixes: 8a04cb612589 ("pci: set default numa node for broken systems") > > > > > Cc: stable@dpdk.org > > > > > > Signed-off-by: Houssem Bouhlel <houssem.bouhlel@6wind.com> > > > Signed-off-by: Olivier Matz <olivier.matz@6wind.com> > > > --- > > > drivers/bus/pci/pci_common.c | 4 +++- > > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > > > diff --git a/drivers/bus/pci/pci_common.c b/drivers/bus/pci/pci_common.c > > > index f8fff2c98ebf..c70ab2373c79 100644 > > > --- a/drivers/bus/pci/pci_common.c > > > +++ b/drivers/bus/pci/pci_common.c > > > @@ -166,6 +166,7 @@ rte_pci_probe_one_driver(struct rte_pci_driver *dr, > > > struct rte_pci_device *dev) > > > { > > > int ret; > > > + unsigned int socket_id; > > > bool already_probed; > > > struct rte_pci_addr *loc; > > > > > > @@ -194,7 +195,8 @@ rte_pci_probe_one_driver(struct rte_pci_driver *dr, > > > if (rte_socket_count() > 1) > > > RTE_LOG(INFO, EAL, "Device %s is not NUMA-aware, defaulting socket to 0\n", > > > dev->name); > > > > One more comment (sorry, I should have done it before you send the mail): > > We should move this log below, and use the socket_id instead of 0. > > > > > - dev->device.numa_node = 0; > > > + socket_id = rte_lcore_to_socket_id(rte_get_next_lcore(-1, 0, 0)); > > > + dev->device.numa_node = socket_id; > > After some offline discussions with David, some additional comments: > > - a similar change may be needed in other bus drivers > > - instead of setting the numa node to an existing socket, it can make > more sense to keep its value to unknown (-1). This would however be a > behavior change for pci bus, which returns 0 since 2015 for unknown > cases. See: > 81f8d2317df2 ("eal/linux: fix socket value for undetermined numa node") > 8a04cb612589 ("pci: set default numa node for broken systems") > > I'll tend to be in favor of using -1. Any other opinion? > Should we announce a behavior change in this case? Good summary. I copied some more people. I am for -1 too (as a way to indicate "I don't know what this PCI device affinity is"). It is dangerous to change now, and I think it is late for 21.11.
On Wed, Nov 03, 2021 at 09:36:49PM +0100, David Marchand wrote: > On Fri, Oct 29, 2021 at 10:45 AM Olivier Matz <olivier.matz@6wind.com> wrote: > > > > +CC David > > > > On Tue, Oct 26, 2021 at 11:17:08AM +0200, Olivier Matz wrote: > > > On Tue, Oct 26, 2021 at 11:06:10AM +0200, Houssem Bouhlel wrote: > > > > There can be dev binding issue when no hugepages > > > > are allocated for socket 0. > > > > To avoid this, set device numa node value based on > > > > the first lcore instead of 0. > > > > > > > > Fixes: 831dba47bd36 ("bus/vmbus: add Hyper-V virtual bus support") > > > > > > Sorry, the Fixes line is wrong. This is the correct one: > > > Fixes: 8a04cb612589 ("pci: set default numa node for broken systems") > > > > > > > Cc: stable@dpdk.org > > > > > > > > Signed-off-by: Houssem Bouhlel <houssem.bouhlel@6wind.com> > > > > Signed-off-by: Olivier Matz <olivier.matz@6wind.com> > > > > --- > > > > drivers/bus/pci/pci_common.c | 4 +++- > > > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/drivers/bus/pci/pci_common.c b/drivers/bus/pci/pci_common.c > > > > index f8fff2c98ebf..c70ab2373c79 100644 > > > > --- a/drivers/bus/pci/pci_common.c > > > > +++ b/drivers/bus/pci/pci_common.c > > > > @@ -166,6 +166,7 @@ rte_pci_probe_one_driver(struct rte_pci_driver *dr, > > > > struct rte_pci_device *dev) > > > > { > > > > int ret; > > > > + unsigned int socket_id; > > > > bool already_probed; > > > > struct rte_pci_addr *loc; > > > > > > > > @@ -194,7 +195,8 @@ rte_pci_probe_one_driver(struct rte_pci_driver *dr, > > > > if (rte_socket_count() > 1) > > > > RTE_LOG(INFO, EAL, "Device %s is not NUMA-aware, defaulting socket to 0\n", > > > > dev->name); > > > > > > One more comment (sorry, I should have done it before you send the mail): > > > We should move this log below, and use the socket_id instead of 0. > > > > > > > - dev->device.numa_node = 0; > > > > + socket_id = rte_lcore_to_socket_id(rte_get_next_lcore(-1, 0, 0)); > > > > + dev->device.numa_node = socket_id; > > > > After some offline discussions with David, some additional comments: > > > > - a similar change may be needed in other bus drivers > > > > - instead of setting the numa node to an existing socket, it can make > > more sense to keep its value to unknown (-1). This would however be a > > behavior change for pci bus, which returns 0 since 2015 for unknown > > cases. See: > > 81f8d2317df2 ("eal/linux: fix socket value for undetermined numa node") > > 8a04cb612589 ("pci: set default numa node for broken systems") > > > > I'll tend to be in favor of using -1. Any other opinion? > > Should we announce a behavior change in this case? > > Good summary. > I copied some more people. > > I am for -1 too (as a way to indicate "I don't know what this PCI > device affinity is"). > > It is dangerous to change now, and I think it is late for 21.11. +1, we can make an announce and change this for next version.
diff --git a/drivers/bus/pci/pci_common.c b/drivers/bus/pci/pci_common.c index f8fff2c98ebf..c70ab2373c79 100644 --- a/drivers/bus/pci/pci_common.c +++ b/drivers/bus/pci/pci_common.c @@ -166,6 +166,7 @@ rte_pci_probe_one_driver(struct rte_pci_driver *dr, struct rte_pci_device *dev) { int ret; + unsigned int socket_id; bool already_probed; struct rte_pci_addr *loc; @@ -194,7 +195,8 @@ rte_pci_probe_one_driver(struct rte_pci_driver *dr, if (rte_socket_count() > 1) RTE_LOG(INFO, EAL, "Device %s is not NUMA-aware, defaulting socket to 0\n", dev->name); - dev->device.numa_node = 0; + socket_id = rte_lcore_to_socket_id(rte_get_next_lcore(-1, 0, 0)); + dev->device.numa_node = socket_id; } already_probed = rte_dev_is_probed(&dev->device);