[dpdk-dev] eal/linux: fix negative value for undetermined numa_node

Message ID 1438306572-25434-1-git-send-email-cunming.liang@intel.com (mailing list archive)
State Accepted, archived
Headers

Commit Message

Cunming Liang July 31, 2015, 1:36 a.m. UTC
  The patch sets zero as the default value of pci device numa_node
if the socket could not be determined.
It provides the same default value as FreeBSD which has no NUMA support,
and makes the return value of rte_eth_dev_socket_id() be consistent
with the API description.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/linuxapp/eal/eal_pci.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
  

Comments

Matthew Hall Aug. 1, 2015, 3:56 a.m. UTC | #1
I asked about this many months ago and was informed that "-1" is a "standard 
error value" that I should expect from these APIs when NUMA is not present. 
Now we're saying I have to change my code again to handle a zero value?

Also not sure how to tell the difference between no NUMA, something running on 
socket zero, and something with multiple sockets. Seems like we need a bit of 
thought about how the NUMA APIs should behave overall.

Matthew.

On Fri, Jul 31, 2015 at 09:36:12AM +0800, Cunming Liang wrote:
> The patch sets zero as the default value of pci device numa_node
> if the socket could not be determined.
> It provides the same default value as FreeBSD which has no NUMA support,
> and makes the return value of rte_eth_dev_socket_id() be consistent
> with the API description.
> 
> Signed-off-by: Cunming Liang <cunming.liang@intel.com>
  
Cunming Liang Aug. 3, 2015, 1:46 a.m. UTC | #2
Hi,

On 8/1/2015 11:56 AM, Matthew Hall wrote:
> I asked about this many months ago and was informed that "-1" is a "standard
> error value" that I should expect from these APIs when NUMA is not present.
> Now we're saying I have to change my code again to handle a zero value?
>
> Also not sure how to tell the difference between no NUMA, something running on
> socket zero, and something with multiple sockets. Seems like we need a bit of
> thought about how the NUMA APIs should behave overall.
>
> Matthew.
>
> On Fri, Jul 31, 2015 at 09:36:12AM +0800, Cunming Liang wrote:
>> The patch sets zero as the default value of pci device numa_node
>> if the socket could not be determined.
>> It provides the same default value as FreeBSD which has no NUMA support,
>> and makes the return value of rte_eth_dev_socket_id() be consistent
>> with the API description.
>>
>> Signed-off-by: Cunming Liang <cunming.liang@intel.com>
>>
>>
/*
  * Return the NUMA socket to which an Ethernet device is connected
  *
  * @param port_id
  *   The port identifier of the Ethernet device
  * @return
  *   The NUMA socket id to which the Ethernet device is connected or
  *   a default of zero if the socket could not be determined.
  *   -1 is returned is the port_id value is out of range.
  */
extern int rte_eth_dev_socket_id(uint8_t port_id);

According to the API definition, if the socket could not be determined, 
a default of zero will take.
The '-1' is returned when the port_id value is out of range.

To your concern, "difference between no NUMA, something running on 
socket zero, and something with multiple sockets.".
The latter two belongs to the same situation, that is the numa_node 
stores the NUMA id.
So in fact the concern is about using '-1' or '0' when there's no NUMA 
detect.
If we won't plan to redefine the API return value, the fix patch is 
reasonable.

Btw, if it returns '-1' when no NUMA is detected, what will you do, do 
condition check '-1' and then use node 0 instead ?
In that way, you can't distinguish '-'1 is out of range port_id error or 
no NUMA detection error.
If it is, why not follow the API definition.

/Steve
  
Matthew Hall Aug. 3, 2015, 5:04 a.m. UTC | #3
On Mon, Aug 03, 2015 at 09:46:54AM +0800, Liang, Cunming wrote:
> According to the API definition, if the socket could not be determined, a
> default of zero will take.
> The '-1' is returned when the port_id value is out of range.

Yes, but when I asked the exact same question and was told the documentation 
was wrong not the -1 return value.

> To your concern, "difference between no NUMA, something running on socket
> zero, and something with multiple sockets.".
> The latter two belongs to the same situation, that is the numa_node stores
> the NUMA id.
> So in fact the concern is about using '-1' or '0' when there's no NUMA
> detect.
> If we won't plan to redefine the API return value, the fix patch is
> reasonable.
> 
> Btw, if it returns '-1' when no NUMA is detected, what will you do, do
> condition check '-1' and then use node 0 instead ?
> In that way, you can't distinguish '-'1 is out of range port_id error or no
> NUMA detection error.

I asked that question also, and the answer I got was to use node 0 instead.

> If it is, why not follow the API definition.

Sure, if nobody objects like the last time I asked. But this will change the 
user behavior as I am looking for the -1 now.

> /Steve

Matthew.
  
Thomas Monjalon Aug. 3, 2015, 5:19 p.m. UTC | #4
2015-08-02 22:04, Matthew Hall:
> On Mon, Aug 03, 2015 at 09:46:54AM +0800, Liang, Cunming wrote:
> > According to the API definition, if the socket could not be determined, a
> > default of zero will take.
> > The '-1' is returned when the port_id value is out of range.
> 
> Yes, but when I asked the exact same question and was told the documentation 
> was wrong not the -1 return value.

It was an error.
The correct API, as defined, is to return 0 when there is no NUMA.
This patch now reverts this one:
	http://dpdk.org/browse/dpdk/commit/?id=a4ff75496a18
Thanks Matthew for helping to clarify it.

> Sure, if nobody objects like the last time I asked. But this will change the 
> user behavior as I am looking for the -1 now.

You're right. But this fix follows the defined API, so it's rather an API fix
than an API break.

Applied, thanks
  

Patch

diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c b/lib/librte_eal/linuxapp/eal/eal_pci.c
index 0e62f65..f573092 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -325,8 +325,8 @@  pci_scan_one(const char *dirname, uint16_t domain, uint8_t bus,
 	snprintf(filename, sizeof(filename), "%s/numa_node",
 		 dirname);
 	if (access(filename, R_OK) != 0) {
-		/* if no NUMA support just set node to -1 */
-		dev->numa_node = -1;
+		/* if no NUMA support, set default to 0 */
+		dev->numa_node = 0;
 	} else {
 		if (eal_parse_sysfs_value(filename, &tmp) < 0) {
 			free(dev);