[v2] ethdev: fix device init without socket-local memory
Checks
Commit Message
When allocating memory for an ethdev, the rte_malloc_socket call used
only allocates memory on the NUMA node/socket local to the device. This
means that even if the user wanted to, they could never use a remote NIC
without also having memory on that NIC's socket.
For example, if we change examples/skeleton/basicfwd.c to have
SOCKET_ID_ANY as the socket_id parameter for Rx and Tx rings, we should
be able to run the app cross-numa e.g. as below, where the two PCI
devices are on socket 1, and core 1 is on socket 0:
./build/examples/dpdk-skeleton -l 1 --legacy-mem --socket-mem=1024,0 \
-a a8:00.0 -a b8:00.0
This fails however, with the error:
ETHDEV: failed to allocate private data
PCI_BUS: Requested device 0000:a8:00.0 cannot be used
We can remove this restriction by doing a fallback call to general
rte_malloc after a call to rte_malloc_socket fails. This should be safe
to do because the later ethdev calls to setup Rx/Tx queues all take a
socket_id parameter, which can be used by applications to enforce the
requirement for local-only memory for a device, if so desired. [If
device-local memory is present it will be used as before, while if not
present the rte_eth_dev_configure call will now pass, but the subsequent
queue setup calls requesting local memory will fail].
Fixes: e489007a411c ("ethdev: add generic create/destroy ethdev APIs")
Fixes: dcd5c8112bc3 ("ethdev: add PCI driver helpers")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Padraig Connolly <padraig.j.connolly@intel.com>
---
V2:
* Add warning printout in the case where we don't get device-local
memory, but we do get memory on another socket.
---
lib/ethdev/ethdev_driver.c | 20 +++++++++++++++-----
lib/ethdev/ethdev_pci.h | 20 +++++++++++++++++---
2 files changed, 32 insertions(+), 8 deletions(-)
Comments
On 7/22/2024 11:02 AM, Bruce Richardson wrote:
> When allocating memory for an ethdev, the rte_malloc_socket call used
> only allocates memory on the NUMA node/socket local to the device. This
> means that even if the user wanted to, they could never use a remote NIC
> without also having memory on that NIC's socket.
>
> For example, if we change examples/skeleton/basicfwd.c to have
> SOCKET_ID_ANY as the socket_id parameter for Rx and Tx rings, we should
> be able to run the app cross-numa e.g. as below, where the two PCI
> devices are on socket 1, and core 1 is on socket 0:
>
> ./build/examples/dpdk-skeleton -l 1 --legacy-mem --socket-mem=1024,0 \
> -a a8:00.0 -a b8:00.0
>
> This fails however, with the error:
>
> ETHDEV: failed to allocate private data
> PCI_BUS: Requested device 0000:a8:00.0 cannot be used
>
> We can remove this restriction by doing a fallback call to general
> rte_malloc after a call to rte_malloc_socket fails. This should be safe
> to do because the later ethdev calls to setup Rx/Tx queues all take a
> socket_id parameter, which can be used by applications to enforce the
> requirement for local-only memory for a device, if so desired. [If
> device-local memory is present it will be used as before, while if not
> present the rte_eth_dev_configure call will now pass, but the subsequent
> queue setup calls requesting local memory will fail].
>
> Fixes: e489007a411c ("ethdev: add generic create/destroy ethdev APIs")
> Fixes: dcd5c8112bc3 ("ethdev: add PCI driver helpers")
> Cc: stable@dpdk.org
>
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> Signed-off-by: Padraig Connolly <padraig.j.connolly@intel.com>
>
Reviewed-by: Ferruh Yigit <ferruh.yigit@amd.com>
Applied to dpdk-next-net/main, thanks.
@@ -303,15 +303,25 @@ rte_eth_dev_create(struct rte_device *device, const char *name,
return -ENODEV;
if (priv_data_size) {
+ /* try alloc private data on device-local node. */
ethdev->data->dev_private = rte_zmalloc_socket(
name, priv_data_size, RTE_CACHE_LINE_SIZE,
device->numa_node);
- if (!ethdev->data->dev_private) {
- RTE_ETHDEV_LOG_LINE(ERR,
- "failed to allocate private data");
- retval = -ENOMEM;
- goto probe_failed;
+ /* fall back to alloc on any socket on failure */
+ if (ethdev->data->dev_private == NULL) {
+ ethdev->data->dev_private = rte_zmalloc(name,
+ priv_data_size, RTE_CACHE_LINE_SIZE);
+
+ if (ethdev->data->dev_private == NULL) {
+ RTE_ETHDEV_LOG_LINE(ERR, "failed to allocate private data");
+ retval = -ENOMEM;
+ goto probe_failed;
+ }
+ /* got memory, but not local, so issue warning */
+ RTE_ETHDEV_LOG_LINE(WARNING,
+ "Private data for ethdev '%s' not allocated on local NUMA node %d",
+ device->name, device->numa_node);
}
}
} else {
@@ -93,12 +93,26 @@ rte_eth_dev_pci_allocate(struct rte_pci_device *dev, size_t private_data_size)
return NULL;
if (private_data_size) {
+ /* Try and alloc the private-data structure on socket local to the device */
eth_dev->data->dev_private = rte_zmalloc_socket(name,
private_data_size, RTE_CACHE_LINE_SIZE,
dev->device.numa_node);
- if (!eth_dev->data->dev_private) {
- rte_eth_dev_release_port(eth_dev);
- return NULL;
+
+ /* if cannot allocate memory on the socket local to the device
+ * use rte_malloc to allocate memory on some other socket, if available.
+ */
+ if (eth_dev->data->dev_private == NULL) {
+ eth_dev->data->dev_private = rte_zmalloc(name,
+ private_data_size, RTE_CACHE_LINE_SIZE);
+
+ if (eth_dev->data->dev_private == NULL) {
+ rte_eth_dev_release_port(eth_dev);
+ return NULL;
+ }
+ /* got memory, but not local, so issue warning */
+ RTE_ETHDEV_LOG_LINE(WARNING,
+ "Private data for ethdev '%s' not allocated on local NUMA node %d",
+ dev->device.name, dev->device.numa_node);
}
}
} else {