[4/8] net/mlx5: add sysfs check for Multiport E-Switch

Message ID 20231031142733.2009166-5-dsosnowski@nvidia.com (mailing list archive)
State Accepted, archived
Delegated to: Raslan Darawsheh
Headers
Series net/mlx5: add Multiport E-Switch support |

Checks

Context Check Description
ci/checkpatch success coding style OK

Commit Message

Dariusz Sosnowski Oct. 31, 2023, 2:27 p.m. UTC
  This patch implements checking if Multiport E-Switch is enabled
on a given PCI device, using sysfs Linux kernel interface.
This facility will be used in follow up commits,
which add support for such configuration to mlx5 PMD.

MLNX_OFED mlx5_core kernel module versions which support
Multiport E-Switch do not expose this configuration through Devlink,
but through sysfs interface.
If such a version is used, then Multiport E-Switch can be enabled
(or its state can be probed) through a sysfs file under path:

  # <ifname> should be substituted with Linux interface name.
  /sys/class/net/<ifname>/compat/devlink/lag_port_select_mode

Writing "multiport_esw" to this file enables Multiport E-Switch.
If "multiport_esw" is read from this file, then
Multiport E-Switch is enabled.

If this file does not exist or writing "multiport_esw" to this file,
raises an error, then Multiport E-Switch is not supported.

Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 drivers/net/mlx5/linux/mlx5_os.c | 69 ++++++++++++++++++++++++++++++++
 1 file changed, 69 insertions(+)
  

Comments

Stephen Hemminger Oct. 31, 2023, 4:09 p.m. UTC | #1
On Tue, 31 Oct 2023 16:27:29 +0200
Dariusz Sosnowski <dsosnowski@nvidia.com> wrote:

> +		MKSTR(sysfs_if_path, "/sys/class/net/%s", ifname);
> +		if (mlx5_get_pci_addr(sysfs_if_path, &if_pci_addr))
> +			continue;
> +		if (pci_addr->domain != if_pci_addr.domain ||
> +		    pci_addr->bus != if_pci_addr.bus ||
> +		    pci_addr->devid != if_pci_addr.devid ||
> +		    pci_addr->function != if_pci_addr.function)
> +			continue;
> +		MKSTR(sysfs_mpesw_path,
> +		      "/sys/class/net/%s/compat/devlink/lag_port_select_mode", ifname);

There are lots of DPDK code that reads sysfs, but eal and each driver ends up
coding there own way of handling this. Would be good to have common helpers in EAL.
  
Dariusz Sosnowski Oct. 31, 2023, 5:37 p.m. UTC | #2
Hi Stephen,

Thank you for your comment.

> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Tuesday, October 31, 2023 17:09
> To: Dariusz Sosnowski <dsosnowski@nvidia.com>
> Cc: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; Suanming Mou
> <suanmingm@nvidia.com>; dev@dpdk.org; Raslan Darawsheh
> <rasland@nvidia.com>
> Subject: Re: [PATCH 4/8] net/mlx5: add sysfs check for Multiport E-Switch
> 
> External email: Use caution opening links or attachments
> 
> 
> On Tue, 31 Oct 2023 16:27:29 +0200
> Dariusz Sosnowski <dsosnowski@nvidia.com> wrote:
> 
> > +             MKSTR(sysfs_if_path, "/sys/class/net/%s", ifname);
> > +             if (mlx5_get_pci_addr(sysfs_if_path, &if_pci_addr))
> > +                     continue;
> > +             if (pci_addr->domain != if_pci_addr.domain ||
> > +                 pci_addr->bus != if_pci_addr.bus ||
> > +                 pci_addr->devid != if_pci_addr.devid ||
> > +                 pci_addr->function != if_pci_addr.function)
> > +                     continue;
> > +             MKSTR(sysfs_mpesw_path,
> > +
> > + "/sys/class/net/%s/compat/devlink/lag_port_select_mode", ifname);
> 
> There are lots of DPDK code that reads sysfs, but eal and each driver ends up
> coding there own way of handling this. Would be good to have common
> helpers in EAL.
Agreed.

From a quick glance, I see that there are a few sysfs paths with which several drivers interact with e.g.:
- /sys/class/net
- /sys/bus/pci/devices
- /sys/devices
I think that, introducing common sysfs utilities (for example, some way of interacting with such common paths or just constructing sysfs paths) in DPDK could be beneficial.
We definitely can look into it, to see if it is viable.

Best regards,
Dariusz Sosnowski
  

Patch

diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index 2f08f2354e..7a656a7237 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -1931,6 +1931,75 @@  mlx5_device_bond_pci_match(const char *ibdev_name,
 	return pf;
 }
 
+#define SYSFS_MPESW_PARAM_MAX_LEN 16
+
+static __rte_unused int
+mlx5_sysfs_esw_multiport_get(struct ibv_device *ibv, struct rte_pci_addr *pci_addr, int *enabled)
+{
+	int nl_rdma;
+	unsigned int n_ports;
+	unsigned int i;
+	int ret;
+
+	/* Provide correct value to have defined enabled state in case of an error. */
+	*enabled = 0;
+	nl_rdma = mlx5_nl_init(NETLINK_RDMA, 0);
+	if (nl_rdma < 0)
+		return nl_rdma;
+	n_ports = mlx5_nl_portnum(nl_rdma, ibv->name);
+	if (!n_ports) {
+		ret = -rte_errno;
+		goto close_nl_rdma;
+	}
+	for (i = 1; i <= n_ports; ++i) {
+		unsigned int ifindex;
+		char ifname[IF_NAMESIZE + 1];
+		struct rte_pci_addr if_pci_addr;
+		char mpesw[SYSFS_MPESW_PARAM_MAX_LEN + 1];
+		FILE *sysfs;
+		int n;
+
+		ifindex = mlx5_nl_ifindex(nl_rdma, ibv->name, i);
+		if (!ifindex)
+			continue;
+		if (!if_indextoname(ifindex, ifname))
+			continue;
+		MKSTR(sysfs_if_path, "/sys/class/net/%s", ifname);
+		if (mlx5_get_pci_addr(sysfs_if_path, &if_pci_addr))
+			continue;
+		if (pci_addr->domain != if_pci_addr.domain ||
+		    pci_addr->bus != if_pci_addr.bus ||
+		    pci_addr->devid != if_pci_addr.devid ||
+		    pci_addr->function != if_pci_addr.function)
+			continue;
+		MKSTR(sysfs_mpesw_path,
+		      "/sys/class/net/%s/compat/devlink/lag_port_select_mode", ifname);
+		sysfs = fopen(sysfs_mpesw_path, "r");
+		if (!sysfs)
+			continue;
+		n = fscanf(sysfs, "%" RTE_STR(SYSFS_MPESW_PARAM_MAX_LEN) "s", mpesw);
+		fclose(sysfs);
+		if (n != 1)
+			continue;
+		ret = 0;
+		if (strcmp(mpesw, "multiport_esw") == 0) {
+			*enabled = 1;
+			break;
+		}
+		*enabled = 0;
+		break;
+	}
+	if (i > n_ports) {
+		DRV_LOG(DEBUG, "Unable to get Multiport E-Switch state by sysfs.");
+		rte_errno = ENOENT;
+		ret = -rte_errno;
+	}
+
+close_nl_rdma:
+	close(nl_rdma);
+	return ret;
+}
+
 /**
  * Register a PCI device within bonding.
  *