[v2] net/af_xdp: enable a sock path alongside use_cni

Message ID 20231204103101.2124374-1-mtahhan@redhat.com (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers
Series [v2] net/af_xdp: enable a sock path alongside use_cni |

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/github-robot: build success github build: passed
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/intel-Functional success Functional PASS
ci/iol-unit-arm64-testing success Testing PASS
ci/iol-compile-amd64-testing success Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-unit-amd64-testing success Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-compile-arm64-testing success Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-sample-apps-testing success Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS

Commit Message

Maryam Tahhan Dec. 4, 2023, 10:31 a.m. UTC
With the original 'use_cni' implementation, (using a
hardcoded socket rather than a configurable one),
if a single pod is requesting multiple net devices
and these devices are from different pools, then
the container attempts to mount all the netdev UDSes
in the pod as /tmp/afxdp.sock. Which means that at best
only 1 netdev will handshake correctly with the AF_XDP
DP. This patch addresses this by making the socket
parameter configurable alongside the 'use_cni' param.
Tested with the AF_XDP DP CNI PR 81.

v2:
* Rename sock_path to uds_path.
* Update documentation to reflect when CAP_BPF is needed.
* Fix testpmd arguments in the provided example for Pods.
* Use AF_XDP API to update the xskmap entry.

Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
---
 doc/guides/howto/af_xdp_cni.rst     | 24 ++++++-----
 drivers/net/af_xdp/rte_eth_af_xdp.c | 62 ++++++++++++++++++-----------
 2 files changed, 54 insertions(+), 32 deletions(-)
  

Comments

Koikkara Reeny, Shibin Dec. 4, 2023, 5:18 p.m. UTC | #1
Hi Maryam,

Apologies for asking this question bit late. 
The UDS sock name will be afxdp.sock only and addition director is created between the sock name and the uds filepath (/tmp/afxdp_dp/<interface name>/afxdp.sock).

As per the command " --vdev net_af_xdp0,iface=<interface name>,use_cni=1,uds_path=/tmp/afxdp_dp/<interface name>/afxdp.sock"
We are already passing the interface name(iface=<interface name> . So can't we create the uds_path inside the program uds_path="/tmp/afxdp_dp/"+ iface + "afxdp.sock"


If you check the code afxdp-plugins-for-kubernetes constants.go [1] they still have the constants and also they are using these constants to create the path [2]

[1] https://github.com/intel/afxdp-plugins-for-kubernetes/blob/main/constants/constants.go#L84 
[2] https://github.com/intel/afxdp-plugins-for-kubernetes/blob/main/internal/deviceplugin/poolManager_test.go#L78

If we are able to create path in the program then user won't have to pass along argument value.

Regards,
Shibin

> -----Original Message-----
> From: Maryam Tahhan <mtahhan@redhat.com>
> Sent: Monday, December 4, 2023 10:31 AM
> To: ferruh.yigit@amd.com; stephen@networkplumber.org;
> lihuisong@huawei.com; fengchengwen@huawei.com;
> liuyonglong@huawei.com; Koikkara Reeny, Shibin
> <shibin.koikkara.reeny@intel.com>
> Cc: dev@dpdk.org; Tahhan, Maryam <mtahhan@redhat.com>
> Subject: [v2] net/af_xdp: enable a sock path alongside use_cni
> 
> With the original 'use_cni' implementation, (using a hardcoded socket rather
> than a configurable one), if a single pod is requesting multiple net devices
> and these devices are from different pools, then the container attempts to
> mount all the netdev UDSes in the pod as /tmp/afxdp.sock. Which means
> that at best only 1 netdev will handshake correctly with the AF_XDP DP. This
> patch addresses this by making the socket parameter configurable alongside
> the 'use_cni' param.
> Tested with the AF_XDP DP CNI PR 81.
> 
> v2:
> * Rename sock_path to uds_path.
> * Update documentation to reflect when CAP_BPF is needed.
> * Fix testpmd arguments in the provided example for Pods.
> * Use AF_XDP API to update the xskmap entry.
> 
> Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
> ---
>  doc/guides/howto/af_xdp_cni.rst     | 24 ++++++-----
>  drivers/net/af_xdp/rte_eth_af_xdp.c | 62 ++++++++++++++++++-----------
>  2 files changed, 54 insertions(+), 32 deletions(-)
> 
> diff --git a/doc/guides/howto/af_xdp_cni.rst
> b/doc/guides/howto/af_xdp_cni.rst index a1a6d5b99c..7829526b40 100644
> --- a/doc/guides/howto/af_xdp_cni.rst
> +++ b/doc/guides/howto/af_xdp_cni.rst
> @@ -38,9 +38,10 @@ The XSKMAP is a BPF map of AF_XDP sockets (XSK).
>  The client can then proceed with creating an AF_XDP socket  and inserting
> that socket into the XSKMAP pointed to by the descriptor.
> 
> -The EAL vdev argument ``use_cni`` is used to indicate that the user wishes -
> to run the PMD in unprivileged mode and to receive the XSKMAP file
> descriptor -from the CNI.
> +The EAL vdev arguments ``use_cni`` and ``uds_path`` are used to
> +indicate that the user wishes to run the PMD in unprivileged mode and
> +to receive the XSKMAP file descriptor from the CNI.
> +
>  When this flag is set,
>  the ``XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD`` libbpf flag  should be
> used when creating the socket @@ -49,7 +50,7 @@ Instead the loading is
> handled by the CNI.
> 
>  .. note::
> 
> -   The Unix Domain Socket file path appear in the end user is
> "/tmp/afxdp.sock".
> +   The Unix Domain Socket file path appears to the end user at
> "/tmp/afxdp_dp/<netdev>/afxdp.sock".
> 
> 
>  Prerequisites
> @@ -223,8 +224,7 @@ Howto run dpdk-testpmd with CNI plugin:
>           securityContext:
>            capabilities:
>               add:
> -               - CAP_NET_RAW
> -               - CAP_BPF
> +               - NET_RAW

Need to update the 1.3. Prerequisites.


>           resources:
>             requests:
>               hugepages-2Mi: 2Gi
> @@ -239,14 +239,20 @@ Howto run dpdk-testpmd with CNI plugin:
> 
>    .. _pod.yaml: https://github.com/intel/afxdp-plugins-for-
> kubernetes/blob/v0.0.2/test/e2e/pod-1c1d.yaml
> 
> +.. note::
> +
> +   For Kernel versions older than 5.19 `CAP_BPF` is also required in
> +   the container capabilities stanza.
> +
>  * Run DPDK with a command like the following:
> 
>    .. code-block:: console
> 
>       kubectl exec -i <Pod name> --container <containers name> -- \
> -           /<Path>/dpdk-testpmd -l 0,1 --no-pci \
> -           --vdev=net_af_xdp0,use_cni=1,iface=<interface name> \
> -           -- --no-mlockall --in-memory
> +           /<Path>/dpdk-testpmd -l 0-2 --no-pci --main-lcore=2 \
> +           --vdev net_af_xdp0,iface=<interface
> name>,use_cni=1,uds_path=/tmp/afxdp_dp/<interface name>/afxdp.sock
> \
> +           --vdev net_af_xdp1,iface=e<interface
> name>,use_cni=1,uds_path=/tmp/afxdp_dp/<interface name>/afxdp.sock

There is a typo " iface=e<interface " == "iface=<interface"

> \
> +           -- -i --a --nb-cores=2 --rxq=1 --txq=1
> + --forward-mode=macswap;
> 
>  For further reference please use the `e2e`_ test case in `AF_XDP Plugin for
> Kubernetes`_
> 
> diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c
> b/drivers/net/af_xdp/rte_eth_af_xdp.c
> index 353c8688ec..505ed6cf1e 100644
> --- a/drivers/net/af_xdp/rte_eth_af_xdp.c
> +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
> @@ -88,7 +88,6 @@ RTE_LOG_REGISTER_DEFAULT(af_xdp_logtype,
> NOTICE);
>  #define UDS_MAX_CMD_LEN			64
>  #define UDS_MAX_CMD_RESP		128
>  #define UDS_XSK_MAP_FD_MSG		"/xsk_map_fd"
> -#define UDS_SOCK			"/tmp/afxdp.sock"
>  #define UDS_CONNECT_MSG			"/connect"
>  #define UDS_HOST_OK_MSG			"/host_ok"
>  #define UDS_HOST_NAK_MSG		"/host_nak"
> @@ -171,6 +170,7 @@ struct pmd_internals {
>  	bool custom_prog_configured;
>  	bool force_copy;
>  	bool use_cni;
> +	char uds_path[PATH_MAX];
>  	struct bpf_map *map;
> 
>  	struct rte_ether_addr eth_addr;
> @@ -191,6 +191,7 @@ struct pmd_process_private {
>  #define ETH_AF_XDP_BUDGET_ARG			"busy_budget"
>  #define ETH_AF_XDP_FORCE_COPY_ARG		"force_copy"
>  #define ETH_AF_XDP_USE_CNI_ARG			"use_cni"
> +#define ETH_AF_XDP_USE_CNI_UDS_PATH_ARG	"uds_path"
> 
>  static const char * const valid_arguments[] = {
>  	ETH_AF_XDP_IFACE_ARG,
> @@ -201,6 +202,7 @@ static const char * const valid_arguments[] = {
>  	ETH_AF_XDP_BUDGET_ARG,
>  	ETH_AF_XDP_FORCE_COPY_ARG,
>  	ETH_AF_XDP_USE_CNI_ARG,
> +	ETH_AF_XDP_USE_CNI_UDS_PATH_ARG,
>  	NULL
>  };
> 
> @@ -1351,7 +1353,7 @@ configure_preferred_busy_poll(struct
> pkt_rx_queue *rxq)  }
> 
>  static int
> -init_uds_sock(struct sockaddr_un *server)
> +init_uds_sock(struct sockaddr_un *server, const char *uds_path)
>  {
>  	int sock;
> 
> @@ -1362,7 +1364,7 @@ init_uds_sock(struct sockaddr_un *server)
>  	}
> 
>  	server->sun_family = AF_UNIX;
> -	strlcpy(server->sun_path, UDS_SOCK, sizeof(server->sun_path));
> +	strlcpy(server->sun_path, uds_path, sizeof(server->sun_path));
> 
>  	if (connect(sock, (struct sockaddr *)server, sizeof(struct
> sockaddr_un)) < 0) {
>  		close(sock);
> @@ -1382,7 +1384,7 @@ struct msg_internal {  };
> 
>  static int
> -send_msg(int sock, char *request, int *fd)
> +send_msg(int sock, char *request, int *fd, const char *uds_path)
>  {
>  	int snd;
>  	struct iovec iov;
> @@ -1393,7 +1395,7 @@ send_msg(int sock, char *request, int *fd)
> 
>  	memset(&dst, 0, sizeof(dst));
>  	dst.sun_family = AF_UNIX;
> -	strlcpy(dst.sun_path, UDS_SOCK, sizeof(dst.sun_path));
> +	strlcpy(dst.sun_path, uds_path, sizeof(dst.sun_path));
> 
>  	/* Initialize message header structure */
>  	memset(&msgh, 0, sizeof(msgh));
> @@ -1471,7 +1473,7 @@ read_msg(int sock, char *response, struct
> sockaddr_un *s, int *fd)
> 
>  static int
>  make_request_cni(int sock, struct sockaddr_un *server, char *request,
> -		 int *req_fd, char *response, int *out_fd)
> +		 int *req_fd, char *response, int *out_fd, const char
> *uds_path)
>  {
>  	int rval;
> 
> @@ -1483,7 +1485,7 @@ make_request_cni(int sock, struct sockaddr_un
> *server, char *request,
>  	if (req_fd == NULL)
>  		rval = write(sock, request, strlen(request));
>  	else
> -		rval = send_msg(sock, request, req_fd);
> +		rval = send_msg(sock, request, req_fd, uds_path);
> 
>  	if (rval < 0) {
>  		AF_XDP_LOG(ERR, "Write error %s\n", strerror(errno)); @@
> -1507,7 +1509,7 @@ check_response(char *response, char *exp_resp, long
> size)  }
> 
>  static int
> -get_cni_fd(char *if_name)
> +get_cni_fd(char *if_name, const char *uds_path)
>  {
>  	char request[UDS_MAX_CMD_LEN],
> response[UDS_MAX_CMD_RESP];
>  	char hostname[MAX_LONG_OPT_SZ],
> exp_resp[UDS_MAX_CMD_RESP]; @@ -1520,14 +1522,14 @@
> get_cni_fd(char *if_name)
>  		return -1;
> 
>  	memset(&server, 0, sizeof(server));
> -	sock = init_uds_sock(&server);
> +	sock = init_uds_sock(&server, uds_path);
>  	if (sock < 0)
>  		return -1;
> 
>  	/* Initiates handshake to CNI send: /connect,hostname */
>  	snprintf(request, sizeof(request), "%s,%s", UDS_CONNECT_MSG,
> hostname);
>  	memset(response, 0, sizeof(response));
> -	if (make_request_cni(sock, &server, request, NULL, response,
> &out_fd) < 0) {
> +	if (make_request_cni(sock, &server, request, NULL, response,
> &out_fd,
> +uds_path) < 0) {
>  		AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n",
> request);
>  		goto err_close;
>  	}
> @@ -1541,7 +1543,7 @@ get_cni_fd(char *if_name)
>  	/* Request for "/version" */
>  	strlcpy(request, UDS_VERSION_MSG, UDS_MAX_CMD_LEN);
>  	memset(response, 0, sizeof(response));
> -	if (make_request_cni(sock, &server, request, NULL, response,
> &out_fd) < 0) {
> +	if (make_request_cni(sock, &server, request, NULL, response,
> &out_fd,
> +uds_path) < 0) {
>  		AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n",
> request);
>  		goto err_close;
>  	}
> @@ -1549,7 +1551,7 @@ get_cni_fd(char *if_name)
>  	/* Request for file descriptor for netdev name*/
>  	snprintf(request, sizeof(request), "%s,%s",
> UDS_XSK_MAP_FD_MSG, if_name);
>  	memset(response, 0, sizeof(response));
> -	if (make_request_cni(sock, &server, request, NULL, response,
> &out_fd) < 0) {
> +	if (make_request_cni(sock, &server, request, NULL, response,
> &out_fd,
> +uds_path) < 0) {
>  		AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n",
> request);
>  		goto err_close;
>  	}
> @@ -1571,7 +1573,7 @@ get_cni_fd(char *if_name)
>  	/* Initiate close connection */
>  	strlcpy(request, UDS_FIN_MSG, UDS_MAX_CMD_LEN);
>  	memset(response, 0, sizeof(response));
> -	if (make_request_cni(sock, &server, request, NULL, response,
> &out_fd) < 0) {
> +	if (make_request_cni(sock, &server, request, NULL, response,
> &out_fd,
> +uds_path) < 0) {
>  		AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n",
> request);
>  		goto err_close;
>  	}
> @@ -1695,17 +1697,16 @@ xsk_configure(struct pmd_internals *internals,
> struct pkt_rx_queue *rxq,
>  	}
> 
>  	if (internals->use_cni) {
> -		int err, fd, map_fd;
> +		int err, map_fd;
> 
>  		/* get socket fd from CNI plugin */
> -		map_fd = get_cni_fd(internals->if_name);
> +		map_fd = get_cni_fd(internals->if_name, internals-
> >uds_path);
>  		if (map_fd < 0) {
>  			AF_XDP_LOG(ERR, "Failed to receive CNI plugin
> fd\n");
>  			goto out_xsk;
>  		}
> -		/* get socket fd */
> -		fd = xsk_socket__fd(rxq->xsk);
> -		err = bpf_map_update_elem(map_fd, &rxq-
> >xsk_queue_idx, &fd, 0);
> +
> +		err = xsk_socket__update_xskmap(rxq->xsk, map_fd);
>  		if (err) {
>  			AF_XDP_LOG(ERR, "Failed to insert unprivileged xsk
> in map.\n");
>  			goto out_xsk;
> @@ -2023,7 +2024,8 @@ xdp_get_channels_info(const char *if_name, int
> *max_queues,  static int  parse_parameters(struct rte_kvargs *kvlist, char
> *if_name, int *start_queue,
>  		 int *queue_cnt, int *shared_umem, char *prog_path,
> -		 int *busy_budget, int *force_copy, int *use_cni)
> +		 int *busy_budget, int *force_copy, int *use_cni,
> +		 char *uds_path)
>  {
>  	int ret;
> 
> @@ -2069,6 +2071,11 @@ parse_parameters(struct rte_kvargs *kvlist, char
> *if_name, int *start_queue,
>  	if (ret < 0)
>  		goto free_kvlist;
> 
> +	ret = rte_kvargs_process(kvlist,
> ETH_AF_XDP_USE_CNI_UDS_PATH_ARG,
> +				 &parse_prog_arg, uds_path);
> +	if (ret < 0)
> +		goto free_kvlist;
> +
>  free_kvlist:
>  	rte_kvargs_free(kvlist);
>  	return ret;
> @@ -2108,7 +2115,7 @@ static struct rte_eth_dev *  init_internals(struct
> rte_vdev_device *dev, const char *if_name,
>  	       int start_queue_idx, int queue_cnt, int shared_umem,
>  	       const char *prog_path, int busy_budget, int force_copy,
> -	       int use_cni)
> +		   int use_cni, const char *uds_path)
>  {
>  	const char *name = rte_vdev_device_name(dev);
>  	const unsigned int numa_node = dev->device.numa_node; @@ -
> 2138,6 +2145,7 @@ init_internals(struct rte_vdev_device *dev, const char
> *if_name,
>  	internals->shared_umem = shared_umem;
>  	internals->force_copy = force_copy;
>  	internals->use_cni = use_cni;
> +	strlcpy(internals->uds_path, uds_path, PATH_MAX);
> 
>  	if (xdp_get_channels_info(if_name, &internals->max_queue_cnt,
>  				  &internals->combined_queue_cnt)) { @@ -
> 2328,6 +2336,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev)
>  	int busy_budget = -1, ret;
>  	int force_copy = 0;
>  	int use_cni = 0;
> +	char uds_path[PATH_MAX] = {'\0'};
>  	struct rte_eth_dev *eth_dev = NULL;
>  	const char *name = rte_vdev_device_name(dev);
> 
> @@ -2370,7 +2379,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device
> *dev)
> 
>  	if (parse_parameters(kvlist, if_name, &xsk_start_queue_idx,
>  			     &xsk_queue_cnt, &shared_umem, prog_path,
> -			     &busy_budget, &force_copy, &use_cni) < 0) {
> +				 &busy_budget, &force_copy, &use_cni,
> uds_path) < 0) {
>  		AF_XDP_LOG(ERR, "Invalid kvargs value\n");
>  		return -EINVAL;
>  	}
> @@ -2387,6 +2396,12 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device
> *dev)
>  			return -EINVAL;
>  	}
> 
> +	if (use_cni && !strnlen(uds_path, PATH_MAX)) {
> +		AF_XDP_LOG(ERR, "When '%s' parameter is used, '%s' must
> also be provided\n",
> +			ETH_AF_XDP_USE_CNI_ARG,
> ETH_AF_XDP_USE_CNI_UDS_PATH_ARG);
> +			return -EINVAL;
> +	}
> +
>  	if (strlen(if_name) == 0) {
>  		AF_XDP_LOG(ERR, "Network interface must be
> specified\n");
>  		return -EINVAL;
> @@ -2410,7 +2425,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device
> *dev)
> 
>  	eth_dev = init_internals(dev, if_name, xsk_start_queue_idx,
>  				 xsk_queue_cnt, shared_umem, prog_path,
> -				 busy_budget, force_copy, use_cni);
> +				 busy_budget, force_copy, use_cni,
> uds_path);
>  	if (eth_dev == NULL) {
>  		AF_XDP_LOG(ERR, "Failed to init internals\n");
>  		return -1;
> @@ -2471,4 +2486,5 @@
> RTE_PMD_REGISTER_PARAM_STRING(net_af_xdp,
>  			      "xdp_prog=<string> "
>  			      "busy_budget=<int> "
>  			      "force_copy=<int> "
> -			      "use_cni=<int> ");
> +			      "use_cni=<int> "
> +			      "uds_path=<string> ");
> --
> 2.41.0
  
Maryam Tahhan Dec. 4, 2023, 6:41 p.m. UTC | #2
Hi Shibin

I'm not really sure what you are suggesting, is to make an assumption on 
the path part where the socket resides (aka hard code it) and then try 
to build the full UDS path in DPDK?

Yes the plugin is using constants ATM for certain parts of the UDS path, 
but that's not say that it's something that won't become configurable 
later on. Someone may not want to use "/tmp/afxdp_dp/" as the base 
directory. Then we'd have to change DPDK's implementation again. These 
are not really things that are configured by hand and are generated by 
initialization scripts (typically). I would rather build this with the 
idea that things can change in the future without having to change the 
DPDK implementation again.
BR
Maryam

On 04/12/2023 17:18, Koikkara Reeny, Shibin wrote:
> Hi Maryam,
>
> Apologies for asking this question bit late.
> The UDS sock name will be afxdp.sock only and addition director is created between the sock name and the uds filepath (/tmp/afxdp_dp/<interface name>/afxdp.sock).
>
> As per the command " --vdev net_af_xdp0,iface=<interface name>,use_cni=1,uds_path=/tmp/afxdp_dp/<interface name>/afxdp.sock"
> We are already passing the interface name(iface=<interface name> . So can't we create the uds_path inside the program uds_path="/tmp/afxdp_dp/"+ iface + "afxdp.sock"
>
>
> If you check the code afxdp-plugins-for-kubernetes constants.go [1] they still have the constants and also they are using these constants to create the path [2]
>
> [1]https://github.com/intel/afxdp-plugins-for-kubernetes/blob/main/constants/constants.go#L84  
> [2]https://github.com/intel/afxdp-plugins-for-kubernetes/blob/main/internal/deviceplugin/poolManager_test.go#L78
>
> If we are able to create path in the program then user won't have to pass along argument value.
>
> Regards,
> Shibin
>
>> -----Original Message-----
>> From: Maryam Tahhan<mtahhan@redhat.com>
>> Sent: Monday, December 4, 2023 10:31 AM
>> To:ferruh.yigit@amd.com;stephen@networkplumber.org;
>> lihuisong@huawei.com;fengchengwen@huawei.com;
>> liuyonglong@huawei.com; Koikkara Reeny, Shibin
>> <shibin.koikkara.reeny@intel.com>
>> Cc:dev@dpdk.org; Tahhan, Maryam<mtahhan@redhat.com>
>> Subject: [v2] net/af_xdp: enable a sock path alongside use_cni
>>
>> With the original 'use_cni' implementation, (using a hardcoded socket rather
>> than a configurable one), if a single pod is requesting multiple net devices
>> and these devices are from different pools, then the container attempts to
>> mount all the netdev UDSes in the pod as /tmp/afxdp.sock. Which means
>> that at best only 1 netdev will handshake correctly with the AF_XDP DP. This
>> patch addresses this by making the socket parameter configurable alongside
>> the 'use_cni' param.
>> Tested with the AF_XDP DP CNI PR 81.
>>
>> v2:
>> * Rename sock_path to uds_path.
>> * Update documentation to reflect when CAP_BPF is needed.
>> * Fix testpmd arguments in the provided example for Pods.
>> * Use AF_XDP API to update the xskmap entry.
>>
>> Signed-off-by: Maryam Tahhan<mtahhan@redhat.com>
>> ---
>>   doc/guides/howto/af_xdp_cni.rst     | 24 ++++++-----
>>   drivers/net/af_xdp/rte_eth_af_xdp.c | 62 ++++++++++++++++++-----------
>>   2 files changed, 54 insertions(+), 32 deletions(-)
>>
>> diff --git a/doc/guides/howto/af_xdp_cni.rst
>> b/doc/guides/howto/af_xdp_cni.rst index a1a6d5b99c..7829526b40 100644
>> --- a/doc/guides/howto/af_xdp_cni.rst
>> +++ b/doc/guides/howto/af_xdp_cni.rst
>> @@ -38,9 +38,10 @@ The XSKMAP is a BPF map of AF_XDP sockets (XSK).
>>   The client can then proceed with creating an AF_XDP socket  and inserting
>> that socket into the XSKMAP pointed to by the descriptor.
>>
>> -The EAL vdev argument ``use_cni`` is used to indicate that the user wishes -
>> to run the PMD in unprivileged mode and to receive the XSKMAP file
>> descriptor -from the CNI.
>> +The EAL vdev arguments ``use_cni`` and ``uds_path`` are used to
>> +indicate that the user wishes to run the PMD in unprivileged mode and
>> +to receive the XSKMAP file descriptor from the CNI.
>> +
>>   When this flag is set,
>>   the ``XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD`` libbpf flag  should be
>> used when creating the socket @@ -49,7 +50,7 @@ Instead the loading is
>> handled by the CNI.
>>
>>   .. note::
>>
>> -   The Unix Domain Socket file path appear in the end user is
>> "/tmp/afxdp.sock".
>> +   The Unix Domain Socket file path appears to the end user at
>> "/tmp/afxdp_dp/<netdev>/afxdp.sock".
>>
>>
>>   Prerequisites
>> @@ -223,8 +224,7 @@ Howto run dpdk-testpmd with CNI plugin:
>>            securityContext:
>>             capabilities:
>>                add:
>> -               - CAP_NET_RAW
>> -               - CAP_BPF
>> +               - NET_RAW
> Need to update the 1.3. Prerequisites.
>
>
>>            resources:
>>              requests:
>>                hugepages-2Mi: 2Gi
>> @@ -239,14 +239,20 @@ Howto run dpdk-testpmd with CNI plugin:
>>
>>     .. _pod.yaml:https://github.com/intel/afxdp-plugins-for-
>> kubernetes/blob/v0.0.2/test/e2e/pod-1c1d.yaml
>>
>> +.. note::
>> +
>> +   For Kernel versions older than 5.19 `CAP_BPF` is also required in
>> +   the container capabilities stanza.
>> +
>>   * Run DPDK with a command like the following:
>>
>>     .. code-block:: console
>>
>>        kubectl exec -i <Pod name> --container <containers name> -- \
>> -           /<Path>/dpdk-testpmd -l 0,1 --no-pci \
>> -           --vdev=net_af_xdp0,use_cni=1,iface=<interface name> \
>> -           -- --no-mlockall --in-memory
>> +           /<Path>/dpdk-testpmd -l 0-2 --no-pci --main-lcore=2 \
>> +           --vdev net_af_xdp0,iface=<interface
>> name>,use_cni=1,uds_path=/tmp/afxdp_dp/<interface name>/afxdp.sock
>> \
>> +           --vdev net_af_xdp1,iface=e<interface
>> name>,use_cni=1,uds_path=/tmp/afxdp_dp/<interface name>/afxdp.sock
> There is a typo " iface=e<interface " == "iface=<interface"
>
>> \
>> +           -- -i --a --nb-cores=2 --rxq=1 --txq=1
>> + --forward-mode=macswap;
>>
>>   For further reference please use the `e2e`_ test case in `AF_XDP Plugin for
>> Kubernetes`_
>>
>> diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c
>> b/drivers/net/af_xdp/rte_eth_af_xdp.c
>> index 353c8688ec..505ed6cf1e 100644
>> --- a/drivers/net/af_xdp/rte_eth_af_xdp.c
>> +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
>> @@ -88,7 +88,6 @@ RTE_LOG_REGISTER_DEFAULT(af_xdp_logtype,
>> NOTICE);
>>   #define UDS_MAX_CMD_LEN			64
>>   #define UDS_MAX_CMD_RESP		128
>>   #define UDS_XSK_MAP_FD_MSG		"/xsk_map_fd"
>> -#define UDS_SOCK			"/tmp/afxdp.sock"
>>   #define UDS_CONNECT_MSG			"/connect"
>>   #define UDS_HOST_OK_MSG			"/host_ok"
>>   #define UDS_HOST_NAK_MSG		"/host_nak"
>> @@ -171,6 +170,7 @@ struct pmd_internals {
>>   	bool custom_prog_configured;
>>   	bool force_copy;
>>   	bool use_cni;
>> +	char uds_path[PATH_MAX];
>>   	struct bpf_map *map;
>>
>>   	struct rte_ether_addr eth_addr;
>> @@ -191,6 +191,7 @@ struct pmd_process_private {
>>   #define ETH_AF_XDP_BUDGET_ARG			"busy_budget"
>>   #define ETH_AF_XDP_FORCE_COPY_ARG		"force_copy"
>>   #define ETH_AF_XDP_USE_CNI_ARG			"use_cni"
>> +#define ETH_AF_XDP_USE_CNI_UDS_PATH_ARG	"uds_path"
>>
>>   static const char * const valid_arguments[] = {
>>   	ETH_AF_XDP_IFACE_ARG,
>> @@ -201,6 +202,7 @@ static const char * const valid_arguments[] = {
>>   	ETH_AF_XDP_BUDGET_ARG,
>>   	ETH_AF_XDP_FORCE_COPY_ARG,
>>   	ETH_AF_XDP_USE_CNI_ARG,
>> +	ETH_AF_XDP_USE_CNI_UDS_PATH_ARG,
>>   	NULL
>>   };
>>
>> @@ -1351,7 +1353,7 @@ configure_preferred_busy_poll(struct
>> pkt_rx_queue *rxq)  }
>>
>>   static int
>> -init_uds_sock(struct sockaddr_un *server)
>> +init_uds_sock(struct sockaddr_un *server, const char *uds_path)
>>   {
>>   	int sock;
>>
>> @@ -1362,7 +1364,7 @@ init_uds_sock(struct sockaddr_un *server)
>>   	}
>>
>>   	server->sun_family = AF_UNIX;
>> -	strlcpy(server->sun_path, UDS_SOCK, sizeof(server->sun_path));
>> +	strlcpy(server->sun_path, uds_path, sizeof(server->sun_path));
>>
>>   	if (connect(sock, (struct sockaddr *)server, sizeof(struct
>> sockaddr_un)) < 0) {
>>   		close(sock);
>> @@ -1382,7 +1384,7 @@ struct msg_internal {  };
>>
>>   static int
>> -send_msg(int sock, char *request, int *fd)
>> +send_msg(int sock, char *request, int *fd, const char *uds_path)
>>   {
>>   	int snd;
>>   	struct iovec iov;
>> @@ -1393,7 +1395,7 @@ send_msg(int sock, char *request, int *fd)
>>
>>   	memset(&dst, 0, sizeof(dst));
>>   	dst.sun_family = AF_UNIX;
>> -	strlcpy(dst.sun_path, UDS_SOCK, sizeof(dst.sun_path));
>> +	strlcpy(dst.sun_path, uds_path, sizeof(dst.sun_path));
>>
>>   	/* Initialize message header structure */
>>   	memset(&msgh, 0, sizeof(msgh));
>> @@ -1471,7 +1473,7 @@ read_msg(int sock, char *response, struct
>> sockaddr_un *s, int *fd)
>>
>>   static int
>>   make_request_cni(int sock, struct sockaddr_un *server, char *request,
>> -		 int *req_fd, char *response, int *out_fd)
>> +		 int *req_fd, char *response, int *out_fd, const char
>> *uds_path)
>>   {
>>   	int rval;
>>
>> @@ -1483,7 +1485,7 @@ make_request_cni(int sock, struct sockaddr_un
>> *server, char *request,
>>   	if (req_fd == NULL)
>>   		rval = write(sock, request, strlen(request));
>>   	else
>> -		rval = send_msg(sock, request, req_fd);
>> +		rval = send_msg(sock, request, req_fd, uds_path);
>>
>>   	if (rval < 0) {
>>   		AF_XDP_LOG(ERR, "Write error %s\n", strerror(errno)); @@
>> -1507,7 +1509,7 @@ check_response(char *response, char *exp_resp, long
>> size)  }
>>
>>   static int
>> -get_cni_fd(char *if_name)
>> +get_cni_fd(char *if_name, const char *uds_path)
>>   {
>>   	char request[UDS_MAX_CMD_LEN],
>> response[UDS_MAX_CMD_RESP];
>>   	char hostname[MAX_LONG_OPT_SZ],
>> exp_resp[UDS_MAX_CMD_RESP]; @@ -1520,14 +1522,14 @@
>> get_cni_fd(char *if_name)
>>   		return -1;
>>
>>   	memset(&server, 0, sizeof(server));
>> -	sock = init_uds_sock(&server);
>> +	sock = init_uds_sock(&server, uds_path);
>>   	if (sock < 0)
>>   		return -1;
>>
>>   	/* Initiates handshake to CNI send: /connect,hostname */
>>   	snprintf(request, sizeof(request), "%s,%s", UDS_CONNECT_MSG,
>> hostname);
>>   	memset(response, 0, sizeof(response));
>> -	if (make_request_cni(sock, &server, request, NULL, response,
>> &out_fd) < 0) {
>> +	if (make_request_cni(sock, &server, request, NULL, response,
>> &out_fd,
>> +uds_path) < 0) {
>>   		AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n",
>> request);
>>   		goto err_close;
>>   	}
>> @@ -1541,7 +1543,7 @@ get_cni_fd(char *if_name)
>>   	/* Request for "/version" */
>>   	strlcpy(request, UDS_VERSION_MSG, UDS_MAX_CMD_LEN);
>>   	memset(response, 0, sizeof(response));
>> -	if (make_request_cni(sock, &server, request, NULL, response,
>> &out_fd) < 0) {
>> +	if (make_request_cni(sock, &server, request, NULL, response,
>> &out_fd,
>> +uds_path) < 0) {
>>   		AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n",
>> request);
>>   		goto err_close;
>>   	}
>> @@ -1549,7 +1551,7 @@ get_cni_fd(char *if_name)
>>   	/* Request for file descriptor for netdev name*/
>>   	snprintf(request, sizeof(request), "%s,%s",
>> UDS_XSK_MAP_FD_MSG, if_name);
>>   	memset(response, 0, sizeof(response));
>> -	if (make_request_cni(sock, &server, request, NULL, response,
>> &out_fd) < 0) {
>> +	if (make_request_cni(sock, &server, request, NULL, response,
>> &out_fd,
>> +uds_path) < 0) {
>>   		AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n",
>> request);
>>   		goto err_close;
>>   	}
>> @@ -1571,7 +1573,7 @@ get_cni_fd(char *if_name)
>>   	/* Initiate close connection */
>>   	strlcpy(request, UDS_FIN_MSG, UDS_MAX_CMD_LEN);
>>   	memset(response, 0, sizeof(response));
>> -	if (make_request_cni(sock, &server, request, NULL, response,
>> &out_fd) < 0) {
>> +	if (make_request_cni(sock, &server, request, NULL, response,
>> &out_fd,
>> +uds_path) < 0) {
>>   		AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n",
>> request);
>>   		goto err_close;
>>   	}
>> @@ -1695,17 +1697,16 @@ xsk_configure(struct pmd_internals *internals,
>> struct pkt_rx_queue *rxq,
>>   	}
>>
>>   	if (internals->use_cni) {
>> -		int err, fd, map_fd;
>> +		int err, map_fd;
>>
>>   		/* get socket fd from CNI plugin */
>> -		map_fd = get_cni_fd(internals->if_name);
>> +		map_fd = get_cni_fd(internals->if_name, internals-
>>> uds_path);
>>   		if (map_fd < 0) {
>>   			AF_XDP_LOG(ERR, "Failed to receive CNI plugin
>> fd\n");
>>   			goto out_xsk;
>>   		}
>> -		/* get socket fd */
>> -		fd = xsk_socket__fd(rxq->xsk);
>> -		err = bpf_map_update_elem(map_fd, &rxq-
>>> xsk_queue_idx, &fd, 0);
>> +
>> +		err = xsk_socket__update_xskmap(rxq->xsk, map_fd);
>>   		if (err) {
>>   			AF_XDP_LOG(ERR, "Failed to insert unprivileged xsk
>> in map.\n");
>>   			goto out_xsk;
>> @@ -2023,7 +2024,8 @@ xdp_get_channels_info(const char *if_name, int
>> *max_queues,  static int  parse_parameters(struct rte_kvargs *kvlist, char
>> *if_name, int *start_queue,
>>   		 int *queue_cnt, int *shared_umem, char *prog_path,
>> -		 int *busy_budget, int *force_copy, int *use_cni)
>> +		 int *busy_budget, int *force_copy, int *use_cni,
>> +		 char *uds_path)
>>   {
>>   	int ret;
>>
>> @@ -2069,6 +2071,11 @@ parse_parameters(struct rte_kvargs *kvlist, char
>> *if_name, int *start_queue,
>>   	if (ret < 0)
>>   		goto free_kvlist;
>>
>> +	ret = rte_kvargs_process(kvlist,
>> ETH_AF_XDP_USE_CNI_UDS_PATH_ARG,
>> +				 &parse_prog_arg, uds_path);
>> +	if (ret < 0)
>> +		goto free_kvlist;
>> +
>>   free_kvlist:
>>   	rte_kvargs_free(kvlist);
>>   	return ret;
>> @@ -2108,7 +2115,7 @@ static struct rte_eth_dev *  init_internals(struct
>> rte_vdev_device *dev, const char *if_name,
>>   	       int start_queue_idx, int queue_cnt, int shared_umem,
>>   	       const char *prog_path, int busy_budget, int force_copy,
>> -	       int use_cni)
>> +		   int use_cni, const char *uds_path)
>>   {
>>   	const char *name = rte_vdev_device_name(dev);
>>   	const unsigned int numa_node = dev->device.numa_node; @@ -
>> 2138,6 +2145,7 @@ init_internals(struct rte_vdev_device *dev, const char
>> *if_name,
>>   	internals->shared_umem = shared_umem;
>>   	internals->force_copy = force_copy;
>>   	internals->use_cni = use_cni;
>> +	strlcpy(internals->uds_path, uds_path, PATH_MAX);
>>
>>   	if (xdp_get_channels_info(if_name, &internals->max_queue_cnt,
>>   				  &internals->combined_queue_cnt)) { @@ -
>> 2328,6 +2336,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev)
>>   	int busy_budget = -1, ret;
>>   	int force_copy = 0;
>>   	int use_cni = 0;
>> +	char uds_path[PATH_MAX] = {'\0'};
>>   	struct rte_eth_dev *eth_dev = NULL;
>>   	const char *name = rte_vdev_device_name(dev);
>>
>> @@ -2370,7 +2379,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device
>> *dev)
>>
>>   	if (parse_parameters(kvlist, if_name, &xsk_start_queue_idx,
>>   			     &xsk_queue_cnt, &shared_umem, prog_path,
>> -			     &busy_budget, &force_copy, &use_cni) < 0) {
>> +				 &busy_budget, &force_copy, &use_cni,
>> uds_path) < 0) {
>>   		AF_XDP_LOG(ERR, "Invalid kvargs value\n");
>>   		return -EINVAL;
>>   	}
>> @@ -2387,6 +2396,12 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device
>> *dev)
>>   			return -EINVAL;
>>   	}
>>
>> +	if (use_cni && !strnlen(uds_path, PATH_MAX)) {
>> +		AF_XDP_LOG(ERR, "When '%s' parameter is used, '%s' must
>> also be provided\n",
>> +			ETH_AF_XDP_USE_CNI_ARG,
>> ETH_AF_XDP_USE_CNI_UDS_PATH_ARG);
>> +			return -EINVAL;
>> +	}
>> +
>>   	if (strlen(if_name) == 0) {
>>   		AF_XDP_LOG(ERR, "Network interface must be
>> specified\n");
>>   		return -EINVAL;
>> @@ -2410,7 +2425,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device
>> *dev)
>>
>>   	eth_dev = init_internals(dev, if_name, xsk_start_queue_idx,
>>   				 xsk_queue_cnt, shared_umem, prog_path,
>> -				 busy_budget, force_copy, use_cni);
>> +				 busy_budget, force_copy, use_cni,
>> uds_path);
>>   	if (eth_dev == NULL) {
>>   		AF_XDP_LOG(ERR, "Failed to init internals\n");
>>   		return -1;
>> @@ -2471,4 +2486,5 @@
>> RTE_PMD_REGISTER_PARAM_STRING(net_af_xdp,
>>   			      "xdp_prog=<string> "
>>   			      "busy_budget=<int> "
>>   			      "force_copy=<int> "
>> -			      "use_cni=<int> ");
>> +			      "use_cni=<int> "
>> +			      "uds_path=<string> ");
>> --
>> 2.41.0
  
Koikkara Reeny, Shibin Dec. 5, 2023, 10:29 a.m. UTC | #3
Hi Maryam,

Apologies for making it confusing.

As per the afxdp-plugins-for-kubernetes code it looks like the udsPodPath or udsSockDir[1] and udsPodSock[1] are constant and not changing. Only the interface name is changing and we are already passing the interface name through the command line. So I was suggesting we can write a logic to create sock path from these logic.

If I am wrong please correct me, Isn’t that the logic afxdp-plugins-for-kubernetes doing? [2]
This is only a suggestions.

[1] https://github.com/intel/afxdp-plugins-for-kubernetes/blob/main/constants/constants.go#L84
[2] https://github.com/intel/afxdp-plugins-for-kubernetes/blob/main/internal/deviceplugin/poolManager_test.go#L99

I had added 2 small comment earlier in the code.


Regards,
Shibin
From: Maryam Tahhan <mtahhan@redhat.com>
Sent: Monday, December 4, 2023 6:41 PM
To: Koikkara Reeny, Shibin <shibin.koikkara.reeny@intel.com>; ferruh.yigit@amd.com; stephen@networkplumber.org; lihuisong@huawei.com; fengchengwen@huawei.com; liuyonglong@huawei.com
Cc: dev@dpdk.org
Subject: Re: [v2] net/af_xdp: enable a sock path alongside use_cni

Hi Shibin

I'm not really sure what you are suggesting, is to make an assumption on the path part where the socket resides (aka hard code it) and then try to build the full UDS path in DPDK?

Yes the plugin is using constants ATM for certain parts of the UDS path, but that's not say that it's something that won't become configurable later on. Someone may not want to use "/tmp/afxdp_dp/" as the base directory. Then we'd have to change DPDK's implementation again. These are not really things that are configured by hand and are generated by initialization scripts (typically). I would rather build this with the idea that things can change in the future without having to change the DPDK implementation again.
BR
Maryam

On 04/12/2023 17:18, Koikkara Reeny, Shibin wrote:

Hi Maryam,



Apologies for asking this question bit late.

The UDS sock name will be afxdp.sock only and addition director is created between the sock name and the uds filepath (/tmp/afxdp_dp/<interface name>/afxdp.sock).



As per the command " --vdev net_af_xdp0,iface=<interface name>,use_cni=1,uds_path=/tmp/afxdp_dp/<interface name>/afxdp.sock"

We are already passing the interface name(iface=<interface name> . So can't we create the uds_path inside the program uds_path="/tmp/afxdp_dp/"+ iface + "afxdp.sock"





If you check the code afxdp-plugins-for-kubernetes constants.go [1] they still have the constants and also they are using these constants to create the path [2]



[1] https://github.com/intel/afxdp-plugins-for-kubernetes/blob/main/constants/constants.go#L84

[2] https://github.com/intel/afxdp-plugins-for-kubernetes/blob/main/internal/deviceplugin/poolManager_test.go#L78



If we are able to create path in the program then user won't have to pass along argument value.



Regards,

Shibin



-----Original Message-----

From: Maryam Tahhan <mtahhan@redhat.com><mailto:mtahhan@redhat.com>

Sent: Monday, December 4, 2023 10:31 AM

To: ferruh.yigit@amd.com<mailto:ferruh.yigit@amd.com>; stephen@networkplumber.org<mailto:stephen@networkplumber.org>;

lihuisong@huawei.com<mailto:lihuisong@huawei.com>; fengchengwen@huawei.com<mailto:fengchengwen@huawei.com>;

liuyonglong@huawei.com<mailto:liuyonglong@huawei.com>; Koikkara Reeny, Shibin

<shibin.koikkara.reeny@intel.com><mailto:shibin.koikkara.reeny@intel.com>

Cc: dev@dpdk.org<mailto:dev@dpdk.org>; Tahhan, Maryam <mtahhan@redhat.com><mailto:mtahhan@redhat.com>

Subject: [v2] net/af_xdp: enable a sock path alongside use_cni



With the original 'use_cni' implementation, (using a hardcoded socket rather

than a configurable one), if a single pod is requesting multiple net devices

and these devices are from different pools, then the container attempts to

mount all the netdev UDSes in the pod as /tmp/afxdp.sock. Which means

that at best only 1 netdev will handshake correctly with the AF_XDP DP. This

patch addresses this by making the socket parameter configurable alongside

the 'use_cni' param.

Tested with the AF_XDP DP CNI PR 81.



v2:

* Rename sock_path to uds_path.

* Update documentation to reflect when CAP_BPF is needed.

* Fix testpmd arguments in the provided example for Pods.

* Use AF_XDP API to update the xskmap entry.



Signed-off-by: Maryam Tahhan <mtahhan@redhat.com><mailto:mtahhan@redhat.com>

---

 doc/guides/howto/af_xdp_cni.rst     | 24 ++++++-----

 drivers/net/af_xdp/rte_eth_af_xdp.c | 62 ++++++++++++++++++-----------

 2 files changed, 54 insertions(+), 32 deletions(-)



diff --git a/doc/guides/howto/af_xdp_cni.rst

b/doc/guides/howto/af_xdp_cni.rst index a1a6d5b99c..7829526b40 100644

--- a/doc/guides/howto/af_xdp_cni.rst

+++ b/doc/guides/howto/af_xdp_cni.rst

@@ -38,9 +38,10 @@ The XSKMAP is a BPF map of AF_XDP sockets (XSK).

 The client can then proceed with creating an AF_XDP socket  and inserting

that socket into the XSKMAP pointed to by the descriptor.



-The EAL vdev argument ``use_cni`` is used to indicate that the user wishes -

to run the PMD in unprivileged mode and to receive the XSKMAP file

descriptor -from the CNI.

+The EAL vdev arguments ``use_cni`` and ``uds_path`` are used to

+indicate that the user wishes to run the PMD in unprivileged mode and

+to receive the XSKMAP file descriptor from the CNI.

+

 When this flag is set,

 the ``XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD`` libbpf flag  should be

used when creating the socket @@ -49,7 +50,7 @@ Instead the loading is

handled by the CNI.



 .. note::



-   The Unix Domain Socket file path appear in the end user is

"/tmp/afxdp.sock".

+   The Unix Domain Socket file path appears to the end user at

"/tmp/afxdp_dp/<netdev>/afxdp.sock".





 Prerequisites

@@ -223,8 +224,7 @@ Howto run dpdk-testpmd with CNI plugin:

          securityContext:

           capabilities:

              add:

-               - CAP_NET_RAW

-               - CAP_BPF

+               - NET_RAW



Need to update the 1.3. Prerequisites.





          resources:

            requests:

              hugepages-2Mi: 2Gi

@@ -239,14 +239,20 @@ Howto run dpdk-testpmd with CNI plugin:



   .. _pod.yaml: https://github.com/intel/afxdp-plugins-for-

kubernetes/blob/v0.0.2/test/e2e/pod-1c1d.yaml



+.. note::

+

+   For Kernel versions older than 5.19 `CAP_BPF` is also required in

+   the container capabilities stanza.

+

 * Run DPDK with a command like the following:



   .. code-block:: console



      kubectl exec -i <Pod name> --container <containers name> -- \

-           /<Path>/dpdk-testpmd -l 0,1 --no-pci \

-           --vdev=net_af_xdp0,use_cni=1,iface=<interface name> \

-           -- --no-mlockall --in-memory

+           /<Path>/dpdk-testpmd -l 0-2 --no-pci --main-lcore=2 \

+           --vdev net_af_xdp0,iface=<interface

name>,use_cni=1,uds_path=/tmp/afxdp_dp/<interface name>/afxdp.sock

\

+           --vdev net_af_xdp1,iface=e<interface

name>,use_cni=1,uds_path=/tmp/afxdp_dp/<interface name>/afxdp.sock



There is a typo " iface=e<interface " == "iface=<interface"



\

+           -- -i --a --nb-cores=2 --rxq=1 --txq=1

+ --forward-mode=macswap;



 For further reference please use the `e2e`_ test case in `AF_XDP Plugin for

Kubernetes`_



diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c

b/drivers/net/af_xdp/rte_eth_af_xdp.c

index 353c8688ec..505ed6cf1e 100644

--- a/drivers/net/af_xdp/rte_eth_af_xdp.c

+++ b/drivers/net/af_xdp/rte_eth_af_xdp.c

@@ -88,7 +88,6 @@ RTE_LOG_REGISTER_DEFAULT(af_xdp_logtype,

NOTICE);

 #define UDS_MAX_CMD_LEN                 64

 #define UDS_MAX_CMD_RESP        128

 #define UDS_XSK_MAP_FD_MSG              "/xsk_map_fd"

-#define UDS_SOCK                "/tmp/afxdp.sock"

 #define UDS_CONNECT_MSG                 "/connect"

 #define UDS_HOST_OK_MSG                 "/host_ok"

 #define UDS_HOST_NAK_MSG        "/host_nak"

@@ -171,6 +170,7 @@ struct pmd_internals {

  bool custom_prog_configured;

  bool force_copy;

  bool use_cni;

+ char uds_path[PATH_MAX];

  struct bpf_map *map;



  struct rte_ether_addr eth_addr;

@@ -191,6 +191,7 @@ struct pmd_process_private {

 #define ETH_AF_XDP_BUDGET_ARG                  "busy_budget"

 #define ETH_AF_XDP_FORCE_COPY_ARG              "force_copy"

 #define ETH_AF_XDP_USE_CNI_ARG                 "use_cni"

+#define ETH_AF_XDP_USE_CNI_UDS_PATH_ARG "uds_path"



 static const char * const valid_arguments[] = {

  ETH_AF_XDP_IFACE_ARG,

@@ -201,6 +202,7 @@ static const char * const valid_arguments[] = {

  ETH_AF_XDP_BUDGET_ARG,

  ETH_AF_XDP_FORCE_COPY_ARG,

  ETH_AF_XDP_USE_CNI_ARG,

+ ETH_AF_XDP_USE_CNI_UDS_PATH_ARG,

  NULL

 };



@@ -1351,7 +1353,7 @@ configure_preferred_busy_poll(struct

pkt_rx_queue *rxq)  }



 static int

-init_uds_sock(struct sockaddr_un *server)

+init_uds_sock(struct sockaddr_un *server, const char *uds_path)

 {

  int sock;



@@ -1362,7 +1364,7 @@ init_uds_sock(struct sockaddr_un *server)

  }



  server->sun_family = AF_UNIX;

- strlcpy(server->sun_path, UDS_SOCK, sizeof(server->sun_path));

+ strlcpy(server->sun_path, uds_path, sizeof(server->sun_path));



  if (connect(sock, (struct sockaddr *)server, sizeof(struct

sockaddr_un)) < 0) {

          close(sock);

@@ -1382,7 +1384,7 @@ struct msg_internal {  };



 static int

-send_msg(int sock, char *request, int *fd)

+send_msg(int sock, char *request, int *fd, const char *uds_path)

 {

  int snd;

  struct iovec iov;

@@ -1393,7 +1395,7 @@ send_msg(int sock, char *request, int *fd)



  memset(&dst, 0, sizeof(dst));

  dst.sun_family = AF_UNIX;

- strlcpy(dst.sun_path, UDS_SOCK, sizeof(dst.sun_path));

+ strlcpy(dst.sun_path, uds_path, sizeof(dst.sun_path));



  /* Initialize message header structure */

  memset(&msgh, 0, sizeof(msgh));

@@ -1471,7 +1473,7 @@ read_msg(int sock, char *response, struct

sockaddr_un *s, int *fd)



 static int

 make_request_cni(int sock, struct sockaddr_un *server, char *request,

-          int *req_fd, char *response, int *out_fd)

+          int *req_fd, char *response, int *out_fd, const char

*uds_path)

 {

  int rval;



@@ -1483,7 +1485,7 @@ make_request_cni(int sock, struct sockaddr_un

*server, char *request,

  if (req_fd == NULL)

          rval = write(sock, request, strlen(request));

  else

-         rval = send_msg(sock, request, req_fd);

+         rval = send_msg(sock, request, req_fd, uds_path);



  if (rval < 0) {

          AF_XDP_LOG(ERR, "Write error %s\n", strerror(errno)); @@

-1507,7 +1509,7 @@ check_response(char *response, char *exp_resp, long

size)  }



 static int

-get_cni_fd(char *if_name)

+get_cni_fd(char *if_name, const char *uds_path)

 {

  char request[UDS_MAX_CMD_LEN],

response[UDS_MAX_CMD_RESP];

  char hostname[MAX_LONG_OPT_SZ],

exp_resp[UDS_MAX_CMD_RESP]; @@ -1520,14 +1522,14 @@

get_cni_fd(char *if_name)

          return -1;



  memset(&server, 0, sizeof(server));

- sock = init_uds_sock(&server);

+ sock = init_uds_sock(&server, uds_path);

  if (sock < 0)

          return -1;



  /* Initiates handshake to CNI send: /connect,hostname */

  snprintf(request, sizeof(request), "%s,%s", UDS_CONNECT_MSG,

hostname);

  memset(response, 0, sizeof(response));

- if (make_request_cni(sock, &server, request, NULL, response,

&out_fd) < 0) {

+ if (make_request_cni(sock, &server, request, NULL, response,

&out_fd,

+uds_path) < 0) {

          AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n",

request);

          goto err_close;

  }

@@ -1541,7 +1543,7 @@ get_cni_fd(char *if_name)

  /* Request for "/version" */

  strlcpy(request, UDS_VERSION_MSG, UDS_MAX_CMD_LEN);

  memset(response, 0, sizeof(response));

- if (make_request_cni(sock, &server, request, NULL, response,

&out_fd) < 0) {

+ if (make_request_cni(sock, &server, request, NULL, response,

&out_fd,

+uds_path) < 0) {

          AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n",

request);

          goto err_close;

  }

@@ -1549,7 +1551,7 @@ get_cni_fd(char *if_name)

  /* Request for file descriptor for netdev name*/

  snprintf(request, sizeof(request), "%s,%s",

UDS_XSK_MAP_FD_MSG, if_name);

  memset(response, 0, sizeof(response));

- if (make_request_cni(sock, &server, request, NULL, response,

&out_fd) < 0) {

+ if (make_request_cni(sock, &server, request, NULL, response,

&out_fd,

+uds_path) < 0) {

          AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n",

request);

          goto err_close;

  }

@@ -1571,7 +1573,7 @@ get_cni_fd(char *if_name)

  /* Initiate close connection */

  strlcpy(request, UDS_FIN_MSG, UDS_MAX_CMD_LEN);

  memset(response, 0, sizeof(response));

- if (make_request_cni(sock, &server, request, NULL, response,

&out_fd) < 0) {

+ if (make_request_cni(sock, &server, request, NULL, response,

&out_fd,

+uds_path) < 0) {

          AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n",

request);

          goto err_close;

  }

@@ -1695,17 +1697,16 @@ xsk_configure(struct pmd_internals *internals,

struct pkt_rx_queue *rxq,

  }



  if (internals->use_cni) {

-         int err, fd, map_fd;

+         int err, map_fd;



          /* get socket fd from CNI plugin */

-         map_fd = get_cni_fd(internals->if_name);

+         map_fd = get_cni_fd(internals->if_name, internals-

uds_path);

          if (map_fd < 0) {

                  AF_XDP_LOG(ERR, "Failed to receive CNI plugin

fd\n");

                  goto out_xsk;

          }

-         /* get socket fd */

-         fd = xsk_socket__fd(rxq->xsk);

-         err = bpf_map_update_elem(map_fd, &rxq-

xsk_queue_idx, &fd, 0);

+

+         err = xsk_socket__update_xskmap(rxq->xsk, map_fd);

          if (err) {

                  AF_XDP_LOG(ERR, "Failed to insert unprivileged xsk

in map.\n");

                  goto out_xsk;

@@ -2023,7 +2024,8 @@ xdp_get_channels_info(const char *if_name, int

*max_queues,  static int  parse_parameters(struct rte_kvargs *kvlist, char

*if_name, int *start_queue,

           int *queue_cnt, int *shared_umem, char *prog_path,

-          int *busy_budget, int *force_copy, int *use_cni)

+          int *busy_budget, int *force_copy, int *use_cni,

+          char *uds_path)

 {

  int ret;



@@ -2069,6 +2071,11 @@ parse_parameters(struct rte_kvargs *kvlist, char

*if_name, int *start_queue,

  if (ret < 0)

          goto free_kvlist;



+ ret = rte_kvargs_process(kvlist,

ETH_AF_XDP_USE_CNI_UDS_PATH_ARG,

+                         &parse_prog_arg, uds_path);

+ if (ret < 0)

+         goto free_kvlist;

+

 free_kvlist:

  rte_kvargs_free(kvlist);

  return ret;

@@ -2108,7 +2115,7 @@ static struct rte_eth_dev *  init_internals(struct

rte_vdev_device *dev, const char *if_name,

         int start_queue_idx, int queue_cnt, int shared_umem,

         const char *prog_path, int busy_budget, int force_copy,

-        int use_cni)

+            int use_cni, const char *uds_path)

 {

  const char *name = rte_vdev_device_name(dev);

  const unsigned int numa_node = dev->device.numa_node; @@ -

2138,6 +2145,7 @@ init_internals(struct rte_vdev_device *dev, const char

*if_name,

  internals->shared_umem = shared_umem;

  internals->force_copy = force_copy;

  internals->use_cni = use_cni;

+ strlcpy(internals->uds_path, uds_path, PATH_MAX);



  if (xdp_get_channels_info(if_name, &internals->max_queue_cnt,

                           &internals->combined_queue_cnt)) { @@ -

2328,6 +2336,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev)

  int busy_budget = -1, ret;

  int force_copy = 0;

  int use_cni = 0;

+ char uds_path[PATH_MAX] = {'\0'};

  struct rte_eth_dev *eth_dev = NULL;

  const char *name = rte_vdev_device_name(dev);



@@ -2370,7 +2379,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device

*dev)



  if (parse_parameters(kvlist, if_name, &xsk_start_queue_idx,

                       &xsk_queue_cnt, &shared_umem, prog_path,

-                      &busy_budget, &force_copy, &use_cni) < 0) {

+                         &busy_budget, &force_copy, &use_cni,

uds_path) < 0) {

          AF_XDP_LOG(ERR, "Invalid kvargs value\n");

          return -EINVAL;

  }

@@ -2387,6 +2396,12 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device

*dev)

                  return -EINVAL;

  }



+ if (use_cni && !strnlen(uds_path, PATH_MAX)) {

+         AF_XDP_LOG(ERR, "When '%s' parameter is used, '%s' must

also be provided\n",

+                 ETH_AF_XDP_USE_CNI_ARG,

ETH_AF_XDP_USE_CNI_UDS_PATH_ARG);

+                 return -EINVAL;

+ }

+

  if (strlen(if_name) == 0) {

          AF_XDP_LOG(ERR, "Network interface must be

specified\n");

          return -EINVAL;

@@ -2410,7 +2425,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device

*dev)



  eth_dev = init_internals(dev, if_name, xsk_start_queue_idx,

                          xsk_queue_cnt, shared_umem, prog_path,

-                         busy_budget, force_copy, use_cni);

+                         busy_budget, force_copy, use_cni,

uds_path);

  if (eth_dev == NULL) {

          AF_XDP_LOG(ERR, "Failed to init internals\n");

          return -1;

@@ -2471,4 +2486,5 @@

RTE_PMD_REGISTER_PARAM_STRING(net_af_xdp,

                        "xdp_prog=<string> "

                        "busy_budget=<int> "

                        "force_copy=<int> "

-                       "use_cni=<int> ");

+                       "use_cni=<int> "

+                       "uds_path=<string> ");

--

2.41.0
  
Maryam Tahhan Dec. 5, 2023, 11:28 a.m. UTC | #4
Hi Shibin

As I've already explained in my previous email, they are constant ATM, 
however they will become configurable. I am implementing the Operator 
and it will make a lot of these "fixed" params configurable. My 
recommendation is not to try to generate the path in DPDK - as it's 
likely to be different in different k8s environments.

As I've also mentioned, the current patch means that I don't need to 
come back in 2 months and update DPDK to support n paths for the UDS 
(aka future-proofing).

Additionally - this is a side discussion as far as this patch goes. *The 
point of this patch is to fix the broken UDS behavior *and it has been 
tested in a full deployment scenario. @Shibin if you strongly feel that 
there's a better approach, then please go ahead, implement it, test it 
in a full deployment scenario and push it for review.

In general, allowing the AF_XDP params to be configurable rather than 
fixing/hardcoding anything in DPDK decouples the AF_XDP DP from DPDK so 
we don't have to keep coming back to make changes.

BR
Maryam


On 05/12/2023 10:29, Koikkara Reeny, Shibin wrote:
>
> Hi Maryam,
>
> Apologies for making it confusing.
>
> As per the afxdp-plugins-for-kubernetes code it looks like the 
> udsPodPath or udsSockDir[1] and udsPodSock[1] are constant and not 
> changing. Only the interface name is changing and we are already 
> passing the interface name through the command line. So I was 
> suggesting we can write a logic to create sock path from these logic.
>
> If I am wrong please correct me, Isn’t that the logic 
> afxdp-plugins-for-kubernetes doing? [2]
>
> This is only a suggestions.
>
> [1] 
> https://github.com/intel/afxdp-plugins-for-kubernetes/blob/main/constants/constants.go#L84
>
> [2] 
> https://github.com/intel/afxdp-plugins-for-kubernetes/blob/main/internal/deviceplugin/poolManager_test.go#L99
>
> I had added 2 small comment earlier in the code.
>
> Regards,
>
> Shibin
>
> *From:*Maryam Tahhan <mtahhan@redhat.com>
> *Sent:* Monday, December 4, 2023 6:41 PM
> *To:* Koikkara Reeny, Shibin <shibin.koikkara.reeny@intel.com>; 
> ferruh.yigit@amd.com; stephen@networkplumber.org; 
> lihuisong@huawei.com; fengchengwen@huawei.com; liuyonglong@huawei.com
> *Cc:* dev@dpdk.org
> *Subject:* Re: [v2] net/af_xdp: enable a sock path alongside use_cni
>
> Hi Shibin
>
> I'm not really sure what you are suggesting, is to make an assumption 
> on the path part where the socket resides (aka hard code it) and then 
> try to build the full UDS path in DPDK?
>
> Yes the plugin is using constants ATM for certain parts of the UDS 
> path, but that's not say that it's something that won't become 
> configurable later on. Someone may not want to use "/tmp/afxdp_dp/" as 
> the base directory. Then we'd have to change DPDK's implementation 
> again. These are not really things that are configured by hand and are 
> generated by initialization scripts (typically). I would rather build 
> this with the idea that things can change in the future without having 
> to change the DPDK implementation again.
>
> BR
> Maryam
>
> On 04/12/2023 17:18, Koikkara Reeny, Shibin wrote:
>
>     Hi Maryam,
>
>     Apologies for asking this question bit late.
>
>     The UDS sock name will be afxdp.sock only and addition director is created between the sock name and the uds filepath (/tmp/afxdp_dp/<interface name>/afxdp.sock).
>
>     As per the command " --vdev net_af_xdp0,iface=<interface name>,use_cni=1,uds_path=/tmp/afxdp_dp/<interface name>/afxdp.sock"
>
>     We are already passing the interface name(iface=<interface name> . So can't we create the uds_path inside the program uds_path="/tmp/afxdp_dp/"+ iface + "afxdp.sock"
>
>     If you check the code afxdp-plugins-for-kubernetes constants.go [1] they still have the constants and also they are using these constants to create the path [2]
>
>     [1]https://github.com/intel/afxdp-plugins-for-kubernetes/blob/main/constants/constants.go#L84  
>
>     [2]https://github.com/intel/afxdp-plugins-for-kubernetes/blob/main/internal/deviceplugin/poolManager_test.go#L78
>
>     If we are able to create path in the program then user won't have to pass along argument value.
>
>     Regards,
>
>     Shibin
>
>         -----Original Message-----
>
>         From: Maryam Tahhan<mtahhan@redhat.com>  <mailto:mtahhan@redhat.com>
>
>         Sent: Monday, December 4, 2023 10:31 AM
>
>         To:ferruh.yigit@amd.com;stephen@networkplumber.org;
>
>         lihuisong@huawei.com;fengchengwen@huawei.com;
>
>         liuyonglong@huawei.com; Koikkara Reeny, Shibin
>
>         <shibin.koikkara.reeny@intel.com>  <mailto:shibin.koikkara.reeny@intel.com>
>
>         Cc:dev@dpdk.org; Tahhan, Maryam<mtahhan@redhat.com>  <mailto:mtahhan@redhat.com>
>
>         Subject: [v2] net/af_xdp: enable a sock path alongside use_cni
>
>         With the original 'use_cni' implementation, (using a hardcoded socket rather
>
>         than a configurable one), if a single pod is requesting multiple net devices
>
>         and these devices are from different pools, then the container attempts to
>
>         mount all the netdev UDSes in the pod as /tmp/afxdp.sock. Which means
>
>         that at best only 1 netdev will handshake correctly with the AF_XDP DP. This
>
>         patch addresses this by making the socket parameter configurable alongside
>
>         the 'use_cni' param.
>
>         Tested with the AF_XDP DP CNI PR 81.
>
>         v2:
>
>         * Rename sock_path to uds_path.
>
>         * Update documentation to reflect when CAP_BPF is needed.
>
>         * Fix testpmd arguments in the provided example for Pods.
>
>         * Use AF_XDP API to update the xskmap entry.
>
>         Signed-off-by: Maryam Tahhan<mtahhan@redhat.com>  <mailto:mtahhan@redhat.com>
>
>         ---
>
>           doc/guides/howto/af_xdp_cni.rst     | 24 ++++++-----
>
>           drivers/net/af_xdp/rte_eth_af_xdp.c | 62 ++++++++++++++++++-----------
>
>           2 files changed, 54 insertions(+), 32 deletions(-)
>
>         diff --git a/doc/guides/howto/af_xdp_cni.rst
>
>         b/doc/guides/howto/af_xdp_cni.rst index a1a6d5b99c..7829526b40 100644
>
>         --- a/doc/guides/howto/af_xdp_cni.rst
>
>         +++ b/doc/guides/howto/af_xdp_cni.rst
>
>         @@ -38,9 +38,10 @@ The XSKMAP is a BPF map of AF_XDP sockets (XSK).
>
>           The client can then proceed with creating an AF_XDP socket  and inserting
>
>         that socket into the XSKMAP pointed to by the descriptor.
>
>         -The EAL vdev argument ``use_cni`` is used to indicate that the user wishes -
>
>         to run the PMD in unprivileged mode and to receive the XSKMAP file
>
>         descriptor -from the CNI.
>
>         +The EAL vdev arguments ``use_cni`` and ``uds_path`` are used to
>
>         +indicate that the user wishes to run the PMD in unprivileged mode and
>
>         +to receive the XSKMAP file descriptor from the CNI.
>
>         +
>
>           When this flag is set,
>
>           the ``XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD`` libbpf flag  should be
>
>         used when creating the socket @@ -49,7 +50,7 @@ Instead the loading is
>
>         handled by the CNI.
>
>           .. note::
>
>         -   The Unix Domain Socket file path appear in the end user is
>
>         "/tmp/afxdp.sock".
>
>         +   The Unix Domain Socket file path appears to the end user at
>
>         "/tmp/afxdp_dp/<netdev>/afxdp.sock".
>
>           Prerequisites
>
>         @@ -223,8 +224,7 @@ Howto run dpdk-testpmd with CNI plugin:
>
>                    securityContext:
>
>                     capabilities:
>
>                        add:
>
>         -               - CAP_NET_RAW
>
>         -               - CAP_BPF
>
>         +               - NET_RAW
>
>     Need to update the 1.3. Prerequisites.
>
>                    resources:
>
>                      requests:
>
>                        hugepages-2Mi: 2Gi
>
>         @@ -239,14 +239,20 @@ Howto run dpdk-testpmd with CNI plugin:
>
>             .. _pod.yaml:https://github.com/intel/afxdp-plugins-for-
>
>         kubernetes/blob/v0.0.2/test/e2e/pod-1c1d.yaml
>
>         +.. note::
>
>         +
>
>         +   For Kernel versions older than 5.19 `CAP_BPF` is also required in
>
>         +   the container capabilities stanza.
>
>         +
>
>           * Run DPDK with a command like the following:
>
>             .. code-block:: console
>
>                kubectl exec -i <Pod name> --container <containers name> -- \
>
>         -           /<Path>/dpdk-testpmd -l 0,1 --no-pci \
>
>         -           --vdev=net_af_xdp0,use_cni=1,iface=<interface name> \
>
>         -           -- --no-mlockall --in-memory
>
>         +           /<Path>/dpdk-testpmd -l 0-2 --no-pci --main-lcore=2 \
>
>         +           --vdev net_af_xdp0,iface=<interface
>
>         name>,use_cni=1,uds_path=/tmp/afxdp_dp/<interface name>/afxdp.sock
>
>         \
>
>         +           --vdev net_af_xdp1,iface=e<interface
>
>         name>,use_cni=1,uds_path=/tmp/afxdp_dp/<interface name>/afxdp.sock
>
>     There is a typo " iface=e<interface " == "iface=<interface"
>
>         \
>
>         +           -- -i --a --nb-cores=2 --rxq=1 --txq=1
>
>         + --forward-mode=macswap;
>
>           For further reference please use the `e2e`_ test case in `AF_XDP Plugin for
>
>         Kubernetes`_
>
>         diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c
>
>         b/drivers/net/af_xdp/rte_eth_af_xdp.c
>
>         index 353c8688ec..505ed6cf1e 100644
>
>         --- a/drivers/net/af_xdp/rte_eth_af_xdp.c
>
>         +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
>
>         @@ -88,7 +88,6 @@ RTE_LOG_REGISTER_DEFAULT(af_xdp_logtype,
>
>         NOTICE);
>
>           #define UDS_MAX_CMD_LEN                 64
>
>           #define UDS_MAX_CMD_RESP        128
>
>           #define UDS_XSK_MAP_FD_MSG              "/xsk_map_fd"
>
>         -#define UDS_SOCK                "/tmp/afxdp.sock"
>
>           #define UDS_CONNECT_MSG                 "/connect"
>
>           #define UDS_HOST_OK_MSG                 "/host_ok"
>
>           #define UDS_HOST_NAK_MSG        "/host_nak"
>
>         @@ -171,6 +170,7 @@ struct pmd_internals {
>
>            bool custom_prog_configured;
>
>            bool force_copy;
>
>            bool use_cni;
>
>         + char uds_path[PATH_MAX];
>
>            struct bpf_map *map;
>
>            struct rte_ether_addr eth_addr;
>
>         @@ -191,6 +191,7 @@ struct pmd_process_private {
>
>           #define ETH_AF_XDP_BUDGET_ARG                  "busy_budget"
>
>           #define ETH_AF_XDP_FORCE_COPY_ARG              "force_copy"
>
>           #define ETH_AF_XDP_USE_CNI_ARG                 "use_cni"
>
>         +#define ETH_AF_XDP_USE_CNI_UDS_PATH_ARG "uds_path"
>
>           static const char * const valid_arguments[] = {
>
>            ETH_AF_XDP_IFACE_ARG,
>
>         @@ -201,6 +202,7 @@ static const char * const valid_arguments[] = {
>
>            ETH_AF_XDP_BUDGET_ARG,
>
>            ETH_AF_XDP_FORCE_COPY_ARG,
>
>            ETH_AF_XDP_USE_CNI_ARG,
>
>         + ETH_AF_XDP_USE_CNI_UDS_PATH_ARG,
>
>            NULL
>
>           };
>
>         @@ -1351,7 +1353,7 @@ configure_preferred_busy_poll(struct
>
>         pkt_rx_queue *rxq)  }
>
>           static int
>
>         -init_uds_sock(struct sockaddr_un *server)
>
>         +init_uds_sock(struct sockaddr_un *server, const char *uds_path)
>
>           {
>
>            int sock;
>
>         @@ -1362,7 +1364,7 @@ init_uds_sock(struct sockaddr_un *server)
>
>            }
>
>            server->sun_family = AF_UNIX;
>
>         - strlcpy(server->sun_path, UDS_SOCK, sizeof(server->sun_path));
>
>         + strlcpy(server->sun_path, uds_path, sizeof(server->sun_path));
>
>            if (connect(sock, (struct sockaddr *)server, sizeof(struct
>
>         sockaddr_un)) < 0) {
>
>                    close(sock);
>
>         @@ -1382,7 +1384,7 @@ struct msg_internal {  };
>
>           static int
>
>         -send_msg(int sock, char *request, int *fd)
>
>         +send_msg(int sock, char *request, int *fd, const char *uds_path)
>
>           {
>
>            int snd;
>
>            struct iovec iov;
>
>         @@ -1393,7 +1395,7 @@ send_msg(int sock, char *request, int *fd)
>
>            memset(&dst, 0, sizeof(dst));
>
>            dst.sun_family = AF_UNIX;
>
>         - strlcpy(dst.sun_path, UDS_SOCK, sizeof(dst.sun_path));
>
>         + strlcpy(dst.sun_path, uds_path, sizeof(dst.sun_path));
>
>            /* Initialize message header structure */
>
>            memset(&msgh, 0, sizeof(msgh));
>
>         @@ -1471,7 +1473,7 @@ read_msg(int sock, char *response, struct
>
>         sockaddr_un *s, int *fd)
>
>           static int
>
>           make_request_cni(int sock, struct sockaddr_un *server, char *request,
>
>         -          int *req_fd, char *response, int *out_fd)
>
>         +          int *req_fd, char *response, int *out_fd, const char
>
>         *uds_path)
>
>           {
>
>            int rval;
>
>         @@ -1483,7 +1485,7 @@ make_request_cni(int sock, struct sockaddr_un
>
>         *server, char *request,
>
>            if (req_fd == NULL)
>
>                    rval = write(sock, request, strlen(request));
>
>            else
>
>         -         rval = send_msg(sock, request, req_fd);
>
>         +         rval = send_msg(sock, request, req_fd, uds_path);
>
>            if (rval < 0) {
>
>                    AF_XDP_LOG(ERR, "Write error %s\n", strerror(errno)); @@
>
>         -1507,7 +1509,7 @@ check_response(char *response, char *exp_resp, long
>
>         size)  }
>
>           static int
>
>         -get_cni_fd(char *if_name)
>
>         +get_cni_fd(char *if_name, const char *uds_path)
>
>           {
>
>            char request[UDS_MAX_CMD_LEN],
>
>         response[UDS_MAX_CMD_RESP];
>
>            char hostname[MAX_LONG_OPT_SZ],
>
>         exp_resp[UDS_MAX_CMD_RESP]; @@ -1520,14 +1522,14 @@
>
>         get_cni_fd(char *if_name)
>
>                    return -1;
>
>            memset(&server, 0, sizeof(server));
>
>         - sock = init_uds_sock(&server);
>
>         + sock = init_uds_sock(&server, uds_path);
>
>            if (sock < 0)
>
>                    return -1;
>
>            /* Initiates handshake to CNI send: /connect,hostname */
>
>            snprintf(request, sizeof(request), "%s,%s", UDS_CONNECT_MSG,
>
>         hostname);
>
>            memset(response, 0, sizeof(response));
>
>         - if (make_request_cni(sock, &server, request, NULL, response,
>
>         &out_fd) < 0) {
>
>         + if (make_request_cni(sock, &server, request, NULL, response,
>
>         &out_fd,
>
>         +uds_path) < 0) {
>
>                    AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n",
>
>         request);
>
>                    goto err_close;
>
>            }
>
>         @@ -1541,7 +1543,7 @@ get_cni_fd(char *if_name)
>
>            /* Request for "/version" */
>
>            strlcpy(request, UDS_VERSION_MSG, UDS_MAX_CMD_LEN);
>
>            memset(response, 0, sizeof(response));
>
>         - if (make_request_cni(sock, &server, request, NULL, response,
>
>         &out_fd) < 0) {
>
>         + if (make_request_cni(sock, &server, request, NULL, response,
>
>         &out_fd,
>
>         +uds_path) < 0) {
>
>                    AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n",
>
>         request);
>
>                    goto err_close;
>
>            }
>
>         @@ -1549,7 +1551,7 @@ get_cni_fd(char *if_name)
>
>            /* Request for file descriptor for netdev name*/
>
>            snprintf(request, sizeof(request), "%s,%s",
>
>         UDS_XSK_MAP_FD_MSG, if_name);
>
>            memset(response, 0, sizeof(response));
>
>         - if (make_request_cni(sock, &server, request, NULL, response,
>
>         &out_fd) < 0) {
>
>         + if (make_request_cni(sock, &server, request, NULL, response,
>
>         &out_fd,
>
>         +uds_path) < 0) {
>
>                    AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n",
>
>         request);
>
>                    goto err_close;
>
>            }
>
>         @@ -1571,7 +1573,7 @@ get_cni_fd(char *if_name)
>
>            /* Initiate close connection */
>
>            strlcpy(request, UDS_FIN_MSG, UDS_MAX_CMD_LEN);
>
>            memset(response, 0, sizeof(response));
>
>         - if (make_request_cni(sock, &server, request, NULL, response,
>
>         &out_fd) < 0) {
>
>         + if (make_request_cni(sock, &server, request, NULL, response,
>
>         &out_fd,
>
>         +uds_path) < 0) {
>
>                    AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n",
>
>         request);
>
>                    goto err_close;
>
>            }
>
>         @@ -1695,17 +1697,16 @@ xsk_configure(struct pmd_internals *internals,
>
>         struct pkt_rx_queue *rxq,
>
>            }
>
>            if (internals->use_cni) {
>
>         -         int err, fd, map_fd;
>
>         +         int err, map_fd;
>
>                    /* get socket fd from CNI plugin */
>
>         -         map_fd = get_cni_fd(internals->if_name);
>
>         +         map_fd = get_cni_fd(internals->if_name, internals-
>
>             uds_path);
>
>                    if (map_fd < 0) {
>
>                            AF_XDP_LOG(ERR, "Failed to receive CNI plugin
>
>         fd\n");
>
>                            goto out_xsk;
>
>                    }
>
>         -         /* get socket fd */
>
>         -         fd = xsk_socket__fd(rxq->xsk);
>
>         -         err = bpf_map_update_elem(map_fd, &rxq-
>
>             xsk_queue_idx, &fd, 0);
>
>         +
>
>         +         err = xsk_socket__update_xskmap(rxq->xsk, map_fd);
>
>                    if (err) {
>
>                            AF_XDP_LOG(ERR, "Failed to insert unprivileged xsk
>
>         in map.\n");
>
>                            goto out_xsk;
>
>         @@ -2023,7 +2024,8 @@ xdp_get_channels_info(const char *if_name, int
>
>         *max_queues,  static int  parse_parameters(struct rte_kvargs *kvlist, char
>
>         *if_name, int *start_queue,
>
>                     int *queue_cnt, int *shared_umem, char *prog_path,
>
>         -          int *busy_budget, int *force_copy, int *use_cni)
>
>         +          int *busy_budget, int *force_copy, int *use_cni,
>
>         +          char *uds_path)
>
>           {
>
>            int ret;
>
>         @@ -2069,6 +2071,11 @@ parse_parameters(struct rte_kvargs *kvlist, char
>
>         *if_name, int *start_queue,
>
>            if (ret < 0)
>
>                    goto free_kvlist;
>
>         + ret = rte_kvargs_process(kvlist,
>
>         ETH_AF_XDP_USE_CNI_UDS_PATH_ARG,
>
>         +                         &parse_prog_arg, uds_path);
>
>         + if (ret < 0)
>
>         +         goto free_kvlist;
>
>         +
>
>           free_kvlist:
>
>            rte_kvargs_free(kvlist);
>
>            return ret;
>
>         @@ -2108,7 +2115,7 @@ static struct rte_eth_dev *  init_internals(struct
>
>         rte_vdev_device *dev, const char *if_name,
>
>                   int start_queue_idx, int queue_cnt, int shared_umem,
>
>                   const char *prog_path, int busy_budget, int force_copy,
>
>         -        int use_cni)
>
>         +            int use_cni, const char *uds_path)
>
>           {
>
>            const char *name = rte_vdev_device_name(dev);
>
>            const unsigned int numa_node = dev->device.numa_node; @@ -
>
>         2138,6 +2145,7 @@ init_internals(struct rte_vdev_device *dev, const char
>
>         *if_name,
>
>            internals->shared_umem = shared_umem;
>
>            internals->force_copy = force_copy;
>
>            internals->use_cni = use_cni;
>
>         + strlcpy(internals->uds_path, uds_path, PATH_MAX);
>
>            if (xdp_get_channels_info(if_name, &internals->max_queue_cnt,
>
>                                     &internals->combined_queue_cnt)) { @@ -
>
>         2328,6 +2336,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev)
>
>            int busy_budget = -1, ret;
>
>            int force_copy = 0;
>
>            int use_cni = 0;
>
>         + char uds_path[PATH_MAX] = {'\0'};
>
>            struct rte_eth_dev *eth_dev = NULL;
>
>            const char *name = rte_vdev_device_name(dev);
>
>         @@ -2370,7 +2379,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device
>
>         *dev)
>
>            if (parse_parameters(kvlist, if_name, &xsk_start_queue_idx,
>
>                                 &xsk_queue_cnt, &shared_umem, prog_path,
>
>         -                      &busy_budget, &force_copy, &use_cni) < 0) {
>
>         +                         &busy_budget, &force_copy, &use_cni,
>
>         uds_path) < 0) {
>
>                    AF_XDP_LOG(ERR, "Invalid kvargs value\n");
>
>                    return -EINVAL;
>
>            }
>
>         @@ -2387,6 +2396,12 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device
>
>         *dev)
>
>                            return -EINVAL;
>
>            }
>
>         + if (use_cni && !strnlen(uds_path, PATH_MAX)) {
>
>         +         AF_XDP_LOG(ERR, "When '%s' parameter is used, '%s' must
>
>         also be provided\n",
>
>         +                 ETH_AF_XDP_USE_CNI_ARG,
>
>         ETH_AF_XDP_USE_CNI_UDS_PATH_ARG);
>
>         +                 return -EINVAL;
>
>         + }
>
>         +
>
>            if (strlen(if_name) == 0) {
>
>                    AF_XDP_LOG(ERR, "Network interface must be
>
>         specified\n");
>
>                    return -EINVAL;
>
>         @@ -2410,7 +2425,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device
>
>         *dev)
>
>            eth_dev = init_internals(dev, if_name, xsk_start_queue_idx,
>
>                                    xsk_queue_cnt, shared_umem, prog_path,
>
>         -                         busy_budget, force_copy, use_cni);
>
>         +                         busy_budget, force_copy, use_cni,
>
>         uds_path);
>
>            if (eth_dev == NULL) {
>
>                    AF_XDP_LOG(ERR, "Failed to init internals\n");
>
>                    return -1;
>
>         @@ -2471,4 +2486,5 @@
>
>         RTE_PMD_REGISTER_PARAM_STRING(net_af_xdp,
>
>                                  "xdp_prog=<string> "
>
>                                  "busy_budget=<int> "
>
>                                  "force_copy=<int> "
>
>         -                       "use_cni=<int> ");
>
>         +                       "use_cni=<int> "
>
>         +                       "uds_path=<string> ");
>
>         --
>
>         2.41.0
>
  
Maryam Tahhan Dec. 5, 2023, 11:31 a.m. UTC | #5
On 04/12/2023 17:18, Koikkara Reeny, Shibin wrote:
>>   Prerequisites
>> @@ -223,8 +224,7 @@ Howto run dpdk-testpmd with CNI plugin:
>>            securityContext:
>>             capabilities:
>>                add:
>> -               - CAP_NET_RAW
>> -               - CAP_BPF
>> +               - NET_RAW
> Need to update the 1.3. Prerequisites.


Sorry, what are you referring to?

>
>
>>            resources:
>>              requests:
>>                hugepages-2Mi: 2Gi
>> @@ -239,14 +239,20 @@ Howto run dpdk-testpmd with CNI plugin:
>>
>>     .. _pod.yaml:https://github.com/intel/afxdp-plugins-for-
>> kubernetes/blob/v0.0.2/test/e2e/pod-1c1d.yaml
>>
>> +.. note::
>> +
>> +   For Kernel versions older than 5.19 `CAP_BPF` is also required in
>> +   the container capabilities stanza.
>> +
>>   * Run DPDK with a command like the following:
>>
>>     .. code-block:: console
>>
>>        kubectl exec -i <Pod name> --container <containers name> -- \
>> -           /<Path>/dpdk-testpmd -l 0,1 --no-pci \
>> -           --vdev=net_af_xdp0,use_cni=1,iface=<interface name> \
>> -           -- --no-mlockall --in-memory
>> +           /<Path>/dpdk-testpmd -l 0-2 --no-pci --main-lcore=2 \
>> +           --vdev net_af_xdp0,iface=<interface
>> name>,use_cni=1,uds_path=/tmp/afxdp_dp/<interface name>/afxdp.sock
>> \
>> +           --vdev net_af_xdp1,iface=e<interface
>> name>,use_cni=1,uds_path=/tmp/afxdp_dp/<interface name>/afxdp.sock
> There is a typo " iface=e<interface " == "iface=<interface"


Ack
  
Koikkara Reeny, Shibin Dec. 5, 2023, 11:49 a.m. UTC | #6
From: Maryam Tahhan <mtahhan@redhat.com>
Sent: Tuesday, December 5, 2023 11:31 AM
To: Koikkara Reeny, Shibin <shibin.koikkara.reeny@intel.com>; ferruh.yigit@amd.com; stephen@networkplumber.org; lihuisong@huawei.com; fengchengwen@huawei.com; liuyonglong@huawei.com
Cc: dev@dpdk.org
Subject: Re: [v2] net/af_xdp: enable a sock path alongside use_cni

On 04/12/2023 17:18, Koikkara Reeny, Shibin wrote:

 Prerequisites

@@ -223,8 +224,7 @@ Howto run dpdk-testpmd with CNI plugin:

          securityContext:

           capabilities:

              add:

-               - CAP_NET_RAW

-               - CAP_BPF

+               - NET_RAW

Need to update the 1.3. Prerequisites.



Sorry, what are you referring to?



You are removing the CAP_NET_RAW and CAP_BPF. So you will need to update the doc section 1.3 Prerequisites.[1]

[1] https://doc.dpdk.org/guides/howto/af_xdp_cni.html







          resources:

            requests:

              hugepages-2Mi: 2Gi

@@ -239,14 +239,20 @@ Howto run dpdk-testpmd with CNI plugin:



   .. _pod.yaml: https://github.com/intel/afxdp-plugins-for-

kubernetes/blob/v0.0.2/test/e2e/pod-1c1d.yaml



+.. note::

+

+   For Kernel versions older than 5.19 `CAP_BPF` is also required in

+   the container capabilities stanza.

+

 * Run DPDK with a command like the following:



   .. code-block:: console



      kubectl exec -i <Pod name> --container <containers name> -- \

-           /<Path>/dpdk-testpmd -l 0,1 --no-pci \

-           --vdev=net_af_xdp0,use_cni=1,iface=<interface name> \

-           -- --no-mlockall --in-memory

+           /<Path>/dpdk-testpmd -l 0-2 --no-pci --main-lcore=2 \

+           --vdev net_af_xdp0,iface=<interface

name>,use_cni=1,uds_path=/tmp/afxdp_dp/<interface name>/afxdp.sock

\

+           --vdev net_af_xdp1,iface=e<interface

name>,use_cni=1,uds_path=/tmp/afxdp_dp/<interface name>/afxdp.sock

There is a typo " iface=e<interface " == "iface=<interface"



Ack
  
Koikkara Reeny, Shibin Dec. 5, 2023, 11:54 a.m. UTC | #7
Hi Maryam,

I was suggesting with reference to the constant ATM and the implementation in afxdp_plugin. If you think they will also be changing in the future then please go with what you think will be the best.

Regards,
Shibin

From: Maryam Tahhan <mtahhan@redhat.com>
Sent: Tuesday, December 5, 2023 11:29 AM
To: Koikkara Reeny, Shibin <shibin.koikkara.reeny@intel.com>; ferruh.yigit@amd.com; stephen@networkplumber.org; lihuisong@huawei.com; fengchengwen@huawei.com; liuyonglong@huawei.com
Cc: dev@dpdk.org
Subject: Re: [v2] net/af_xdp: enable a sock path alongside use_cni

Hi Shibin

As I've already explained in my previous email, they are constant ATM, however they will become configurable. I am implementing the Operator and it will make a lot of these "fixed" params configurable. My recommendation is not to try to generate the path in DPDK - as it's likely to be different in different k8s environments.

As I've also mentioned, the current patch means that I don't need to come back in 2 months and update DPDK to support n paths for the UDS (aka future-proofing).

Additionally - this is a side discussion as far as this patch goes. The point of this patch is to fix the broken UDS behavior and it has been tested in a full deployment scenario. @Shibin if you strongly feel that there's a better approach, then please go ahead, implement it, test it in a full deployment scenario and push it for review.

In general, allowing the AF_XDP params to be configurable rather than fixing/hardcoding anything in DPDK decouples the AF_XDP DP from DPDK so we don't have to keep coming back to make changes.

BR
Maryam


On 05/12/2023 10:29, Koikkara Reeny, Shibin wrote:
Hi Maryam,

Apologies for making it confusing.

As per the afxdp-plugins-for-kubernetes code it looks like the udsPodPath or udsSockDir[1] and udsPodSock[1] are constant and not changing. Only the interface name is changing and we are already passing the interface name through the command line. So I was suggesting we can write a logic to create sock path from these logic.

If I am wrong please correct me, Isn’t that the logic afxdp-plugins-for-kubernetes doing? [2]
This is only a suggestions.

[1] https://github.com/intel/afxdp-plugins-for-kubernetes/blob/main/constants/constants.go#L84
[2] https://github.com/intel/afxdp-plugins-for-kubernetes/blob/main/internal/deviceplugin/poolManager_test.go#L99

I had added 2 small comment earlier in the code.


Regards,
Shibin
From: Maryam Tahhan <mtahhan@redhat.com><mailto:mtahhan@redhat.com>
Sent: Monday, December 4, 2023 6:41 PM
To: Koikkara Reeny, Shibin <shibin.koikkara.reeny@intel.com><mailto:shibin.koikkara.reeny@intel.com>; ferruh.yigit@amd.com<mailto:ferruh.yigit@amd.com>; stephen@networkplumber.org<mailto:stephen@networkplumber.org>; lihuisong@huawei.com<mailto:lihuisong@huawei.com>; fengchengwen@huawei.com<mailto:fengchengwen@huawei.com>; liuyonglong@huawei.com<mailto:liuyonglong@huawei.com>
Cc: dev@dpdk.org<mailto:dev@dpdk.org>
Subject: Re: [v2] net/af_xdp: enable a sock path alongside use_cni

Hi Shibin

I'm not really sure what you are suggesting, is to make an assumption on the path part where the socket resides (aka hard code it) and then try to build the full UDS path in DPDK?

Yes the plugin is using constants ATM for certain parts of the UDS path, but that's not say that it's something that won't become configurable later on. Someone may not want to use "/tmp/afxdp_dp/" as the base directory. Then we'd have to change DPDK's implementation again. These are not really things that are configured by hand and are generated by initialization scripts (typically). I would rather build this with the idea that things can change in the future without having to change the DPDK implementation again.
BR
Maryam

On 04/12/2023 17:18, Koikkara Reeny, Shibin wrote:

Hi Maryam,



Apologies for asking this question bit late.

The UDS sock name will be afxdp.sock only and addition director is created between the sock name and the uds filepath (/tmp/afxdp_dp/<interface name>/afxdp.sock).



As per the command " --vdev net_af_xdp0,iface=<interface name>,use_cni=1,uds_path=/tmp/afxdp_dp/<interface name>/afxdp.sock"

We are already passing the interface name(iface=<interface name> . So can't we create the uds_path inside the program uds_path="/tmp/afxdp_dp/"+ iface + "afxdp.sock"





If you check the code afxdp-plugins-for-kubernetes constants.go [1] they still have the constants and also they are using these constants to create the path [2]



[1] https://github.com/intel/afxdp-plugins-for-kubernetes/blob/main/constants/constants.go#L84

[2] https://github.com/intel/afxdp-plugins-for-kubernetes/blob/main/internal/deviceplugin/poolManager_test.go#L78



If we are able to create path in the program then user won't have to pass along argument value.



Regards,

Shibin



-----Original Message-----

From: Maryam Tahhan <mtahhan@redhat.com><mailto:mtahhan@redhat.com>

Sent: Monday, December 4, 2023 10:31 AM

To: ferruh.yigit@amd.com<mailto:ferruh.yigit@amd.com>; stephen@networkplumber.org<mailto:stephen@networkplumber.org>;

lihuisong@huawei.com<mailto:lihuisong@huawei.com>; fengchengwen@huawei.com<mailto:fengchengwen@huawei.com>;

liuyonglong@huawei.com<mailto:liuyonglong@huawei.com>; Koikkara Reeny, Shibin

<shibin.koikkara.reeny@intel.com><mailto:shibin.koikkara.reeny@intel.com>

Cc: dev@dpdk.org<mailto:dev@dpdk.org>; Tahhan, Maryam <mtahhan@redhat.com><mailto:mtahhan@redhat.com>

Subject: [v2] net/af_xdp: enable a sock path alongside use_cni



With the original 'use_cni' implementation, (using a hardcoded socket rather

than a configurable one), if a single pod is requesting multiple net devices

and these devices are from different pools, then the container attempts to

mount all the netdev UDSes in the pod as /tmp/afxdp.sock. Which means

that at best only 1 netdev will handshake correctly with the AF_XDP DP. This

patch addresses this by making the socket parameter configurable alongside

the 'use_cni' param.

Tested with the AF_XDP DP CNI PR 81.



v2:

* Rename sock_path to uds_path.

* Update documentation to reflect when CAP_BPF is needed.

* Fix testpmd arguments in the provided example for Pods.

* Use AF_XDP API to update the xskmap entry.



Signed-off-by: Maryam Tahhan <mtahhan@redhat.com><mailto:mtahhan@redhat.com>

---

 doc/guides/howto/af_xdp_cni.rst     | 24 ++++++-----

 drivers/net/af_xdp/rte_eth_af_xdp.c | 62 ++++++++++++++++++-----------

 2 files changed, 54 insertions(+), 32 deletions(-)



diff --git a/doc/guides/howto/af_xdp_cni.rst

b/doc/guides/howto/af_xdp_cni.rst index a1a6d5b99c..7829526b40 100644

--- a/doc/guides/howto/af_xdp_cni.rst

+++ b/doc/guides/howto/af_xdp_cni.rst

@@ -38,9 +38,10 @@ The XSKMAP is a BPF map of AF_XDP sockets (XSK).

 The client can then proceed with creating an AF_XDP socket  and inserting

that socket into the XSKMAP pointed to by the descriptor.



-The EAL vdev argument ``use_cni`` is used to indicate that the user wishes -

to run the PMD in unprivileged mode and to receive the XSKMAP file

descriptor -from the CNI.

+The EAL vdev arguments ``use_cni`` and ``uds_path`` are used to

+indicate that the user wishes to run the PMD in unprivileged mode and

+to receive the XSKMAP file descriptor from the CNI.

+

 When this flag is set,

 the ``XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD`` libbpf flag  should be

used when creating the socket @@ -49,7 +50,7 @@ Instead the loading is

handled by the CNI.



 .. note::



-   The Unix Domain Socket file path appear in the end user is

"/tmp/afxdp.sock".

+   The Unix Domain Socket file path appears to the end user at

"/tmp/afxdp_dp/<netdev>/afxdp.sock".





 Prerequisites

@@ -223,8 +224,7 @@ Howto run dpdk-testpmd with CNI plugin:

          securityContext:

           capabilities:

              add:

-               - CAP_NET_RAW

-               - CAP_BPF

+               - NET_RAW



Need to update the 1.3. Prerequisites.





          resources:

            requests:

              hugepages-2Mi: 2Gi

@@ -239,14 +239,20 @@ Howto run dpdk-testpmd with CNI plugin:



   .. _pod.yaml: https://github.com/intel/afxdp-plugins-for-

kubernetes/blob/v0.0.2/test/e2e/pod-1c1d.yaml



+.. note::

+

+   For Kernel versions older than 5.19 `CAP_BPF` is also required in

+   the container capabilities stanza.

+

 * Run DPDK with a command like the following:



   .. code-block:: console



      kubectl exec -i <Pod name> --container <containers name> -- \

-           /<Path>/dpdk-testpmd -l 0,1 --no-pci \

-           --vdev=net_af_xdp0,use_cni=1,iface=<interface name> \

-           -- --no-mlockall --in-memory

+           /<Path>/dpdk-testpmd -l 0-2 --no-pci --main-lcore=2 \

+           --vdev net_af_xdp0,iface=<interface

name>,use_cni=1,uds_path=/tmp/afxdp_dp/<interface name>/afxdp.sock

\

+           --vdev net_af_xdp1,iface=e<interface

name>,use_cni=1,uds_path=/tmp/afxdp_dp/<interface name>/afxdp.sock



There is a typo " iface=e<interface " == "iface=<interface"



\

+           -- -i --a --nb-cores=2 --rxq=1 --txq=1

+ --forward-mode=macswap;



 For further reference please use the `e2e`_ test case in `AF_XDP Plugin for

Kubernetes`_



diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c

b/drivers/net/af_xdp/rte_eth_af_xdp.c

index 353c8688ec..505ed6cf1e 100644

--- a/drivers/net/af_xdp/rte_eth_af_xdp.c

+++ b/drivers/net/af_xdp/rte_eth_af_xdp.c

@@ -88,7 +88,6 @@ RTE_LOG_REGISTER_DEFAULT(af_xdp_logtype,

NOTICE);

 #define UDS_MAX_CMD_LEN                 64

 #define UDS_MAX_CMD_RESP        128

 #define UDS_XSK_MAP_FD_MSG              "/xsk_map_fd"

-#define UDS_SOCK                "/tmp/afxdp.sock"

 #define UDS_CONNECT_MSG                 "/connect"

 #define UDS_HOST_OK_MSG                 "/host_ok"

 #define UDS_HOST_NAK_MSG        "/host_nak"

@@ -171,6 +170,7 @@ struct pmd_internals {

  bool custom_prog_configured;

  bool force_copy;

  bool use_cni;

+ char uds_path[PATH_MAX];

  struct bpf_map *map;



  struct rte_ether_addr eth_addr;

@@ -191,6 +191,7 @@ struct pmd_process_private {

 #define ETH_AF_XDP_BUDGET_ARG                  "busy_budget"

 #define ETH_AF_XDP_FORCE_COPY_ARG              "force_copy"

 #define ETH_AF_XDP_USE_CNI_ARG                 "use_cni"

+#define ETH_AF_XDP_USE_CNI_UDS_PATH_ARG "uds_path"



 static const char * const valid_arguments[] = {

  ETH_AF_XDP_IFACE_ARG,

@@ -201,6 +202,7 @@ static const char * const valid_arguments[] = {

  ETH_AF_XDP_BUDGET_ARG,

  ETH_AF_XDP_FORCE_COPY_ARG,

  ETH_AF_XDP_USE_CNI_ARG,

+ ETH_AF_XDP_USE_CNI_UDS_PATH_ARG,

  NULL

 };



@@ -1351,7 +1353,7 @@ configure_preferred_busy_poll(struct

pkt_rx_queue *rxq)  }



 static int

-init_uds_sock(struct sockaddr_un *server)

+init_uds_sock(struct sockaddr_un *server, const char *uds_path)

 {

  int sock;



@@ -1362,7 +1364,7 @@ init_uds_sock(struct sockaddr_un *server)

  }



  server->sun_family = AF_UNIX;

- strlcpy(server->sun_path, UDS_SOCK, sizeof(server->sun_path));

+ strlcpy(server->sun_path, uds_path, sizeof(server->sun_path));



  if (connect(sock, (struct sockaddr *)server, sizeof(struct

sockaddr_un)) < 0) {

          close(sock);

@@ -1382,7 +1384,7 @@ struct msg_internal {  };



 static int

-send_msg(int sock, char *request, int *fd)

+send_msg(int sock, char *request, int *fd, const char *uds_path)

 {

  int snd;

  struct iovec iov;

@@ -1393,7 +1395,7 @@ send_msg(int sock, char *request, int *fd)



  memset(&dst, 0, sizeof(dst));

  dst.sun_family = AF_UNIX;

- strlcpy(dst.sun_path, UDS_SOCK, sizeof(dst.sun_path));

+ strlcpy(dst.sun_path, uds_path, sizeof(dst.sun_path));



  /* Initialize message header structure */

  memset(&msgh, 0, sizeof(msgh));

@@ -1471,7 +1473,7 @@ read_msg(int sock, char *response, struct

sockaddr_un *s, int *fd)



 static int

 make_request_cni(int sock, struct sockaddr_un *server, char *request,

-          int *req_fd, char *response, int *out_fd)

+          int *req_fd, char *response, int *out_fd, const char

*uds_path)

 {

  int rval;



@@ -1483,7 +1485,7 @@ make_request_cni(int sock, struct sockaddr_un

*server, char *request,

  if (req_fd == NULL)

          rval = write(sock, request, strlen(request));

  else

-         rval = send_msg(sock, request, req_fd);

+         rval = send_msg(sock, request, req_fd, uds_path);



  if (rval < 0) {

          AF_XDP_LOG(ERR, "Write error %s\n", strerror(errno)); @@

-1507,7 +1509,7 @@ check_response(char *response, char *exp_resp, long

size)  }



 static int

-get_cni_fd(char *if_name)

+get_cni_fd(char *if_name, const char *uds_path)

 {

  char request[UDS_MAX_CMD_LEN],

response[UDS_MAX_CMD_RESP];

  char hostname[MAX_LONG_OPT_SZ],

exp_resp[UDS_MAX_CMD_RESP]; @@ -1520,14 +1522,14 @@

get_cni_fd(char *if_name)

          return -1;



  memset(&server, 0, sizeof(server));

- sock = init_uds_sock(&server);

+ sock = init_uds_sock(&server, uds_path);

  if (sock < 0)

          return -1;



  /* Initiates handshake to CNI send: /connect,hostname */

  snprintf(request, sizeof(request), "%s,%s", UDS_CONNECT_MSG,

hostname);

  memset(response, 0, sizeof(response));

- if (make_request_cni(sock, &server, request, NULL, response,

&out_fd) < 0) {

+ if (make_request_cni(sock, &server, request, NULL, response,

&out_fd,

+uds_path) < 0) {

          AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n",

request);

          goto err_close;

  }

@@ -1541,7 +1543,7 @@ get_cni_fd(char *if_name)

  /* Request for "/version" */

  strlcpy(request, UDS_VERSION_MSG, UDS_MAX_CMD_LEN);

  memset(response, 0, sizeof(response));

- if (make_request_cni(sock, &server, request, NULL, response,

&out_fd) < 0) {

+ if (make_request_cni(sock, &server, request, NULL, response,

&out_fd,

+uds_path) < 0) {

          AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n",

request);

          goto err_close;

  }

@@ -1549,7 +1551,7 @@ get_cni_fd(char *if_name)

  /* Request for file descriptor for netdev name*/

  snprintf(request, sizeof(request), "%s,%s",

UDS_XSK_MAP_FD_MSG, if_name);

  memset(response, 0, sizeof(response));

- if (make_request_cni(sock, &server, request, NULL, response,

&out_fd) < 0) {

+ if (make_request_cni(sock, &server, request, NULL, response,

&out_fd,

+uds_path) < 0) {

          AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n",

request);

          goto err_close;

  }

@@ -1571,7 +1573,7 @@ get_cni_fd(char *if_name)

  /* Initiate close connection */

  strlcpy(request, UDS_FIN_MSG, UDS_MAX_CMD_LEN);

  memset(response, 0, sizeof(response));

- if (make_request_cni(sock, &server, request, NULL, response,

&out_fd) < 0) {

+ if (make_request_cni(sock, &server, request, NULL, response,

&out_fd,

+uds_path) < 0) {

          AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n",

request);

          goto err_close;

  }

@@ -1695,17 +1697,16 @@ xsk_configure(struct pmd_internals *internals,

struct pkt_rx_queue *rxq,

  }



  if (internals->use_cni) {

-         int err, fd, map_fd;

+         int err, map_fd;



          /* get socket fd from CNI plugin */

-         map_fd = get_cni_fd(internals->if_name);

+         map_fd = get_cni_fd(internals->if_name, internals-

uds_path);

          if (map_fd < 0) {

                  AF_XDP_LOG(ERR, "Failed to receive CNI plugin

fd\n");

                  goto out_xsk;

          }

-         /* get socket fd */

-         fd = xsk_socket__fd(rxq->xsk);

-         err = bpf_map_update_elem(map_fd, &rxq-

xsk_queue_idx, &fd, 0);

+

+         err = xsk_socket__update_xskmap(rxq->xsk, map_fd);

          if (err) {

                  AF_XDP_LOG(ERR, "Failed to insert unprivileged xsk

in map.\n");

                  goto out_xsk;

@@ -2023,7 +2024,8 @@ xdp_get_channels_info(const char *if_name, int

*max_queues,  static int  parse_parameters(struct rte_kvargs *kvlist, char

*if_name, int *start_queue,

           int *queue_cnt, int *shared_umem, char *prog_path,

-          int *busy_budget, int *force_copy, int *use_cni)

+          int *busy_budget, int *force_copy, int *use_cni,

+          char *uds_path)

 {

  int ret;



@@ -2069,6 +2071,11 @@ parse_parameters(struct rte_kvargs *kvlist, char

*if_name, int *start_queue,

  if (ret < 0)

          goto free_kvlist;



+ ret = rte_kvargs_process(kvlist,

ETH_AF_XDP_USE_CNI_UDS_PATH_ARG,

+                         &parse_prog_arg, uds_path);

+ if (ret < 0)

+         goto free_kvlist;

+

 free_kvlist:

  rte_kvargs_free(kvlist);

  return ret;

@@ -2108,7 +2115,7 @@ static struct rte_eth_dev *  init_internals(struct

rte_vdev_device *dev, const char *if_name,

         int start_queue_idx, int queue_cnt, int shared_umem,

         const char *prog_path, int busy_budget, int force_copy,

-        int use_cni)

+            int use_cni, const char *uds_path)

 {

  const char *name = rte_vdev_device_name(dev);

  const unsigned int numa_node = dev->device.numa_node; @@ -

2138,6 +2145,7 @@ init_internals(struct rte_vdev_device *dev, const char

*if_name,

  internals->shared_umem = shared_umem;

  internals->force_copy = force_copy;

  internals->use_cni = use_cni;

+ strlcpy(internals->uds_path, uds_path, PATH_MAX);



  if (xdp_get_channels_info(if_name, &internals->max_queue_cnt,

                           &internals->combined_queue_cnt)) { @@ -

2328,6 +2336,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev)

  int busy_budget = -1, ret;

  int force_copy = 0;

  int use_cni = 0;

+ char uds_path[PATH_MAX] = {'\0'};

  struct rte_eth_dev *eth_dev = NULL;

  const char *name = rte_vdev_device_name(dev);



@@ -2370,7 +2379,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device

*dev)



  if (parse_parameters(kvlist, if_name, &xsk_start_queue_idx,

                       &xsk_queue_cnt, &shared_umem, prog_path,

-                      &busy_budget, &force_copy, &use_cni) < 0) {

+                         &busy_budget, &force_copy, &use_cni,

uds_path) < 0) {

          AF_XDP_LOG(ERR, "Invalid kvargs value\n");

          return -EINVAL;

  }

@@ -2387,6 +2396,12 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device

*dev)

                  return -EINVAL;

  }



+ if (use_cni && !strnlen(uds_path, PATH_MAX)) {

+         AF_XDP_LOG(ERR, "When '%s' parameter is used, '%s' must

also be provided\n",

+                 ETH_AF_XDP_USE_CNI_ARG,

ETH_AF_XDP_USE_CNI_UDS_PATH_ARG);

+                 return -EINVAL;

+ }

+

  if (strlen(if_name) == 0) {

          AF_XDP_LOG(ERR, "Network interface must be

specified\n");

          return -EINVAL;

@@ -2410,7 +2425,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device

*dev)



  eth_dev = init_internals(dev, if_name, xsk_start_queue_idx,

                          xsk_queue_cnt, shared_umem, prog_path,

-                         busy_budget, force_copy, use_cni);

+                         busy_budget, force_copy, use_cni,

uds_path);

  if (eth_dev == NULL) {

          AF_XDP_LOG(ERR, "Failed to init internals\n");

          return -1;

@@ -2471,4 +2486,5 @@

RTE_PMD_REGISTER_PARAM_STRING(net_af_xdp,

                        "xdp_prog=<string> "

                        "busy_budget=<int> "

                        "force_copy=<int> "

-                       "use_cni=<int> ");

+                       "use_cni=<int> "

+                       "uds_path=<string> ");

--

2.41.0
  
Loftus, Ciara Dec. 5, 2023, 1:43 p.m. UTC | #8
> -----Original Message-----
> From: Maryam Tahhan <mtahhan@redhat.com>
> Sent: Monday, December 4, 2023 10:31 AM
> To: ferruh.yigit@amd.com; stephen@networkplumber.org;
> lihuisong@huawei.com; fengchengwen@huawei.com;
> liuyonglong@huawei.com; Koikkara Reeny, Shibin
> <shibin.koikkara.reeny@intel.com>
> Cc: dev@dpdk.org; Tahhan, Maryam <mtahhan@redhat.com>
> Subject: [v2] net/af_xdp: enable a sock path alongside use_cni
> 
> With the original 'use_cni' implementation, (using a
> hardcoded socket rather than a configurable one),
> if a single pod is requesting multiple net devices
> and these devices are from different pools, then
> the container attempts to mount all the netdev UDSes
> in the pod as /tmp/afxdp.sock. Which means that at best
> only 1 netdev will handshake correctly with the AF_XDP
> DP. This patch addresses this by making the socket
> parameter configurable alongside the 'use_cni' param.
> Tested with the AF_XDP DP CNI PR 81.
> 
> v2:
> * Rename sock_path to uds_path.
> * Update documentation to reflect when CAP_BPF is needed.
> * Fix testpmd arguments in the provided example for Pods.
> * Use AF_XDP API to update the xskmap entry.
> 
> Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
> ---
>  doc/guides/howto/af_xdp_cni.rst     | 24 ++++++-----
>  drivers/net/af_xdp/rte_eth_af_xdp.c | 62 ++++++++++++++++++-----------
>  2 files changed, 54 insertions(+), 32 deletions(-)
> 
> diff --git a/doc/guides/howto/af_xdp_cni.rst
> b/doc/guides/howto/af_xdp_cni.rst
> index a1a6d5b99c..7829526b40 100644
> --- a/doc/guides/howto/af_xdp_cni.rst
> +++ b/doc/guides/howto/af_xdp_cni.rst
> @@ -38,9 +38,10 @@ The XSKMAP is a BPF map of AF_XDP sockets (XSK).
>  The client can then proceed with creating an AF_XDP socket
>  and inserting that socket into the XSKMAP pointed to by the descriptor.
> 
> -The EAL vdev argument ``use_cni`` is used to indicate that the user wishes
> -to run the PMD in unprivileged mode and to receive the XSKMAP file
> descriptor
> -from the CNI.
> +The EAL vdev arguments ``use_cni`` and ``uds_path`` are used to indicate that
> +the user wishes to run the PMD in unprivileged mode and to receive the
> XSKMAP
> +file descriptor from the CNI.
> +
>  When this flag is set,
>  the ``XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD`` libbpf flag
>  should be used when creating the socket
> @@ -49,7 +50,7 @@ Instead the loading is handled by the CNI.
> 
>  .. note::
> 
> -   The Unix Domain Socket file path appear in the end user is
> "/tmp/afxdp.sock".
> +   The Unix Domain Socket file path appears to the end user at
> "/tmp/afxdp_dp/<netdev>/afxdp.sock".
> 
> 
>  Prerequisites
> @@ -223,8 +224,7 @@ Howto run dpdk-testpmd with CNI plugin:
>           securityContext:
>            capabilities:
>               add:
> -               - CAP_NET_RAW
> -               - CAP_BPF
> +               - NET_RAW
>           resources:
>             requests:
>               hugepages-2Mi: 2Gi
> @@ -239,14 +239,20 @@ Howto run dpdk-testpmd with CNI plugin:
> 
>    .. _pod.yaml: https://github.com/intel/afxdp-plugins-for-
> kubernetes/blob/v0.0.2/test/e2e/pod-1c1d.yaml
> 
> +.. note::
> +
> +   For Kernel versions older than 5.19 `CAP_BPF` is also required in
> +   the container capabilities stanza.
> +
>  * Run DPDK with a command like the following:
> 
>    .. code-block:: console
> 
>       kubectl exec -i <Pod name> --container <containers name> -- \
> -           /<Path>/dpdk-testpmd -l 0,1 --no-pci \
> -           --vdev=net_af_xdp0,use_cni=1,iface=<interface name> \
> -           -- --no-mlockall --in-memory
> +           /<Path>/dpdk-testpmd -l 0-2 --no-pci --main-lcore=2 \
> +           --vdev net_af_xdp0,iface=<interface
> name>,use_cni=1,uds_path=/tmp/afxdp_dp/<interface name>/afxdp.sock \
> +           --vdev net_af_xdp1,iface=e<interface
> name>,use_cni=1,uds_path=/tmp/afxdp_dp/<interface name>/afxdp.sock \
> +           -- -i --a --nb-cores=2 --rxq=1 --txq=1 --forward-mode=macswap;
> 
>  For further reference please use the `e2e`_ test case in `AF_XDP Plugin for
> Kubernetes`_
> 
> diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c
> b/drivers/net/af_xdp/rte_eth_af_xdp.c
> index 353c8688ec..505ed6cf1e 100644
> --- a/drivers/net/af_xdp/rte_eth_af_xdp.c
> +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
> @@ -88,7 +88,6 @@ RTE_LOG_REGISTER_DEFAULT(af_xdp_logtype, NOTICE);
>  #define UDS_MAX_CMD_LEN			64
>  #define UDS_MAX_CMD_RESP		128
>  #define UDS_XSK_MAP_FD_MSG		"/xsk_map_fd"
> -#define UDS_SOCK			"/tmp/afxdp.sock"
>  #define UDS_CONNECT_MSG			"/connect"
>  #define UDS_HOST_OK_MSG			"/host_ok"
>  #define UDS_HOST_NAK_MSG		"/host_nak"
> @@ -171,6 +170,7 @@ struct pmd_internals {
>  	bool custom_prog_configured;
>  	bool force_copy;
>  	bool use_cni;
> +	char uds_path[PATH_MAX];
>  	struct bpf_map *map;
> 
>  	struct rte_ether_addr eth_addr;
> @@ -191,6 +191,7 @@ struct pmd_process_private {
>  #define ETH_AF_XDP_BUDGET_ARG			"busy_budget"
>  #define ETH_AF_XDP_FORCE_COPY_ARG		"force_copy"
>  #define ETH_AF_XDP_USE_CNI_ARG			"use_cni"
> +#define ETH_AF_XDP_USE_CNI_UDS_PATH_ARG	"uds_path"
> 
>  static const char * const valid_arguments[] = {
>  	ETH_AF_XDP_IFACE_ARG,
> @@ -201,6 +202,7 @@ static const char * const valid_arguments[] = {
>  	ETH_AF_XDP_BUDGET_ARG,
>  	ETH_AF_XDP_FORCE_COPY_ARG,
>  	ETH_AF_XDP_USE_CNI_ARG,
> +	ETH_AF_XDP_USE_CNI_UDS_PATH_ARG,
>  	NULL
>  };
> 
> @@ -1351,7 +1353,7 @@ configure_preferred_busy_poll(struct
> pkt_rx_queue *rxq)
>  }
> 
>  static int
> -init_uds_sock(struct sockaddr_un *server)
> +init_uds_sock(struct sockaddr_un *server, const char *uds_path)
>  {
>  	int sock;
> 
> @@ -1362,7 +1364,7 @@ init_uds_sock(struct sockaddr_un *server)
>  	}
> 
>  	server->sun_family = AF_UNIX;
> -	strlcpy(server->sun_path, UDS_SOCK, sizeof(server->sun_path));
> +	strlcpy(server->sun_path, uds_path, sizeof(server->sun_path));
> 
>  	if (connect(sock, (struct sockaddr *)server, sizeof(struct
> sockaddr_un)) < 0) {
>  		close(sock);
> @@ -1382,7 +1384,7 @@ struct msg_internal {
>  };
> 
>  static int
> -send_msg(int sock, char *request, int *fd)
> +send_msg(int sock, char *request, int *fd, const char *uds_path)
>  {
>  	int snd;
>  	struct iovec iov;
> @@ -1393,7 +1395,7 @@ send_msg(int sock, char *request, int *fd)
> 
>  	memset(&dst, 0, sizeof(dst));
>  	dst.sun_family = AF_UNIX;
> -	strlcpy(dst.sun_path, UDS_SOCK, sizeof(dst.sun_path));
> +	strlcpy(dst.sun_path, uds_path, sizeof(dst.sun_path));
> 
>  	/* Initialize message header structure */
>  	memset(&msgh, 0, sizeof(msgh));
> @@ -1471,7 +1473,7 @@ read_msg(int sock, char *response, struct
> sockaddr_un *s, int *fd)
> 
>  static int
>  make_request_cni(int sock, struct sockaddr_un *server, char *request,
> -		 int *req_fd, char *response, int *out_fd)
> +		 int *req_fd, char *response, int *out_fd, const char
> *uds_path)
>  {
>  	int rval;
> 
> @@ -1483,7 +1485,7 @@ make_request_cni(int sock, struct sockaddr_un
> *server, char *request,
>  	if (req_fd == NULL)
>  		rval = write(sock, request, strlen(request));
>  	else
> -		rval = send_msg(sock, request, req_fd);
> +		rval = send_msg(sock, request, req_fd, uds_path);
> 
>  	if (rval < 0) {
>  		AF_XDP_LOG(ERR, "Write error %s\n", strerror(errno));
> @@ -1507,7 +1509,7 @@ check_response(char *response, char *exp_resp,
> long size)
>  }
> 
>  static int
> -get_cni_fd(char *if_name)
> +get_cni_fd(char *if_name, const char *uds_path)
>  {
>  	char request[UDS_MAX_CMD_LEN],
> response[UDS_MAX_CMD_RESP];
>  	char hostname[MAX_LONG_OPT_SZ],
> exp_resp[UDS_MAX_CMD_RESP];
> @@ -1520,14 +1522,14 @@ get_cni_fd(char *if_name)
>  		return -1;
> 
>  	memset(&server, 0, sizeof(server));
> -	sock = init_uds_sock(&server);
> +	sock = init_uds_sock(&server, uds_path);
>  	if (sock < 0)
>  		return -1;
> 
>  	/* Initiates handshake to CNI send: /connect,hostname */
>  	snprintf(request, sizeof(request), "%s,%s", UDS_CONNECT_MSG,
> hostname);
>  	memset(response, 0, sizeof(response));
> -	if (make_request_cni(sock, &server, request, NULL, response,
> &out_fd) < 0) {
> +	if (make_request_cni(sock, &server, request, NULL, response,
> &out_fd, uds_path) < 0) {
>  		AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n", request);
>  		goto err_close;
>  	}
> @@ -1541,7 +1543,7 @@ get_cni_fd(char *if_name)
>  	/* Request for "/version" */
>  	strlcpy(request, UDS_VERSION_MSG, UDS_MAX_CMD_LEN);
>  	memset(response, 0, sizeof(response));
> -	if (make_request_cni(sock, &server, request, NULL, response,
> &out_fd) < 0) {
> +	if (make_request_cni(sock, &server, request, NULL, response,
> &out_fd, uds_path) < 0) {
>  		AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n", request);
>  		goto err_close;
>  	}
> @@ -1549,7 +1551,7 @@ get_cni_fd(char *if_name)
>  	/* Request for file descriptor for netdev name*/
>  	snprintf(request, sizeof(request), "%s,%s", UDS_XSK_MAP_FD_MSG,
> if_name);
>  	memset(response, 0, sizeof(response));
> -	if (make_request_cni(sock, &server, request, NULL, response,
> &out_fd) < 0) {
> +	if (make_request_cni(sock, &server, request, NULL, response,
> &out_fd, uds_path) < 0) {
>  		AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n", request);
>  		goto err_close;
>  	}
> @@ -1571,7 +1573,7 @@ get_cni_fd(char *if_name)
>  	/* Initiate close connection */
>  	strlcpy(request, UDS_FIN_MSG, UDS_MAX_CMD_LEN);
>  	memset(response, 0, sizeof(response));
> -	if (make_request_cni(sock, &server, request, NULL, response,
> &out_fd) < 0) {
> +	if (make_request_cni(sock, &server, request, NULL, response,
> &out_fd, uds_path) < 0) {
>  		AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n", request);
>  		goto err_close;
>  	}
> @@ -1695,17 +1697,16 @@ xsk_configure(struct pmd_internals *internals,
> struct pkt_rx_queue *rxq,
>  	}
> 
>  	if (internals->use_cni) {
> -		int err, fd, map_fd;
> +		int err, map_fd;
> 
>  		/* get socket fd from CNI plugin */
> -		map_fd = get_cni_fd(internals->if_name);
> +		map_fd = get_cni_fd(internals->if_name, internals-
> >uds_path);
>  		if (map_fd < 0) {
>  			AF_XDP_LOG(ERR, "Failed to receive CNI plugin fd\n");
>  			goto out_xsk;
>  		}
> -		/* get socket fd */
> -		fd = xsk_socket__fd(rxq->xsk);
> -		err = bpf_map_update_elem(map_fd, &rxq->xsk_queue_idx,
> &fd, 0);
> +
> +		err = xsk_socket__update_xskmap(rxq->xsk, map_fd);
>  		if (err) {
>  			AF_XDP_LOG(ERR, "Failed to insert unprivileged xsk in
> map.\n");
>  			goto out_xsk;
> @@ -2023,7 +2024,8 @@ xdp_get_channels_info(const char *if_name, int
> *max_queues,
>  static int
>  parse_parameters(struct rte_kvargs *kvlist, char *if_name, int *start_queue,
>  		 int *queue_cnt, int *shared_umem, char *prog_path,
> -		 int *busy_budget, int *force_copy, int *use_cni)
> +		 int *busy_budget, int *force_copy, int *use_cni,
> +		 char *uds_path)
>  {
>  	int ret;
> 
> @@ -2069,6 +2071,11 @@ parse_parameters(struct rte_kvargs *kvlist, char
> *if_name, int *start_queue,
>  	if (ret < 0)
>  		goto free_kvlist;
> 
> +	ret = rte_kvargs_process(kvlist,
> ETH_AF_XDP_USE_CNI_UDS_PATH_ARG,
> +				 &parse_prog_arg, uds_path);
> +	if (ret < 0)
> +		goto free_kvlist;
> +
>  free_kvlist:
>  	rte_kvargs_free(kvlist);
>  	return ret;
> @@ -2108,7 +2115,7 @@ static struct rte_eth_dev *
>  init_internals(struct rte_vdev_device *dev, const char *if_name,
>  	       int start_queue_idx, int queue_cnt, int shared_umem,
>  	       const char *prog_path, int busy_budget, int force_copy,
> -	       int use_cni)
> +		   int use_cni, const char *uds_path)
>  {
>  	const char *name = rte_vdev_device_name(dev);
>  	const unsigned int numa_node = dev->device.numa_node;
> @@ -2138,6 +2145,7 @@ init_internals(struct rte_vdev_device *dev, const
> char *if_name,
>  	internals->shared_umem = shared_umem;
>  	internals->force_copy = force_copy;
>  	internals->use_cni = use_cni;
> +	strlcpy(internals->uds_path, uds_path, PATH_MAX);
> 
>  	if (xdp_get_channels_info(if_name, &internals->max_queue_cnt,
>  				  &internals->combined_queue_cnt)) {
> @@ -2328,6 +2336,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device
> *dev)
>  	int busy_budget = -1, ret;
>  	int force_copy = 0;
>  	int use_cni = 0;
> +	char uds_path[PATH_MAX] = {'\0'};
>  	struct rte_eth_dev *eth_dev = NULL;
>  	const char *name = rte_vdev_device_name(dev);
> 
> @@ -2370,7 +2379,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device
> *dev)
> 
>  	if (parse_parameters(kvlist, if_name, &xsk_start_queue_idx,
>  			     &xsk_queue_cnt, &shared_umem, prog_path,
> -			     &busy_budget, &force_copy, &use_cni) < 0) {
> +				 &busy_budget, &force_copy, &use_cni,
> uds_path) < 0) {
>  		AF_XDP_LOG(ERR, "Invalid kvargs value\n");
>  		return -EINVAL;
>  	}
> @@ -2387,6 +2396,12 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device
> *dev)
>  			return -EINVAL;
>  	}
> 
> +	if (use_cni && !strnlen(uds_path, PATH_MAX)) {
> +		AF_XDP_LOG(ERR, "When '%s' parameter is used, '%s' must
> also be provided\n",

Thanks for the patch Maryam.
Do we really need the use_cni devarg anymore if we must also always pair it with a uds_path string?
I am in favour of removing it if both yourself and Shibin think that makes sense too.

Ciara

> +			ETH_AF_XDP_USE_CNI_ARG,
> ETH_AF_XDP_USE_CNI_UDS_PATH_ARG);
> +			return -EINVAL;
> +	}
> +
>  	if (strlen(if_name) == 0) {
>  		AF_XDP_LOG(ERR, "Network interface must be specified\n");
>  		return -EINVAL;
> @@ -2410,7 +2425,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device
> *dev)
> 
>  	eth_dev = init_internals(dev, if_name, xsk_start_queue_idx,
>  				 xsk_queue_cnt, shared_umem, prog_path,
> -				 busy_budget, force_copy, use_cni);
> +				 busy_budget, force_copy, use_cni,
> uds_path);
>  	if (eth_dev == NULL) {
>  		AF_XDP_LOG(ERR, "Failed to init internals\n");
>  		return -1;
> @@ -2471,4 +2486,5 @@ RTE_PMD_REGISTER_PARAM_STRING(net_af_xdp,
>  			      "xdp_prog=<string> "
>  			      "busy_budget=<int> "
>  			      "force_copy=<int> "
> -			      "use_cni=<int> ");
> +			      "use_cni=<int> "
> +			      "uds_path=<string> ");
> --
> 2.41.0
  
Maryam Tahhan Dec. 5, 2023, 2:38 p.m. UTC | #9
On 05/12/2023 13:43, Loftus, Ciara wrote:
>> also be provided\n",
> Thanks for the patch Maryam.
> Do we really need the use_cni devarg anymore if we must also always pair it with a uds_path string?
> I am in favour of removing it if both yourself and Shibin think that makes sense too.
>
> Ciara

Hey Ciara

I'm happy to remove the use_cni arg in favour of just the uds_path

BR
Maryam
  
Koikkara Reeny, Shibin Dec. 5, 2023, 2:42 p.m. UTC | #10
Hi Ciara,

I agree.

Regards,
Shibin

From: Maryam Tahhan <mtahhan@redhat.com>
Sent: Tuesday, December 5, 2023 2:38 PM
To: Loftus, Ciara <ciara.loftus@intel.com>; ferruh.yigit@amd.com; stephen@networkplumber.org; lihuisong@huawei.com; fengchengwen@huawei.com; liuyonglong@huawei.com; Koikkara Reeny, Shibin <shibin.koikkara.reeny@intel.com>
Cc: dev@dpdk.org
Subject: Re: [v2] net/af_xdp: enable a sock path alongside use_cni

On 05/12/2023 13:43, Loftus, Ciara wrote:

also be provided\n",

Thanks for the patch Maryam.

Do we really need the use_cni devarg anymore if we must also always pair it with a uds_path string?

I am in favour of removing it if both yourself and Shibin think that makes sense too.



Ciara

Hey Ciara

I'm happy to remove the use_cni arg in favour of just the uds_path

BR
Maryam
  
Stephen Hemminger Dec. 5, 2023, 6:30 p.m. UTC | #11
On Mon,  4 Dec 2023 05:31:01 -0500
Maryam Tahhan <mtahhan@redhat.com> wrote:

> With the original 'use_cni' implementation, (using a
> hardcoded socket rather than a configurable one),
> if a single pod is requesting multiple net devices
> and these devices are from different pools, then
> the container attempts to mount all the netdev UDSes
> in the pod as /tmp/afxdp.sock. Which means that at best
> only 1 netdev will handshake correctly with the AF_XDP
> DP. This patch addresses this by making the socket
> parameter configurable alongside the 'use_cni' param.
> Tested with the AF_XDP DP CNI PR 81.
> 
> v2:
> * Rename sock_path to uds_path.
> * Update documentation to reflect when CAP_BPF is needed.
> * Fix testpmd arguments in the provided example for Pods.
> * Use AF_XDP API to update the xskmap entry.
> 
> Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>

Why does XDP PMD not use abstract socket path?
Having actual file visible in file system can cause permission
and leftover file issues that are not present with abstract path.

If you use abstract path then when last reference is gone (ie server exits)
the path is removed. With regular paths, the file gets stuck in the
file system and has to be cleaned up.
  
Maryam Tahhan Dec. 6, 2023, 3 p.m. UTC | #12
On 05/12/2023 18:30, Stephen Hemminger wrote:
> On Mon,  4 Dec 2023 05:31:01 -0500
> Maryam Tahhan <mtahhan@redhat.com> wrote:
>
>> With the original 'use_cni' implementation, (using a
>> hardcoded socket rather than a configurable one),
>> if a single pod is requesting multiple net devices
>> and these devices are from different pools, then
>> the container attempts to mount all the netdev UDSes
>> in the pod as /tmp/afxdp.sock. Which means that at best
>> only 1 netdev will handshake correctly with the AF_XDP
>> DP. This patch addresses this by making the socket
>> parameter configurable alongside the 'use_cni' param.
>> Tested with the AF_XDP DP CNI PR 81.
>>
>> v2:
>> * Rename sock_path to uds_path.
>> * Update documentation to reflect when CAP_BPF is needed.
>> * Fix testpmd arguments in the provided example for Pods.
>> * Use AF_XDP API to update the xskmap entry.
>>
>> Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
> Why does XDP PMD not use abstract socket path?
> Having actual file visible in file system can cause permission
> and leftover file issues that are not present with abstract path.
>
> If you use abstract path then when last reference is gone (ie server exits)
> the path is removed. With regular paths, the file gets stuck in the
> file system and has to be cleaned up.
>
Hi Stephen

I've not seen abstract sockets being used in pod to pod communications 
in Kubernetes before. I would love to learn more if you have any 
references.

In the scenario mentioned above, the AF_XDP Device Plugin (A pod that is 
the K8s entity on host provisioning and managing interfaces that want to 
use AF_XDP) manages access and cleanup for this uds path so there's no 
overhead on the XDP PMD side inside the DPDK pod. The AF_XDP DP uses 
file permissions on the host side to ensure that other processes/pods 
(that shouldn't) can't access the uds path for a specific pod. Both the 
AF_XDP DP and the DPDK Pod will have different network namespaces. 
Additionally, the actual path (on the host) and the mounted volume path 
in the container are not the same, on the Container side we set it up so 
that the path appears in a predictable location (which is the path 
mentioned above). I'm not sure I can do this with an abstract socket 
without picking a name/path that's entirely predictable on both ends?

Just for clarity on what this UDS is used for -  this is the uds that's 
mounted into a the DPDK Pod/container as a volume, so that the XDP_PMD 
can "handshake" with the AF_XDP device plugin to get a FD for the xskmap 
that the PMD can use when it creates the AF_XDP socket (all this to 
remove elevated privileges being granted to the container).

Thanks
Maryam
  
Stephen Hemminger Dec. 8, 2023, 5:23 p.m. UTC | #13
On Mon, 4 Dec 2023 18:41:18 +0000
Maryam Tahhan <mtahhan@redhat.com> wrote:

> Hi Shibin
> 
> I'm not really sure what you are suggesting, is to make an assumption on 
> the path part where the socket resides (aka hard code it) and then try 
> to build the full UDS path in DPDK?
> 
> Yes the plugin is using constants ATM for certain parts of the UDS path, 
> but that's not say that it's something that won't become configurable 
> later on. Someone may not want to use "/tmp/afxdp_dp/" as the base 
> directory. Then we'd have to change DPDK's implementation again. These 
> are not really things that are configured by hand and are generated by 
> initialization scripts (typically). I would rather build this with the 
> idea that things can change in the future without having to change the 
> DPDK implementation again.
> BR
> Maryam

In UNIX(7) man page:

       abstract
              an  abstract  socket  address  is distinguished (from a pathname
              socket) by the fact that sun_path[0] is a null byte ('\0').  The
              socket's address in this namespace is given  by  the  additional
              bytes  in  sun_path  that are covered by the specified length of
              the address structure.  (Null bytes in the name have no  special
              significance.)  The name has no connection with filesystem path‐
              names.   When the address of an abstract socket is returned, the
              returned addrlen  is  greater  than  sizeof(sa_family_t)  (i.e.,
              greater  than 2), and the name of the socket is contained in the
              first (addrlen - sizeof(sa_family_t)) bytes of sun_path.


Something like this:

diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c b/drivers/net/af_xdp/rte_eth_af_xdp.c
index 353c8688ec9c..f41632a9df5a 100644
--- a/drivers/net/af_xdp/rte_eth_af_xdp.c
+++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
@@ -1362,7 +1362,10 @@ init_uds_sock(struct sockaddr_un *server)
 	}
 
 	server->sun_family = AF_UNIX;
-	strlcpy(server->sun_path, UDS_SOCK, sizeof(server->sun_path));
+
+	/* Use an abstract socket (not in filesystem) */
+	memset(server.sun_path, '\0', sizeof(server.sun_path));
+	strlcpy(server->sun_path + 1, UDS_SOCK, sizeof(server->sun_path) - 1);
 
 	if (connect(sock, (struct sockaddr *)server, sizeof(struct sockaddr_un)) < 0) {
 		close(sock);
@@ -1393,7 +1396,10 @@ send_msg(int sock, char *request, int *fd)
 
 	memset(&dst, 0, sizeof(dst));
 	dst.sun_family = AF_UNIX;
-	strlcpy(dst.sun_path, UDS_SOCK, sizeof(dst.sun_path));
+
+	/* Use an abstract socket (not in filesystem) */
+	memset(server.sun_path, '\0', sizeof(server.sun_path));
+	strlcpy(server->sun_path + 1, UDS_SOCK, sizeof(server->sun_path) - 1);
 
 	/* Initialize message header structure */
 	memset(&msgh, 0, sizeof(msgh));
  
Maryam Tahhan Dec. 11, 2023, 1:22 p.m. UTC | #14
On 08/12/2023 18:10, Maryam Tahhan wrote:
> Thanks Stephen,  I will have a look.
>
> I've seen a few places mention that abstract sockets are attached to 
> the network namespace of a process. For our use case the 2 processes 
> (pods) will have separate network namespaces. So I'm not sure it will 
> work. However, it should be easy to validate and I can give it a try 
> in a k8s environment for completeness. Otherwise all the pods would 
> need to be host networked which is not what we want at all.
>
> I was able to find a case where abstract sockets were used by 
> containerd (CVE-2020-15257) [1]. Our AF_XDP DP Pod is also host 
> networked and so it seems that we would be opening ourselves up to 
> similar issues,  in that a bad acting container could block containers 
> that actually want to use afxdp_dp by simply connecting to the DP and 
> just failing to handshake on all the abstract sockets it finds in,the 
> host namespace.
>
> I will circle back on Mon re the first open, but considering that 
> containerd abandoned this approach, I'm not sure it's the way to go 
> for us. But let's cross that bridge after we have an answer to the 
> first issue.
>
Hi Stephen

Circling back, I built a simple example here [1] using kind. The 
abstract sockets don't work across network namespaces (which is our 
scenario with the Pods) and so will not be usable for what we are trying 
to do here.

The example creates a simple kind cluster. It builds a simple docker 
image that incorporates socat. Then it launches 2 pods:

- The first pod is the server (it will use socat to create an abstract 
socket).

- The second pod is the client (it will use socat to try to connect to 
the abstract socket).


The connection attempt in the client fails.


[1] https://github.com/maryamtahhan/ans-kind-example
  

Patch

diff --git a/doc/guides/howto/af_xdp_cni.rst b/doc/guides/howto/af_xdp_cni.rst
index a1a6d5b99c..7829526b40 100644
--- a/doc/guides/howto/af_xdp_cni.rst
+++ b/doc/guides/howto/af_xdp_cni.rst
@@ -38,9 +38,10 @@  The XSKMAP is a BPF map of AF_XDP sockets (XSK).
 The client can then proceed with creating an AF_XDP socket
 and inserting that socket into the XSKMAP pointed to by the descriptor.
 
-The EAL vdev argument ``use_cni`` is used to indicate that the user wishes
-to run the PMD in unprivileged mode and to receive the XSKMAP file descriptor
-from the CNI.
+The EAL vdev arguments ``use_cni`` and ``uds_path`` are used to indicate that
+the user wishes to run the PMD in unprivileged mode and to receive the XSKMAP
+file descriptor from the CNI.
+
 When this flag is set,
 the ``XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD`` libbpf flag
 should be used when creating the socket
@@ -49,7 +50,7 @@  Instead the loading is handled by the CNI.
 
 .. note::
 
-   The Unix Domain Socket file path appear in the end user is "/tmp/afxdp.sock".
+   The Unix Domain Socket file path appears to the end user at "/tmp/afxdp_dp/<netdev>/afxdp.sock".
 
 
 Prerequisites
@@ -223,8 +224,7 @@  Howto run dpdk-testpmd with CNI plugin:
          securityContext:
           capabilities:
              add:
-               - CAP_NET_RAW
-               - CAP_BPF
+               - NET_RAW
          resources:
            requests:
              hugepages-2Mi: 2Gi
@@ -239,14 +239,20 @@  Howto run dpdk-testpmd with CNI plugin:
 
   .. _pod.yaml: https://github.com/intel/afxdp-plugins-for-kubernetes/blob/v0.0.2/test/e2e/pod-1c1d.yaml
 
+.. note::
+
+   For Kernel versions older than 5.19 `CAP_BPF` is also required in
+   the container capabilities stanza.
+
 * Run DPDK with a command like the following:
 
   .. code-block:: console
 
      kubectl exec -i <Pod name> --container <containers name> -- \
-           /<Path>/dpdk-testpmd -l 0,1 --no-pci \
-           --vdev=net_af_xdp0,use_cni=1,iface=<interface name> \
-           -- --no-mlockall --in-memory
+           /<Path>/dpdk-testpmd -l 0-2 --no-pci --main-lcore=2 \
+           --vdev net_af_xdp0,iface=<interface name>,use_cni=1,uds_path=/tmp/afxdp_dp/<interface name>/afxdp.sock \
+           --vdev net_af_xdp1,iface=e<interface name>,use_cni=1,uds_path=/tmp/afxdp_dp/<interface name>/afxdp.sock \
+           -- -i --a --nb-cores=2 --rxq=1 --txq=1 --forward-mode=macswap;
 
 For further reference please use the `e2e`_ test case in `AF_XDP Plugin for Kubernetes`_
 
diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c b/drivers/net/af_xdp/rte_eth_af_xdp.c
index 353c8688ec..505ed6cf1e 100644
--- a/drivers/net/af_xdp/rte_eth_af_xdp.c
+++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
@@ -88,7 +88,6 @@  RTE_LOG_REGISTER_DEFAULT(af_xdp_logtype, NOTICE);
 #define UDS_MAX_CMD_LEN			64
 #define UDS_MAX_CMD_RESP		128
 #define UDS_XSK_MAP_FD_MSG		"/xsk_map_fd"
-#define UDS_SOCK			"/tmp/afxdp.sock"
 #define UDS_CONNECT_MSG			"/connect"
 #define UDS_HOST_OK_MSG			"/host_ok"
 #define UDS_HOST_NAK_MSG		"/host_nak"
@@ -171,6 +170,7 @@  struct pmd_internals {
 	bool custom_prog_configured;
 	bool force_copy;
 	bool use_cni;
+	char uds_path[PATH_MAX];
 	struct bpf_map *map;
 
 	struct rte_ether_addr eth_addr;
@@ -191,6 +191,7 @@  struct pmd_process_private {
 #define ETH_AF_XDP_BUDGET_ARG			"busy_budget"
 #define ETH_AF_XDP_FORCE_COPY_ARG		"force_copy"
 #define ETH_AF_XDP_USE_CNI_ARG			"use_cni"
+#define ETH_AF_XDP_USE_CNI_UDS_PATH_ARG	"uds_path"
 
 static const char * const valid_arguments[] = {
 	ETH_AF_XDP_IFACE_ARG,
@@ -201,6 +202,7 @@  static const char * const valid_arguments[] = {
 	ETH_AF_XDP_BUDGET_ARG,
 	ETH_AF_XDP_FORCE_COPY_ARG,
 	ETH_AF_XDP_USE_CNI_ARG,
+	ETH_AF_XDP_USE_CNI_UDS_PATH_ARG,
 	NULL
 };
 
@@ -1351,7 +1353,7 @@  configure_preferred_busy_poll(struct pkt_rx_queue *rxq)
 }
 
 static int
-init_uds_sock(struct sockaddr_un *server)
+init_uds_sock(struct sockaddr_un *server, const char *uds_path)
 {
 	int sock;
 
@@ -1362,7 +1364,7 @@  init_uds_sock(struct sockaddr_un *server)
 	}
 
 	server->sun_family = AF_UNIX;
-	strlcpy(server->sun_path, UDS_SOCK, sizeof(server->sun_path));
+	strlcpy(server->sun_path, uds_path, sizeof(server->sun_path));
 
 	if (connect(sock, (struct sockaddr *)server, sizeof(struct sockaddr_un)) < 0) {
 		close(sock);
@@ -1382,7 +1384,7 @@  struct msg_internal {
 };
 
 static int
-send_msg(int sock, char *request, int *fd)
+send_msg(int sock, char *request, int *fd, const char *uds_path)
 {
 	int snd;
 	struct iovec iov;
@@ -1393,7 +1395,7 @@  send_msg(int sock, char *request, int *fd)
 
 	memset(&dst, 0, sizeof(dst));
 	dst.sun_family = AF_UNIX;
-	strlcpy(dst.sun_path, UDS_SOCK, sizeof(dst.sun_path));
+	strlcpy(dst.sun_path, uds_path, sizeof(dst.sun_path));
 
 	/* Initialize message header structure */
 	memset(&msgh, 0, sizeof(msgh));
@@ -1471,7 +1473,7 @@  read_msg(int sock, char *response, struct sockaddr_un *s, int *fd)
 
 static int
 make_request_cni(int sock, struct sockaddr_un *server, char *request,
-		 int *req_fd, char *response, int *out_fd)
+		 int *req_fd, char *response, int *out_fd, const char *uds_path)
 {
 	int rval;
 
@@ -1483,7 +1485,7 @@  make_request_cni(int sock, struct sockaddr_un *server, char *request,
 	if (req_fd == NULL)
 		rval = write(sock, request, strlen(request));
 	else
-		rval = send_msg(sock, request, req_fd);
+		rval = send_msg(sock, request, req_fd, uds_path);
 
 	if (rval < 0) {
 		AF_XDP_LOG(ERR, "Write error %s\n", strerror(errno));
@@ -1507,7 +1509,7 @@  check_response(char *response, char *exp_resp, long size)
 }
 
 static int
-get_cni_fd(char *if_name)
+get_cni_fd(char *if_name, const char *uds_path)
 {
 	char request[UDS_MAX_CMD_LEN], response[UDS_MAX_CMD_RESP];
 	char hostname[MAX_LONG_OPT_SZ], exp_resp[UDS_MAX_CMD_RESP];
@@ -1520,14 +1522,14 @@  get_cni_fd(char *if_name)
 		return -1;
 
 	memset(&server, 0, sizeof(server));
-	sock = init_uds_sock(&server);
+	sock = init_uds_sock(&server, uds_path);
 	if (sock < 0)
 		return -1;
 
 	/* Initiates handshake to CNI send: /connect,hostname */
 	snprintf(request, sizeof(request), "%s,%s", UDS_CONNECT_MSG, hostname);
 	memset(response, 0, sizeof(response));
-	if (make_request_cni(sock, &server, request, NULL, response, &out_fd) < 0) {
+	if (make_request_cni(sock, &server, request, NULL, response, &out_fd, uds_path) < 0) {
 		AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n", request);
 		goto err_close;
 	}
@@ -1541,7 +1543,7 @@  get_cni_fd(char *if_name)
 	/* Request for "/version" */
 	strlcpy(request, UDS_VERSION_MSG, UDS_MAX_CMD_LEN);
 	memset(response, 0, sizeof(response));
-	if (make_request_cni(sock, &server, request, NULL, response, &out_fd) < 0) {
+	if (make_request_cni(sock, &server, request, NULL, response, &out_fd, uds_path) < 0) {
 		AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n", request);
 		goto err_close;
 	}
@@ -1549,7 +1551,7 @@  get_cni_fd(char *if_name)
 	/* Request for file descriptor for netdev name*/
 	snprintf(request, sizeof(request), "%s,%s", UDS_XSK_MAP_FD_MSG, if_name);
 	memset(response, 0, sizeof(response));
-	if (make_request_cni(sock, &server, request, NULL, response, &out_fd) < 0) {
+	if (make_request_cni(sock, &server, request, NULL, response, &out_fd, uds_path) < 0) {
 		AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n", request);
 		goto err_close;
 	}
@@ -1571,7 +1573,7 @@  get_cni_fd(char *if_name)
 	/* Initiate close connection */
 	strlcpy(request, UDS_FIN_MSG, UDS_MAX_CMD_LEN);
 	memset(response, 0, sizeof(response));
-	if (make_request_cni(sock, &server, request, NULL, response, &out_fd) < 0) {
+	if (make_request_cni(sock, &server, request, NULL, response, &out_fd, uds_path) < 0) {
 		AF_XDP_LOG(ERR, "Error in processing cmd [%s]\n", request);
 		goto err_close;
 	}
@@ -1695,17 +1697,16 @@  xsk_configure(struct pmd_internals *internals, struct pkt_rx_queue *rxq,
 	}
 
 	if (internals->use_cni) {
-		int err, fd, map_fd;
+		int err, map_fd;
 
 		/* get socket fd from CNI plugin */
-		map_fd = get_cni_fd(internals->if_name);
+		map_fd = get_cni_fd(internals->if_name, internals->uds_path);
 		if (map_fd < 0) {
 			AF_XDP_LOG(ERR, "Failed to receive CNI plugin fd\n");
 			goto out_xsk;
 		}
-		/* get socket fd */
-		fd = xsk_socket__fd(rxq->xsk);
-		err = bpf_map_update_elem(map_fd, &rxq->xsk_queue_idx, &fd, 0);
+
+		err = xsk_socket__update_xskmap(rxq->xsk, map_fd);
 		if (err) {
 			AF_XDP_LOG(ERR, "Failed to insert unprivileged xsk in map.\n");
 			goto out_xsk;
@@ -2023,7 +2024,8 @@  xdp_get_channels_info(const char *if_name, int *max_queues,
 static int
 parse_parameters(struct rte_kvargs *kvlist, char *if_name, int *start_queue,
 		 int *queue_cnt, int *shared_umem, char *prog_path,
-		 int *busy_budget, int *force_copy, int *use_cni)
+		 int *busy_budget, int *force_copy, int *use_cni,
+		 char *uds_path)
 {
 	int ret;
 
@@ -2069,6 +2071,11 @@  parse_parameters(struct rte_kvargs *kvlist, char *if_name, int *start_queue,
 	if (ret < 0)
 		goto free_kvlist;
 
+	ret = rte_kvargs_process(kvlist, ETH_AF_XDP_USE_CNI_UDS_PATH_ARG,
+				 &parse_prog_arg, uds_path);
+	if (ret < 0)
+		goto free_kvlist;
+
 free_kvlist:
 	rte_kvargs_free(kvlist);
 	return ret;
@@ -2108,7 +2115,7 @@  static struct rte_eth_dev *
 init_internals(struct rte_vdev_device *dev, const char *if_name,
 	       int start_queue_idx, int queue_cnt, int shared_umem,
 	       const char *prog_path, int busy_budget, int force_copy,
-	       int use_cni)
+		   int use_cni, const char *uds_path)
 {
 	const char *name = rte_vdev_device_name(dev);
 	const unsigned int numa_node = dev->device.numa_node;
@@ -2138,6 +2145,7 @@  init_internals(struct rte_vdev_device *dev, const char *if_name,
 	internals->shared_umem = shared_umem;
 	internals->force_copy = force_copy;
 	internals->use_cni = use_cni;
+	strlcpy(internals->uds_path, uds_path, PATH_MAX);
 
 	if (xdp_get_channels_info(if_name, &internals->max_queue_cnt,
 				  &internals->combined_queue_cnt)) {
@@ -2328,6 +2336,7 @@  rte_pmd_af_xdp_probe(struct rte_vdev_device *dev)
 	int busy_budget = -1, ret;
 	int force_copy = 0;
 	int use_cni = 0;
+	char uds_path[PATH_MAX] = {'\0'};
 	struct rte_eth_dev *eth_dev = NULL;
 	const char *name = rte_vdev_device_name(dev);
 
@@ -2370,7 +2379,7 @@  rte_pmd_af_xdp_probe(struct rte_vdev_device *dev)
 
 	if (parse_parameters(kvlist, if_name, &xsk_start_queue_idx,
 			     &xsk_queue_cnt, &shared_umem, prog_path,
-			     &busy_budget, &force_copy, &use_cni) < 0) {
+				 &busy_budget, &force_copy, &use_cni, uds_path) < 0) {
 		AF_XDP_LOG(ERR, "Invalid kvargs value\n");
 		return -EINVAL;
 	}
@@ -2387,6 +2396,12 @@  rte_pmd_af_xdp_probe(struct rte_vdev_device *dev)
 			return -EINVAL;
 	}
 
+	if (use_cni && !strnlen(uds_path, PATH_MAX)) {
+		AF_XDP_LOG(ERR, "When '%s' parameter is used, '%s' must also be provided\n",
+			ETH_AF_XDP_USE_CNI_ARG, ETH_AF_XDP_USE_CNI_UDS_PATH_ARG);
+			return -EINVAL;
+	}
+
 	if (strlen(if_name) == 0) {
 		AF_XDP_LOG(ERR, "Network interface must be specified\n");
 		return -EINVAL;
@@ -2410,7 +2425,7 @@  rte_pmd_af_xdp_probe(struct rte_vdev_device *dev)
 
 	eth_dev = init_internals(dev, if_name, xsk_start_queue_idx,
 				 xsk_queue_cnt, shared_umem, prog_path,
-				 busy_budget, force_copy, use_cni);
+				 busy_budget, force_copy, use_cni, uds_path);
 	if (eth_dev == NULL) {
 		AF_XDP_LOG(ERR, "Failed to init internals\n");
 		return -1;
@@ -2471,4 +2486,5 @@  RTE_PMD_REGISTER_PARAM_STRING(net_af_xdp,
 			      "xdp_prog=<string> "
 			      "busy_budget=<int> "
 			      "force_copy=<int> "
-			      "use_cni=<int> ");
+			      "use_cni=<int> "
+			      "uds_path=<string> ");