From patchwork Thu Feb 29 13:21:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maryam Tahhan X-Patchwork-Id: 137493 X-Patchwork-Delegate: ferruh.yigit@amd.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 72DCF43C0C; Thu, 29 Feb 2024 14:22:02 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 63D2842DD4; Thu, 29 Feb 2024 14:21:44 +0100 (CET) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id 2192D42D2A for ; Thu, 29 Feb 2024 14:21:42 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1709212901; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+O6gc/sTuZlbGWivjVeDkFQqQK69w/QEM5L0WhSF1A0=; b=K5AJ+aeL/OjMc1+FVKTcaWydKvoLOtgEAdV4vbntwM2bxH6fxTC+XiWq/ntljLJ+0uer9k yPYYvF/Rc/rUpc89kDxm7MwIn5yWa4xWUJj9yJfToDbkA1LNsZw4Fi/qlLcqKL7nMA5NdD 65kHmoKMcR2PFqxzOAiR0rboYnf+IvI= Received: from mail-oa1-f71.google.com (mail-oa1-f71.google.com [209.85.160.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-144-kQU86KGjM3WFw756VOosgQ-1; Thu, 29 Feb 2024 08:21:39 -0500 X-MC-Unique: kQU86KGjM3WFw756VOosgQ-1 Received: by mail-oa1-f71.google.com with SMTP id 586e51a60fabf-21e57398aeaso806960fac.0 for ; Thu, 29 Feb 2024 05:21:39 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709212899; x=1709817699; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+O6gc/sTuZlbGWivjVeDkFQqQK69w/QEM5L0WhSF1A0=; b=tW0VysajM8suBwv9ABqfxP4zBzQlGtVJZPqll9ZLf/ocTp2gBIQOAU5hQ8BaLNcYWf nihfN9UGzP08INQ2/h3+mEeBZvL1tbSvqmNOoy7yy3fUfyyH1AI9wA9Q5VdeZxrdmeuh Rnr4VQUb0TMb/xQT5gA1JXVvul8tMwWgvV8XwbVfdq4T3Igh8B/ufINUZnJi3JuZ9d7+ otVwzGh/fgWRcUd6JHtvi4Rq6iFCBEVpoM29zoZWBWkg8Pwaj3UAwrXISk+rZ2ykcaR7 MtYA/yX7dSl59quRDVWcBADXPZF4MQwIi/jtx5W+S+J+VP4NEZtcd8xiVRpk6MiZWS0o AImQ== X-Gm-Message-State: AOJu0Yx4hgrZEeXu1owOII3+SnnsB3dQhCKl7oRsfiYjjSAOeW1VOqp5 d8y8uZYHqEK6uInfayMRyRkK2PBJci86KEL3iAPwuMT+JswKtlQzN8iTswNIr0YcG7Bzp0fL4Vs 19X7jbCwnZVixs7tz0cCEG/b+6DBS6yPjcEMvyDsb X-Received: by 2002:a05:6870:8994:b0:21e:f15e:836b with SMTP id f20-20020a056870899400b0021ef15e836bmr2006310oaq.20.1709212898936; Thu, 29 Feb 2024 05:21:38 -0800 (PST) X-Google-Smtp-Source: AGHT+IFecUFnS7ul6QnuoPDMb3bcHqCXGOoeiagO9ag/ArTtk4hqaB95uOxQSiSPZ5bRR8y8GmAMuA== X-Received: by 2002:a05:6870:8994:b0:21e:f15e:836b with SMTP id f20-20020a056870899400b0021ef15e836bmr2006289oaq.20.1709212898543; Thu, 29 Feb 2024 05:21:38 -0800 (PST) Received: from nfvsdn-06.redhat.com (nat-pool-232-132.redhat.com. [66.187.232.132]) by smtp.gmail.com with ESMTPSA id g22-20020ac84b76000000b0042e6d2dd6bbsm698133qts.11.2024.02.29.05.21.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Feb 2024 05:21:38 -0800 (PST) From: Maryam Tahhan To: ferruh.yigit@amd.com, stephen@networkplumber.org, lihuisong@huawei.com, fengchengwen@huawei.com, liuyonglong@huawei.com, david.marchand@redhat.com, shibin.koikkara.reeny@intel.com, ciara.loftus@intel.com Cc: dev@dpdk.org, Maryam Tahhan Subject: [v11 3/3] net/af_xdp: support AF_XDP DP pinned maps Date: Thu, 29 Feb 2024 08:21:22 -0500 Message-ID: <20240229132129.656166-4-mtahhan@redhat.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20240229132129.656166-1-mtahhan@redhat.com> References: <20240229132129.656166-1-mtahhan@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Enable the AF_XDP PMD to retrieve the xskmap from a pinned eBPF map. This map is expected to be pinned by an external entity like the AF_XDP Device Plugin. This enabled unprivileged pods to create and use AF_XDP sockets. Signed-off-by: Maryam Tahhan --- doc/guides/howto/af_xdp_dp.rst | 35 ++++++++-- doc/guides/nics/af_xdp.rst | 34 ++++++++-- doc/guides/rel_notes/release_24_03.rst | 10 +++ drivers/net/af_xdp/rte_eth_af_xdp.c | 93 ++++++++++++++++++++------ 4 files changed, 141 insertions(+), 31 deletions(-) diff --git a/doc/guides/howto/af_xdp_dp.rst b/doc/guides/howto/af_xdp_dp.rst index 4aa6b5499f..8b9b5ebbad 100644 --- a/doc/guides/howto/af_xdp_dp.rst +++ b/doc/guides/howto/af_xdp_dp.rst @@ -52,10 +52,21 @@ should be used when creating the socket to instruct libbpf not to load the default libbpf program on the netdev. Instead the loading is handled by the AF_XDP Device Plugin. -The EAL vdev argument ``dp_path`` is used alongside the ``use_cni`` argument -to explicitly tell the AF_XDP PMD where to find the UDS to interact with the -AF_XDP Device Plugin. If this argument is not passed alongside the ``use_cni`` -argument then the AF_XDP PMD configures it internally. +The EAL vdev argument ``use_pinned_map`` is used indicate to the AF_XDP PMD to +retrieve the XSKMAP fd from a pinned eBPF map. This map is expected to be pinned +by an external entity like the AF_XDP Device Plugin. This enabled unprivileged pods +to create and use AF_XDP sockets. When this flag is set, the +``XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD`` libbpf flag is used by the AF_XDP PMD when +creating the AF_XDP socket. + +The EAL vdev argument ``dp_path`` is used alongside the ``use_cni`` or ``use_pinned_map`` +arguments to explicitly tell the AF_XDP PMD where to find either: + +1. The UDS to interact with the AF_XDP Device Plugin. OR +2. The pinned xskmap to use when creating AF_XDP sockets. + +If this argument is not passed alongside the ``use_cni`` or ``use_pinned_map`` arguments then +the AF_XDP PMD configures it internally to the `AF_XDP Device Plugin for Kubernetes`_. .. note:: @@ -312,8 +323,18 @@ Run dpdk-testpmd with the AF_XDP Device Plugin + CNI --no-mlockall --in-memory \ -- -i --a --nb-cores=2 --rxq=1 --txq=1 --forward-mode=macswap; + Or + + .. code-block:: console + + kubectl exec -i --container -- \ + //dpdk-testpmd -l 0,1 --no-pci \ + --vdev=net_af_xdp0,use_pinned_map=1,iface=,dp_path="/tmp/afxdp_dp//xsks_map" \ + --no-mlockall --in-memory \ + -- -i --a --nb-cores=2 --rxq=1 --txq=1 --forward-mode=macswap; + .. note:: - If the ``dp_path`` parameter isn't explicitly set (like the example above) - the AF_XDP PMD will set the parameter value to - ``/tmp/afxdp_dp/<>/afxdp.sock``. + If the ``dp_path`` parameter isn't explicitly set with ``use_cni`` or ``use_pinned_map`` + the AF_XDP PMD will set the parameter values to the `AF_XDP Device Plugin for Kubernetes`_ + defaults. diff --git a/doc/guides/nics/af_xdp.rst b/doc/guides/nics/af_xdp.rst index 7f8651beda..940bbf60f2 100644 --- a/doc/guides/nics/af_xdp.rst +++ b/doc/guides/nics/af_xdp.rst @@ -171,13 +171,35 @@ enable the `AF_XDP Device Plugin for Kubernetes`_ with a DPDK application/pod. so enabling and disabling of the promiscuous mode through the DPDK application is also not supported. +use_pinned_map +~~~~~~~~~~~~~~ + +The EAL vdev argument ``use_pinned_map`` is used to indicate that the user wishes to +load a pinned xskmap mounted by `AF_XDP Device Plugin for Kubernetes`_ in the DPDK +application/pod. + +.. _AF_XDP Device Plugin for Kubernetes: https://github.com/intel/afxdp-plugins-for-kubernetes + +.. code-block:: console + + --vdev=net_af_xdp0,use_pinned_map=1 + +.. note:: + + This feature can also be used with any external entity that can pin an eBPF map, not just + the `AF_XDP Device Plugin for Kubernetes`_. + dp_path ~~~~~~~ -The EAL vdev argument ``dp_path`` is used alongside the ``use_cni`` argument -to explicitly tell the AF_XDP PMD where to find the UDS to interact with the -`AF_XDP Device Plugin for Kubernetes`_. If this argument is not passed -alongside the ``use_cni`` argument then the AF_XDP PMD configures it internally. +The EAL vdev argument ``dp_path`` is used alongside the ``use_cni`` or ``use_pinned_map`` +arguments to explicitly tell the AF_XDP PMD where to find either: + +1. The UDS to interact with the AF_XDP Device Plugin. OR +2. The pinned xskmap to use when creating AF_XDP sockets. + +If this argument is not passed alongside the ``use_cni`` or ``use_pinned_map`` arguments then +the AF_XDP PMD configures it internally to the `AF_XDP Device Plugin for Kubernetes`_. .. _AF_XDP Device Plugin for Kubernetes: https://github.com/intel/afxdp-plugins-for-kubernetes @@ -185,6 +207,10 @@ alongside the ``use_cni`` argument then the AF_XDP PMD configures it internally. --vdev=net_af_xdp0,use_cni=1,dp_path="/tmp/afxdp_dp/<>/afxdp.sock" +.. code-block:: console + + --vdev=net_af_xdp0,use_pinned_map=1,dp_path="/tmp/afxdp_dp/<>/xsks_map" + Limitations ----------- diff --git a/doc/guides/rel_notes/release_24_03.rst b/doc/guides/rel_notes/release_24_03.rst index b2b1f2566f..95d9a0f842 100644 --- a/doc/guides/rel_notes/release_24_03.rst +++ b/doc/guides/rel_notes/release_24_03.rst @@ -146,6 +146,16 @@ New Features compatibility for any applications already using the ``use_cni`` vdev argument with the AF_XDP Device Plugin. +* **Integrated AF_XDP PMD with AF_XDP Device Plugin eBPF map pinning support**. + + The EAL vdev argument for the AF_XDP PMD ``use_map_pinning`` was added + to allow Kubernetes Pods to use AF_XDP with DPDK, and run with limited + privileges, without having to do a full handshake over a Unix Domain + Socket with the Device Plugin. This flag indicates that the AF_XDP PMD + will be used in unprivileged mode and will obtain the XSKMAP FD by calling + ``bpf_obj_get()`` for an xskmap pinned (by the AF_XDP DP) inside the + container. + Removed Items ------------- diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c b/drivers/net/af_xdp/rte_eth_af_xdp.c index 3fb0c6a3b9..f13bdb9017 100644 --- a/drivers/net/af_xdp/rte_eth_af_xdp.c +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c @@ -85,6 +85,7 @@ RTE_LOG_REGISTER_DEFAULT(af_xdp_logtype, NOTICE); #define DP_BASE_PATH "/tmp/afxdp_dp" #define DP_UDS_SOCK "afxdp.sock" +#define DP_XSK_MAP "xsks_map" #define MAX_LONG_OPT_SZ 64 #define UDS_MAX_FD_NUM 2 #define UDS_MAX_CMD_LEN 64 @@ -172,6 +173,7 @@ struct pmd_internals { bool custom_prog_configured; bool force_copy; bool use_cni; + bool use_pinned_map; char dp_path[PATH_MAX]; struct bpf_map *map; @@ -193,6 +195,7 @@ struct pmd_process_private { #define ETH_AF_XDP_BUDGET_ARG "busy_budget" #define ETH_AF_XDP_FORCE_COPY_ARG "force_copy" #define ETH_AF_XDP_USE_CNI_ARG "use_cni" +#define ETH_AF_XDP_USE_PINNED_MAP_ARG "use_pinned_map" #define ETH_AF_XDP_DP_PATH_ARG "dp_path" static const char * const valid_arguments[] = { @@ -204,6 +207,7 @@ static const char * const valid_arguments[] = { ETH_AF_XDP_BUDGET_ARG, ETH_AF_XDP_FORCE_COPY_ARG, ETH_AF_XDP_USE_CNI_ARG, + ETH_AF_XDP_USE_PINNED_MAP_ARG, ETH_AF_XDP_DP_PATH_ARG, NULL }; @@ -1258,6 +1262,21 @@ xsk_umem_info *xdp_umem_configure(struct pmd_internals *internals, } #endif +static int +get_pinned_map(const char *dp_path, int *map_fd) +{ + *map_fd = bpf_obj_get(dp_path); + if (!*map_fd) { + AF_XDP_LOG(ERR, "Failed to find xsks_map in %s\n", dp_path); + return -1; + } + + AF_XDP_LOG(INFO, "Successfully retrieved map %s with fd %d\n", + dp_path, *map_fd); + + return 0; +} + static int load_custom_xdp_prog(const char *prog_path, int if_index, struct bpf_map **map) { @@ -1644,7 +1663,7 @@ xsk_configure(struct pmd_internals *internals, struct pkt_rx_queue *rxq, #endif /* Disable libbpf from loading XDP program */ - if (internals->use_cni) + if (internals->use_cni || internals->use_pinned_map) cfg.libbpf_flags |= XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD; if (strnlen(internals->prog_path, PATH_MAX)) { @@ -1698,14 +1717,23 @@ xsk_configure(struct pmd_internals *internals, struct pkt_rx_queue *rxq, } } - if (internals->use_cni) { + if (internals->use_cni || internals->use_pinned_map) { int err, map_fd; - /* get socket fd from AF_XDP Device Plugin */ - map_fd = uds_get_xskmap_fd(internals->if_name, internals->dp_path); - if (map_fd < 0) { - AF_XDP_LOG(ERR, "Failed to receive xskmap fd from AF_XDP Device Plugin\n"); - goto out_xsk; + if (internals->use_cni) { + /* get socket fd from AF_XDP Device Plugin */ + map_fd = uds_get_xskmap_fd(internals->if_name, internals->dp_path); + if (map_fd < 0) { + AF_XDP_LOG(ERR, "Failed to receive xskmap fd from AF_XDP Device Plugin\n"); + goto out_xsk; + } + } else { + /* get socket fd from AF_XDP plugin */ + err = get_pinned_map(internals->dp_path, &map_fd); + if (err < 0 || map_fd < 0) { + AF_XDP_LOG(ERR, "Failed to retrieve pinned map fd\n"); + goto out_xsk; + } } err = xsk_socket__update_xskmap(rxq->xsk, map_fd); @@ -2027,7 +2055,7 @@ static int parse_parameters(struct rte_kvargs *kvlist, char *if_name, int *start_queue, int *queue_cnt, int *shared_umem, char *prog_path, int *busy_budget, int *force_copy, int *use_cni, - char *dp_path) + int *use_pinned_map, char *dp_path) { int ret; @@ -2073,6 +2101,11 @@ parse_parameters(struct rte_kvargs *kvlist, char *if_name, int *start_queue, if (ret < 0) goto free_kvlist; + ret = rte_kvargs_process(kvlist, ETH_AF_XDP_USE_PINNED_MAP_ARG, + &parse_integer_arg, use_pinned_map); + if (ret < 0) + goto free_kvlist; + ret = rte_kvargs_process(kvlist, ETH_AF_XDP_DP_PATH_ARG, &parse_prog_arg, dp_path); if (ret < 0) @@ -2117,7 +2150,7 @@ static struct rte_eth_dev * init_internals(struct rte_vdev_device *dev, const char *if_name, int start_queue_idx, int queue_cnt, int shared_umem, const char *prog_path, int busy_budget, int force_copy, - int use_cni, const char *dp_path) + int use_cni, int use_pinned_map, const char *dp_path) { const char *name = rte_vdev_device_name(dev); const unsigned int numa_node = dev->device.numa_node; @@ -2147,6 +2180,7 @@ init_internals(struct rte_vdev_device *dev, const char *if_name, internals->shared_umem = shared_umem; internals->force_copy = force_copy; internals->use_cni = use_cni; + internals->use_pinned_map = use_pinned_map; strlcpy(internals->dp_path, dp_path, PATH_MAX); if (xdp_get_channels_info(if_name, &internals->max_queue_cnt, @@ -2206,7 +2240,7 @@ init_internals(struct rte_vdev_device *dev, const char *if_name, eth_dev->data->dev_link = pmd_link; eth_dev->data->mac_addrs = &internals->eth_addr; eth_dev->data->dev_flags |= RTE_ETH_DEV_AUTOFILL_QUEUE_XSTATS; - if (!internals->use_cni) + if (!internals->use_cni && !internals->use_pinned_map) eth_dev->dev_ops = &ops; else eth_dev->dev_ops = &ops_afxdp_dp; @@ -2338,6 +2372,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) int busy_budget = -1, ret; int force_copy = 0; int use_cni = 0; + int use_pinned_map = 0; char dp_path[PATH_MAX] = {'\0'}; struct rte_eth_dev *eth_dev = NULL; const char *name = rte_vdev_device_name(dev); @@ -2381,20 +2416,29 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) if (parse_parameters(kvlist, if_name, &xsk_start_queue_idx, &xsk_queue_cnt, &shared_umem, prog_path, - &busy_budget, &force_copy, &use_cni, dp_path) < 0) { + &busy_budget, &force_copy, &use_cni, &use_pinned_map, + dp_path) < 0) { AF_XDP_LOG(ERR, "Invalid kvargs value\n"); return -EINVAL; } - if (use_cni && busy_budget > 0) { + if (use_cni && use_pinned_map) { AF_XDP_LOG(ERR, "When '%s' parameter is used, '%s' parameter is not valid\n", - ETH_AF_XDP_USE_CNI_ARG, ETH_AF_XDP_BUDGET_ARG); + ETH_AF_XDP_USE_CNI_ARG, ETH_AF_XDP_USE_PINNED_MAP_ARG); return -EINVAL; } - if (use_cni && strnlen(prog_path, PATH_MAX)) { - AF_XDP_LOG(ERR, "When '%s' parameter is used, '%s' parameter is not valid\n", - ETH_AF_XDP_USE_CNI_ARG, ETH_AF_XDP_PROG_ARG); + if ((use_cni || use_pinned_map) && busy_budget > 0) { + AF_XDP_LOG(ERR, "When '%s' or '%s' parameter is used, '%s' parameter is not valid\n", + ETH_AF_XDP_USE_CNI_ARG, ETH_AF_XDP_USE_PINNED_MAP_ARG, + ETH_AF_XDP_BUDGET_ARG); + return -EINVAL; + } + + if ((use_cni || use_pinned_map) && strnlen(prog_path, PATH_MAX)) { + AF_XDP_LOG(ERR, "When '%s' or '%s' parameter is used, '%s' parameter is not valid\n", + ETH_AF_XDP_USE_CNI_ARG, ETH_AF_XDP_USE_PINNED_MAP_ARG, + ETH_AF_XDP_PROG_ARG); return -EINVAL; } @@ -2404,9 +2448,16 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) ETH_AF_XDP_DP_PATH_ARG, dp_path); } - if (!use_cni && strnlen(dp_path, PATH_MAX)) { - AF_XDP_LOG(ERR, "'%s' parameter is set, but '%s' was not enabled\n", - ETH_AF_XDP_DP_PATH_ARG, ETH_AF_XDP_USE_CNI_ARG); + if (use_pinned_map && !strnlen(dp_path, PATH_MAX)) { + snprintf(dp_path, sizeof(dp_path), "%s/%s/%s", DP_BASE_PATH, if_name, DP_XSK_MAP); + AF_XDP_LOG(INFO, "'%s' parameter not provided, setting value to '%s'\n", + ETH_AF_XDP_DP_PATH_ARG, dp_path); + } + + if ((!use_cni && !use_pinned_map) && strnlen(dp_path, PATH_MAX)) { + AF_XDP_LOG(ERR, "'%s' parameter is set, but '%s' or '%s' were not enabled\n", + ETH_AF_XDP_DP_PATH_ARG, ETH_AF_XDP_USE_CNI_ARG, + ETH_AF_XDP_USE_PINNED_MAP_ARG); return -EINVAL; } @@ -2433,7 +2484,8 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) eth_dev = init_internals(dev, if_name, xsk_start_queue_idx, xsk_queue_cnt, shared_umem, prog_path, - busy_budget, force_copy, use_cni, dp_path); + busy_budget, force_copy, use_cni, use_pinned_map, + dp_path); if (eth_dev == NULL) { AF_XDP_LOG(ERR, "Failed to init internals\n"); return -1; @@ -2495,4 +2547,5 @@ RTE_PMD_REGISTER_PARAM_STRING(net_af_xdp, "busy_budget= " "force_copy= " "use_cni= " + "use_pinned_map= " "dp_path= ");