Patch Detail
get:
Show a patch.
patch:
Update a patch.
put:
Update a patch.
GET /api/patches/98589/?format=api
http://patchwork.dpdk.org/api/patches/98589/?format=api", "web_url": "http://patchwork.dpdk.org/project/dpdk/patch/043fc2d53770da8248b9cd0214775f9d41f2e0fb.1631273229.git.anatoly.burakov@intel.com/", "project": { "id": 1, "url": "http://patchwork.dpdk.org/api/projects/1/?format=api", "name": "DPDK", "link_name": "dpdk", "list_id": "dev.dpdk.org", "list_email": "dev@dpdk.org", "web_url": "http://core.dpdk.org", "scm_url": "git://dpdk.org/dpdk", "webscm_url": "http://git.dpdk.org/dpdk", "list_archive_url": "https://inbox.dpdk.org/dev", "list_archive_url_format": "https://inbox.dpdk.org/dev/{}", "commit_url_format": "" }, "msgid": "<043fc2d53770da8248b9cd0214775f9d41f2e0fb.1631273229.git.anatoly.burakov@intel.com>", "list_archive_url": "https://inbox.dpdk.org/dev/043fc2d53770da8248b9cd0214775f9d41f2e0fb.1631273229.git.anatoly.burakov@intel.com", "date": "2021-09-10T11:27:35", "name": "[v1,1/1] vfio: add page-by-page mapping API", "commit_ref": null, "pull_url": null, "state": "superseded", "archived": true, "hash": "b6b4ff74d470831f28a93c5915e156c55ea1a56d", "submitter": { "id": 4, "url": "http://patchwork.dpdk.org/api/people/4/?format=api", "name": "Anatoly Burakov", "email": "anatoly.burakov@intel.com" }, "delegate": { "id": 24651, "url": "http://patchwork.dpdk.org/api/users/24651/?format=api", "username": "dmarchand", "first_name": "David", "last_name": "Marchand", "email": "david.marchand@redhat.com" }, "mbox": "http://patchwork.dpdk.org/project/dpdk/patch/043fc2d53770da8248b9cd0214775f9d41f2e0fb.1631273229.git.anatoly.burakov@intel.com/mbox/", "series": [ { "id": 18835, "url": "http://patchwork.dpdk.org/api/series/18835/?format=api", "web_url": "http://patchwork.dpdk.org/project/dpdk/list/?series=18835", "date": "2021-09-10T11:27:35", "name": "[v1,1/1] vfio: add page-by-page mapping API", "version": 1, "mbox": "http://patchwork.dpdk.org/series/18835/mbox/" } ], "comments": "http://patchwork.dpdk.org/api/patches/98589/comments/", "check": "warning", "checks": "http://patchwork.dpdk.org/api/patches/98589/checks/", "tags": {}, "related": [], "headers": { "Return-Path": "<dev-bounces@dpdk.org>", "X-Original-To": "patchwork@inbox.dpdk.org", "Delivered-To": "patchwork@inbox.dpdk.org", "Received": [ "from mails.dpdk.org (mails.dpdk.org [217.70.189.124])\n\tby inbox.dpdk.org (Postfix) with ESMTP id F08BFA0547;\n\tFri, 10 Sep 2021 13:27:42 +0200 (CEST)", "from [217.70.189.124] (localhost [127.0.0.1])\n\tby mails.dpdk.org (Postfix) with ESMTP id 6DFAA40DDE;\n\tFri, 10 Sep 2021 13:27:42 +0200 (CEST)", "from mga05.intel.com (mga05.intel.com [192.55.52.43])\n by mails.dpdk.org (Postfix) with ESMTP id 86819406B4\n for <dev@dpdk.org>; Fri, 10 Sep 2021 13:27:39 +0200 (CEST)", "from fmsmga008.fm.intel.com ([10.253.24.58])\n by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;\n 10 Sep 2021 04:27:38 -0700", "from silpixa00401191.ir.intel.com ([10.55.128.95])\n by fmsmga008.fm.intel.com with ESMTP; 10 Sep 2021 04:27:36 -0700" ], "X-IronPort-AV": [ "E=McAfee;i=\"6200,9189,10102\"; a=\"306623882\"", "E=Sophos;i=\"5.85,282,1624345200\"; d=\"scan'208\";a=\"306623882\"", "E=Sophos;i=\"5.85,282,1624345200\"; d=\"scan'208\";a=\"505048251\"" ], "X-ExtLoop1": "1", "From": "Anatoly Burakov <anatoly.burakov@intel.com>", "To": "dev@dpdk.org, Bruce Richardson <bruce.richardson@intel.com>,\n Ray Kinsella <mdr@ashroe.eu>, Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>,\n Narcisa Ana Maria Vasile <navasile@linux.microsoft.com>,\n Dmitry Malloy <dmitrym@microsoft.com>,\n Pallavi Kadam <pallavi.kadam@intel.com>", "Cc": "xuan.ding@intel.com,\n\tferruh.yigit@intel.com", "Date": "Fri, 10 Sep 2021 11:27:35 +0000", "Message-Id": "\n <043fc2d53770da8248b9cd0214775f9d41f2e0fb.1631273229.git.anatoly.burakov@intel.com>", "X-Mailer": "git-send-email 2.25.1", "MIME-Version": "1.0", "Content-Transfer-Encoding": "8bit", "Subject": "[dpdk-dev] [PATCH v1 1/1] vfio: add page-by-page mapping API", "X-BeenThere": "dev@dpdk.org", "X-Mailman-Version": "2.1.29", "Precedence": "list", "List-Id": "DPDK patches and discussions <dev.dpdk.org>", "List-Unsubscribe": "<https://mails.dpdk.org/options/dev>,\n <mailto:dev-request@dpdk.org?subject=unsubscribe>", "List-Archive": "<http://mails.dpdk.org/archives/dev/>", "List-Post": "<mailto:dev@dpdk.org>", "List-Help": "<mailto:dev-request@dpdk.org?subject=help>", "List-Subscribe": "<https://mails.dpdk.org/listinfo/dev>,\n <mailto:dev-request@dpdk.org?subject=subscribe>", "Errors-To": "dev-bounces@dpdk.org", "Sender": "\"dev\" <dev-bounces@dpdk.org>" }, "content": "Currently, there is no way to map memory for DMA in a way that allows\nunmapping it partially later, because some IOMMU's do not support\npartial unmapping. There is a workaround of mapping all of these\nsegments separately, but this is inconvenient and silly, so this\ncommit adds a proper API that does it.\n\nThis commit relies on earlier infrastructure that was built out to\nsupport \"chunking\", as the concept of \"chunks\" is essentially the same\nas page size.\n\nSigned-off-by: Anatoly Burakov <anatoly.burakov@intel.com>\n---\n lib/eal/freebsd/eal.c | 10 ++++\n lib/eal/include/rte_vfio.h | 33 ++++++++++++++\n lib/eal/linux/eal_vfio.c | 93 +++++++++++++++++++++++++++++++-------\n lib/eal/version.map | 3 ++\n lib/eal/windows/eal.c | 10 ++++\n 5 files changed, 133 insertions(+), 16 deletions(-)", "diff": "diff --git a/lib/eal/freebsd/eal.c b/lib/eal/freebsd/eal.c\nindex 6cee5ae369..78e18f9765 100644\n--- a/lib/eal/freebsd/eal.c\n+++ b/lib/eal/freebsd/eal.c\n@@ -1085,6 +1085,16 @@ rte_vfio_container_dma_map(__rte_unused int container_fd,\n \treturn -1;\n }\n \n+int\n+rte_vfio_container_dma_map_paged(__rte_unused int container_fd,\n+\t\t__rte_unused uint64_t vaddr,\n+\t\t__rte_unused uint64_t iova,\n+\t\t__rte_unused uint64_t len,\n+\t\t__rte_unused uint64_t pagesz)\n+{\n+\treturn -1;\n+}\n+\n int\n rte_vfio_container_dma_unmap(__rte_unused int container_fd,\n \t\t\t__rte_unused uint64_t vaddr,\ndiff --git a/lib/eal/include/rte_vfio.h b/lib/eal/include/rte_vfio.h\nindex 2d90b36480..6afae2ccce 100644\n--- a/lib/eal/include/rte_vfio.h\n+++ b/lib/eal/include/rte_vfio.h\n@@ -17,6 +17,8 @@ extern \"C\" {\n #include <stdbool.h>\n #include <stdint.h>\n \n+#include <rte_compat.h>\n+\n /*\n * determine if VFIO is present on the system\n */\n@@ -331,6 +333,37 @@ int\n rte_vfio_container_dma_map(int container_fd, uint64_t vaddr,\n \t\tuint64_t iova, uint64_t len);\n \n+/**\n+ * @warning\n+ * @b EXPERIMENTAL: this API may change, or be removed, without prior notice\n+ *\n+ * Perform DMA mapping for devices in a container, mapping memory page-by-page.\n+ *\n+ * @param container_fd\n+ * the specified container fd. Use RTE_VFIO_DEFAULT_CONTAINER_FD to\n+ * use the default container.\n+ *\n+ * @param vaddr\n+ * Starting virtual address of memory to be mapped.\n+ *\n+ * @param iova\n+ * Starting IOVA address of memory to be mapped.\n+ *\n+ * @param len\n+ * Length of memory segment being mapped.\n+ *\n+ * @param pagesz\n+ * Page size of the underlying memory.\n+ *\n+ * @return\n+ * 0 if successful\n+ * <0 if failed\n+ */\n+__rte_experimental\n+int\n+rte_vfio_container_dma_map_paged(int container_fd, uint64_t vaddr,\n+\t\tuint64_t iova, uint64_t len, uint64_t pagesz);\n+\n /**\n * Perform DMA unmapping for devices in a container.\n *\ndiff --git a/lib/eal/linux/eal_vfio.c b/lib/eal/linux/eal_vfio.c\nindex 657c89ca58..c791730251 100644\n--- a/lib/eal/linux/eal_vfio.c\n+++ b/lib/eal/linux/eal_vfio.c\n@@ -1872,11 +1872,12 @@ vfio_dma_mem_map(struct vfio_config *vfio_cfg, uint64_t vaddr, uint64_t iova,\n \n static int\n container_dma_map(struct vfio_config *vfio_cfg, uint64_t vaddr, uint64_t iova,\n-\t\tuint64_t len)\n+\t\tuint64_t len, uint64_t pagesz)\n {\n \tstruct user_mem_map *new_map;\n \tstruct user_mem_maps *user_mem_maps;\n \tbool has_partial_unmap;\n+\tuint64_t chunk_size;\n \tint ret = 0;\n \n \tuser_mem_maps = &vfio_cfg->mem_maps;\n@@ -1887,19 +1888,37 @@ container_dma_map(struct vfio_config *vfio_cfg, uint64_t vaddr, uint64_t iova,\n \t\tret = -1;\n \t\tgoto out;\n \t}\n-\t/* map the entry */\n-\tif (vfio_dma_mem_map(vfio_cfg, vaddr, iova, len, 1)) {\n-\t\t/* technically, this will fail if there are currently no devices\n-\t\t * plugged in, even if a device were added later, this mapping\n-\t\t * might have succeeded. however, since we cannot verify if this\n-\t\t * is a valid mapping without having a device attached, consider\n-\t\t * this to be unsupported, because we can't just store any old\n-\t\t * mapping and pollute list of active mappings willy-nilly.\n-\t\t */\n-\t\tRTE_LOG(ERR, EAL, \"Couldn't map new region for DMA\\n\");\n-\t\tret = -1;\n-\t\tgoto out;\n+\n+\t/* technically, mapping will fail if there are currently no devices\n+\t * plugged in, even if a device were added later, this mapping might\n+\t * have succeeded. however, since we cannot verify if this is a valid\n+\t * mapping without having a device attached, consider this to be\n+\t * unsupported, because we can't just store any old mapping and pollute\n+\t * list of active mappings willy-nilly.\n+\t */\n+\n+\t/* if page size was not specified, map the entire segment in one go */\n+\tif (pagesz == 0) {\n+\t\tif (vfio_dma_mem_map(vfio_cfg, vaddr, iova, len, 1)) {\n+\t\t\tRTE_LOG(ERR, EAL, \"Couldn't map new region for DMA\\n\");\n+\t\t\tret = -1;\n+\t\t\tgoto out;\n+\t\t}\n+\t} else {\n+\t\t/* otherwise, do mappings page-by-page */\n+\t\tuint64_t offset;\n+\n+\t\tfor (offset = 0; offset < len; offset += pagesz) {\n+\t\t\tuint64_t va = vaddr + offset;\n+\t\t\tuint64_t io = iova + offset;\n+\t\t\tif (vfio_dma_mem_map(vfio_cfg, va, io, pagesz, 1)) {\n+\t\t\t\tRTE_LOG(ERR, EAL, \"Couldn't map new region for DMA\\n\");\n+\t\t\t\tret = -1;\n+\t\t\t\tgoto out;\n+\t\t\t}\n+\t\t}\n \t}\n+\n \t/* do we have partial unmap support? */\n \thas_partial_unmap = vfio_cfg->vfio_iommu_type->partial_unmap;\n \n@@ -1908,8 +1927,18 @@ container_dma_map(struct vfio_config *vfio_cfg, uint64_t vaddr, uint64_t iova,\n \tnew_map->addr = vaddr;\n \tnew_map->iova = iova;\n \tnew_map->len = len;\n-\t/* for IOMMU types supporting partial unmap, we don't need chunking */\n-\tnew_map->chunk = has_partial_unmap ? 0 : len;\n+\n+\t/*\n+\t * Chunking essentially serves largely the same purpose as page sizes,\n+\t * so for the purposes of this calculation, we treat them as the same.\n+\t * The reason we have page sizes is because we want to map things in a\n+\t * way that allows us to partially unmap later. Therefore, when IOMMU\n+\t * supports partial unmap, page size is irrelevant and can be ignored.\n+\t * For IOMMU that don't support partial unmap, page size is equivalent\n+\t * to chunk size.\n+\t */\n+\tchunk_size = pagesz == 0 ? len : pagesz;\n+\tnew_map->chunk = has_partial_unmap ? 0 : chunk_size;\n \n \tcompact_user_maps(user_mem_maps);\n out:\n@@ -2179,7 +2208,29 @@ rte_vfio_container_dma_map(int container_fd, uint64_t vaddr, uint64_t iova,\n \t\treturn -1;\n \t}\n \n-\treturn container_dma_map(vfio_cfg, vaddr, iova, len);\n+\t/* not having page size means we map entire segment */\n+\treturn container_dma_map(vfio_cfg, vaddr, iova, len, 0);\n+}\n+\n+int\n+rte_vfio_container_dma_map_paged(int container_fd, uint64_t vaddr,\n+\t\tuint64_t iova, uint64_t len, uint64_t pagesz)\n+{\n+\tstruct vfio_config *vfio_cfg;\n+\n+\tif (len == 0 || pagesz == 0 || !rte_is_power_of_2(pagesz) ||\n+\t\t\t(len % pagesz) != 0) {\n+\t\trte_errno = EINVAL;\n+\t\treturn -1;\n+\t}\n+\n+\tvfio_cfg = get_vfio_cfg_by_container_fd(container_fd);\n+\tif (vfio_cfg == NULL) {\n+\t\tRTE_LOG(ERR, EAL, \"Invalid VFIO container fd\\n\");\n+\t\treturn -1;\n+\t}\n+\n+\treturn container_dma_map(vfio_cfg, vaddr, iova, len, pagesz);\n }\n \n int\n@@ -2299,6 +2350,16 @@ rte_vfio_container_dma_map(__rte_unused int container_fd,\n \treturn -1;\n }\n \n+int\n+rte_vfio_container_dma_map_paged(__rte_unused int container_fd,\n+\t\t__rte_unused uint64_t vaddr,\n+\t\t__rte_unused uint64_t iova,\n+\t\t__rte_unused uint64_t len,\n+\t\t__rte_unused uint64_t pagesz)\n+{\n+\treturn -1;\n+}\n+\n int\n rte_vfio_container_dma_unmap(__rte_unused int container_fd,\n \t\t__rte_unused uint64_t vaddr,\ndiff --git a/lib/eal/version.map b/lib/eal/version.map\nindex beeb986adc..eaa6b0bedf 100644\n--- a/lib/eal/version.map\n+++ b/lib/eal/version.map\n@@ -426,6 +426,9 @@ EXPERIMENTAL {\n \n \t# added in 21.08\n \trte_power_monitor_multi; # WINDOWS_NO_EXPORT\n+\n+\t# added in 21.11\n+\trte_vfio_container_dma_map_paged;\n };\n \n INTERNAL {\ndiff --git a/lib/eal/windows/eal.c b/lib/eal/windows/eal.c\nindex 3d8c520412..fcd6bc1894 100644\n--- a/lib/eal/windows/eal.c\n+++ b/lib/eal/windows/eal.c\n@@ -459,6 +459,16 @@ rte_vfio_container_dma_map(__rte_unused int container_fd,\n \treturn -1;\n }\n \n+int\n+rte_vfio_container_dma_map_paged(__rte_unused int container_fd,\n+\t\t__rte_unused uint64_t vaddr,\n+\t\t__rte_unused uint64_t iova,\n+\t\t__rte_unused uint64_t len,\n+\t\t__rte_unused uint64_t pagesz)\n+{\n+\treturn -1;\n+}\n+\n int\n rte_vfio_container_dma_unmap(__rte_unused int container_fd,\n \t\t\t__rte_unused uint64_t vaddr,\n", "prefixes": [ "v1", "1/1" ] }{ "id": 98589, "url": "