From patchwork Tue Nov 13 17:54:47 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anatoly Burakov X-Patchwork-Id: 48047 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id F021E4CA2; Tue, 13 Nov 2018 18:54:54 +0100 (CET) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by dpdk.org (Postfix) with ESMTP id 26C991D7 for ; Tue, 13 Nov 2018 18:54:51 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 13 Nov 2018 09:54:50 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,228,1539673200"; d="scan'208";a="104058631" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by fmsmga002.fm.intel.com with ESMTP; 13 Nov 2018 09:54:49 -0800 Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com [10.237.217.45]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id wADHsmjo032027; Tue, 13 Nov 2018 17:54:48 GMT Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1]) by sivswdev01.ir.intel.com with ESMTP id wADHsmjP001124; Tue, 13 Nov 2018 17:54:48 GMT Received: (from aburakov@localhost) by sivswdev01.ir.intel.com with LOCAL id wADHsmaU001116; Tue, 13 Nov 2018 17:54:48 GMT From: Anatoly Burakov To: dev@dpdk.org Cc: Bruce Richardson , przemyslawx.lal@intel.com, kuralamudhan.ramakrishnan@intel.com, ivan.coughlan@intel.com, tiwei.bie@intel.com, ray.kinsella@intel.com Date: Tue, 13 Nov 2018 17:54:47 +0000 Message-Id: <51e22e414880066b307ca1d0342fd20b1f3c3953.1542130721.git.anatoly.burakov@intel.com> X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [PATCH 19.02 1/2] memalloc: allow setting up segment list fd's X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Currently, only segment fd's for multi-file segments are supported, while for memfd-backed no-huge memory we need single-file segments mode. Add support for single-file segments in the internal API. Signed-off-by: Anatoly Burakov --- lib/librte_eal/bsdapp/eal/eal_memalloc.c | 6 ++++++ lib/librte_eal/common/eal_memalloc.h | 4 ++++ lib/librte_eal/linuxapp/eal/eal_memalloc.c | 16 ++++++++++++++++ 3 files changed, 26 insertions(+) diff --git a/lib/librte_eal/bsdapp/eal/eal_memalloc.c b/lib/librte_eal/bsdapp/eal/eal_memalloc.c index a5847f0bd..6893448db 100644 --- a/lib/librte_eal/bsdapp/eal/eal_memalloc.c +++ b/lib/librte_eal/bsdapp/eal/eal_memalloc.c @@ -61,6 +61,12 @@ eal_memalloc_set_seg_fd(int list_idx __rte_unused, int seg_idx __rte_unused, return -ENOTSUP; } +int +eal_memalloc_set_seg_list_fd(int list_idx __rte_unused, int fd __rte_unused) +{ + return -ENOTSUP; +} + int eal_memalloc_get_seg_fd_offset(int list_idx __rte_unused, int seg_idx __rte_unused, size_t *offset __rte_unused) diff --git a/lib/librte_eal/common/eal_memalloc.h b/lib/librte_eal/common/eal_memalloc.h index af917c2f9..b96c9c512 100644 --- a/lib/librte_eal/common/eal_memalloc.h +++ b/lib/librte_eal/common/eal_memalloc.h @@ -84,6 +84,10 @@ eal_memalloc_get_seg_fd(int list_idx, int seg_idx); int eal_memalloc_set_seg_fd(int list_idx, int seg_idx, int fd); +/* returns 0 or -errno */ +int +eal_memalloc_set_seg_list_fd(int list_idx, int fd); + int eal_memalloc_get_seg_fd_offset(int list_idx, int seg_idx, size_t *offset); diff --git a/lib/librte_eal/linuxapp/eal/eal_memalloc.c b/lib/librte_eal/linuxapp/eal/eal_memalloc.c index 48b9c7360..5bda92717 100644 --- a/lib/librte_eal/linuxapp/eal/eal_memalloc.c +++ b/lib/librte_eal/linuxapp/eal/eal_memalloc.c @@ -1529,6 +1529,10 @@ eal_memalloc_set_seg_fd(int list_idx, int seg_idx, int fd) { struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; + /* single file segments mode doesn't support individual segment fd's */ + if (internal_config.single_file_segments) + return -ENOTSUP; + /* if list is not allocated, allocate it */ if (fd_list[list_idx].len == 0) { int len = mcfg->memsegs[list_idx].memseg_arr.len; @@ -1541,6 +1545,18 @@ eal_memalloc_set_seg_fd(int list_idx, int seg_idx, int fd) return 0; } +int +eal_memalloc_set_seg_list_fd(int list_idx, int fd) +{ + /* non-single file segment mode doesn't support segment list fd's */ + if (!internal_config.single_file_segments) + return -ENOTSUP; + + fd_list[list_idx].memseg_list_fd = fd; + + return 0; +} + int eal_memalloc_get_seg_fd(int list_idx, int seg_idx) { From patchwork Tue Nov 13 17:54:48 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anatoly Burakov X-Patchwork-Id: 48046 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 187312B83; Tue, 13 Nov 2018 18:54:53 +0100 (CET) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by dpdk.org (Postfix) with ESMTP id 53E6B1D7 for ; Tue, 13 Nov 2018 18:54:51 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 13 Nov 2018 09:54:50 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,228,1539673200"; d="scan'208";a="104058629" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by fmsmga002.fm.intel.com with ESMTP; 13 Nov 2018 09:54:49 -0800 Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com [10.237.217.45]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id wADHsmrh032030; Tue, 13 Nov 2018 17:54:48 GMT Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1]) by sivswdev01.ir.intel.com with ESMTP id wADHsm42001131; Tue, 13 Nov 2018 17:54:48 GMT Received: (from aburakov@localhost) by sivswdev01.ir.intel.com with LOCAL id wADHsmhb001127; Tue, 13 Nov 2018 17:54:48 GMT From: Anatoly Burakov To: dev@dpdk.org Cc: przemyslawx.lal@intel.com, kuralamudhan.ramakrishnan@intel.com, ivan.coughlan@intel.com, tiwei.bie@intel.com, ray.kinsella@intel.com Date: Tue, 13 Nov 2018 17:54:48 +0000 Message-Id: <6b5e1de44f42e18a6a9326a06eb800c6bb8ddfdf.1542130721.git.anatoly.burakov@intel.com> X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [PATCH 19.02 2/2] mem: use memfd for no-huge mode X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" When running in no-huge mode, we anonymously allocate our memory. While this works for regular NICs and vdev's, it's not suitable for memory sharing scenarios such as virtio with vhost_user backend. To fix this, allocate no-huge memory using memfd, and register it with memalloc just like any other memseg fd. This will enable using rte_memseg_get_fd() API with --no-huge EAL flag. Signed-off-by: Anatoly Burakov --- lib/librte_eal/linuxapp/eal/eal_memory.c | 46 ++++++++++++++++++++++-- 1 file changed, 44 insertions(+), 2 deletions(-) diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c index 48b23ce19..8feac2c56 100644 --- a/lib/librte_eal/linuxapp/eal/eal_memory.c +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c @@ -25,6 +25,7 @@ #include #include #include +#include #ifdef RTE_EAL_NUMA_AWARE_HUGEPAGES #include #include @@ -1345,12 +1346,15 @@ eal_legacy_hugepage_init(void) /* hugetlbfs can be disabled */ if (internal_config.no_hugetlbfs) { struct rte_memseg_list *msl; + int n_segs, cur_seg, fd, memfd, flags; uint64_t page_sz; - int n_segs, cur_seg; /* nohuge mode is legacy mode */ internal_config.legacy_mem = 1; + /* nohuge mode is single-file segments mode */ + internal_config.single_file_segments = 1; + /* create a memseg list */ msl = &mcfg->memsegs[0]; @@ -1363,8 +1367,36 @@ eal_legacy_hugepage_init(void) return -1; } + /* set up parameters for anonymous mmap */ + fd = -1; + flags = MAP_PRIVATE | MAP_ANONYMOUS; + + /* create a memfd and store it in the segment fd table */ + memfd = memfd_create("nohuge", 0); + if (memfd < 0) { + RTE_LOG(ERR, EAL, "Cannot create memfd: %s\n", + strerror(errno)); + RTE_LOG(ERR, EAL, "Falling back to anonymous map\n"); + } else { + /* we got an fd - now resize it */ + if (ftruncate(memfd, internal_config.memory) < 0) { + RTE_LOG(ERR, EAL, "Cannot resize memfd: %s\n", + strerror(errno)); + RTE_LOG(ERR, EAL, "Falling back to anonymous map\n"); + close(memfd); + } else { + /* creating memfd-backed file was successful. + * we want changes to memfd to be visible to + * other processes (such as vhost backend), so + * map it as shared memory. + */ + RTE_LOG(DEBUG, EAL, "Using memfd for anonymous memory\n"); + fd = memfd; + flags = MAP_SHARED; + } + } addr = mmap(NULL, internal_config.memory, PROT_READ | PROT_WRITE, - MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); + flags, fd, 0); if (addr == MAP_FAILED) { RTE_LOG(ERR, EAL, "%s: mmap() failed: %s\n", __func__, strerror(errno)); @@ -1375,6 +1407,16 @@ eal_legacy_hugepage_init(void) msl->socket_id = 0; msl->len = internal_config.memory; + /* we're in single-file segments mode, so only the segment list + * fd needs to be set up. + */ + if (fd != -1) { + if (eal_memalloc_set_seg_list_fd(0, fd) < 0) { + RTE_LOG(ERR, EAL, "Cannot set up segment list fd\n"); + /* not a serious error, proceed */ + } + } + /* populate memsegs. each memseg is one page long */ for (cur_seg = 0; cur_seg < n_segs; cur_seg++) { arr = &msl->memseg_arr;