From patchwork Tue Nov 9 10:32:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Kozlyuk X-Patchwork-Id: 104051 X-Patchwork-Delegate: rasland@nvidia.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 7118FA034F; Tue, 9 Nov 2021 11:33:13 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 6342F40E03; Tue, 9 Nov 2021 11:33:13 +0100 (CET) Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2082.outbound.protection.outlook.com [40.107.243.82]) by mails.dpdk.org (Postfix) with ESMTP id 0CCF24068B for ; Tue, 9 Nov 2021 11:33:12 +0100 (CET) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=cgjSCEtK8J6kd7XaUIytyH2dieoKxap+UI5Xa5tE99mL/YSJtTCTthNzmT0vPA75rPnCoIa0gfHG9D6x7vsCeNlnu0ggLAiVhPaYJaZHfqP9i7OhYK7yscgRzJ0CS1Np7FVC+zWw2P5uNdSVhnMUj99f4qdOR4FwvTbfI0lEOij7ZBmhdTeWOX1mhueL//NjfD9WgfAYpoRjbKbC2vLIT2iklnpvVxnVFUGIbNTCHPYl5TSJDaJXuKE8tBQF559DHs+2q+ZLAoIT5dSpQnpY/q9CppKCgxNjiKY23As91GNjA6diiBc925aqYXvirrqEOaH4ggIVjJjt3qLui3ZSOw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=BPu3io6CQ/DGq4xX1ES2fu1QRXAFWbTOhz4MBlNfaG4=; b=CzyXOvI4roNxjGSB61j/BfFiWZVe6i34uwUSIdyQ74XalJSzGe94eFh6h9O1LoNb90GIS1O0z6nB/zNMvEoDh5F/9MbNIHaz69UVz02mLegtFlf64M9+OpWxbNLyJw0Q/LTLegOmTh51+zmiP/gWSa3XtYYEHSV/J3Ewn0ng22QdgtG7nNQkhnERRl5RoMAUcS8die+ZCpmog9uXWEkKPH/NNElq5au4VySBV/dT9YclDBgiy5hcwuLuHiqXUFLDPK5dCYUesVcBYQFR6nQlKI0Oup1Iv6FeJEnhkAR+j98TukRX3MlNjpyHbBWKpEcIsaur55UgZOsm41euE+2Z5g== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.112.34) smtp.rcpttodomain=dpdk.org smtp.mailfrom=nvidia.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=BPu3io6CQ/DGq4xX1ES2fu1QRXAFWbTOhz4MBlNfaG4=; b=PzT/1AdzWz45AtFEkNAbYiGbblaHX5F0GF/GbleJxcCJNVdRGfLf83ZQIKhUENhJcuz7jhoX0qYUc6UlBMhFaVbd5FLqVEMDK1ABZZhH5kHSNsgaDmh12zH3fi8F3gxIBlX1FrEYlXvXrP1WmOsp8muD0eFQu7ulQdok4N+7pTE84/D0FpFfreULxF/ddjuEXkivS8JHMmqoDEdbXamUReSBE3FSXxbmZaZwWdbMx0WC8PDo/W5/ZF4GAtTkExNT9ROcT0PhbYHq46qxqF5JutgDeRXNLCtzyhhClMhyvk6a1nydj42NjczuqiISDmkmdXdFboJaswfEdnE3HXqCqA== Received: from MW4P222CA0010.NAMP222.PROD.OUTLOOK.COM (2603:10b6:303:114::15) by BYAPR12MB3094.namprd12.prod.outlook.com (2603:10b6:a03:db::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.10; Tue, 9 Nov 2021 10:33:10 +0000 Received: from CO1NAM11FT018.eop-nam11.prod.protection.outlook.com (2603:10b6:303:114:cafe::f) by MW4P222CA0010.outlook.office365.com (2603:10b6:303:114::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.16 via Frontend Transport; Tue, 9 Nov 2021 10:33:09 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.112.34) smtp.mailfrom=nvidia.com; dpdk.org; dkim=none (message not signed) header.d=none;dpdk.org; dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.112.34 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.112.34; helo=mail.nvidia.com; Received: from mail.nvidia.com (216.228.112.34) by CO1NAM11FT018.mail.protection.outlook.com (10.13.175.16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.4669.10 via Frontend Transport; Tue, 9 Nov 2021 10:33:09 +0000 Received: from nvidia.com (172.20.187.5) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1497.18; Tue, 9 Nov 2021 10:33:07 +0000 From: Dmitry Kozlyuk To: CC: Raslan Darawsheh , Matan Azrad , Viacheslav Ovsiienko Date: Tue, 9 Nov 2021 12:32:53 +0200 Message-ID: <20211109103253.1938561-1-dkozlyuk@nvidia.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20211102065917.889267-1-dkozlyuk@nvidia.com> References: <20211102065917.889267-1-dkozlyuk@nvidia.com> MIME-Version: 1.0 X-Originating-IP: [172.20.187.5] X-ClientProxiedBy: HQMAIL111.nvidia.com (172.20.187.18) To HQMAIL107.nvidia.com (172.20.187.13) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 9496ebac-9dc1-41c1-266c-08d9a36c530e X-MS-TrafficTypeDiagnostic: BYAPR12MB3094: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:133; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: OcO9fHUjFUuGOqGinNryiK0/prA4XPyavWL6/s0RyaQDyVgBjV/rya0Jw1One/X6e7L0vfP3wK1EKIgHN6+MEIFCNQEH9WY2wKDSw7iBapa4sEmoPR9semCfr59DlzZSrrzNa0G+UJX87J8kIWafe5Dgac4jWrYQ5RYroeZA5Cwv68Uzpj5RVkBffRrCYeRMajWSMoE1n0j44QkHPdUHfHzPAL+vyC2j6dI/uIFDSVnUQyMDJ6breuPjflwz9UJ7E3MXyR8252NTeQPBeLAU034DIGem6TtoeQt17iKXZrs306K8vm6q/xp7YR38GLXpfsv+CwtzZp7pdXsrM73oV+pJDTYY3LQkHNaxSfZ0hRe9rw/C5uCIUfhLDpxws+wfKkSlBHkHHUmH/bsh/mu3nvQPpJokzB+4LefZn+N7LYB5hLlkZQEqUIt5FLWASI82IZSDTzU/+QM9ybOf3Wn7TKC3GWUi6w8N7N+22JW6tgkJXnb5JvId+VXBhTNCNzqU3DrJtoMECNuqaqmH0RF7qJyUM25eJyWuCjUfKgmkLGufvGM6tmzY6KVPw2pfhNfsadSKkseuuHwXD1WvvOEdIFtlbjIDc0fsUulMSfG3shvXsNeZPB/Y94W2lqTiFDUOM4t+49oet4AH5FsARa25H9J6Yfp6PTwUu4mOaN6HyhyMGPfutt+0657nZ9J3V7h6oTkYeKu6sI9YzLtEnONKdw== X-Forefront-Antispam-Report: CIP:216.228.112.34; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:schybrid03.nvidia.com; CAT:NONE; SFS:(4636009)(36840700001)(46966006)(1076003)(7696005)(7636003)(107886003)(508600001)(8676002)(2906002)(316002)(83380400001)(82310400003)(47076005)(5660300002)(4326008)(70586007)(86362001)(70206006)(6666004)(356005)(26005)(36860700001)(336012)(36756003)(426003)(2616005)(54906003)(6286002)(186003)(6916009)(8936002)(55016002)(16526019); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Nov 2021 10:33:09.3728 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 9496ebac-9dc1-41c1-266c-08d9a36c530e X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[216.228.112.34]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: CO1NAM11FT018.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: BYAPR12MB3094 Subject: [dpdk-dev] [PATCH v2] common/mlx5: fix external memory pool registration X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Registration of packet mempools with RTE_PKTMBUF_POOL_PINNED_EXT_MEM was performed incorrectly: after population of such mempool chunks only contain memory for rte_mbuf structures, while pointers to actual external memory are not yet filled. MR LKeys could not be obtained for external memory addresses of such mempools. Rx datapath assumes all used mempools are registered and does not fallback to dynamic MR creation in such case, so no packets could be received. Skip registration of extmem pools on population because it is useless. If used for Rx, they are registered at port start. During registration, recognize such pools, inspect their mbufs and recover the pages they reside in. While MRs for these pages may already be created by rte_dev_dma_map(), they are not reused to avoid synchronization on Rx datapath in case these MRs are changed in the database. Fixes: 690b2a88c2f7 ("common/mlx5: add mempool registration facilities") Signed-off-by: Dmitry Kozlyuk Reviewed-by: Matan Azrad Reviewed-by: Viacheslav Ovsiienko --- v2: rebase on next-net-mlx drivers/common/mlx5/mlx5_common.c | 4 + drivers/common/mlx5/mlx5_common_mr.c | 113 +++++++++++++++++++++++++-- drivers/net/mlx5/mlx5_trigger.c | 8 +- 3 files changed, 117 insertions(+), 8 deletions(-) diff --git a/drivers/common/mlx5/mlx5_common.c b/drivers/common/mlx5/mlx5_common.c index 9c75e5f173..b9ed5ee676 100644 --- a/drivers/common/mlx5/mlx5_common.c +++ b/drivers/common/mlx5/mlx5_common.c @@ -390,9 +390,13 @@ mlx5_dev_mempool_event_cb(enum rte_mempool_event event, struct rte_mempool *mp, void *arg) { struct mlx5_common_device *cdev = arg; + bool extmem = rte_pktmbuf_priv_flags(mp) & + RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF; switch (event) { case RTE_MEMPOOL_EVENT_READY: + if (extmem) + break; if (mlx5_dev_mempool_register(cdev, mp) < 0) DRV_LOG(ERR, "Failed to register new mempool %s for PD %p: %s", diff --git a/drivers/common/mlx5/mlx5_common_mr.c b/drivers/common/mlx5/mlx5_common_mr.c index 93c8dc9042..49feea4474 100644 --- a/drivers/common/mlx5/mlx5_common_mr.c +++ b/drivers/common/mlx5/mlx5_common_mr.c @@ -1292,6 +1292,105 @@ mlx5_range_from_mempool_chunk(struct rte_mempool *mp, void *opaque, range->end = RTE_ALIGN_CEIL(range->start + memhdr->len, page_size); } +/** + * Collect page-aligned memory ranges of the mempool. + */ +static int +mlx5_mempool_get_chunks(struct rte_mempool *mp, struct mlx5_range **out, + unsigned int *out_n) +{ + struct mlx5_range *chunks; + unsigned int n; + + n = mp->nb_mem_chunks; + chunks = calloc(sizeof(chunks[0]), n); + if (chunks == NULL) + return -1; + rte_mempool_mem_iter(mp, mlx5_range_from_mempool_chunk, chunks); + *out = chunks; + *out_n = n; + return 0; +} + +struct mlx5_mempool_get_extmem_data { + struct mlx5_range *heap; + unsigned int heap_size; + int ret; +}; + +static void +mlx5_mempool_get_extmem_cb(struct rte_mempool *mp, void *opaque, + void *obj, unsigned int obj_idx) +{ + struct mlx5_mempool_get_extmem_data *data = opaque; + struct rte_mbuf *mbuf = obj; + uintptr_t addr = (uintptr_t)mbuf->buf_addr; + struct mlx5_range *seg, *heap; + struct rte_memseg_list *msl; + size_t page_size; + uintptr_t page_start; + unsigned int pos = 0, len = data->heap_size, delta; + + RTE_SET_USED(mp); + RTE_SET_USED(obj_idx); + if (data->ret < 0) + return; + /* Binary search for an already visited page. */ + while (len > 1) { + delta = len / 2; + if (addr < data->heap[pos + delta].start) { + len = delta; + } else { + pos += delta; + len -= delta; + } + } + if (data->heap != NULL) { + seg = &data->heap[pos]; + if (seg->start <= addr && addr < seg->end) + return; + } + /* Determine the page boundaries and remember them. */ + heap = realloc(data->heap, sizeof(heap[0]) * (data->heap_size + 1)); + if (heap == NULL) { + free(data->heap); + data->heap = NULL; + data->ret = -1; + return; + } + data->heap = heap; + data->heap_size++; + seg = &heap[data->heap_size - 1]; + msl = rte_mem_virt2memseg_list((void *)addr); + page_size = msl != NULL ? msl->page_sz : rte_mem_page_size(); + page_start = RTE_PTR_ALIGN_FLOOR(addr, page_size); + seg->start = page_start; + seg->end = page_start + page_size; + /* Maintain the heap order. */ + qsort(data->heap, data->heap_size, sizeof(heap[0]), + mlx5_range_compare_start); +} + +/** + * Recover pages of external memory as close as possible + * for a mempool with RTE_PKTMBUF_POOL_PINNED_EXT_BUF. + * Pages are stored in a heap for efficient search, for mbufs are many. + */ +static int +mlx5_mempool_get_extmem(struct rte_mempool *mp, struct mlx5_range **out, + unsigned int *out_n) +{ + struct mlx5_mempool_get_extmem_data data; + + memset(&data, 0, sizeof(data)); + rte_mempool_obj_iter(mp, mlx5_mempool_get_extmem_cb, &data); + if (data.ret < 0) + return -1; + *out = data.heap; + *out_n = data.heap_size; + return 0; +} + /** * Get VA-contiguous ranges of the mempool memory. * Each range start and end is aligned to the system page size. @@ -1311,13 +1410,15 @@ mlx5_get_mempool_ranges(struct rte_mempool *mp, struct mlx5_range **out, unsigned int *out_n) { struct mlx5_range *chunks; - unsigned int chunks_n = mp->nb_mem_chunks, contig_n, i; + unsigned int chunks_n, contig_n, i; + int ret; - /* Collect page-aligned memory ranges of the mempool. */ - chunks = calloc(sizeof(chunks[0]), chunks_n); - if (chunks == NULL) - return -1; - rte_mempool_mem_iter(mp, mlx5_range_from_mempool_chunk, chunks); + /* Collect the pool underlying memory. */ + ret = (rte_pktmbuf_priv_flags(mp) & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF) ? + mlx5_mempool_get_extmem(mp, &chunks, &chunks_n) : + mlx5_mempool_get_chunks(mp, &chunks, &chunks_n); + if (ret < 0) + return ret; /* Merge adjacent chunks and place them at the beginning. */ qsort(chunks, chunks_n, sizeof(chunks[0]), mlx5_range_compare_start); contig_n = 1; diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c index a8dd07bdce..1952d68444 100644 --- a/drivers/net/mlx5/mlx5_trigger.c +++ b/drivers/net/mlx5/mlx5_trigger.c @@ -146,12 +146,16 @@ mlx5_rxq_mempool_register(struct mlx5_rxq_ctrl *rxq_ctrl) return 0; } for (s = 0; s < rxq_ctrl->rxq.rxseg_n; s++) { + uint32_t flags; + mp = rxq_ctrl->rxq.rxseg[s].mp; + flags = rte_pktmbuf_priv_flags(mp); ret = mlx5_mr_mempool_register(rxq_ctrl->sh->cdev, mp); if (ret < 0 && rte_errno != EEXIST) return ret; - rte_mempool_mem_iter(mp, mlx5_rxq_mempool_register_cb, - &rxq_ctrl->rxq); + if ((flags & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF) == 0) + rte_mempool_mem_iter(mp, mlx5_rxq_mempool_register_cb, + &rxq_ctrl->rxq); } return 0; }