From patchwork Fri Nov 19 14:31:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Kozlyuk X-Patchwork-Id: 104537 X-Patchwork-Delegate: rasland@nvidia.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 2779FA0C45; Fri, 19 Nov 2021 15:32:18 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id D66F940143; Fri, 19 Nov 2021 15:32:16 +0100 (CET) Received: from NAM02-BN1-obe.outbound.protection.outlook.com (mail-bn1nam07on2077.outbound.protection.outlook.com [40.107.212.77]) by mails.dpdk.org (Postfix) with ESMTP id 2DA8B40140 for ; Fri, 19 Nov 2021 15:32:15 +0100 (CET) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=aSrlNBaHPBcPoRVp0cWWQfgPuegHmSyWjtwzrYz9bpik5zZbe7IhuXfq6HdL4bxgU+bDz6XIDTOVSG60pMS387REoC8edKOtBY2bejfxj+V4zlBb5z7/BXkd6BN+lArUQrNYQLtCoqDs9fCPPQj+Qn7N979n3USN7mQ7zky6mDiw8r3+oO2GqHrZYp9jkKffB5wwj5kYLCAaIs3c5x7NfKDaghoYh7lT50GY+yp1SAFKYZrFlUYb2qVgj8+uGBaQYl/Q3Sfi5g5rugucypTtWnYHudENOTEFkRlciRqmLM9/9JsQUWMnHJmfw1ocTrbPifLPBROlYZMaHjnPLKgqgA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=tmXGst/NBrWEnLylHH6t11DiGYifbxYfVnXjVst4gBo=; b=mBmIBXjBdBjWe54GCdjWKOxnZcAiHjCEOexK4bpSc9BjeQHWT+g0pnlumqxuFJ+/ic55JuolSvgnN8jUccVt0itLICrseP1wfC+CKRbX/XYpoi8+JyegJ/UWjshb6BV7GG1onMYbM7Wumx+irlXTSWt2xabOZEe62HxbrO3nO3lxEZf4UCO/ZkfKAEmoUMYBC8b2uRB+wdHgeoAqThw6w2T7nGSFoNmUI4l2KNjvmOIzfl7p+lHviyCe9nx5hrEyS2AVg89czGdGvpPOsWGmfKB7uVQaO3s6LrZu+zcnInvBJVgURlFs8FC+3NZo6zEKjicyuWdHITzCXV3VOc17tQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.112.34) smtp.rcpttodomain=dpdk.org smtp.mailfrom=nvidia.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=tmXGst/NBrWEnLylHH6t11DiGYifbxYfVnXjVst4gBo=; b=QObXQT7N66abf2gPQoKUJBCFtz9+U4T1ZhJ7Got8mBfOaEOIcVoldGtNI/oPHsRvfQqsJxF+BTLpDc8qvEoo5IQ1PbX9z6BfVh8YAJsUfrj6cNhzK7A3IznV67gK0er2rI6lDqtcn0wLqcTS+tqrgT4+nm/JbhOzshfrHFdOrETPDmhwKN1iHnD3uvgAf8GwZh4wfrG01zqX0lL8kMHpYjFNZXfHtgggqcJ0oq6h6NS7bAEM02YMWZEv3edtyxhVElD6IUILpxypGJRaVPy68YEFkkgM6hGDOZKZ5TEdsghVa39qcZBWFvaMCeTvzNOE5sRB23R7oZowusItXntqNA== Received: from DM5PR21CA0023.namprd21.prod.outlook.com (2603:10b6:3:ac::33) by DM6PR12MB2764.namprd12.prod.outlook.com (2603:10b6:5:4a::30) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4690.27; Fri, 19 Nov 2021 14:32:12 +0000 Received: from DM6NAM11FT006.eop-nam11.prod.protection.outlook.com (2603:10b6:3:ac:cafe::6f) by DM5PR21CA0023.outlook.office365.com (2603:10b6:3:ac::33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4734.13 via Frontend Transport; Fri, 19 Nov 2021 14:32:12 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.112.34) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.112.34 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.112.34; helo=mail.nvidia.com; Received: from mail.nvidia.com (216.228.112.34) by DM6NAM11FT006.mail.protection.outlook.com (10.13.173.104) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.4713.20 via Frontend Transport; Fri, 19 Nov 2021 14:32:11 +0000 Received: from nvidia.com (172.20.187.6) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1497.18; Fri, 19 Nov 2021 14:32:09 +0000 From: Dmitry Kozlyuk To: CC: Raslan Darawsheh , Michael Baum , Matan Azrad , Viacheslav Ovsiienko Subject: [PATCH v3] common/mlx5: fix mempool registration Date: Fri, 19 Nov 2021 16:31:56 +0200 Message-ID: <20211119143156.3040273-1-dkozlyuk@nvidia.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20211118152501.2936413-1-dkozlyuk@nvidia.com> References: <20211118152501.2936413-1-dkozlyuk@nvidia.com> MIME-Version: 1.0 X-Originating-IP: [172.20.187.6] X-ClientProxiedBy: HQMAIL105.nvidia.com (172.20.187.12) To HQMAIL107.nvidia.com (172.20.187.13) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 2a9e9441-d316-402f-fca0-08d9ab695fd0 X-MS-TrafficTypeDiagnostic: DM6PR12MB2764: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:6430; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: LaUFEz9tzsER1rjMGqTxJx3Lm74g9ZyobOJa5mg1EWmG/aIcM9usJLGaYsvPMGNxDGnvMKnZdZ3zBw6NBpqN7X+vLTkXt1TRI3e7OhSGoHeq/JJzlDWRUtB0LxzLgN0YZvihQ6Ew35qkYc9vCE60Ar2ZsZSCs6wE9cEjLfeD2phHLdeewVGwdEUJ3L2qnHuzaT0v8o9PuND5WK7IL6s5PtKwrkV+hWwy8+DeNQvcntvdgAfE+921I3yoEMA0JoKeuWUuOyNwwO/YSZYsOmiBX1VfaM3E3Vu476Zka9lzWNxYr61ExYCwD1eq/zF8h59W3h23+Q+3bSNerasaki8GecV/WyBwLZz1cHq1U3URtAse75Ip17pcF/06U3J8X8BNinuR+20PboBNrB/HMCOJiBCH99fVXLxxBg5mnx5uCY3DBXCI2Ogal7GhldVA3IWFmVaiV9JnP6dbs4LlpfVdh1fERVHrhBX6pB54mBpbcZLfJxfBnM4C7gMhkAd1h1eks8sh7XOcPnWdw10gA1rr+Q2ul1xQL4j2xqmdiTY7X2HnxJlKn1CAMYCVaxMjxR0YKyE1mAKihYG8xr1NaegKlYh++AOpfU7lbKy1ufGxFRLq4CMSdWev2TOhwgIkbFKfRLd97EYKl3PuFYab9XcLfga5fq9aONs9aHC1iF4zJSET+4aDK8JUeRpVVOpMggSk8g8IPOv7JCTHHn3AIuCsjg== X-Forefront-Antispam-Report: CIP:216.228.112.34; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:schybrid03.nvidia.com; CAT:NONE; SFS:(4636009)(46966006)(36840700001)(36756003)(83380400001)(2906002)(356005)(2616005)(7636003)(316002)(5660300002)(54906003)(36906005)(8936002)(47076005)(4326008)(7696005)(16526019)(6666004)(26005)(30864003)(8676002)(55016002)(70206006)(1076003)(508600001)(186003)(86362001)(107886003)(336012)(6916009)(82310400003)(426003)(36860700001)(6286002)(70586007); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 19 Nov 2021 14:32:11.5434 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 2a9e9441-d316-402f-fca0-08d9ab695fd0 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[216.228.112.34]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: DM6NAM11FT006.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM6PR12MB2764 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Mempool registration was not correctly processing mempools with RTE_PKTMBUF_F_PINEND_EXT_BUF flag set ("pinned mempools" for short), because it is not known at registration time whether the mempool is a pktmbuf one, and its elements may not yet be initialized to analyze them. Attempts had been made to recognize such pools, but there was no robust solution, only the owner of a mempool (the application or a device) knows its type. This patch extends common/mlx5 registration code to accept a hint that the mempool is a pinned one and uses this capability from net/mlx5 driver. 1. Remove all code assuming pktmbuf pool type or trying to recognize the type of a pool. 2. Register pinned mempools used for Rx and their external memory on port start. Populate the MR cache with all their MRs. 3. Change Tx slow path logic as follows: 3.1. Search the mempool database for a memory region (MR) by the mbuf pool and its buffer address. 3.2. If not MR for the address is found for the mempool, and the mempool contains only pinned external buffers, perform the mempool registration of the mempool and its external pinned memory. 3.3. Fall back to using page-based MRs in other cases (for example, a buffer with externally attached memory, but not from a pinned mempool). Fixes: 690b2a88c2f7 ("common/mlx5: add mempool registration facilities") Fixes: fec28ca0e3a9 ("net/mlx5: support mempool registration") Signed-off-by: Dmitry Kozlyuk Reviewed-by: Matan Azrad Reviewed-by: Viacheslav Ovsiienko --- v3: fix build with GCC on RHEL7 v2: 1) rebase on ToT 2) fix MR cache population drivers/common/mlx5/mlx5_common.c | 11 +- drivers/common/mlx5/mlx5_common_mp.c | 4 +- drivers/common/mlx5/mlx5_common_mp.h | 10 +- drivers/common/mlx5/mlx5_common_mr.c | 180 +++++++++++++++++++-------- drivers/common/mlx5/mlx5_common_mr.h | 15 +-- drivers/common/mlx5/version.map | 1 + drivers/net/mlx5/linux/mlx5_mp_os.c | 3 +- drivers/net/mlx5/mlx5_rxq.c | 2 +- drivers/net/mlx5/mlx5_trigger.c | 40 ++---- 9 files changed, 163 insertions(+), 103 deletions(-) diff --git a/drivers/common/mlx5/mlx5_common.c b/drivers/common/mlx5/mlx5_common.c index 66c2c08b7d..f1650f94c6 100644 --- a/drivers/common/mlx5/mlx5_common.c +++ b/drivers/common/mlx5/mlx5_common.c @@ -317,9 +317,9 @@ mlx5_dev_to_pci_str(const struct rte_device *dev, char *addr, size_t size) */ static int mlx5_dev_mempool_register(struct mlx5_common_device *cdev, - struct rte_mempool *mp) + struct rte_mempool *mp, bool is_extmem) { - return mlx5_mr_mempool_register(cdev, mp); + return mlx5_mr_mempool_register(cdev, mp, is_extmem); } /** @@ -353,7 +353,7 @@ mlx5_dev_mempool_register_cb(struct rte_mempool *mp, void *arg) struct mlx5_common_device *cdev = arg; int ret; - ret = mlx5_dev_mempool_register(cdev, mp); + ret = mlx5_dev_mempool_register(cdev, mp, false); if (ret < 0 && rte_errno != EEXIST) DRV_LOG(ERR, "Failed to register existing mempool %s for PD %p: %s", @@ -390,13 +390,10 @@ mlx5_dev_mempool_event_cb(enum rte_mempool_event event, struct rte_mempool *mp, void *arg) { struct mlx5_common_device *cdev = arg; - bool extmem = mlx5_mempool_is_extmem(mp); switch (event) { case RTE_MEMPOOL_EVENT_READY: - if (extmem) - break; - if (mlx5_dev_mempool_register(cdev, mp) < 0) + if (mlx5_dev_mempool_register(cdev, mp, false) < 0) DRV_LOG(ERR, "Failed to register new mempool %s for PD %p: %s", mp->name, cdev->pd, rte_strerror(rte_errno)); diff --git a/drivers/common/mlx5/mlx5_common_mp.c b/drivers/common/mlx5/mlx5_common_mp.c index 536d61f66c..a7a671b7c5 100644 --- a/drivers/common/mlx5/mlx5_common_mp.c +++ b/drivers/common/mlx5/mlx5_common_mp.c @@ -65,7 +65,8 @@ mlx5_mp_req_mr_create(struct mlx5_common_device *cdev, uintptr_t addr) */ int mlx5_mp_req_mempool_reg(struct mlx5_common_device *cdev, - struct rte_mempool *mempool, bool reg) + struct rte_mempool *mempool, bool reg, + bool is_extmem) { struct rte_mp_msg mp_req; struct rte_mp_msg *mp_res; @@ -82,6 +83,7 @@ mlx5_mp_req_mempool_reg(struct mlx5_common_device *cdev, MLX5_MP_REQ_MEMPOOL_UNREGISTER; mp_init_port_agnostic_msg(&mp_req, type); arg->mempool = mempool; + arg->is_extmem = is_extmem; arg->cdev = cdev; ret = rte_mp_request_sync(&mp_req, &mp_rep, &ts); if (ret) { diff --git a/drivers/common/mlx5/mlx5_common_mp.h b/drivers/common/mlx5/mlx5_common_mp.h index b1e3a41a20..4599ba8f92 100644 --- a/drivers/common/mlx5/mlx5_common_mp.h +++ b/drivers/common/mlx5/mlx5_common_mp.h @@ -37,9 +37,12 @@ struct mlx5_mp_arg_queue_id { struct mlx5_mp_arg_mr_manage { struct mlx5_common_device *cdev; + RTE_STD_C11 union { - struct rte_mempool *mempool; - /* MLX5_MP_REQ_MEMPOOL_(UN)REGISTER */ + struct { + struct rte_mempool *mempool; + bool is_extmem; + }; /* MLX5_MP_REQ_MEMPOOL_(UN)REGISTER */ uintptr_t addr; /* MLX5_MP_REQ_CREATE_MR */ }; }; @@ -134,7 +137,8 @@ __rte_internal int mlx5_mp_req_mr_create(struct mlx5_common_device *cdev, uintptr_t addr); __rte_internal int mlx5_mp_req_mempool_reg(struct mlx5_common_device *cdev, - struct rte_mempool *mempool, bool reg); + struct rte_mempool *mempool, bool reg, + bool is_extmem); __rte_internal int mlx5_mp_req_queue_state_modify(struct mlx5_mp_id *mp_id, struct mlx5_mp_arg_queue_state_modify *sm); diff --git a/drivers/common/mlx5/mlx5_common_mr.c b/drivers/common/mlx5/mlx5_common_mr.c index 70365e1466..47121d8ff7 100644 --- a/drivers/common/mlx5/mlx5_common_mr.c +++ b/drivers/common/mlx5/mlx5_common_mr.c @@ -47,6 +47,8 @@ struct mlx5_mempool_reg { struct mlx5_mempool_mr *mrs; /** Number of memory regions. */ unsigned int mrs_n; + /** Whether the MR were created for external pinned memory. */ + bool is_extmem; }; void @@ -1302,16 +1304,14 @@ static int mlx5_mempool_get_chunks(struct rte_mempool *mp, struct mlx5_range **out, unsigned int *out_n) { - struct mlx5_range *chunks; unsigned int n; DRV_LOG(DEBUG, "Collecting chunks of regular mempool %s", mp->name); n = mp->nb_mem_chunks; - chunks = calloc(sizeof(chunks[0]), n); - if (chunks == NULL) + *out = calloc(sizeof(**out), n); + if (*out == NULL) return -1; - rte_mempool_mem_iter(mp, mlx5_range_from_mempool_chunk, chunks); - *out = chunks; + rte_mempool_mem_iter(mp, mlx5_range_from_mempool_chunk, *out); *out_n = n; return 0; } @@ -1390,11 +1390,9 @@ mlx5_mempool_get_extmem(struct rte_mempool *mp, struct mlx5_range **out, mp->name); memset(&data, 0, sizeof(data)); rte_mempool_obj_iter(mp, mlx5_mempool_get_extmem_cb, &data); - if (data.ret < 0) - return -1; *out = data.heap; *out_n = data.heap_size; - return 0; + return data.ret; } /** @@ -1403,26 +1401,27 @@ mlx5_mempool_get_extmem(struct rte_mempool *mp, struct mlx5_range **out, * * @param[in] mp * Analyzed mempool. + * @param[in] is_extmem + * Whether the pool is contains only external pinned buffers. * @param[out] out * Receives the ranges, caller must release it with free(). - * @param[out] ount_n + * @param[out] out_n * Receives the number of @p out elements. * * @return * 0 on success, (-1) on failure. */ static int -mlx5_get_mempool_ranges(struct rte_mempool *mp, struct mlx5_range **out, - unsigned int *out_n) +mlx5_get_mempool_ranges(struct rte_mempool *mp, bool is_extmem, + struct mlx5_range **out, unsigned int *out_n) { struct mlx5_range *chunks; unsigned int chunks_n, contig_n, i; int ret; /* Collect the pool underlying memory. */ - ret = mlx5_mempool_is_extmem(mp) ? - mlx5_mempool_get_extmem(mp, &chunks, &chunks_n) : - mlx5_mempool_get_chunks(mp, &chunks, &chunks_n); + ret = is_extmem ? mlx5_mempool_get_extmem(mp, &chunks, &chunks_n) : + mlx5_mempool_get_chunks(mp, &chunks, &chunks_n); if (ret < 0) return ret; /* Merge adjacent chunks and place them at the beginning. */ @@ -1446,6 +1445,8 @@ mlx5_get_mempool_ranges(struct rte_mempool *mp, struct mlx5_range **out, * * @param[in] mp * Mempool to analyze. + * @param[in] is_extmem + * Whether the pool is contains only external pinned buffers. * @param[out] out * Receives memory ranges to register, aligned to the system page size. * The caller must release them with free(). @@ -1458,14 +1459,15 @@ mlx5_get_mempool_ranges(struct rte_mempool *mp, struct mlx5_range **out, * 0 on success, (-1) on failure. */ static int -mlx5_mempool_reg_analyze(struct rte_mempool *mp, struct mlx5_range **out, - unsigned int *out_n, bool *share_hugepage) +mlx5_mempool_reg_analyze(struct rte_mempool *mp, bool is_extmem, + struct mlx5_range **out, unsigned int *out_n, + bool *share_hugepage) { struct mlx5_range *ranges = NULL; unsigned int i, ranges_n = 0; struct rte_memseg_list *msl; - if (mlx5_get_mempool_ranges(mp, &ranges, &ranges_n) < 0) { + if (mlx5_get_mempool_ranges(mp, is_extmem, &ranges, &ranges_n) < 0) { DRV_LOG(ERR, "Cannot get address ranges for mempool %s", mp->name); return -1; @@ -1507,7 +1509,8 @@ mlx5_mempool_reg_analyze(struct rte_mempool *mp, struct mlx5_range **out, /** Create a registration object for the mempool. */ static struct mlx5_mempool_reg * -mlx5_mempool_reg_create(struct rte_mempool *mp, unsigned int mrs_n) +mlx5_mempool_reg_create(struct rte_mempool *mp, unsigned int mrs_n, + bool is_extmem) { struct mlx5_mempool_reg *mpr = NULL; @@ -1522,6 +1525,7 @@ mlx5_mempool_reg_create(struct rte_mempool *mp, unsigned int mrs_n) mpr->mp = mp; mpr->mrs = (struct mlx5_mempool_mr *)(mpr + 1); mpr->mrs_n = mrs_n; + mpr->is_extmem = is_extmem; return mpr; } @@ -1586,31 +1590,32 @@ mlx5_mempool_reg_detach(struct mlx5_mempool_reg *mpr) static int mlx5_mr_mempool_register_primary(struct mlx5_mr_share_cache *share_cache, - void *pd, struct rte_mempool *mp) + void *pd, struct rte_mempool *mp, + bool is_extmem) { struct mlx5_range *ranges = NULL; - struct mlx5_mempool_reg *mpr, *new_mpr; + struct mlx5_mempool_reg *mpr, *old_mpr, *new_mpr; unsigned int i, ranges_n; - bool share_hugepage; + bool share_hugepage, standalone = false; int ret = -1; /* Early check to avoid unnecessary creation of MRs. */ rte_rwlock_read_lock(&share_cache->rwlock); - mpr = mlx5_mempool_reg_lookup(share_cache, mp); + old_mpr = mlx5_mempool_reg_lookup(share_cache, mp); rte_rwlock_read_unlock(&share_cache->rwlock); - if (mpr != NULL) { + if (old_mpr != NULL && (!is_extmem || old_mpr->is_extmem)) { DRV_LOG(DEBUG, "Mempool %s is already registered for PD %p", mp->name, pd); rte_errno = EEXIST; goto exit; } - if (mlx5_mempool_reg_analyze(mp, &ranges, &ranges_n, + if (mlx5_mempool_reg_analyze(mp, is_extmem, &ranges, &ranges_n, &share_hugepage) < 0) { DRV_LOG(ERR, "Cannot get mempool %s memory ranges", mp->name); rte_errno = ENOMEM; goto exit; } - new_mpr = mlx5_mempool_reg_create(mp, ranges_n); + new_mpr = mlx5_mempool_reg_create(mp, ranges_n, is_extmem); if (new_mpr == NULL) { DRV_LOG(ERR, "Cannot create a registration object for mempool %s in PD %p", @@ -1670,6 +1675,12 @@ mlx5_mr_mempool_register_primary(struct mlx5_mr_share_cache *share_cache, /* Concurrent registration is not supposed to happen. */ rte_rwlock_write_lock(&share_cache->rwlock); mpr = mlx5_mempool_reg_lookup(share_cache, mp); + if (mpr == old_mpr && old_mpr != NULL) { + LIST_REMOVE(old_mpr, next); + standalone = mlx5_mempool_reg_detach(mpr); + /* No need to flush the cache: old MRs cannot be in use. */ + mpr = NULL; + } if (mpr == NULL) { mlx5_mempool_reg_attach(new_mpr); LIST_INSERT_HEAD(&share_cache->mempool_reg_list, new_mpr, next); @@ -1682,6 +1693,10 @@ mlx5_mr_mempool_register_primary(struct mlx5_mr_share_cache *share_cache, mlx5_mempool_reg_destroy(share_cache, new_mpr, true); rte_errno = EEXIST; goto exit; + } else if (old_mpr != NULL) { + DRV_LOG(DEBUG, "Mempool %s registration for PD %p updated for external memory", + mp->name, pd); + mlx5_mempool_reg_destroy(share_cache, old_mpr, standalone); } exit: free(ranges); @@ -1690,9 +1705,9 @@ mlx5_mr_mempool_register_primary(struct mlx5_mr_share_cache *share_cache, static int mlx5_mr_mempool_register_secondary(struct mlx5_common_device *cdev, - struct rte_mempool *mp) + struct rte_mempool *mp, bool is_extmem) { - return mlx5_mp_req_mempool_reg(cdev, mp, true); + return mlx5_mp_req_mempool_reg(cdev, mp, true, is_extmem); } /** @@ -1708,16 +1723,17 @@ mlx5_mr_mempool_register_secondary(struct mlx5_common_device *cdev, */ int mlx5_mr_mempool_register(struct mlx5_common_device *cdev, - struct rte_mempool *mp) + struct rte_mempool *mp, bool is_extmem) { if (mp->flags & RTE_MEMPOOL_F_NON_IO) return 0; switch (rte_eal_process_type()) { case RTE_PROC_PRIMARY: return mlx5_mr_mempool_register_primary(&cdev->mr_scache, - cdev->pd, mp); + cdev->pd, mp, + is_extmem); case RTE_PROC_SECONDARY: - return mlx5_mr_mempool_register_secondary(cdev, mp); + return mlx5_mr_mempool_register_secondary(cdev, mp, is_extmem); default: return -1; } @@ -1756,7 +1772,7 @@ static int mlx5_mr_mempool_unregister_secondary(struct mlx5_common_device *cdev, struct rte_mempool *mp) { - return mlx5_mp_req_mempool_reg(cdev, mp, false); + return mlx5_mp_req_mempool_reg(cdev, mp, false, false /* is_extmem */); } /** @@ -1868,6 +1884,65 @@ mlx5_lookup_mempool_regs(struct mlx5_mr_ctrl *mr_ctrl, return lkey; } +/** + * Populate cache with LKeys of all MRs used by the mempool. + * It is intended to be used to register Rx mempools in advance. + * + * @param mr_ctrl + * Per-queue MR control handle. + * @param mp + * Registered memory pool. + * + * @return + * 0 on success, (-1) on failure and rte_errno is set. + */ +int +mlx5_mr_mempool_populate_cache(struct mlx5_mr_ctrl *mr_ctrl, + struct rte_mempool *mp) +{ + struct mlx5_mr_share_cache *share_cache = + container_of(mr_ctrl->dev_gen_ptr, struct mlx5_mr_share_cache, + dev_gen); + struct mlx5_mr_btree *bt = &mr_ctrl->cache_bh; + struct mlx5_mempool_reg *mpr; + unsigned int i; + + /* + * Registration is valid after the lock is released, + * because the function is called after the mempool is registered. + */ + rte_rwlock_read_lock(&share_cache->rwlock); + mpr = mlx5_mempool_reg_lookup(share_cache, mp); + rte_rwlock_read_unlock(&share_cache->rwlock); + if (mpr == NULL) { + DRV_LOG(ERR, "Mempool %s is not registered", mp->name); + rte_errno = ENOENT; + return -1; + } + for (i = 0; i < mpr->mrs_n; i++) { + struct mlx5_mempool_mr *mr = &mpr->mrs[i]; + struct mr_cache_entry entry; + uint32_t lkey; + uint16_t idx; + + lkey = mr_btree_lookup(bt, &idx, (uintptr_t)mr->pmd_mr.addr); + if (lkey != UINT32_MAX) + continue; + if (bt->len == bt->size) + mr_btree_expand(bt, bt->size << 1); + entry.start = (uintptr_t)mr->pmd_mr.addr; + entry.end = entry.start + mr->pmd_mr.len; + entry.lkey = rte_cpu_to_be_32(mr->pmd_mr.lkey); + if (mr_btree_insert(bt, &entry) < 0) { + DRV_LOG(ERR, "Cannot insert cache entry for mempool %s MR %08x", + mp->name, entry.lkey); + rte_errno = EINVAL; + return -1; + } + } + return 0; +} + /** * Bottom-half lookup for the address from the mempool. * @@ -1909,6 +1984,8 @@ mlx5_mr_mempool2mr_bh(struct mlx5_mr_ctrl *mr_ctrl, uint32_t mlx5_mr_mb2mr_bh(struct mlx5_mr_ctrl *mr_ctrl, struct rte_mbuf *mb) { + struct rte_mempool *mp; + struct mlx5_mprq_buf *buf; uint32_t lkey; uintptr_t addr = (uintptr_t)mb->buf_addr; struct mlx5_mr_share_cache *share_cache = @@ -1917,27 +1994,26 @@ mlx5_mr_mb2mr_bh(struct mlx5_mr_ctrl *mr_ctrl, struct rte_mbuf *mb) struct mlx5_common_device *cdev = container_of(share_cache, struct mlx5_common_device, mr_scache); - if (cdev->config.mr_mempool_reg_en) { - struct rte_mempool *mp = NULL; - struct mlx5_mprq_buf *buf; - - if (!RTE_MBUF_HAS_EXTBUF(mb)) { - mp = mlx5_mb2mp(mb); - } else if (mb->shinfo->free_cb == mlx5_mprq_buf_free_cb) { - /* Recover MPRQ mempool. */ - buf = mb->shinfo->fcb_opaque; - mp = buf->mp; - } - if (mp != NULL) { - lkey = mlx5_mr_mempool2mr_bh(mr_ctrl, mp, addr); - /* - * Lookup can only fail on invalid input, e.g. "addr" - * is not from "mp" or "mp" has MEMPOOL_F_NON_IO set. - */ - if (lkey != UINT32_MAX) - return lkey; - } - /* Fallback for generic mechanism in corner cases. */ + /* Recover MPRQ mempool. */ + if (RTE_MBUF_HAS_EXTBUF(mb) && + mb->shinfo->free_cb == mlx5_mprq_buf_free_cb) { + buf = mb->shinfo->fcb_opaque; + mp = buf->mp; + } else { + mp = mlx5_mb2mp(mb); + } + lkey = mlx5_mr_mempool2mr_bh(mr_ctrl, mp, addr); + if (lkey != UINT32_MAX) + return lkey; + /* Register pinned external memory if the mempool is not used for Rx. */ + if (cdev->config.mr_mempool_reg_en && + (rte_pktmbuf_priv_flags(mp) & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF)) { + if (mlx5_mr_mempool_register(cdev, mp, true) < 0) + return UINT32_MAX; + lkey = mlx5_mr_mempool2mr_bh(mr_ctrl, mp, addr); + MLX5_ASSERT(lkey != UINT32_MAX); + return lkey; } + /* Fallback to generic mechanism in corner cases. */ return mlx5_mr_addr2mr_bh(mr_ctrl, addr); } diff --git a/drivers/common/mlx5/mlx5_common_mr.h b/drivers/common/mlx5/mlx5_common_mr.h index de0cab29cc..cf384b6748 100644 --- a/drivers/common/mlx5/mlx5_common_mr.h +++ b/drivers/common/mlx5/mlx5_common_mr.h @@ -255,20 +255,15 @@ mlx5_os_set_reg_mr_cb(mlx5_reg_mr_t *reg_mr_cb, mlx5_dereg_mr_t *dereg_mr_cb); __rte_internal int mlx5_mr_mempool_register(struct mlx5_common_device *cdev, - struct rte_mempool *mp); + struct rte_mempool *mp, bool is_extmem); __rte_internal int mlx5_mr_mempool_unregister(struct mlx5_common_device *cdev, struct rte_mempool *mp); -/** Check if @p mp has buffers pinned in external memory. */ -static inline bool -mlx5_mempool_is_extmem(struct rte_mempool *mp) -{ - return (mp->private_data_size == - sizeof(struct rte_pktmbuf_pool_private)) && - (mp->elt_size >= sizeof(struct rte_mbuf)) && - (rte_pktmbuf_priv_flags(mp) & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF); -} +__rte_internal +int +mlx5_mr_mempool_populate_cache(struct mlx5_mr_ctrl *mr_ctrl, + struct rte_mempool *mp); #endif /* RTE_PMD_MLX5_COMMON_MR_H_ */ diff --git a/drivers/common/mlx5/version.map b/drivers/common/mlx5/version.map index 8a62dc2782..34e86004a0 100644 --- a/drivers/common/mlx5/version.map +++ b/drivers/common/mlx5/version.map @@ -145,4 +145,5 @@ INTERNAL { mlx5_mr_mempool_unregister; mlx5_mp_req_mempool_reg; mlx5_mr_mempool2mr_bh; + mlx5_mr_mempool_populate_cache; }; diff --git a/drivers/net/mlx5/linux/mlx5_mp_os.c b/drivers/net/mlx5/linux/mlx5_mp_os.c index edc5203dd6..c448a3e9eb 100644 --- a/drivers/net/mlx5/linux/mlx5_mp_os.c +++ b/drivers/net/mlx5/linux/mlx5_mp_os.c @@ -48,7 +48,8 @@ mlx5_mp_os_handle_port_agnostic(const struct rte_mp_msg *mp_msg, return rte_mp_reply(&mp_res, peer); case MLX5_MP_REQ_MEMPOOL_REGISTER: mp_init_port_agnostic_msg(&mp_res, param->type); - res->result = mlx5_mr_mempool_register(mng->cdev, mng->mempool); + res->result = mlx5_mr_mempool_register(mng->cdev, mng->mempool, + mng->is_extmem); return rte_mp_reply(&mp_res, peer); case MLX5_MP_REQ_MEMPOOL_UNREGISTER: mp_init_port_agnostic_msg(&mp_res, param->type); diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c index 480f4f9f07..e406779faf 100644 --- a/drivers/net/mlx5/mlx5_rxq.c +++ b/drivers/net/mlx5/mlx5_rxq.c @@ -1458,7 +1458,7 @@ mlx5_mprq_alloc_mp(struct rte_eth_dev *dev) rte_errno = ENOMEM; return -rte_errno; } - ret = mlx5_mr_mempool_register(priv->sh->cdev, mp); + ret = mlx5_mr_mempool_register(priv->sh->cdev, mp, false); if (ret < 0 && rte_errno != EEXIST) { ret = rte_errno; DRV_LOG(ERR, "port %u failed to register a mempool for Multi-Packet RQ", diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c index e2bfde19c7..68df3f8b4d 100644 --- a/drivers/net/mlx5/mlx5_trigger.c +++ b/drivers/net/mlx5/mlx5_trigger.c @@ -105,21 +105,6 @@ mlx5_txq_start(struct rte_eth_dev *dev) return -rte_errno; } -/** - * Translate the chunk address to MR key in order to put in into the cache. - */ -static void -mlx5_rxq_mempool_register_cb(struct rte_mempool *mp, void *opaque, - struct rte_mempool_memhdr *memhdr, - unsigned int idx) -{ - struct mlx5_rxq_data *rxq = opaque; - - RTE_SET_USED(mp); - RTE_SET_USED(idx); - mlx5_rx_addr2mr(rxq, (uintptr_t)memhdr->addr); -} - /** * Register Rx queue mempools and fill the Rx queue cache. * This function tolerates repeated mempool registration. @@ -139,24 +124,23 @@ mlx5_rxq_mempool_register(struct mlx5_rxq_ctrl *rxq_ctrl) mlx5_mr_flush_local_cache(&rxq_ctrl->rxq.mr_ctrl); /* MPRQ mempool is registered on creation, just fill the cache. */ - if (mlx5_rxq_mprq_enabled(&rxq_ctrl->rxq)) { - rte_mempool_mem_iter(rxq_ctrl->rxq.mprq_mp, - mlx5_rxq_mempool_register_cb, - &rxq_ctrl->rxq); - return 0; - } + if (mlx5_rxq_mprq_enabled(&rxq_ctrl->rxq)) + return mlx5_mr_mempool_populate_cache(&rxq_ctrl->rxq.mr_ctrl, + rxq_ctrl->rxq.mprq_mp); for (s = 0; s < rxq_ctrl->rxq.rxseg_n; s++) { - uint32_t flags; + bool is_extmem; mp = rxq_ctrl->rxq.rxseg[s].mp; - flags = mp != rxq_ctrl->rxq.mprq_mp ? - rte_pktmbuf_priv_flags(mp) : 0; - ret = mlx5_mr_mempool_register(rxq_ctrl->sh->cdev, mp); + is_extmem = (rte_pktmbuf_priv_flags(mp) & + RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF) != 0; + ret = mlx5_mr_mempool_register(rxq_ctrl->sh->cdev, mp, + is_extmem); if (ret < 0 && rte_errno != EEXIST) return ret; - if ((flags & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF) == 0) - rte_mempool_mem_iter(mp, mlx5_rxq_mempool_register_cb, - &rxq_ctrl->rxq); + ret = mlx5_mr_mempool_populate_cache(&rxq_ctrl->rxq.mr_ctrl, + mp); + if (ret < 0) + return ret; } return 0; }