From patchwork Wed Sep 29 14:52:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Kozlyuk X-Patchwork-Id: 100005 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 02BC2A0547; Wed, 29 Sep 2021 16:53:39 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id CDBA641123; Wed, 29 Sep 2021 16:53:20 +0200 (CEST) Received: from AZHDRRW-EX02.NVIDIA.COM (azhdrrw-ex02.nvidia.com [20.64.145.131]) by mails.dpdk.org (Postfix) with ESMTP id 52B32410F8 for ; Wed, 29 Sep 2021 16:53:16 +0200 (CEST) Received: from NAM11-CO1-obe.outbound.protection.outlook.com (104.47.56.171) by mxs.oss.nvidia.com (10.13.234.37) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.858.15; Wed, 29 Sep 2021 07:53:14 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=O4nybuXwgzJ7jnfY/jGHJcyB88KRFM+hDldA+RXMW6gmUWTVbiTpO3Ss7xC9tfz6t15H8NP/TICq+WFs+Pu/Bo60uPiNV6ACiNn01XrxMDvPJ0Oflt4qD9I1Owng/mX5F5Jb5kAn4h+U633zAoZvHmE2lSqVhJGkat+202d8uBKm7edQmGxLoApxFKnDKQVSg/e+iXitme1TjlRcpu205siwEV0WkFEcz6XWbkRHUh4gbHkDJuXksfIjSSetl3MxJQDgFYXzqjy6ajiAnlQTRFGnblA/hYEeppsvxeCIyrihHF/vSWedo+/Oer/Hu0ao1+q49yT+7XWhg8DO52CgdA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=45En4/UTddHYiaYH5+wGO4vOR97JPbEpC7XWybNPFIQ=; b=fCRh7GmqjMPJE/SKRtlMpHXW8hx6NceXdYF8oWBlaGtkZHeQDPeV7TUlc/SfJmoTEP7yGyvt/k4+XXrEZU3UOfq3N9zd3KFYcjosXnHFYwSFuhKcCl1BV/faiJOxJw4S59SLV95M4IvLWI4zmT4FDVTrzlmzvp+KLDmD1Aaz1VNEpwOEx4+EEZtD0TIKGD5HfkInuwnOBPOSn8a8p0RBaNY1v8lViCuHfj7pYkezcAM0MXTBnjx34YaiAKLW3RKdiDi7FT4towP4347u9Su+kH8HonocikasGzNxyngIpRIx9+mneywdLW6Whe3zHj6leDYcHYyu0nNd0/F+Kcnxvg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.112.32) smtp.rcpttodomain=dpdk.org smtp.mailfrom=nvidia.com; dmarc=pass (p=quarantine sp=none pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=45En4/UTddHYiaYH5+wGO4vOR97JPbEpC7XWybNPFIQ=; b=ZIR+E138aoSmxQIJJNc+XFf210EQihnnoQUVFPJ7olSNmIfs6/HRoZpL+h07Q8OKnim88NRe4B2fBhn12OQHOfRZML1bWk91E1CfcsFX3FQMQPFAeOQBvNNqnYcF26k80rBWy+kWSHWBinKHPzGOhwnfg/RrxFV5UVxp4/V4tq7Mg9bFe/2fXtbv3uDqlswnDD9LkOLcgcJVjcDyINMa8aseqmE6BBMch0Fe94CXho/GJ5kOLlgf5KYDC9Nu3JEul9E6Y61BXnr++x+MtxXpwdvYnSnXQwqPhbxmNbjNbylRThWmIDXZeMi8c75TQDeLtT+17ej+eAdU/J7pzGLHuw== Received: from DM5PR12CA0011.namprd12.prod.outlook.com (2603:10b6:4:1::21) by BN6PR1201MB0195.namprd12.prod.outlook.com (2603:10b6:405:53::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4566.15; Wed, 29 Sep 2021 14:53:13 +0000 Received: from DM6NAM11FT060.eop-nam11.prod.protection.outlook.com (2603:10b6:4:1:cafe::8c) by DM5PR12CA0011.outlook.office365.com (2603:10b6:4:1::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4566.13 via Frontend Transport; Wed, 29 Sep 2021 14:53:13 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.112.32) smtp.mailfrom=nvidia.com; dpdk.org; dkim=none (message not signed) header.d=none;dpdk.org; dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.112.32 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.112.32; helo=mail.nvidia.com; Received: from mail.nvidia.com (216.228.112.32) by DM6NAM11FT060.mail.protection.outlook.com (10.13.173.63) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.4566.14 via Frontend Transport; Wed, 29 Sep 2021 14:53:12 +0000 Received: from DRHQMAIL107.nvidia.com (10.27.9.16) by HQMAIL109.nvidia.com (172.20.187.15) with Microsoft SMTP Server (TLS) id 15.0.1497.18; Wed, 29 Sep 2021 07:53:12 -0700 Received: from nvidia.com (172.20.187.5) by DRHQMAIL107.nvidia.com (10.27.9.16) with Microsoft SMTP Server (TLS) id 15.0.1497.18; Wed, 29 Sep 2021 14:53:10 +0000 From: To: CC: Dmitry Kozlyuk , Matan Azrad , Viacheslav Ovsiienko Date: Wed, 29 Sep 2021 17:52:49 +0300 Message-ID: <20210929145249.2176811-5-dkozlyuk@nvidia.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210929145249.2176811-1-dkozlyuk@nvidia.com> References: <20210818090755.2419483-1-dkozlyuk@nvidia.com> <20210929145249.2176811-1-dkozlyuk@nvidia.com> MIME-Version: 1.0 X-Originating-IP: [172.20.187.5] X-ClientProxiedBy: HQMAIL111.nvidia.com (172.20.187.18) To DRHQMAIL107.nvidia.com (10.27.9.16) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 39eb58c0-9362-48b0-7ec0-08d98358dc9b X-MS-TrafficTypeDiagnostic: BN6PR1201MB0195: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:2201; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: v3oACQe6xjEf6AU9DsyTqOZkJd02p50QYPXUsL5bdzMqWjrTb2+OLCGzjCEZ9MrVTwyuESPr+j5R62FAEPXqu1cHLbCc4ndm+x7UFv/D2CHDKx/MK1+d/umF41m3NSwGJOt1dlB0t+yEFA/Gq5Ylt08kKcQn8FU1Q7XJURVXe2JHVkBCs6oexUNuGsKXGGeLnO5ecp4bvFfvpKb1t/mr+veg7I2bysAvqGug/fO2SxkvDZfHAyulWREV0oe2UcQbcZZRpiOcp982J3eIEQf91A2Oi/stdgx+9EH4eIgRp3kJgFBZsC10Ax37E5sRknDc03sBZacrWtAUSqXSeVYkn9uTu10dBf5OFpeZJOInauHUbcCexoO/tO8vVZpgjhcujow47oSDcVNOJze2K/VbuTshEMhFRejrXZeUWkciHOskHarA6ONX1cn2iIgnQy/96kbjN8X/pnpvc36pUPsKamLEYussMjMT/l4uAdQrHE2gxUlZywaRVjt9bU92kjzWwmN3QZXn/ICsj7JMq1LS+DaW8XEsnteFEQZ0ULr6nbVMDl2K0CuWxiWkqTxPg1jiywEF/c4vCgPWzoZa2vQLDGV7shz5TbsTAEhqK8fGKfv7hxWiUiSIOjdvVZEFrEZNP8O/Sfk4N6EIyLV6/FDDmwnpXCcrbSTn3ZxFLbpEEv5EuscG8ueWlK1RHAQdD/JU1Y7uQ/mgANiyjW/1JsmttlBhbbZLaBrASCGcy3M+7yU= X-Forefront-Antispam-Report: CIP:216.228.112.32; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:schybrid01.nvidia.com; CAT:NONE; SFS:(4636009)(36840700001)(46966006)(30864003)(86362001)(2876002)(107886003)(6286002)(8676002)(70206006)(356005)(70586007)(36756003)(7636003)(7696005)(47076005)(83380400001)(6916009)(6666004)(26005)(16526019)(36860700001)(8936002)(508600001)(55016002)(186003)(426003)(336012)(2906002)(82310400003)(54906003)(2616005)(4326008)(5660300002)(1076003)(316002)(309714004); DIR:OUT; SFP:1101; X-MS-Exchange-CrossTenant-OriginalArrivalTime: 29 Sep 2021 14:53:12.9595 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 39eb58c0-9362-48b0-7ec0-08d98358dc9b X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[216.228.112.32]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: DM6NAM11FT060.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: BN6PR1201MB0195 Subject: [dpdk-dev] [PATCH v2 4/4] net/mlx5: support mempool registration X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Dmitry Kozlyuk When the first port in a given protection domain (PD) starts, install a mempool event callback for this PD and register all existing memory regions (MR) for it. When the last port in a PD closes, remove the callback and unregister all mempools for this PD. This behavior can be switched off with a new devarg: mr_mempool_reg_en. On TX slow path, i.e. when an MR key for the address of the buffer to send is not in the local cache, first try to retrieve it from the database of registered mempools. Supported are direct and indirect mbufs, as well as externally-attached ones from MLX5 MPRQ feature. Lookup in the database of non-mempool memory is used as the last resort. RX mempools are registered regardless of the devarg value. On RX data path only the local cache and the mempool database is used. If implicit mempool registration is disabled, these mempools are unregistered at port stop, releasing the MRs. Signed-off-by: Dmitry Kozlyuk Acked-by: Matan Azrad --- doc/guides/nics/mlx5.rst | 11 ++ doc/guides/rel_notes/release_21_11.rst | 6 + drivers/net/mlx5/linux/mlx5_mp_os.c | 44 +++++++ drivers/net/mlx5/linux/mlx5_os.c | 4 +- drivers/net/mlx5/mlx5.c | 152 +++++++++++++++++++++++++ drivers/net/mlx5/mlx5.h | 10 ++ drivers/net/mlx5/mlx5_mr.c | 120 +++++-------------- drivers/net/mlx5/mlx5_mr.h | 2 - drivers/net/mlx5/mlx5_rx.h | 21 ++-- drivers/net/mlx5/mlx5_rxq.c | 13 +++ drivers/net/mlx5/mlx5_trigger.c | 77 +++++++++++-- drivers/net/mlx5/windows/mlx5_os.c | 1 + 12 files changed, 345 insertions(+), 116 deletions(-) diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst index bae73f42d8..58d1c5b65c 100644 --- a/doc/guides/nics/mlx5.rst +++ b/doc/guides/nics/mlx5.rst @@ -1001,6 +1001,17 @@ Driver options Enabled by default. +- ``mr_mempool_reg_en`` parameter [int] + + A nonzero value enables implicit registration of DMA memory of all mempools + except those having ``MEMPOOL_F_NON_IO``. The effect is that when a packet + from a mempool is transmitted, its memory is already registered for DMA + in the PMD and no registration will happen on the data path. The tradeoff is + extra work on the creation of each mempool and increased HW resource use + if some mempools are not used with MLX5 devices. + + Enabled by default. + - ``representor`` parameter [list] This parameter can be used to instantiate DPDK Ethernet devices from diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst index 873beda633..1fc09faf96 100644 --- a/doc/guides/rel_notes/release_21_11.rst +++ b/doc/guides/rel_notes/release_21_11.rst @@ -106,6 +106,12 @@ New Features * Added tests to validate packets hard expiry. * Added tests to verify tunnel header verification in IPsec inbound. +* **Updated Mellanox mlx5 driver.** + + Updated the Mellanox mlx5 driver with new features and improvements, including: + + * Added implicit mempool registration to avoid data path hiccups (opt-out). + Removed Items ------------- diff --git a/drivers/net/mlx5/linux/mlx5_mp_os.c b/drivers/net/mlx5/linux/mlx5_mp_os.c index 3a4aa766f8..d2ac375a47 100644 --- a/drivers/net/mlx5/linux/mlx5_mp_os.c +++ b/drivers/net/mlx5/linux/mlx5_mp_os.c @@ -20,6 +20,45 @@ #include "mlx5_tx.h" #include "mlx5_utils.h" +/** + * Handle a port-agnostic message. + * + * @return + * 0 on success, 1 when message is not port-agnostic, (-1) on error. + */ +static int +mlx5_mp_os_handle_port_agnostic(const struct rte_mp_msg *mp_msg, + const void *peer) +{ + struct rte_mp_msg mp_res; + struct mlx5_mp_param *res = (struct mlx5_mp_param *)mp_res.param; + const struct mlx5_mp_param *param = + (const struct mlx5_mp_param *)mp_msg->param; + const struct mlx5_mp_arg_mempool_reg *mpr; + struct mlx5_mp_id mp_id; + + switch (param->type) { + case MLX5_MP_REQ_MEMPOOL_REGISTER: + mlx5_mp_id_init(&mp_id, param->port_id); + mp_init_msg(&mp_id, &mp_res, param->type); + mpr = ¶m->args.mempool_reg; + res->result = mlx5_mr_mempool_register(mpr->share_cache, + mpr->pd, mpr->mempool, + NULL); + return rte_mp_reply(&mp_res, peer); + case MLX5_MP_REQ_MEMPOOL_UNREGISTER: + mlx5_mp_id_init(&mp_id, param->port_id); + mp_init_msg(&mp_id, &mp_res, param->type); + mpr = ¶m->args.mempool_reg; + res->result = mlx5_mr_mempool_unregister(mpr->share_cache, + mpr->mempool, NULL); + return rte_mp_reply(&mp_res, peer); + default: + return 1; + } + return -1; +} + int mlx5_mp_os_primary_handle(const struct rte_mp_msg *mp_msg, const void *peer) { @@ -34,6 +73,11 @@ mlx5_mp_os_primary_handle(const struct rte_mp_msg *mp_msg, const void *peer) int ret; MLX5_ASSERT(rte_eal_process_type() == RTE_PROC_PRIMARY); + /* Port-agnostic messages. */ + ret = mlx5_mp_os_handle_port_agnostic(mp_msg, peer); + if (ret <= 0) + return ret; + /* Port-specific messages. */ if (!rte_eth_dev_is_valid_port(param->port_id)) { rte_errno = ENODEV; DRV_LOG(ERR, "port %u invalid port ID", param->port_id); diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c index 470b16cb9a..78ed588746 100644 --- a/drivers/net/mlx5/linux/mlx5_os.c +++ b/drivers/net/mlx5/linux/mlx5_os.c @@ -1034,8 +1034,7 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev, err = mlx5_proc_priv_init(eth_dev); if (err) return NULL; - mp_id.port_id = eth_dev->data->port_id; - strlcpy(mp_id.name, MLX5_MP_NAME, RTE_MP_MAX_NAME_LEN); + mlx5_mp_id_init(&mp_id, eth_dev->data->port_id); /* Receive command fd from primary process */ err = mlx5_mp_req_verbs_cmd_fd(&mp_id); if (err < 0) @@ -2133,6 +2132,7 @@ mlx5_os_config_default(struct mlx5_dev_config *config) config->txqs_inline = MLX5_ARG_UNSET; config->vf_nl_en = 1; config->mr_ext_memseg_en = 1; + config->mr_mempool_reg_en = 1; config->mprq.max_memcpy_len = MLX5_MPRQ_MEMCPY_DEFAULT_LEN; config->mprq.min_rxqs_num = MLX5_MPRQ_MIN_RXQS; config->dv_esw_en = 1; diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index f84e061fe7..a16be05f7c 100644 --- a/drivers/net/mlx5/mlx5.c +++ b/drivers/net/mlx5/mlx5.c @@ -178,6 +178,9 @@ /* Device parameter to configure allow or prevent duplicate rules pattern. */ #define MLX5_ALLOW_DUPLICATE_PATTERN "allow_duplicate_pattern" +/* Device parameter to configure implicit registration of mempool memory. */ +#define MLX5_MR_MEMPOOL_REG_EN "mr_mempool_reg_en" + /* Shared memory between primary and secondary processes. */ struct mlx5_shared_data *mlx5_shared_data; @@ -1085,6 +1088,141 @@ mlx5_alloc_rxtx_uars(struct mlx5_dev_ctx_shared *sh, return err; } +/** + * Unregister the mempool from the protection domain. + * + * @param sh + * Pointer to the device shared context. + * @param mp + * Mempool being unregistered. + */ +static void +mlx5_dev_ctx_shared_mempool_unregister(struct mlx5_dev_ctx_shared *sh, + struct rte_mempool *mp) +{ + struct mlx5_mp_id mp_id; + + mlx5_mp_id_init(&mp_id, 0); + if (mlx5_mr_mempool_unregister(&sh->share_cache, mp, &mp_id) < 0) + DRV_LOG(WARNING, "Failed to unregister mempool %s for PD %p: %s", + mp->name, sh->pd, rte_strerror(rte_errno)); +} + +/** + * rte_mempool_walk() callback to register mempools + * for the protection domain. + * + * @param mp + * The mempool being walked. + * @param arg + * Pointer to the device shared context. + */ +static void +mlx5_dev_ctx_shared_mempool_register_cb(struct rte_mempool *mp, void *arg) +{ + struct mlx5_dev_ctx_shared *sh = arg; + struct mlx5_mp_id mp_id; + int ret; + + mlx5_mp_id_init(&mp_id, 0); + ret = mlx5_mr_mempool_register(&sh->share_cache, sh->pd, mp, &mp_id); + if (ret < 0 && rte_errno != EEXIST) + DRV_LOG(ERR, "Failed to register existing mempool %s for PD %p: %s", + mp->name, sh->pd, rte_strerror(rte_errno)); +} + +/** + * rte_mempool_walk() callback to unregister mempools + * from the protection domain. + * + * @param mp + * The mempool being walked. + * @param arg + * Pointer to the device shared context. + */ +static void +mlx5_dev_ctx_shared_mempool_unregister_cb(struct rte_mempool *mp, void *arg) +{ + mlx5_dev_ctx_shared_mempool_unregister + ((struct mlx5_dev_ctx_shared *)arg, mp); +} + +/** + * Mempool life cycle callback for Ethernet devices. + * + * @param event + * Mempool life cycle event. + * @param mp + * Associated mempool. + * @param arg + * Pointer to a device shared context. + */ +static void +mlx5_dev_ctx_shared_mempool_event_cb(enum rte_mempool_event event, + struct rte_mempool *mp, void *arg) +{ + struct mlx5_dev_ctx_shared *sh = arg; + struct mlx5_mp_id mp_id; + + switch (event) { + case RTE_MEMPOOL_EVENT_READY: + mlx5_mp_id_init(&mp_id, 0); + if (mlx5_mr_mempool_register(&sh->share_cache, sh->pd, mp, + &mp_id) < 0) + DRV_LOG(ERR, "Failed to register new mempool %s for PD %p: %s", + mp->name, sh->pd, rte_strerror(rte_errno)); + break; + case RTE_MEMPOOL_EVENT_DESTROY: + mlx5_dev_ctx_shared_mempool_unregister(sh, mp); + break; + } +} + +/** + * Callback used when implicit mempool registration is disabled + * in order to track Rx mempool destruction. + * + * @param event + * Mempool life cycle event. + * @param mp + * An Rx mempool registered explicitly when the port is started. + * @param arg + * Pointer to a device shared context. + */ +static void +mlx5_dev_ctx_shared_rx_mempool_event_cb(enum rte_mempool_event event, + struct rte_mempool *mp, void *arg) +{ + struct mlx5_dev_ctx_shared *sh = arg; + + if (event == RTE_MEMPOOL_EVENT_DESTROY) + mlx5_dev_ctx_shared_mempool_unregister(sh, mp); +} + +int +mlx5_dev_ctx_shared_mempool_subscribe(struct rte_eth_dev *dev) +{ + struct mlx5_priv *priv = dev->data->dev_private; + struct mlx5_dev_ctx_shared *sh = priv->sh; + int ret; + + /* Check if we only need to track Rx mempool destruction. */ + if (!priv->config.mr_mempool_reg_en) { + ret = rte_mempool_event_callback_register + (mlx5_dev_ctx_shared_rx_mempool_event_cb, sh); + return ret == 0 || rte_errno == EEXIST ? 0 : ret; + } + /* Callback for this shared context may be already registered. */ + ret = rte_mempool_event_callback_register + (mlx5_dev_ctx_shared_mempool_event_cb, sh); + if (ret != 0 && rte_errno != EEXIST) + return ret; + /* Register mempools only once for this shared context. */ + if (ret == 0) + rte_mempool_walk(mlx5_dev_ctx_shared_mempool_register_cb, sh); + return 0; +} + /** * Allocate shared device context. If there is multiport device the * master and representors will share this context, if there is single @@ -1282,6 +1420,8 @@ mlx5_alloc_shared_dev_ctx(const struct mlx5_dev_spawn_data *spawn, void mlx5_free_shared_dev_ctx(struct mlx5_dev_ctx_shared *sh) { + int ret; + pthread_mutex_lock(&mlx5_dev_ctx_list_mutex); #ifdef RTE_LIBRTE_MLX5_DEBUG /* Check the object presence in the list. */ @@ -1302,6 +1442,15 @@ mlx5_free_shared_dev_ctx(struct mlx5_dev_ctx_shared *sh) MLX5_ASSERT(rte_eal_process_type() == RTE_PROC_PRIMARY); if (--sh->refcnt) goto exit; + /* Stop watching for mempool events and unregister all mempools. */ + ret = rte_mempool_event_callback_unregister + (mlx5_dev_ctx_shared_mempool_event_cb, sh); + if (ret < 0 && rte_errno == ENOENT) + ret = rte_mempool_event_callback_unregister + (mlx5_dev_ctx_shared_rx_mempool_event_cb, sh); + if (ret == 0) + rte_mempool_walk(mlx5_dev_ctx_shared_mempool_unregister_cb, + sh); /* Remove from memory callback device list. */ rte_rwlock_write_lock(&mlx5_shared_data->mem_event_rwlock); LIST_REMOVE(sh, mem_event_cb); @@ -1991,6 +2140,8 @@ mlx5_args_check(const char *key, const char *val, void *opaque) config->decap_en = !!tmp; } else if (strcmp(MLX5_ALLOW_DUPLICATE_PATTERN, key) == 0) { config->allow_duplicate_pattern = !!tmp; + } else if (strcmp(MLX5_MR_MEMPOOL_REG_EN, key) == 0) { + config->mr_mempool_reg_en = !!tmp; } else { DRV_LOG(WARNING, "%s: unknown parameter", key); rte_errno = EINVAL; @@ -2051,6 +2202,7 @@ mlx5_args(struct mlx5_dev_config *config, struct rte_devargs *devargs) MLX5_SYS_MEM_EN, MLX5_DECAP_EN, MLX5_ALLOW_DUPLICATE_PATTERN, + MLX5_MR_MEMPOOL_REG_EN, NULL, }; struct rte_kvargs *kvlist; diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index e02714e231..89630d569b 100644 --- a/drivers/net/mlx5/mlx5.h +++ b/drivers/net/mlx5/mlx5.h @@ -155,6 +155,13 @@ struct mlx5_flow_dump_ack { /** Key string for IPC. */ #define MLX5_MP_NAME "net_mlx5_mp" +/** Initialize a multi-process ID. */ +static inline void +mlx5_mp_id_init(struct mlx5_mp_id *mp_id, uint16_t port_id) +{ + mp_id->port_id = port_id; + strlcpy(mp_id->name, MLX5_MP_NAME, RTE_MP_MAX_NAME_LEN); +} LIST_HEAD(mlx5_dev_list, mlx5_dev_ctx_shared); @@ -270,6 +277,8 @@ struct mlx5_dev_config { unsigned int dv_miss_info:1; /* restore packet after partial hw miss */ unsigned int allow_duplicate_pattern:1; /* Allow/Prevent the duplicate rules pattern. */ + unsigned int mr_mempool_reg_en:1; + /* Allow/prevent implicit mempool memory registration. */ struct { unsigned int enabled:1; /* Whether MPRQ is enabled. */ unsigned int stride_num_n; /* Number of strides. */ @@ -1497,6 +1506,7 @@ struct mlx5_dev_ctx_shared * mlx5_alloc_shared_dev_ctx(const struct mlx5_dev_spawn_data *spawn, const struct mlx5_dev_config *config); void mlx5_free_shared_dev_ctx(struct mlx5_dev_ctx_shared *sh); +int mlx5_dev_ctx_shared_mempool_subscribe(struct rte_eth_dev *dev); void mlx5_free_table_hash_list(struct mlx5_priv *priv); int mlx5_alloc_table_hash_list(struct mlx5_priv *priv); void mlx5_set_min_inline(struct mlx5_dev_spawn_data *spawn, diff --git a/drivers/net/mlx5/mlx5_mr.c b/drivers/net/mlx5/mlx5_mr.c index 44afda731f..55d27b50b9 100644 --- a/drivers/net/mlx5/mlx5_mr.c +++ b/drivers/net/mlx5/mlx5_mr.c @@ -65,30 +65,6 @@ mlx5_mr_mem_event_cb(enum rte_mem_event event_type, const void *addr, } } -/** - * Bottom-half of LKey search on Rx. - * - * @param rxq - * Pointer to Rx queue structure. - * @param addr - * Search key. - * - * @return - * Searched LKey on success, UINT32_MAX on no match. - */ -uint32_t -mlx5_rx_addr2mr_bh(struct mlx5_rxq_data *rxq, uintptr_t addr) -{ - struct mlx5_rxq_ctrl *rxq_ctrl = - container_of(rxq, struct mlx5_rxq_ctrl, rxq); - struct mlx5_mr_ctrl *mr_ctrl = &rxq->mr_ctrl; - struct mlx5_priv *priv = rxq_ctrl->priv; - - return mlx5_mr_addr2mr_bh(priv->sh->pd, &priv->mp_id, - &priv->sh->share_cache, mr_ctrl, addr, - priv->config.mr_ext_memseg_en); -} - /** * Bottom-half of LKey search on Tx. * @@ -128,9 +104,36 @@ mlx5_tx_addr2mr_bh(struct mlx5_txq_data *txq, uintptr_t addr) uint32_t mlx5_tx_mb2mr_bh(struct mlx5_txq_data *txq, struct rte_mbuf *mb) { + struct mlx5_txq_ctrl *txq_ctrl = + container_of(txq, struct mlx5_txq_ctrl, txq); + struct mlx5_mr_ctrl *mr_ctrl = &txq->mr_ctrl; + struct mlx5_priv *priv = txq_ctrl->priv; uintptr_t addr = (uintptr_t)mb->buf_addr; uint32_t lkey; + if (priv->config.mr_mempool_reg_en) { + struct rte_mempool *mp = NULL; + struct mlx5_mprq_buf *buf; + + if (!RTE_MBUF_HAS_EXTBUF(mb)) { + mp = mlx5_mb2mp(mb); + } else if (mb->shinfo->free_cb == mlx5_mprq_buf_free_cb) { + /* Recover MPRQ mempool. */ + buf = mb->shinfo->fcb_opaque; + mp = buf->mp; + } + if (mp != NULL) { + lkey = mlx5_mr_mempool2mr_bh(&priv->sh->share_cache, + mr_ctrl, mp, addr); + /* + * Lookup can only fail on invalid input, e.g. "addr" + * is not from "mp" or "mp" has MEMPOOL_F_NON_IO set. + */ + if (lkey != UINT32_MAX) + return lkey; + } + /* Fallback for generic mechanism in corner cases. */ + } lkey = mlx5_tx_addr2mr_bh(txq, addr); if (lkey == UINT32_MAX && rte_errno == ENXIO) { /* Mempool may have externally allocated memory. */ @@ -392,72 +395,3 @@ mlx5_tx_update_ext_mp(struct mlx5_txq_data *txq, uintptr_t addr, mlx5_mr_update_ext_mp(ETH_DEV(priv), mr_ctrl, mp); return mlx5_tx_addr2mr_bh(txq, addr); } - -/* Called during rte_mempool_mem_iter() by mlx5_mr_update_mp(). */ -static void -mlx5_mr_update_mp_cb(struct rte_mempool *mp __rte_unused, void *opaque, - struct rte_mempool_memhdr *memhdr, - unsigned mem_idx __rte_unused) -{ - struct mr_update_mp_data *data = opaque; - struct rte_eth_dev *dev = data->dev; - struct mlx5_priv *priv = dev->data->dev_private; - - uint32_t lkey; - - /* Stop iteration if failed in the previous walk. */ - if (data->ret < 0) - return; - /* Register address of the chunk and update local caches. */ - lkey = mlx5_mr_addr2mr_bh(priv->sh->pd, &priv->mp_id, - &priv->sh->share_cache, data->mr_ctrl, - (uintptr_t)memhdr->addr, - priv->config.mr_ext_memseg_en); - if (lkey == UINT32_MAX) - data->ret = -1; -} - -/** - * Register entire memory chunks in a Mempool. - * - * @param dev - * Pointer to Ethernet device. - * @param mr_ctrl - * Pointer to per-queue MR control structure. - * @param mp - * Pointer to registering Mempool. - * - * @return - * 0 on success, -1 on failure. - */ -int -mlx5_mr_update_mp(struct rte_eth_dev *dev, struct mlx5_mr_ctrl *mr_ctrl, - struct rte_mempool *mp) -{ - struct mr_update_mp_data data = { - .dev = dev, - .mr_ctrl = mr_ctrl, - .ret = 0, - }; - uint32_t flags = rte_pktmbuf_priv_flags(mp); - - if (flags & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF) { - /* - * The pinned external buffer should be registered for DMA - * operations by application. The mem_list of the pool contains - * the list of chunks with mbuf structures w/o built-in data - * buffers and DMA actually does not happen there, no need - * to create MR for these chunks. - */ - return 0; - } - DRV_LOG(DEBUG, "Port %u Rx queue registering mp %s " - "having %u chunks.", dev->data->port_id, - mp->name, mp->nb_mem_chunks); - rte_mempool_mem_iter(mp, mlx5_mr_update_mp_cb, &data); - if (data.ret < 0 && rte_errno == ENXIO) { - /* Mempool may have externally allocated memory. */ - return mlx5_mr_update_ext_mp(dev, mr_ctrl, mp); - } - return data.ret; -} diff --git a/drivers/net/mlx5/mlx5_mr.h b/drivers/net/mlx5/mlx5_mr.h index 4a7fab6df2..c984e777b5 100644 --- a/drivers/net/mlx5/mlx5_mr.h +++ b/drivers/net/mlx5/mlx5_mr.h @@ -22,7 +22,5 @@ void mlx5_mr_mem_event_cb(enum rte_mem_event event_type, const void *addr, size_t len, void *arg); -int mlx5_mr_update_mp(struct rte_eth_dev *dev, struct mlx5_mr_ctrl *mr_ctrl, - struct rte_mempool *mp); #endif /* RTE_PMD_MLX5_MR_H_ */ diff --git a/drivers/net/mlx5/mlx5_rx.h b/drivers/net/mlx5/mlx5_rx.h index 3f2b99fb65..3952e64d50 100644 --- a/drivers/net/mlx5/mlx5_rx.h +++ b/drivers/net/mlx5/mlx5_rx.h @@ -275,13 +275,11 @@ uint16_t mlx5_rx_burst_vec(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t mlx5_rx_burst_mprq_vec(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n); -/* mlx5_mr.c */ - -uint32_t mlx5_rx_addr2mr_bh(struct mlx5_rxq_data *rxq, uintptr_t addr); +static int mlx5_rxq_mprq_enabled(struct mlx5_rxq_data *rxq); /** - * Query LKey from a packet buffer for Rx. No need to flush local caches for Rx - * as mempool is pre-configured and static. + * Query LKey from a packet buffer for Rx. No need to flush local caches + * as the Rx mempool database entries are valid for the lifetime of the queue. * * @param rxq * Pointer to Rx queue structure. @@ -290,11 +288,14 @@ uint32_t mlx5_rx_addr2mr_bh(struct mlx5_rxq_data *rxq, uintptr_t addr); * * @return * Searched LKey on success, UINT32_MAX on no match. + * This function always succeeds on valid input. */ static __rte_always_inline uint32_t mlx5_rx_addr2mr(struct mlx5_rxq_data *rxq, uintptr_t addr) { struct mlx5_mr_ctrl *mr_ctrl = &rxq->mr_ctrl; + struct mlx5_rxq_ctrl *rxq_ctrl; + struct rte_mempool *mp; uint32_t lkey; /* Linear search on MR cache array. */ @@ -302,8 +303,14 @@ mlx5_rx_addr2mr(struct mlx5_rxq_data *rxq, uintptr_t addr) MLX5_MR_CACHE_N, addr); if (likely(lkey != UINT32_MAX)) return lkey; - /* Take slower bottom-half (Binary Search) on miss. */ - return mlx5_rx_addr2mr_bh(rxq, addr); + /* + * Slower search in the mempool database on miss. + * During queue creation rxq->sh is not yet set, so we use rxq_ctrl. + */ + rxq_ctrl = container_of(rxq, struct mlx5_rxq_ctrl, rxq); + mp = mlx5_rxq_mprq_enabled(rxq) ? rxq->mprq_mp : rxq->mp; + return mlx5_mr_mempool2mr_bh(&rxq_ctrl->priv->sh->share_cache, + mr_ctrl, mp, addr); } #define mlx5_rx_mb2mr(rxq, mb) mlx5_rx_addr2mr(rxq, (uintptr_t)((mb)->buf_addr)) diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c index abd8ce7989..4696f0b851 100644 --- a/drivers/net/mlx5/mlx5_rxq.c +++ b/drivers/net/mlx5/mlx5_rxq.c @@ -1165,6 +1165,7 @@ mlx5_mprq_alloc_mp(struct rte_eth_dev *dev) unsigned int strd_sz_n = 0; unsigned int i; unsigned int n_ibv = 0; + int ret; if (!mlx5_mprq_enabled(dev)) return 0; @@ -1244,6 +1245,16 @@ mlx5_mprq_alloc_mp(struct rte_eth_dev *dev) rte_errno = ENOMEM; return -rte_errno; } + ret = mlx5_mr_mempool_register(&priv->sh->share_cache, priv->sh->pd, + mp, &priv->mp_id); + if (ret < 0 && rte_errno != EEXIST) { + ret = rte_errno; + DRV_LOG(ERR, "port %u failed to register a mempool for Multi-Packet RQ", + dev->data->port_id); + rte_mempool_free(mp); + rte_errno = ret; + return -rte_errno; + } priv->mprq_mp = mp; exit: /* Set mempool for each Rx queue. */ @@ -1446,6 +1457,8 @@ mlx5_rxq_new(struct rte_eth_dev *dev, uint16_t idx, uint16_t desc, /* rte_errno is already set. */ goto error; } + /* Rx queues don't use this pointer, but we want a valid structure. */ + tmpl->rxq.mr_ctrl.dev_gen_ptr = &priv->sh->share_cache.dev_gen; tmpl->socket = socket; if (dev->data->dev_conf.intr_conf.rxq) tmpl->irq = 1; diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c index 54173bfacb..3cbf5816a1 100644 --- a/drivers/net/mlx5/mlx5_trigger.c +++ b/drivers/net/mlx5/mlx5_trigger.c @@ -105,6 +105,59 @@ mlx5_txq_start(struct rte_eth_dev *dev) return -rte_errno; } +/** + * Translate the chunk address to MR key in order to put in into the cache. + */ +static void +mlx5_rxq_mempool_register_cb(struct rte_mempool *mp, void *opaque, + struct rte_mempool_memhdr *memhdr, + unsigned int idx) +{ + struct mlx5_rxq_data *rxq = opaque; + + RTE_SET_USED(mp); + RTE_SET_USED(idx); + mlx5_rx_addr2mr(rxq, (uintptr_t)memhdr->addr); +} + +/** + * Register Rx queue mempools and fill the Rx queue cache. + * This function tolerates repeated mempool registration. + * + * @param[in] rxq_ctrl + * Rx queue control data. + * + * @return + * 0 on success, (-1) on failure and rte_errno is set. + */ +static int +mlx5_rxq_mempool_register(struct mlx5_rxq_ctrl *rxq_ctrl) +{ + struct mlx5_priv *priv = rxq_ctrl->priv; + struct rte_mempool *mp; + uint32_t s; + int ret = 0; + + mlx5_mr_flush_local_cache(&rxq_ctrl->rxq.mr_ctrl); + /* MPRQ mempool is registered on creation, just fill the cache. */ + if (mlx5_rxq_mprq_enabled(&rxq_ctrl->rxq)) { + rte_mempool_mem_iter(rxq_ctrl->rxq.mprq_mp, + mlx5_rxq_mempool_register_cb, + &rxq_ctrl->rxq); + return 0; + } + for (s = 0; s < rxq_ctrl->rxq.rxseg_n; s++) { + mp = rxq_ctrl->rxq.rxseg[s].mp; + ret = mlx5_mr_mempool_register(&priv->sh->share_cache, + priv->sh->pd, mp, &priv->mp_id); + if (ret < 0 && rte_errno != EEXIST) + return ret; + rte_mempool_mem_iter(mp, mlx5_rxq_mempool_register_cb, + &rxq_ctrl->rxq); + } + return 0; +} + /** * Stop traffic on Rx queues. * @@ -152,18 +205,13 @@ mlx5_rxq_start(struct rte_eth_dev *dev) if (!rxq_ctrl) continue; if (rxq_ctrl->type == MLX5_RXQ_TYPE_STANDARD) { - /* Pre-register Rx mempools. */ - if (mlx5_rxq_mprq_enabled(&rxq_ctrl->rxq)) { - mlx5_mr_update_mp(dev, &rxq_ctrl->rxq.mr_ctrl, - rxq_ctrl->rxq.mprq_mp); - } else { - uint32_t s; - - for (s = 0; s < rxq_ctrl->rxq.rxseg_n; s++) - mlx5_mr_update_mp - (dev, &rxq_ctrl->rxq.mr_ctrl, - rxq_ctrl->rxq.rxseg[s].mp); - } + /* + * Pre-register the mempools. Regardless of whether + * the implicit registration is enabled or not, + * Rx mempool destruction is tracked to free MRs. + */ + if (mlx5_rxq_mempool_register(rxq_ctrl) < 0) + goto error; ret = rxq_alloc_elts(rxq_ctrl); if (ret) goto error; @@ -1124,6 +1172,11 @@ mlx5_dev_start(struct rte_eth_dev *dev) dev->data->port_id, strerror(rte_errno)); goto error; } + if (mlx5_dev_ctx_shared_mempool_subscribe(dev) != 0) { + DRV_LOG(ERR, "port %u failed to subscribe for mempool life cycle: %s", + dev->data->port_id, rte_strerror(rte_errno)); + goto error; + } rte_wmb(); dev->tx_pkt_burst = mlx5_select_tx_function(dev); dev->rx_pkt_burst = mlx5_select_rx_function(dev); diff --git a/drivers/net/mlx5/windows/mlx5_os.c b/drivers/net/mlx5/windows/mlx5_os.c index 26fa927039..149253d174 100644 --- a/drivers/net/mlx5/windows/mlx5_os.c +++ b/drivers/net/mlx5/windows/mlx5_os.c @@ -1116,6 +1116,7 @@ mlx5_os_net_probe(struct rte_device *dev) dev_config.txqs_inline = MLX5_ARG_UNSET; dev_config.vf_nl_en = 0; dev_config.mr_ext_memseg_en = 1; + dev_config.mr_mempool_reg_en = 1; dev_config.mprq.max_memcpy_len = MLX5_MPRQ_MEMCPY_DEFAULT_LEN; dev_config.mprq.min_rxqs_num = MLX5_MPRQ_MIN_RXQS; dev_config.dv_esw_en = 0;