From patchwork Wed Oct 20 03:19:37 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rongwei Liu X-Patchwork-Id: 102334 X-Patchwork-Delegate: rasland@nvidia.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 2F7EDA0C45; Wed, 20 Oct 2021 05:20:11 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 56D9241164; Wed, 20 Oct 2021 05:20:05 +0200 (CEST) Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10on2070.outbound.protection.outlook.com [40.107.94.70]) by mails.dpdk.org (Postfix) with ESMTP id C6D1440142 for ; Wed, 20 Oct 2021 05:20:03 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=GpR7m6SfFMEc+KNTdNI173Aobmfi44gOvRvBpw30WMFFvLgtzRkNQjbUgdgdpnDzOlMwjX206w0GMHUXinRpTDVgsKdAWb8jSRgmvPSonYTExB0iyoNb5vjmsnvhZfAgm9vYgFnxPVsmxN+VIfAGKbeDnRaemaYRSXKFPb0jy6vsG844u8EcT5H1B42oNDWKyCu9v6aS3Y6OHAmHm47s/FxF9vdo4Aw4wp6he7YbTfecMWEsjz/z2yKsbzJjY2NxeOBeadh6aoZ/Joiuw2ln8dQqz/EDX+zUdNU4lqS4vD9vqWIRF98ZsFD0Am/yBd8dK4DaB5JhHokyTVllvC8MsA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=zPy5c36WJKX5dtU7rtIFTiDmuvq5qExk0W+A8FGJuLI=; b=XyPYrIXdm+XHkgcCvSn0m8U6tYt1wKFjC2MWohwH1YIG4tZ7t/oHUt9CPSLfgpsl+YZJm3KkqT1IUf84dSP6GfF9RahdzipIzPZ2sv0pHfXybi7kLuj3AP6tvOMbH76VVyJoqriZfu02KHB0/QlcaJHLJmKqap1EVWe045tZTCixjDL+yuyuJsOZMbT/ifGNcRYa4ZxB06XwUUHennV3OEwDXOkHQ/VtGPyNYVRxKC6M58KpOd+SWRnEBHDfKqjsiaVEQweS/aRMHSaWbM57jirNqcjxPwnlGeckqTUAXXBzt3r/DINARug2NNHNTOVl9OvmMUF9XB+GYuGkNUHeEg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.112.34) smtp.rcpttodomain=dpdk.org smtp.mailfrom=nvidia.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=zPy5c36WJKX5dtU7rtIFTiDmuvq5qExk0W+A8FGJuLI=; b=KlUFwq2OQSJLF8C9rDPFkK4ax2+RrkRk7u1ibbJHZqLCUi8YHiKMCp3fjp7Ak5zd34qppnSpll8gNyu8onVtOQow8uLaYRfBenQnOc36nLvbfb2amfdRJtPPTYkQrmuLqI2C0murVMLNEEyM4wdobgEk/tKhngt6jOs3T4T1LD0BV4mpW90nh1PFTXMFS7vDzbsQgU9HNTVK6/9s7eO+tKMxbn8eP1ji4+ms1Ka7RQdfFLjyfYfUYqtEFjdjXk3Lz03nMQX+ZjLGRbzs/QYYI1deTsCxZ7wnzwHWC7uczRXT/rLp8tp89di67zCl6OgztynC2KLTXjFDiXydIOGDrg== Received: from BN9P221CA0010.NAMP221.PROD.OUTLOOK.COM (2603:10b6:408:10a::13) by MN2PR12MB3437.namprd12.prod.outlook.com (2603:10b6:208:c3::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4608.16; Wed, 20 Oct 2021 03:20:02 +0000 Received: from BN8NAM11FT063.eop-nam11.prod.protection.outlook.com (2603:10b6:408:10a:cafe::6f) by BN9P221CA0010.outlook.office365.com (2603:10b6:408:10a::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4608.18 via Frontend Transport; Wed, 20 Oct 2021 03:20:02 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.112.34) smtp.mailfrom=nvidia.com; dpdk.org; dkim=none (message not signed) header.d=none;dpdk.org; dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.112.34 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.112.34; helo=mail.nvidia.com; Received: from mail.nvidia.com (216.228.112.34) by BN8NAM11FT063.mail.protection.outlook.com (10.13.177.110) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.4608.15 via Frontend Transport; Wed, 20 Oct 2021 03:20:01 +0000 Received: from nvidia.com (172.20.187.6) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1497.18; Wed, 20 Oct 2021 03:19:55 +0000 From: Rongwei Liu To: , , , , Ray Kinsella CC: , , Jiawei Wang Date: Wed, 20 Oct 2021 06:19:37 +0300 Message-ID: <20211020031938.3190843-2-rongweil@nvidia.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20211020031938.3190843-1-rongweil@nvidia.com> References: <20211020031938.3190843-1-rongweil@nvidia.com> MIME-Version: 1.0 X-Originating-IP: [172.20.187.6] X-ClientProxiedBy: HQMAIL111.nvidia.com (172.20.187.18) To HQMAIL107.nvidia.com (172.20.187.13) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: f284f11e-bb67-455d-d7c0-08d99378810d X-MS-TrafficTypeDiagnostic: MN2PR12MB3437: X-LD-Processed: 43083d15-7273-40c1-b7db-39efd9ccc17a,ExtAddr X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:7219; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: Kn2UDS/y4oV8jco5dLJ2Wnttrt+tcR+wPQNP8rIF2F60NUvi3/b9vEMXHmDT1PL8+mz0JFH9zTn3njefrFE8DlM4vqNZ+JHNmJFfau/BF0bAS3bFYc6g3JgJbQQG0KgbsPfWzujYIqciH+/15bQSYoxAlAiSFeLJ9SxRCbvgU/hgtA/FG+c+YLZeRKO0X5jN66oMdxyNO9CtCF+sZI6TgvAzY50uvVuq4wTzMQtC+JCacD9pExlU3rChLQwxwmMGA1tir8ChhsnpWYw4vx5PVVrQqiL4RSMRZA8tJ9MMHjy4Dt4x0x9WrK1IPExrKg7FGV0ZzuPCX4rGCZItYwo9ZtrVaVsyI6CAMMdBOCAJOGC/+/gf4fQ/7Uo8SIzu2yRAWkVKpDQjTWk6/zWp1/BlXbcr/hZy9V+jT4PfA98v4UX++7UDHumHGedrVMkmc5zLnZCxE1ASe5X+DLV9d5VLr3331WC6jIiXbXG/spSoSMjFtn0lFNQIBueV1DvmoT45oldaeq7n0GFU4fT9KjinuHKkAhuA9VHbp0jHTfESAQurgjkI7giQqvwdXUyr4kCdVv4zqhySF6FQ7XykjIN+VoA3FSOzL+qhFNvu8paGGL+RlYH+Gx6d6XwSx5rlzapU8nI55AgUjoxT/LYMsFJ3V+FlV8taYKwsNDQ3J643ppXf43Xi6BWlM1xWRG3GYm5GdINYTeYoQ8MjtKjejrN31w== X-Forefront-Antispam-Report: CIP:216.228.112.34; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:schybrid03.nvidia.com; CAT:NONE; SFS:(4636009)(36840700001)(46966006)(110136005)(54906003)(336012)(36906005)(316002)(6286002)(6666004)(1076003)(26005)(2616005)(70206006)(70586007)(186003)(8676002)(8936002)(426003)(82310400003)(4326008)(16526019)(7636003)(2906002)(86362001)(7696005)(36756003)(5660300002)(356005)(55016002)(508600001)(47076005)(36860700001)(83380400001)(107886003); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 20 Oct 2021 03:20:01.5835 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: f284f11e-bb67-455d-d7c0-08d99378810d X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[216.228.112.34]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: BN8NAM11FT063.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: MN2PR12MB3437 Subject: [dpdk-dev] [PATCH 1/2] common/mlx5: support lag context query X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Added a new api mlx5_devx_cmd_query_lag() to query lag property from firmware including state/affinity/mode etc. Signed-off-by: Jiawei Wang Signed-off-by: Rongwei Liu Acked-by: Matan Azrad --- drivers/common/mlx5/mlx5_devx_cmds.c | 40 +++++++++++++++++++++++++ drivers/common/mlx5/mlx5_devx_cmds.h | 13 ++++++++ drivers/common/mlx5/mlx5_prm.h | 45 +++++++++++++++++++++++++++- drivers/common/mlx5/version.map | 1 + 4 files changed, 98 insertions(+), 1 deletion(-) diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/mlx5_devx_cmds.c index 6538bce57b..fb7c8e986f 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.c +++ b/drivers/common/mlx5/mlx5_devx_cmds.c @@ -2800,3 +2800,43 @@ mlx5_devx_cmd_create_crypto_login_obj(void *ctx, crypto_login_obj->id = MLX5_GET(general_obj_out_cmd_hdr, out, obj_id); return crypto_login_obj; } + +/** + * Query LAG context. + * + * @param[in] ctx + * Pointer to ibv_context, returned from mlx5dv_open_device. + * @param[out] lag_ctx + * Pointer to struct mlx5_devx_lag_context, to be set by the routine. + * + * @return + * 0 on success, a negative value otherwise. + */ +int +mlx5_devx_cmd_query_lag(void *ctx, + struct mlx5_devx_lag_context *lag_ctx) +{ + uint32_t in[MLX5_ST_SZ_DW(query_lag_in)] = {0}; + uint32_t out[MLX5_ST_SZ_DW(query_lag_out)] = {0}; + void *lctx; + int rc; + + MLX5_SET(query_lag_in, in, opcode, MLX5_CMD_OP_QUERY_LAG); + rc = mlx5_glue->devx_general_cmd(ctx, in, sizeof(in), out, sizeof(out)); + if (rc) + goto error; + lctx = MLX5_ADDR_OF(query_lag_out, out, context); + lag_ctx->fdb_selection_mode = MLX5_GET(lag_context, lctx, + fdb_selection_mode); + lag_ctx->port_select_mode = MLX5_GET(lag_context, lctx, + port_select_mode); + lag_ctx->lag_state = MLX5_GET(lag_context, lctx, lag_state); + lag_ctx->tx_remap_affinity_2 = MLX5_GET(lag_context, lctx, + tx_remap_affinity_2); + lag_ctx->tx_remap_affinity_1 = MLX5_GET(lag_context, lctx, + tx_remap_affinity_1); + return 0; +error: + rc = (rc > 0) ? -rc : rc; + return rc; +} diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h b/drivers/common/mlx5/mlx5_devx_cmds.h index 6948cadd37..5e4f3b749e 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.h +++ b/drivers/common/mlx5/mlx5_devx_cmds.h @@ -197,6 +197,15 @@ struct mlx5_hca_attr { uint32_t umr_indirect_mkey_disabled:1; }; +/* LAG Context. */ +struct mlx5_devx_lag_context { + uint32_t fdb_selection_mode:1; + uint32_t port_select_mode:3; + uint32_t lag_state:3; + uint32_t tx_remap_affinity_1:4; + uint32_t tx_remap_affinity_2:4; +}; + struct mlx5_devx_wq_attr { uint32_t wq_type:4; uint32_t wq_signature:1; @@ -681,4 +690,8 @@ struct mlx5_devx_obj * mlx5_devx_cmd_create_crypto_login_obj(void *ctx, struct mlx5_devx_crypto_login_attr *attr); +__rte_internal +int +mlx5_devx_cmd_query_lag(void *ctx, + struct mlx5_devx_lag_context *lag_ctx); #endif /* RTE_PMD_MLX5_DEVX_CMDS_H_ */ diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h index 54e62aa153..eab80eaead 100644 --- a/drivers/common/mlx5/mlx5_prm.h +++ b/drivers/common/mlx5/mlx5_prm.h @@ -1048,6 +1048,7 @@ enum { MLX5_CMD_OP_DEALLOC_PD = 0x801, MLX5_CMD_OP_ACCESS_REGISTER = 0x805, MLX5_CMD_OP_ALLOC_TRANSPORT_DOMAIN = 0x816, + MLX5_CMD_OP_QUERY_LAG = 0x842, MLX5_CMD_OP_CREATE_TIR = 0x900, MLX5_CMD_OP_MODIFY_TIR = 0x901, MLX5_CMD_OP_CREATE_SQ = 0X904, @@ -1507,7 +1508,8 @@ struct mlx5_ifc_cmd_hca_cap_bits { u8 uar_4k[0x1]; u8 reserved_at_241[0x9]; u8 uar_sz[0x6]; - u8 reserved_at_250[0x8]; + u8 port_selection_cap[0x1]; + u8 reserved_at_251[0x7]; u8 log_pg_sz[0x8]; u8 bf[0x1]; u8 driver_version[0x1]; @@ -1974,6 +1976,14 @@ struct mlx5_ifc_query_nic_vport_context_in_bits { u8 reserved_at_68[0x18]; }; +/* + * lag_tx_port_affinity: 0 auto-selection, 1 PF1, 2 PF2 vice versa. + * Each TIS binds to one PF by setting lag_tx_port_affinity (>0). + * Once LAG enabled, we create multiple TISs and bind each one to + * different PFs, then TIS[i] gets affinity i+1 and goes to PF i+1. + */ +#define MLX5_IFC_LAG_MAP_TIS_AFFINITY(index, num) ((num) ? \ + (index) % (num) + 1 : 0) struct mlx5_ifc_tisc_bits { u8 strict_lag_tx_port_affinity[0x1]; u8 reserved_at_1[0x3]; @@ -2007,6 +2017,39 @@ struct mlx5_ifc_query_tis_in_bits { u8 reserved_at_60[0x20]; }; +/* port_select_mode definition. */ +enum mlx5_lag_mode_type { + MLX5_LAG_MODE_TIS = 0, + MLX5_LAG_MODE_HASH = 1, +}; + +struct mlx5_ifc_lag_context_bits { + u8 fdb_selection_mode[0x1]; + u8 reserved_at_1[0x14]; + u8 port_select_mode[0x3]; + u8 reserved_at_18[0x5]; + u8 lag_state[0x3]; + u8 reserved_at_20[0x14]; + u8 tx_remap_affinity_2[0x4]; + u8 reserved_at_38[0x4]; + u8 tx_remap_affinity_1[0x4]; +}; + +struct mlx5_ifc_query_lag_in_bits { + u8 opcode[0x10]; + u8 uid[0x10]; + u8 reserved_at_20[0x10]; + u8 op_mod[0x10]; + u8 reserved_at_40[0x40]; +}; + +struct mlx5_ifc_query_lag_out_bits { + u8 status[0x8]; + u8 reserved_at_8[0x18]; + u8 syndrome[0x20]; + struct mlx5_ifc_lag_context_bits context; +}; + struct mlx5_ifc_alloc_transport_domain_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; diff --git a/drivers/common/mlx5/version.map b/drivers/common/mlx5/version.map index d3c5040aac..95f8bddb94 100644 --- a/drivers/common/mlx5/version.map +++ b/drivers/common/mlx5/version.map @@ -53,6 +53,7 @@ INTERNAL { mlx5_devx_cmd_modify_virtq; mlx5_devx_cmd_qp_query_tis_td; mlx5_devx_cmd_query_hca_attr; + mlx5_devx_cmd_query_lag; mlx5_devx_cmd_query_parse_samples; mlx5_devx_cmd_query_virtio_q_counters; # WINDOWS_NO_EXPORT mlx5_devx_cmd_query_virtq; From patchwork Wed Oct 20 03:19:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rongwei Liu X-Patchwork-Id: 102335 X-Patchwork-Delegate: rasland@nvidia.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id BE5D9A0C45; Wed, 20 Oct 2021 05:20:16 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 5DD4941155; Wed, 20 Oct 2021 05:20:08 +0200 (CEST) Received: from NAM12-MW2-obe.outbound.protection.outlook.com (mail-mw2nam12on2060.outbound.protection.outlook.com [40.107.244.60]) by mails.dpdk.org (Postfix) with ESMTP id EA3DE40683 for ; Wed, 20 Oct 2021 05:20:06 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=LKujMdiv+Ebhk7PEQc/Bqqh1KsSMMs7lHTtTGRb6VXUt5LFqXgzis8Ekgl6nDZsYDe5OWsLr8wwnU2Kv49RgFgtBal0vyhpQAQfaRy0w1I7CyFY5VGIaGqBlAh2rdMnoYsLD4c1TMFuTGdLokNkDPK8i0likCJhXuCsOGrQWr/q2OqDTSwouSEONwBlUQ4mn2GeNT5gmECueEpKuiZ/EVGNz134/ro/xUxgCaMEcasngzjPmfDhWOwvQSIk0hYolaoZaySJmRQrLjE8xIGIILg9lu/CSfKpWn4Cw8THLNa4I+R7IH9ogvioZV6cWP3LhWTHNbW5zO6Tr7ayUIjI5xg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=gfoPk5CCxZhVWvoWEzosfHYufoOI7OM1E0/R+IddS8Y=; b=cCJKMG0cf+KAH+mV2K60c+f2iwvpzM0XtDCO5cAKVKn2MrhFWytnFYOIAm4Iq65KGkJfG9vGYYUtjooXntbveELEF1VviUjIDrlRCl7d+nkiWNSw58ApHE2vFWUNxOUaYQy7pDU16cOL3dqHgz8sBk9XUmjIk+IqwGU5gSG8pRYj10NgsGOvC6bmlLdl9OUP1pvUf3dA+DzqPsJRC9bF2ky1Ws3kghs01HIQHat9mG702A2B1qe2W6z+N6jKDC/4Cu6oXSzitACUtYvIf2bogJNF5TG1nCnmsilI+dDIKSZCVrb2VmiJy8Wx5Lg/3+ZgQjGHhVMSosnT+uE1HurYWg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.112.34) smtp.rcpttodomain=dpdk.org smtp.mailfrom=nvidia.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=gfoPk5CCxZhVWvoWEzosfHYufoOI7OM1E0/R+IddS8Y=; b=IX6yWURwkB/TD1QwK+JMqEFnJBCGfjtGdJSK9oQ6gAISY6snEhLoPAWWSl6pJrhMEg2Umf6bX1XULVx9rARHngEAq/T35uuHBJs+aU7V5WWqhS1aIBqm9aMJwANB7cj/7fLOq858U0NJaCmHNCrpIIyzYUCoDJuo/Y4NqjZBqI5r/dZUZ9UDB6vMpdWEkb/BglJ1jfkM6KOkcJLdq6m2w+HNB7+SA7qvNaSUw6KKpsd7/ltKph23u4E2Xc6ad8kW4PzYYJSG/fyip4s8914Da8dSCrdU7xVOr66Kb2XFDAAFL8IPLtOcynHeyhYgGDKSTdTSmg0+ee5Pvd/jH0nklQ== Received: from BN9P222CA0020.NAMP222.PROD.OUTLOOK.COM (2603:10b6:408:10c::25) by DM6PR12MB3066.namprd12.prod.outlook.com (2603:10b6:5:11a::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4608.18; Wed, 20 Oct 2021 03:20:05 +0000 Received: from BN8NAM11FT048.eop-nam11.prod.protection.outlook.com (2603:10b6:408:10c:cafe::d2) by BN9P222CA0020.outlook.office365.com (2603:10b6:408:10c::25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4608.15 via Frontend Transport; Wed, 20 Oct 2021 03:20:04 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.112.34) smtp.mailfrom=nvidia.com; dpdk.org; dkim=none (message not signed) header.d=none;dpdk.org; dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.112.34 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.112.34; helo=mail.nvidia.com; Received: from mail.nvidia.com (216.228.112.34) by BN8NAM11FT048.mail.protection.outlook.com (10.13.177.117) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.4628.16 via Frontend Transport; Wed, 20 Oct 2021 03:20:04 +0000 Received: from nvidia.com (172.20.187.6) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1497.18; Wed, 20 Oct 2021 03:20:00 +0000 From: Rongwei Liu To: , , , CC: , Date: Wed, 20 Oct 2021 06:19:38 +0300 Message-ID: <20211020031938.3190843-3-rongweil@nvidia.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20211020031938.3190843-1-rongweil@nvidia.com> References: <20211020031938.3190843-1-rongweil@nvidia.com> MIME-Version: 1.0 X-Originating-IP: [172.20.187.6] X-ClientProxiedBy: HQMAIL111.nvidia.com (172.20.187.18) To HQMAIL107.nvidia.com (172.20.187.13) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 1b3f4a02-127a-4d25-c264-08d9937882cf X-MS-TrafficTypeDiagnostic: DM6PR12MB3066: X-LD-Processed: 43083d15-7273-40c1-b7db-39efd9ccc17a,ExtAddr X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:2733; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: TBCvt1g/ilpqKeYbcAdxnAakCLH4k1cDzVMncAs/GnpwirrC3M4XgKqyLiFyYNP521RiK2L54MvbtFR2aYeHCac7nxgwP7Rrikmoy/Os1GooBVtb6pXd2L8lepNRLiwtCWPRXk2t/fnsfeSqeY1dzGrPauYWfuDlUW81Qm0S5CK+6ZKfqZYb0f2JNe1/f780Iqn1L3VY2NwE/YOZZrWS9eyOq60XIAJa4HZMQy7GEkbMFCZVWSrSDSJoeQMMm5aUMTGm21CeKmkmQLy7ySmWHYCR8grkjNxVoT+OdR8ZRMrcJyfLxijvuKSIEuv+gp666SmvFSl6vkxMoOzO2PBz9VvtQr8/CkqJfDXEQnkejx92eIDIFNAVlwMt08z1C79LkkQl5EXKRoFCzux7nCbcy1Fx2lYxH9pfzoH1/hgLe0H513VDX1MvQjC3BW+AvjJlPdiXz8nqvqqrp0zMsMigofEvsbwcf2RSa5fy0zOqhmRvJ3vpO/KFl8FalJmv1bdrRHhbyQw+DyWEMJgcQytYEbRSp9kgkINGCegznpECCydkVAmN7j2Z8dD8ig5aSQ3lq3tgkJE9iE3rb1Qv/eJ0cD5q9W4LGO9MDP+9PdKFLVynZKkRvjzMXJvXLVqyKKxW9yJZTh30CrtA+UckK1mIebxL0a9Jw2OTktGEpqHF5KKqeqsQLs7NnsYVozetaQ1ogRt0JHg7uc3jTqdlgpSAVXr1C0MkzQc2GT7XMxV7/fc= X-Forefront-Antispam-Report: CIP:216.228.112.34; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:schybrid03.nvidia.com; CAT:NONE; SFS:(4636009)(46966006)(36840700001)(2906002)(8936002)(7696005)(2616005)(107886003)(336012)(1076003)(30864003)(26005)(47076005)(7636003)(36756003)(55016002)(356005)(110136005)(316002)(36906005)(83380400001)(70586007)(5660300002)(4326008)(54906003)(36860700001)(86362001)(82310400003)(8676002)(70206006)(186003)(6286002)(16526019)(426003)(6666004)(508600001)(309714004); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 20 Oct 2021 03:20:04.6849 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 1b3f4a02-127a-4d25-c264-08d9937882cf X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[216.228.112.34]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: BN8NAM11FT048.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM6PR12MB3066 Subject: [dpdk-dev] [PATCH 2/2] net/mlx5: set txq affinity in round-robin X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Previously, we set txq affinity to 0 and let firmware to perform round-robin when bonding. Firmware uses a global counter to assign txq affinity to different physical ports accord to remainder after division. There are three dis-advantages: 1. The global counter is shared between kernel and dpdk. 2. After restarting pmd or port, the previous counter value is reused, so the new affinity is unpredictable. 3. There is no way to get what affinity is set by firmware. In this update, we will create several TISs up to the number of bonding ports and bind each TIS to one PF port. For each port, it will start to pick up TIS using its port index. Upper layer application can quickly calculate each txq's affinity without querying. At DPDK layer, when creating txq with 2 bonding ports, the affinity is set like: port 0: 1-->2-->1-->2 port 1: 2-->1-->2-->1 port 2: 1-->2-->1-->2 Note: Only applicable to DevX api. This affinity subjects to HW hash. Signed-off-by: Rongwei Liu Acked-by: Matan Azrad --- doc/guides/nics/mlx5.rst | 4 ++ drivers/net/mlx5/linux/mlx5_os.c | 2 +- drivers/net/mlx5/mlx5.c | 81 ++++++++++++++++++++++++++++---- drivers/net/mlx5/mlx5.h | 10 +++- drivers/net/mlx5/mlx5_devx.c | 37 ++++++++++++++- drivers/net/mlx5/mlx5_txpp.c | 4 +- 6 files changed, 124 insertions(+), 14 deletions(-) diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst index bae73f42d8..d817caedac 100644 --- a/doc/guides/nics/mlx5.rst +++ b/doc/guides/nics/mlx5.rst @@ -464,6 +464,10 @@ Limitations - In order to achieve best insertion rate, application should manage the flows per lcore. - Better to disable memory reclaim by setting ``reclaim_mem_mode`` to 0 to accelerate the flow object allocation and release with cache. +- HW hashed bonding + + - TXQ affinity subjects to HW hash once enabled. + Statistics ---------- diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c index a823d26beb..1d7fa7dc6c 100644 --- a/drivers/net/mlx5/linux/mlx5_os.c +++ b/drivers/net/mlx5/linux/mlx5_os.c @@ -928,7 +928,6 @@ mlx5_representor_match(struct mlx5_dev_spawn_data *spawn, return false; } - /** * Spawn an Ethernet device from Verbs information. * @@ -1707,6 +1706,7 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev, */ MLX5_ASSERT(spawn->ifindex); priv->if_index = spawn->ifindex; + priv->lag_affinity_idx = sh->refcnt - 1; eth_dev->data->dev_private = priv; priv->dev_data = eth_dev->data; eth_dev->data->mac_addrs = priv->mac; diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index e28cc461b9..e049a367f0 100644 --- a/drivers/net/mlx5/mlx5.c +++ b/drivers/net/mlx5/mlx5.c @@ -1118,6 +1118,68 @@ mlx5_alloc_rxtx_uars(struct mlx5_dev_ctx_shared *sh, return err; } +/** + * Set up multiple TISs with different affinities according to + * number of bonding ports + * + * @param priv + * Pointer of shared context. + * + * @return + * Zero on success, -1 otherwise. + */ +static int +mlx5_setup_tis(struct mlx5_dev_ctx_shared *sh) +{ + int i; + struct mlx5_devx_lag_context lag_ctx = { 0 }; + struct mlx5_devx_tis_attr tis_attr = { 0 }; + + tis_attr.transport_domain = sh->td->id; + if (sh->bond.n_port) { + if (!mlx5_devx_cmd_query_lag(sh->ctx, &lag_ctx)) { + sh->lag.tx_remap_affinity[0] = + lag_ctx.tx_remap_affinity_1; + sh->lag.tx_remap_affinity[1] = + lag_ctx.tx_remap_affinity_2; + sh->lag.affinity_mode = lag_ctx.port_select_mode; + } else { + DRV_LOG(ERR, "Failed to query lag affinity."); + return -1; + } + if (sh->lag.affinity_mode == MLX5_LAG_MODE_TIS) { + for (i = 0; i < sh->bond.n_port; i++) { + tis_attr.lag_tx_port_affinity = + MLX5_IFC_LAG_MAP_TIS_AFFINITY(i, + sh->bond.n_port); + sh->tis[i] = mlx5_devx_cmd_create_tis(sh->ctx, + &tis_attr); + if (!sh->tis[i]) { + DRV_LOG(ERR, "Failed to TIS %d/%d for bonding device" + " %s.", i, sh->bond.n_port, + sh->ibdev_name); + return -1; + } + } + DRV_LOG(DEBUG, "LAG number of ports : %d, affinity_1 & 2 : pf%d & %d.\n", + sh->bond.n_port, lag_ctx.tx_remap_affinity_1, + lag_ctx.tx_remap_affinity_2); + return 0; + } + if (sh->lag.affinity_mode == MLX5_LAG_MODE_HASH) + DRV_LOG(INFO, "Device %s enabled HW hash based LAG.", + sh->ibdev_name); + } + tis_attr.lag_tx_port_affinity = 0; + sh->tis[0] = mlx5_devx_cmd_create_tis(sh->ctx, &tis_attr); + if (!sh->tis[0]) { + DRV_LOG(ERR, "Failed to TIS 0 for bonding device" + " %s.", sh->ibdev_name); + return -1; + } + return 0; +} + /** * Allocate shared device context. If there is multiport device the * master and representors will share this context, if there is single @@ -1145,7 +1207,6 @@ mlx5_alloc_shared_dev_ctx(const struct mlx5_dev_spawn_data *spawn, struct mlx5_dev_ctx_shared *sh; int err = 0; uint32_t i; - struct mlx5_devx_tis_attr tis_attr = { 0 }; MLX5_ASSERT(spawn); /* Secondary process should not create the shared context. */ @@ -1216,9 +1277,7 @@ mlx5_alloc_shared_dev_ctx(const struct mlx5_dev_spawn_data *spawn, err = ENOMEM; goto error; } - tis_attr.transport_domain = sh->td->id; - sh->tis = mlx5_devx_cmd_create_tis(sh->ctx, &tis_attr); - if (!sh->tis) { + if (mlx5_setup_tis(sh)) { DRV_LOG(ERR, "TIS allocation failure"); err = ENOMEM; goto error; @@ -1282,10 +1341,13 @@ mlx5_alloc_shared_dev_ctx(const struct mlx5_dev_spawn_data *spawn, MLX5_ASSERT(sh); if (sh->share_cache.cache.table) mlx5_mr_btree_free(&sh->share_cache.cache); - if (sh->tis) - claim_zero(mlx5_devx_cmd_destroy(sh->tis)); if (sh->td) claim_zero(mlx5_devx_cmd_destroy(sh->td)); + i = 0; + do { + if (sh->tis[i]) + claim_zero(mlx5_devx_cmd_destroy(sh->tis[i])); + } while (++i < (uint32_t)sh->bond.n_port); if (sh->devx_rx_uar) mlx5_glue->devx_free_uar(sh->devx_rx_uar); if (sh->tx_uar) @@ -1310,6 +1372,7 @@ mlx5_alloc_shared_dev_ctx(const struct mlx5_dev_spawn_data *spawn, void mlx5_free_shared_dev_ctx(struct mlx5_dev_ctx_shared *sh) { + int i = 0; pthread_mutex_lock(&mlx5_dev_ctx_list_mutex); #ifdef RTE_LIBRTE_MLX5_DEBUG /* Check the object presence in the list. */ @@ -1361,8 +1424,10 @@ mlx5_free_shared_dev_ctx(struct mlx5_dev_ctx_shared *sh) } if (sh->pd) claim_zero(mlx5_os_dealloc_pd(sh->pd)); - if (sh->tis) - claim_zero(mlx5_devx_cmd_destroy(sh->tis)); + do { + if (sh->tis[i]) + claim_zero(mlx5_devx_cmd_destroy(sh->tis[i])); + } while (++i < sh->bond.n_port); if (sh->td) claim_zero(mlx5_devx_cmd_destroy(sh->td)); if (sh->devx_rx_uar) diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index a15f86616d..7ff5feaf4a 100644 --- a/drivers/net/mlx5/mlx5.h +++ b/drivers/net/mlx5/mlx5.h @@ -1111,6 +1111,12 @@ struct mlx5_aso_ct_pools_mng { struct mlx5_aso_sq aso_sq; /* ASO queue objects. */ }; +/* LAG attr. */ +struct mlx5_lag { + uint8_t tx_remap_affinity[16]; /* The PF port number of affinity */ + uint8_t affinity_mode; /* TIS or hash based affinity */ +}; + /* * Shared Infiniband device context for Master/Representors * which belong to same IB device with multiple IB ports. @@ -1178,8 +1184,9 @@ struct mlx5_dev_ctx_shared { struct rte_intr_handle intr_handle; /* Interrupt handler for device. */ struct rte_intr_handle intr_handle_devx; /* DEVX interrupt handler. */ void *devx_comp; /* DEVX async comp obj. */ - struct mlx5_devx_obj *tis; /* TIS object. */ + struct mlx5_devx_obj *tis[16]; /* TIS object. */ struct mlx5_devx_obj *td; /* Transport domain. */ + struct mlx5_lag lag; /* LAG attributes */ void *tx_uar; /* Tx/packet pacing shared UAR. */ struct mlx5_flex_parser_profiles fp[MLX5_FLEX_PARSER_MAX]; /* Flex parser profiles information. */ @@ -1445,6 +1452,7 @@ struct mlx5_priv { uint32_t rss_shared_actions; /* RSS shared actions. */ struct mlx5_devx_obj *q_counters; /* DevX queue counter object. */ uint32_t counter_set_id; /* Queue counter ID to set in DevX objects. */ + uint32_t lag_affinity_idx; /* LAG mode queue 0 affinity starting. */ }; #define PORT_ID(priv) ((priv)->dev_data->port_id) diff --git a/drivers/net/mlx5/mlx5_devx.c b/drivers/net/mlx5/mlx5_devx.c index a1db53577a..bff81c7df2 100644 --- a/drivers/net/mlx5/mlx5_devx.c +++ b/drivers/net/mlx5/mlx5_devx.c @@ -888,6 +888,37 @@ mlx5_devx_drop_action_destroy(struct rte_eth_dev *dev) rte_errno = ENOTSUP; } +/** + * Select TXQ TIS number. + * + * @param dev + * Pointer to Ethernet device. + * @param queue_idx + * Queue index in DPDK Tx queue array. + * + * @return + * > 0 on success, a negative errno value otherwise. + */ +static uint32_t +mlx5_get_txq_tis_num(struct rte_eth_dev *dev, uint16_t queue_idx) +{ + struct mlx5_priv *priv = dev->data->dev_private; + int tis_idx; + + if (priv->sh->bond.n_port && priv->sh->lag.affinity_mode == + MLX5_LAG_MODE_TIS) { + tis_idx = (priv->lag_affinity_idx + queue_idx) % + priv->sh->bond.n_port; + DRV_LOG(INFO, "port %d txq %d gets affinity %d and maps to PF %d.", + dev->data->port_id, queue_idx, tis_idx + 1, + priv->sh->lag.tx_remap_affinity[tis_idx]); + } else { + tis_idx = 0; + } + MLX5_ASSERT(priv->sh->tis[tis_idx]); + return priv->sh->tis[tis_idx]->id; +} + /** * Create the Tx hairpin queue object. * @@ -935,7 +966,8 @@ mlx5_txq_obj_hairpin_new(struct rte_eth_dev *dev, uint16_t idx) attr.wq_attr.log_hairpin_num_packets = attr.wq_attr.log_hairpin_data_sz - MLX5_HAIRPIN_QUEUE_STRIDE; - attr.tis_num = priv->sh->tis->id; + + attr.tis_num = mlx5_get_txq_tis_num(dev, idx); tmpl->sq = mlx5_devx_cmd_create_sq(priv->sh->ctx, &attr); if (!tmpl->sq) { DRV_LOG(ERR, @@ -992,14 +1024,15 @@ mlx5_txq_create_devx_sq_resources(struct rte_eth_dev *dev, uint16_t idx, .allow_swp = !!priv->config.swp, .cqn = txq_obj->cq_obj.cq->id, .tis_lst_sz = 1, - .tis_num = priv->sh->tis->id, .wq_attr = (struct mlx5_devx_wq_attr){ .pd = priv->sh->pdn, .uar_page = mlx5_os_get_devx_uar_page_id(priv->sh->tx_uar), }, .ts_format = mlx5_ts_format_conv(priv->sh->sq_ts_format), + .tis_num = mlx5_get_txq_tis_num(dev, idx), }; + /* Create Send Queue object with DevX. */ return mlx5_devx_sq_create(priv->sh->ctx, &txq_obj->sq_obj, log_desc_n, &sq_attr, priv->sh->numa_node); diff --git a/drivers/net/mlx5/mlx5_txpp.c b/drivers/net/mlx5/mlx5_txpp.c index 2be7e71f89..6e874fa090 100644 --- a/drivers/net/mlx5/mlx5_txpp.c +++ b/drivers/net/mlx5/mlx5_txpp.c @@ -230,7 +230,7 @@ mlx5_txpp_create_rearm_queue(struct mlx5_dev_ctx_shared *sh) .cd_master = 1, .state = MLX5_SQC_STATE_RST, .tis_lst_sz = 1, - .tis_num = sh->tis->id, + .tis_num = sh->tis[0]->id, .wq_attr = (struct mlx5_devx_wq_attr){ .pd = sh->pdn, .uar_page = mlx5_os_get_devx_uar_page_id(sh->tx_uar), @@ -433,7 +433,7 @@ mlx5_txpp_create_clock_queue(struct mlx5_dev_ctx_shared *sh) /* Create send queue object for Clock Queue. */ if (sh->txpp.test) { sq_attr.tis_lst_sz = 1; - sq_attr.tis_num = sh->tis->id; + sq_attr.tis_num = sh->tis[0]->id; sq_attr.non_wire = 0; sq_attr.static_sq_wq = 1; } else {