From patchwork Tue Feb 28 16:43:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Kozyrev X-Patchwork-Id: 124576 X-Patchwork-Delegate: rasland@nvidia.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 1343C41D9E; Tue, 28 Feb 2023 17:43:54 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 48DBE42C4D; Tue, 28 Feb 2023 17:43:48 +0100 (CET) Received: from NAM12-BN8-obe.outbound.protection.outlook.com (mail-bn8nam12on2049.outbound.protection.outlook.com [40.107.237.49]) by mails.dpdk.org (Postfix) with ESMTP id 463E342BFE for ; Tue, 28 Feb 2023 17:43:47 +0100 (CET) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=hzzJ3yxZ70CNKs6pGuxxHaq/ZmWKiNfNC+CS1E/qmO/e0bXdpzy1vkLGJG7RC2hD9Uis1Xj2VcANt17MHmulC7Spx6McUFgCuFU52F2I0iQbj5IaplKgkn9bLFtvgenvbxo9yHYpfE1MVMY22Ici4eqrkXLD0OXMUO8THee7Xj4Ar0gnitBih+97XOhSce67ZSQhiwC900uwoN+tKKdDsfO1Uecs6CEmOC8qVnnuhVXweyqCzh0HAzj5iDD9A0t3Lpo4NMEOpQRXnyhbQky4ecNEcqo+Z779m6SSIBKK6OBwAlrZQkZ1OdwwNQZWuBQSfWIFJT25dwsL49T/6K+6SA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=r1M7qO0R7oVB+xtud9Qr2m4+HeZ2WJyOrH8pnf3lTiM=; b=aKRBQwrYU1Q1DWdIdh2dymCEU8DAMNjwqowKFtidFp4KmTymmijJHf2yAdqgk2CHJIwjFMr6ui1Wh45puCL4JhICcbvf5fMgyPJ02au5hg9161qkzNzR5yA+Pc2ddnbsGdP3uWQqa3kH75NDiF24hqFs0BcrurLLVpstawo0bbTiLQH5nDVx8u1age/pkOVCeDcS+SISmT9SRZ0bHxNkkkK3NwbSfbRqwtisCn/udfYdcZfOOGTIwkCF4F9HDChIrpyhVgmMWZNacm5JzhWQBZdi1bC8gDYqmswSIBoIY5fLJSbi53QJwKbixxDtVcSZCrDqc8xsxKmI58t5NwOUOQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.117.160) smtp.rcpttodomain=dpdk.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=r1M7qO0R7oVB+xtud9Qr2m4+HeZ2WJyOrH8pnf3lTiM=; b=Pe2GiRrFTh7un1FHtVcfrVVkncFkw9/PRFN+boJoeng796jhv8l0sQmJUhShLwI50sBelVFrx9C1h5Bis13K5ayRmjc3uof2a+IsZAytJQlHOXbOUDdRYGqyRsmiUYyssEDe5u+Zt9hUdwtO7T+ov8JV5bgRxrXH7cu7m/UsjctNPKpMpxCQ2F7Q8+OHQ1A2TlsqSAuZaiL8I3XD5o3ve0H13/m6tYYd3i8geiaL47XEhdu8wgdnIN8Qiwjedb+jsAE+gLlmR+RdOjH+sWroUi90Z93hO9vp79te194RD/K8+3+ER5+ELgjEovNWTOKIt0tvuUkVUJxZXIpIc+VeOw== Received: from BN8PR03CA0013.namprd03.prod.outlook.com (2603:10b6:408:94::26) by DS0PR12MB8017.namprd12.prod.outlook.com (2603:10b6:8:146::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6134.30; Tue, 28 Feb 2023 16:43:45 +0000 Received: from BN8NAM11FT043.eop-nam11.prod.protection.outlook.com (2603:10b6:408:94:cafe::78) by BN8PR03CA0013.outlook.office365.com (2603:10b6:408:94::26) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6134.29 via Frontend Transport; Tue, 28 Feb 2023 16:43:45 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.117.160) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.117.160 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.117.160; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.117.160) by BN8NAM11FT043.mail.protection.outlook.com (10.13.177.218) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6156.14 via Frontend Transport; Tue, 28 Feb 2023 16:43:45 +0000 Received: from rnnvmail201.nvidia.com (10.129.68.8) by mail.nvidia.com (10.129.200.66) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.5; Tue, 28 Feb 2023 08:43:28 -0800 Received: from pegasus01.mtr.labs.mlnx (10.126.230.37) by rnnvmail201.nvidia.com (10.129.68.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.37; Tue, 28 Feb 2023 08:43:26 -0800 From: Alexander Kozyrev To: CC: , , Subject: [PATCH 1/5] common/mlx5: detect enhanced CQE compression capability Date: Tue, 28 Feb 2023 18:43:06 +0200 Message-ID: <20230228164310.807594-2-akozyrev@nvidia.com> X-Mailer: git-send-email 2.18.2 In-Reply-To: <20230228164310.807594-1-akozyrev@nvidia.com> References: <20230228164310.807594-1-akozyrev@nvidia.com> MIME-Version: 1.0 X-Originating-IP: [10.126.230.37] X-ClientProxiedBy: rnnvmail202.nvidia.com (10.129.68.7) To rnnvmail201.nvidia.com (10.129.68.8) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: BN8NAM11FT043:EE_|DS0PR12MB8017:EE_ X-MS-Office365-Filtering-Correlation-Id: 09b6c986-29f6-4178-2700-08db19aaf580 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: jSXWBHs9MvbHYSJsepJfK5Gn0lUcJW1jdwy1oM6JaXiHjGFkoH13ZsYuKqpbZCPAkE9n2olOvYiSePmIBQo/sR9g7XbgIKDr7OtfGUqGETQlK+fbO1ZULtrCK5q2bSlLps8oepBEYNZNYTVTEeuPpz6kGSRcXbw8cddzXW5mfUOfPOOB8cBZVkg3KpTwyfLDfqNonZup5QkVQ7VZqabqHNWJv9yY8wXXvBgQlacK/dZfUWWBrFh1/AeyZ4hKFe8T6ACwbvXn4BnbaKsQ3waQRisAGxmHWufuKJLfgh+sIwFlPAIwKS9tLuqCc7gY4G193PRcE4vuW2vdEYU+DJtNSX+RVeSzwhrMiOE1qa7uGnXiJhBHxpLedYgsemDnlIq3S9LSni92NC51/nkN7cpm5WwfoGz9isO31LSAsZLnm0qBxrWhst4P3FA4ZiGqWFwHxUnISCMq71qsjk7IzNZHDre1d5i2LxqDktdSAd2cWbSVw/TOvb0/HYSL39+IQ132a4+fP4PtVfd4oXkCwRT8Snt59hIhjq6mQY/UXtSMxY2LyTo2P7ocNAj2sIM3dBtYt6n6ydgzE53jg+/sPTZQRhbvGQsCoGNaghWpk1JJhJmej7dnRPcG5Vra83LNmB7pBjANE07kDB4fs15aZUcxttnhss8k6prqiLRdibLYnigLfpXeLlKv45R9816XScjrIbeFwXpcKbCZWhCg8qH3Qf++Y3rcqFYZYdRMku6PjJYaywQL6Nai7Dw6dTvcRQQg X-Forefront-Antispam-Report: CIP:216.228.117.160; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:dc6edge1.nvidia.com; CAT:NONE; SFS:(13230025)(4636009)(396003)(376002)(136003)(39860400002)(346002)(451199018)(40470700004)(36840700001)(46966006)(36756003)(86362001)(8676002)(6916009)(8936002)(4326008)(41300700001)(70586007)(70206006)(5660300002)(2906002)(34020700004)(356005)(36860700001)(7636003)(82740400003)(316002)(478600001)(107886003)(6666004)(336012)(40480700001)(40460700003)(82310400005)(426003)(54906003)(47076005)(16526019)(2616005)(186003)(83380400001)(1076003)(26005)(309714004); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 Feb 2023 16:43:45.4821 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 09b6c986-29f6-4178-2700-08db19aaf580 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[216.228.117.160]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: BN8NAM11FT043.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DS0PR12MB8017 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Enhanced CQE Compression is designed for better latency and SW utilization. Check the HCA capabilities to see if Enhanced CQE Compression is supported. Basic or Enhanced CQE Compression can be set as the CQE Compression Layout. Enhanced CQE Compression can be selected only if it is supported by the FW. Signed-off-by: Alexander Kozyrev Acked-by: Viacheslav Ovsiienko --- drivers/common/mlx5/mlx5_devx_cmds.c | 3 +++ drivers/common/mlx5/mlx5_devx_cmds.h | 2 ++ drivers/common/mlx5/mlx5_prm.h | 30 +++++++++++++++++++++++----- 3 files changed, 30 insertions(+), 5 deletions(-) diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/mlx5_devx_cmds.c index bfc6e09eac..8b5582701f 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.c +++ b/drivers/common/mlx5/mlx5_devx_cmds.c @@ -1001,6 +1001,8 @@ mlx5_devx_cmd_query_hca_attr(void *ctx, mini_cqe_resp_flow_tag); attr->mini_cqe_resp_l3_l4_tag = MLX5_GET(cmd_hca_cap, hcattr, mini_cqe_resp_l3_l4_tag); + attr->enhanced_cqe_compression = MLX5_GET(cmd_hca_cap, hcattr, + enhanced_cqe_compression); attr->umr_indirect_mkey_disabled = MLX5_GET(cmd_hca_cap, hcattr, umr_indirect_mkey_disabled); attr->umr_modify_entity_size_disabled = @@ -2059,6 +2061,7 @@ mlx5_devx_cmd_create_cq(void *ctx, struct mlx5_devx_cq_attr *attr) MLX5_SET(cqc, cqctx, c_eqn, attr->eqn); MLX5_SET(cqc, cqctx, uar_page, attr->uar_page_id); MLX5_SET(cqc, cqctx, cqe_comp_en, !!attr->cqe_comp_en); + MLX5_SET(cqc, cqctx, cqe_comp_layout, !!attr->cqe_comp_layout); MLX5_SET(cqc, cqctx, mini_cqe_res_format, attr->mini_cqe_res_format); MLX5_SET(cqc, cqctx, mini_cqe_res_format_ext, attr->mini_cqe_res_format_ext); diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h b/drivers/common/mlx5/mlx5_devx_cmds.h index 8e68eeaf37..44b627daf2 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.h +++ b/drivers/common/mlx5/mlx5_devx_cmds.h @@ -244,6 +244,7 @@ struct mlx5_hca_attr { uint32_t cqe_compression:1; uint32_t mini_cqe_resp_flow_tag:1; uint32_t mini_cqe_resp_l3_l4_tag:1; + uint32_t enhanced_cqe_compression:1; uint32_t pkt_integrity_match:1; /* 1 if HW supports integrity item */ struct mlx5_hca_qos_attr qos; struct mlx5_hca_vdpa_attr vdpa; @@ -468,6 +469,7 @@ struct mlx5_devx_cq_attr { uint32_t cqe_comp_en:1; uint32_t mini_cqe_res_format:2; uint32_t mini_cqe_res_format_ext:2; + uint32_t cqe_comp_layout:2; uint32_t log_cq_size:5; uint32_t log_page_size:5; uint32_t uar_page_id; diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h index 54766c2d65..aa291f19a6 100644 --- a/drivers/common/mlx5/mlx5_prm.h +++ b/drivers/common/mlx5/mlx5_prm.h @@ -1715,10 +1715,28 @@ struct mlx5_ifc_cmd_hca_cap_bits { u8 max_geneve_tlv_options[0x8]; u8 reserved_at_568[0x3]; u8 max_geneve_tlv_option_data_len[0x5]; - u8 reserved_at_570[0x49]; + u8 flex_parser_header_modify[0x1]; + u8 reserved_at_571[0x2]; + u8 log_max_guaranteed_connections[0x5]; + u8 driver_version_before_init_hca[0x1]; + u8 adv_virtualization[0x1]; + u8 reserved_at_57a[0x1]; + u8 log_max_dct_connections[0x5]; + u8 log_max_atomic_size_qp[0x8]; + u8 reserved_at_587[0x3]; + u8 log_max_dci_stream_channels[0x5]; + u8 reserved_at_58f[0x3]; + u8 log_max_dci_errored_streams[0x5]; + u8 log_max_atomic_dize_dc[0x8]; + u8 max_multi_user_ggroup_size[0x10]; + u8 enhanced_cqe_compression[0x1]; + u8 reserved_at_5b0[0x1]; + u8 crossing_vhca_mkey[0x1]; + u8 log_max_dek[0x5]; + u8 reserved_at_5b7[0x1]; u8 mini_cqe_resp_l3_l4_tag[0x1]; u8 mini_cqe_resp_flow_tag[0x1]; - u8 enhanced_cqe_compression[0x1]; + u8 reserved_at_5ba[0x1]; u8 mini_cqe_resp_stride_index[0x1]; u8 cqe_128_always[0x1]; u8 cqe_compression_128[0x1]; @@ -3042,7 +3060,7 @@ struct mlx5_ifc_cqc_bits { u8 as_notify[0x1]; u8 initiator_src_dct[0x1]; u8 dbr_umem_valid[0x1]; - u8 reserved_at_7[0x1]; + u8 ext_element[0x1]; u8 cqe_sz[0x3]; u8 cc[0x1]; u8 reserved_at_c[0x1]; @@ -3052,8 +3070,10 @@ struct mlx5_ifc_cqc_bits { u8 cqe_comp_en[0x1]; u8 mini_cqe_res_format[0x2]; u8 st[0x4]; - u8 reserved_at_18[0x1]; - u8 cqe_comp_layout[0x7]; + u8 always_armed_cq[0x1]; + u8 ext_element_type[0x3]; + u8 reserved_at_1c[0x2]; + u8 cqe_comp_layout[0x2]; u8 dbr_umem_id[0x20]; u8 reserved_at_40[0x14]; u8 page_offset[0x6]; From patchwork Tue Feb 28 16:43:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Kozyrev X-Patchwork-Id: 124577 X-Patchwork-Delegate: rasland@nvidia.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 5415441D9E; Tue, 28 Feb 2023 17:44:00 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 50842427F5; Tue, 28 Feb 2023 17:43:52 +0100 (CET) Received: from NAM02-DM3-obe.outbound.protection.outlook.com (mail-dm3nam02on2043.outbound.protection.outlook.com [40.107.95.43]) by mails.dpdk.org (Postfix) with ESMTP id A904F427F5 for ; Tue, 28 Feb 2023 17:43:50 +0100 (CET) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=KT6N7f+/muqd0R+Lth20VM8TIeWJIulFQv68WvG9agWVNuC9bUVnfWgQwe+fgJ7+QKpDxZDw370p/uQsyZC9ST/gKLkyaU0HKGetE0lsPI7jmjQCbnk75b7npOirC2uyxlnW/5jWBM2J55g5OahoAurDUB+Jknpl1baoYRgmceQfdQFLWnw66QFVPm9TTnK2e3HkqYDvufAdbhU15q/XZWo3mkwZ2zuL1yijop4QrGFXDccQCq2VU8ydXmqG0RZucebuLcfm9Wc4uxb5G5iUHWxnFvpjdDLzEDAve8nb9TjpGWvoko0gD7RjPEtKvdkD9a6nupJixdE4FRMDJRD1kQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=qn+Dnyd8/h66tp/JukU+sScF4FxujZ57bSfBtM9PWH8=; b=Lv8z9PUINxRzLYelGY8tBfF7+XMxi8XttkssagdJBWvOqhE1uwcTi+X0O0I9HlzPF1yWutyJ0I/OeDCWfVHDutj4ZCCf9/Q8hUGEIfzULjN83SpNFq7o4eN6bay5/6QdA2gH1AdKxPl3TlWaALC5vU2ef0uVTAM99NE/4ZUhi639xI3wf2P/yBhrGqYKx/31+GzMtaXEqNVZ07YqwBYXLa69dk2Uqstqi2xoznMRxokE2ahjE3+kajaBoMfVn+LeJOWHYvPJC23TuA6+ueF0f9Kml0JD42XvTmYjlvHoDE3crjTn7LSbDvwC9wyR1+tjYtMMVlyaHhyQgtZWTRQnnA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.117.160) smtp.rcpttodomain=dpdk.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=qn+Dnyd8/h66tp/JukU+sScF4FxujZ57bSfBtM9PWH8=; b=iH/PYBSTn08wnbBMNcuWXZVbWK/PTWwKKShYrP6kdkyHK6HyhdXW4pLCfdmr+o1a7g6BsS37cCV6e9rcxLnGwAiZlV/KE7bL7GsK7f26pAIKj5dwKmKVtEbkX9VwkHzTJQifg/H8Us2XNVQYfEelTRjAy6gaY9i3wRNrHhh6wELO+Kfjv0O8lY4v3S8UjgoiQK2j2C6warxL+t+mG7ONQJI5Ya/5K+sfeTj80IOHZSASNHTiEZ+yb0UaGH92RZjSlLTkUSEjD6znL/GSwUyT4s/FB9x5C5P5nyainyYlpoENWdtZ+TnObgAFqa0lPdtjeD/lw6GaJxnAeknnfVs3dQ== Received: from BN8PR15CA0009.namprd15.prod.outlook.com (2603:10b6:408:c0::22) by PH0PR12MB8174.namprd12.prod.outlook.com (2603:10b6:510:298::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6134.30; Tue, 28 Feb 2023 16:43:48 +0000 Received: from BN8NAM11FT085.eop-nam11.prod.protection.outlook.com (2603:10b6:408:c0:cafe::2c) by BN8PR15CA0009.outlook.office365.com (2603:10b6:408:c0::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6134.30 via Frontend Transport; Tue, 28 Feb 2023 16:43:48 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.117.160) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.117.160 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.117.160; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.117.160) by BN8NAM11FT085.mail.protection.outlook.com (10.13.176.100) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6156.12 via Frontend Transport; Tue, 28 Feb 2023 16:43:47 +0000 Received: from rnnvmail201.nvidia.com (10.129.68.8) by mail.nvidia.com (10.129.200.66) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.5; Tue, 28 Feb 2023 08:43:29 -0800 Received: from pegasus01.mtr.labs.mlnx (10.126.230.37) by rnnvmail201.nvidia.com (10.129.68.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.37; Tue, 28 Feb 2023 08:43:28 -0800 From: Alexander Kozyrev To: CC: , , Subject: [PATCH 2/5] common/mlx5: add CQE validity iteration count Date: Tue, 28 Feb 2023 18:43:07 +0200 Message-ID: <20230228164310.807594-3-akozyrev@nvidia.com> X-Mailer: git-send-email 2.18.2 In-Reply-To: <20230228164310.807594-1-akozyrev@nvidia.com> References: <20230228164310.807594-1-akozyrev@nvidia.com> MIME-Version: 1.0 X-Originating-IP: [10.126.230.37] X-ClientProxiedBy: rnnvmail202.nvidia.com (10.129.68.7) To rnnvmail201.nvidia.com (10.129.68.8) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: BN8NAM11FT085:EE_|PH0PR12MB8174:EE_ X-MS-Office365-Filtering-Correlation-Id: 40b3d7eb-7e97-436e-4336-08db19aaf6da X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: SmMv7+aEFNxaYbr4/8G4mIm/IuATt0ag3pYe0RGA0qupX38Hnf2fIcvGS3OMvOULjOEgVbwG9hh5LZwtuUMcAD4CDw5+/QWw7btRMvO/RIhs2MhISKGq/i7tYIVMC03tqRHmCQOCSdco4d4owEAE+R9XBE296w5PiARXlY6nahi8ta95rmog8ci2NecoEC3mHq6IuB/+t91nEEnRJufkYbBzEhjva0JTGiQ3bVnOqtSAkHMX92xEvo1/qiMQ56hneuYpFK0dCYmJ30cQmME6hh83ze/I8/piBakjt7+rpF9FQL7EhQ58yR+jM31ONpZ9cBHScIaZxAVExPDRLG98UpYgW1WwGgRjHD0rvho3kf5vsDpL5MzcGcQtOBFSdb8dOiaum5vc8qPEJUdGyMyOWxSb1PnyhENNkXraLACK/IhF/X+leB2l4BnLDcizTPHVoFpJsEx8GHVEXNNvI10V6AZYOiix25oDUTz14SulVmCXXu7I+MyCD5x41DHkePYBoviDZezkinMex43ewEcyvFI345BRRG0wBeM0SoDCqChgFU7Vxq8iyJ0Oh0dpCBZIRXFynKHUFqfKrWbZ0Pa9/xxifW+GOO1CjhXEfaCVxN70oNVPiPGPBgy0JNjpV9jtrBB7lwLwnYQRVsg1hulBUbfmktv9zGOOX+AvI2t7CPdUupYBLIgkB3qLgzE/u3IzgNcPprPqI31+tDBD7KVjpChnDMc578QGx8+QAnlWzvw= X-Forefront-Antispam-Report: CIP:216.228.117.160; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:dc6edge1.nvidia.com; CAT:NONE; SFS:(13230025)(4636009)(136003)(39860400002)(346002)(396003)(376002)(451199018)(40470700004)(46966006)(36840700001)(86362001)(8936002)(5660300002)(186003)(1076003)(316002)(8676002)(70586007)(70206006)(41300700001)(6916009)(26005)(4326008)(2906002)(36756003)(54906003)(83380400001)(40460700003)(478600001)(40480700001)(107886003)(6666004)(16526019)(336012)(47076005)(356005)(2616005)(82740400003)(426003)(34020700004)(7636003)(82310400005)(36860700001); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 Feb 2023 16:43:47.7503 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 40b3d7eb-7e97-436e-4336-08db19aaf6da X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[216.228.117.160]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: BN8NAM11FT085.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR12MB8174 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org The validity iteration count replaces the functionality of the owner bit in terms of indicating that a new CQE was written to buffer. On iteration=k on the CQ buffer, only entries with the iteration_count=k should be treated as new CQEs or mini CQE arrays. The validity iteration count is used when the Enhanced CQE compression is selected. Add this CQE field and the method to check it. Signed-off-by: Alexander Kozyrev Acked-by: Viacheslav Ovsiienko --- drivers/common/mlx5/mlx5_common.h | 57 ++++++++++++++++++++++---- drivers/common/mlx5/mlx5_common_devx.c | 4 +- drivers/common/mlx5/mlx5_prm.h | 12 ++++-- 3 files changed, 62 insertions(+), 11 deletions(-) diff --git a/drivers/common/mlx5/mlx5_common.h b/drivers/common/mlx5/mlx5_common.h index f8d07d6c6b..9fb85ddefb 100644 --- a/drivers/common/mlx5/mlx5_common.h +++ b/drivers/common/mlx5/mlx5_common.h @@ -180,7 +180,26 @@ enum mlx5_cqe_status { }; /** - * Check whether CQE is valid. + * Check whether CQE has an error opcode. + * + * @param op_code + * Opcode to check. + * + * @return + * The CQE status. + */ +static __rte_always_inline enum mlx5_cqe_status +check_cqe_error(const uint8_t op_code) +{ + rte_io_rmb(); + if (unlikely(op_code == MLX5_CQE_RESP_ERR || + op_code == MLX5_CQE_REQ_ERR)) + return MLX5_CQE_STATUS_ERR; + return MLX5_CQE_STATUS_SW_OWN; +} + +/** + * Check whether CQE is valid using owner bit. * * @param cqe * Pointer to CQE. @@ -201,13 +220,37 @@ check_cqe(volatile struct mlx5_cqe *cqe, const uint16_t cqes_n, const uint8_t op_owner = MLX5_CQE_OWNER(op_own); const uint8_t op_code = MLX5_CQE_OPCODE(op_own); - if (unlikely((op_owner != (!!(idx))) || (op_code == MLX5_CQE_INVALID))) + if (unlikely((op_owner != (!!(idx))) || + (op_code == MLX5_CQE_INVALID))) return MLX5_CQE_STATUS_HW_OWN; - rte_io_rmb(); - if (unlikely(op_code == MLX5_CQE_RESP_ERR || - op_code == MLX5_CQE_REQ_ERR)) - return MLX5_CQE_STATUS_ERR; - return MLX5_CQE_STATUS_SW_OWN; + return check_cqe_error(op_code); +} + +/** + * Check whether CQE is valid using validity iteration count. + * + * @param cqe + * Pointer to CQE. + * @param cqes_n + * Log 2 of completion queue size. + * @param ci + * Consumer index. + * + * @return + * The CQE status. + */ +static __rte_always_inline enum mlx5_cqe_status +check_cqe_iteration(volatile struct mlx5_cqe *cqe, const uint16_t cqes_n, + const uint32_t ci) +{ + const uint8_t op_own = cqe->op_own; + const uint8_t op_code = MLX5_CQE_OPCODE(op_own); + const uint8_t vic = ci >> cqes_n; + + if (unlikely((cqe->validity_iteration_count != vic) || + (op_code == MLX5_CQE_INVALID))) + return MLX5_CQE_STATUS_HW_OWN; + return check_cqe_error(op_code); } /* diff --git a/drivers/common/mlx5/mlx5_common_devx.c b/drivers/common/mlx5/mlx5_common_devx.c index 5f53996b72..431d8361ce 100644 --- a/drivers/common/mlx5/mlx5_common_devx.c +++ b/drivers/common/mlx5/mlx5_common_devx.c @@ -41,8 +41,10 @@ mlx5_cq_init(struct mlx5_devx_cq *cq_obj, uint16_t cq_size) volatile struct mlx5_cqe *cqe = cq_obj->cqes; uint16_t i; - for (i = 0; i < cq_size; i++, cqe++) + for (i = 0; i < cq_size; i++, cqe++) { cqe->op_own = (MLX5_CQE_INVALID << 4) | MLX5_CQE_OWNER_MASK; + cqe->validity_iteration_count = MLX5_CQE_VIC_INIT; + } } /** diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h index aa291f19a6..a52feba7e4 100644 --- a/drivers/common/mlx5/mlx5_prm.h +++ b/drivers/common/mlx5/mlx5_prm.h @@ -26,12 +26,18 @@ /* Get CQE opcode. */ #define MLX5_CQE_OPCODE(op_own) (((op_own) & 0xf0) >> 4) +/* Get CQE number of mini CQEs. */ +#define MLX5_CQE_NUM_MINIS(op_own) (((op_own) & 0xf0) >> 4) + /* Get CQE solicited event. */ #define MLX5_CQE_SE(op_own) (((op_own) >> 1) & 1) /* Invalidate a CQE. */ #define MLX5_CQE_INVALIDATE (MLX5_CQE_INVALID << 4) +/* Initialize CQE validity iteration count. */ +#define MLX5_CQE_VIC_INIT 0xffu + /* Hardware index widths. */ #define MLX5_CQ_INDEX_WIDTH 24 #define MLX5_WQ_INDEX_WIDTH 16 @@ -442,7 +448,7 @@ struct mlx5_cqe { uint64_t timestamp; uint32_t sop_drop_qpn; uint16_t wqe_counter; - uint8_t rsvd5; + uint8_t validity_iteration_count; uint8_t op_own; }; @@ -450,7 +456,7 @@ struct mlx5_cqe_ts { uint64_t timestamp; uint32_t sop_drop_qpn; uint16_t wqe_counter; - uint8_t rsvd5; + uint8_t validity_iteration_count; uint8_t op_own; }; @@ -5041,8 +5047,8 @@ struct mlx5_mini_cqe8 { }; struct { uint16_t wqe_counter; + uint8_t validity_iteration_count; uint8_t s_wqe_opcode; - uint8_t reserved; } s_wqe_info; }; union { From patchwork Tue Feb 28 16:43:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Kozyrev X-Patchwork-Id: 124575 X-Patchwork-Delegate: rasland@nvidia.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id EA82141D9E; Tue, 28 Feb 2023 17:43:45 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id DE13A4161A; Tue, 28 Feb 2023 17:43:45 +0100 (CET) Received: from NAM12-MW2-obe.outbound.protection.outlook.com (mail-mw2nam12on2071.outbound.protection.outlook.com [40.107.244.71]) by mails.dpdk.org (Postfix) with ESMTP id E3A3942670 for ; Tue, 28 Feb 2023 17:43:43 +0100 (CET) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=KHnNfxr28nvvExIhL9BR6VZk9TH6ggi89Us6DSgt5td+qPXrMGkv1Nvntgsp1ewIsUFkewvC1U/eHepryR0JNSy9W/1s3nUz87C84WGO1Gsk4++FUNDORtFUlI9pqLeXdTkVbsAsvR8uTpVnCbrNOwbUxxCesgjyKEtsavOF8skEwX/TA6JG8VbODpEB9+pOY8bMc3tm0/8zZmsJ4BxnZbAKY0PN6MHpEf/uq9fqu7Cvk+A95GmUdaEcthTWNTVQvhhIUUxfv13VqyxMc7gtCrGfoNqjwhkXEigQ5HVNim1yYLbEkVA91DFgRqjih/T0sOc7mjyPqXC8Jf60Vn6u/g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=nLvECaHOw4UThQBSTSEqpPuOZtuO+XtjvN05xApJgA4=; b=XTqOELMD1qNQZtH9alPJGxSqli910DbJPIklTzYwBEvTRx8bhTmYa45OdwGx+ftj0SxySqyi+fmohMT1jxWRyJugaaoN5V7nAjiwcjRY7F9CPFFocoOIdGJ9YvyxUmZ/cOhjOjc/u0tnJWwZy3ZOA3AKq7BIAL4hQteAciwyZqSwKUrsAjMbqzcdUEWpfzp7mWuGeKZ43h2VJFz5+fJw9fv0najAzdv0MhuhDnxseAUeHB6q4Uo1McH90b09eq1xaLG66qF2nPnSVjKNlgamZl82b6a8P1VXan0RKPTyIukFmtLD9Ss73yLGZhEXCZ+T5LzTwQiOa6Gb+jBmzHfzQA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.117.161) smtp.rcpttodomain=dpdk.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=nLvECaHOw4UThQBSTSEqpPuOZtuO+XtjvN05xApJgA4=; b=S8BBC7SOH1QxfT8s+TnRvzf+teA6p8xiICETuRnEqoUyeXRqBFYUP7gsQKXADuGGJNxGp7ExY14dlBJy2/W8vs3YaV07bpcG6unMW4Ad/XRDJpyK+Qe9Ag6Vk+FgVR70WYtgilJDtdA4Ds277pleVPwRI/pUcC4RwRXfcODXWy/1zLKScaO1TH14aOGhjE/+w9XAiej5ucb+h3LKAlUvU0/FRg5Bfpn4LV5wS5KYp4feCm1O5XRNz33F9p/qCf58hzexkutcWJN6jXkQBnfwcc1ed17nezw686zQCk9WBtjCGHYIzarsjfqwt2c+2eauilQmv7Zw7DtcB9CQ4Dlpww== Received: from DM6PR21CA0018.namprd21.prod.outlook.com (2603:10b6:5:174::28) by CY8PR12MB8196.namprd12.prod.outlook.com (2603:10b6:930:78::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6134.29; Tue, 28 Feb 2023 16:43:42 +0000 Received: from DS1PEPF0000E63D.namprd02.prod.outlook.com (2603:10b6:5:174:cafe::fa) by DM6PR21CA0018.outlook.office365.com (2603:10b6:5:174::28) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6178.5 via Frontend Transport; Tue, 28 Feb 2023 16:43:41 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.117.161) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.117.161 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.117.161; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.117.161) by DS1PEPF0000E63D.mail.protection.outlook.com (10.167.17.75) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6156.12 via Frontend Transport; Tue, 28 Feb 2023 16:43:41 +0000 Received: from rnnvmail201.nvidia.com (10.129.68.8) by mail.nvidia.com (10.129.200.67) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.5; Tue, 28 Feb 2023 08:43:31 -0800 Received: from pegasus01.mtr.labs.mlnx (10.126.230.37) by rnnvmail201.nvidia.com (10.129.68.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.37; Tue, 28 Feb 2023 08:43:29 -0800 From: Alexander Kozyrev To: CC: , , Subject: [PATCH 3/5] net/mlx5: support enhanced CQE compression in Rx burst Date: Tue, 28 Feb 2023 18:43:08 +0200 Message-ID: <20230228164310.807594-4-akozyrev@nvidia.com> X-Mailer: git-send-email 2.18.2 In-Reply-To: <20230228164310.807594-1-akozyrev@nvidia.com> References: <20230228164310.807594-1-akozyrev@nvidia.com> MIME-Version: 1.0 X-Originating-IP: [10.126.230.37] X-ClientProxiedBy: rnnvmail202.nvidia.com (10.129.68.7) To rnnvmail201.nvidia.com (10.129.68.8) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS1PEPF0000E63D:EE_|CY8PR12MB8196:EE_ X-MS-Office365-Filtering-Correlation-Id: eb6a80a0-0688-4b53-da0b-08db19aaf337 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: xnxooMXONBcLAjgGK+TvGttZWK7DBFR0xMC1tvn2+OknP7saPx2zjx+K1AlYzuwj3M3V7Il5R0K850fbAFkNkqFZ3+pdnjQQ7YDkNlOHdxY8KPLDDZt8gTzDKfNKU+RLl4PC366uuzd7zqsViuDd6oK9etkO0VoD85xKH1aiHKsubkVw9sZb8HPJOSDvC+aOv0EGtPVU0MwDMiPD1EN//lIyGWDOAAt9AGxfBE7iW8C5WEwv9wr2/3cCWSGfp7nVQWgxIa5HJofVSeoQQXSrVCIND1XVQP/TCR+NPRneYmtlZd0+E9ZGTXNLuCFKJHpv5Vjg+T+/g4lYgmjBeWB6iqoyZu/vhQIAcyabxXKKDapztLrmJg4EGmlqiEP3lT583eIuD7/UcwcKIv2HF69t8hJl05WQ6QOatm0oN7EceU6h9UiXdAQJ3b0ok2KLSDyeSWEWqEBClPvxdSucmYo2Ary3YKSTeJIFfIAs7KJe1Qz70ebJ6J/q9kENIOGe6bFeg6q2oyHLh8jr44roLjF/RCdnqPrP3IYOuojPTrLVH84B3aZ9jzrZYVBstXJ1M/Ty/cusasAhN2ZMxx+oIK4+a4LOJwSY06gYLh5PdqoztKuhxisNeH7Ix8TuILCrAlT+wEAz2G4hWqlXLH/OZ2zIlM9ptv0sHxTfkhecEQqKX/kJe9R/gO2WD0Lsa9aGO0QIxlRQsw+BNxk9aXL5UDFxPb981gsfii/vKsYQ7QXbZj0= X-Forefront-Antispam-Report: CIP:216.228.117.161; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:dc6edge2.nvidia.com; CAT:NONE; SFS:(13230025)(4636009)(376002)(396003)(346002)(136003)(39860400002)(451199018)(46966006)(36840700001)(40470700004)(83380400001)(336012)(316002)(426003)(54906003)(36756003)(47076005)(82740400003)(6916009)(7636003)(4326008)(40480700001)(36860700001)(8676002)(2906002)(82310400005)(2616005)(16526019)(26005)(478600001)(1076003)(186003)(5660300002)(6666004)(107886003)(70586007)(70206006)(30864003)(8936002)(86362001)(40460700003)(356005)(41300700001)(309714004); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 Feb 2023 16:43:41.7309 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: eb6a80a0-0688-4b53-da0b-08db19aaf337 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[216.228.117.161]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: DS1PEPF0000E63D.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY8PR12MB8196 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org net/mlx5: support enhanced CQE compression Enhanced CQE compression changes the structure of the compression block and the number of miniCQEs per miniCQE array. Adapt to these changes in the datapath by defining a new parsing mechanism of a miniCQE array: 1. The title CQE is no longer marked as the compressed one. Need to copy it for the future miniCQE arrays parsing. 2. Mini CQE arrays now consist of up to 7 miniCQEs and a control block. The control block contains the number of miniCQEs in the array as well as an indication that this CQE is compressed. 3. The invalidation of reserved CQEs between miniCQEs arrays is not needed. 4. The owner_bit is replaced the validity_iteration_count for all CQEs. Signed-off-by: Alexander Kozyrev Acked-by: Viacheslav Ovsiienko --- drivers/net/mlx5/mlx5_rx.c | 175 +++++++++++++++++++++++------------- drivers/net/mlx5/mlx5_rx.h | 12 +-- drivers/net/mlx5/mlx5_rxq.c | 5 +- 3 files changed, 123 insertions(+), 69 deletions(-) diff --git a/drivers/net/mlx5/mlx5_rx.c b/drivers/net/mlx5/mlx5_rx.c index 99a08ef5f1..d2eb732cf1 100644 --- a/drivers/net/mlx5/mlx5_rx.c +++ b/drivers/net/mlx5/mlx5_rx.c @@ -39,7 +39,8 @@ rxq_cq_to_pkt_type(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe, static __rte_always_inline int mlx5_rx_poll_len(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe, - uint16_t cqe_cnt, volatile struct mlx5_mini_cqe8 **mcqe, + uint16_t cqe_n, uint16_t cqe_mask, + volatile struct mlx5_mini_cqe8 **mcqe, uint16_t *skip_cnt, bool mprq); static __rte_always_inline uint32_t @@ -297,15 +298,22 @@ int mlx5_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc) const unsigned int cqe_num = 1 << rxq->cqe_n; const unsigned int cqe_mask = cqe_num - 1; const uint16_t idx = rxq->cq_ci & cqe_num; + const uint8_t vic = rxq->cq_ci >> rxq->cqe_n; volatile struct mlx5_cqe *cqe = &(*rxq->cqes)[rxq->cq_ci & cqe_mask]; if (unlikely(rxq->cqes == NULL)) { rte_errno = EINVAL; return -rte_errno; } - pmc->addr = &cqe->op_own; - pmc->opaque[CLB_VAL_IDX] = !!idx; - pmc->opaque[CLB_MSK_IDX] = MLX5_CQE_OWNER_MASK; + if (rxq->cqe_comp_layout) { + pmc->addr = &cqe->validity_iteration_count; + pmc->opaque[CLB_VAL_IDX] = vic; + pmc->opaque[CLB_MSK_IDX] = MLX5_CQE_VIC_INIT; + } else { + pmc->addr = &cqe->op_own; + pmc->opaque[CLB_VAL_IDX] = !!idx; + pmc->opaque[CLB_MSK_IDX] = MLX5_CQE_OWNER_MASK; + } pmc->fn = mlx5_monitor_callback; pmc->size = sizeof(uint8_t); return 0; @@ -593,6 +601,10 @@ mlx5_rx_err_handle(struct mlx5_rxq_data *rxq, uint8_t vec, * Pointer to RX queue. * @param cqe * CQE to process. + * @param cqe_n + * Completion queue count. + * @param cqe_mask + * Completion queue mask. * @param[out] mcqe * Store pointer to mini-CQE if compressed. Otherwise, the pointer is not * written. @@ -608,13 +620,13 @@ mlx5_rx_err_handle(struct mlx5_rxq_data *rxq, uint8_t vec, */ static inline int mlx5_rx_poll_len(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe, - uint16_t cqe_cnt, volatile struct mlx5_mini_cqe8 **mcqe, + uint16_t cqe_n, uint16_t cqe_mask, + volatile struct mlx5_mini_cqe8 **mcqe, uint16_t *skip_cnt, bool mprq) { struct rxq_zip *zip = &rxq->zip; - uint16_t cqe_n = cqe_cnt + 1; int len = 0, ret = 0; - uint16_t idx, end; + uint32_t idx, end; do { len = 0; @@ -623,39 +635,47 @@ mlx5_rx_poll_len(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe, volatile struct mlx5_mini_cqe8 (*mc)[8] = (volatile struct mlx5_mini_cqe8 (*)[8]) (uintptr_t)(&(*rxq->cqes)[zip->ca & - cqe_cnt].pkt_info); + cqe_mask].pkt_info); len = rte_be_to_cpu_32((*mc)[zip->ai & 7].byte_cnt & - rxq->byte_mask); + rxq->byte_mask); *mcqe = &(*mc)[zip->ai & 7]; - if ((++zip->ai & 7) == 0) { - /* Invalidate consumed CQEs */ - idx = zip->ca; - end = zip->na; - while (idx != end) { - (*rxq->cqes)[idx & cqe_cnt].op_own = - MLX5_CQE_INVALIDATE; - ++idx; + if (rxq->cqe_comp_layout) { + zip->ai++; + if (unlikely(rxq->zip.ai == rxq->zip.cqe_cnt)) { + rxq->cq_ci = zip->cq_ci; + zip->ai = 0; } - /* - * Increment consumer index to skip the number - * of CQEs consumed. Hardware leaves holes in - * the CQ ring for software use. - */ - zip->ca = zip->na; - zip->na += 8; - } - if (unlikely(rxq->zip.ai == rxq->zip.cqe_cnt)) { - /* Invalidate the rest */ - idx = zip->ca; - end = zip->cq_ci; - - while (idx != end) { - (*rxq->cqes)[idx & cqe_cnt].op_own = - MLX5_CQE_INVALIDATE; - ++idx; + } else { + if ((++zip->ai & 7) == 0) { + /* Invalidate consumed CQEs */ + idx = zip->ca; + end = zip->na; + while (idx != end) { + (*rxq->cqes)[idx & cqe_mask].op_own = + MLX5_CQE_INVALIDATE; + ++idx; + } + /* + * Increment consumer index to skip the number + * of CQEs consumed. Hardware leaves holes in + * the CQ ring for software use. + */ + zip->ca = zip->na; + zip->na += 8; + } + if (unlikely(rxq->zip.ai == rxq->zip.cqe_cnt)) { + /* Invalidate the rest */ + idx = zip->ca; + end = zip->cq_ci; + + while (idx != end) { + (*rxq->cqes)[idx & cqe_mask].op_own = + MLX5_CQE_INVALIDATE; + ++idx; + } + rxq->cq_ci = zip->cq_ci; + zip->ai = 0; } - rxq->cq_ci = zip->cq_ci; - zip->ai = 0; } /* * No compressed data, get next CQE and verify if it is @@ -665,7 +685,9 @@ mlx5_rx_poll_len(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe, int8_t op_own; uint32_t cq_ci; - ret = check_cqe(cqe, cqe_n, rxq->cq_ci); + ret = (rxq->cqe_comp_layout) ? + check_cqe_iteration(cqe, rxq->cqe_n, rxq->cq_ci) : + check_cqe(cqe, cqe_n, rxq->cq_ci); if (unlikely(ret != MLX5_CQE_STATUS_SW_OWN)) { if (unlikely(ret == MLX5_CQE_STATUS_ERR || rxq->err_state)) { @@ -685,16 +707,18 @@ mlx5_rx_poll_len(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe, * actual CQE boundary (not pointing to the middle * of compressed CQE session). */ - cq_ci = rxq->cq_ci + 1; + cq_ci = rxq->cq_ci + !rxq->cqe_comp_layout; op_own = cqe->op_own; if (MLX5_CQE_FORMAT(op_own) == MLX5_COMPRESSED) { volatile struct mlx5_mini_cqe8 (*mc)[8] = (volatile struct mlx5_mini_cqe8 (*)[8]) (uintptr_t)(&(*rxq->cqes) - [cq_ci & cqe_cnt].pkt_info); + [cq_ci & cqe_mask].pkt_info); /* Fix endianness. */ - zip->cqe_cnt = rte_be_to_cpu_32(cqe->byte_cnt); + zip->cqe_cnt = rxq->cqe_comp_layout ? + (MLX5_CQE_NUM_MINIS(op_own) + 1U) : + rte_be_to_cpu_32(cqe->byte_cnt); /* * Current mini array position is the one * returned by check_cqe64(). @@ -703,27 +727,44 @@ mlx5_rx_poll_len(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe, * as a special case the second one is located * 7 CQEs after the initial CQE instead of 8 * for subsequent ones. - */ + */ zip->ca = cq_ci; zip->na = zip->ca + 7; /* Compute the next non compressed CQE. */ zip->cq_ci = rxq->cq_ci + zip->cqe_cnt; /* Get packet size to return. */ len = rte_be_to_cpu_32((*mc)[0].byte_cnt & - rxq->byte_mask); + rxq->byte_mask); *mcqe = &(*mc)[0]; - zip->ai = 1; - /* Prefetch all to be invalidated */ - idx = zip->ca; - end = zip->cq_ci; - while (idx != end) { - rte_prefetch0(&(*rxq->cqes)[(idx) & - cqe_cnt]); - ++idx; + if (rxq->cqe_comp_layout) { + if (MLX5_CQE_NUM_MINIS(op_own)) + zip->ai = 1; + else + rxq->cq_ci = zip->cq_ci; + } else { + zip->ai = 1; + /* Prefetch all to be invalidated */ + idx = zip->ca; + end = zip->cq_ci; + while (idx != end) { + rte_prefetch0(&(*rxq->cqes)[(idx) & cqe_mask]); + ++idx; + } } } else { - rxq->cq_ci = cq_ci; + ++rxq->cq_ci; len = rte_be_to_cpu_32(cqe->byte_cnt); + if (rxq->cqe_comp_layout) { + volatile struct mlx5_cqe *next; + + next = &(*rxq->cqes)[rxq->cq_ci & cqe_mask]; + ret = check_cqe_iteration(next, rxq->cqe_n, rxq->cq_ci); + if (ret != MLX5_CQE_STATUS_SW_OWN || + MLX5_CQE_FORMAT(next->op_own) == MLX5_COMPRESSED) + rte_memcpy(&rxq->title_cqe, + (const void *)(uintptr_t)cqe, + sizeof(struct mlx5_cqe)); + } } } if (unlikely(rxq->err_state)) { @@ -732,7 +773,7 @@ mlx5_rx_poll_len(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe, rxq->err_state = MLX5_RXQ_ERR_STATE_NO_ERROR; return len & MLX5_ERROR_CQE_MASK; } - cqe = &(*rxq->cqes)[rxq->cq_ci & cqe_cnt]; + cqe = &(*rxq->cqes)[rxq->cq_ci & cqe_mask]; ++rxq->stats.idropped; (*skip_cnt) += mprq ? (len & MLX5_MPRQ_STRIDE_NUM_MASK) >> MLX5_MPRQ_STRIDE_NUM_SHIFT : 1; @@ -875,20 +916,22 @@ uint16_t mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n) { struct mlx5_rxq_data *rxq = dpdk_rxq; - const unsigned int wqe_cnt = (1 << rxq->elts_n) - 1; - const unsigned int cqe_cnt = (1 << rxq->cqe_n) - 1; + const uint32_t wqe_n = 1 << rxq->elts_n; + const uint32_t wqe_mask = wqe_n - 1; + const uint32_t cqe_n = 1 << rxq->cqe_n; + const uint32_t cqe_mask = cqe_n - 1; const unsigned int sges_n = rxq->sges_n; struct rte_mbuf *pkt = NULL; struct rte_mbuf *seg = NULL; volatile struct mlx5_cqe *cqe = - &(*rxq->cqes)[rxq->cq_ci & cqe_cnt]; + &(*rxq->cqes)[rxq->cq_ci & cqe_mask]; unsigned int i = 0; unsigned int rq_ci = rxq->rq_ci << sges_n; int len = 0; /* keep its value across iterations. */ while (pkts_n) { uint16_t skip_cnt; - unsigned int idx = rq_ci & wqe_cnt; + unsigned int idx = rq_ci & wqe_mask; volatile struct mlx5_wqe_data_seg *wqe = &((volatile struct mlx5_wqe_data_seg *)rxq->wqes)[idx]; struct rte_mbuf *rep = (*rxq->elts)[idx]; @@ -925,8 +968,8 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n) break; } if (!pkt) { - cqe = &(*rxq->cqes)[rxq->cq_ci & cqe_cnt]; - len = mlx5_rx_poll_len(rxq, cqe, cqe_cnt, &mcqe, &skip_cnt, false); + cqe = &(*rxq->cqes)[rxq->cq_ci & cqe_mask]; + len = mlx5_rx_poll_len(rxq, cqe, cqe_n, cqe_mask, &mcqe, &skip_cnt, false); if (unlikely(len & MLX5_ERROR_CQE_MASK)) { if (len == MLX5_CRITICAL_ERROR_CQE_RET) { rte_mbuf_raw_free(rep); @@ -936,10 +979,10 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n) rq_ci >>= sges_n; rq_ci += skip_cnt; rq_ci <<= sges_n; - idx = rq_ci & wqe_cnt; + idx = rq_ci & wqe_mask; wqe = &((volatile struct mlx5_wqe_data_seg *)rxq->wqes)[idx]; seg = (*rxq->elts)[idx]; - cqe = &(*rxq->cqes)[rxq->cq_ci & cqe_cnt]; + cqe = &(*rxq->cqes)[rxq->cq_ci & cqe_mask]; len = len & ~MLX5_ERROR_CQE_MASK; } if (len == 0) { @@ -949,6 +992,8 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n) pkt = seg; MLX5_ASSERT(len >= (rxq->crc_present << 2)); pkt->ol_flags &= RTE_MBUF_F_EXTERNAL; + if (rxq->cqe_comp_layout && mcqe) + cqe = &rxq->title_cqe; rxq_cq_to_mbuf(rxq, pkt, cqe, mcqe); if (rxq->crc_present) len -= RTE_ETHER_CRC_LEN; @@ -1138,8 +1183,10 @@ mlx5_rx_burst_mprq(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n) struct mlx5_rxq_data *rxq = dpdk_rxq; const uint32_t strd_n = RTE_BIT32(rxq->log_strd_num); const uint32_t strd_sz = RTE_BIT32(rxq->log_strd_sz); - const uint32_t cq_mask = (1 << rxq->cqe_n) - 1; - const uint32_t wq_mask = (1 << rxq->elts_n) - 1; + const uint32_t cqe_n = 1 << rxq->cqe_n; + const uint32_t cq_mask = cqe_n - 1; + const uint32_t wqe_n = 1 << rxq->elts_n; + const uint32_t wq_mask = wqe_n - 1; volatile struct mlx5_cqe *cqe = &(*rxq->cqes)[rxq->cq_ci & cq_mask]; unsigned int i = 0; uint32_t rq_ci = rxq->rq_ci; @@ -1166,7 +1213,7 @@ mlx5_rx_burst_mprq(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n) buf = (*rxq->mprq_bufs)[rq_ci & wq_mask]; } cqe = &(*rxq->cqes)[rxq->cq_ci & cq_mask]; - ret = mlx5_rx_poll_len(rxq, cqe, cq_mask, &mcqe, &skip_cnt, true); + ret = mlx5_rx_poll_len(rxq, cqe, cqe_n, cq_mask, &mcqe, &skip_cnt, true); if (unlikely(ret & MLX5_ERROR_CQE_MASK)) { if (ret == MLX5_CRITICAL_ERROR_CQE_RET) { rq_ci = rxq->rq_ci; @@ -1201,6 +1248,8 @@ mlx5_rx_burst_mprq(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n) consumed_strd += strd_cnt; if (byte_cnt & MLX5_MPRQ_FILLER_MASK) continue; + if (rxq->cqe_comp_layout && mcqe) + cqe = &rxq->title_cqe; strd_idx = rte_be_to_cpu_16(mcqe == NULL ? cqe->wqe_counter : mcqe->stride_idx); diff --git a/drivers/net/mlx5/mlx5_rx.h b/drivers/net/mlx5/mlx5_rx.h index 6b42e27c89..143685c6ab 100644 --- a/drivers/net/mlx5/mlx5_rx.h +++ b/drivers/net/mlx5/mlx5_rx.h @@ -41,11 +41,11 @@ struct mlx5_rxq_stats { /* Compressed CQE context. */ struct rxq_zip { + uint16_t cqe_cnt; /* Number of CQEs. */ uint16_t ai; /* Array index. */ - uint16_t ca; /* Current array index. */ - uint16_t na; /* Next array index. */ - uint16_t cq_ci; /* The next CQE. */ - uint32_t cqe_cnt; /* Number of CQEs. */ + uint32_t ca; /* Current array index. */ + uint32_t na; /* Next array index. */ + uint32_t cq_ci; /* The next CQE. */ }; /* Get pointer to the first stride. */ @@ -100,6 +100,8 @@ struct mlx5_rxq_data { unsigned int mcqe_format:3; /* CQE compression format. */ unsigned int shared:1; /* Shared RXQ. */ unsigned int delay_drop:1; /* Enable delay drop. */ + unsigned int cqe_comp_layout:1; /* CQE Compression Layout*/ + unsigned int cq_ci:24; volatile uint32_t *rq_db; volatile uint32_t *cq_db; uint16_t port_id; @@ -107,7 +109,6 @@ struct mlx5_rxq_data { uint32_t rq_ci; uint16_t consumed_strd; /* Number of consumed strides in WQE. */ uint32_t rq_pi; - uint32_t cq_ci; uint16_t rq_repl_thresh; /* Threshold for buffer replenishment. */ uint32_t byte_mask; union { @@ -119,6 +120,7 @@ struct mlx5_rxq_data { uint16_t mprq_max_memcpy_len; /* Maximum size of packet to memcpy. */ volatile void *wqes; volatile struct mlx5_cqe(*cqes)[]; + struct mlx5_cqe title_cqe; /* Title CQE for CQE compression. */ struct rte_mbuf *(*elts)[]; struct mlx5_mprq_buf *(*mprq_bufs)[]; struct rte_mempool *mp; diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c index 81aa3f074a..6e99c4dde4 100644 --- a/drivers/net/mlx5/mlx5_rxq.c +++ b/drivers/net/mlx5/mlx5_rxq.c @@ -444,12 +444,15 @@ rxq_sync_cq(struct mlx5_rxq_data *rxq) continue; } /* Compute the next non compressed CQE. */ - rxq->cq_ci += rte_be_to_cpu_32(cqe->byte_cnt); + rxq->cq_ci += rxq->cqe_comp_layout ? + (MLX5_CQE_NUM_MINIS(cqe->op_own) + 1U) : + rte_be_to_cpu_32(cqe->byte_cnt); } while (--i); /* Move all CQEs to HW ownership, including possible MiniCQEs. */ for (i = 0; i < cqe_n; i++) { cqe = &(*rxq->cqes)[i]; + cqe->validity_iteration_count = MLX5_CQE_VIC_INIT; cqe->op_own = MLX5_CQE_INVALIDATE; } /* Resync CQE and WQE (WQ in RESET state). */ From patchwork Tue Feb 28 16:43:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Kozyrev X-Patchwork-Id: 124579 X-Patchwork-Delegate: rasland@nvidia.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id BE4B441D9E; Tue, 28 Feb 2023 17:44:15 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 07DBB42BC9; Tue, 28 Feb 2023 17:44:01 +0100 (CET) Received: from NAM12-MW2-obe.outbound.protection.outlook.com (mail-mw2nam12on2078.outbound.protection.outlook.com [40.107.244.78]) by mails.dpdk.org (Postfix) with ESMTP id 197D042D2C for ; Tue, 28 Feb 2023 17:43:59 +0100 (CET) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=FlOXyLb0aOwhJiKURUT6sBPSX2yBonxNOVb/fzjb5nat+Vae3fsZ9At2ziPHQc1ulatxrv77sNsAdsbmgoz9GvMxiNOTZL0iimy4RKxpPEEoeg16tFqRPDrLuVX5yn7FVum2YuL9Hwowo2dw4TZ41QuTK/E/1+v3yBFb5fvFct9j/lqyDhDuNdWiv47fPTUfZoLghAfCculpOOme9YzF/U1GFmKVWSv3/fHpxpWTNLBkpAoWMX4tw0lXGHPX3B7OfcR68B1WfN+LYEWKux7nfM6zL+rTFHoLgQ5tT9A6NtBhQej8Zy6ttmfwLxJm5wMjpO08Z2jNxrrS7lbq0val4w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=jAnsS/eQSA0/cwj3ENS9ELcT47b1Q32qpNjiM1yGYVM=; b=L3RwGuK1332VozJkNXN0aBPd+Xbp7mM+4LIo6ohiapCjnsCUqwa5ZMuGsznBcItCegsKSaZTtVNWlDASRVBcJyYmevFvOJn+xsIOMW332cZvTY4Zk9c15izpE0AH+WrSsThKchW+nn/MSKJvs4sya36T8y+gNQXrU7G1LEvLCUqaTSQpgXY+heVUzn5X4FSSMAhNTgHINEzsVWm07fIeelW3OXOFSB7ya9czAGUlRtQiKBxvx85kHm/2pyqlMxTMh5pvrb0EHRiZFTK4VcWeT8hMAyFCe5KxFEURKS7ft2sl5GaUYhol1D+BMDXZ/XvE5Q2HTtrFR0WsJHBalsuzSw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.117.160) smtp.rcpttodomain=dpdk.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=jAnsS/eQSA0/cwj3ENS9ELcT47b1Q32qpNjiM1yGYVM=; b=VnZ2n+5M4nSe/vkWi3eOBqlwCfuVREmtGNXzlHOIuouMs1sRm76VAkf4ccVSALhLVOCTD5JkzQXnWoDbSKvWpTN9tjv/RMih20nrSbNvO8JdwBQp0P8KcFK42HJYdnD2N+/Ff3r1rAzvSrCNrDnwFN3iIYm7zO815lGdTJlrh6FFsdYGoO1a3tvy9/XgsxebaclY7vn81K/DHPFILh5pYt/ISuoQP/aqweIEfbXffsTx06nzLRMaL3qUSJ69KcCeUuqL5FxYKcMBVfmTchhpZfgKhm3PD9U2MveBCmoTL0i/gpoAuR//GE9Dxh/Iy2w2Of2XYuKVrTEKV8JOmxUDdQ== Received: from BN9PR03CA0981.namprd03.prod.outlook.com (2603:10b6:408:109::26) by CH2PR12MB4971.namprd12.prod.outlook.com (2603:10b6:610:6b::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6134.30; Tue, 28 Feb 2023 16:43:56 +0000 Received: from BN8NAM11FT090.eop-nam11.prod.protection.outlook.com (2603:10b6:408:109:cafe::6d) by BN9PR03CA0981.outlook.office365.com (2603:10b6:408:109::26) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6134.30 via Frontend Transport; Tue, 28 Feb 2023 16:43:56 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.117.160) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.117.160 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.117.160; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.117.160) by BN8NAM11FT090.mail.protection.outlook.com (10.13.177.105) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6156.17 via Frontend Transport; Tue, 28 Feb 2023 16:43:56 +0000 Received: from rnnvmail201.nvidia.com (10.129.68.8) by mail.nvidia.com (10.129.200.66) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.5; Tue, 28 Feb 2023 08:43:33 -0800 Received: from pegasus01.mtr.labs.mlnx (10.126.230.37) by rnnvmail201.nvidia.com (10.129.68.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.37; Tue, 28 Feb 2023 08:43:31 -0800 From: Alexander Kozyrev To: CC: , , Subject: [PATCH 4/5] net/mlx5: support enhanced CQE zipping in vector Rx burst Date: Tue, 28 Feb 2023 18:43:09 +0200 Message-ID: <20230228164310.807594-5-akozyrev@nvidia.com> X-Mailer: git-send-email 2.18.2 In-Reply-To: <20230228164310.807594-1-akozyrev@nvidia.com> References: <20230228164310.807594-1-akozyrev@nvidia.com> MIME-Version: 1.0 X-Originating-IP: [10.126.230.37] X-ClientProxiedBy: rnnvmail202.nvidia.com (10.129.68.7) To rnnvmail201.nvidia.com (10.129.68.8) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: BN8NAM11FT090:EE_|CH2PR12MB4971:EE_ X-MS-Office365-Filtering-Correlation-Id: 13ef1409-c7fe-42b3-ec22-08db19aafbcf X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: FqtdRo05kVSJx731A8eLsjhtoi9xVj4f1EGg7mgV12oPfsKSCP52r4S3cegAfrHgHr81uiw+3LXXBvkRzX2Z8vDYOkCPd/uN5N6hN99I76aiRwM+Ou2WUrBt+Gqa2dkhQDD2y/ZTJeV6/qKONOg9nWjVaKE93Sl6csJtzfHWYw2oUCPUkskEFdyOeA0pYjzpIaaHx7Wplm5Wrdn5mZGNKKDfL6Gne2I78uRmwYONlgFKAZVscCbxmbWgUGXJMNMdoprbJBffISjKs6lCpBBMW2PsGfRkSumnES8YZXrl+NxMJOA8feXn3qzUZl8RaQ1ib+tuYwVVSpWg+35ceYQFmECnZ9siaMZzrNuexgsHHZuPfRDCMrxcz6cMsn4dc3VqMJSL4DY+riSviYdussOy/3QIplbe8+HCdUIWbHC63Vd/yb8bxhL4QTNCZONful+iMIgvKY8WcAzOoDlD+i1T7QomSZ8cjQ59eDatBJDgZTbE4Qd4Rj2vTJhCPdtz471PnJ8cKltPMQtRVpWM9HSIz0EKXZYGtex92iDYw1RrF1Jw27HTxjCQ9CLV+Rt0hbttyw+0pqDSORPsfeI/R/VNho6FHgVWvXRUcOv2lJRVSDFQdFU6gcTXQ2F/Pvy7MnVBxRgwVfbkJrHeUlBJD4ArEsaC7QsFq7ERKPySKoc5zE+yiHZltVpgEjcLUFOxj8lW6aLRS0yOG4moZG/UpL90D09wwkTM/Ep84Vx16XERnkI= X-Forefront-Antispam-Report: CIP:216.228.117.160; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:dc6edge1.nvidia.com; CAT:NONE; SFS:(13230025)(4636009)(136003)(39860400002)(396003)(376002)(346002)(451199018)(40470700004)(36840700001)(46966006)(34020700004)(36860700001)(356005)(7636003)(82740400003)(86362001)(36756003)(2906002)(30864003)(40480700001)(70206006)(8676002)(4326008)(41300700001)(8936002)(6916009)(5660300002)(82310400005)(70586007)(40460700003)(2616005)(1076003)(186003)(16526019)(336012)(83380400001)(426003)(26005)(47076005)(478600001)(316002)(54906003)(6666004)(107886003); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 Feb 2023 16:43:56.0868 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 13ef1409-c7fe-42b3-ec22-08db19aafbcf X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[216.228.117.160]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: BN8NAM11FT090.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: CH2PR12MB4971 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Add Enhanced CQE compression support to vectorized Rx burst routines. Adopt the same algorithm as scalar Rx burst routines have today. 1. Retrieve the validity_iteration_count from CQEs and use it to check if the CQE is ready to be processed instead of the owner_bit. 2. Do not invalidate reserved CQEs between miniCQE arrays. 3. Copy the title packet from the last processed uncompressed CQE since we will need it later to build packets from zipped CQEs. 4. Skip the regular CQE processing and go straight to the CQE unzip function in case the very first CQE is compressed to sace CPU time. Signed-off-by: Alexander Kozyrev Acked-by: Viacheslav Ovsiienko --- drivers/net/mlx5/mlx5_rx.h | 1 + drivers/net/mlx5/mlx5_rxtx_vec.c | 24 ++++- drivers/net/mlx5/mlx5_rxtx_vec_altivec.h | 108 ++++++++++++++++------- drivers/net/mlx5/mlx5_rxtx_vec_neon.h | 91 +++++++++++++------ drivers/net/mlx5/mlx5_rxtx_vec_sse.h | 94 ++++++++++++++------ 5 files changed, 232 insertions(+), 86 deletions(-) diff --git a/drivers/net/mlx5/mlx5_rx.h b/drivers/net/mlx5/mlx5_rx.h index 143685c6ab..8b87adad36 100644 --- a/drivers/net/mlx5/mlx5_rx.h +++ b/drivers/net/mlx5/mlx5_rx.h @@ -122,6 +122,7 @@ struct mlx5_rxq_data { volatile struct mlx5_cqe(*cqes)[]; struct mlx5_cqe title_cqe; /* Title CQE for CQE compression. */ struct rte_mbuf *(*elts)[]; + struct rte_mbuf title_pkt; /* Title packet for CQE compression. */ struct mlx5_mprq_buf *(*mprq_bufs)[]; struct rte_mempool *mp; struct rte_mempool *mprq_mp; /* Mempool for Multi-Packet RQ. */ diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.c b/drivers/net/mlx5/mlx5_rxtx_vec.c index 667475a93e..2363d7ed27 100644 --- a/drivers/net/mlx5/mlx5_rxtx_vec.c +++ b/drivers/net/mlx5/mlx5_rxtx_vec.c @@ -290,13 +290,14 @@ rxq_burst_v(struct mlx5_rxq_data *rxq, struct rte_mbuf **pkts, const uint16_t q_mask = q_n - 1; const uint16_t e_n = 1 << rxq->elts_n; const uint16_t e_mask = e_n - 1; - volatile struct mlx5_cqe *cq; + volatile struct mlx5_cqe *cq, *next; struct rte_mbuf **elts; uint64_t comp_idx = MLX5_VPMD_DESCS_PER_LOOP; uint16_t nocmp_n = 0; uint16_t rcvd_pkt = 0; unsigned int cq_idx = rxq->cq_ci & q_mask; unsigned int elts_idx; + int ret; MLX5_ASSERT(rxq->sges_n == 0); MLX5_ASSERT(rxq->cqe_n == rxq->elts_n); @@ -342,6 +343,15 @@ rxq_burst_v(struct mlx5_rxq_data *rxq, struct rte_mbuf **pkts, rxq->cq_ci += nocmp_n; rxq->rq_pi += nocmp_n; rcvd_pkt += nocmp_n; + /* Copy title packet for future compressed sessions. */ + if (rxq->cqe_comp_layout) { + next = &(*rxq->cqes)[rxq->cq_ci & q_mask]; + ret = check_cqe_iteration(next, rxq->cqe_n, rxq->cq_ci); + if (ret != MLX5_CQE_STATUS_SW_OWN || + MLX5_CQE_FORMAT(next->op_own) == MLX5_COMPRESSED) + rte_memcpy(&rxq->title_pkt, elts[nocmp_n - 1], + sizeof(struct rte_mbuf)); + } /* Decompress the last CQE if compressed. */ if (comp_idx < MLX5_VPMD_DESCS_PER_LOOP) { MLX5_ASSERT(comp_idx == (nocmp_n % MLX5_VPMD_DESCS_PER_LOOP)); @@ -431,7 +441,7 @@ rxq_burst_mprq_v(struct mlx5_rxq_data *rxq, struct rte_mbuf **pkts, const uint32_t strd_n = RTE_BIT32(rxq->log_strd_num); const uint32_t elts_n = wqe_n * strd_n; const uint32_t elts_mask = elts_n - 1; - volatile struct mlx5_cqe *cq; + volatile struct mlx5_cqe *cq, *next; struct rte_mbuf **elts; uint64_t comp_idx = MLX5_VPMD_DESCS_PER_LOOP; uint16_t nocmp_n = 0; @@ -439,6 +449,7 @@ rxq_burst_mprq_v(struct mlx5_rxq_data *rxq, struct rte_mbuf **pkts, uint16_t cp_pkt = 0; unsigned int cq_idx = rxq->cq_ci & q_mask; unsigned int elts_idx; + int ret; MLX5_ASSERT(rxq->sges_n == 0); cq = &(*rxq->cqes)[cq_idx]; @@ -482,6 +493,15 @@ rxq_burst_mprq_v(struct mlx5_rxq_data *rxq, struct rte_mbuf **pkts, MLX5_ASSERT(nocmp_n <= pkts_n); cp_pkt = rxq_copy_mprq_mbuf_v(rxq, pkts, nocmp_n); rcvd_pkt += cp_pkt; + /* Copy title packet for future compressed sessions. */ + if (rxq->cqe_comp_layout) { + next = &(*rxq->cqes)[rxq->cq_ci & q_mask]; + ret = check_cqe_iteration(next, rxq->cqe_n, rxq->cq_ci); + if (ret != MLX5_CQE_STATUS_SW_OWN || + MLX5_CQE_FORMAT(next->op_own) == MLX5_COMPRESSED) + rte_memcpy(&rxq->title_pkt, elts[nocmp_n - 1], + sizeof(struct rte_mbuf)); + } /* Decompress the last CQE if compressed. */ if (comp_idx < MLX5_VPMD_DESCS_PER_LOOP) { MLX5_ASSERT(comp_idx == (nocmp_n % MLX5_VPMD_DESCS_PER_LOOP)); diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h b/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h index 204d17a8f2..14ffff26f4 100644 --- a/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h +++ b/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h @@ -76,8 +76,10 @@ static inline uint16_t rxq_cq_decompress_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq, struct rte_mbuf **elts) { - volatile struct mlx5_mini_cqe8 *mcq = (void *)&(cq + 1)->pkt_info; - struct rte_mbuf *t_pkt = elts[0]; /* Title packet is pre-built. */ + volatile struct mlx5_mini_cqe8 *mcq = + (void *)&(cq + !rxq->cqe_comp_layout)->pkt_info; + /* Title packet is pre-built. */ + struct rte_mbuf *t_pkt = rxq->cqe_comp_layout ? &rxq->title_pkt : elts[0]; const __vector unsigned char zero = (__vector unsigned char){0}; /* Mask to shuffle from extracted mini CQE to mbuf. */ const __vector unsigned char shuf_mask1 = (__vector unsigned char){ @@ -93,8 +95,10 @@ rxq_cq_decompress_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq, -1, -1, /* skip vlan_tci */ 11, 10, 9, 8}; /* bswap32, rss */ /* Restore the compressed count. Must be 16 bits. */ - const uint16_t mcqe_n = t_pkt->data_len + - (rxq->crc_present * RTE_ETHER_CRC_LEN); + uint16_t mcqe_n = (rxq->cqe_comp_layout) ? + (MLX5_CQE_NUM_MINIS(cq->op_own) + 1) : + t_pkt->data_len + (rxq->crc_present * RTE_ETHER_CRC_LEN); + uint16_t pkts_n = mcqe_n; const __vector unsigned char rearm = (__vector unsigned char)vec_vsx_ld(0, (signed int const *)&t_pkt->rearm_data); @@ -132,6 +136,9 @@ rxq_cq_decompress_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq, * D. store rx_descriptor_fields1. * E. store flow tag (rte_flow mark). */ +cycle: + if (rxq->cqe_comp_layout) + rte_prefetch0((void *)(cq + mcqe_n)); for (pos = 0; pos < mcqe_n; ) { __vector unsigned char mcqe1, mcqe2; __vector unsigned char rxdf1, rxdf2; @@ -154,9 +161,10 @@ rxq_cq_decompress_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq, const __vector unsigned long shmax = {64, 64}; #endif - for (i = 0; i < MLX5_VPMD_DESCS_PER_LOOP; ++i) - if (likely(pos + i < mcqe_n)) - rte_prefetch0((void *)(cq + pos + i)); + if (!rxq->cqe_comp_layout) + for (i = 0; i < MLX5_VPMD_DESCS_PER_LOOP; ++i) + if (likely(pos + i < mcqe_n)) + rte_prefetch0((void *)(cq + pos + i)); /* A.1 load mCQEs into a 128bit register. */ mcqe1 = (__vector unsigned char)vec_vsx_ld(0, (signed int const *)&mcq[pos % 8]); @@ -488,25 +496,43 @@ rxq_cq_decompress_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq, pos += MLX5_VPMD_DESCS_PER_LOOP; /* Move to next CQE and invalidate consumed CQEs. */ - if (!(pos & 0x7) && pos < mcqe_n) { - if (pos + 8 < mcqe_n) - rte_prefetch0((void *)(cq + pos + 8)); - mcq = (void *)&(cq + pos)->pkt_info; - for (i = 0; i < 8; ++i) - cq[inv++].op_own = MLX5_CQE_INVALIDATE; + if (!rxq->cqe_comp_layout) { + if (!(pos & 0x7) && pos < mcqe_n) { + if (pos + 8 < mcqe_n) + rte_prefetch0((void *)(cq + pos + 8)); + mcq = (void *)&(cq + pos)->pkt_info; + for (i = 0; i < 8; ++i) + cq[inv++].op_own = MLX5_CQE_INVALIDATE; + } } } - /* Invalidate the rest of CQEs. */ - for (; inv < mcqe_n; ++inv) - cq[inv].op_own = MLX5_CQE_INVALIDATE; + if (rxq->cqe_comp_layout) { + int ret; + /* Keep unzipping if the next CQE is the miniCQE array. */ + cq = &cq[mcqe_n]; + ret = check_cqe_iteration(cq, rxq->cqe_n, rxq->cq_ci + pkts_n); + if (ret == MLX5_CQE_STATUS_SW_OWN && + MLX5_CQE_FORMAT(cq->op_own) == MLX5_COMPRESSED) { + pos = 0; + elts = &elts[mcqe_n]; + mcq = (void *)cq; + mcqe_n = MLX5_CQE_NUM_MINIS(cq->op_own) + 1; + pkts_n += mcqe_n; + goto cycle; + } + } else { + /* Invalidate the rest of CQEs. */ + for (; inv < pkts_n; ++inv) + cq[inv].op_own = MLX5_CQE_INVALIDATE; + } #ifdef MLX5_PMD_SOFT_COUNTERS - rxq->stats.ipackets += mcqe_n; + rxq->stats.ipackets += pkts_n; rxq->stats.ibytes += rcvd_byte; #endif - return mcqe_n; + return pkts_n; } /** @@ -787,9 +813,13 @@ rxq_cq_process_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq, uint64_t n = 0; uint64_t comp_idx = MLX5_VPMD_DESCS_PER_LOOP; uint16_t nocmp_n = 0; - unsigned int ownership = !!(rxq->cq_ci & (q_mask + 1)); + const uint8_t vic = rxq->cq_ci >> rxq->cqe_n; + unsigned int own = !(rxq->cq_ci & (q_mask + 1)); const __vector unsigned char zero = (__vector unsigned char){0}; const __vector unsigned char ones = vec_splat_u8(-1); + const __vector unsigned char vic_check = + (__vector unsigned char)(__vector unsigned long){ + 0x00ff000000ff0000LL, 0x00ff000000ff0000LL}; const __vector unsigned char owner_check = (__vector unsigned char)(__vector unsigned long){ 0x0100000001000000LL, 0x0100000001000000LL}; @@ -837,7 +867,16 @@ rxq_cq_process_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq, (__vector unsigned short){0, 0, 0, 0, 0xffff, 0xffff, 0, 0}; const __vector unsigned short cqe_sel_mask2 = (__vector unsigned short){0, 0, 0xffff, 0, 0, 0, 0, 0}; - + const __vector unsigned char validity = (__vector unsigned char){ + 0, 0, vic, 0, + 0, 0, vic, 0, + 0, 0, vic, 0, + 0, 0, vic, 0}; + const __vector unsigned char ownership = (__vector unsigned char){ + 0, 0, 0, own, + 0, 0, 0, own, + 0, 0, 0, own, + 0, 0, 0, own}; /* * A. load first Qword (8bytes) in one loop. * B. copy 4 mbuf pointers from elts ring to returning pkts. @@ -848,7 +887,7 @@ rxq_cq_process_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq, * uint8_t pkt_info; * uint8_t flow_tag[3]; * uint16_t byte_cnt; - * uint8_t rsvd4; + * uint8_t validity_iteration_count; * uint8_t op_own; * uint16_t hdr_type_etc; * uint16_t vlan_info; @@ -1082,17 +1121,25 @@ rxq_cq_process_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq, *(__vector unsigned char *) &pkts[pos]->pkt_len = pkt_mb0; - /* E.2 flip owner bit to mark CQEs from last round. */ - owner_mask = (__vector unsigned char) - vec_and((__vector unsigned long)op_own, - (__vector unsigned long)owner_check); - if (ownership) + /* E.2 mask out CQEs belonging to HW. */ + if (rxq->cqe_comp_layout) { + owner_mask = (__vector unsigned char) + vec_and((__vector unsigned long)op_own, + (__vector unsigned long)vic_check); + owner_mask = (__vector unsigned char) + vec_cmpeq((__vector unsigned int)owner_mask, + (__vector unsigned int)validity); owner_mask = (__vector unsigned char) vec_xor((__vector unsigned long)owner_mask, + (__vector unsigned long)ones); + } else { + owner_mask = (__vector unsigned char) + vec_and((__vector unsigned long)op_own, (__vector unsigned long)owner_check); - owner_mask = (__vector unsigned char) - vec_cmpeq((__vector unsigned int)owner_mask, - (__vector unsigned int)owner_check); + owner_mask = (__vector unsigned char) + vec_cmpeq((__vector unsigned int)owner_mask, + (__vector unsigned int)ownership); + } owner_mask = (__vector unsigned char) vec_packs((__vector unsigned int)owner_mask, (__vector unsigned int)zero); @@ -1174,7 +1221,8 @@ rxq_cq_process_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq, (__vector unsigned long)mask); /* D.3 check error in opcode. */ - adj = (comp_idx != MLX5_VPMD_DESCS_PER_LOOP && comp_idx == n); + adj = (!rxq->cqe_comp_layout && + comp_idx != MLX5_VPMD_DESCS_PER_LOOP && comp_idx == n); mask = (__vector unsigned char)(__vector unsigned long){ (adj * sizeof(uint16_t) * 8), 0}; lshift = vec_splat((__vector unsigned long)mask, 0); diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h index 41b9cf5444..75e8ed7e5a 100644 --- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h +++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h @@ -71,8 +71,10 @@ static inline uint16_t rxq_cq_decompress_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq, struct rte_mbuf **elts) { - volatile struct mlx5_mini_cqe8 *mcq = (void *)&(cq + 1)->pkt_info; - struct rte_mbuf *t_pkt = elts[0]; /* Title packet is pre-built. */ + volatile struct mlx5_mini_cqe8 *mcq = + (void *)&(cq + !rxq->cqe_comp_layout)->pkt_info; + /* Title packet is pre-built. */ + struct rte_mbuf *t_pkt = rxq->cqe_comp_layout ? &rxq->title_pkt : elts[0]; unsigned int pos; unsigned int i; unsigned int inv = 0; @@ -92,8 +94,10 @@ rxq_cq_decompress_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq, 11, 10, 9, 8 /* hash.rss, bswap32 */ }; /* Restore the compressed count. Must be 16 bits. */ - const uint16_t mcqe_n = t_pkt->data_len + - (rxq->crc_present * RTE_ETHER_CRC_LEN); + uint16_t mcqe_n = (rxq->cqe_comp_layout) ? + (MLX5_CQE_NUM_MINIS(cq->op_own) + 1) : + t_pkt->data_len + (rxq->crc_present * RTE_ETHER_CRC_LEN); + uint16_t pkts_n = mcqe_n; const uint64x2_t rearm = vld1q_u64((void *)&t_pkt->rearm_data); const uint32x4_t rxdf_mask = { @@ -131,6 +135,9 @@ rxq_cq_decompress_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq, * D. store rx_descriptor_fields1. * E. store flow tag (rte_flow mark). */ +cycle: + if (rxq->cqe_comp_layout) + rte_prefetch0((void *)(cq + mcqe_n)); for (pos = 0; pos < mcqe_n; ) { uint8_t *p = (void *)&mcq[pos % 8]; uint8_t *e0 = (void *)&elts[pos]->rearm_data; @@ -145,9 +152,10 @@ rxq_cq_decompress_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq, sizeof(uint16_t) * 8) : 0); #endif - for (i = 0; i < MLX5_VPMD_DESCS_PER_LOOP; ++i) - if (likely(pos + i < mcqe_n)) - rte_prefetch0((void *)(cq + pos + i)); + if (!rxq->cqe_comp_layout) + for (i = 0; i < MLX5_VPMD_DESCS_PER_LOOP; ++i) + if (likely(pos + i < mcqe_n)) + rte_prefetch0((void *)(cq + pos + i)); __asm__ volatile ( /* A.1 load mCQEs into a 128bit register. */ "ld1 {v16.16b - v17.16b}, [%[mcq]] \n\t" @@ -354,22 +362,40 @@ rxq_cq_decompress_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq, } pos += MLX5_VPMD_DESCS_PER_LOOP; /* Move to next CQE and invalidate consumed CQEs. */ - if (!(pos & 0x7) && pos < mcqe_n) { - if (pos + 8 < mcqe_n) - rte_prefetch0((void *)(cq + pos + 8)); - mcq = (void *)&(cq + pos)->pkt_info; - for (i = 0; i < 8; ++i) - cq[inv++].op_own = MLX5_CQE_INVALIDATE; + if (!rxq->cqe_comp_layout) { + if (!(pos & 0x7) && pos < mcqe_n) { + if (pos + 8 < mcqe_n) + rte_prefetch0((void *)(cq + pos + 8)); + mcq = (void *)&(cq + pos)->pkt_info; + for (i = 0; i < 8; ++i) + cq[inv++].op_own = MLX5_CQE_INVALIDATE; + } + } + } + if (rxq->cqe_comp_layout) { + int ret; + /* Keep unzipping if the next CQE is the miniCQE array. */ + cq = &cq[mcqe_n]; + ret = check_cqe_iteration(cq, rxq->cqe_n, rxq->cq_ci + pkts_n); + if (ret == MLX5_CQE_STATUS_SW_OWN && + MLX5_CQE_FORMAT(cq->op_own) == MLX5_COMPRESSED) { + pos = 0; + elts = &elts[mcqe_n]; + mcq = (void *)cq; + mcqe_n = MLX5_CQE_NUM_MINIS(cq->op_own) + 1; + pkts_n += mcqe_n; + goto cycle; } + } else { + /* Invalidate the rest of CQEs. */ + for (; inv < pkts_n; ++inv) + cq[inv].op_own = MLX5_CQE_INVALIDATE; } - /* Invalidate the rest of CQEs. */ - for (; inv < mcqe_n; ++inv) - cq[inv].op_own = MLX5_CQE_INVALIDATE; #ifdef MLX5_PMD_SOFT_COUNTERS - rxq->stats.ipackets += mcqe_n; + rxq->stats.ipackets += pkts_n; rxq->stats.ibytes += rcvd_byte; #endif - return mcqe_n; + return pkts_n; } /** @@ -528,7 +554,9 @@ rxq_cq_process_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq, uint64_t n = 0; uint64_t comp_idx = MLX5_VPMD_DESCS_PER_LOOP; uint16_t nocmp_n = 0; + const uint16x4_t validity = vdup_n_u16((rxq->cq_ci >> rxq->cqe_n) << 8); const uint16x4_t ownership = vdup_n_u16(!(rxq->cq_ci & (q_mask + 1))); + const uint16x4_t vic_check = vcreate_u16(0xff00ff00ff00ff00); const uint16x4_t owner_check = vcreate_u16(0x0001000100010001); const uint16x4_t opcode_check = vcreate_u16(0x00f000f000f000f0); const uint16x4_t format_check = vcreate_u16(0x000c000c000c000c); @@ -547,7 +575,7 @@ rxq_cq_process_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq, const uint8x16_t cqe_shuf_m = { 28, 29, /* hdr_type_etc */ 0, /* pkt_info */ - -1, /* null */ + 62, /* validity_iteration_count */ 47, 46, /* byte_cnt, bswap16 */ 31, 30, /* vlan_info, bswap16 */ 15, 14, 13, 12, /* rx_hash_res, bswap32 */ @@ -564,10 +592,10 @@ rxq_cq_process_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq, }; /* Mask to generate 16B owner vector. */ const uint8x8_t owner_shuf_m = { - 63, -1, /* 4th CQE */ - 47, -1, /* 3rd CQE */ - 31, -1, /* 2nd CQE */ - 15, -1 /* 1st CQE */ + 63, 51, /* 4th CQE */ + 47, 35, /* 3rd CQE */ + 31, 19, /* 2nd CQE */ + 15, 3 /* 1st CQE */ }; /* Mask to generate a vector having packet_type/ol_flags. */ const uint8x16_t ptype_shuf_m = { @@ -600,7 +628,7 @@ rxq_cq_process_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq, * struct { * uint16_t hdr_type_etc; * uint8_t pkt_info; - * uint8_t rsvd; + * uint8_t validity_iteration_count; * uint16_t byte_cnt; * uint16_t vlan_info; * uint32_t rx_has_res; @@ -748,9 +776,15 @@ rxq_cq_process_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq, "v16", "v17", "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25"); - /* D.2 flip owner bit to mark CQEs from last round. */ - owner_mask = vand_u16(op_own, owner_check); - owner_mask = vceq_u16(owner_mask, ownership); + /* D.2 mask out CQEs belonging to HW. */ + if (rxq->cqe_comp_layout) { + owner_mask = vand_u16(op_own, vic_check); + owner_mask = vceq_u16(owner_mask, validity); + owner_mask = vmvn_u16(owner_mask); + } else { + owner_mask = vand_u16(op_own, owner_check); + owner_mask = vceq_u16(owner_mask, ownership); + } /* D.3 get mask for invalidated CQEs. */ opcode = vand_u16(op_own, opcode_check); invalid_mask = vceq_u16(opcode_check, opcode); @@ -780,7 +814,8 @@ rxq_cq_process_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq, -1UL >> (n * sizeof(uint16_t) * 8) : 0); invalid_mask = vorr_u16(invalid_mask, mask); /* D.3 check error in opcode. */ - adj = (comp_idx != MLX5_VPMD_DESCS_PER_LOOP && comp_idx == n); + adj = (!rxq->cqe_comp_layout && + comp_idx != MLX5_VPMD_DESCS_PER_LOOP && comp_idx == n); mask = vcreate_u16(adj ? -1UL >> ((n + 1) * sizeof(uint16_t) * 8) : -1UL); mini_mask = vand_u16(invalid_mask, mask); diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h index ab69af0c55..b282f8b8e6 100644 --- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h +++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h @@ -73,8 +73,9 @@ static inline uint16_t rxq_cq_decompress_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq, struct rte_mbuf **elts) { - volatile struct mlx5_mini_cqe8 *mcq = (void *)(cq + 1); - struct rte_mbuf *t_pkt = elts[0]; /* Title packet is pre-built. */ + volatile struct mlx5_mini_cqe8 *mcq = (void *)(cq + !rxq->cqe_comp_layout); + /* Title packet is pre-built. */ + struct rte_mbuf *t_pkt = rxq->cqe_comp_layout ? &rxq->title_pkt : elts[0]; unsigned int pos; unsigned int i; unsigned int inv = 0; @@ -92,8 +93,10 @@ rxq_cq_decompress_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq, -1, -1, 14, 15, /* pkt_len, bswap16 */ -1, -1, -1, -1 /* skip packet_type */); /* Restore the compressed count. Must be 16 bits. */ - const uint16_t mcqe_n = t_pkt->data_len + - (rxq->crc_present * RTE_ETHER_CRC_LEN); + uint16_t mcqe_n = (rxq->cqe_comp_layout) ? + (MLX5_CQE_NUM_MINIS(cq->op_own) + 1) : + t_pkt->data_len + (rxq->crc_present * RTE_ETHER_CRC_LEN); + uint16_t pkts_n = mcqe_n; const __m128i rearm = _mm_loadu_si128((__m128i *)&t_pkt->rearm_data); const __m128i rxdf = @@ -124,6 +127,9 @@ rxq_cq_decompress_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq, * D. store rx_descriptor_fields1. * E. store flow tag (rte_flow mark). */ +cycle: + if (rxq->cqe_comp_layout) + rte_prefetch0((void *)(cq + mcqe_n)); for (pos = 0; pos < mcqe_n; ) { __m128i mcqe1, mcqe2; __m128i rxdf1, rxdf2; @@ -131,9 +137,10 @@ rxq_cq_decompress_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq, __m128i byte_cnt, invalid_mask; #endif - for (i = 0; i < MLX5_VPMD_DESCS_PER_LOOP; ++i) - if (likely(pos + i < mcqe_n)) - rte_prefetch0((void *)(cq + pos + i)); + if (!rxq->cqe_comp_layout) + for (i = 0; i < MLX5_VPMD_DESCS_PER_LOOP; ++i) + if (likely(pos + i < mcqe_n)) + rte_prefetch0((void *)(cq + pos + i)); /* A.1 load mCQEs into a 128bit register. */ mcqe1 = _mm_loadu_si128((__m128i *)&mcq[pos % 8]); mcqe2 = _mm_loadu_si128((__m128i *)&mcq[pos % 8 + 2]); @@ -344,22 +351,40 @@ rxq_cq_decompress_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq, } pos += MLX5_VPMD_DESCS_PER_LOOP; /* Move to next CQE and invalidate consumed CQEs. */ - if (!(pos & 0x7) && pos < mcqe_n) { - if (pos + 8 < mcqe_n) - rte_prefetch0((void *)(cq + pos + 8)); - mcq = (void *)(cq + pos); - for (i = 0; i < 8; ++i) - cq[inv++].op_own = MLX5_CQE_INVALIDATE; + if (!rxq->cqe_comp_layout) { + if (!(pos & 0x7) && pos < mcqe_n) { + if (pos + 8 < mcqe_n) + rte_prefetch0((void *)(cq + pos + 8)); + mcq = (void *)(cq + pos); + for (i = 0; i < 8; ++i) + cq[inv++].op_own = MLX5_CQE_INVALIDATE; + } + } + } + if (rxq->cqe_comp_layout) { + int ret; + /* Keep unzipping if the next CQE is the miniCQE array. */ + cq = &cq[mcqe_n]; + ret = check_cqe_iteration(cq, rxq->cqe_n, rxq->cq_ci + pkts_n); + if (ret == MLX5_CQE_STATUS_SW_OWN && + MLX5_CQE_FORMAT(cq->op_own) == MLX5_COMPRESSED) { + pos = 0; + elts = &elts[mcqe_n]; + mcq = (void *)cq; + mcqe_n = MLX5_CQE_NUM_MINIS(cq->op_own) + 1; + pkts_n += mcqe_n; + goto cycle; } + } else { + /* Invalidate the rest of CQEs. */ + for (; inv < pkts_n; ++inv) + cq[inv].op_own = MLX5_CQE_INVALIDATE; } - /* Invalidate the rest of CQEs. */ - for (; inv < mcqe_n; ++inv) - cq[inv].op_own = MLX5_CQE_INVALIDATE; #ifdef MLX5_PMD_SOFT_COUNTERS - rxq->stats.ipackets += mcqe_n; + rxq->stats.ipackets += pkts_n; rxq->stats.ibytes += rcvd_byte; #endif - return mcqe_n; + return pkts_n; } /** @@ -527,7 +552,9 @@ rxq_cq_process_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq, uint64_t n = 0; uint64_t comp_idx = MLX5_VPMD_DESCS_PER_LOOP; uint16_t nocmp_n = 0; - unsigned int ownership = !!(rxq->cq_ci & (q_mask + 1)); + const uint8_t vic = rxq->cq_ci >> rxq->cqe_n; + const uint8_t own = !(rxq->cq_ci & (q_mask + 1)); + const __m128i vic_check = _mm_set1_epi64x(0x00ff000000ff0000LL); const __m128i owner_check = _mm_set1_epi64x(0x0100000001000000LL); const __m128i opcode_check = _mm_set1_epi64x(0xf0000000f0000000LL); const __m128i format_check = _mm_set1_epi64x(0x0c0000000c000000LL); @@ -541,6 +568,16 @@ rxq_cq_process_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq, 12, 13, 8, 9, 4, 5, 0, 1); #endif + const __m128i validity = + _mm_set_epi8(0, vic, 0, 0, + 0, vic, 0, 0, + 0, vic, 0, 0, + 0, vic, 0, 0); + const __m128i ownership = + _mm_set_epi8(own, 0, 0, 0, + own, 0, 0, 0, + own, 0, 0, 0, + own, 0, 0, 0); /* Mask to shuffle from extracted CQE to mbuf. */ const __m128i shuf_mask = _mm_set_epi8(-1, 3, 2, 1, /* fdir.hi */ @@ -573,7 +610,7 @@ rxq_cq_process_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq, * uint8_t pkt_info; * uint8_t flow_tag[3]; * uint16_t byte_cnt; - * uint8_t rsvd4; + * uint8_t validity_iteration_count; * uint8_t op_own; * uint16_t hdr_type_etc; * uint16_t vlan_info; @@ -689,11 +726,15 @@ rxq_cq_process_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq, /* D.1 fill in mbuf - rx_descriptor_fields1. */ _mm_storeu_si128((void *)&pkts[pos + 1]->pkt_len, pkt_mb1); _mm_storeu_si128((void *)&pkts[pos]->pkt_len, pkt_mb0); - /* E.2 flip owner bit to mark CQEs from last round. */ - owner_mask = _mm_and_si128(op_own, owner_check); - if (ownership) - owner_mask = _mm_xor_si128(owner_mask, owner_check); - owner_mask = _mm_cmpeq_epi32(owner_mask, owner_check); + /* E.2 mask out CQEs belonging to HW. */ + if (rxq->cqe_comp_layout) { + owner_mask = _mm_and_si128(op_own, vic_check); + owner_mask = _mm_cmpeq_epi32(owner_mask, validity); + owner_mask = _mm_xor_si128(owner_mask, ones); + } else { + owner_mask = _mm_and_si128(op_own, owner_check); + owner_mask = _mm_cmpeq_epi32(owner_mask, ownership); + } owner_mask = _mm_packs_epi32(owner_mask, zero); /* E.3 get mask for invalidated CQEs. */ opcode = _mm_and_si128(op_own, opcode_check); @@ -729,7 +770,8 @@ rxq_cq_process_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq, mask = _mm_sll_epi64(ones, mask); invalid_mask = _mm_or_si128(invalid_mask, mask); /* D.3 check error in opcode. */ - adj = (comp_idx != MLX5_VPMD_DESCS_PER_LOOP && comp_idx == n); + adj = (!rxq->cqe_comp_layout && + comp_idx != MLX5_VPMD_DESCS_PER_LOOP && comp_idx == n); mask = _mm_set_epi64x(0, adj * sizeof(uint16_t) * 8); mini_mask = _mm_sll_epi64(invalid_mask, mask); opcode = _mm_cmpeq_epi32(resp_err_check, opcode); From patchwork Tue Feb 28 16:43:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Kozyrev X-Patchwork-Id: 124578 X-Patchwork-Delegate: rasland@nvidia.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id D51FA41D9E; Tue, 28 Feb 2023 17:44:06 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 6D6A941611; Tue, 28 Feb 2023 17:43:55 +0100 (CET) Received: from NAM10-DM6-obe.outbound.protection.outlook.com (mail-dm6nam10on2072.outbound.protection.outlook.com [40.107.93.72]) by mails.dpdk.org (Postfix) with ESMTP id E394541611 for ; Tue, 28 Feb 2023 17:43:54 +0100 (CET) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Rfsy0QqVHR5mLB0AA60C7LXJRGw9hBV0DOO7zkB1HIAhuMGv57d3Xr4P40wGGn6IT41HIONPjm9u4/LQu/p8NA1Qx7s+W4gb+NHkTdfc67vJlR4b/fciX0Hn9MvHHVgT4Rq0SAwekmSDivH+ic81neXTAIUTpM23WtUlR+F0yPEpvbMHnlf/MdJsapJnvz08veQ5AsVLFt2WHJjKrNx9DpnQP8BqhdxtfmAJ3mOp1RhUVULVkEmH37q3Zd/1/ggTTZAjM9dBSp1z1emnvMk5d5BEmKeUt78qhxuMlcTUFQogL8amU/a3HXfPZGwlAYs2xk4/3CjBrBeO+ykhJrB+8A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=bijbOXvRpwd4nc12X0TjtZQlclNuxchkeFb+molrNgQ=; b=SOTG61oYr3QSDi0WSyeoqwAFhZMoXXH6S/i08kiSVyRQXVPZile9LQRRKwwyY0HaDRZQpAgBYwwMne21BriVMSzxgfReLRy+3CdToY0jJHiYsuL7gXc/zuVJPJsvXwAnCOTbTSkFJLXtQoUyceNBJ/k/glRAkvzNIEXg9a1pbTDCS0fmE/urelVSGQxoE03Rk4Z1z/pmjYTBPmjEYeHei4XFqWSpcHe6+uV/4n2NWCfe9JYpAWdl5GO1eIb25ybsEejyUQK4T0AiPcy2sCvTmlpud/RPt5iIWOyyG4t5IQsmPJ5UwyVd0kXdbZ6fEHuteATmjq2ImnYApEuNZ2mdEA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.117.161) smtp.rcpttodomain=dpdk.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=bijbOXvRpwd4nc12X0TjtZQlclNuxchkeFb+molrNgQ=; b=oPqPjZN7povCcWRhJ5jlFkDrXVDx0GH9JS/RPGwAjkeV4JSnwZGmwjzh20bQbsoMe5wC/18WEl+LFcqvW+c9htzzI0z41Zd/lqN8ylGV5cLi2vXpK2r5Xm1f6k6p8zXOYPwZR58+MGv6wXiWYDQ3e6uRp6CdEn0TJoSDiqzHDzeZsRRwjhETVobIYESJWJ9wSlBvzOjywfa5cZJkU5g+SCU1W/NSrDibT3vI+hbDKFDJZaYVYtuAP6XwdIPiXlKXyHkQHIzZ+J3q8IckN9dYcEPye9eN58kWONoGSSr9XT2JYscvaHWXn5f7q79dKhmj8amW177KKJ0ErTHS0UerGw== Received: from DM6PR21CA0012.namprd21.prod.outlook.com (2603:10b6:5:174::22) by MN2PR12MB4254.namprd12.prod.outlook.com (2603:10b6:208:1d0::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6134.30; Tue, 28 Feb 2023 16:43:44 +0000 Received: from DS1PEPF0000E63D.namprd02.prod.outlook.com (2603:10b6:5:174:cafe::25) by DM6PR21CA0012.outlook.office365.com (2603:10b6:5:174::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6178.4 via Frontend Transport; Tue, 28 Feb 2023 16:43:44 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.117.161) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.117.161 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.117.161; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.117.161) by DS1PEPF0000E63D.mail.protection.outlook.com (10.167.17.75) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6156.12 via Frontend Transport; Tue, 28 Feb 2023 16:43:44 +0000 Received: from rnnvmail201.nvidia.com (10.129.68.8) by mail.nvidia.com (10.129.200.67) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.5; Tue, 28 Feb 2023 08:43:35 -0800 Received: from pegasus01.mtr.labs.mlnx (10.126.230.37) by rnnvmail201.nvidia.com (10.129.68.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.37; Tue, 28 Feb 2023 08:43:33 -0800 From: Alexander Kozyrev To: CC: , , Subject: [PATCH 5/5] net/mlx5: enable enhanced CQE compression Date: Tue, 28 Feb 2023 18:43:10 +0200 Message-ID: <20230228164310.807594-6-akozyrev@nvidia.com> X-Mailer: git-send-email 2.18.2 In-Reply-To: <20230228164310.807594-1-akozyrev@nvidia.com> References: <20230228164310.807594-1-akozyrev@nvidia.com> MIME-Version: 1.0 X-Originating-IP: [10.126.230.37] X-ClientProxiedBy: rnnvmail202.nvidia.com (10.129.68.7) To rnnvmail201.nvidia.com (10.129.68.8) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS1PEPF0000E63D:EE_|MN2PR12MB4254:EE_ X-MS-Office365-Filtering-Correlation-Id: 3f96e588-d710-49ab-3e04-08db19aaf4c1 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: Qut6UGIG4HEWmTJ37N5hAbDBU550CxAMFoc5fi8JlJMwMjmt6JsI8I/8KXQP9T2EcNnTykxi+PoO5NHbW9NkQoWJcn8mXrLT3VcNIFvOyJbzIvWh+LqXt8+5Ah8kgMcsQb+6qEJxlv+3R+1j80l4mDigwQddAbGcXshSThSKtlWpWsa+IToWW9MFJKalhOuyAO6xSuUpAEA/ezDyZLqmG/MNNb6cKR4ZeyyP7AFeCGjZUwgdCQ7TeRJUV84U09l7UAD8uVeW3TE9iWsdSYGbGVXSs6ELmmYyFNP9QnVKUwOGGEkBQOf/yI5N1uwsPSQ3o2JG/ZEHsXPtm1kpR0PB41gvSLRI2o4SzFbb1VJ3oo2tVy1k1pEEIopyA2wYqyTlGK4DrhUcv3V1JxvfV9KPJGpxkRFHt1r+B1/Vf3Gx64lhty6hBNbq1Hj8RjSBitaKN7YcRpWLHQCq2jX9Gh8aYzy4BGMtesX+sfBXUJg1jo0zKaJdHYMdIP5I/HFb6oYS+4S4p/QGG0B7BhEkVNJTXJDsj3hIZCq4QZc2eFE9v4iEiEH7RVZ2q9cOu4n7LpVVd6dyQtXTKJqfNSr27oSPQlxCu+/rhjPnAgQ0Ge5ybirp5H0vTNdqcl9ps+sHTqqT7Lal7+4nDoru9XOLCWalvT/6FmJgs+uZR3zNjr6TMCx/KEPnp3SdlMrkR7Etv+bPKY/PFzArs07GSdQyv9LLwBBsJvd2993VHIGsGrKG5cs= X-Forefront-Antispam-Report: CIP:216.228.117.161; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:dc6edge2.nvidia.com; CAT:NONE; SFS:(13230025)(4636009)(346002)(136003)(39860400002)(376002)(396003)(451199018)(36840700001)(40470700004)(46966006)(86362001)(36756003)(6916009)(5660300002)(8936002)(4326008)(2906002)(40480700001)(41300700001)(36860700001)(356005)(8676002)(82740400003)(7636003)(6666004)(107886003)(478600001)(54906003)(316002)(70586007)(70206006)(82310400005)(40460700003)(83380400001)(47076005)(426003)(26005)(2616005)(16526019)(186003)(336012)(1076003)(309714004); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 Feb 2023 16:43:44.2621 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 3f96e588-d710-49ab-3e04-08db19aaf4c1 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[216.228.117.161]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: DS1PEPF0000E63D.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: MN2PR12MB4254 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Extend rxq_cqe_comp_en devarg to allow the Enhanced CQE Compression layout to be enabled by a user. Setting the 8th bit turns it on. For example, rxq_cqe_comp_en=0x84 means the L3/L4 Header miniCQE format and the Enhanced CQE Compression layout. Enhanced CQE Compression can be enabled only if it is supported by FW. Create CQ with the proper CQE compression layout based on capabilities. Signed-off-by: Alexander Kozyrev Acked-by: Viacheslav Ovsiienko --- doc/guides/nics/mlx5.rst | 14 ++++++++++---- doc/guides/rel_notes/release_23_03.rst | 1 + drivers/net/mlx5/mlx5.c | 17 ++++++++++++++--- drivers/net/mlx5/mlx5.h | 1 + drivers/net/mlx5/mlx5_devx.c | 2 ++ 5 files changed, 28 insertions(+), 7 deletions(-) diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst index 0929f3ead0..29eedd7a35 100644 --- a/doc/guides/nics/mlx5.rst +++ b/doc/guides/nics/mlx5.rst @@ -681,14 +681,20 @@ for an additional list of options shared with other mlx5 drivers. Multi-Packet Rx queue configuration: Hash RSS format is used in case MPRQ is disabled, Checksum format is used in case MPRQ is enabled. - Specifying 2 as a ``rxq_cqe_comp_en`` value selects Flow Tag format for - better compression rate in case of RTE Flow Mark traffic. - Specifying 3 as a ``rxq_cqe_comp_en`` value selects Checksum format. - Specifying 4 as a ``rxq_cqe_comp_en`` value selects L3/L4 Header format for + The lower 3 bits define the CQE compression format: + Specifying 2 in these bits of the ``rxq_cqe_comp_en`` parameter selects + Flow Tag format for better compression rate in case of RTE Flow Mark traffic. + Specifying 3 in these bits selects Checksum format. + Specifying 4 in these bits selects L3/L4 Header format for better compression rate in case of mixed TCP/UDP and IPv4/IPv6 traffic. CQE compression format selection requires DevX to be enabled. If there is no DevX enabled/supported the value is reset to 1 by default. + 8th bit defines the CQE compression layout. + Setting this bit to 1 turns Enhanced CQE Compression Layout on. + Enhanced CQE Compression is designed for better latency and SW utilization. + This bit is ignored if the Basic CQE compression layout is only supported. + Supported on: - x86_64 with ConnectX-4, ConnectX-4 Lx, ConnectX-5, ConnectX-6, ConnectX-6 Dx, diff --git a/doc/guides/rel_notes/release_23_03.rst b/doc/guides/rel_notes/release_23_03.rst index 49c18617a5..de151b2b2f 100644 --- a/doc/guides/rel_notes/release_23_03.rst +++ b/doc/guides/rel_notes/release_23_03.rst @@ -155,6 +155,7 @@ New Features * **Updated NVIDIA mlx5 driver.** * Added support for matching on ICMPv6 ID and sequence fields. + * Added support for Enhanced CQE Compression layout. * **Updated Wangxun ngbe driver.** diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index 6bf522ae9d..41b1b12b91 100644 --- a/drivers/net/mlx5/mlx5.c +++ b/drivers/net/mlx5/mlx5.c @@ -384,6 +384,8 @@ static const struct mlx5_indexed_pool_config mlx5_ipool_cfg[] = { #define MLX5_FLOW_TABLE_HLIST_ARRAY_SIZE 1024 +#define MLX5_RXQ_ENH_CQE_COMP_MASK 0x80 + /** * Decide whether representor ID is a HPF(host PF) port on BF2. * @@ -2461,14 +2463,16 @@ mlx5_port_args_check_handler(const char *key, const char *val, void *opaque) return -rte_errno; } if (strcmp(MLX5_RXQ_CQE_COMP_EN, key) == 0) { - if (tmp > MLX5_CQE_RESP_FORMAT_L34H_STRIDX) { + if ((tmp & ~MLX5_RXQ_ENH_CQE_COMP_MASK) > + MLX5_CQE_RESP_FORMAT_L34H_STRIDX) { DRV_LOG(ERR, "invalid CQE compression " "format parameter"); rte_errno = EINVAL; return -rte_errno; } config->cqe_comp = !!tmp; - config->cqe_comp_fmt = tmp; + config->cqe_comp_fmt = tmp & ~MLX5_RXQ_ENH_CQE_COMP_MASK; + config->enh_cqe_comp = !!(tmp & MLX5_RXQ_ENH_CQE_COMP_MASK); } else if (strcmp(MLX5_RXQ_PKT_PAD_EN, key) == 0) { config->hw_padding = !!tmp; } else if (strcmp(MLX5_RX_MPRQ_EN, key) == 0) { @@ -2640,7 +2644,13 @@ mlx5_port_args_config(struct mlx5_priv *priv, struct mlx5_kvargs_ctrl *mkvlist, "L3/L4 Header CQE compression format isn't supported."); config->cqe_comp = 0; } - DRV_LOG(DEBUG, "Rx CQE compression is %ssupported.", + if (config->enh_cqe_comp && !hca_attr->enhanced_cqe_compression) { + DRV_LOG(WARNING, + "Enhanced CQE compression isn't supported."); + config->enh_cqe_comp = 0; + } + DRV_LOG(DEBUG, "%sRx CQE compression is %ssupported.", + config->enh_cqe_comp ? "Enhanced " : "", config->cqe_comp ? "" : "not "); if ((config->std_delay_drop || config->hp_delay_drop) && !dev_cap->rq_delay_drop_en) { @@ -2662,6 +2672,7 @@ mlx5_port_args_config(struct mlx5_priv *priv, struct mlx5_kvargs_ctrl *mkvlist, DRV_LOG(DEBUG, "\"rxq_pkt_pad_en\" is %u.", config->hw_padding); DRV_LOG(DEBUG, "\"rxq_cqe_comp_en\" is %u.", config->cqe_comp); DRV_LOG(DEBUG, "\"cqe_comp_fmt\" is %u.", config->cqe_comp_fmt); + DRV_LOG(DEBUG, "\"enh_cqe_comp\" is %u.", config->enh_cqe_comp); DRV_LOG(DEBUG, "\"rx_vec_en\" is %u.", config->rx_vec_en); DRV_LOG(DEBUG, "Standard \"delay_drop\" is %u.", config->std_delay_drop); diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index 4d1af3089e..29e12cf4a7 100644 --- a/drivers/net/mlx5/mlx5.h +++ b/drivers/net/mlx5/mlx5.h @@ -271,6 +271,7 @@ struct mlx5_port_config { unsigned int hw_vlan_insert:1; /* VLAN insertion in WQE is supported. */ unsigned int hw_padding:1; /* End alignment padding is supported. */ unsigned int cqe_comp:1; /* CQE compression is enabled. */ + unsigned int enh_cqe_comp:1; /* Enhanced CQE compression is enabled. */ unsigned int cqe_comp_fmt:3; /* CQE compression format. */ unsigned int rx_vec_en:1; /* Rx vector is enabled. */ unsigned int std_delay_drop:1; /* Enable standard Rxq delay drop. */ diff --git a/drivers/net/mlx5/mlx5_devx.c b/drivers/net/mlx5/mlx5_devx.c index d02cedb202..4369d2557e 100644 --- a/drivers/net/mlx5/mlx5_devx.c +++ b/drivers/net/mlx5/mlx5_devx.c @@ -372,6 +372,8 @@ mlx5_rxq_create_devx_cq_resources(struct mlx5_rxq_priv *rxq) if (priv->config.cqe_comp && !rxq_data->hw_timestamp && !rxq_data->lro) { cq_attr.cqe_comp_en = 1u; + cq_attr.cqe_comp_layout = priv->config.enh_cqe_comp; + rxq_data->cqe_comp_layout = cq_attr.cqe_comp_layout; rxq_data->mcqe_format = priv->config.cqe_comp_fmt; rxq_data->byte_mask = UINT32_MAX; switch (priv->config.cqe_comp_fmt) {