Message ID | 20211021104142.2649060-1-xuemingl@nvidia.com (mailing list archive) |
---|---|
Headers |
Return-Path: <dev-bounces@dpdk.org> X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 31C31A0C4B; Thu, 21 Oct 2021 12:42:04 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id A4EF54116A; Thu, 21 Oct 2021 12:42:03 +0200 (CEST) Received: from NAM11-DM6-obe.outbound.protection.outlook.com (mail-dm6nam11on2049.outbound.protection.outlook.com [40.107.223.49]) by mails.dpdk.org (Postfix) with ESMTP id 7687F410E2 for <dev@dpdk.org>; Thu, 21 Oct 2021 12:42:02 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=H48pv9pACIn2F0P5+yz/Pw+k/EbV61bE4BzcjzQPQPKFvuM6LqF0qxymhZuZhiuzXqB1fSSqeBkuyrBo1CmQs8Isj3zZOPkKYKfvS/WGjsH40tVctySc5StRTy9eV4LoIC3LAv0zexmwpnNcfwWnxq57VcjfO9Xyk+iP7Z1wsqlRuWeJTC6Ip8t54iyKS0kh/VAh4B5jmderTsG2Ik8Ssmc5KJe9EpSMxX8JxaTNEp0GTyWOQHnMUhh9/E8oQtRuXz3dYE7Hyb40ZLx2Yj22YXpTbZMhaQVqgoVuJKY4IuBeO0fNDiPI23nDdm+7/6c/XLC1BzoIg5K09LQOW7CIHQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=uZ6YRFvZ3a9GFDTeHLj72fglh14AtogpYPeG5Zq97f4=; b=eiV66H31k5HG1jJFWVxokH8sJplyPWkdijtk7wC/e5bX/RG+k1cyQ+P+70BIWjTuG5bkLzjUQB9ZAJ3NqbQTv/5kM5i36avhpAzR4pN4SOh2d/4sn+q+fVd6mbSc7b0vx6LROMp01Y/Wt6u+JwzrbOov7ycq5M4FmHpEi4zu8jovTo95WsdSE8mBbFe4hMFc7q8LjhXOUqgh1hojUzt8g6pFKcCXtH0ILwBIn13pWkiEOI16MLvNQGt9PaoxRbs6Rw8Yx0gtijm2SmLa5ZDgrC2y61ArzC4cV7rUc5hu/TbRZnj4kzmTcIG9q3opou9pGuj/DCWWKa0g4zLenIFXCg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.112.34) smtp.rcpttodomain=monjalon.net smtp.mailfrom=nvidia.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=uZ6YRFvZ3a9GFDTeHLj72fglh14AtogpYPeG5Zq97f4=; b=UKbcziHTiwD/EPHIa0D0e30W2M9mxJ5FUfjIbjjBQIlTMmaAA3uoxasxjFUXq4C9Eyfp7g6rjhIyJD0UUL6qgUJXslDBDcKDm41FjBrLDBdFs1UE+rl/clrGOj9xqZ9Hb9HVBxsTXieuq9aJjAZwxutnAg26Oplw8OoCw2yyjZTKkW+bGacSYyIEt2GHssM/NTsBm2FBpJHc2mAysZlOcumfAXXvXuKtsxogw2jTGHDdyS0dvc8GnMSNkRtFXI6fB/uF8DXo6mx3dgQKOYQQVh4Gmy6I83ThlHjF6cLa9SNzFdzia9qZsaWw9NH/QU6H5M/gfl3hoQOMsFPSsjTiGg== Received: from BN9PR03CA0288.namprd03.prod.outlook.com (2603:10b6:408:f5::23) by MWHPR12MB1518.namprd12.prod.outlook.com (2603:10b6:301:11::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4608.18; Thu, 21 Oct 2021 10:42:00 +0000 Received: from BN8NAM11FT020.eop-nam11.prod.protection.outlook.com (2603:10b6:408:f5:cafe::64) by BN9PR03CA0288.outlook.office365.com (2603:10b6:408:f5::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4608.16 via Frontend Transport; Thu, 21 Oct 2021 10:41:59 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.112.34) smtp.mailfrom=nvidia.com; monjalon.net; dkim=none (message not signed) header.d=none;monjalon.net; dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.112.34 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.112.34; helo=mail.nvidia.com; Received: from mail.nvidia.com (216.228.112.34) by BN8NAM11FT020.mail.protection.outlook.com (10.13.176.223) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.4628.16 via Frontend Transport; Thu, 21 Oct 2021 10:41:58 +0000 Received: from nvidia.com (172.20.187.5) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1497.18; Thu, 21 Oct 2021 10:41:55 +0000 From: Xueming Li <xuemingl@nvidia.com> To: <dev@dpdk.org>, Zhang Yuying <yuying.zhang@intel.com>, Li Xiaoyun <xiaoyun.li@intel.com> CC: <xuemingl@nvidia.com>, Jerin Jacob <jerinjacobk@gmail.com>, Ferruh Yigit <ferruh.yigit@intel.com>, Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>, Viacheslav Ovsiienko <viacheslavo@nvidia.com>, Thomas Monjalon <thomas@monjalon.net>, Lior Margalit <lmargalit@nvidia.com>, "Ananyev Konstantin" <konstantin.ananyev@intel.com>, Ajit Khaparde <ajit.khaparde@broadcom.com> Date: Thu, 21 Oct 2021 18:41:35 +0800 Message-ID: <20211021104142.2649060-1-xuemingl@nvidia.com> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20210727034204.20649-1-xuemingl@nvidia.com> References: <20210727034204.20649-1-xuemingl@nvidia.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-Originating-IP: [172.20.187.5] X-ClientProxiedBy: HQMAIL105.nvidia.com (172.20.187.12) To HQMAIL107.nvidia.com (172.20.187.13) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 1b77eab7-09da-4247-8a79-08d9947f68b6 X-MS-TrafficTypeDiagnostic: MWHPR12MB1518: X-LD-Processed: 43083d15-7273-40c1-b7db-39efd9ccc17a,ExtAddr X-Microsoft-Antispam-PRVS: <MWHPR12MB1518B711B551FAEFF3C8CAEAA1BF9@MWHPR12MB1518.namprd12.prod.outlook.com> X-MS-Exchange-Transport-Forked: True X-MS-Oob-TLC-OOBClassifiers: OLM:8882; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: kv87r5jQjOLnokLFDZ0R2EnzCRIiYeeDVueVVewHUQvaPu+nD4IH6N7fG7k6ybfF+lO8v1zmH9oTBkVpX7luiLI75GyAChKzUC14U/vH6r9YqEafNUblHPfNWgm2EOWqI2BcAEjjZ30ZDhfbNWxzYDJjFzs1YrwQtDsQrO4RSmdXfG5OzIdYn5gc3PAAm37YTXPMOLXlcn7A9uLo7X4UioL90RhFUl0ecPx7y1TkbYSyJclKBe0HcKXfOsB1NHenegU0V7dstwI5SEj+9us3WjeCcIFYqj68E1kDk/DO3Tv+bZzS73v6gCzwzPCYnAvv73AVJiVmTw5tzHg6CofWJMRm8Wj6OoyEcWm0lq12+gJkr8rIXruSGHg+o5xD0ENo0H9fSOL2VBxH8ZwUoBdASvEeQ3qZ5AT9X4t3yx+AiDq4u5un3B3JpNpzVmCBnDdhkHa4rONzZf6ByGvPnPaVAiLgG5ws2CNRfJ4inshQmmLpgcbnsOpRuAHAwK0ux5MyOG56pWh/rwBbLZ2hyYXeoxmCxnVH/HEV87R4F6QAhWLOfAKhef315LEhoJ80jRf57AwOGoloW53jS0HzmGR8X9HgB+qnLCaDa2QfBwBDx4i80l9YB5r96py0pcLvRaGcznw6qV8EpFY+s3XaE9YWXVucEcL+rlAHolZz/WUHSBc+bhqwQA5wElcUulvDX/eJTeo91xnwgGOHbp7RksP3wg== X-Forefront-Antispam-Report: CIP:216.228.112.34; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:schybrid03.nvidia.com; CAT:NONE; SFS:(4636009)(46966006)(36840700001)(8936002)(6666004)(16526019)(82310400003)(55016002)(36756003)(356005)(336012)(1076003)(7636003)(8676002)(2616005)(26005)(316002)(6286002)(7696005)(110136005)(54906003)(83380400001)(426003)(4326008)(5660300002)(2906002)(508600001)(86362001)(70206006)(36860700001)(70586007)(47076005)(186003); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 21 Oct 2021 10:41:58.3838 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 1b77eab7-09da-4247-8a79-08d9947f68b6 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[216.228.112.34]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: BN8NAM11FT020.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: MWHPR12MB1518 Subject: [dpdk-dev] [PATCH v13 0/7] ethdev: introduce shared Rx queue X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions <dev.dpdk.org> List-Unsubscribe: <https://mails.dpdk.org/options/dev>, <mailto:dev-request@dpdk.org?subject=unsubscribe> List-Archive: <http://mails.dpdk.org/archives/dev/> List-Post: <mailto:dev@dpdk.org> List-Help: <mailto:dev-request@dpdk.org?subject=help> List-Subscribe: <https://mails.dpdk.org/listinfo/dev>, <mailto:dev-request@dpdk.org?subject=subscribe> Errors-To: dev-bounces@dpdk.org Sender: "dev" <dev-bounces@dpdk.org> |
Series |
ethdev: introduce shared Rx queue
|
|
Message
Xueming Li
Oct. 21, 2021, 10:41 a.m. UTC
In current DPDK framework, all Rx queues is pre-loaded with mbufs for incoming packets. When number of representors scale out in a switch domain, the memory consumption became significant. Further more, polling all ports leads to high cache miss, high latency and low throughputs. This patch introduces shared Rx queue. PF and representors in same Rx domain and switch domain could share Rx queue set by specifying non-zero share group value in Rx queue configuration. All ports that share Rx queue actually shares hardware descriptor queue and feed all Rx queues with one descriptor supply, memory is saved. Polling any queue using same shared Rx queue receives packets from all member ports. Source port is identified by mbuf->port. Multiple groups is supported by group ID. Port queue number in a shared group should be identical. Queue index is 1:1 mapped in shared group. An example of two share groups: Group1, 4 shared Rx queues per member port: PF, repr0, repr1 Group2, 2 shared Rx queues per member port: repr2, repr3, ... repr127 Poll first port for each group: core port queue 0 0 0 1 0 1 2 0 2 3 0 3 4 2 0 5 2 1 Shared Rx queue must be polled on single thread or core. If both PF0 and representor0 joined same share group, can't poll pf0rxq0 on core1 and rep0rxq0 on core2. Actually, polling one port within share group is sufficient since polling any port in group will return packets for any port in group. There was some discussion to aggregate member ports in same group into a dummy port, several ways to achieve it. Since it optional, need to collect more feedback and requirement from user, make better decision later. v1: - initial version v2: - add testpmd patches v3: - change common forwarding api to macro for performance, thanks Jerin. - save global variable accessed in forwarding to flowstream to minimize cache miss - combined patches for each forwarding engine - support multiple groups in testpmd "--share-rxq" parameter - new api to aggregate shared rxq group v4: - spelling fixes - remove shared-rxq support for all forwarding engines - add dedicate shared-rxq forwarding engine v5: - fix grammars - remove aggregate api and leave it for later discussion - add release notes - add deployment example v6: - replace RxQ offload flag with device offload capability flag - add Rx domain - RxQ is shared when share group > 0 - update testpmd accordingly v7: - fix testpmd share group id allocation - change rx_domain to 16bits v8: - add new patch for testpmd to show device Rx domain ID and capability - new share_qid in RxQ configuration v9: - fix some spelling v10: - add device capability name api v11: - remove macro from device capability name list v12: - rephrase - in forwarding core check, add global flag and RxQ enabled check v13: - update imports of new forwarding engine - rephrase Xueming Li (7): ethdev: introduce shared Rx queue ethdev: get device capability name as string app/testpmd: dump device capability and Rx domain info app/testpmd: new parameter to enable shared Rx queue app/testpmd: dump port info for shared Rx queue app/testpmd: force shared Rx queue polled on same core app/testpmd: add forwarding engine for shared Rx queue app/test-pmd/config.c | 141 +++++++++++++++++- app/test-pmd/meson.build | 1 + app/test-pmd/parameters.c | 13 ++ app/test-pmd/shared_rxq_fwd.c | 115 ++++++++++++++ app/test-pmd/testpmd.c | 26 +++- app/test-pmd/testpmd.h | 5 + app/test-pmd/util.c | 3 + doc/guides/nics/features.rst | 13 ++ doc/guides/nics/features/default.ini | 1 + .../prog_guide/switch_representation.rst | 11 ++ doc/guides/rel_notes/release_21_11.rst | 6 + doc/guides/testpmd_app_ug/run_app.rst | 9 ++ doc/guides/testpmd_app_ug/testpmd_funcs.rst | 5 +- lib/ethdev/rte_ethdev.c | 33 ++++ lib/ethdev/rte_ethdev.h | 38 +++++ lib/ethdev/version.map | 1 + 16 files changed, 415 insertions(+), 6 deletions(-) create mode 100644 app/test-pmd/shared_rxq_fwd.c
Comments
On 10/21/2021 11:41 AM, Xueming Li wrote: > In current DPDK framework, all Rx queues is pre-loaded with mbufs for > incoming packets. When number of representors scale out in a switch > domain, the memory consumption became significant. Further more, > polling all ports leads to high cache miss, high latency and low > throughputs. > > This patch introduces shared Rx queue. PF and representors in same > Rx domain and switch domain could share Rx queue set by specifying > non-zero share group value in Rx queue configuration. > > All ports that share Rx queue actually shares hardware descriptor > queue and feed all Rx queues with one descriptor supply, memory is saved. > > Polling any queue using same shared Rx queue receives packets from all > member ports. Source port is identified by mbuf->port. > > Multiple groups is supported by group ID. Port queue number in a shared > group should be identical. Queue index is 1:1 mapped in shared group. > An example of two share groups: > Group1, 4 shared Rx queues per member port: PF, repr0, repr1 > Group2, 2 shared Rx queues per member port: repr2, repr3, ... repr127 > Poll first port for each group: > core port queue > 0 0 0 > 1 0 1 > 2 0 2 > 3 0 3 > 4 2 0 > 5 2 1 > > Shared Rx queue must be polled on single thread or core. If both PF0 and > representor0 joined same share group, can't poll pf0rxq0 on core1 and > rep0rxq0 on core2. Actually, polling one port within share group is > sufficient since polling any port in group will return packets for any > port in group. > > There was some discussion to aggregate member ports in same group into a > dummy port, several ways to achieve it. Since it optional, need to collect > more feedback and requirement from user, make better decision later. > > v1: > - initial version > v2: > - add testpmd patches > v3: > - change common forwarding api to macro for performance, thanks Jerin. > - save global variable accessed in forwarding to flowstream to minimize > cache miss > - combined patches for each forwarding engine > - support multiple groups in testpmd "--share-rxq" parameter > - new api to aggregate shared rxq group > v4: > - spelling fixes > - remove shared-rxq support for all forwarding engines > - add dedicate shared-rxq forwarding engine > v5: > - fix grammars > - remove aggregate api and leave it for later discussion > - add release notes > - add deployment example > v6: > - replace RxQ offload flag with device offload capability flag > - add Rx domain > - RxQ is shared when share group > 0 > - update testpmd accordingly > v7: > - fix testpmd share group id allocation > - change rx_domain to 16bits > v8: > - add new patch for testpmd to show device Rx domain ID and capability > - new share_qid in RxQ configuration > v9: > - fix some spelling > v10: > - add device capability name api > v11: > - remove macro from device capability name list > v12: > - rephrase > - in forwarding core check, add global flag and RxQ enabled check > v13: > - update imports of new forwarding engine > - rephrase > > Xueming Li (7): > ethdev: introduce shared Rx queue > ethdev: get device capability name as string > app/testpmd: dump device capability and Rx domain info > app/testpmd: new parameter to enable shared Rx queue > app/testpmd: dump port info for shared Rx queue > app/testpmd: force shared Rx queue polled on same core > app/testpmd: add forwarding engine for shared Rx queue > This patch is changing some common ethdev structs for a use case I am not sure how common, I would like to see more reviews from more vendors but we didn't get, at this stage I will proceed based on Andres's review. Since only nvidia will be able to test this feature in this release, can you please be sure nvidia test report contains this feature? To be sure the feature is tested at least by a vendor. Series applied to dpdk-next-net/main, thanks.
On Fri, 2021-10-22 at 00:41 +0100, Ferruh Yigit wrote: > On 10/21/2021 11:41 AM, Xueming Li wrote: > > In current DPDK framework, all Rx queues is pre-loaded with mbufs for > > incoming packets. When number of representors scale out in a switch > > domain, the memory consumption became significant. Further more, > > polling all ports leads to high cache miss, high latency and low > > throughputs. > > > > This patch introduces shared Rx queue. PF and representors in same > > Rx domain and switch domain could share Rx queue set by specifying > > non-zero share group value in Rx queue configuration. > > > > All ports that share Rx queue actually shares hardware descriptor > > queue and feed all Rx queues with one descriptor supply, memory is saved. > > > > Polling any queue using same shared Rx queue receives packets from all > > member ports. Source port is identified by mbuf->port. > > > > Multiple groups is supported by group ID. Port queue number in a shared > > group should be identical. Queue index is 1:1 mapped in shared group. > > An example of two share groups: > > Group1, 4 shared Rx queues per member port: PF, repr0, repr1 > > Group2, 2 shared Rx queues per member port: repr2, repr3, ... repr127 > > Poll first port for each group: > > core port queue > > 0 0 0 > > 1 0 1 > > 2 0 2 > > 3 0 3 > > 4 2 0 > > 5 2 1 > > > > Shared Rx queue must be polled on single thread or core. If both PF0 and > > representor0 joined same share group, can't poll pf0rxq0 on core1 and > > rep0rxq0 on core2. Actually, polling one port within share group is > > sufficient since polling any port in group will return packets for any > > port in group. > > > > There was some discussion to aggregate member ports in same group into a > > dummy port, several ways to achieve it. Since it optional, need to collect > > more feedback and requirement from user, make better decision later. > > > > v1: > > - initial version > > v2: > > - add testpmd patches > > v3: > > - change common forwarding api to macro for performance, thanks Jerin. > > - save global variable accessed in forwarding to flowstream to minimize > > cache miss > > - combined patches for each forwarding engine > > - support multiple groups in testpmd "--share-rxq" parameter > > - new api to aggregate shared rxq group > > v4: > > - spelling fixes > > - remove shared-rxq support for all forwarding engines > > - add dedicate shared-rxq forwarding engine > > v5: > > - fix grammars > > - remove aggregate api and leave it for later discussion > > - add release notes > > - add deployment example > > v6: > > - replace RxQ offload flag with device offload capability flag > > - add Rx domain > > - RxQ is shared when share group > 0 > > - update testpmd accordingly > > v7: > > - fix testpmd share group id allocation > > - change rx_domain to 16bits > > v8: > > - add new patch for testpmd to show device Rx domain ID and capability > > - new share_qid in RxQ configuration > > v9: > > - fix some spelling > > v10: > > - add device capability name api > > v11: > > - remove macro from device capability name list > > v12: > > - rephrase > > - in forwarding core check, add global flag and RxQ enabled check > > v13: > > - update imports of new forwarding engine > > - rephrase > > > > Xueming Li (7): > > ethdev: introduce shared Rx queue > > ethdev: get device capability name as string > > app/testpmd: dump device capability and Rx domain info > > app/testpmd: new parameter to enable shared Rx queue > > app/testpmd: dump port info for shared Rx queue > > app/testpmd: force shared Rx queue polled on same core > > app/testpmd: add forwarding engine for shared Rx queue > > > > This patch is changing some common ethdev structs for a use case I am > not sure how common, I would like to see more reviews from more vendors > but we didn't get, at this stage I will proceed based on Andres's review. > > Since only nvidia will be able to test this feature in this release, can > you please be sure nvidia test report contains this feature? To be sure > the feature is tested at least by a vendor. > > > Series applied to dpdk-next-net/main, thanks. Hi Ferruh, Thanks very much for your help! +Raslan, Ali Let's make sure the test report contains this feature. Best Regards, Xueming Li
Le 21-10-21 à 12:41, Xueming Li a écrit : > In current DPDK framework, all Rx queues is pre-loaded with mbufs for > incoming packets. When number of representors scale out in a switch > domain, the memory consumption became significant. Further more, > polling all ports leads to high cache miss, high latency and low > throughputs. > > This patch introduces shared Rx queue. PF and representors in same > Rx domain and switch domain could share Rx queue set by specifying > non-zero share group value in Rx queue configuration. > > All ports that share Rx queue actually shares hardware descriptor > queue and feed all Rx queues with one descriptor supply, memory is saved. > > Polling any queue using same shared Rx queue receives packets from all > member ports. Source port is identified by mbuf->port. > > Multiple groups is supported by group ID. Port queue number in a shared > group should be identical. Queue index is 1:1 mapped in shared group. > An example of two share groups: > Group1, 4 shared Rx queues per member port: PF, repr0, repr1 > Group2, 2 shared Rx queues per member port: repr2, repr3, ... repr127 > Poll first port for each group: > core port queue > 0 0 0 > 1 0 1 > 2 0 2 > 3 0 3 > 4 2 0 > 5 2 1 > > Shared Rx queue must be polled on single thread or core. If both PF0 and > representor0 joined same share group, can't poll pf0rxq0 on core1 and > rep0rxq0 on core2. Actually, polling one port within share group is > sufficient since polling any port in group will return packets for any > port in group. > > There was some discussion to aggregate member ports in same group into a > dummy port, several ways to achieve it. Since it optional, need to collect > more feedback and requirement from user, make better decision later. > > v1: > - initial version > v2: > - add testpmd patches > v3: > - change common forwarding api to macro for performance, thanks Jerin. > - save global variable accessed in forwarding to flowstream to minimize > cache miss > - combined patches for each forwarding engine > - support multiple groups in testpmd "--share-rxq" parameter > - new api to aggregate shared rxq group > v4: > - spelling fixes > - remove shared-rxq support for all forwarding engines > - add dedicate shared-rxq forwarding engine > v5: > - fix grammars > - remove aggregate api and leave it for later discussion > - add release notes > - add deployment example > v6: > - replace RxQ offload flag with device offload capability flag > - add Rx domain > - RxQ is shared when share group > 0 > - update testpmd accordingly > v7: > - fix testpmd share group id allocation > - change rx_domain to 16bits > v8: > - add new patch for testpmd to show device Rx domain ID and capability > - new share_qid in RxQ configuration > v9: > - fix some spelling > v10: > - add device capability name api > v11: > - remove macro from device capability name list > v12: > - rephrase > - in forwarding core check, add global flag and RxQ enabled check > v13: > - update imports of new forwarding engine > - rephrase > > Xueming Li (7): > ethdev: introduce shared Rx queue > ethdev: get device capability name as string > app/testpmd: dump device capability and Rx domain info > app/testpmd: new parameter to enable shared Rx queue > app/testpmd: dump port info for shared Rx queue > app/testpmd: force shared Rx queue polled on same core > app/testpmd: add forwarding engine for shared Rx queue > > app/test-pmd/config.c | 141 +++++++++++++++++- > app/test-pmd/meson.build | 1 + > app/test-pmd/parameters.c | 13 ++ > app/test-pmd/shared_rxq_fwd.c | 115 ++++++++++++++ > app/test-pmd/testpmd.c | 26 +++- > app/test-pmd/testpmd.h | 5 + > app/test-pmd/util.c | 3 + > doc/guides/nics/features.rst | 13 ++ > doc/guides/nics/features/default.ini | 1 + > .../prog_guide/switch_representation.rst | 11 ++ > doc/guides/rel_notes/release_21_11.rst | 6 + > doc/guides/testpmd_app_ug/run_app.rst | 9 ++ > doc/guides/testpmd_app_ug/testpmd_funcs.rst | 5 +- > lib/ethdev/rte_ethdev.c | 33 ++++ > lib/ethdev/rte_ethdev.h | 38 +++++ > lib/ethdev/version.map | 1 + > 16 files changed, 415 insertions(+), 6 deletions(-) > create mode 100644 app/test-pmd/shared_rxq_fwd.c Hi all, Sorry to jump in this late but I think this solves only a consequence of another "problem", the fact the mbuf descriptor is coupled with the buffer. And you might want to consider another approach that does not require API change. The problem (partially solved by this patch) is that you'll "touch" many descriptors (the rte_mbuf itself) if you have many queues, or even a few queues but with quite large rings. Those descriptors, will all be likely out of cache when you access them. However, as we demonstrated with mlx5 (see https://packetmill.io/) you can build a descriptor from scratch out of the NIC hw ring that points to the underlying buffer in an indirect way. This descriptor can be taken out of the thread-local buffer pool. You'll actualy keep as much mbufs descriptors in-flight as your burst size. Which probably even defeats what this patch can do, as you can actually use only 32 descriptors per thread for any number of queues of any size. What that solution does not solve is the need to poll many different queues. I think that is orthogonal, with the NICs getting smarter we're going to have many rules sending traffic to per-application, per-priority queues anyway. Maybe even per-microflows. To solve this we would need a kind of queue bitmask set in hw to indicate which queue to poll instead of trying all of them. Maybe this can be done through a FW update? It's a feature we'll want in the future in any cases. The shared RX queue is surely an easy fix for the polling itself, but one problem of the shared RX queue is that it will lead to scattered batches. We'll get batches of packets from all ports that will surely take different code path for anything above forwarding, breaking the benefit of batching (this can also lead up to 50% of performance penalty due to interleaved burst, see https://people.kth.se/~dejanko/documents/publications/ordermatters-nsdi22.pdf). Cheers, Tom