From patchwork Sat Oct 9 01:53:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elena Agostini X-Patchwork-Id: 100838 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 9ECD1A0C43; Fri, 8 Oct 2021 19:44:42 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id EDD974114E; Fri, 8 Oct 2021 19:43:52 +0200 (CEST) Received: from NAM11-BN8-obe.outbound.protection.outlook.com (mail-bn8nam11on2082.outbound.protection.outlook.com [40.107.236.82]) by mails.dpdk.org (Postfix) with ESMTP id 8847B41142 for ; Fri, 8 Oct 2021 19:43:49 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=nu3kfsr/+gxQGf9miZHy3y+YsRwJT7jaEW9BNIskEhaqixrxYwzUf881VUwpChVtPJvc1Ebf0R9FTlO20uoI7SYo6wEDyjU6hpzWrKnZg7/L02eA4Mc55C1awu94dcODsDjo8VJoBu+ZSZsm7P2xp/JiiJhNP7L4eVCgoKxBE2GFLqOdcPUfAa3HLGpLShNSJyVdGTxLw49YRrahynqOKju4lSUsyj4kyux0Oe2edYKYltrTQ+S1+XinXmwqutJdlDDFBgfy/Up70mNcF7JP/3GAydF6ynAESjdxi0sQitrPD6i6Z/kb3puf0zBcSeHQD3u6jp9O/pj80IySn5A/8A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=OxtiHhow+oMMccuwlq9p6l5sIJT9EIjR/wpVQcG82Kw=; b=TD1GLD5F6H4/WK8qkVLEpmgCVYyKRb5tfkacU/ucSGfsXQv6BHnClrAp3INR9I3Y4qOSMr6WeA3OUav/QjDzcpZXQYmXBUKb+TRV8S/49y9fAMlWNyhfc7bUUpCCXiGDanFt1nLwjikg9ni6LJxg1D5mp3AZeNPlIHOjnQ5MTX+Cgox76Y803u81fyIl036GZAkhg8iNQxCBwTIooWj+32UN/qmk2lr8zLthomYO9ejHbYK83E7LM9XPPNf9hqrmK3vdNW8SZWNOuPBsGvzvuoVGYOFu2ciosgnKV1jNuLIrf5LXlsGZI+sLZFDXzW0FbIilLJq/AiFwZn/Ezf/waw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.112.34) smtp.rcpttodomain=dpdk.org smtp.mailfrom=nvidia.com; dmarc=pass (p=quarantine sp=none pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=OxtiHhow+oMMccuwlq9p6l5sIJT9EIjR/wpVQcG82Kw=; b=KDM6lgQn5LoCtFvo+/bHBgXapfBuG479xHW14A+4+vgksFb2dtge6u7fewsAvlT/E2x+pf8UQBV9+vxwWUW0vgR8zoS/uyq0RjVjH5EycscdOiAnNW7/kQRd6PsrmpIdWC9dHSnWfB/v6hBixjW2TruUSwI7QldvZJ7nldBQ/iS9f7ugF9KSRZSdNoTq2z1MCitHQfpiGSJnB8zbJevp+9vhm5WvkvDn6mLx05XaP7I+QOpW/J3MvKCCVLRdGC1CBUO25jEakxQz+/QlVdTNSfRRe3qog0SoJ0KTGeaZrXnkvlpHwKeQJzdL7nQc9BGFLJERC3AXGX55MD474eG6Gw== Received: from MW4PR03CA0130.namprd03.prod.outlook.com (2603:10b6:303:8c::15) by CH0PR12MB5122.namprd12.prod.outlook.com (2603:10b6:610:bd::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4587.22; Fri, 8 Oct 2021 17:43:47 +0000 Received: from CO1NAM11FT047.eop-nam11.prod.protection.outlook.com (2603:10b6:303:8c:cafe::eb) by MW4PR03CA0130.outlook.office365.com (2603:10b6:303:8c::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4587.18 via Frontend Transport; Fri, 8 Oct 2021 17:43:47 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.112.34) smtp.mailfrom=nvidia.com; dpdk.org; dkim=none (message not signed) header.d=none;dpdk.org; dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.112.34 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.112.34; helo=mail.nvidia.com; Received: from mail.nvidia.com (216.228.112.34) by CO1NAM11FT047.mail.protection.outlook.com (10.13.174.132) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.4587.18 via Frontend Transport; Fri, 8 Oct 2021 17:43:47 +0000 Received: from nvidia.com (172.20.187.6) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1497.18; Fri, 8 Oct 2021 17:43:43 +0000 From: To: CC: Elena Agostini Date: Sat, 9 Oct 2021 01:53:49 +0000 Message-ID: <20211009015349.9694-10-eagostini@nvidia.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20211009015349.9694-1-eagostini@nvidia.com> References: <20210602203531.2288645-1-thomas@monjalon.net> <20211009015349.9694-1-eagostini@nvidia.com> MIME-Version: 1.0 X-Originating-IP: [172.20.187.6] X-ClientProxiedBy: HQMAIL111.nvidia.com (172.20.187.18) To HQMAIL107.nvidia.com (172.20.187.13) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 9878705d-b820-4bd2-0a94-08d98a832e51 X-MS-TrafficTypeDiagnostic: CH0PR12MB5122: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:6430; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: vC5baE00yaFCYcQPz4GGTviXD4KHj6MKEcJpe0EkJBi32nsyg4ATReJQQpfXVKdbIiJC0DIUFCh2NRCBEzgeel+u0JB54LUB6iklKCN+ndHQmrWh5zKLOBMAJs1Z6vk65DjV0MWy7rpDeWDlph+mTdIqdDf/q7yLt9rP+NdOljl4hsXWt/HruaxggmrLw1LH2AZAikZ/ngATM2Va2pcimlc2tOojPbmHG6AmDkiJ4kx7Jr9hl6BGmfdl0xQzAhUxzGRwKwLTcjlvyr9yyhu6kN85FmLvyMGsDqOVgAG0Lx3+W3QWsKiCB1hbVYH3giZItlutXgLmo2bjGQ8WQfd5qm1e3uxTOlqhppuNEI0WqtZ4TyzKoGQGA1Yv6rc8QB5/PXxZNDgXm/ooEBkLfJeAaHO4Gb2YXxaIpOMNhZV4axPWwVoxX/HGcROqvMjQX2mf4SFKwoQUcmYclLA52KRFtkc7eBwauJlQ6pKNAVCtwHKKJVFK86xioiapdkfiL1hSPEFmUa/Zj6Rz3TnOv/CfFy9zODe+zf2c1dfXj56Y5w610Lys1hiQXp45mj6BpBte8DEswxqI+fLOiCDI5Aq7ITaR6g1+zO7lin8gukbbdYIVXmD+FQO58ZKEPXocEkXrge2ZARNJTfoQzk4M1+V9k7mB9aKnKStzOgzS3jOZRkZR64NtmZ222unBWbEg5P5SE4dbaOn6jX1Wv/KxNQw5zQ== X-Forefront-Antispam-Report: CIP:216.228.112.34; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:schybrid03.nvidia.com; CAT:NONE; SFS:(4636009)(36840700001)(46966006)(36756003)(86362001)(83380400001)(6286002)(47076005)(336012)(55016002)(36860700001)(107886003)(82310400003)(356005)(7696005)(1076003)(7636003)(426003)(2906002)(316002)(2876002)(4326008)(5660300002)(26005)(186003)(16526019)(508600001)(70206006)(70586007)(6916009)(8936002)(2616005)(8676002); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Oct 2021 17:43:47.0408 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 9878705d-b820-4bd2-0a94-08d98a832e51 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[216.228.112.34]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: CO1NAM11FT047.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: CH0PR12MB5122 Subject: [dpdk-dev] [PATCH v3 9/9] doc: add CUDA example in GPU guide X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Elena Agostini Signed-off-by: Elena Agostini --- doc/guides/prog_guide/gpudev.rst | 122 +++++++++++++++++++++++++++++++ 1 file changed, 122 insertions(+) diff --git a/doc/guides/prog_guide/gpudev.rst b/doc/guides/prog_guide/gpudev.rst index cbaec5a1e4..1baf0c6772 100644 --- a/doc/guides/prog_guide/gpudev.rst +++ b/doc/guides/prog_guide/gpudev.rst @@ -102,3 +102,125 @@ the list of mbuf payload addresses where received packet have been stored. The ``rte_gpu_comm_*()`` functions are responsible to create a list of packets that can be populated with receive mbuf payload addresses and communicated to the task running on the GPU. + + +CUDA Example +------------ + +In the example below, there is a pseudo-code to give an example +about how to use functions in this library in case of a CUDA application. + +.. code-block:: c + + ////////////////////////////////////////////////////////////////////////// + ///// gpudev library + CUDA functions + ////////////////////////////////////////////////////////////////////////// + #define GPU_PAGE_SHIFT 16 + #define GPU_PAGE_SIZE (1UL << GPU_PAGE_SHIFT) + + int main() { + struct rte_gpu_flag quit_flag; + struct rte_gpu_comm_list *comm_list; + int nb_rx = 0; + int comm_list_entry = 0; + struct rte_mbuf * rx_mbufs[max_rx_mbufs]; + cudaStream_t cstream; + struct rte_mempool *mpool_payload, *mpool_header; + struct rte_pktmbuf_extmem ext_mem; + int16_t dev_id; + int16_t port_id = 0; + + /** Initialize CUDA objects (cstream, context, etc..). */ + /** Use gpudev library to register a new CUDA context if any */ + /** Let's assume the application wants to use the default context of the GPU device 0 */ + + dev_id = 0; + + /** + * Create an external memory mempool using memory allocated on the GPU. + */ + ext_mem.elt_size = mbufs_headroom_size; + ext_mem.buf_len = RTE_ALIGN_CEIL(mbufs_num * ext_mem.elt_size, GPU_PAGE_SIZE); + ext_mem.buf_iova = RTE_BAD_IOVA; + ext_mem.buf_ptr = rte_gpu_malloc(dev_id, ext_mem.buf_len, 0); + rte_extmem_register(ext_mem.buf_ptr, ext_mem.buf_len, NULL, ext_mem.buf_iova, GPU_PAGE_SIZE); + rte_dev_dma_map(rte_eth_devices[port_id].device, ext_mem.buf_ptr, ext_mem.buf_iova, ext_mem.buf_len); + mpool_payload = rte_pktmbuf_pool_create_extbuf("gpu_mempool", mbufs_num, + 0, 0, ext_mem.elt_size, + rte_socket_id(), &ext_mem, 1); + + /** + * Create CPU - device communication flag. With this flag, the CPU can tell to the CUDA kernel + * to exit from the main loop. + */ + rte_gpu_comm_create_flag(dev_id, &quit_flag, RTE_GPU_COMM_FLAG_CPU); + rte_gpu_comm_set_flag(&quit_flag , 0); + + /** + * Create CPU - device communication list. Each entry of this list will be populated by the CPU + * with a new set of received mbufs that the CUDA kernel has to process. + */ + comm_list = rte_gpu_comm_create_list(dev_id, num_entries); + + /** A very simple CUDA kernel with just 1 CUDA block and RTE_GPU_COMM_LIST_PKTS_MAX CUDA threads. */ + cuda_kernel_packet_processing<<<1, RTE_GPU_COMM_LIST_PKTS_MAX, 0, cstream>>>(quit_flag->ptr, comm_list, num_entries, ...); + + /** + * For simplicity, the CPU here receives only 2 bursts of mbufs. + * In a real application, network activity and device processing should overlap. + */ + nb_rx = rte_eth_rx_burst(port_id, queue_id, &(rx_mbufs[0]), max_rx_mbufs); + rte_gpu_comm_populate_list_pkts(comm_list[0], rx_mbufs, nb_rx); + nb_rx = rte_eth_rx_burst(port_id, queue_id, &(rx_mbufs[0]), max_rx_mbufs); + rte_gpu_comm_populate_list_pkts(comm_list[1], rx_mbufs, nb_rx); + + /** + * CPU waits for the completion of the packets' processing on the CUDA kernel + * and then it does a cleanup of the received mbufs. + */ + while(rte_gpu_comm_cleanup_list(comm_list[0])); + while(rte_gpu_comm_cleanup_list(comm_list[1])); + + /** CPU notifies the CUDA kernel that it has to terminate */ + rte_gpu_comm_set_flag(&quit_flag, 1); + + /** gpudev objects cleanup/destruction */ + /** CUDA cleanup */ + + rte_gpu_free(dev_id, ext_mem.buf_len); + + /** DPDK cleanup */ + + return 0; + } + + ////////////////////////////////////////////////////////////////////////// + ///// CUDA kernel + ////////////////////////////////////////////////////////////////////////// + + void cuda_kernel(uint32_t * quit_flag_ptr, struct rte_gpu_comm_list *comm_list, int comm_list_entries) { + int comm_list_index = 0; + struct rte_gpu_comm_pkt *pkt_list = NULL; + + /** Do some pre-processing operations. */ + + /** GPU kernel keeps checking this flag to know if it has to quit or wait for more packets. */ + while(*quit_flag_ptr == 0) + { + if(comm_list[comm_list_index]->status != RTE_GPU_COMM_LIST_READY) + continue; + + if(threadIdx.x < comm_list[comm_list_index]->num_pkts) + { + /** Each CUDA thread processes a different packet. */ + packet_processing(comm_list[comm_list_index]->addr, comm_list[comm_list_index]->size, ..); + } + __threadfence(); + __syncthreads(); + + /** Wait for new packets on the next communication list entry. */ + comm_list_index = (comm_list_index+1) % comm_list_entries; + } + + /** Do some post-processing operations. */ + }