From patchwork Sat Jan 8 00:20:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elena Agostini X-Patchwork-Id: 105692 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id C0B52A04A6; Fri, 7 Jan 2022 17:10:32 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 9FE3641141; Fri, 7 Jan 2022 17:10:21 +0100 (CET) Received: from NAM11-CO1-obe.outbound.protection.outlook.com (mail-co1nam11on2052.outbound.protection.outlook.com [40.107.220.52]) by mails.dpdk.org (Postfix) with ESMTP id 5196F40042 for ; Fri, 7 Jan 2022 17:10:19 +0100 (CET) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Zx+prNlZicwU9LUPw8D3MyOhbgljYD59F0sCEmMtoMvMZ4xNZh3+fHFKWA3mnDgTJ9WcizQrCPFd9E3bunVXzy3upqXUts4ogTM/D0ettIZ12Tprkvz71J22JBeZBNwJhcLMO52NdtBchDcbBilY9ZYTBf0skN28oT0Z0qOJGHYo+GUqEWQFThwIoX9uVAAGWD6HdgSzGQvcV5gO6/zNywoFaLBdPons8t8NqKCETGBKrJwLdcreyYtmhlr4D9BzxoipVKPKVDLr4lYeULURj8Sryfqq0G6FOiKgyzC29l1x+9hDkwDoKnd/yhd9T04lpqgMNiS/TbDQpexmItx+pw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=a03DfUynRhyTyJpArg5xa6hs1G1OEEhemAVjGSwUFxI=; b=oEHw6V356nP6hItql7u0oiB6NfnDiS8+EtpjIt4G2UyHCYJ7n2X10e7MLC9v2AcNAw7pSId0d1cMrAz+mJUhQFlZxwJRtD1GjwSurOd/K/TDwSAdT8WxR87QHDm9el3pDeJlFSnGxYrt8yktqNQ1AiIQOnkqi9QGKV3bw9GJa7kOIlMr7zR6n/X3RzEEWQD0glRGh9uiJyRYy+YbGzfpJPL+0w1RhM1MgVZQTvfLbMyKZIW6OWX88FAeSKSlOL7Sjuhj1UOniF3aGBD/WwSK1FVCatzwNxvpKIs+BK9kyNCaiAfgZ9hp7wEE3mxohc1v/mISWN6mdOkgwYAsJuuadQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 12.22.5.235) smtp.rcpttodomain=dpdk.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=a03DfUynRhyTyJpArg5xa6hs1G1OEEhemAVjGSwUFxI=; b=F1sTRMhYIuCvilqKyrqxPVBK10mSg+09cnsgaiBbAAJnvnWQSrRG+TdZAlAy8gHaqnpivPSluxcad7J2bir1zzqNxC8UgOVRG2Ho8haAIQ9bizDs0aAzUI5U6FnuK2bCR6hd1ThgEJ5HwmZA3e3sPhce21eqobRhk7LecqzKQr5e3SPGyQJ5cQ/oB/hDpj4FSm7Jmk4BXxHcMkAnQrKAk1FRn2jBcN0kbnjDUPP3kl8VnZtm+wXrUg6OUBScZ7bgMucx4/+NNS0dLybWDYQlJSrEkm0a93YeotQ1Jq5eNtegCcUqIq0rWPqyMWvJfKcTbh1NZB3kavxAehaI5Zsdhw== Received: from CO2PR04CA0153.namprd04.prod.outlook.com (2603:10b6:104::31) by CH2PR12MB3894.namprd12.prod.outlook.com (2603:10b6:610:2b::28) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4867.7; Fri, 7 Jan 2022 16:10:16 +0000 Received: from CO1NAM11FT042.eop-nam11.prod.protection.outlook.com (2603:10b6:104:0:cafe::7d) by CO2PR04CA0153.outlook.office365.com (2603:10b6:104::31) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4867.9 via Frontend Transport; Fri, 7 Jan 2022 16:10:16 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 12.22.5.235) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 12.22.5.235 as permitted sender) receiver=protection.outlook.com; client-ip=12.22.5.235; helo=mail.nvidia.com; Received: from mail.nvidia.com (12.22.5.235) by CO1NAM11FT042.mail.protection.outlook.com (10.13.174.250) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.4867.7 via Frontend Transport; Fri, 7 Jan 2022 16:10:16 +0000 Received: from rnnvmail201.nvidia.com (10.129.68.8) by DRHQMAIL107.nvidia.com (10.27.9.16) with Microsoft SMTP Server (TLS) id 15.0.1497.18; Fri, 7 Jan 2022 16:10:15 +0000 Received: from nvidia.com (172.20.187.5) by rnnvmail201.nvidia.com (10.129.68.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.2.986.9; Fri, 7 Jan 2022 08:10:15 -0800 From: To: CC: Elena Agostini Subject: [PATCH v2 3/3] gpu/cuda: mem alloc aligned memory Date: Sat, 8 Jan 2022 00:20:03 +0000 Message-ID: <20220108002003.21153-3-eagostini@nvidia.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220108002003.21153-1-eagostini@nvidia.com> References: <20220104014721.1799-1-eagostini@nvidia.com> <20220108002003.21153-1-eagostini@nvidia.com> MIME-Version: 1.0 X-Originating-IP: [172.20.187.5] X-ClientProxiedBy: HQMAIL111.nvidia.com (172.20.187.18) To rnnvmail201.nvidia.com (10.129.68.8) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 0d1e7347-2582-494d-2716-08d9d1f831a4 X-MS-TrafficTypeDiagnostic: CH2PR12MB3894:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:428; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: jvsbYtzHHnwtcdRQP4Pb5uSnZ63OZ0uyo8gx6zfAaB2VbaUdPGR7zL4HsebVaHw8s+ic6dSiVvbbty/QN/s9JKLduRw1mbXH9FyrMRXpZNFVyHOl8glla6RxyZhF5agvCA8x5FK39MInK3haAIO1agokWn2y0ddY02Jvi17tZJWriOjBnW6RBWnEsnq71NObj2+zz9aLDfK1SETC5QqYWhmu33IsVKlg/v7xyKqYhKYujlfJULkd0FoKsTui7uToMh6T75LuJ6t5JnNQV0PdgRpnXSYtbQRdh1aZBA43PVZQ/JkCGP7b8I9cTdkkB3Jb7X3prf7Wi6UgQJF7I8ohkGtN96Zkh0EK0pKZeQOjgcf1WUL88pM9Cq05bvmI37Gvz9of7yUsQQfxxR1yEy+oLMcmgjAW08WSN5f4VO6jZIVqOAofiBezA0k9jl1YSg+D2AT56J2mI4TTPR9OgRumnzCPsB1PGh3jWp5r/EQQII+ubtrz2nJP1NgHrBXyXAAgPqwweqJSj6cXBao4GXe/u1bnWocFN/PUJWLLAEkQoq5WnUQXI0af7AOh7tDWKSaeKCUUEyWIOWXzc7i98Y4qUvk9gHDYkzmYWkzPpjna7XmEu5eIj94GvY96oYgTaCRSqSqiWVbL4ATJy8TW1sYfseDUQ22XPCCOrdNz83zo7N1XbkByQdiX6xL7zN4iDx4RQSf1xd2Dla2oMAihDShhFhAnWk+7UZw+un41GG/82MkR3Z3KlJyW21BoaBh/1PgR8zQIu6IvhOKPXDqzwLAo5uauoAzOWieOLVS0zQkLuts= X-Forefront-Antispam-Report: CIP:12.22.5.235; CTRY:US; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:mail.nvidia.com; PTR:InfoNoRecords; CAT:NONE; SFS:(4636009)(46966006)(36840700001)(40470700002)(36756003)(2906002)(1076003)(7696005)(86362001)(508600001)(36860700001)(186003)(26005)(336012)(47076005)(2616005)(426003)(8676002)(8936002)(5660300002)(16526019)(83380400001)(356005)(6916009)(81166007)(55016003)(82310400004)(6666004)(6286002)(40460700001)(2876002)(4326008)(70586007)(70206006)(107886003)(316002)(36900700001); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 07 Jan 2022 16:10:16.3322 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 0d1e7347-2582-494d-2716-08d9d1f831a4 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[12.22.5.235]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: CO1NAM11FT042.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: CH2PR12MB3894 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org From: Elena Agostini Implement aligned GPU memory allocation in GPU CUDA driver. Changelog: - cuda_mem_alloc parameters order Signed-off-by: Elena Agostini --- drivers/gpu/cuda/cuda.c | 21 ++++++++++++++++----- 1 file changed, 16 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/cuda/cuda.c b/drivers/gpu/cuda/cuda.c index 882df08e56..dc8d3d3b5a 100644 --- a/drivers/gpu/cuda/cuda.c +++ b/drivers/gpu/cuda/cuda.c @@ -139,8 +139,10 @@ typedef uintptr_t cuda_ptr_key; /* Single entry of the memory list */ struct mem_entry { CUdeviceptr ptr_d; + CUdeviceptr ptr_orig_d; void *ptr_h; size_t size; + size_t size_orig; struct rte_gpu *dev; CUcontext ctx; cuda_ptr_key pkey; @@ -569,7 +571,7 @@ cuda_dev_info_get(struct rte_gpu *dev, struct rte_gpu_info *info) */ static int -cuda_mem_alloc(struct rte_gpu *dev, size_t size, void **ptr) +cuda_mem_alloc(struct rte_gpu *dev, size_t size, unsigned int align, void **ptr) { CUresult res; const char *err_string; @@ -610,8 +612,10 @@ cuda_mem_alloc(struct rte_gpu *dev, size_t size, void **ptr) /* Allocate memory */ mem_alloc_list_tail->size = size; - res = pfn_cuMemAlloc(&(mem_alloc_list_tail->ptr_d), - mem_alloc_list_tail->size); + mem_alloc_list_tail->size_orig = size + align; + + res = pfn_cuMemAlloc(&(mem_alloc_list_tail->ptr_orig_d), + mem_alloc_list_tail->size_orig); if (res != 0) { pfn_cuGetErrorString(res, &(err_string)); rte_cuda_log(ERR, "cuCtxSetCurrent current failed with %s", @@ -620,6 +624,13 @@ cuda_mem_alloc(struct rte_gpu *dev, size_t size, void **ptr) return -rte_errno; } + + /* Align memory address */ + mem_alloc_list_tail->ptr_d = mem_alloc_list_tail->ptr_orig_d; + if (align && ((uintptr_t)mem_alloc_list_tail->ptr_d) % align) + mem_alloc_list_tail->ptr_d += (align - + (((uintptr_t)mem_alloc_list_tail->ptr_d) % align)); + /* GPUDirect RDMA attribute required */ res = pfn_cuPointerSetAttribute(&flag, CU_POINTER_ATTRIBUTE_SYNC_MEMOPS, @@ -634,7 +645,6 @@ cuda_mem_alloc(struct rte_gpu *dev, size_t size, void **ptr) mem_alloc_list_tail->pkey = get_hash_from_ptr((void *)mem_alloc_list_tail->ptr_d); mem_alloc_list_tail->ptr_h = NULL; - mem_alloc_list_tail->size = size; mem_alloc_list_tail->dev = dev; mem_alloc_list_tail->ctx = (CUcontext)((uintptr_t)dev->mpshared->info.context); mem_alloc_list_tail->mtype = GPU_MEM; @@ -761,6 +771,7 @@ cuda_mem_register(struct rte_gpu *dev, size_t size, void *ptr) mem_alloc_list_tail->dev = dev; mem_alloc_list_tail->ctx = (CUcontext)((uintptr_t)dev->mpshared->info.context); mem_alloc_list_tail->mtype = CPU_REGISTERED; + mem_alloc_list_tail->ptr_orig_d = mem_alloc_list_tail->ptr_d; /* Restore original ctx as current ctx */ res = pfn_cuCtxSetCurrent(current_ctx); @@ -796,7 +807,7 @@ cuda_mem_free(struct rte_gpu *dev, void *ptr) } if (mem_item->mtype == GPU_MEM) { - res = pfn_cuMemFree(mem_item->ptr_d); + res = pfn_cuMemFree(mem_item->ptr_orig_d); if (res != 0) { pfn_cuGetErrorString(res, &(err_string)); rte_cuda_log(ERR, "cuMemFree current failed with %s",