From patchwork Fri Sep 8 13:17:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Artemy Kovalyov X-Patchwork-Id: 131305 X-Patchwork-Delegate: david.marchand@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 1635042548; Fri, 8 Sep 2023 15:17:55 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id B9F0D402D3; Fri, 8 Sep 2023 15:17:53 +0200 (CEST) Received: from NAM10-DM6-obe.outbound.protection.outlook.com (mail-dm6nam10on2042.outbound.protection.outlook.com [40.107.93.42]) by mails.dpdk.org (Postfix) with ESMTP id A6A5B402D2; Fri, 8 Sep 2023 15:17:52 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=irBeyr3s4NqVwdVlBsL0PHMI06aJQglCKOjOTcUXMV+QjnXZVznfHOk1P+b+TNfBQBoBOiR/2Tx4e2GAMRZiF71IG1ONI2dSztRmjaUkK0onT1qqfvuXPZfuxyhFrg/JhsOeG76bSRYS9xiBc2V1Inx+PL0anXT2OhSrg+lO1RGORopRyjoVqhslnT/LYVsM1448aU0HTEeiiEyDxmqF/cKe+UllNoYmCkLNpTF8fN8PSJ7dSEOA7ho5Bn/isRDUhD9FB+I6ZC5+MGf+ybP/JTCwSTzA82tPlGtV5iMW53IvD06wD9chB+p5Y1mSDa+X0RG+ZGEUEziUsI6ldwIqPA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=3EmTsUc2dnZ1aRaJuiSpVe0mz7yZFx68nVBuD7MBLXk=; b=nOHZRXXc89xZVz03iv7REKnyIWUjguPh1D4WIvfCdfPKnqrpQlKVNxtSnF4wzNE3ii9v9DAHLZwPmX+RdJVvN5lWY5NoLg75N0bSaj9jGEV65huBv/pvCKZG0G3sbHSAT6IIbLfoeQnQaxvr+igec2I2siyH5GXPnwz6Z/Lx1cx362H0FdL4McHRuVJlFKajtYpf3kmdOjZBlVtj7QI1p8tqiDBqEixzSFH3imUDboZ3l/pVZ9WjHJqrz5PuRtTK/geGZjUen0vWWd+c2JrDZ1KoChXOLI0lN/SY33jBJbX4OihLRO3AP9j98LL97ERRkpZ22lqtqy2hqxs5J7S/rw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.232) smtp.rcpttodomain=dpdk.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=3EmTsUc2dnZ1aRaJuiSpVe0mz7yZFx68nVBuD7MBLXk=; b=Hw2my+Qva5deHQ+8ZniWE6HJege8kKeb+yRYyuISgaZS1k0kwbpkQwiFgUySh0HZdmS28c8aEcAJA0Ba0f7DnQzGWU5xGlQsk/WwnnGTBjzbdyWazfqTNK4b4VggcI7gRW5gSbvcMmnjehBXV/fYL4DunwdtUFabb6xIcQX0DBh3lnQy5rbQ27GwanxYD+YtsBH1vExNXGDhLWtLPJI4LumIO3n8WPDzO0qZFSdcyQs1vmFzEIbgIdqlVXDS4C53zSYk374eEDj1dWaaS9G9dmTPThEJY99LnO4W5s2iGPy0w83FwbncTGXO95Mh5qtp52NAkiMqThyrKDrJUTN99g== Received: from BL0PR05CA0011.namprd05.prod.outlook.com (2603:10b6:208:91::21) by SN7PR12MB6715.namprd12.prod.outlook.com (2603:10b6:806:271::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6768.30; Fri, 8 Sep 2023 13:17:50 +0000 Received: from BL6PEPF0001AB76.namprd02.prod.outlook.com (2603:10b6:208:91:cafe::33) by BL0PR05CA0011.outlook.office365.com (2603:10b6:208:91::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6792.10 via Frontend Transport; Fri, 8 Sep 2023 13:17:50 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.232) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.232 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.232; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.232) by BL6PEPF0001AB76.mail.protection.outlook.com (10.167.242.169) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6792.11 via Frontend Transport; Fri, 8 Sep 2023 13:17:49 +0000 Received: from drhqmail202.nvidia.com (10.126.190.181) by mail.nvidia.com (10.127.129.5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.5; Fri, 8 Sep 2023 06:17:43 -0700 Received: from drhqmail201.nvidia.com (10.126.190.180) by drhqmail202.nvidia.com (10.126.190.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.37; Fri, 8 Sep 2023 06:17:43 -0700 Received: from nvidia.com (10.127.8.9) by mail.nvidia.com (10.126.190.180) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.37 via Frontend Transport; Fri, 8 Sep 2023 06:17:41 -0700 From: Artemy Kovalyov To: CC: Thomas Monjalon , Ophir Munk , , Anatoly Burakov , =?utf-8?q?M?= =?utf-8?q?orten_Br=C3=B8rup?= , "Stephen Hemminger" Subject: [PATCH v4 1/2] eal: fix memory initialization deadlock Date: Fri, 8 Sep 2023 16:17:35 +0300 Message-ID: <20230908131737.1714750-2-artemyko@nvidia.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230908131737.1714750-1-artemyko@nvidia.com> References: <20230830103303.2428995-1-artemyko@nvidia.com> <20230908131737.1714750-1-artemyko@nvidia.com> MIME-Version: 1.0 X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: BL6PEPF0001AB76:EE_|SN7PR12MB6715:EE_ X-MS-Office365-Filtering-Correlation-Id: c17e029d-d8b4-4675-d929-08dbb06e0043 X-LD-Processed: 43083d15-7273-40c1-b7db-39efd9ccc17a,ExtAddr X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: I8vFvWpifBg8SoPpbJQocz4EQxLa96rZ8aphKTm+NwqJ2xp8OZfTRfj7oYyRTd2UsMuSd89O3VKMEcwhVtzlMsLVlzpw3jpfKVbAz8fI9rOzBshTGDpNzaoupZJN2dtknmuXl1fgMBqKvHZLQKiWi0IZPIOD7HxOMjnzYlB6muUG61E/bvmKjKQHp07+1tMeNiRhaRZL1ZkeLoJAVSLbTutV3g4fI4vxD997Xbgll/ufOW02/PX5d6dkljGsxpKJk6Eo6VrBBgEYHRM1Q2EHkcAoYOhCR/AUw0SMiD9nbEZMNNNeDfEGW4xa1uNbz+nn76weykpU6p9LwsDeFLtelXcPUMsJVZ12A2rBAko5WSfUeIeHtmckrBXe4J+iC9OsnzV4CzmfSwx/yf4HIY0CGL4OjPDvBXjnX2Hb9ANzsyBu5amIfyVEK95AI26m7BFkErhec87kfpvnaF/MyDTAe3c5I9iqxP6vBPDX/4NcRjdkXvd5A8vmSY7P29P1Z2ryrxaTsFckRSXvcEa6KVtRdyluc50c62yWsuCI+HgxbOIFzipy7BjrhtdASIZyTE4h92OWJbI8FdHMHgsx4RZAILmDF7eWmlyhQFu0TszWwClo8BVGykIV8j1Cg2+N29OHBgNsxdbUpo9IXTDJ0GYHJmjkhtdrWmHzhUyg20FuKKRCyJLEkZgn+JU5ZiV+6nIIulYX++WEeGjtjhyUtjD3cQW1Xn8bipSJijW34y58WajE3VVeIsyD1W/wSkRLtLI2 X-Forefront-Antispam-Report: CIP:216.228.118.232; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:dc7edge1.nvidia.com; CAT:NONE; SFS:(13230031)(4636009)(136003)(396003)(39860400002)(376002)(346002)(82310400011)(1800799009)(186009)(451199024)(36840700001)(40470700004)(46966006)(5660300002)(7696005)(8936002)(4326008)(8676002)(70206006)(316002)(70586007)(41300700001)(6916009)(54906003)(478600001)(2906002)(6666004)(26005)(2616005)(426003)(336012)(6286002)(1076003)(47076005)(36860700001)(83380400001)(55016003)(7636003)(356005)(82740400003)(40460700003)(86362001)(36756003)(40480700001); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Sep 2023 13:17:49.8519 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: c17e029d-d8b4-4675-d929-08dbb06e0043 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[216.228.118.232]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: BL6PEPF0001AB76.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN7PR12MB6715 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org The issue arose due to changes in the DPDK read-write lock implementation. Following these changes, the RW-lock no longer supports recursion, implying that a single thread shouldn't obtain a read lock if it already possesses one. The problem arises during initialization: the rte_eal_init() function acquires the memory_hotplug_lock, and later on, there are sequences of calls that acquire it again without releasing it. * rte_eal_memory_init() -> eal_memalloc_init() -> rte_memseg_list_walk() * rte_eal_memory_init() -> rte_eal_hugepage_init() -> eal_dynmem_hugepage_init() -> rte_memseg_list_walk() This scenario introduces the risk of a potential deadlock when concurrent write locks are applied to the same memory_hotplug_lock. To address this locally, we resolved the issue by replacing rte_memseg_list_walk() with rte_memseg_list_walk_thread_unsafe(). Bugzilla ID: 1277 Fixes: 832cecc03d77 ("rwlock: prevent readers from starving writers") Cc: stable@dpdk.org Signed-off-by: Artemy Kovalyov --- lib/eal/common/eal_common_dynmem.c | 5 ++++- lib/eal/include/generic/rte_rwlock.h | 4 ++++ lib/eal/linux/eal_memalloc.c | 7 +++++-- 3 files changed, 13 insertions(+), 3 deletions(-) diff --git a/lib/eal/common/eal_common_dynmem.c b/lib/eal/common/eal_common_dynmem.c index bdbbe23..95da55d 100644 --- a/lib/eal/common/eal_common_dynmem.c +++ b/lib/eal/common/eal_common_dynmem.c @@ -251,7 +251,10 @@ */ memset(&dummy, 0, sizeof(dummy)); dummy.hugepage_sz = hpi->hugepage_sz; - if (rte_memseg_list_walk(hugepage_count_walk, &dummy) < 0) + /* memory_hotplug_lock is held during initialization, so it's + * safe to call thread-unsafe version. + */ + if (rte_memseg_list_walk_thread_unsafe(hugepage_count_walk, &dummy) < 0) return -1; for (i = 0; i < RTE_DIM(dummy.num_pages); i++) { diff --git a/lib/eal/include/generic/rte_rwlock.h b/lib/eal/include/generic/rte_rwlock.h index 9e083bb..c98fc7d 100644 --- a/lib/eal/include/generic/rte_rwlock.h +++ b/lib/eal/include/generic/rte_rwlock.h @@ -80,6 +80,10 @@ /** * Take a read lock. Loop until the lock is held. * + * @note The RW lock isn't recursive, so calling this function on the same + * lock twice without releasing it could potentially result in a deadlock + * scenario when a write lock is involved. + * * @param rwl * A pointer to a rwlock structure. */ diff --git a/lib/eal/linux/eal_memalloc.c b/lib/eal/linux/eal_memalloc.c index f8b1588..9853ec7 100644 --- a/lib/eal/linux/eal_memalloc.c +++ b/lib/eal/linux/eal_memalloc.c @@ -1740,7 +1740,10 @@ struct rte_memseg * eal_get_internal_configuration(); if (rte_eal_process_type() == RTE_PROC_SECONDARY) - if (rte_memseg_list_walk(secondary_msl_create_walk, NULL) < 0) + /* memory_hotplug_lock is held during initialization, so it's + * safe to call thread-unsafe version. + */ + if (rte_memseg_list_walk_thread_unsafe(secondary_msl_create_walk, NULL) < 0) return -1; if (rte_eal_process_type() == RTE_PROC_PRIMARY && internal_conf->in_memory) { @@ -1778,7 +1781,7 @@ struct rte_memseg * } /* initialize all of the fd lists */ - if (rte_memseg_list_walk(fd_list_create_walk, NULL)) + if (rte_memseg_list_walk_thread_unsafe(fd_list_create_walk, NULL)) return -1; return 0; }