From patchwork Wed Jan 24 13:45:11 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Rahul Gupta X-Patchwork-Id: 136106 X-Patchwork-Delegate: david.marchand@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 2EC53439B6; Wed, 24 Jan 2024 14:45:17 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 1524742DE5; Wed, 24 Jan 2024 14:45:17 +0100 (CET) Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by mails.dpdk.org (Postfix) with ESMTP id 9787441133 for ; Wed, 24 Jan 2024 14:45:15 +0100 (CET) Received: by linux.microsoft.com (Postfix, from userid 1179) id EE0F020E34F4; Wed, 24 Jan 2024 05:45:14 -0800 (PST) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com EE0F020E34F4 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1706103914; bh=t/I+vW7N2IDGYO9GwHh/nnx8gHFwcrTJ6NVgZ/Z2KkA=; h=From:To:Cc:Subject:Date:From; b=McG1O1+ywefdG4QMSDc72D3Nc8jWi7io1DdLpQUX/4G5PSJq0cl6B/yaNLig6XiPT GZHDn1/q6lzk06GsN0O8tDCuv0Pdk+jFdCzrVPHdgJAu0rfchIaG+UAsu/sebYyKVy V61ODNJuyecRgRcu7gOKQl1Q6NpfdphkUHg0PSSU= From: Rahul Gupta To: dev@dpdk.org, thomas@monjalon.net, bruce.richardson@intel.com, dmitry.kozliuk@gmail.com, stephen@networkplumber.org Cc: sovaradh@linux.microsoft.com, okaya@kernel.org, sujithsankar@microsoft.com, sowmini.varadhan@microsoft.com, krathinavel@microsoft.com, rahulrgupta27@gmail.com, Rahul Gupta Subject: [dpdk-dev] [PATCH v4] eal: refactor rte_eal_init into sub-functions Date: Wed, 24 Jan 2024 05:45:11 -0800 Message-Id: <1706103911-6907-1-git-send-email-rahulgupt@linux.microsoft.com> X-Mailer: git-send-email 1.8.3.1 MIME-Version: 1.0 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org From: Rahul Gupta In continuation to the following email, I am sending this patch. (https://inbox.dpdk.org/dev/20231110172523.GA17466@microsoft.com/) Initialization requires rte_eal_init + rte_pktmbuf_pool_create which can consume a total time of 500-600 ms: a) For many devices FLR may take a significant chunk of time (200-250 ms in our use-case), this FLR is triggered during device probe in rte_eal_init(). b) rte_pktmbuf_pool_create() can consume up to 300-350 ms for applications that require huge memory. This cost is incurred on each restart (which happens in our use-case during binary updates for servicing). This patch provides an optimization using pthreads that applications can use and which can save 200-230ms. In this patch, rte_eal_init() is refactored into two parts- a) 1st part is dependent code ie- it’s a perquisite of the FLR and mempool creation. So this code needs to be executed before any pthreads. Its named as rte_eal_init_setup() b) 2nd part of code is independent code ie- it can execute in parallel to mempool creation in a pthread. Its named as rte_eal_init_async_setup(). In existing applications no changes are required unless they wish to leverage the optimization. If the application wants to leverage this optimization, then it needs to call rte_eal_init_async() (instead of call rte_eal_init()), then it can create a thread using rte_eal_remote_launch() to schedule a task it would like todo in parallel rte_eal_init_async_setup(), this task can be a mbuf pool creation using- rte_pktmbuf_pool_create() After this, if next operations require completion of above task, then user can use rte_eal_init_wait_async_setup_complete(), or if user wants to just check status of that thread, then use- rte_eal_init_async_setup_done() --- v2: Address Stephen Hemminger's comment --- v3: address support for single lcore --- v4: address Brue Richardson and Stephen Hemminger comment Existing application need not do any changes if bootup optimization is not needed. app/test-pmd/testpmd.c | 24 ++++++++- lib/eal/include/rte_eal.h | 107 ++++++++++++++++++++++++++++++++++++++ lib/eal/linux/eal.c | 62 ++++++++++++++++++++-- lib/eal/version.map | 7 +++ 4 files changed, 196 insertions(+), 4 deletions(-) diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index 9e4e99e53b..c8eb194f64 100644 --- a/app/test-pmd/testpmd.c +++ b/app/test-pmd/testpmd.c @@ -4531,6 +4531,8 @@ main(int argc, char** argv) portid_t port_id; uint16_t count; int ret; + int lcore_id; + int main_lcore_id; #ifdef RTE_EXEC_ENV_WINDOWS signal(SIGINT, signal_handler); @@ -4550,11 +4552,31 @@ main(int argc, char** argv) rte_exit(EXIT_FAILURE, "Cannot register log type"); rte_log_set_level(testpmd_logtype, RTE_LOG_DEBUG); - diag = rte_eal_init(argc, argv); + diag = rte_eal_init_async(argc, argv); if (diag < 0) rte_exit(EXIT_FAILURE, "Cannot init EAL: %s\n", rte_strerror(rte_errno)); + main_lcore_id = rte_get_main_lcore(); + lcore_id = rte_get_next_lcore(main_lcore_id, 0, 1); + /* Gives status of rte_eal_init_async() */ + if (main_lcore_id != lcore_id) + while (rte_eal_init_async_setup_done(lcore_id) == 0) + ; + + /* + * Use rte_eal_init_wait_async_setup_complete() to get return value of + * rte_eal_init_async(). + * Or + * if testpmd application don't want to know progress/status of + * rte_eal_init_async() and just want to wait till it finishes + * then use following function. + */ + ret = rte_eal_init_wait_async_setup_complete(); + if (ret < 0) + rte_exit(EXIT_FAILURE, "Cannot init EAL: " + "rte_eal_init_async() failed: %s\n", + strerror(ret)); /* allocate port structures, and init them */ init_port(); diff --git a/lib/eal/include/rte_eal.h b/lib/eal/include/rte_eal.h index c2256f832e..6d7044b632 100644 --- a/lib/eal/include/rte_eal.h +++ b/lib/eal/include/rte_eal.h @@ -111,6 +111,113 @@ int rte_eal_iopl_init(void); */ int rte_eal_init(int argc, char **argv); +/** + * Initialize the Environment Abstraction Layer (EAL). + * + * This function is to be executed on the MAIN lcore only, as soon + * as possible in the application's main() function. + * It puts the WORKER lcores in the WAIT state. + * + * @param argc + * A non-negative value. If it is greater than 0, the array members + * for argv[0] through argv[argc] (non-inclusive) shall contain pointers + * to strings. + * @param argv + * An array of strings. The contents of the array, as well as the strings + * which are pointed to by the array, may be modified by this function. + * The program name pointer argv[0] is copied into the last parsed argv + * so that argv[0] is still the same after deducing the parsed arguments. + * @return + * - On success, the number of parsed arguments, which is greater or + * equal to zero. After the call to rte_eal_init_async(), + * all arguments argv[x] with x < ret may have been modified by this + * function call and should not be further interpreted by the + * application. The EAL does not take any ownership of the memory used + * for either the argv array, or its members. + * - On failure, -1 and rte_errno is set to a value indicating the cause + * for failure. In some instances, the application will need to be + * restarted as part of clearing the issue. + * + * Error codes returned via rte_errno: + * EACCES indicates a permissions issue. + * + * EAGAIN indicates either a bus or system resource was not available, + * setup may be attempted again. + * + * EALREADY indicates that the rte_eal_init_async function has already been + * called, and cannot be called again. + * + * EFAULT indicates the tailq configuration name was not found in + * memory configuration. + * + * EINVAL indicates invalid parameters were passed as argv/argc. + * + * ENOMEM indicates failure likely caused by an out-of-memory condition. + * + * ENODEV indicates memory setup issues. + * + * ENOTSUP indicates that the EAL cannot initialize on this system. + * + * EPROTO indicates that the PCI bus is either not present, or is not + * readable by the eal. + * + * ENOEXEC indicates that a service core failed to launch successfully. + */ +int rte_eal_init_async(int argc, char **argv); + +/** + * Initialize the Environment Abstraction Layer (EAL): Initial setup + * + * Its called from rte_eal_init() on MAIN lcore only and must NOT be directly + * called by user application. + * The driver dependent code is present in this function, ie before calling any other + * function eal library function this function must be complete successfully. + * + * return value is same as rte_eal_init(). + */ + +__rte_experimental +int rte_eal_init_setup(int argc, char **argv); + +/** + * Initialize the Environment Abstraction Layer (EAL): FLR and probe device + * + * Its thread is forked by rte_eal_init() and must NOT be directly called by user application. + * Launched on next available worker lcore. + * In this function initialisation needed for memory pool creation is completed, + * so this code can be executed in parallel to non device related operations + * like mbuf pool creation. + * + * return value is same as rte_eal_init(). + */ +__rte_experimental +int rte_eal_init_async_setup(__attribute__((unused)) void *arg); + +/** + * Initialize the Environment Abstraction Layer (EAL): Indication of rte_eal_init() completion + * + * It waits for rte_eal_init_async() to finish. It MUST be called from application, + * when a thread join is needed. Typically application will call this function after + * it performs all device independent operation (like mbuf pool creation) on initial lcore. + * + * return value is same as rte_eal_init(). + */ +__rte_experimental +int rte_eal_init_wait_async_setup_complete(void); + +/** + * Initialize the Environment Abstraction Layer (EAL): Indication of rte_eal_init() completion + * + * It shows status of rte_eal_init_async() ie the function is executing or completed. + * It MUST be called from application, + * Typically an application will call this function when it wants to know status of + * rte_eal_init_async() (ie FLR and probe thread). + * + * return value is same as rte_eal_init(). + */ +__rte_experimental +int rte_eal_init_async_setup_done(int lcore_id); + /** * Clean up the Environment Abstraction Layer (EAL) * diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c index fd422f1f62..e86aac5980 100644 --- a/lib/eal/linux/eal.c +++ b/lib/eal/linux/eal.c @@ -962,9 +962,8 @@ eal_worker_thread_create(unsigned int lcore_id) return ret; } -/* Launch threads, called at application init(). */ -int -rte_eal_init(int argc, char **argv) +__rte_experimental +int rte_eal_init_setup(int argc, char **argv) { int i, fctret, ret; static RTE_ATOMIC(uint32_t) run_once; @@ -1268,7 +1267,15 @@ rte_eal_init(int argc, char **argv) */ rte_eal_mp_remote_launch(sync_func, NULL, SKIP_MAIN); rte_eal_mp_wait_lcore(); + return fctret; +} +__rte_experimental int +rte_eal_init_async_setup(__attribute__((unused)) void *arg) +{ + int ret = 0; + struct internal_config *internal_conf = + eal_get_internal_configuration(); /* initialize services so vdevs register service during bus_probe. */ ret = rte_service_init(); if (ret) { @@ -1322,6 +1329,55 @@ rte_eal_init(int argc, char **argv) eal_mcfg_complete(); + return 0; +} + +/* + * waits until function executing on given lcore finishes. + * returns value returned by the function executing on that lcore. + */ +__rte_experimental int +rte_eal_init_wait_async_setup_complete(void) +{ + int lcore_id = -1; + lcore_id = rte_lcore_id(); + lcore_id = rte_get_next_lcore(lcore_id, 0, 1); + int ret = rte_eal_wait_lcore(lcore_id); + return ret; +} + +/* + * returns current status of execution on a given lcore + */ +__rte_experimental int +rte_eal_init_async_setup_done(int lcore_id) +{ + int ret = (lcore_config[lcore_id].state); + return (ret == WAIT); +} + +/* Launch threads, called at application init(). */ +int +rte_eal_init(int argc, char **argv) +{ + int fctret = rte_eal_init_setup(argc, argv); + if (fctret < 0) + return fctret; + return rte_eal_init_async_setup(NULL); +} + +/* Launch threads, called at application init(). */ +__rte_experimental int +rte_eal_init_async(int argc, char **argv) +{ + int lcore_id; + int fctret = rte_eal_init_setup(argc, argv); /* initial lcore*/ + if (fctret < 0) + return fctret; + lcore_id = rte_lcore_id(); + lcore_id = rte_get_next_lcore(lcore_id, 0, 1); + /* running on a worker lcore */ + rte_eal_remote_launch(rte_eal_init_async_setup, NULL, lcore_id); return fctret; } diff --git a/lib/eal/version.map b/lib/eal/version.map index 5e0cd47c82..5e7ccb67c4 100644 --- a/lib/eal/version.map +++ b/lib/eal/version.map @@ -393,6 +393,13 @@ EXPERIMENTAL { # added in 23.07 rte_memzone_max_get; rte_memzone_max_set; + + # added in 24.01 + rte_eal_init_async; + rte_eal_init_setup; + rte_eal_init_async_setup; + rte_eal_init_async_setup_done; + rte_eal_init_wait_async_setup_complete; }; INTERNAL {