From patchwork Fri Aug 23 09:29:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 143362 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id BC0E945843; Fri, 23 Aug 2024 11:30:02 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 8EEE6402E6; Fri, 23 Aug 2024 11:30:02 +0200 (CEST) Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) by mails.dpdk.org (Postfix) with ESMTP id C10CF402CA for ; Fri, 23 Aug 2024 11:30:00 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1724405401; x=1755941401; h=from:to:subject:date:message-id:mime-version: content-transfer-encoding; bh=MtS8Y88pKGY+hd8je3nFO88cNRxkQvf59FDS8SZYH1Q=; b=JKAMzSQ9PsL7AfnLFdlZ/Gz29Hbtq/gT1ccdL/AtkIjIRbU0fB87QVSA S0Gw6HmFFwLWf7zpFoMRMsNbCZ9AoBy/KCCMr/7saOEDlW0GCliuxKzhZ NRiQDuGGaovj00fNE99zASBjdaqv6f/iCqCgHWI6yduTWfv88klcX2Pwi nnY9TrftyTjAhslpqwz8wNtxIrMRtTyXfzwRW4wUpD582rAKpMBowC48x pSfOWqmdPN6ugeA/iH16qsIUDluxVFfgfDVyAZio1hvInq8C6HqSGn0l3 7Zl6WLlUUM81oSfr8ICUCmBwFLNrhypblgDEMVog2HrrRyL+atMGiBtNI w==; X-CSE-ConnectionGUID: pBIS7WJ8T3CEpg8/NSO3dg== X-CSE-MsgGUID: keiuV3TJQZq/AUfiEMszgQ== X-IronPort-AV: E=McAfee;i="6700,10204,11172"; a="22727266" X-IronPort-AV: E=Sophos;i="6.10,169,1719903600"; d="scan'208";a="22727266" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Aug 2024 02:29:59 -0700 X-CSE-ConnectionGUID: +jrbMLJ1TCOtT9FomWog6g== X-CSE-MsgGUID: Wh/Lrck4T2aI/cTSHPUe1g== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.10,169,1719903600"; d="scan'208";a="61417092" Received: from silpixa00401119.ir.intel.com ([10.55.129.167]) by fmviesa006.fm.intel.com with ESMTP; 23 Aug 2024 02:29:59 -0700 From: Anatoly Burakov To: dev@dpdk.org Subject: [PATCH v1 1/3] fbarray: rename tests to be more meaningful Date: Fri, 23 Aug 2024 10:29:54 +0100 Message-ID: <47e02f706d284a2e2a49db51ce75081e62aee393.1724405387.git.anatoly.burakov@intel.com> X-Mailer: git-send-email 2.43.5 MIME-Version: 1.0 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Some tests reference internal implementation details of fbarray, as well as equivocate between mask and index. Most other test names are not very descriptive. Rename them, and adjust comments to explain things in terms of what the tests actually do, instead of referring to internal implementation details. Also, add more tests that fill up exactly one mask, with and without neighbouring set bits. Signed-off-by: Anatoly Burakov --- app/test/test_fbarray.c | 99 ++++++++++++++++++++++++++++------------- 1 file changed, 67 insertions(+), 32 deletions(-) diff --git a/app/test/test_fbarray.c b/app/test/test_fbarray.c index 09f6907fb1..6ca509b898 100644 --- a/app/test/test_fbarray.c +++ b/app/test/test_fbarray.c @@ -104,7 +104,7 @@ static int first_msk_test_setup(void) return init_aligned(); } -static int cross_msk_test_setup(void) +static int contig_test_setup(void) { /* put all within second and third mask */ param.start = 70; @@ -112,7 +112,7 @@ static int cross_msk_test_setup(void) return init_aligned(); } -static int multi_msk_test_setup(void) +static int large_contig_test_setup(void) { /* put all within first and last mask */ param.start = 3; @@ -128,15 +128,39 @@ static int last_msk_test_setup(void) return init_aligned(); } +static int full_index_test_setup(void) +{ + /* fill entire index */ + param.start = 0; + param.end = FBARRAY_TEST_LEN - 1; + return init_aligned(); +} + static int full_msk_test_setup(void) { - /* fill entire mask */ + /* fill one mask */ param.start = 0; - param.end = FBARRAY_TEST_LEN - 1; + param.end = 63; return init_aligned(); } -static int lookahead_test_setup(void) +static int full_msk_contig_fwd_test_setup(void) +{ + /* fill one mask plus one item */ + param.start = 64; + param.end = 128; + return init_aligned(); +} + +static int full_msk_contig_rev_test_setup(void) +{ + /* fill one mask plus one item */ + param.start = 63; + param.end = 127; + return init_aligned(); +} + +static int cross_msk_test_setup(void) { /* set index 64 as used */ param.start = 64; @@ -144,7 +168,7 @@ static int lookahead_test_setup(void) return init_aligned(); } -static int lookbehind_test_setup(void) +static int cross_msk_rev_test_setup(void) { /* set index 63 as used */ param.start = 63; @@ -160,6 +184,13 @@ static int unaligned_test_setup(void) return init_unaligned(); } +static int full_unaligned_test_setup(void) +{ + unaligned.start = 0; + unaligned.end = FBARRAY_UNALIGNED_TEST_LEN - 1; + return init_unaligned(); +} + static int test_invalid(void) { struct rte_fbarray dummy; @@ -786,7 +817,7 @@ static int test_empty(void) return TEST_SUCCESS; } -static int test_lookahead(void) +static int test_cross_msk(void) { int ret; @@ -801,7 +832,7 @@ static int test_lookahead(void) return TEST_SUCCESS; } -static int test_lookbehind(void) +static int test_cross_rev_msk(void) { int ret, free_len = 2; @@ -816,19 +847,19 @@ static int test_lookbehind(void) return TEST_SUCCESS; } -static int test_lookahead_mask(void) +static int test_broken_run(void) { /* - * There is a certain type of lookahead behavior we want to test here, - * namely masking of bits that were scanned with lookahead but that we - * know do not match our criteria. This is achieved in following steps: + * There is a certain type of search behavior we want to test here, + * namely starting cross-mask runs and failing to find them. This is + * achieved when these conditions happen: * * 0. Look for a big enough chunk of free space (say, 62 elements) - * 1. Trigger lookahead by breaking a run somewhere inside mask 0 - * (indices 0-63) - * 2. Fail lookahead by breaking the run somewhere inside mask 1 - * (indices 64-127) - * 3. Ensure that we can still find free space in mask 1 afterwards + * 1. Break a run somewhere inside mask 0 (indices 0-63) but leave + * some free elements at the end of mask 0 to start a run + * 2. Break the run somewhere inside mask 1 (indices 64-127) + * 3. Ensure that we can still find a free space run right after the + * second broken run */ /* break run on first mask */ @@ -842,19 +873,19 @@ static int test_lookahead_mask(void) return TEST_SUCCESS; } -static int test_lookbehind_mask(void) +static int test_rev_broken_run(void) { /* - * There is a certain type of lookbehind behavior we want to test here, - * namely masking of bits that were scanned with lookbehind but that we - * know do not match our criteria. This is achieved in two steps: + * There is a certain type of search behavior we want to test here, + * namely starting cross-mask runs and failing to find them. This is + * achieved when these conditions happen: * * 0. Look for a big enough chunk of free space (say, 62 elements) - * 1. Trigger lookbehind by breaking a run somewhere inside mask 2 - * (indices 128-191) - * 2. Fail lookbehind by breaking the run somewhere inside mask 1 - * (indices 64-127) - * 3. Ensure that we can still find free space in mask 1 afterwards + * 1. Break a run somewhere inside mask 2 (indices 128-191) but leave + * some free elements at the beginning of mask 2 to start a run + * 2. Break the run somewhere inside mask 1 (indices 64-127) + * 3. Ensure that we can still find free space N elements down from + * our last broken run (inside mask 0 in this case) */ /* break run on mask 2 */ @@ -876,18 +907,22 @@ static struct unit_test_suite fbarray_test_suite = { TEST_CASE(test_invalid), TEST_CASE(test_basic), TEST_CASE_ST(first_msk_test_setup, reset_aligned, test_find), - TEST_CASE_ST(cross_msk_test_setup, reset_aligned, test_find), - TEST_CASE_ST(multi_msk_test_setup, reset_aligned, test_find), + TEST_CASE_ST(contig_test_setup, reset_aligned, test_find), + TEST_CASE_ST(large_contig_test_setup, reset_aligned, test_find), TEST_CASE_ST(last_msk_test_setup, reset_aligned, test_find), TEST_CASE_ST(full_msk_test_setup, reset_aligned, test_find), + TEST_CASE_ST(full_msk_contig_fwd_test_setup, reset_aligned, test_find), + TEST_CASE_ST(full_msk_contig_rev_test_setup, reset_aligned, test_find), + TEST_CASE_ST(full_index_test_setup, reset_aligned, test_find), /* empty test does not need setup */ TEST_CASE_ST(NULL, reset_aligned, test_empty), - TEST_CASE_ST(lookahead_test_setup, reset_aligned, test_lookahead), - TEST_CASE_ST(lookbehind_test_setup, reset_aligned, test_lookbehind), + TEST_CASE_ST(cross_msk_test_setup, reset_aligned, test_cross_msk), + TEST_CASE_ST(cross_msk_rev_test_setup, reset_aligned, test_cross_rev_msk), /* setup for these tests is more complex so do it in test func */ - TEST_CASE_ST(NULL, reset_aligned, test_lookahead_mask), - TEST_CASE_ST(NULL, reset_aligned, test_lookbehind_mask), + TEST_CASE_ST(NULL, reset_aligned, test_broken_run), + TEST_CASE_ST(NULL, reset_aligned, test_rev_broken_run), TEST_CASE_ST(unaligned_test_setup, reset_unaligned, test_find_unaligned), + TEST_CASE_ST(full_unaligned_test_setup, reset_unaligned, test_find_unaligned), TEST_CASES_END() } }; From patchwork Fri Aug 23 09:29:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 143363 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 0B3CD45843; Fri, 23 Aug 2024 11:30:10 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 84093432E6; Fri, 23 Aug 2024 11:30:04 +0200 (CEST) Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) by mails.dpdk.org (Postfix) with ESMTP id 899FD402CA for ; Fri, 23 Aug 2024 11:30:01 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1724405401; x=1755941401; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=t0j3XvlV9G2eQOuNKC20nAHt7cBfmkeDOPNH8diB6q4=; b=fQHiZ6TGjgyU8ooB0FvENmWvqXY8c1tDwkbxmVhJic9ebQ0YvwG7lF8i U1TRLpbbTIsm/9Hd7sTALWc5s2Z22lPQ1n3ySUS2CV70IwCXr/QqtMxwB Xl3zaJyI5+R1H6qIRUzSw44FWFpN7a+Yl7FkII5n/Ou+HvS6mBveFd1gH 1T96Whfxbl0WNZNgTh3VDQ/e+K9nJZRGaIhkoO5TQi6VlvtPjAMMzu2pG YUp4eVoRtWuWNe2JgjT792Wiy4gks8JlQVBqho5NnxxNBdNOnmNDxKH3P hXIbAKgMUB40qDSTDmQliebvU9h7qy+QBzOwQoRHYJqDjZz2kXl7ZRr04 A==; X-CSE-ConnectionGUID: ZZtrv1mIRzaMGgpoXYxCLw== X-CSE-MsgGUID: mKk2W5O4RcCGcPo3f4oYxQ== X-IronPort-AV: E=McAfee;i="6700,10204,11172"; a="22727270" X-IronPort-AV: E=Sophos;i="6.10,169,1719903600"; d="scan'208";a="22727270" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Aug 2024 02:30:01 -0700 X-CSE-ConnectionGUID: wQtY6ai/SYiT4MurUYtKWA== X-CSE-MsgGUID: club+aB4Q9Oc5LHOBp0agQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.10,169,1719903600"; d="scan'208";a="61417098" Received: from silpixa00401119.ir.intel.com ([10.55.129.167]) by fmviesa006.fm.intel.com with ESMTP; 23 Aug 2024 02:30:00 -0700 From: Anatoly Burakov To: dev@dpdk.org, Tyler Retzlaff Subject: [PATCH v1 2/3] fbarray: rework find_next_n to flatten the loop Date: Fri, 23 Aug 2024 10:29:55 +0100 Message-ID: X-Mailer: git-send-email 2.43.5 In-Reply-To: <47e02f706d284a2e2a49db51ce75081e62aee393.1724405387.git.anatoly.burakov@intel.com> References: <47e02f706d284a2e2a49db51ce75081e62aee393.1724405387.git.anatoly.burakov@intel.com> MIME-Version: 1.0 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Currently, find_next_n() is implemented as a nested loop due to lookahead functionality. This is not very efficient because when doing lookahead, we essentially scan some of the indices twice, and in general the lookahead functionality has been a source of bugs because it is overcomplicated. The bit ignore feature on lookahead is also unnecessary because we don't really win anything by ignoring bits we have already scanned, as they would not trigger any matches anyway. This patch reworks find_next_n() to flatten the loop and remove the lookahead and bit-ignore functionality, instead replacing it with state machine-like behavior. This makes the code simpler to reason about. Signed-off-by: Anatoly Burakov --- lib/eal/common/eal_common_fbarray.c | 195 +++++++++++++--------------- 1 file changed, 93 insertions(+), 102 deletions(-) diff --git a/lib/eal/common/eal_common_fbarray.c b/lib/eal/common/eal_common_fbarray.c index 22b43073c6..d38a43d8d1 100644 --- a/lib/eal/common/eal_common_fbarray.c +++ b/lib/eal/common/eal_common_fbarray.c @@ -117,9 +117,11 @@ find_next_n(const struct rte_fbarray *arr, unsigned int start, unsigned int n, { const struct used_mask *msk = get_used_mask(arr->data, arr->elt_sz, arr->len); - unsigned int msk_idx, lookahead_idx, first, first_mod; + unsigned int msk_idx, first, first_mod; unsigned int last, last_mod; - uint64_t last_msk, ignore_msk; + uint64_t last_msk, first_msk; + unsigned int run_start, left = 0; + bool run_started = false; /* * mask only has granularity of MASK_ALIGN, but start may not be aligned @@ -128,7 +130,7 @@ find_next_n(const struct rte_fbarray *arr, unsigned int start, unsigned int n, */ first = MASK_LEN_TO_IDX(start); first_mod = MASK_LEN_TO_MOD(start); - ignore_msk = ~((1ULL << first_mod) - 1); + first_msk = ~((1ULL << first_mod) - 1); /* array length may not be aligned, so calculate ignore mask for last * mask index. @@ -137,131 +139,120 @@ find_next_n(const struct rte_fbarray *arr, unsigned int start, unsigned int n, last_mod = MASK_LEN_TO_MOD(arr->len); last_msk = ~(UINT64_MAX << last_mod); + left = n; + for (msk_idx = first; msk_idx < msk->n_masks; msk_idx++) { - uint64_t cur_msk, lookahead_msk; - unsigned int run_start, clz, left; - bool found = false; + unsigned int s_idx, clz, need; + uint64_t cur_msk, tmp_msk; + /* - * The process of getting n consecutive bits for arbitrary n is - * a bit involved, but here it is in a nutshell: + * In order to find N consecutive bits for arbitrary N, we need + * to be aware of the following: * - * 1. let n be the number of consecutive bits we're looking for - * 2. check if n can fit in one mask, and if so, do n-1 - * rshift-ands to see if there is an appropriate run inside - * our current mask - * 2a. if we found a run, bail out early - * 2b. if we didn't find a run, proceed - * 3. invert the mask and count leading zeroes (that is, count - * how many consecutive set bits we had starting from the - * end of current mask) as k - * 3a. if k is 0, continue to next mask - * 3b. if k is not 0, we have a potential run - * 4. to satisfy our requirements, next mask must have n-k - * consecutive set bits right at the start, so we will do - * (n-k-1) rshift-ands and check if first bit is set. + * 1. To find N number of consecutive bits within a mask, we + * need to do N-1 rshift-ands and see if we still have set + * bits anywhere in the mask + * 2. N may be larger than mask size, in which case we need to + * do a search in multiple consecutive masks + * 3. For multi-mask search to be meaningful, we need to anchor + * our searches, i.e. first we find a run of M bits at the + * end of current mask, then we look for N-M bits at the + * beginning of next mask (or multiple masks) * - * Step 4 will need to be repeated if (n-k) > MASK_ALIGN until - * we either run out of masks, lose the run, or find what we - * were looking for. + * With all of the above, the algorihm looks as follows: + * + * 1. let N be the number of consecutive bits we're looking for + * 2. if we already started a run, check if we can continue it + * by looking for remainder of N at the beginning of current + * mask + * 3. if we lost a run or if we never had a run, we look for N + * bits anywhere within the current mask (up to mask size, + * we can finish this run in the next mask if N > mask size) + * 4. if we didn't find anything up to this point, check if any + * topmost bits of the mask are set (meaning we can start a + * run and finish it in the next mask) + * 5. at any point in steps 2-4, we may do an early exit due to + * finding what we were looking for, or continue searching + * further */ cur_msk = msk->data[msk_idx]; - left = n; /* if we're looking for free spaces, invert the mask */ if (!used) cur_msk = ~cur_msk; - /* combine current ignore mask with last index ignore mask */ + /* first and last mask may not be aligned */ + if (msk_idx == first) + cur_msk &= first_msk; if (msk_idx == last) - ignore_msk &= last_msk; + cur_msk &= last_msk; - /* if we have an ignore mask, ignore once */ - if (ignore_msk) { - cur_msk &= ignore_msk; - ignore_msk = 0; - } + /* do we have an active previous run? */ + if (run_started) { + /* figure out how many consecutive bits we need here */ + need = RTE_MIN(left, MASK_ALIGN); - /* if n can fit in within a single mask, do a search */ - if (n <= MASK_ALIGN) { - uint64_t tmp_msk = cur_msk; - unsigned int s_idx; - for (s_idx = 0; s_idx < n - 1; s_idx++) + /* see if we get a run of needed length */ + tmp_msk = cur_msk; + for (s_idx = 0; s_idx < need - 1; s_idx++) tmp_msk &= tmp_msk >> 1ULL; - /* we found what we were looking for */ - if (tmp_msk != 0) { - run_start = rte_ctz64(tmp_msk); - return MASK_GET_IDX(msk_idx, run_start); + + /* if first bit is set, we keep the run */ + if (tmp_msk & 1) { + left -= need; + + /* did we find what we were looking for? */ + if (left == 0) + return run_start; + + /* keep looking */ + continue; } + /* we lost the run, reset */ + run_started = false; + left = n; } - /* - * we didn't find our run within the mask, or n > MASK_ALIGN, - * so we're going for plan B. - */ + /* if we're here, we either lost the run or never had it */ + + /* figure out how many consecutive bits we need here */ + need = RTE_MIN(left, MASK_ALIGN); + + /* do a search */ + tmp_msk = cur_msk; + for (s_idx = 0; s_idx < need - 1; s_idx++) + tmp_msk &= tmp_msk >> 1ULL; + + /* have we found something? */ + if (tmp_msk != 0) { + /* figure out where the run started */ + run_start = MASK_GET_IDX(msk_idx, rte_ctz64(tmp_msk)); + run_started = true; + left -= need; + + /* do we need to look further? */ + if (left == 0) + return run_start; + + /* we need to keep looking */ + continue; + } + + /* we didn't find our run within current mask, go for plan B. */ /* count leading zeroes on inverted mask */ - if (~cur_msk == 0) - clz = sizeof(cur_msk) * 8; - else - clz = rte_clz64(~cur_msk); + clz = rte_clz64(~cur_msk); - /* if there aren't any runs at the end either, just continue */ + /* if there aren't any set bits at the end, just continue */ if (clz == 0) continue; - /* we have a partial run at the end, so try looking ahead */ - run_start = MASK_ALIGN - clz; + /* we have a partial run at the end */ + run_start = MASK_GET_IDX(msk_idx, MASK_ALIGN - clz); + run_started = true; left -= clz; - for (lookahead_idx = msk_idx + 1; lookahead_idx < msk->n_masks; - lookahead_idx++) { - unsigned int s_idx, need; - uint64_t first_bit = 1; - - lookahead_msk = msk->data[lookahead_idx]; - - /* if we're looking for free space, invert the mask */ - if (!used) - lookahead_msk = ~lookahead_msk; - - /* figure out how many consecutive bits we need here */ - need = RTE_MIN(left, MASK_ALIGN); - - /* count number of shifts we performed */ - for (s_idx = 0; s_idx < need - 1; s_idx++) { - lookahead_msk &= lookahead_msk >> 1ULL; - /* did we lose the run yet? */ - if ((lookahead_msk & first_bit) == 0) - break; - } - - /* if first bit is not set, we've lost the run */ - if ((lookahead_msk & first_bit) == 0) { - /* - * we've scanned this far, so we know there are - * no runs in the space we've lookahead-scanned - * as well, so skip that on next iteration. - */ - ignore_msk = ~((1ULL << (s_idx + 1)) - 1); - /* outer loop will increment msk_idx so add 1 */ - msk_idx = lookahead_idx - 1; - break; - } - - left -= need; - - /* check if we've found what we were looking for */ - if (left == 0) { - found = true; - break; - } - } - - /* we didn't find anything, so continue */ - if (!found) - continue; - - return MASK_GET_IDX(msk_idx, run_start); + /* we'll figure this out in the next iteration */ } /* we didn't find anything */ rte_errno = used ? ENOENT : ENOSPC; From patchwork Fri Aug 23 09:29:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 143364 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 11BF545843; Fri, 23 Aug 2024 11:30:16 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 9B74E432F4; Fri, 23 Aug 2024 11:30:05 +0200 (CEST) Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) by mails.dpdk.org (Postfix) with ESMTP id 018B1432A2 for ; Fri, 23 Aug 2024 11:30:02 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1724405403; x=1755941403; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=iebN4f2Y0BA+Ds5Mw3y2x/SMLjGglFrpI3mA1UrxZVQ=; b=esJvQLttYdgVdMakDpvnxWET86Q7PckYUcSDA1i6ClyUf+XcNoL1KZ/l mjGy4QWdP2rQ8gbCZtOlh8ya3/iUbWlH0RJCHlx5mQRgY94Aw8wDH8uOj eEe/BsYufTJBDNnSEiw0xk0jX6R8B12bXnyyOo06CYATi5pxsiD3mqWoh 9HYCYeSgMy+9C/r8MFkV1wh45tL/Eyx3pjdYgjM/u9DPnbhSw466t2HwM ub/lssQ0OgE/XDvySq7aEfUnCfXCb/9oKcfNj82ViehAoo9rHQObbNOH4 Hmur3FNhpd9ZeJgc2wsHS1wrdE8kLBfBIBh9LmvJ64HETyylosi16aAwQ w==; X-CSE-ConnectionGUID: 8nYF4ylAR3K74Y0nCy3chg== X-CSE-MsgGUID: MOHUr2AwTvqlU+BZJSO6Vg== X-IronPort-AV: E=McAfee;i="6700,10204,11172"; a="22727277" X-IronPort-AV: E=Sophos;i="6.10,169,1719903600"; d="scan'208";a="22727277" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Aug 2024 02:30:02 -0700 X-CSE-ConnectionGUID: m2F6h08sSqiK1nYrBVySpA== X-CSE-MsgGUID: MVWB5lpYSXqnESxq/MXLOg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.10,169,1719903600"; d="scan'208";a="61417112" Received: from silpixa00401119.ir.intel.com ([10.55.129.167]) by fmviesa006.fm.intel.com with ESMTP; 23 Aug 2024 02:30:01 -0700 From: Anatoly Burakov To: dev@dpdk.org, Tyler Retzlaff Subject: [PATCH v1 3/3] fbarray: rework find_prev_n to flatten the loop Date: Fri, 23 Aug 2024 10:29:56 +0100 Message-ID: <08803db4a09d57f0b81b452bc9520f16692b250b.1724405387.git.anatoly.burakov@intel.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <47e02f706d284a2e2a49db51ce75081e62aee393.1724405387.git.anatoly.burakov@intel.com> References: <47e02f706d284a2e2a49db51ce75081e62aee393.1724405387.git.anatoly.burakov@intel.com> MIME-Version: 1.0 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Currently, find_prev_n() is implemented as a nested loop due to lookbehind functionality. This is not very efficient because when doing lookbehind, we essentially scan some of the indices twice, and in general the lookbehind functionality has been a source of bugs because it is overcomplicated. The bit ignore feature on lookbehind is also unnecessary because we don't really win anything by ignoring bits we have already scanned, as they would not trigger any matches anyway. This patch reworks find_prev_n() to flatten the loop and remove the lookbehind and bit-ignore functionality, instead replacing it with state machine-like behavior. This makes the code simpler to reason about. Signed-off-by: Anatoly Burakov --- lib/eal/common/eal_common_fbarray.c | 223 +++++++++++++--------------- 1 file changed, 101 insertions(+), 122 deletions(-) diff --git a/lib/eal/common/eal_common_fbarray.c b/lib/eal/common/eal_common_fbarray.c index d38a43d8d1..a2a19af14a 100644 --- a/lib/eal/common/eal_common_fbarray.c +++ b/lib/eal/common/eal_common_fbarray.c @@ -382,164 +382,143 @@ find_prev_n(const struct rte_fbarray *arr, unsigned int start, unsigned int n, { const struct used_mask *msk = get_used_mask(arr->data, arr->elt_sz, arr->len); - unsigned int msk_idx, lookbehind_idx, first, first_mod; - uint64_t ignore_msk; + /* we're going backwards so we need negative space */ + int64_t msk_idx; + unsigned int first, first_mod; + uint64_t first_msk; + unsigned int run_end, left; + bool run_started = false; /* * mask only has granularity of MASK_ALIGN, but start may not be aligned * on that boundary, so construct a special mask to exclude anything we - * don't want to see to avoid confusing ctz. + * don't want to see to avoid confusing clz. this "first" mask is + * actually our last because we're going backwards, so no second mask + * is required like in find_next_n case. */ first = MASK_LEN_TO_IDX(start); first_mod = MASK_LEN_TO_MOD(start); /* we're going backwards, so mask must start from the top */ - ignore_msk = first_mod == MASK_ALIGN - 1 ? + first_msk = first_mod == MASK_ALIGN - 1 ? UINT64_MAX : /* prevent overflow */ ~(UINT64_MAX << (first_mod + 1)); + left = n; + /* go backwards, include zero */ - msk_idx = first; - do { - uint64_t cur_msk, lookbehind_msk; - unsigned int run_start, run_end, ctz, left; - bool found = false; + for (msk_idx = first; msk_idx >= 0; msk_idx--) { + unsigned int s_idx, ctz, need; + uint64_t cur_msk, tmp_msk; + /* - * The process of getting n consecutive bits from the top for - * arbitrary n is a bit involved, but here it is in a nutshell: + * In order to find N consecutive bits for arbitrary N, we need + * to be aware of the following: * - * 1. let n be the number of consecutive bits we're looking for - * 2. check if n can fit in one mask, and if so, do n-1 - * lshift-ands to see if there is an appropriate run inside - * our current mask - * 2a. if we found a run, bail out early - * 2b. if we didn't find a run, proceed - * 3. invert the mask and count trailing zeroes (that is, count - * how many consecutive set bits we had starting from the - * start of current mask) as k - * 3a. if k is 0, continue to next mask - * 3b. if k is not 0, we have a potential run - * 4. to satisfy our requirements, next mask must have n-k - * consecutive set bits at the end, so we will do (n-k-1) - * lshift-ands and check if last bit is set. + * 1. To find N number of consecutive bits within a mask, we + * need to do N-1 lshift-ands and see if we still have set + * bits anywhere in the mask + * 2. N may be larger than mask size, in which case we need to + * do a search in multiple consecutive masks + * 3. For multi-mask search to be meaningful, we need to anchor + * our searches, i.e. first we find a run of M bits at the + * beginning of current mask, then we look for N-M bits at + * the end of previous mask (or multiple masks) * - * Step 4 will need to be repeated if (n-k) > MASK_ALIGN until - * we either run out of masks, lose the run, or find what we - * were looking for. + * With all of the above, the algorihm looks as follows: + * + * 1. let N be the number of consecutive bits we're looking for + * 2. if we already started a run, check if we can continue it + * by looking for remainder of N at the end of current mask + * 3. if we lost a run or if we never had a run, we look for N + * bits anywhere within the current mask (up to mask size, + * we can finish this run in the previous mask if N > mask + * size) + * 4. if we didn't find anything up to this point, check if any + * first bits of the mask are set (meaning we can start a + * run and finish it in the previous mask) + * 5. at any point in steps 2-4, we may do an early exit due to + * finding what we were looking for, or continue searching + * further */ cur_msk = msk->data[msk_idx]; - left = n; /* if we're looking for free spaces, invert the mask */ if (!used) cur_msk = ~cur_msk; - /* if we have an ignore mask, ignore once */ - if (ignore_msk) { - cur_msk &= ignore_msk; - ignore_msk = 0; - } + /* first mask may not be aligned */ + if (msk_idx == first) + cur_msk &= first_msk; - /* if n can fit in within a single mask, do a search */ - if (n <= MASK_ALIGN) { - uint64_t tmp_msk = cur_msk; - unsigned int s_idx; - for (s_idx = 0; s_idx < n - 1; s_idx++) + /* do we have an active previous run? */ + if (run_started) { + uint64_t last_bit = 0x1ULL << (MASK_ALIGN - 1); + + /* figure out how many consecutive bits we need here */ + need = RTE_MIN(left, MASK_ALIGN); + + /* see if we get a run of needed length */ + tmp_msk = cur_msk; + for (s_idx = 0; s_idx < need - 1; s_idx++) tmp_msk &= tmp_msk << 1ULL; - /* we found what we were looking for */ - if (tmp_msk != 0) { - /* clz will give us offset from end of mask, and - * we only get the end of our run, not start, - * so adjust result to point to where start - * would have been. - */ - run_start = MASK_ALIGN - - rte_clz64(tmp_msk) - n; - return MASK_GET_IDX(msk_idx, run_start); + + /* if last bit is set, we keep the run */ + if (tmp_msk & last_bit) { + left -= need; + + /* did we find what we were looking for? */ + if (left == 0) + return run_end - n; + + /* keep looking */ + continue; } + /* we lost the run, reset */ + run_started = false; + left = n; } - /* - * we didn't find our run within the mask, or n > MASK_ALIGN, - * so we're going for plan B. - */ + /* if we're here, we either lost the run or never had it */ + + /* figure out how many consecutive bits we need here */ + need = RTE_MIN(left, MASK_ALIGN); + + /* do a search */ + tmp_msk = cur_msk; + for (s_idx = 0; s_idx < need - 1; s_idx++) + tmp_msk &= tmp_msk << 1ULL; + + /* have we found something? */ + if (tmp_msk != 0) { + /* figure out where the run started */ + run_end = MASK_GET_IDX(msk_idx, MASK_ALIGN - rte_clz64(tmp_msk)); + run_started = true; + left -= need; + + /* do we need to look further? */ + if (left == 0) + return run_end - n; + + /* we need to keep looking */ + continue; + } + + /* we didn't find our run within current mask, go for plan B. */ /* count trailing zeroes on inverted mask */ - if (~cur_msk == 0) - ctz = sizeof(cur_msk) * 8; - else - ctz = rte_ctz64(~cur_msk); + ctz = rte_ctz64(~cur_msk); - /* if there aren't any runs at the start either, just - * continue - */ + /* if there aren't any set bits at the beginning, just continue */ if (ctz == 0) continue; - /* we have a partial run at the start, so try looking behind */ + /* we have a partial run at the beginning */ run_end = MASK_GET_IDX(msk_idx, ctz); + run_started = true; left -= ctz; - /* go backwards, include zero */ - lookbehind_idx = msk_idx - 1; - - /* we can't lookbehind as we've run out of masks, so stop */ - if (msk_idx == 0) - break; - - do { - const uint64_t last_bit = 1ULL << (MASK_ALIGN - 1); - unsigned int s_idx, need; - - lookbehind_msk = msk->data[lookbehind_idx]; - - /* if we're looking for free space, invert the mask */ - if (!used) - lookbehind_msk = ~lookbehind_msk; - - /* figure out how many consecutive bits we need here */ - need = RTE_MIN(left, MASK_ALIGN); - - /* count number of shifts we performed */ - for (s_idx = 0; s_idx < need - 1; s_idx++) { - lookbehind_msk &= lookbehind_msk << 1ULL; - /* did we lose the run yet? */ - if ((lookbehind_msk & last_bit) == 0) - break; - } - - /* if last bit is not set, we've lost the run */ - if ((lookbehind_msk & last_bit) == 0) { - /* - * we've scanned this far, so we know there are - * no runs in the space we've lookbehind-scanned - * as well, so skip that on next iteration. - */ - ignore_msk = ~(UINT64_MAX << (MASK_ALIGN - s_idx - 1)); - /* outer loop will decrement msk_idx so add 1 */ - msk_idx = lookbehind_idx + 1; - break; - } - - left -= need; - - /* check if we've found what we were looking for */ - if (left == 0) { - found = true; - break; - } - } while ((lookbehind_idx--) != 0); /* decrement after check to - * include zero - */ - - /* we didn't find anything, so continue */ - if (!found) - continue; - - /* we've found what we were looking for, but we only know where - * the run ended, so calculate start position. - */ - return run_end - n; - } while (msk_idx-- != 0); /* decrement after check to include zero */ + /* we'll figure this out in the next iteration */ + } /* we didn't find anything */ rte_errno = used ? ENOENT : ENOSPC; return -1;