[v3] eal: fix eal init may failed when too much continuous memsegs under legacy mode

Message ID 20230526034127.18231-1-changfengnan@bytedance.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers
Series [v3] eal: fix eal init may failed when too much continuous memsegs under legacy mode |

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/github-robot: build success github build: passed
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-aarch64-unit-testing success Testing PASS
ci/iol-unit-testing success Testing PASS
ci/iol-abi-testing success Testing PASS
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/iol-x86_64-compile-testing success Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-testing success Testing PASS
ci/iol-x86_64-unit-testing success Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/intel-Functional success Functional PASS

Commit Message

Fengnan Chang May 26, 2023, 3:41 a.m. UTC
  Under legacy mode, if the number of continuous memsegs greater
than RTE_MAX_MEMSEG_PER_LIST, eal init will failed even though
another memseg list is empty, because only one memseg list used
to check in remap_needed_hugepages.
Fix this by make remap_segment return how many segments mapped,
remap_segment try to map most contiguous segments it can, if
exceed it's capbility, remap_needed_hugepages will continue to
map other left pages.

For example:
hugepage configure:
cat /sys/devices/system/node/node*/hugepages/hugepages-2048kB/nr_hugepages
10241
10239

startup log:
EAL: Detected memory type: socket_id:0 hugepage_sz:2097152
EAL: Detected memory type: socket_id:1 hugepage_sz:2097152
EAL: Creating 4 segment lists: n_segs:8192 socket_id:0 hugepage_sz:2097152
EAL: Creating 4 segment lists: n_segs:8192 socket_id:1 hugepage_sz:2097152
EAL: Requesting 13370 pages of size 2MB from socket 0
EAL: Requesting 7110 pages of size 2MB from socket 1
EAL: Attempting to map 14220M on socket 1
EAL: Allocated 14220M on socket 1
EAL: Attempting to map 26740M on socket 0
EAL: Could not find space for memseg. Please increase 32768 and/or 65536 in
configuration.
EAL: Couldn't remap hugepage files into memseg lists
EAL: FATAL: Cannot init memory
EAL: Cannot init memory

Signed-off-by: Fengnan Chang <changfengnan@bytedance.com>
Signed-off-by: Lin Li <lilintjpu@bytedance.com>
Signed-off-by: Burakov Anatoly <anatoly.burakov@intel.com>
---
 lib/eal/linux/eal_memory.c | 55 +++++++++++++++++++++++++-------------
 1 file changed, 36 insertions(+), 19 deletions(-)
  

Comments

Anatoly Burakov May 26, 2023, 2:41 p.m. UTC | #1
On 5/26/2023 4:41 AM, Fengnan Chang wrote:
> Under legacy mode, if the number of continuous memsegs greater
> than RTE_MAX_MEMSEG_PER_LIST, eal init will failed even though
> another memseg list is empty, because only one memseg list used
> to check in remap_needed_hugepages.
> Fix this by make remap_segment return how many segments mapped,
> remap_segment try to map most contiguous segments it can, if
> exceed it's capbility, remap_needed_hugepages will continue to
> map other left pages.
> 
> For example:
> hugepage configure:
> cat /sys/devices/system/node/node*/hugepages/hugepages-2048kB/nr_hugepages
> 10241
> 10239
> 
> startup log:
> EAL: Detected memory type: socket_id:0 hugepage_sz:2097152
> EAL: Detected memory type: socket_id:1 hugepage_sz:2097152
> EAL: Creating 4 segment lists: n_segs:8192 socket_id:0 hugepage_sz:2097152
> EAL: Creating 4 segment lists: n_segs:8192 socket_id:1 hugepage_sz:2097152
> EAL: Requesting 13370 pages of size 2MB from socket 0
> EAL: Requesting 7110 pages of size 2MB from socket 1
> EAL: Attempting to map 14220M on socket 1
> EAL: Allocated 14220M on socket 1
> EAL: Attempting to map 26740M on socket 0
> EAL: Could not find space for memseg. Please increase 32768 and/or 65536 in
> configuration.
> EAL: Couldn't remap hugepage files into memseg lists
> EAL: FATAL: Cannot init memory
> EAL: Cannot init memory
> 
> Signed-off-by: Fengnan Chang <changfengnan@bytedance.com>
> Signed-off-by: Lin Li <lilintjpu@bytedance.com>
> Signed-off-by: Burakov Anatoly <anatoly.burakov@intel.com>

Hi,

Thanks for taking my suggested implementation on board!

> ---
>   lib/eal/linux/eal_memory.c | 55 +++++++++++++++++++++++++-------------
>   1 file changed, 36 insertions(+), 19 deletions(-)
> 
> diff --git a/lib/eal/linux/eal_memory.c b/lib/eal/linux/eal_memory.c
> index 60fc8cc6ca..085defdee5 100644
> --- a/lib/eal/linux/eal_memory.c
> +++ b/lib/eal/linux/eal_memory.c
> @@ -681,6 +681,7 @@ remap_segment(struct hugepage_file *hugepages, int seg_start, int seg_end)
>   
>   	/* find free space in memseg lists */
>   	for (msl_idx = 0; msl_idx < RTE_MAX_MEMSEG_LISTS; msl_idx++) {
> +		int free_len;
>   		bool empty;
>   		msl = &mcfg->memsegs[msl_idx];
>   		arr = &msl->memseg_arr;
> @@ -692,24 +693,28 @@ remap_segment(struct hugepage_file *hugepages, int seg_start, int seg_end)
>   
>   		/* leave space for a hole if array is not empty */
>   		empty = arr->count == 0;
> -		ms_idx = rte_fbarray_find_next_n_free(arr, 0,
> -				seg_len + (empty ? 0 : 1));
> -
> -		/* memseg list is full? */
> -		if (ms_idx < 0)
> +		/* find start of the biggest contiguous block and its size */
> +		ms_idx = rte_fbarray_find_biggest_free(arr, 0);
> +		free_len = rte_fbarray_find_contig_free(arr, ms_idx);
> +		if (free_len < 0)
>   			continue;

Technically, ms_idx can return -1, which should not be passed to 
rte_fbarray_find_contig_free() because the index value it accepts is 
unsigned (meaning, -1 will translate to UINT32_MAX). This *would* cause 
failure at parameter checking, so in practice it's not a bug, but I'm 
pretty sure code analyzers will complain about it, so the control flow 
needs to be changed somewhat.

Specifically, we should check for `ms_idx < 0` and continue, before 
using it in `rte_fbarray_find_contig_free()`.

> -
>   		/* leave some space between memsegs, they are not IOVA
>   		 * contiguous, so they shouldn't be VA contiguous either.
>   		 */
> -		if (!empty)
> +		if (!empty) {
>   			ms_idx++;
> +			free_len--;
> +		}
> +		printf("free len %d seg_len %d ms_idx %d\n", free_len, seg_len, ms_idx);

Debugging leftover?

> +		/* we might not get all of the space we wanted */
> +		free_len = RTE_MIN(seg_len, free_len);
> +		seg_end = seg_start + free_len;
> +		seg_len = seg_end - seg_start;
>   		break;
>   	}
>   	if (msl_idx == RTE_MAX_MEMSEG_LISTS) {
> -		RTE_LOG(ERR, EAL, "Could not find space for memseg. Please increase %s and/or %s in configuration.\n",
> -				RTE_STR(RTE_MAX_MEMSEG_PER_TYPE),
> -				RTE_STR(RTE_MAX_MEM_MB_PER_TYPE));
> +		RTE_LOG(ERR, EAL, "Could not find space for memseg. Please increase RTE_MAX_MEMSEG_PER_LIST "
> +				"RTE_MAX_MEMSEG_PER_TYPE and/or RTE_MAX_MEM_MB_PER_TYPE in configuration.\n");

I don't think this change should be part of this patch, as this is 
fixing a separate issue. If we are to fix it, it should be a separate patch.

With the above changes,

Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
  
Anatoly Burakov May 26, 2023, 3:32 p.m. UTC | #2
On 5/26/2023 3:41 PM, Burakov, Anatoly wrote:
> On 5/26/2023 4:41 AM, Fengnan Chang wrote:
>> Under legacy mode, if the number of continuous memsegs greater
>> than RTE_MAX_MEMSEG_PER_LIST, eal init will failed even though
>> another memseg list is empty, because only one memseg list used
>> to check in remap_needed_hugepages.
>> Fix this by make remap_segment return how many segments mapped,
>> remap_segment try to map most contiguous segments it can, if
>> exceed it's capbility, remap_needed_hugepages will continue to
>> map other left pages.
>>
>> For example:
>> hugepage configure:
>> cat 
>> /sys/devices/system/node/node*/hugepages/hugepages-2048kB/nr_hugepages
>> 10241
>> 10239
>>
>> startup log:
>> EAL: Detected memory type: socket_id:0 hugepage_sz:2097152
>> EAL: Detected memory type: socket_id:1 hugepage_sz:2097152
>> EAL: Creating 4 segment lists: n_segs:8192 socket_id:0 
>> hugepage_sz:2097152
>> EAL: Creating 4 segment lists: n_segs:8192 socket_id:1 
>> hugepage_sz:2097152
>> EAL: Requesting 13370 pages of size 2MB from socket 0
>> EAL: Requesting 7110 pages of size 2MB from socket 1
>> EAL: Attempting to map 14220M on socket 1
>> EAL: Allocated 14220M on socket 1
>> EAL: Attempting to map 26740M on socket 0
>> EAL: Could not find space for memseg. Please increase 32768 and/or 
>> 65536 in
>> configuration.
>> EAL: Couldn't remap hugepage files into memseg lists
>> EAL: FATAL: Cannot init memory
>> EAL: Cannot init memory
>>
>> Signed-off-by: Fengnan Chang <changfengnan@bytedance.com>
>> Signed-off-by: Lin Li <lilintjpu@bytedance.com>
>> Signed-off-by: Burakov Anatoly <anatoly.burakov@intel.com>
> 
> Hi,
> 
> Thanks for taking my suggested implementation on board!
> 
>> ---
>>   lib/eal/linux/eal_memory.c | 55 +++++++++++++++++++++++++-------------
>>   1 file changed, 36 insertions(+), 19 deletions(-)
>>
>> diff --git a/lib/eal/linux/eal_memory.c b/lib/eal/linux/eal_memory.c
>> index 60fc8cc6ca..085defdee5 100644
>> --- a/lib/eal/linux/eal_memory.c
>> +++ b/lib/eal/linux/eal_memory.c
>> @@ -681,6 +681,7 @@ remap_segment(struct hugepage_file *hugepages, int 
>> seg_start, int seg_end)
>>       /* find free space in memseg lists */
>>       for (msl_idx = 0; msl_idx < RTE_MAX_MEMSEG_LISTS; msl_idx++) {
>> +        int free_len;
>>           bool empty;
>>           msl = &mcfg->memsegs[msl_idx];
>>           arr = &msl->memseg_arr;
>> @@ -692,24 +693,28 @@ remap_segment(struct hugepage_file *hugepages, 
>> int seg_start, int seg_end)
>>           /* leave space for a hole if array is not empty */
>>           empty = arr->count == 0;
>> -        ms_idx = rte_fbarray_find_next_n_free(arr, 0,
>> -                seg_len + (empty ? 0 : 1));
>> -
>> -        /* memseg list is full? */
>> -        if (ms_idx < 0)
>> +        /* find start of the biggest contiguous block and its size */
>> +        ms_idx = rte_fbarray_find_biggest_free(arr, 0);
>> +        free_len = rte_fbarray_find_contig_free(arr, ms_idx);
>> +        if (free_len < 0)
>>               continue;

Missed this.

When we're splitting up segments, we're looking for contiguous free 
areas that are at least two memsegs long, so if this memseg is full but 
contains segments that were split up, there will be a bunch of holes in 
it, and free_len will be 1 (because every hole will be 1 segment long by 
definition). So, we should not accept anything that is at least two 
segments long.
  
Fengnan Chang May 29, 2023, 11:20 a.m. UTC | #3
Burakov, Anatoly <anatoly.burakov@intel.com> 于2023年5月26日周五 23:33写道:
>
> On 5/26/2023 3:41 PM, Burakov, Anatoly wrote:
> > On 5/26/2023 4:41 AM, Fengnan Chang wrote:
> >> Under legacy mode, if the number of continuous memsegs greater
> >> than RTE_MAX_MEMSEG_PER_LIST, eal init will failed even though
> >> another memseg list is empty, because only one memseg list used
> >> to check in remap_needed_hugepages.
> >> Fix this by make remap_segment return how many segments mapped,
> >> remap_segment try to map most contiguous segments it can, if
> >> exceed it's capbility, remap_needed_hugepages will continue to
> >> map other left pages.
> >>
> >> For example:
> >> hugepage configure:
> >> cat
> >> /sys/devices/system/node/node*/hugepages/hugepages-2048kB/nr_hugepages
> >> 10241
> >> 10239
> >>
> >> startup log:
> >> EAL: Detected memory type: socket_id:0 hugepage_sz:2097152
> >> EAL: Detected memory type: socket_id:1 hugepage_sz:2097152
> >> EAL: Creating 4 segment lists: n_segs:8192 socket_id:0
> >> hugepage_sz:2097152
> >> EAL: Creating 4 segment lists: n_segs:8192 socket_id:1
> >> hugepage_sz:2097152
> >> EAL: Requesting 13370 pages of size 2MB from socket 0
> >> EAL: Requesting 7110 pages of size 2MB from socket 1
> >> EAL: Attempting to map 14220M on socket 1
> >> EAL: Allocated 14220M on socket 1
> >> EAL: Attempting to map 26740M on socket 0
> >> EAL: Could not find space for memseg. Please increase 32768 and/or
> >> 65536 in
> >> configuration.
> >> EAL: Couldn't remap hugepage files into memseg lists
> >> EAL: FATAL: Cannot init memory
> >> EAL: Cannot init memory
> >>
> >> Signed-off-by: Fengnan Chang <changfengnan@bytedance.com>
> >> Signed-off-by: Lin Li <lilintjpu@bytedance.com>
> >> Signed-off-by: Burakov Anatoly <anatoly.burakov@intel.com>
> >
> > Hi,
> >
> > Thanks for taking my suggested implementation on board!
> >
> >> ---
> >>   lib/eal/linux/eal_memory.c | 55 +++++++++++++++++++++++++-------------
> >>   1 file changed, 36 insertions(+), 19 deletions(-)
> >>
> >> diff --git a/lib/eal/linux/eal_memory.c b/lib/eal/linux/eal_memory.c
> >> index 60fc8cc6ca..085defdee5 100644
> >> --- a/lib/eal/linux/eal_memory.c
> >> +++ b/lib/eal/linux/eal_memory.c
> >> @@ -681,6 +681,7 @@ remap_segment(struct hugepage_file *hugepages, int
> >> seg_start, int seg_end)
> >>       /* find free space in memseg lists */
> >>       for (msl_idx = 0; msl_idx < RTE_MAX_MEMSEG_LISTS; msl_idx++) {
> >> +        int free_len;
> >>           bool empty;
> >>           msl = &mcfg->memsegs[msl_idx];
> >>           arr = &msl->memseg_arr;
> >> @@ -692,24 +693,28 @@ remap_segment(struct hugepage_file *hugepages,
> >> int seg_start, int seg_end)
> >>           /* leave space for a hole if array is not empty */
> >>           empty = arr->count == 0;
> >> -        ms_idx = rte_fbarray_find_next_n_free(arr, 0,
> >> -                seg_len + (empty ? 0 : 1));
> >> -
> >> -        /* memseg list is full? */
> >> -        if (ms_idx < 0)
> >> +        /* find start of the biggest contiguous block and its size */
> >> +        ms_idx = rte_fbarray_find_biggest_free(arr, 0);
> >> +        free_len = rte_fbarray_find_contig_free(arr, ms_idx);
> >> +        if (free_len < 0)
> >>               continue;
>
> Missed this.
>
> When we're splitting up segments, we're looking for contiguous free
> areas that are at least two memsegs long, so if this memseg is full but
> contains segments that were split up, there will be a bunch of holes in
> it, and free_len will be 1 (because every hole will be 1 segment long by
> definition). So, we should not accept anything that is at least two
> segments long.

Got it, thanks for your reminder.

>
> --
> Thanks,
> Anatoly
>
  

Patch

diff --git a/lib/eal/linux/eal_memory.c b/lib/eal/linux/eal_memory.c
index 60fc8cc6ca..085defdee5 100644
--- a/lib/eal/linux/eal_memory.c
+++ b/lib/eal/linux/eal_memory.c
@@ -681,6 +681,7 @@  remap_segment(struct hugepage_file *hugepages, int seg_start, int seg_end)
 
 	/* find free space in memseg lists */
 	for (msl_idx = 0; msl_idx < RTE_MAX_MEMSEG_LISTS; msl_idx++) {
+		int free_len;
 		bool empty;
 		msl = &mcfg->memsegs[msl_idx];
 		arr = &msl->memseg_arr;
@@ -692,24 +693,28 @@  remap_segment(struct hugepage_file *hugepages, int seg_start, int seg_end)
 
 		/* leave space for a hole if array is not empty */
 		empty = arr->count == 0;
-		ms_idx = rte_fbarray_find_next_n_free(arr, 0,
-				seg_len + (empty ? 0 : 1));
-
-		/* memseg list is full? */
-		if (ms_idx < 0)
+		/* find start of the biggest contiguous block and its size */
+		ms_idx = rte_fbarray_find_biggest_free(arr, 0);
+		free_len = rte_fbarray_find_contig_free(arr, ms_idx);
+		if (free_len < 0)
 			continue;
-
 		/* leave some space between memsegs, they are not IOVA
 		 * contiguous, so they shouldn't be VA contiguous either.
 		 */
-		if (!empty)
+		if (!empty) {
 			ms_idx++;
+			free_len--;
+		}
+		printf("free len %d seg_len %d ms_idx %d\n", free_len, seg_len, ms_idx);
+		/* we might not get all of the space we wanted */
+		free_len = RTE_MIN(seg_len, free_len);
+		seg_end = seg_start + free_len;
+		seg_len = seg_end - seg_start;
 		break;
 	}
 	if (msl_idx == RTE_MAX_MEMSEG_LISTS) {
-		RTE_LOG(ERR, EAL, "Could not find space for memseg. Please increase %s and/or %s in configuration.\n",
-				RTE_STR(RTE_MAX_MEMSEG_PER_TYPE),
-				RTE_STR(RTE_MAX_MEM_MB_PER_TYPE));
+		RTE_LOG(ERR, EAL, "Could not find space for memseg. Please increase RTE_MAX_MEMSEG_PER_LIST "
+				"RTE_MAX_MEMSEG_PER_TYPE and/or RTE_MAX_MEM_MB_PER_TYPE in configuration.\n");
 		return -1;
 	}
 
@@ -787,7 +792,7 @@  remap_segment(struct hugepage_file *hugepages, int seg_start, int seg_end)
 	}
 	RTE_LOG(DEBUG, EAL, "Allocated %" PRIu64 "M on socket %i\n",
 			(seg_len * page_sz) >> 20, socket_id);
-	return 0;
+	return seg_len;
 }
 
 static uint64_t
@@ -1022,10 +1027,16 @@  remap_needed_hugepages(struct hugepage_file *hugepages, int n_pages)
 		if (new_memseg) {
 			/* if this isn't the first time, remap segment */
 			if (cur_page != 0) {
-				ret = remap_segment(hugepages, seg_start_page,
-						cur_page);
-				if (ret != 0)
-					return -1;
+				int n_remapped = 0;
+				int n_needed = cur_page - seg_start_page;
+				while (n_remapped < n_needed) {
+					ret = remap_segment(hugepages, seg_start_page,
+							cur_page);
+					if (ret < 0)
+						return -1;
+					n_remapped += ret;
+					seg_start_page += ret;
+				}
 			}
 			/* remember where we started */
 			seg_start_page = cur_page;
@@ -1034,10 +1045,16 @@  remap_needed_hugepages(struct hugepage_file *hugepages, int n_pages)
 	}
 	/* we were stopped, but we didn't remap the last segment, do it now */
 	if (cur_page != 0) {
-		ret = remap_segment(hugepages, seg_start_page,
-				cur_page);
-		if (ret != 0)
-			return -1;
+		int n_remapped = 0;
+		int n_needed = cur_page - seg_start_page;
+		while (n_remapped < n_needed) {
+			ret = remap_segment(hugepages, seg_start_page,
+					cur_page);
+			if (ret < 0)
+				return -1;
+			n_remapped += ret;
+			seg_start_page += ret;
+		}
 	}
 	return 0;
 }