test: make hugepage check more robust under Linux

Message ID 20210317144409.288346-1-aconole@redhat.com (mailing list archive)
State Changes Requested, archived
Delegated to: David Marchand
Headers
Series test: make hugepage check more robust under Linux |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK
ci/travis-robot fail travis build: failed
ci/github-robot fail github build: failed
ci/iol-abi-testing success Testing PASS
ci/intel-Testing success Testing PASS
ci/iol-testing warning Testing issues
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS

Commit Message

Aaron Conole March 17, 2021, 2:44 p.m. UTC
  The hugepage test really needs to check multiple things on Linux:

1. Are hugepages reserved in the system?

2. Is the hugepage mountpoint available so that we can allocate them?

3. Do we have permissions to write into the hugepage mountpoint?

The existing hugepage check only verifies the first.  On some setups,
a non-root user won't have access to the mountpoint for hugepages to
be allocated and that needs to be reflected in the test as well.  Add
such checks for Linux OS to give a more check when running test suites.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 app/test/has-hugepage.sh | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)
  

Comments

Thomas Monjalon March 17, 2021, 2:57 p.m. UTC | #1
17/03/2021 15:44, Aaron Conole:
> The hugepage test really needs to check multiple things on Linux:
> 
> 1. Are hugepages reserved in the system?
> 
> 2. Is the hugepage mountpoint available so that we can allocate them?
> 
> 3. Do we have permissions to write into the hugepage mountpoint?
> 
> The existing hugepage check only verifies the first.  On some setups,
> a non-root user won't have access to the mountpoint for hugepages to
> be allocated and that needs to be reflected in the test as well.  Add
> such checks for Linux OS to give a more check when running test suites.

Requirements 2 & 3 are optional.
You don't need a mount point if using the option --in-memory.

[...]
> +	perm=""

perm= should do the same.

> +	for mount in `mount | grep hugetlbfs | awk '{ print $3; }'`; do

Please prefer $() syntax.
Are spaces in awk required?

> +		test ! -w $mount/. || perm="$mount"

Why /. ?

> +	done
> +	if [ "$perm" = "" -o "$nr_hugepages" = "0" ]; then

= "" can be replaced with -z
"0" can be simply 0

> +		echo 0
> +	else
> +		echo $nr_hugepages
> +	fi
  
David Marchand March 19, 2021, 1:41 p.m. UTC | #2
On Wed, Mar 17, 2021 at 3:44 PM Aaron Conole <aconole@redhat.com> wrote:
> diff --git a/app/test/has-hugepage.sh b/app/test/has-hugepage.sh
> index d600fad319..1c3cfb665a 100755
> --- a/app/test/has-hugepage.sh
> +++ b/app/test/has-hugepage.sh
> @@ -3,7 +3,17 @@
>  # Copyright 2020 Mellanox Technologies, Ltd
>
>  if [ "$(uname)" = "Linux" ] ; then
> -       cat /proc/sys/vm/nr_hugepages || echo 0
> +       nr_hugepages=$(cat /proc/sys/vm/nr_hugepages)
> +       # Need to check if we have permissions to access hugepages
> +       perm=""
> +       for mount in `mount | grep hugetlbfs | awk '{ print $3; }'`; do
> +               test ! -w $mount/. || perm="$mount"
> +       done
> +       if [ "$perm" = "" -o "$nr_hugepages" = "0" ]; then
> +               echo 0
> +       else
> +               echo $nr_hugepages
> +       fi
>  elif [ "$(uname)" = "FreeBSD" ] ; then
>         echo 1 # assume FreeBSD always has hugepages
>  else

The check is evaluated first when configuring as normal user and not
reevaluated when running the tests as root in Travis/GHA, so the tests
are started with no hugepage.

Then, in Travis and GHA, we can see (sigbus o_O) crashes in no-huge mode.
I tried to reproduce but did not manage.
  
Aaron Conole March 19, 2021, 2:34 p.m. UTC | #3
David Marchand <david.marchand@redhat.com> writes:

> On Wed, Mar 17, 2021 at 3:44 PM Aaron Conole <aconole@redhat.com> wrote:
>> diff --git a/app/test/has-hugepage.sh b/app/test/has-hugepage.sh
>> index d600fad319..1c3cfb665a 100755
>> --- a/app/test/has-hugepage.sh
>> +++ b/app/test/has-hugepage.sh
>> @@ -3,7 +3,17 @@
>>  # Copyright 2020 Mellanox Technologies, Ltd
>>
>>  if [ "$(uname)" = "Linux" ] ; then
>> -       cat /proc/sys/vm/nr_hugepages || echo 0
>> +       nr_hugepages=$(cat /proc/sys/vm/nr_hugepages)
>> +       # Need to check if we have permissions to access hugepages
>> +       perm=""
>> +       for mount in `mount | grep hugetlbfs | awk '{ print $3; }'`; do
>> +               test ! -w $mount/. || perm="$mount"
>> +       done
>> +       if [ "$perm" = "" -o "$nr_hugepages" = "0" ]; then
>> +               echo 0
>> +       else
>> +               echo $nr_hugepages
>> +       fi
>>  elif [ "$(uname)" = "FreeBSD" ] ; then
>>         echo 1 # assume FreeBSD always has hugepages
>>  else
>
> The check is evaluated first when configuring as normal user and not
> reevaluated when running the tests as root in Travis/GHA, so the tests
> are started with no hugepage.
>
> Then, in Travis and GHA, we can see (sigbus o_O) crashes in no-huge mode.
> I tried to reproduce but did not manage.

I noticed that as well.  I am equally as confused, but I'm working on
it (along with folding in Thomas' suggestions).
  
Aaron Conole April 6, 2021, 12:33 p.m. UTC | #4
Thomas Monjalon <thomas@monjalon.net> writes:

> 17/03/2021 15:44, Aaron Conole:
>> The hugepage test really needs to check multiple things on Linux:
>> 
>> 1. Are hugepages reserved in the system?
>> 
>> 2. Is the hugepage mountpoint available so that we can allocate them?
>> 
>> 3. Do we have permissions to write into the hugepage mountpoint?
>> 
>> The existing hugepage check only verifies the first.  On some setups,
>> a non-root user won't have access to the mountpoint for hugepages to
>> be allocated and that needs to be reflected in the test as well.  Add
>> such checks for Linux OS to give a more check when running test suites.
>
> Requirements 2 & 3 are optional.
> You don't need a mount point if using the option --in-memory.

That's true, but it seems to break a few of the unit tests without.
I'll clarify the commit message.

Additionally, I thought it would be simple to just incorporate your
suggestions - but it seems that meson / ninja doesn't have cascading
dependencies the way 'make' does (or, I haven't figured out from the
syntax how to do that) - a 'run_command' gets resolved at configure
time and it doesn't seem that we can make a run_target depend on another
run_target since dependencies are on file outputs.  Maybe we do some
kind of trickery here where we write a file that the build script reads?

I am trying to figure out how best to accomplish this - suggestions
welcome.

> [...]
>> +	perm=""
>
> perm= should do the same.
>
>> +	for mount in `mount | grep hugetlbfs | awk '{ print $3; }'`; do
>
> Please prefer $() syntax.

Okay

> Are spaces in awk required?

I'm not sure - I don't think so.

>> +		test ! -w $mount/. || perm="$mount"
>
> Why /. ?

Habit.  I will remove it.

>> +	done
>> +	if [ "$perm" = "" -o "$nr_hugepages" = "0" ]; then
>
> = "" can be replaced with -z
> "0" can be simply 0

Done.

>> +		echo 0
>> +	else
>> +		echo $nr_hugepages
>> +	fi
  
Bruce Richardson April 6, 2021, 12:58 p.m. UTC | #5
On Tue, Apr 06, 2021 at 08:33:07AM -0400, Aaron Conole wrote:
> Thomas Monjalon <thomas@monjalon.net> writes:
> 
> > 17/03/2021 15:44, Aaron Conole:
> >> The hugepage test really needs to check multiple things on Linux:
> >> 
> >> 1. Are hugepages reserved in the system?
> >> 
> >> 2. Is the hugepage mountpoint available so that we can allocate them?
> >> 
> >> 3. Do we have permissions to write into the hugepage mountpoint?
> >> 
> >> The existing hugepage check only verifies the first.  On some setups,
> >> a non-root user won't have access to the mountpoint for hugepages to
> >> be allocated and that needs to be reflected in the test as well.  Add
> >> such checks for Linux OS to give a more check when running test suites.
> >
> > Requirements 2 & 3 are optional.
> > You don't need a mount point if using the option --in-memory.
> 
> That's true, but it seems to break a few of the unit tests without.
> I'll clarify the commit message.
> 
> Additionally, I thought it would be simple to just incorporate your
> suggestions - but it seems that meson / ninja doesn't have cascading
> dependencies the way 'make' does (or, I haven't figured out from the
> syntax how to do that) - a 'run_command' gets resolved at configure
> time and it doesn't seem that we can make a run_target depend on another
> run_target since dependencies are on file outputs.  Maybe we do some
> kind of trickery here where we write a file that the build script reads?
> 
> I am trying to figure out how best to accomplish this - suggestions
> welcome.
> 
Sorry that I'm late to this thread. Can you perhaps explain what you mean
by cascading dependencies in this instance, or what you are trying to do
exactly that is not supported?

In terms of output files that build script reads, yes that is possible -
the file just needs to be in the standard .d format that gcc produces and
then ninja can use that to track file dependencies. The script
"call-sphinx-build.py" does this, for example, and doc/guides/meson.build
tells meson/ninja to look for dependencies in file ".html.d" generated by
that script.

/Bruce
  
Aaron Conole April 6, 2021, 2:20 p.m. UTC | #6
Bruce Richardson <bruce.richardson@intel.com> writes:

> On Tue, Apr 06, 2021 at 08:33:07AM -0400, Aaron Conole wrote:
>> Thomas Monjalon <thomas@monjalon.net> writes:
>> 
>> > 17/03/2021 15:44, Aaron Conole:
>> >> The hugepage test really needs to check multiple things on Linux:
>> >> 
>> >> 1. Are hugepages reserved in the system?
>> >> 
>> >> 2. Is the hugepage mountpoint available so that we can allocate them?
>> >> 
>> >> 3. Do we have permissions to write into the hugepage mountpoint?
>> >> 
>> >> The existing hugepage check only verifies the first.  On some setups,
>> >> a non-root user won't have access to the mountpoint for hugepages to
>> >> be allocated and that needs to be reflected in the test as well.  Add
>> >> such checks for Linux OS to give a more check when running test suites.
>> >
>> > Requirements 2 & 3 are optional.
>> > You don't need a mount point if using the option --in-memory.
>> 
>> That's true, but it seems to break a few of the unit tests without.
>> I'll clarify the commit message.
>> 
>> Additionally, I thought it would be simple to just incorporate your
>> suggestions - but it seems that meson / ninja doesn't have cascading
>> dependencies the way 'make' does (or, I haven't figured out from the
>> syntax how to do that) - a 'run_command' gets resolved at configure
>> time and it doesn't seem that we can make a run_target depend on another
>> run_target since dependencies are on file outputs.  Maybe we do some
>> kind of trickery here where we write a file that the build script reads?
>> 
>> I am trying to figure out how best to accomplish this - suggestions
>> welcome.
>> 
> Sorry that I'm late to this thread. Can you perhaps explain what you mean
> by cascading dependencies in this instance, or what you are trying to do
> exactly that is not supported?

I want to conditionally invoke the test suite with the hugepage tests,
and support the case that the machine has hugepages enabled, but not
accessible.

Right now, if a user runs:

  meson build && ninja -C build test

with hugepages allocated as a non-root user, they will see 'FAIL'
messages.  This isn't very friendly, since the user would be confused.

Right now, hugepage detection is done only at configure time (the
'meson' step), and then the target is always run.

For now, I will continue modifying the below script, but that will be a
detection at configure time, still.  So the case where the user runs
'meson' when they have hugepages, but those hugepages go away and then
they run 'ninja -C build test' will still be FAIL instead of SKIP (maybe
we need a more descriptive error when FAIL due to hugepages happen?)

> In terms of output files that build script reads, yes that is possible -
> the file just needs to be in the standard .d format that gcc produces and
> then ninja can use that to track file dependencies. The script
> "call-sphinx-build.py" does this, for example, and doc/guides/meson.build
> tells meson/ninja to look for dependencies in file ".html.d" generated by
> that script.
>
> /Bruce
  
Bruce Richardson April 6, 2021, 2:50 p.m. UTC | #7
On Tue, Apr 06, 2021 at 10:20:37AM -0400, Aaron Conole wrote:
> Bruce Richardson <bruce.richardson@intel.com> writes:
> 
> > On Tue, Apr 06, 2021 at 08:33:07AM -0400, Aaron Conole wrote:
> >> Thomas Monjalon <thomas@monjalon.net> writes:
> >> 
> >> > 17/03/2021 15:44, Aaron Conole:
> >> >> The hugepage test really needs to check multiple things on Linux:
> >> >> 
> >> >> 1. Are hugepages reserved in the system?
> >> >> 
> >> >> 2. Is the hugepage mountpoint available so that we can allocate them?
> >> >> 
> >> >> 3. Do we have permissions to write into the hugepage mountpoint?
> >> >> 
> >> >> The existing hugepage check only verifies the first.  On some setups,
> >> >> a non-root user won't have access to the mountpoint for hugepages to
> >> >> be allocated and that needs to be reflected in the test as well.  Add
> >> >> such checks for Linux OS to give a more check when running test suites.
> >> >
> >> > Requirements 2 & 3 are optional.
> >> > You don't need a mount point if using the option --in-memory.
> >> 
> >> That's true, but it seems to break a few of the unit tests without.
> >> I'll clarify the commit message.
> >> 
> >> Additionally, I thought it would be simple to just incorporate your
> >> suggestions - but it seems that meson / ninja doesn't have cascading
> >> dependencies the way 'make' does (or, I haven't figured out from the
> >> syntax how to do that) - a 'run_command' gets resolved at configure
> >> time and it doesn't seem that we can make a run_target depend on another
> >> run_target since dependencies are on file outputs.  Maybe we do some
> >> kind of trickery here where we write a file that the build script reads?
> >> 
> >> I am trying to figure out how best to accomplish this - suggestions
> >> welcome.
> >> 
> > Sorry that I'm late to this thread. Can you perhaps explain what you mean
> > by cascading dependencies in this instance, or what you are trying to do
> > exactly that is not supported?
> 
> I want to conditionally invoke the test suite with the hugepage tests,
> and support the case that the machine has hugepages enabled, but not
> accessible.
> 
> Right now, if a user runs:
> 
>   meson build && ninja -C build test
> 
> with hugepages allocated as a non-root user, they will see 'FAIL'
> messages.  This isn't very friendly, since the user would be confused.
> 
> Right now, hugepage detection is done only at configure time (the
> 'meson' step), and then the target is always run.
> 
> For now, I will continue modifying the below script, but that will be a
> detection at configure time, still.  So the case where the user runs
> 'meson' when they have hugepages, but those hugepages go away and then
> they run 'ninja -C build test' will still be FAIL instead of SKIP (maybe
> we need a more descriptive error when FAIL due to hugepages happen?)
> 

This seems to me like the test binary itself should be checking the
presence of hugepages, and reporting skips if necessary. It's not just when
run through ninja that this functionality would be useful.
  
Aaron Conole April 9, 2021, 3:06 p.m. UTC | #8
Bruce Richardson <bruce.richardson@intel.com> writes:

> On Tue, Apr 06, 2021 at 10:20:37AM -0400, Aaron Conole wrote:
>> Bruce Richardson <bruce.richardson@intel.com> writes:
>> 
>> > On Tue, Apr 06, 2021 at 08:33:07AM -0400, Aaron Conole wrote:
>> >> Thomas Monjalon <thomas@monjalon.net> writes:
>> >> 
>> >> > 17/03/2021 15:44, Aaron Conole:
>> >> >> The hugepage test really needs to check multiple things on Linux:
>> >> >> 
>> >> >> 1. Are hugepages reserved in the system?
>> >> >> 
>> >> >> 2. Is the hugepage mountpoint available so that we can allocate them?
>> >> >> 
>> >> >> 3. Do we have permissions to write into the hugepage mountpoint?
>> >> >> 
>> >> >> The existing hugepage check only verifies the first.  On some setups,
>> >> >> a non-root user won't have access to the mountpoint for hugepages to
>> >> >> be allocated and that needs to be reflected in the test as well.  Add
>> >> >> such checks for Linux OS to give a more check when running test suites.
>> >> >
>> >> > Requirements 2 & 3 are optional.
>> >> > You don't need a mount point if using the option --in-memory.
>> >> 
>> >> That's true, but it seems to break a few of the unit tests without.
>> >> I'll clarify the commit message.
>> >> 
>> >> Additionally, I thought it would be simple to just incorporate your
>> >> suggestions - but it seems that meson / ninja doesn't have cascading
>> >> dependencies the way 'make' does (or, I haven't figured out from the
>> >> syntax how to do that) - a 'run_command' gets resolved at configure
>> >> time and it doesn't seem that we can make a run_target depend on another
>> >> run_target since dependencies are on file outputs.  Maybe we do some
>> >> kind of trickery here where we write a file that the build script reads?
>> >> 
>> >> I am trying to figure out how best to accomplish this - suggestions
>> >> welcome.
>> >> 
>> > Sorry that I'm late to this thread. Can you perhaps explain what you mean
>> > by cascading dependencies in this instance, or what you are trying to do
>> > exactly that is not supported?
>> 
>> I want to conditionally invoke the test suite with the hugepage tests,
>> and support the case that the machine has hugepages enabled, but not
>> accessible.
>> 
>> Right now, if a user runs:
>> 
>>   meson build && ninja -C build test
>> 
>> with hugepages allocated as a non-root user, they will see 'FAIL'
>> messages.  This isn't very friendly, since the user would be confused.
>> 
>> Right now, hugepage detection is done only at configure time (the
>> 'meson' step), and then the target is always run.
>> 
>> For now, I will continue modifying the below script, but that will be a
>> detection at configure time, still.  So the case where the user runs
>> 'meson' when they have hugepages, but those hugepages go away and then
>> they run 'ninja -C build test' will still be FAIL instead of SKIP (maybe
>> we need a more descriptive error when FAIL due to hugepages happen?)
>> 
>
> This seems to me like the test binary itself should be checking the
> presence of hugepages, and reporting skips if necessary. It's not just when
> run through ninja that this functionality would be useful.

Either way, there needs to be a rework - if we do it in the test binary,
then the tests that require hugepages need to be worked so that they
correctly detect lack of hugepage support before starting.  If we keep
that knowledge in the meson system, then we need to change the way we
call the test binary script to support a more robust detection.

I guess, I don't care too much which one is the one we choose.  My $.02
opinion is that we already have most of the logic and whatnot done in
the build system, so I'd prefer to do as small a change as possible
(leaving that logic in the meson system).  Then again, maybe it makes
more sense to just rip the bandaid off and move it all into the test
framework.

WDYT?
  
Thomas Monjalon April 9, 2021, 3:33 p.m. UTC | #9
09/04/2021 17:06, Aaron Conole:
> Bruce Richardson <bruce.richardson@intel.com> writes:
> 
> > On Tue, Apr 06, 2021 at 10:20:37AM -0400, Aaron Conole wrote:
> >> Bruce Richardson <bruce.richardson@intel.com> writes:
> >> 
> >> > On Tue, Apr 06, 2021 at 08:33:07AM -0400, Aaron Conole wrote:
> >> >> Thomas Monjalon <thomas@monjalon.net> writes:
> >> >> 
> >> >> > 17/03/2021 15:44, Aaron Conole:
> >> >> >> The hugepage test really needs to check multiple things on Linux:
> >> >> >> 
> >> >> >> 1. Are hugepages reserved in the system?
> >> >> >> 
> >> >> >> 2. Is the hugepage mountpoint available so that we can allocate them?
> >> >> >> 
> >> >> >> 3. Do we have permissions to write into the hugepage mountpoint?
> >> >> >> 
> >> >> >> The existing hugepage check only verifies the first.  On some setups,
> >> >> >> a non-root user won't have access to the mountpoint for hugepages to
> >> >> >> be allocated and that needs to be reflected in the test as well.  Add
> >> >> >> such checks for Linux OS to give a more check when running test suites.
> >> >> >
> >> >> > Requirements 2 & 3 are optional.
> >> >> > You don't need a mount point if using the option --in-memory.
> >> >> 
> >> >> That's true, but it seems to break a few of the unit tests without.
> >> >> I'll clarify the commit message.
> >> >> 
> >> >> Additionally, I thought it would be simple to just incorporate your
> >> >> suggestions - but it seems that meson / ninja doesn't have cascading
> >> >> dependencies the way 'make' does (or, I haven't figured out from the
> >> >> syntax how to do that) - a 'run_command' gets resolved at configure
> >> >> time and it doesn't seem that we can make a run_target depend on another
> >> >> run_target since dependencies are on file outputs.  Maybe we do some
> >> >> kind of trickery here where we write a file that the build script reads?
> >> >> 
> >> >> I am trying to figure out how best to accomplish this - suggestions
> >> >> welcome.
> >> >> 
> >> > Sorry that I'm late to this thread. Can you perhaps explain what you mean
> >> > by cascading dependencies in this instance, or what you are trying to do
> >> > exactly that is not supported?
> >> 
> >> I want to conditionally invoke the test suite with the hugepage tests,
> >> and support the case that the machine has hugepages enabled, but not
> >> accessible.
> >> 
> >> Right now, if a user runs:
> >> 
> >>   meson build && ninja -C build test
> >> 
> >> with hugepages allocated as a non-root user, they will see 'FAIL'
> >> messages.  This isn't very friendly, since the user would be confused.
> >> 
> >> Right now, hugepage detection is done only at configure time (the
> >> 'meson' step), and then the target is always run.
> >> 
> >> For now, I will continue modifying the below script, but that will be a
> >> detection at configure time, still.  So the case where the user runs
> >> 'meson' when they have hugepages, but those hugepages go away and then
> >> they run 'ninja -C build test' will still be FAIL instead of SKIP (maybe
> >> we need a more descriptive error when FAIL due to hugepages happen?)
> >> 
> >
> > This seems to me like the test binary itself should be checking the
> > presence of hugepages, and reporting skips if necessary. It's not just when
> > run through ninja that this functionality would be useful.
> 
> Either way, there needs to be a rework - if we do it in the test binary,
> then the tests that require hugepages need to be worked so that they
> correctly detect lack of hugepage support before starting.  If we keep
> that knowledge in the meson system, then we need to change the way we
> call the test binary script to support a more robust detection.
> 
> I guess, I don't care too much which one is the one we choose.  My $.02
> opinion is that we already have most of the logic and whatnot done in
> the build system, so I'd prefer to do as small a change as possible
> (leaving that logic in the meson system).  Then again, maybe it makes
> more sense to just rip the bandaid off and move it all into the test
> framework.
> 
> WDYT?

I think the test application should adapt to its environment.
If no hugepage, then mark the tests requiring hugepages as skipped.
For the other tests, we could use --in-memory.
  
Bruce Richardson April 9, 2021, 3:40 p.m. UTC | #10
On Fri, Apr 09, 2021 at 11:06:20AM -0400, Aaron Conole wrote:
> Bruce Richardson <bruce.richardson@intel.com> writes:
> 
> > On Tue, Apr 06, 2021 at 10:20:37AM -0400, Aaron Conole wrote:
> >> Bruce Richardson <bruce.richardson@intel.com> writes:
> >> 
> >> > On Tue, Apr 06, 2021 at 08:33:07AM -0400, Aaron Conole wrote:
> >> >> Thomas Monjalon <thomas@monjalon.net> writes:
> >> >> 
> >> >> > 17/03/2021 15:44, Aaron Conole:
> >> >> >> The hugepage test really needs to check multiple things on Linux:
> >> >> >> 
> >> >> >> 1. Are hugepages reserved in the system?
> >> >> >> 
> >> >> >> 2. Is the hugepage mountpoint available so that we can allocate them?
> >> >> >> 
> >> >> >> 3. Do we have permissions to write into the hugepage mountpoint?
> >> >> >> 
> >> >> >> The existing hugepage check only verifies the first.  On some setups,
> >> >> >> a non-root user won't have access to the mountpoint for hugepages to
> >> >> >> be allocated and that needs to be reflected in the test as well.  Add
> >> >> >> such checks for Linux OS to give a more check when running test suites.
> >> >> >
> >> >> > Requirements 2 & 3 are optional.
> >> >> > You don't need a mount point if using the option --in-memory.
> >> >> 
> >> >> That's true, but it seems to break a few of the unit tests without.
> >> >> I'll clarify the commit message.
> >> >> 
> >> >> Additionally, I thought it would be simple to just incorporate your
> >> >> suggestions - but it seems that meson / ninja doesn't have cascading
> >> >> dependencies the way 'make' does (or, I haven't figured out from the
> >> >> syntax how to do that) - a 'run_command' gets resolved at configure
> >> >> time and it doesn't seem that we can make a run_target depend on another
> >> >> run_target since dependencies are on file outputs.  Maybe we do some
> >> >> kind of trickery here where we write a file that the build script reads?
> >> >> 
> >> >> I am trying to figure out how best to accomplish this - suggestions
> >> >> welcome.
> >> >> 
> >> > Sorry that I'm late to this thread. Can you perhaps explain what you mean
> >> > by cascading dependencies in this instance, or what you are trying to do
> >> > exactly that is not supported?
> >> 
> >> I want to conditionally invoke the test suite with the hugepage tests,
> >> and support the case that the machine has hugepages enabled, but not
> >> accessible.
> >> 
> >> Right now, if a user runs:
> >> 
> >>   meson build && ninja -C build test
> >> 
> >> with hugepages allocated as a non-root user, they will see 'FAIL'
> >> messages.  This isn't very friendly, since the user would be confused.
> >> 
> >> Right now, hugepage detection is done only at configure time (the
> >> 'meson' step), and then the target is always run.
> >> 
> >> For now, I will continue modifying the below script, but that will be a
> >> detection at configure time, still.  So the case where the user runs
> >> 'meson' when they have hugepages, but those hugepages go away and then
> >> they run 'ninja -C build test' will still be FAIL instead of SKIP (maybe
> >> we need a more descriptive error when FAIL due to hugepages happen?)
> >> 
> >
> > This seems to me like the test binary itself should be checking the
> > presence of hugepages, and reporting skips if necessary. It's not just when
> > run through ninja that this functionality would be useful.
> 
> Either way, there needs to be a rework - if we do it in the test binary,
> then the tests that require hugepages need to be worked so that they
> correctly detect lack of hugepage support before starting.  If we keep
> that knowledge in the meson system, then we need to change the way we
> call the test binary script to support a more robust detection.
> 
For putting the detection in the app, it's should just be a one-time
detection at init, right? There aren't differing hugepage requiresments per
test? If not, then any tests requiring hugepages just need to
check a global var and return skipped if not.

/Bruce
  
David Marchand April 12, 2021, 11:33 a.m. UTC | #11
On Fri, Apr 9, 2021 at 5:33 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > This seems to me like the test binary itself should be checking the
> > > presence of hugepages, and reporting skips if necessary. It's not just when
> > > run through ninja that this functionality would be useful.
> >
> > Either way, there needs to be a rework - if we do it in the test binary,
> > then the tests that require hugepages need to be worked so that they
> > correctly detect lack of hugepage support before starting.  If we keep
> > that knowledge in the meson system, then we need to change the way we
> > call the test binary script to support a more robust detection.
> >
> > I guess, I don't care too much which one is the one we choose.  My $.02
> > opinion is that we already have most of the logic and whatnot done in
> > the build system, so I'd prefer to do as small a change as possible
> > (leaving that logic in the meson system).  Then again, maybe it makes
> > more sense to just rip the bandaid off and move it all into the test
> > framework.
> >
> > WDYT?
>
> I think the test application should adapt to its environment.
> If no hugepage, then mark the tests requiring hugepages as skipped.
> For the other tests, we could use --in-memory.

There are tests that rely/test mp.
They would have to be identified and skipped if we want to go with --in-memory.
  
Stephen Hemminger June 29, 2023, 4:30 p.m. UTC | #12
On Fri, 19 Mar 2021 10:34:02 -0400
Aaron Conole <aconole@redhat.com> wrote:

> > The check is evaluated first when configuring as normal user and not
> > reevaluated when running the tests as root in Travis/GHA, so the tests
> > are started with no hugepage.
> >
> > Then, in Travis and GHA, we can see (sigbus o_O) crashes in no-huge mode.
> > I tried to reproduce but did not manage.  
> 
> I noticed that as well.  I am equally as confused, but I'm working on
> it (along with folding in Thomas' suggestions).

Marking this patch as "Changes requested".
Please resubmit if more work is needed.
  

Patch

diff --git a/app/test/has-hugepage.sh b/app/test/has-hugepage.sh
index d600fad319..1c3cfb665a 100755
--- a/app/test/has-hugepage.sh
+++ b/app/test/has-hugepage.sh
@@ -3,7 +3,17 @@ 
 # Copyright 2020 Mellanox Technologies, Ltd
 
 if [ "$(uname)" = "Linux" ] ; then
-	cat /proc/sys/vm/nr_hugepages || echo 0
+	nr_hugepages=$(cat /proc/sys/vm/nr_hugepages)
+	# Need to check if we have permissions to access hugepages
+	perm=""
+	for mount in `mount | grep hugetlbfs | awk '{ print $3; }'`; do
+		test ! -w $mount/. || perm="$mount"
+	done
+	if [ "$perm" = "" -o "$nr_hugepages" = "0" ]; then
+		echo 0
+	else
+		echo $nr_hugepages
+	fi
 elif [ "$(uname)" = "FreeBSD" ] ; then
 	echo 1 # assume FreeBSD always has hugepages
 else