[v2] test: fix process dup fd close

Message ID 20190902094939.20482-1-kkanas@marvell.com (mailing list archive)
State Superseded, archived
Delegated to: David Marchand
Headers
Series [v2] test: fix process dup fd close |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK
ci/iol-dpdk_compile_spdk success Compile Testing PASS
ci/iol-dpdk_compile_ovs success Compile Testing PASS
ci/iol-dpdk_compile success Compile Testing PASS
ci/intel-Performance success Performance Testing PASS
ci/mellanox-Performance success Performance Testing PASS

Commit Message

Krzysztof Kanas Sept. 2, 2019, 9:49 a.m. UTC
  From: Krzysztof Kanas <kkanas@marvell.com>

process_dup was intending to close it's own fd's but failed to do so

Fixes: af75078fece3 ("first public release")

Signed-off-by: Krzysztof Kanas <kkanas@marvell.com>
---
v2:
* remove unnecessary commit msg information

 app/test/process.h | 40 +++++++++++++++++++++++++++++++++++-----
 1 file changed, 35 insertions(+), 5 deletions(-)
  

Comments

Krzysztof Kanas Sept. 23, 2019, 11:32 a.m. UTC | #1
Ping..

On 19-09-02 11:49, kkanas@marvell.com wrote:
> From: Krzysztof Kanas <kkanas@marvell.com>
> 
> process_dup was intending to close it's own fd's but failed to do so
> 
> Fixes: af75078fece3 ("first public release")
> 
> Signed-off-by: Krzysztof Kanas <kkanas@marvell.com>
> ---
> v2:
> * remove unnecessary commit msg information
> 
>  app/test/process.h | 40 +++++++++++++++++++++++++++++++++++-----
>  1 file changed, 35 insertions(+), 5 deletions(-)
> 
> diff --git a/app/test/process.h b/app/test/process.h
> index 128ce41219a1..2a6428f104e1 100644
> --- a/app/test/process.h
> +++ b/app/test/process.h
> @@ -11,6 +11,7 @@
>  #include <stdlib.h> /* NULL */
>  #include <string.h> /* strerror */
>  #include <unistd.h> /* readlink */
> +#include <dirent.h>
>  #include <sys/wait.h>
>  
>  #include <rte_string_fns.h> /* strlcpy */
> @@ -40,8 +41,12 @@ process_dup(const char *const argv[], int numargs, const char *env_value)
>  {
>  	int num;
>  	char *argv_cpy[numargs + 1];
> -	int i, fd, status;
> +	int i, fd, fdir, status;
> +	struct dirent *dirent = NULL;
> +	const char *procdir = "/proc/self/fd/";
>  	char path[32];
> +	char *endptr;
> +	DIR *dir = NULL;
>  #ifdef RTE_LIBRTE_PDUMP
>  	pthread_t thread;
>  #endif
> @@ -58,11 +63,36 @@ process_dup(const char *const argv[], int numargs, const char *env_value)
>  
>  		/* close all open file descriptors, check /proc/self/fd to only
>  		 * call close on open fds. Exclude fds 0, 1 and 2*/
> -		for (fd = getdtablesize(); fd > 2; fd-- ) {
> -			snprintf(path, sizeof(path), "/proc/" exe "/fd/%d", fd);
> -			if (access(path, F_OK) == 0)
> -				close(fd);
> +		dir = opendir(procdir);
> +
> +		if (dir == NULL) {
> +			rte_panic("Error opening %s: %s\n", procdir,
> +					strerror(errno));
> +		}
> +
> +		fdir = dirfd(dir);
> +		if (fdir < 0) {
> +			status = errno;
> +			closedir(dir);
> +			rte_panic("Error %d obtaining fd for dir %s: %s\n",
> +					fdir, procdir, strerror(status));
>  		}
> +
> +		while ((dirent = readdir(dir)) != NULL) {
> +			errno = 0;
> +			fd = strtol(dirent->d_name, &endptr, 10);
> +			if (errno != 0 || dirent->d_name == endptr) {
> +				printf("Error converint name fd %d %s:\n",
> +					fd, dirent->d_name);
> +				continue;
> +			}
> +
> +			if (fd == fdir || fd <= 2)
> +				continue;
> +
> +			close(fd);
> +		}
> +
>  		printf("Running binary with argv[]:");
>  		for (i = 0; i < num; i++)
>  			printf("'%s' ", argv_cpy[i]);
> -- 
> 2.21.0
>
  
David Marchand Oct. 30, 2019, 9:06 a.m. UTC | #2
On Mon, Sep 2, 2019 at 11:50 AM <kkanas@marvell.com> wrote:
>
> From: Krzysztof Kanas <kkanas@marvell.com>
>
> process_dup was intending to close it's own fd's but failed to do so

A bit hard to digest, what is the problem that you want to fix?

Thanks.
  
Krzysztof Kanas Nov. 4, 2019, 7:52 a.m. UTC | #3
On 19-10-30 10:06, David Marchand wrote:
> External Email
> 
> ----------------------------------------------------------------------
> On Mon, Sep 2, 2019 at 11:50 AM <kkanas@marvell.com> wrote:
> >
> > From: Krzysztof Kanas <kkanas@marvell.com>
> >
> > process_dup was intending to close it's own fd's but failed to do so
> 
> A bit hard to digest, what is the problem that you want to fix?
> 
> Thanks.

I don't recall the exact test name but I think it was test_eal_flags 
that is included in the meson test suite for fast tests.
This test was timing out on ARM64.

Using strace (-f -c IIRC) showed that the process is spending most of 
the time in access syscall trying to close not existing fd's.

-               for (fd = getdtablesize(); fd > 2; fd-- ) {
-                       snprintf(path, sizeof(path), "/proc/" exe "/fd/%d", fd);
-                       if (access(path, F_OK) == 0)
-                               close(fd);

So the simplest way to do so is to close only the opened fd's not
blindly try every possible one.

Also I think the code was wrong in another way as it tried to iterate 
over /proc/ *exe* /fd/%d/ when it should iterate over /proc/ *self* 
/fd/, so either the comment bove the for loop was wrong or the code was 
incorrect.


> 
> 
> -- 
> David Marchand
>
  
David Marchand Nov. 6, 2019, 2:36 p.m. UTC | #4
On Mon, Nov 4, 2019 at 8:52 AM Krzysztof Kanas <kkanas@marvell.com> wrote:
>
> On 19-10-30 10:06, David Marchand wrote:
> > External Email
> >
> > ----------------------------------------------------------------------
> > On Mon, Sep 2, 2019 at 11:50 AM <kkanas@marvell.com> wrote:
> > >
> > > From: Krzysztof Kanas <kkanas@marvell.com>
> > >
> > > process_dup was intending to close it's own fd's but failed to do so
> >
> > A bit hard to digest, what is the problem that you want to fix?
> >
> > Thanks.
>
> I don't recall the exact test name but I think it was test_eal_flags
> that is included in the meson test suite for fast tests.
> This test was timing out on ARM64.
>
> Using strace (-f -c IIRC) showed that the process is spending most of
> the time in access syscall trying to close not existing fd's.
>
> -               for (fd = getdtablesize(); fd > 2; fd-- ) {
> -                       snprintf(path, sizeof(path), "/proc/" exe "/fd/%d", fd);
> -                       if (access(path, F_OK) == 0)
> -                               close(fd);
>
> So the simplest way to do so is to close only the opened fd's not
> blindly try every possible one.
>
> Also I think the code was wrong in another way as it tried to iterate
> over /proc/ *exe* /fd/%d/ when it should iterate over /proc/ *self*
> /fd/, so either the comment bove the for loop was wrong or the code was
> incorrect.

Thanks.
I agree that this loop is terrible.
I have some comments on the patch.
  
David Marchand Nov. 6, 2019, 2:58 p.m. UTC | #5
On Mon, Sep 2, 2019 at 11:50 AM <kkanas@marvell.com> wrote:
>
> From: Krzysztof Kanas <kkanas@marvell.com>
>
> process_dup was intending to close it's own fd's but failed to do so

Please, explain how you caught the issue and what the impact was on your system.

>
> Fixes: af75078fece3 ("first public release")
Cc: stable@dpdk.org

>
> Signed-off-by: Krzysztof Kanas <kkanas@marvell.com>
> ---
> v2:
> * remove unnecessary commit msg information
>
>  app/test/process.h | 40 +++++++++++++++++++++++++++++++++++-----
>  1 file changed, 35 insertions(+), 5 deletions(-)
>
> diff --git a/app/test/process.h b/app/test/process.h
> index 128ce41219a1..2a6428f104e1 100644
> --- a/app/test/process.h
> +++ b/app/test/process.h
> @@ -11,6 +11,7 @@
>  #include <stdlib.h> /* NULL */
>  #include <string.h> /* strerror */
>  #include <unistd.h> /* readlink */
> +#include <dirent.h>
>  #include <sys/wait.h>
>
>  #include <rte_string_fns.h> /* strlcpy */
> @@ -40,8 +41,12 @@ process_dup(const char *const argv[], int numargs, const char *env_value)
>  {
>         int num;
>         char *argv_cpy[numargs + 1];
> -       int i, fd, status;
> +       int i, fd, fdir, status;
> +       struct dirent *dirent = NULL;

No need to initialise to NULL.

> +       const char *procdir = "/proc/self/fd/";

self is a Linux thing.
This won't work on FreeBSD.


>         char path[32];
> +       char *endptr;
> +       DIR *dir = NULL;

Idem, no need to initialize.

>  #ifdef RTE_LIBRTE_PDUMP
>         pthread_t thread;
>  #endif
> @@ -58,11 +63,36 @@ process_dup(const char *const argv[], int numargs, const char *env_value)
>
>                 /* close all open file descriptors, check /proc/self/fd to only
>                  * call close on open fds. Exclude fds 0, 1 and 2*/
> -               for (fd = getdtablesize(); fd > 2; fd-- ) {
> -                       snprintf(path, sizeof(path), "/proc/" exe "/fd/%d", fd);
> -                       if (access(path, F_OK) == 0)
> -                               close(fd);
> +               dir = opendir(procdir);
> +

Remove empty line.


> +               if (dir == NULL) {
> +                       rte_panic("Error opening %s: %s\n", procdir,
> +                                       strerror(errno));
> +               }
> +
> +               fdir = dirfd(dir);
> +               if (fdir < 0) {
> +                       status = errno;

No need to set status, this (forked) process will panic.


> +                       closedir(dir);
> +                       rte_panic("Error %d obtaining fd for dir %s: %s\n",
> +                                       fdir, procdir, strerror(status));
>                 }
> +
> +               while ((dirent = readdir(dir)) != NULL) {
> +                       errno = 0;
> +                       fd = strtol(dirent->d_name, &endptr, 10);
> +                       if (errno != 0 || dirent->d_name == endptr) {

If you want to validate that the entire string is valid:

if (errno != 0 || endptr[0] != '\0')

> +                               printf("Error converint name fd %d %s:\n",
> +                                       fd, dirent->d_name);

converting


> +                               continue;
> +                       }
> +
> +                       if (fd == fdir || fd <= 2)
> +                               continue;
> +
> +                       close(fd);
> +               }

Missing closedir().

> +
>                 printf("Running binary with argv[]:");
>                 for (i = 0; i < num; i++)
>                         printf("'%s' ", argv_cpy[i]);
> --
> 2.21.0
>
  
Krzysztof Kanas Nov. 8, 2019, 10:21 a.m. UTC | #6
Hi David,

Thanks for review, hopefully this patch will addresses most of the sutff.
Rest I will address here.

> 
> > +       const char *procdir = "/proc/self/fd/";
> 
> self is a Linux thing.
> This won't work on FreeBSD.

IMHO original code didn't worked on FreeBSD as well.
I have created function to adress in third patch

> No need to set status, this (forked) process will panic.
I am using status to save errno as it value can be overwritten on unlikely
event of closedir failure.

---
v3 changes:
  add missing closedir
  fix strtol string validation test
  removed empty line
  remove unecssary initalization
  add check for '.' and '..' to avoid errors in parsing entries
  fix commit message
  add FreeBSD function for closing files
v2 changes:
  fix commit message

Krzysztof Kanas (3):
  test: fix timeout in flags autotest
  test: move close files to separate function
  test: fix FreeBSD file closing function

 process.h |  183 ++++++++++++++++++++++++++++++++++++++++++++++----------------
 1 file changed, 138 insertions(+), 45 deletions(-)
  
David Marchand Nov. 8, 2019, 11:05 a.m. UTC | #7
On Fri, Nov 8, 2019 at 11:21 AM <kkanas@marvell.com> wrote:
>
> Hi David,
>
> Thanks for review, hopefully this patch will addresses most of the sutff.
> Rest I will address here.
>
> >
> > > +       const char *procdir = "/proc/self/fd/";
> >
> > self is a Linux thing.
> > This won't work on FreeBSD.
>
> IMHO original code didn't worked on FreeBSD as well.
> I have created function to adress in third patch

Indeed...

Well, wait.
Why do we need to close those file descriptors?
FreeBSD has been like this for quite some time.

Can't we just remove this code?
Maybe someone from Intel has an idea of why it was like this?

Adding Bruce.
  
David Marchand Nov. 8, 2019, 11:11 a.m. UTC | #8
On Fri, Nov 8, 2019 at 12:05 PM David Marchand
<david.marchand@redhat.com> wrote:
>
> On Fri, Nov 8, 2019 at 11:21 AM <kkanas@marvell.com> wrote:
> >
> > Hi David,
> >
> > Thanks for review, hopefully this patch will addresses most of the sutff.
> > Rest I will address here.
> >
> > >
> > > > +       const char *procdir = "/proc/self/fd/";
> > >
> > > self is a Linux thing.
> > > This won't work on FreeBSD.
> >
> > IMHO original code didn't worked on FreeBSD as well.
> > I have created function to adress in third patch
>
> Indeed...
>
> Well, wait.
> Why do we need to close those file descriptors?
> FreeBSD has been like this for quite some time.
>
> Can't we just remove this code?
> Maybe someone from Intel has an idea of why it was like this?

Great...

        /* get file for config (fd is always 3) */
        snprintf(path, sizeof(path), "/proc/self/fd/%d", 3);

And this only applies to linux (getting the current file prefix).
  
David Marchand Nov. 8, 2019, 1:45 p.m. UTC | #9
On Fri, Nov 8, 2019 at 12:05 PM David Marchand
<david.marchand@redhat.com> wrote:
>
> On Fri, Nov 8, 2019 at 11:21 AM <kkanas@marvell.com> wrote:
> >
> > Hi David,
> >
> > Thanks for review, hopefully this patch will addresses most of the sutff.
> > Rest I will address here.
> >
> > >
> > > > +       const char *procdir = "/proc/self/fd/";
> > >
> > > self is a Linux thing.
> > > This won't work on FreeBSD.
> >
> > IMHO original code didn't worked on FreeBSD as well.
> > I have created function to adress in third patch
>
> Indeed...
>
> Well, wait.
> Why do we need to close those file descriptors?
> FreeBSD has been like this for quite some time.

We don't know what this is used for.

We know it does not work on FreeBSD, but this does not seem to be a problem.
Introducing something more on FreeBSD is a risk with no actual benefit
at first sight.

Either we take only the first patch under a #ifdef EXEC_ENV_LINUX or
we leave this as is.

Krzystof, is this a problem for you if we postpone and investigate
further for 20.02?
  
Krzysztof Kanas Nov. 12, 2019, 8:26 a.m. UTC | #10
On 19-11-08 14:45, David Marchand wrote:
> External Email
> 
> ----------------------------------------------------------------------
> On Fri, Nov 8, 2019 at 12:05 PM David Marchand
> <david.marchand@redhat.com> wrote:
> >
> > On Fri, Nov 8, 2019 at 11:21 AM <kkanas@marvell.com> wrote:
> > >
> > > Hi David,
> > >
> > > Thanks for review, hopefully this patch will addresses most of the sutff.
> > > Rest I will address here.
> > >
> > > >
> > > > > +       const char *procdir = "/proc/self/fd/";
> > > >
> > > > self is a Linux thing.
> > > > This won't work on FreeBSD.
> > >
> > > IMHO original code didn't worked on FreeBSD as well.
> > > I have created function to adress in third patch
> >
> > Indeed...
> >
> > Well, wait.
> > Why do we need to close those file descriptors?
> > FreeBSD has been like this for quite some time.
> 
> We don't know what this is used for.
> 
> We know it does not work on FreeBSD, but this does not seem to be a problem.
> Introducing something more on FreeBSD is a risk with no actual benefit
> at first sight.
Ok.

> 
> Either we take only the first patch under a #ifdef EXEC_ENV_LINUX or
> we leave this as is.
I would like have that because ARM64 is main platform for me and this 
makes the test pass, and fixes the difference between the comment and 
the code.
> 
> Krzystof, is this a problem for you if we postpone and investigate
> further for 20.02?
Not a problem.  I would like to have fix for EXEC_ENV_LINUX for now.  
Rest, FreeBSD case and the decisions weather this whole code should be 
deleted can be made later.
> 
> 
> -- 
> David Marchand
>
  
David Marchand Nov. 12, 2019, 8:34 p.m. UTC | #11
On Tue, Nov 12, 2019 at 9:29 AM Krzysztof Kanas <kkanas@marvell.com> wrote:
>
> On 19-11-08 14:45, David Marchand wrote:
> > External Email
> >
> > ----------------------------------------------------------------------
> > On Fri, Nov 8, 2019 at 12:05 PM David Marchand
> > <david.marchand@redhat.com> wrote:
> > >
> > > On Fri, Nov 8, 2019 at 11:21 AM <kkanas@marvell.com> wrote:
> > > >
> > > > Hi David,
> > > >
> > > > Thanks for review, hopefully this patch will addresses most of the sutff.
> > > > Rest I will address here.
> > > >
> > > > >
> > > > > > +       const char *procdir = "/proc/self/fd/";
> > > > >
> > > > > self is a Linux thing.
> > > > > This won't work on FreeBSD.
> > > >
> > > > IMHO original code didn't worked on FreeBSD as well.
> > > > I have created function to adress in third patch
> > >
> > > Indeed...
> > >
> > > Well, wait.
> > > Why do we need to close those file descriptors?
> > > FreeBSD has been like this for quite some time.
> >
> > We don't know what this is used for.
> >
> > We know it does not work on FreeBSD, but this does not seem to be a problem.
> > Introducing something more on FreeBSD is a risk with no actual benefit
> > at first sight.
> Ok.
>
> >
> > Either we take only the first patch under a #ifdef EXEC_ENV_LINUX or
> > we leave this as is.
> I would like have that because ARM64 is main platform for me and this
> makes the test pass, and fixes the difference between the comment and
> the code.
> >
> > Krzystof, is this a problem for you if we postpone and investigate
> > further for 20.02?
> Not a problem.  I would like to have fix for EXEC_ENV_LINUX for now.
> Rest, FreeBSD case and the decisions weather this whole code should be
> deleted can be made later.

In my tests of the original code, I never saw anything but access -1 ENOENT.
Anyway, just sent a v4, your v3 patch, with just cosmetics and
commitlog updated.
  
Krzysztof Kanas Nov. 13, 2019, 1:35 p.m. UTC | #12
On 12/11/2019 21:34, David Marchand wrote:
> On Tue, Nov 12, 2019 at 9:29 AM Krzysztof Kanas <kkanas@marvell.com> wrote:
>> On 19-11-08 14:45, David Marchand wrote:
>>> External Email
>>>
>>> ----------------------------------------------------------------------
>>> On Fri, Nov 8, 2019 at 12:05 PM David Marchand
>>> <david.marchand@redhat.com> wrote:
>>>> On Fri, Nov 8, 2019 at 11:21 AM <kkanas@marvell.com> wrote:
>>>>> Hi David,
>>>>>
>>>>> Thanks for review, hopefully this patch will addresses most of the sutff.
>>>>> Rest I will address here.
>>>>>
>>>>>>> +       const char *procdir = "/proc/self/fd/";
>>>>>> self is a Linux thing.
>>>>>> This won't work on FreeBSD.
>>>>> IMHO original code didn't worked on FreeBSD as well.
>>>>> I have created function to adress in third patch
>>>> Indeed...
>>>>
>>>> Well, wait.
>>>> Why do we need to close those file descriptors?
>>>> FreeBSD has been like this for quite some time.
>>> We don't know what this is used for.
>>>
>>> We know it does not work on FreeBSD, but this does not seem to be a problem.
>>> Introducing something more on FreeBSD is a risk with no actual benefit
>>> at first sight.
>> Ok.
>>
>>> Either we take only the first patch under a #ifdef EXEC_ENV_LINUX or
>>> we leave this as is.
>> I would like have that because ARM64 is main platform for me and this
>> makes the test pass, and fixes the difference between the comment and
>> the code.
>>> Krzystof, is this a problem for you if we postpone and investigate
>>> further for 20.02?
>> Not a problem.  I would like to have fix for EXEC_ENV_LINUX for now.
>> Rest, FreeBSD case and the decisions weather this whole code should be
>> deleted can be made later.
> In my tests of the original code, I never saw anything but access -1 ENOENT.
> Anyway, just sent a v4, your v3 patch, with just cosmetics and
> commitlog updated.

ACK. Tested, the v4 patch works fine.

>
>
  

Patch

diff --git a/app/test/process.h b/app/test/process.h
index 128ce41219a1..2a6428f104e1 100644
--- a/app/test/process.h
+++ b/app/test/process.h
@@ -11,6 +11,7 @@ 
 #include <stdlib.h> /* NULL */
 #include <string.h> /* strerror */
 #include <unistd.h> /* readlink */
+#include <dirent.h>
 #include <sys/wait.h>
 
 #include <rte_string_fns.h> /* strlcpy */
@@ -40,8 +41,12 @@  process_dup(const char *const argv[], int numargs, const char *env_value)
 {
 	int num;
 	char *argv_cpy[numargs + 1];
-	int i, fd, status;
+	int i, fd, fdir, status;
+	struct dirent *dirent = NULL;
+	const char *procdir = "/proc/self/fd/";
 	char path[32];
+	char *endptr;
+	DIR *dir = NULL;
 #ifdef RTE_LIBRTE_PDUMP
 	pthread_t thread;
 #endif
@@ -58,11 +63,36 @@  process_dup(const char *const argv[], int numargs, const char *env_value)
 
 		/* close all open file descriptors, check /proc/self/fd to only
 		 * call close on open fds. Exclude fds 0, 1 and 2*/
-		for (fd = getdtablesize(); fd > 2; fd-- ) {
-			snprintf(path, sizeof(path), "/proc/" exe "/fd/%d", fd);
-			if (access(path, F_OK) == 0)
-				close(fd);
+		dir = opendir(procdir);
+
+		if (dir == NULL) {
+			rte_panic("Error opening %s: %s\n", procdir,
+					strerror(errno));
+		}
+
+		fdir = dirfd(dir);
+		if (fdir < 0) {
+			status = errno;
+			closedir(dir);
+			rte_panic("Error %d obtaining fd for dir %s: %s\n",
+					fdir, procdir, strerror(status));
 		}
+
+		while ((dirent = readdir(dir)) != NULL) {
+			errno = 0;
+			fd = strtol(dirent->d_name, &endptr, 10);
+			if (errno != 0 || dirent->d_name == endptr) {
+				printf("Error converint name fd %d %s:\n",
+					fd, dirent->d_name);
+				continue;
+			}
+
+			if (fd == fdir || fd <= 2)
+				continue;
+
+			close(fd);
+		}
+
 		printf("Running binary with argv[]:");
 		for (i = 0; i < num; i++)
 			printf("'%s' ", argv_cpy[i]);