timer: fix resource leak in finalize

Message ID 1556737217-24338-1-git-send-email-erik.g.carrillo@intel.com (mailing list archive)
State Superseded, archived
Headers
Series timer: fix resource leak in finalize |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/mellanox-Performance-Testing success Performance Testing PASS
ci/intel-Performance-Testing success Performance Testing PASS
ci/Intel-compilation success Compilation OK

Commit Message

Carrillo, Erik G May 1, 2019, 7 p.m. UTC
  The finalize function should free the memzone created in the init
function, rather than freeing the allocation the memzone references,
otherwise a memzone descriptor can be leaked.

Fixes: c0749f7096c7 ("timer: allow management in shared memory")

Signed-off-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
---
 lib/librte_timer/rte_timer.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)
  

Comments

Anatoly Burakov May 2, 2019, 9:18 a.m. UTC | #1
On 01-May-19 8:00 PM, Erik Gabriel Carrillo wrote:
> The finalize function should free the memzone created in the init
> function, rather than freeing the allocation the memzone references,
> otherwise a memzone descriptor can be leaked.
> 
> Fixes: c0749f7096c7 ("timer: allow management in shared memory")
> 
> Signed-off-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
> ---
>   lib/librte_timer/rte_timer.c | 12 ++++++++++--
>   1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/librte_timer/rte_timer.c b/lib/librte_timer/rte_timer.c
> index eb46009..fb7a87e 100644
> --- a/lib/librte_timer/rte_timer.c
> +++ b/lib/librte_timer/rte_timer.c
> @@ -60,6 +60,7 @@ struct rte_timer_data {
>   };
>   
>   #define RTE_MAX_DATA_ELS 64
> +static const struct rte_memzone *rte_timer_data_mz;
>   static struct rte_timer_data *rte_timer_data_arr;
>   static const uint32_t default_data_id;
>   static uint32_t rte_timer_subsystem_initialized;
> @@ -164,6 +165,7 @@ rte_timer_subsystem_init_v1905(void)
>   		if (mz == NULL)
>   			return -EEXIST;
>   
> +		rte_timer_data_mz = mz;
>   		rte_timer_data_arr = mz->addr;
>   
>   		rte_timer_data_arr[default_data_id].internal_flags |=
> @@ -180,6 +182,7 @@ rte_timer_subsystem_init_v1905(void)
>   	if (mz == NULL)
>   		return -ENOMEM;
>   
> +	rte_timer_data_mz = mz;
>   	rte_timer_data_arr = mz->addr;
>   
>   	for (i = 0; i < RTE_MAX_DATA_ELS; i++) {
> @@ -205,8 +208,13 @@ BIND_DEFAULT_SYMBOL(rte_timer_subsystem_init, _v1905, 19.05);
>   void __rte_experimental
>   rte_timer_subsystem_finalize(void)
>   {
> -	if (rte_timer_data_arr)
> -		rte_free(rte_timer_data_arr);
> +	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
> +		return;
> +
> +	if (!rte_timer_subsystem_initialized)
> +		return;
> +
> +	rte_memzone_free(rte_timer_data_mz);

The patch is a correct fix, but the whole idea of this looks dangerous 
to me.

If we exit the primary while secondaries are still running, wouldn't it 
basically pull out timer data from under secondaries' feet?

>   
>   	rte_timer_subsystem_initialized = 0;
>   }
>
  
Carrillo, Erik G May 2, 2019, 12:19 p.m. UTC | #2
> -----Original Message-----
> From: Burakov, Anatoly
> Sent: Thursday, May 2, 2019 4:18 AM
> To: Carrillo, Erik G <erik.g.carrillo@intel.com>; rsanford@akamai.com;
> thomas@monjalon.net
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] timer: fix resource leak in finalize
> 
> On 01-May-19 8:00 PM, Erik Gabriel Carrillo wrote:
> > The finalize function should free the memzone created in the init
> > function, rather than freeing the allocation the memzone references,
> > otherwise a memzone descriptor can be leaked.
> >
> > Fixes: c0749f7096c7 ("timer: allow management in shared memory")
> >
> > Signed-off-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
> > ---
> >   lib/librte_timer/rte_timer.c | 12 ++++++++++--
> >   1 file changed, 10 insertions(+), 2 deletions(-)
> >
> > diff --git a/lib/librte_timer/rte_timer.c
> > b/lib/librte_timer/rte_timer.c index eb46009..fb7a87e 100644
> > --- a/lib/librte_timer/rte_timer.c
> > +++ b/lib/librte_timer/rte_timer.c
> > @@ -60,6 +60,7 @@ struct rte_timer_data {
> >   };
> >
> >   #define RTE_MAX_DATA_ELS 64
> > +static const struct rte_memzone *rte_timer_data_mz;
> >   static struct rte_timer_data *rte_timer_data_arr;
> >   static const uint32_t default_data_id;
> >   static uint32_t rte_timer_subsystem_initialized; @@ -164,6 +165,7 @@
> > rte_timer_subsystem_init_v1905(void)
> >   		if (mz == NULL)
> >   			return -EEXIST;
> >
> > +		rte_timer_data_mz = mz;
> >   		rte_timer_data_arr = mz->addr;
> >
> >   		rte_timer_data_arr[default_data_id].internal_flags |= @@ -
> 180,6
> > +182,7 @@ rte_timer_subsystem_init_v1905(void)
> >   	if (mz == NULL)
> >   		return -ENOMEM;
> >
> > +	rte_timer_data_mz = mz;
> >   	rte_timer_data_arr = mz->addr;
> >
> >   	for (i = 0; i < RTE_MAX_DATA_ELS; i++) { @@ -205,8 +208,13 @@
> > BIND_DEFAULT_SYMBOL(rte_timer_subsystem_init, _v1905, 19.05);
> >   void __rte_experimental
> >   rte_timer_subsystem_finalize(void)
> >   {
> > -	if (rte_timer_data_arr)
> > -		rte_free(rte_timer_data_arr);
> > +	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
> > +		return;
> > +
> > +	if (!rte_timer_subsystem_initialized)
> > +		return;
> > +
> > +	rte_memzone_free(rte_timer_data_mz);
> 
> The patch is a correct fix, but the whole idea of this looks dangerous to me.
> 
> If we exit the primary while secondaries are still running, wouldn't it basically
> pull out timer data from under secondaries' feet?
> 

Ah yes - that’s right.  Perhaps it would be better to maintain a reference count of some sort such that the last process to exit could cause the memzone_free.

Thanks,
Erik

> >
> >   	rte_timer_subsystem_initialized = 0;
> >   }
> >
> 
> 
> --
> Thanks,
> Anatoly
  
Anatoly Burakov May 2, 2019, 1:03 p.m. UTC | #3
On 02-May-19 1:19 PM, Carrillo, Erik G wrote:
> 
> 
>> -----Original Message-----
>> From: Burakov, Anatoly
>> Sent: Thursday, May 2, 2019 4:18 AM
>> To: Carrillo, Erik G <erik.g.carrillo@intel.com>; rsanford@akamai.com;
>> thomas@monjalon.net
>> Cc: dev@dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH] timer: fix resource leak in finalize
>>
>> On 01-May-19 8:00 PM, Erik Gabriel Carrillo wrote:
>>> The finalize function should free the memzone created in the init
>>> function, rather than freeing the allocation the memzone references,
>>> otherwise a memzone descriptor can be leaked.
>>>
>>> Fixes: c0749f7096c7 ("timer: allow management in shared memory")
>>>
>>> Signed-off-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
>>> ---
>>>    lib/librte_timer/rte_timer.c | 12 ++++++++++--
>>>    1 file changed, 10 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/lib/librte_timer/rte_timer.c
>>> b/lib/librte_timer/rte_timer.c index eb46009..fb7a87e 100644
>>> --- a/lib/librte_timer/rte_timer.c
>>> +++ b/lib/librte_timer/rte_timer.c
>>> @@ -60,6 +60,7 @@ struct rte_timer_data {
>>>    };
>>>
>>>    #define RTE_MAX_DATA_ELS 64
>>> +static const struct rte_memzone *rte_timer_data_mz;
>>>    static struct rte_timer_data *rte_timer_data_arr;
>>>    static const uint32_t default_data_id;
>>>    static uint32_t rte_timer_subsystem_initialized; @@ -164,6 +165,7 @@
>>> rte_timer_subsystem_init_v1905(void)
>>>    		if (mz == NULL)
>>>    			return -EEXIST;
>>>
>>> +		rte_timer_data_mz = mz;
>>>    		rte_timer_data_arr = mz->addr;
>>>
>>>    		rte_timer_data_arr[default_data_id].internal_flags |= @@ -
>> 180,6
>>> +182,7 @@ rte_timer_subsystem_init_v1905(void)
>>>    	if (mz == NULL)
>>>    		return -ENOMEM;
>>>
>>> +	rte_timer_data_mz = mz;
>>>    	rte_timer_data_arr = mz->addr;
>>>
>>>    	for (i = 0; i < RTE_MAX_DATA_ELS; i++) { @@ -205,8 +208,13 @@
>>> BIND_DEFAULT_SYMBOL(rte_timer_subsystem_init, _v1905, 19.05);
>>>    void __rte_experimental
>>>    rte_timer_subsystem_finalize(void)
>>>    {
>>> -	if (rte_timer_data_arr)
>>> -		rte_free(rte_timer_data_arr);
>>> +	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
>>> +		return;
>>> +
>>> +	if (!rte_timer_subsystem_initialized)
>>> +		return;
>>> +
>>> +	rte_memzone_free(rte_timer_data_mz);
>>
>> The patch is a correct fix, but the whole idea of this looks dangerous to me.
>>
>> If we exit the primary while secondaries are still running, wouldn't it basically
>> pull out timer data from under secondaries' feet?
>>
> 
> Ah yes - that’s right.  Perhaps it would be better to maintain a reference count of some sort such that the last process to exit could cause the memzone_free.
> 

It feels like a hack, to be honest. A process can crash or exit without 
calling rte_eal_cleanup(), which will lead to a memory leak due to 
refcount being stuck at a value that's not representing reality. It will 
be saf-er than current approach, but still not ideal.

However, i guess it's a good compromise, if i were to choose between a 
memory leak and a segfault :D I wonder if there is a better approach.

> Thanks,
> Erik
> 
>>>
>>>    	rte_timer_subsystem_initialized = 0;
>>>    }
>>>
>>
>>
>> --
>> Thanks,
>> Anatoly
  
Carrillo, Erik G May 2, 2019, 1:48 p.m. UTC | #4
> -----Original Message-----
> From: Burakov, Anatoly
> Sent: Thursday, May 2, 2019 8:04 AM
> To: Carrillo, Erik G <erik.g.carrillo@intel.com>; rsanford@akamai.com;
> thomas@monjalon.net
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] timer: fix resource leak in finalize
> 
> On 02-May-19 1:19 PM, Carrillo, Erik G wrote:
> >
> >
> >> -----Original Message-----
> >> From: Burakov, Anatoly
> >> Sent: Thursday, May 2, 2019 4:18 AM
> >> To: Carrillo, Erik G <erik.g.carrillo@intel.com>;
> >> rsanford@akamai.com; thomas@monjalon.net
> >> Cc: dev@dpdk.org
> >> Subject: Re: [dpdk-dev] [PATCH] timer: fix resource leak in finalize
> >>
> >> On 01-May-19 8:00 PM, Erik Gabriel Carrillo wrote:
> >>> The finalize function should free the memzone created in the init
> >>> function, rather than freeing the allocation the memzone references,
> >>> otherwise a memzone descriptor can be leaked.
> >>>
> >>> Fixes: c0749f7096c7 ("timer: allow management in shared memory")
> >>>
> >>> Signed-off-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
> >>> ---
> >>>    lib/librte_timer/rte_timer.c | 12 ++++++++++--
> >>>    1 file changed, 10 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/lib/librte_timer/rte_timer.c
> >>> b/lib/librte_timer/rte_timer.c index eb46009..fb7a87e 100644
> >>> --- a/lib/librte_timer/rte_timer.c
> >>> +++ b/lib/librte_timer/rte_timer.c
> >>> @@ -60,6 +60,7 @@ struct rte_timer_data {
> >>>    };
> >>>
> >>>    #define RTE_MAX_DATA_ELS 64
> >>> +static const struct rte_memzone *rte_timer_data_mz;
> >>>    static struct rte_timer_data *rte_timer_data_arr;
> >>>    static const uint32_t default_data_id;
> >>>    static uint32_t rte_timer_subsystem_initialized; @@ -164,6 +165,7
> >>> @@
> >>> rte_timer_subsystem_init_v1905(void)
> >>>    		if (mz == NULL)
> >>>    			return -EEXIST;
> >>>
> >>> +		rte_timer_data_mz = mz;
> >>>    		rte_timer_data_arr = mz->addr;
> >>>
> >>>    		rte_timer_data_arr[default_data_id].internal_flags |= @@ -
> >> 180,6
> >>> +182,7 @@ rte_timer_subsystem_init_v1905(void)
> >>>    	if (mz == NULL)
> >>>    		return -ENOMEM;
> >>>
> >>> +	rte_timer_data_mz = mz;
> >>>    	rte_timer_data_arr = mz->addr;
> >>>
> >>>    	for (i = 0; i < RTE_MAX_DATA_ELS; i++) { @@ -205,8 +208,13 @@
> >>> BIND_DEFAULT_SYMBOL(rte_timer_subsystem_init, _v1905, 19.05);
> >>>    void __rte_experimental
> >>>    rte_timer_subsystem_finalize(void)
> >>>    {
> >>> -	if (rte_timer_data_arr)
> >>> -		rte_free(rte_timer_data_arr);
> >>> +	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
> >>> +		return;
> >>> +
> >>> +	if (!rte_timer_subsystem_initialized)
> >>> +		return;
> >>> +
> >>> +	rte_memzone_free(rte_timer_data_mz);
> >>
> >> The patch is a correct fix, but the whole idea of this looks dangerous to
> me.
> >>
> >> If we exit the primary while secondaries are still running, wouldn't
> >> it basically pull out timer data from under secondaries' feet?
> >>
> >
> > Ah yes - that’s right.  Perhaps it would be better to maintain a reference
> count of some sort such that the last process to exit could cause the
> memzone_free.
> >
> 
> It feels like a hack, to be honest. A process can crash or exit without calling
> rte_eal_cleanup(), which will lead to a memory leak due to refcount being
> stuck at a value that's not representing reality. It will be saf-er than current
> approach, but still not ideal.
> 
> However, i guess it's a good compromise, if i were to choose between a
> memory leak and a segfault :D I wonder if there is a better approach.

Ok, I will take a look at that approach then as a first step, since a process can already crash before calling rte_eal_cleanup(), which would result in leaks anyway.

Thanks,
Erik

> 
> > Thanks,
> > Erik
> >
> >>>
> >>>    	rte_timer_subsystem_initialized = 0;
> >>>    }
> >>>
> >>
> >>
> >> --
> >> Thanks,
> >> Anatoly
> 
> 
> --
> Thanks,
> Anatoly
  

Patch

diff --git a/lib/librte_timer/rte_timer.c b/lib/librte_timer/rte_timer.c
index eb46009..fb7a87e 100644
--- a/lib/librte_timer/rte_timer.c
+++ b/lib/librte_timer/rte_timer.c
@@ -60,6 +60,7 @@  struct rte_timer_data {
 };
 
 #define RTE_MAX_DATA_ELS 64
+static const struct rte_memzone *rte_timer_data_mz;
 static struct rte_timer_data *rte_timer_data_arr;
 static const uint32_t default_data_id;
 static uint32_t rte_timer_subsystem_initialized;
@@ -164,6 +165,7 @@  rte_timer_subsystem_init_v1905(void)
 		if (mz == NULL)
 			return -EEXIST;
 
+		rte_timer_data_mz = mz;
 		rte_timer_data_arr = mz->addr;
 
 		rte_timer_data_arr[default_data_id].internal_flags |=
@@ -180,6 +182,7 @@  rte_timer_subsystem_init_v1905(void)
 	if (mz == NULL)
 		return -ENOMEM;
 
+	rte_timer_data_mz = mz;
 	rte_timer_data_arr = mz->addr;
 
 	for (i = 0; i < RTE_MAX_DATA_ELS; i++) {
@@ -205,8 +208,13 @@  BIND_DEFAULT_SYMBOL(rte_timer_subsystem_init, _v1905, 19.05);
 void __rte_experimental
 rte_timer_subsystem_finalize(void)
 {
-	if (rte_timer_data_arr)
-		rte_free(rte_timer_data_arr);
+	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+		return;
+
+	if (!rte_timer_subsystem_initialized)
+		return;
+
+	rte_memzone_free(rte_timer_data_mz);
 
 	rte_timer_subsystem_initialized = 0;
 }