[v3] app/test: fix stack overflow in lpm6_perf_autotest

Message ID 1734535286-27267-1-git-send-email-andremue@linux.microsoft.com (mailing list archive)
State New
Delegated to: Thomas Monjalon
Headers
Series [v3] app/test: fix stack overflow in lpm6_perf_autotest |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/intel-Functional success Functional PASS
ci/github-robot: build success github build: passed
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-marvell-Functional success Functional Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-unit-amd64-testing success Testing PASS
ci/iol-unit-arm64-testing success Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-compile-arm64-testing success Testing PASS
ci/iol-sample-apps-testing success Testing PASS
ci/iol-abi-testing success Testing PASS
ci/iol-compile-amd64-testing success Testing PASS

Commit Message

Andre Muezerie Dec. 18, 2024, 3:21 p.m. UTC
Test lpm6_perf_autotest was hitting a stack overflow on Windows
with both MSVC and Clang.

The fix is to move some of the data from the stack to the heap.

Signed-off-by: Andre Muezerie <andremue@linux.microsoft.com>
---
 app/test/test_lpm6_perf.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)
  

Comments

Bruce Richardson March 5, 2025, 10:08 a.m. UTC | #1
On Wed, Dec 18, 2024 at 07:21:26AM -0800, Andre Muezerie wrote:
> Test lpm6_perf_autotest was hitting a stack overflow on Windows
> with both MSVC and Clang.
> 
> The fix is to move some of the data from the stack to the heap.
> 
> Signed-off-by: Andre Muezerie <andremue@linux.microsoft.com>
> ---
>  app/test/test_lpm6_perf.c | 13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/app/test/test_lpm6_perf.c b/app/test/test_lpm6_perf.c
> index 1860a99ed6..59df0a958a 100644
> --- a/app/test/test_lpm6_perf.c
> +++ b/app/test/test_lpm6_perf.c
> @@ -10,6 +10,7 @@
>  
>  #include <rte_cycles.h>
>  #include <rte_random.h>
> +#include <rte_malloc.h>
>  #include <rte_memory.h>
>  #include <rte_lpm6.h>
>  
> @@ -117,8 +118,13 @@ test_lpm6_perf(void)
>  	total_time = 0;
>  	count = 0;
>  
> -	struct rte_ipv6_addr ip_batch[NUM_IPS_ENTRIES];
> -	int32_t next_hops[NUM_IPS_ENTRIES];
> +	struct rte_ipv6_addr *ip_batch = rte_calloc("ip_batch",
> +			NUM_IPS_ENTRIES, sizeof(struct rte_ipv6_addr), 0);
> +	TEST_LPM_ASSERT(ip_batch != NULL);
> +
> +	int32_t *next_hops = rte_calloc("next_hops",
> +			NUM_IPS_ENTRIES, sizeof(int32_t), 0);
> +	TEST_LPM_ASSERT(next_hops != NULL);
>  

While I don't think we need to use the "rte_" versions of allocation, this
is still an ok fix - and I see that in v1 regular malloc was used.

With either calloc or rte_calloc used.

Acked-by: Bruce Richardson <bruce.richardson@intel.com>


>  	for (i = 0; i < NUM_IPS_ENTRIES; i++)
>  		ip_batch[i] = large_ips_table[i].ip;
> @@ -153,6 +159,9 @@ test_lpm6_perf(void)
>  	printf("Average LPM Delete: %g cycles\n",
>  			(double)total_time / NUM_ROUTE_ENTRIES);
>  
> +	rte_free(next_hops);
> +	rte_free(ip_batch);
> +
>  	rte_lpm6_delete_all(lpm);
>  	rte_lpm6_free(lpm);
>  
> -- 
> 2.47.0.vfs.0.3
>
  
Andre Muezerie March 5, 2025, 2:45 p.m. UTC | #2
On Wed, Mar 05, 2025 at 10:08:33AM +0000, Bruce Richardson wrote:
> On Wed, Dec 18, 2024 at 07:21:26AM -0800, Andre Muezerie wrote:
> > Test lpm6_perf_autotest was hitting a stack overflow on Windows
> > with both MSVC and Clang.
> > 
> > The fix is to move some of the data from the stack to the heap.
> > 
> > Signed-off-by: Andre Muezerie <andremue@linux.microsoft.com>
> > ---
> >  app/test/test_lpm6_perf.c | 13 +++++++++++--
> >  1 file changed, 11 insertions(+), 2 deletions(-)
> > 
> > diff --git a/app/test/test_lpm6_perf.c b/app/test/test_lpm6_perf.c
> > index 1860a99ed6..59df0a958a 100644
> > --- a/app/test/test_lpm6_perf.c
> > +++ b/app/test/test_lpm6_perf.c
> > @@ -10,6 +10,7 @@
> >  
> >  #include <rte_cycles.h>
> >  #include <rte_random.h>
> > +#include <rte_malloc.h>
> >  #include <rte_memory.h>
> >  #include <rte_lpm6.h>
> >  
> > @@ -117,8 +118,13 @@ test_lpm6_perf(void)
> >  	total_time = 0;
> >  	count = 0;
> >  
> > -	struct rte_ipv6_addr ip_batch[NUM_IPS_ENTRIES];
> > -	int32_t next_hops[NUM_IPS_ENTRIES];
> > +	struct rte_ipv6_addr *ip_batch = rte_calloc("ip_batch",
> > +			NUM_IPS_ENTRIES, sizeof(struct rte_ipv6_addr), 0);
> > +	TEST_LPM_ASSERT(ip_batch != NULL);
> > +
> > +	int32_t *next_hops = rte_calloc("next_hops",
> > +			NUM_IPS_ENTRIES, sizeof(int32_t), 0);
> > +	TEST_LPM_ASSERT(next_hops != NULL);
> >  
> 
> While I don't think we need to use the "rte_" versions of allocation, this
> is still an ok fix - and I see that in v1 regular malloc was used.

The reviewers had different opinions on which allocator function
should be used, that's why it was changed. Thanks for being flexible on this.

> 
> With either calloc or rte_calloc used.
> 
> Acked-by: Bruce Richardson <bruce.richardson@intel.com>
> 
> 
> >  	for (i = 0; i < NUM_IPS_ENTRIES; i++)
> >  		ip_batch[i] = large_ips_table[i].ip;
> > @@ -153,6 +159,9 @@ test_lpm6_perf(void)
> >  	printf("Average LPM Delete: %g cycles\n",
> >  			(double)total_time / NUM_ROUTE_ENTRIES);
> >  
> > +	rte_free(next_hops);
> > +	rte_free(ip_batch);
> > +
> >  	rte_lpm6_delete_all(lpm);
> >  	rte_lpm6_free(lpm);
> >  
> > -- 
> > 2.47.0.vfs.0.3
> >
  
Medvedkin, Vladimir March 5, 2025, 8:04 p.m. UTC | #3
Hi Andre,

On 05/03/2025 14:45, Andre Muezerie wrote:
> On Wed, Mar 05, 2025 at 10:08:33AM +0000, Bruce Richardson wrote:
>> On Wed, Dec 18, 2024 at 07:21:26AM -0800, Andre Muezerie wrote:
>>> Test lpm6_perf_autotest was hitting a stack overflow on Windows
>>> with both MSVC and Clang.
>>>
>>> The fix is to move some of the data from the stack to the heap.
>>>
>>> Signed-off-by: Andre Muezerie<andremue@linux.microsoft.com>
>>> ---
>>>   app/test/test_lpm6_perf.c | 13 +++++++++++--
>>>   1 file changed, 11 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/app/test/test_lpm6_perf.c b/app/test/test_lpm6_perf.c
>>> index 1860a99ed6..59df0a958a 100644
>>> --- a/app/test/test_lpm6_perf.c
>>> +++ b/app/test/test_lpm6_perf.c
>>> @@ -10,6 +10,7 @@
>>>   
>>>   #include <rte_cycles.h>
>>>   #include <rte_random.h>
>>> +#include <rte_malloc.h>
>>>   #include <rte_memory.h>
>>>   #include <rte_lpm6.h>
>>>   
>>> @@ -117,8 +118,13 @@ test_lpm6_perf(void)
>>>   	total_time = 0;
>>>   	count = 0;
>>>   
>>> -	struct rte_ipv6_addr ip_batch[NUM_IPS_ENTRIES];
>>> -	int32_t next_hops[NUM_IPS_ENTRIES];
>>> +	struct rte_ipv6_addr *ip_batch = rte_calloc("ip_batch",
>>> +			NUM_IPS_ENTRIES, sizeof(struct rte_ipv6_addr), 0);
>>> +	TEST_LPM_ASSERT(ip_batch != NULL);
>>> +
>>> +	int32_t *next_hops = rte_calloc("next_hops",
>>> +			NUM_IPS_ENTRIES, sizeof(int32_t), 0);
>>> +	TEST_LPM_ASSERT(next_hops != NULL);
>>>   
>> While I don't think we need to use the "rte_" versions of allocation, this
>> is still an ok fix - and I see that in v1 regular malloc was used.
> The reviewers had different opinions on which allocator function
> should be used, that's why it was changed. Thanks for being flexible on this.

As I mentioned earlier, I think it's better to use the memory allocated 
from hugepages here since we are measuring performance.

Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>

>
>> With either calloc or rte_calloc used.
>>
>> Acked-by: Bruce Richardson<bruce.richardson@intel.com>
>>
>>
>>>   	for (i = 0; i < NUM_IPS_ENTRIES; i++)
>>>   		ip_batch[i] = large_ips_table[i].ip;
>>> @@ -153,6 +159,9 @@ test_lpm6_perf(void)
>>>   	printf("Average LPM Delete: %g cycles\n",
>>>   			(double)total_time / NUM_ROUTE_ENTRIES);
>>>   
>>> +	rte_free(next_hops);
>>> +	rte_free(ip_batch);
>>> +
>>>   	rte_lpm6_delete_all(lpm);
>>>   	rte_lpm6_free(lpm);
>>>   
>>> -- 
>>> 2.47.0.vfs.0.3
>>>
  

Patch

diff --git a/app/test/test_lpm6_perf.c b/app/test/test_lpm6_perf.c
index 1860a99ed6..59df0a958a 100644
--- a/app/test/test_lpm6_perf.c
+++ b/app/test/test_lpm6_perf.c
@@ -10,6 +10,7 @@ 
 
 #include <rte_cycles.h>
 #include <rte_random.h>
+#include <rte_malloc.h>
 #include <rte_memory.h>
 #include <rte_lpm6.h>
 
@@ -117,8 +118,13 @@  test_lpm6_perf(void)
 	total_time = 0;
 	count = 0;
 
-	struct rte_ipv6_addr ip_batch[NUM_IPS_ENTRIES];
-	int32_t next_hops[NUM_IPS_ENTRIES];
+	struct rte_ipv6_addr *ip_batch = rte_calloc("ip_batch",
+			NUM_IPS_ENTRIES, sizeof(struct rte_ipv6_addr), 0);
+	TEST_LPM_ASSERT(ip_batch != NULL);
+
+	int32_t *next_hops = rte_calloc("next_hops",
+			NUM_IPS_ENTRIES, sizeof(int32_t), 0);
+	TEST_LPM_ASSERT(next_hops != NULL);
 
 	for (i = 0; i < NUM_IPS_ENTRIES; i++)
 		ip_batch[i] = large_ips_table[i].ip;
@@ -153,6 +159,9 @@  test_lpm6_perf(void)
 	printf("Average LPM Delete: %g cycles\n",
 			(double)total_time / NUM_ROUTE_ENTRIES);
 
+	rte_free(next_hops);
+	rte_free(ip_batch);
+
 	rte_lpm6_delete_all(lpm);
 	rte_lpm6_free(lpm);