[RFC,1/8] net/ena: fix warnings related to rte_memcpy and gcc-12

Message ID 20220607171746.461772-2-stephen@networkplumber.org (mailing list archive)
State Rejected, archived
Delegated to: David Marchand
Headers
Series Gcc-12 warning fixes |

Checks

Context Check Description
ci/checkpatch success coding style OK

Commit Message

Stephen Hemminger June 7, 2022, 5:17 p.m. UTC
  Rte_memcpy is not needed for small objects only used on control
path. Regular memcpy is as fast or faster and there is more
robust since static analysis etc knows what it does.

In this driver it was redefining all memcpy as rte_memcpy
which is even worse.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 drivers/net/ena/base/ena_plat_dpdk.h | 10 +---------
 drivers/net/ena/ena_ethdev.c         |  8 ++++----
 drivers/net/ena/ena_rss.c            |  2 +-
 3 files changed, 6 insertions(+), 14 deletions(-)
  

Comments

Michal Krawczyk June 8, 2022, 12:29 p.m. UTC | #1
wt., 7 cze 2022 o 19:17 Stephen Hemminger <stephen@networkplumber.org>
napisał(a):
>
> Rte_memcpy is not needed for small objects only used on control
> path. Regular memcpy is as fast or faster and there is more
> robust since static analysis etc knows what it does.
>
> In this driver it was redefining all memcpy as rte_memcpy
> which is even worse.

Hi Stephen,

I would like to shed some light on why we're redefining all the memcpy
as rte_memcpy. The ENA HAL is unmodifiable, as it's shared across many
platforms and we cannot simply adjust it for the DPDK. We can use the
ena_plat_dpdk.h to change the ena_com (HAL) definitions, and that's
what we're doing with memcpy. It's being used on the data path for the
Tx, to copy the bounce buffers. Following the recommendations in [1]
plus the results from [2], we wanted to make use of the optimized
memcpy on the ENA's data path as well to reduce the CPU time spent in
the PMD. I'm worried that removing rte_memcpy from the ena_plat_dpdk.h
will result in some performance degradation for the ENA data path.
However I understand your concerns for the control path and I'm ok
with it.

[1] https://doc.dpdk.org/guides/prog_guide/writing_efficient_code.html#memory
[2] https://www.intel.com/content/www/us/en/developer/articles/technical/performance-optimization-of-memcpy-in-dpdk.html

Thanks,
Michal

>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
>  drivers/net/ena/base/ena_plat_dpdk.h | 10 +---------
>  drivers/net/ena/ena_ethdev.c         |  8 ++++----
>  drivers/net/ena/ena_rss.c            |  2 +-
>  3 files changed, 6 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/net/ena/base/ena_plat_dpdk.h b/drivers/net/ena/base/ena_plat_dpdk.h
> index 8f2b3a87c2ab..caea763e3eca 100644
> --- a/drivers/net/ena/base/ena_plat_dpdk.h
> +++ b/drivers/net/ena/base/ena_plat_dpdk.h
> @@ -26,7 +26,6 @@
>  #include <rte_spinlock.h>
>
>  #include <sys/time.h>
> -#include <rte_memcpy.h>
>
>  typedef uint64_t u64;
>  typedef uint32_t u32;
> @@ -67,14 +66,7 @@ typedef uint64_t dma_addr_t;
>  #define ENA_UDELAY(x) rte_delay_us_block(x)
>
>  #define ENA_TOUCH(x) ((void)(x))
> -/* Redefine memcpy with caution: rte_memcpy can be simply aliased to memcpy, so
> - * make the redefinition only if it's safe (and beneficial) to do so.
> - */
> -#if defined(RTE_ARCH_X86) || defined(RTE_ARCH_ARM64_MEMCPY) || \
> -       defined(RTE_ARCH_ARM_NEON_MEMCPY)
> -#undef memcpy
> -#define memcpy rte_memcpy
> -#endif
> +
>  #define wmb rte_wmb
>  #define rmb rte_rmb
>  #define mb rte_mb
> diff --git a/drivers/net/ena/ena_ethdev.c b/drivers/net/ena/ena_ethdev.c
> index 68768cab7077..5f87429606e6 100644
> --- a/drivers/net/ena/ena_ethdev.c
> +++ b/drivers/net/ena/ena_ethdev.c
> @@ -481,7 +481,7 @@ ENA_PROXY_DESC(ena_com_get_dev_basic_stats, ENA_MP_DEV_STATS_GET,
>         ENA_TOUCH(rsp);
>         ENA_TOUCH(ena_dev);
>         if (stats != &adapter->basic_stats)
> -               rte_memcpy(stats, &adapter->basic_stats, sizeof(*stats));
> +               memcpy(stats, &adapter->basic_stats, sizeof(*stats));
>  }),
>         struct ena_com_dev *ena_dev, struct ena_admin_basic_stats *stats);
>
> @@ -496,7 +496,7 @@ ENA_PROXY_DESC(ena_com_get_eni_stats, ENA_MP_ENI_STATS_GET,
>         ENA_TOUCH(rsp);
>         ENA_TOUCH(ena_dev);
>         if (stats != (struct ena_admin_eni_stats *)&adapter->eni_stats)
> -               rte_memcpy(stats, &adapter->eni_stats, sizeof(*stats));
> +               memcpy(stats, &adapter->eni_stats, sizeof(*stats));
>  }),
>         struct ena_com_dev *ena_dev, struct ena_admin_eni_stats *stats);
>
> @@ -538,8 +538,8 @@ ENA_PROXY_DESC(ena_com_indirect_table_get, ENA_MP_IND_TBL_GET,
>         ENA_TOUCH(rsp);
>         ENA_TOUCH(ena_dev);
>         if (ind_tbl != adapter->indirect_table)
> -               rte_memcpy(ind_tbl, adapter->indirect_table,
> -                          sizeof(adapter->indirect_table));
> +               memcpy(ind_tbl, adapter->indirect_table,
> +                      sizeof(adapter->indirect_table));
>  }),
>         struct ena_com_dev *ena_dev, u32 *ind_tbl);
>
> diff --git a/drivers/net/ena/ena_rss.c b/drivers/net/ena/ena_rss.c
> index b6c4f76e3820..c723d3f5fca1 100644
> --- a/drivers/net/ena/ena_rss.c
> +++ b/drivers/net/ena/ena_rss.c
> @@ -59,7 +59,7 @@ void ena_rss_key_fill(void *key, size_t size)
>                 key_generated = true;
>         }
>
> -       rte_memcpy(key, default_key, size);
> +       memcpy(key, default_key, size);
>  }
>
>  int ena_rss_reta_update(struct rte_eth_dev *dev,
> --
> 2.35.1
>
  
Stephen Hemminger June 8, 2022, 3:32 p.m. UTC | #2
On Wed, 8 Jun 2022 14:29:58 +0200
Michał Krawczyk <mk@semihalf.com> wrote:

> wt., 7 cze 2022 o 19:17 Stephen Hemminger <stephen@networkplumber.org>
> napisał(a):
> >
> > Rte_memcpy is not needed for small objects only used on control
> > path. Regular memcpy is as fast or faster and there is more
> > robust since static analysis etc knows what it does.
> >
> > In this driver it was redefining all memcpy as rte_memcpy
> > which is even worse.  
> 
> Hi Stephen,
> 
> I would like to shed some light on why we're redefining all the memcpy
> as rte_memcpy. The ENA HAL is unmodifiable, as it's shared across many
> platforms and we cannot simply adjust it for the DPDK. We can use the
> ena_plat_dpdk.h to change the ena_com (HAL) definitions, and that's
> what we're doing with memcpy. It's being used on the data path for the
> Tx, to copy the bounce buffers. Following the recommendations in [1]
> plus the results from [2], we wanted to make use of the optimized
> memcpy on the ENA's data path as well to reduce the CPU time spent in
> the PMD. I'm worried that removing rte_memcpy from the ena_plat_dpdk.h
> will result in some performance degradation for the ENA data path.
> However I understand your concerns for the control path and I'm ok
> with it.
> 
> [1] https://doc.dpdk.org/guides/prog_guide/writing_efficient_code.html#memory
> [2] https://www.intel.com/content/www/us/en/developer/articles/technical/performance-optimization-of-memcpy-in-dpdk.html
> 
> Thanks,
> Michal
> 


I admit to having little sympathy unfixable for base/ style code.
You could have just replaced memcpy() in their with an abstraction layer
like other drivers.

The full gcc-12 warnings are:

913/2989] Compiling C object drivers/libtmp_rte_net_ena.a.p/net_ena_ena_rss.c.o
In file included from /usr/lib/gcc/x86_64-linux-gnu/12/include/immintrin.h:43,
                 from /usr/lib/gcc/x86_64-linux-gnu/12/include/x86intrin.h:32,
                 from ../lib/eal/x86/include/rte_vect.h:31,
                 from ../lib/eal/x86/include/rte_memcpy.h:17,
                 from ../lib/mempool/rte_mempool.h:46,
                 from ../lib/mbuf/rte_mbuf.h:38,
                 from ../lib/net/rte_ether.h:22,
                 from ../drivers/net/ena/ena_ethdev.h:10,
                 from ../drivers/net/ena/ena_rss.c:6:
In function ‘_mm256_loadu_si256’,
    inlined from ‘rte_mov32’ at ../lib/eal/x86/include/rte_memcpy.h:346:9,
    inlined from ‘rte_mov128’ at ../lib/eal/x86/include/rte_memcpy.h:369:2,
    inlined from ‘rte_memcpy_generic’ at ../lib/eal/x86/include/rte_memcpy.h:445:4,
    inlined from ‘rte_memcpy’ at ../lib/eal/x86/include/rte_memcpy.h:853:10,
    inlined from ‘ena_rss_key_fill’ at ../drivers/net/ena/ena_rss.c:62:2:
/usr/lib/gcc/x86_64-linux-gnu/12/include/avxintrin.h:929:10: warning: array subscript ‘__m256i_u[1]’ is partly outside array bounds of ‘uint8_t[40]’ {aka ‘unsigned char[40]’} [-Warray-bounds]
  929 |   return *__P;
      |          ^~~~
../drivers/net/ena/ena_rss.c: In function ‘ena_rss_key_fill’:
../drivers/net/ena/ena_rss.c:51:24: note: at offset 32 into object ‘default_key’ of size 40
   51 |         static uint8_t default_key[ENA_HASH_KEY_SIZE];
      |                        ^~~~~~~~~~~
In function ‘_mm256_loadu_si256’,
    inlined from ‘rte_mov32’ at ../lib/eal/x86/include/rte_memcpy.h:346:9,
    inlined from ‘rte_mov128’ at ../lib/eal/x86/include/rte_memcpy.h:370:2,
    inlined from ‘rte_memcpy_generic’ at ../lib/eal/x86/include/rte_memcpy.h:445:4,
    inlined from ‘rte_memcpy’ at ../lib/eal/x86/include/rte_memcpy.h:853:10,
    inlined from ‘ena_rss_key_fill’ at ../drivers/net/ena/ena_rss.c:62:2:
/usr/lib/gcc/x86_64-linux-gnu/12/include/avxintrin.h:929:10: warning: array subscript 2 is outside array bounds of ‘uint8_t[40]’ {aka ‘unsigned char[40]’} [-Warray-bounds]
  929 |   return *__P;
      |          ^~~~
../drivers/net/ena/ena_rss.c: In function ‘ena_rss_key_fill’:
../drivers/net/ena/ena_rss.c:51:24: note: at offset 64 into object ‘default_key’ of size 40
   51 |         static uint8_t default_key[ENA_HASH_KEY_SIZE];
      |                        ^~~~~~~~~~~
In function ‘_mm256_loadu_si256’,
    inlined from ‘rte_mov32’ at ../lib/eal/x86/include/rte_memcpy.h:346:9,
    inlined from ‘rte_mov128’ at ../lib/eal/x86/include/rte_memcpy.h:371:2,
    inlined from ‘rte_memcpy_generic’ at ../lib/eal/x86/include/rte_memcpy.h:445:4,
    inlined from ‘rte_memcpy’ at ../lib/eal/x86/include/rte_memcpy.h:853:10,
    inlined from ‘ena_rss_key_fill’ at ../drivers/net/ena/ena_rss.c:62:2:
/usr/lib/gcc/x86_64-linux-gnu/12/include/avxintrin.h:929:10: warning: array subscript 3 is outside array bounds of ‘uint8_t[40]’ {aka ‘unsigned char[40]’} [-Warray-bounds]
  929 |   return *__P;
      |          ^~~~
../drivers/net/ena/ena_rss.c: In function ‘ena_rss_key_fill’:
../drivers/net/ena/ena_rss.c:51:24: note: at offset 96 into object ‘default_key’ of size 40
   51 |         static uint8_t default_key[ENA_HASH_KEY_SIZE];
      |                        ^~~~~~~~~~~
In function ‘_mm256_loadu_si256’,
    inlined from ‘rte_mov32’ at ../lib/eal/x86/include/rte_memcpy.h:346:9,
    inlined from ‘rte_mov64’ at ../lib/eal/x86/include/rte_memcpy.h:358:2,
    inlined from ‘rte_memcpy_generic’ at ../lib/eal/x86/include/rte_memcpy.h:452:4,
    inlined from ‘rte_memcpy’ at ../lib/eal/x86/include/rte_memcpy.h:853:10,
    inlined from ‘ena_rss_key_fill’ at ../drivers/net/ena/ena_rss.c:62:2:
/usr/lib/gcc/x86_64-linux-gnu/12/include/avxintrin.h:929:10: warning: array subscript ‘__m256i_u[1]’ is partly outside array bounds of ‘const void[40]’ [-Warray-bounds]
  929 |   return *__P;
      |          ^~~~
../drivers/net/ena/ena_rss.c: In function ‘ena_rss_key_fill’:
../drivers/net/ena/ena_rss.c:51:24: note: at offset 32 into object ‘default_key’ of size 40
   51 |         static uint8_t default_key[ENA_HASH_KEY_SIZE];
      |                        ^~~~~~~~~~~
../drivers/net/ena/ena_rss.c:51:24: note: at offset [33, 40] into object ‘default_key’ of size 40
../drivers/net/ena/ena_rss.c:51:24: note: at offset 160 into object ‘default_key’ of size 40
../drivers/net/ena/ena_rss.c:51:24: note: at offset 32 into object ‘default_key’ of size 40
In function ‘_mm256_loadu_si256’,
    inlined from ‘rte_mov32’ at ../lib/eal/x86/include/rte_memcpy.h:346:9,
    inlined from ‘rte_memcpy_generic’ at ../lib/eal/x86/include/rte_memcpy.h:457:4,
    inlined from ‘rte_memcpy’ at ../lib/eal/x86/include/rte_memcpy.h:853:10,
    inlined from ‘ena_rss_key_fill’ at ../drivers/net/ena/ena_rss.c:62:2:
/usr/lib/gcc/x86_64-linux-gnu/12/include/avxintrin.h:929:10: warning: array subscript [2, 288230376151711745] is outside array bounds of ‘const void[40]’ [-Warray-bounds]
  929 |   return *__P;
      |          ^~~~
../drivers/net/ena/ena_rss.c: In function ‘ena_rss_key_fill’:
../drivers/net/ena/ena_rss.c:51:24: note: object ‘default_key’ of size 40
   51 |         static uint8_t default_key[ENA_HASH_KEY_SIZE];
      |                        ^~~~~~~~~~~
../drivers/net/ena/ena_rss.c:51:24: note: at offset [1, 40] into object ‘default_key’ of size 40
../drivers/net/ena/ena_rss.c:51:24: note: at offset [128, 192] into object ‘default_key’ of size 40
../drivers/net/ena/ena_rss.c:51:24: note: object ‘default_key’ of size 40
../drivers/net/ena/ena_rss.c:51:24: note: at offset [1, 40] into object ‘default_key’ of size 40
../drivers/net/ena/ena_rss.c:51:24: note: at offset [128, 192] into object ‘default_key’ of size 40
../drivers/net/ena/ena_rss.c:51:24: note: object ‘default_key’ of size 40
In function ‘_mm256_loadu_si256’,
    inlined from ‘rte_mov32’ at ../lib/eal/x86/include/rte_memcpy.h:346:9,
    inlined from ‘rte_memcpy_generic’ at ../lib/eal/x86/include/rte_memcpy.h:458:4,
    inlined from ‘rte_memcpy’ at ../lib/eal/x86/include/rte_memcpy.h:853:10,
    inlined from ‘ena_rss_key_fill’ at ../drivers/net/ena/ena_rss.c:62:2:
/usr/lib/gcc/x86_64-linux-gnu/12/include/avxintrin.h:929:10: warning: array subscript [2, 288230376151711746] is outside array bounds of ‘const void[40]’ [-Warray-bounds]
  929 |   return *__P;
      |          ^~~~
../drivers/net/ena/ena_rss.c: In function ‘ena_rss_key_fill’:
../drivers/net/ena/ena_rss.c:51:24: note: at offset [1, 40] into object ‘default_key’ of size 40
   51 |         static uint8_t default_key[ENA_HASH_KEY_SIZE];
      |                        ^~~~~~~~~~~
../drivers/net/ena/ena_rss.c:51:24: note: at offset [2, 40] into object ‘default_key’ of size 40
../drivers/net/ena/ena_rss.c:51:24: note: at offset [129, 193] into object ‘default_key’ of size 40
../drivers/net/ena/ena_rss.c:51:24: note: at offset [1, 40] into object ‘default_key’ of size 40
../drivers/net/ena/ena_rss.c:51:24: note: at offset [2, 40] into object ‘default_key’ of size 40
../drivers/net/ena/ena_rss.c:51:24: note: at offset [129, 193] into object ‘default_key’ of size 40
../drivers/net/ena/ena_rss.c:51:24: note: at offset [1, 40] into object ‘default_key’ of size 40
In function ‘_mm256_loadu_si256’,
    inlined from ‘rte_mov32’ at ../lib/eal/x86/include/rte_memcpy.h:346:9,
    inlined from ‘rte_memcpy_generic’ at ../lib/eal/x86/include/rte_memcpy.h:438:3,
    inlined from ‘rte_memcpy’ at ../lib/eal/x86/include/rte_memcpy.h:853:10,
    inlined from ‘ena_rss_key_fill’ at ../drivers/net/ena/ena_rss.c:62:2:
/usr/lib/gcc/x86_64-linux-gnu/12/include/avxintrin.h:929:10: warning: array subscript ‘__m256i_u[0]’ is partly outside array bounds of ‘uint8_t[40]’ {aka ‘unsigned char[40]’} [-Warray-bounds]
  929 |   return *__P;
      |          ^~~~
../drivers/net/ena/ena_rss.c: In function ‘ena_rss_key_fill’:
../drivers/net/ena/ena_rss.c:51:24: note: at offset [17, 32] into object ‘default_key’ of size 40
   51 |         static uint8_t default_key[ENA_HASH_KEY_SIZE];
      |                        ^~~~~~~~~~~
  
Michal Krawczyk June 8, 2022, 7:18 p.m. UTC | #3
śr., 8 cze 2022 o 17:32 Stephen Hemminger <stephen@networkplumber.org>
napisał(a):
>
> On Wed, 8 Jun 2022 14:29:58 +0200
> Michał Krawczyk <mk@semihalf.com> wrote:
>
> > wt., 7 cze 2022 o 19:17 Stephen Hemminger <stephen@networkplumber.org>
> > napisał(a):
> > >
> > > Rte_memcpy is not needed for small objects only used on control
> > > path. Regular memcpy is as fast or faster and there is more
> > > robust since static analysis etc knows what it does.
> > >
> > > In this driver it was redefining all memcpy as rte_memcpy
> > > which is even worse.
> >
> > Hi Stephen,
> >
> > I would like to shed some light on why we're redefining all the memcpy
> > as rte_memcpy. The ENA HAL is unmodifiable, as it's shared across many
> > platforms and we cannot simply adjust it for the DPDK. We can use the
> > ena_plat_dpdk.h to change the ena_com (HAL) definitions, and that's
> > what we're doing with memcpy. It's being used on the data path for the
> > Tx, to copy the bounce buffers. Following the recommendations in [1]
> > plus the results from [2], we wanted to make use of the optimized
> > memcpy on the ENA's data path as well to reduce the CPU time spent in
> > the PMD. I'm worried that removing rte_memcpy from the ena_plat_dpdk.h
> > will result in some performance degradation for the ENA data path.
> > However I understand your concerns for the control path and I'm ok
> > with it.
> >
> > [1] https://doc.dpdk.org/guides/prog_guide/writing_efficient_code.html#memory
> > [2] https://www.intel.com/content/www/us/en/developer/articles/technical/performance-optimization-of-memcpy-in-dpdk.html
> >
> > Thanks,
> > Michal
> >
>
>
> I admit to having little sympathy unfixable for base/ style code.
> You could have just replaced memcpy() in their with an abstraction layer
> like other drivers.
>

We'll probably end up with the solution you're suggesting. For now
let's remove the memcpy redefinition at all to suppress the warnings.

Acked-by: Michal Krawczyk <mk@semiahalf.com>
  
Stephen Hemminger June 8, 2022, 8:52 p.m. UTC | #4
On Wed, 8 Jun 2022 21:18:15 +0200
Michał Krawczyk <mk@semihalf.com> wrote:

> śr., 8 cze 2022 o 17:32 Stephen Hemminger <stephen@networkplumber.org>
> napisał(a):
> >
> > On Wed, 8 Jun 2022 14:29:58 +0200
> > Michał Krawczyk <mk@semihalf.com> wrote:
> >  
> > > wt., 7 cze 2022 o 19:17 Stephen Hemminger <stephen@networkplumber.org>
> > > napisał(a):  
> > > >
> > > > Rte_memcpy is not needed for small objects only used on control
> > > > path. Regular memcpy is as fast or faster and there is more
> > > > robust since static analysis etc knows what it does.
> > > >
> > > > In this driver it was redefining all memcpy as rte_memcpy
> > > > which is even worse.  
> > >
> > > Hi Stephen,
> > >
> > > I would like to shed some light on why we're redefining all the memcpy
> > > as rte_memcpy. The ENA HAL is unmodifiable, as it's shared across many
> > > platforms and we cannot simply adjust it for the DPDK. We can use the
> > > ena_plat_dpdk.h to change the ena_com (HAL) definitions, and that's
> > > what we're doing with memcpy. It's being used on the data path for the
> > > Tx, to copy the bounce buffers. Following the recommendations in [1]
> > > plus the results from [2], we wanted to make use of the optimized
> > > memcpy on the ENA's data path as well to reduce the CPU time spent in
> > > the PMD. I'm worried that removing rte_memcpy from the ena_plat_dpdk.h
> > > will result in some performance degradation for the ENA data path.
> > > However I understand your concerns for the control path and I'm ok
> > > with it.
> > >
> > > [1] https://doc.dpdk.org/guides/prog_guide/writing_efficient_code.html#memory
> > > [2] https://www.intel.com/content/www/us/en/developer/articles/technical/performance-optimization-of-memcpy-in-dpdk.html
> > >
> > > Thanks,
> > > Michal
> > >  
> >
> >
> > I admit to having little sympathy unfixable for base/ style code.
> > You could have just replaced memcpy() in their with an abstraction layer
> > like other drivers.
> >  
> 
> We'll probably end up with the solution you're suggesting. For now
> let's remove the memcpy redefinition at all to suppress the warnings.
> 
> Acked-by: Michal Krawczyk <mk@semiahalf.com>

Lets see if we can fix rte_memcpy() on x86 first.

It seems to me that rte_memcpy() should be an inline that only handles variable
size data, and use __builtin_memcpy() automatically for fixed size values.
  

Patch

diff --git a/drivers/net/ena/base/ena_plat_dpdk.h b/drivers/net/ena/base/ena_plat_dpdk.h
index 8f2b3a87c2ab..caea763e3eca 100644
--- a/drivers/net/ena/base/ena_plat_dpdk.h
+++ b/drivers/net/ena/base/ena_plat_dpdk.h
@@ -26,7 +26,6 @@ 
 #include <rte_spinlock.h>
 
 #include <sys/time.h>
-#include <rte_memcpy.h>
 
 typedef uint64_t u64;
 typedef uint32_t u32;
@@ -67,14 +66,7 @@  typedef uint64_t dma_addr_t;
 #define ENA_UDELAY(x) rte_delay_us_block(x)
 
 #define ENA_TOUCH(x) ((void)(x))
-/* Redefine memcpy with caution: rte_memcpy can be simply aliased to memcpy, so
- * make the redefinition only if it's safe (and beneficial) to do so.
- */
-#if defined(RTE_ARCH_X86) || defined(RTE_ARCH_ARM64_MEMCPY) || \
-	defined(RTE_ARCH_ARM_NEON_MEMCPY)
-#undef memcpy
-#define memcpy rte_memcpy
-#endif
+
 #define wmb rte_wmb
 #define rmb rte_rmb
 #define mb rte_mb
diff --git a/drivers/net/ena/ena_ethdev.c b/drivers/net/ena/ena_ethdev.c
index 68768cab7077..5f87429606e6 100644
--- a/drivers/net/ena/ena_ethdev.c
+++ b/drivers/net/ena/ena_ethdev.c
@@ -481,7 +481,7 @@  ENA_PROXY_DESC(ena_com_get_dev_basic_stats, ENA_MP_DEV_STATS_GET,
 	ENA_TOUCH(rsp);
 	ENA_TOUCH(ena_dev);
 	if (stats != &adapter->basic_stats)
-		rte_memcpy(stats, &adapter->basic_stats, sizeof(*stats));
+		memcpy(stats, &adapter->basic_stats, sizeof(*stats));
 }),
 	struct ena_com_dev *ena_dev, struct ena_admin_basic_stats *stats);
 
@@ -496,7 +496,7 @@  ENA_PROXY_DESC(ena_com_get_eni_stats, ENA_MP_ENI_STATS_GET,
 	ENA_TOUCH(rsp);
 	ENA_TOUCH(ena_dev);
 	if (stats != (struct ena_admin_eni_stats *)&adapter->eni_stats)
-		rte_memcpy(stats, &adapter->eni_stats, sizeof(*stats));
+		memcpy(stats, &adapter->eni_stats, sizeof(*stats));
 }),
 	struct ena_com_dev *ena_dev, struct ena_admin_eni_stats *stats);
 
@@ -538,8 +538,8 @@  ENA_PROXY_DESC(ena_com_indirect_table_get, ENA_MP_IND_TBL_GET,
 	ENA_TOUCH(rsp);
 	ENA_TOUCH(ena_dev);
 	if (ind_tbl != adapter->indirect_table)
-		rte_memcpy(ind_tbl, adapter->indirect_table,
-			   sizeof(adapter->indirect_table));
+		memcpy(ind_tbl, adapter->indirect_table,
+		       sizeof(adapter->indirect_table));
 }),
 	struct ena_com_dev *ena_dev, u32 *ind_tbl);
 
diff --git a/drivers/net/ena/ena_rss.c b/drivers/net/ena/ena_rss.c
index b6c4f76e3820..c723d3f5fca1 100644
--- a/drivers/net/ena/ena_rss.c
+++ b/drivers/net/ena/ena_rss.c
@@ -59,7 +59,7 @@  void ena_rss_key_fill(void *key, size_t size)
 		key_generated = true;
 	}
 
-	rte_memcpy(key, default_key, size);
+	memcpy(key, default_key, size);
 }
 
 int ena_rss_reta_update(struct rte_eth_dev *dev,