[v2] eal/arm: remove CASP constraints for GCC

Message ID 20211105085712.3220-1-pbhagavatula@marvell.com (mailing list archive)
State Accepted, archived
Delegated to: David Marchand
Headers
Series [v2] eal/arm: remove CASP constraints for GCC |

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/github-robot: build success github build: passed
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS
ci/iol-x86_64-compile-testing success Testing PASS
ci/iol-x86_64-unit-testing success Testing PASS
ci/iol-aarch64-unit-testing success Testing PASS
ci/iol-abi-testing success Testing PASS

Commit Message

Pavan Nikhilesh Bhagavatula Nov. 5, 2021, 8:57 a.m. UTC
  From: Pavan Nikhilesh <pbhagavatula@marvell.com>

GCC now assigns even register pairs for CASP, the fix has also been
backported to all stable releases of older GCC versions.
Removing the manual register allocation allows GCC to inline the
functions and pick optimal registers for performing CASP.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
---
 v2 Changes:
 - Remove unnecessary LSE_PREAMBLE for GCC (Ruifeng).

 lib/eal/arm/include/rte_atomic_64.h | 21 ++++++++++++++-------
 1 file changed, 14 insertions(+), 7 deletions(-)

--
2.17.1
  

Comments

Ruifeng Wang Nov. 8, 2021, 7:15 a.m. UTC | #1
> -----Original Message-----
> From: pbhagavatula@marvell.com <pbhagavatula@marvell.com>
> Sent: Friday, November 5, 2021 4:57 PM
> To: Ruifeng Wang <Ruifeng.Wang@arm.com>; david.marchand@redhat.com;
> jerinj@marvell.com
> Cc: dev@dpdk.org; Pavan Nikhilesh <pbhagavatula@marvell.com>
> Subject: [dpdk-dev] [PATCH v2] eal/arm: remove CASP constraints for GCC
> 
> From: Pavan Nikhilesh <pbhagavatula@marvell.com>
> 
> GCC now assigns even register pairs for CASP, the fix has also been
> backported to all stable releases of older GCC versions.
> Removing the manual register allocation allows GCC to inline the functions
> and pick optimal registers for performing CASP.
> 
> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> ---
>  v2 Changes:
>  - Remove unnecessary LSE_PREAMBLE for GCC (Ruifeng).
> 
>  lib/eal/arm/include/rte_atomic_64.h | 21 ++++++++++++++-------
>  1 file changed, 14 insertions(+), 7 deletions(-)
> 
Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>
  
David Marchand Nov. 16, 2021, 2:56 p.m. UTC | #2
On Mon, Nov 8, 2021 at 8:15 AM Ruifeng Wang <Ruifeng.Wang@arm.com> wrote:
>
> > -----Original Message-----
> > From: pbhagavatula@marvell.com <pbhagavatula@marvell.com>
> > Sent: Friday, November 5, 2021 4:57 PM
> > To: Ruifeng Wang <Ruifeng.Wang@arm.com>; david.marchand@redhat.com;
> > jerinj@marvell.com
> > Cc: dev@dpdk.org; Pavan Nikhilesh <pbhagavatula@marvell.com>
> > Subject: [dpdk-dev] [PATCH v2] eal/arm: remove CASP constraints for GCC
> >
> > From: Pavan Nikhilesh <pbhagavatula@marvell.com>
> >
> > GCC now assigns even register pairs for CASP, the fix has also been
> > backported to all stable releases of older GCC versions.
> > Removing the manual register allocation allows GCC to inline the functions
> > and pick optimal registers for performing CASP.
> >
> > Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>

Patch lgtm but it is late for merging in 21.11.

It is in EAL, and is an optimisation of the 128 bits cas operation on ARM.
This is used by the stack library and mempool.
There might be other impacts I did not think of.


Do you have links to bugs or commits for the mentionned fix on gcc side?
This will help when we get reports from users with compilers without the fix.


Thanks.
  
Pavan Nikhilesh Bhagavatula Jan. 20, 2022, 3:32 p.m. UTC | #3
>On Mon, Nov 8, 2021 at 8:15 AM Ruifeng Wang
><Ruifeng.Wang@arm.com> wrote:
>>
>> > -----Original Message-----
>> > From: pbhagavatula@marvell.com <pbhagavatula@marvell.com>
>> > Sent: Friday, November 5, 2021 4:57 PM
>> > To: Ruifeng Wang <Ruifeng.Wang@arm.com>;
>david.marchand@redhat.com;
>> > jerinj@marvell.com
>> > Cc: dev@dpdk.org; Pavan Nikhilesh <pbhagavatula@marvell.com>
>> > Subject: [dpdk-dev] [PATCH v2] eal/arm: remove CASP constraints
>for GCC
>> >
>> > From: Pavan Nikhilesh <pbhagavatula@marvell.com>
>> >
>> > GCC now assigns even register pairs for CASP, the fix has also been
>> > backported to all stable releases of older GCC versions.
>> > Removing the manual register allocation allows GCC to inline the
>functions
>> > and pick optimal registers for performing CASP.
>> >
>> > Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
>> Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>
>
>Patch lgtm but it is late for merging in 21.11.
>
>It is in EAL, and is an optimisation of the 128 bits cas operation on ARM.
>This is used by the stack library and mempool.
>There might be other impacts I did not think of.
>
>
>Do you have links to bugs or commits for the mentionned fix on gcc
>side?

Here is the gcc git commit that fixes this.

https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=563cc649beaf11d707c422e5f4e9e5cdacb818c3

>This will help when we get reports from users with compilers without
>the fix.
>
>
>Thanks.
>
>--
>David Marchand
  
David Marchand Feb. 11, 2022, 7:53 a.m. UTC | #4
On Tue, Nov 16, 2021 at 3:56 PM David Marchand
<david.marchand@redhat.com> wrote:
> > > GCC now assigns even register pairs for CASP, the fix has also been
> > > backported to all stable releases of older GCC versions.
> > > Removing the manual register allocation allows GCC to inline the functions
> > > and pick optimal registers for performing CASP.
> > >
> > > Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> > Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>
>

I added a reference to gcc commit and applied, thanks.
  

Patch

diff --git a/lib/eal/arm/include/rte_atomic_64.h b/lib/eal/arm/include/rte_atomic_64.h
index fa6f334c0d..6047911507 100644
--- a/lib/eal/arm/include/rte_atomic_64.h
+++ b/lib/eal/arm/include/rte_atomic_64.h
@@ -46,12 +46,8 @@  rte_atomic_thread_fence(int memorder)
 /*------------------------ 128 bit atomic operations -------------------------*/

 #if defined(__ARM_FEATURE_ATOMICS) || defined(RTE_ARM_FEATURE_ATOMICS)
-#if defined(RTE_CC_CLANG)
-#define __LSE_PREAMBLE	".arch armv8-a+lse\n"
-#else
-#define __LSE_PREAMBLE	""
-#endif

+#if defined(RTE_CC_CLANG)
 #define __ATOMIC128_CAS_OP(cas_op_name, op_string)                          \
 static __rte_noinline void                                                  \
 cas_op_name(rte_int128_t *dst, rte_int128_t *old, rte_int128_t updated)     \
@@ -65,7 +61,7 @@  cas_op_name(rte_int128_t *dst, rte_int128_t *old, rte_int128_t updated)     \
 	register uint64_t x2 __asm("x2") = (uint64_t)updated.val[0];        \
 	register uint64_t x3 __asm("x3") = (uint64_t)updated.val[1];        \
 	asm volatile(                                                       \
-		__LSE_PREAMBLE						    \
+		".arch armv8-a+lse\n"                                       \
 		op_string " %[old0], %[old1], %[upd0], %[upd1], [%[dst]]"   \
 		: [old0] "+r" (x0),                                         \
 		[old1] "+r" (x1)                                            \
@@ -76,13 +72,24 @@  cas_op_name(rte_int128_t *dst, rte_int128_t *old, rte_int128_t updated)     \
 	old->val[0] = x0;                                                   \
 	old->val[1] = x1;                                                   \
 }
+#else
+#define __ATOMIC128_CAS_OP(cas_op_name, op_string)                          \
+static __rte_always_inline void                                             \
+cas_op_name(rte_int128_t *dst, rte_int128_t *old, rte_int128_t updated)     \
+{                                                                           \
+	asm volatile(                                                       \
+		op_string " %[old], %H[old], %[upd], %H[upd], [%[dst]]"     \
+		: [old] "+r"(old->int128)                                   \
+		: [upd] "r"(updated.int128), [dst] "r"(dst)                 \
+		: "memory");                                                \
+}
+#endif

 __ATOMIC128_CAS_OP(__cas_128_relaxed, "casp")
 __ATOMIC128_CAS_OP(__cas_128_acquire, "caspa")
 __ATOMIC128_CAS_OP(__cas_128_release, "caspl")
 __ATOMIC128_CAS_OP(__cas_128_acq_rel, "caspal")

-#undef __LSE_PREAMBLE
 #undef __ATOMIC128_CAS_OP

 #endif