diff mbox series

[v2] eal/arm: remove CASP constraints for GCC

Message ID 20211105085712.3220-1-pbhagavatula@marvell.com (mailing list archive)
State New
Delegated to: David Marchand
Headers show
Series [v2] eal/arm: remove CASP constraints for GCC | expand

Checks

Context Check Description
ci/iol-x86_64-unit-testing fail Testing issues
ci/iol-aarch64-compile-testing success Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-x86_64-compile-testing success Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-aarch64-unit-testing success Testing PASS
ci/intel-Testing success Testing PASS
ci/Intel-compilation success Compilation OK
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/github-robot: build success github build: passed
ci/checkpatch warning coding style issues

Commit Message

Pavan Nikhilesh Bhagavatula Nov. 5, 2021, 8:57 a.m. UTC
From: Pavan Nikhilesh <pbhagavatula@marvell.com>

GCC now assigns even register pairs for CASP, the fix has also been
backported to all stable releases of older GCC versions.
Removing the manual register allocation allows GCC to inline the
functions and pick optimal registers for performing CASP.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
---
 v2 Changes:
 - Remove unnecessary LSE_PREAMBLE for GCC (Ruifeng).

 lib/eal/arm/include/rte_atomic_64.h | 21 ++++++++++++++-------
 1 file changed, 14 insertions(+), 7 deletions(-)

--
2.17.1

Comments

Ruifeng Wang Nov. 8, 2021, 7:15 a.m. UTC | #1
> -----Original Message-----
> From: pbhagavatula@marvell.com <pbhagavatula@marvell.com>
> Sent: Friday, November 5, 2021 4:57 PM
> To: Ruifeng Wang <Ruifeng.Wang@arm.com>; david.marchand@redhat.com;
> jerinj@marvell.com
> Cc: dev@dpdk.org; Pavan Nikhilesh <pbhagavatula@marvell.com>
> Subject: [dpdk-dev] [PATCH v2] eal/arm: remove CASP constraints for GCC
> 
> From: Pavan Nikhilesh <pbhagavatula@marvell.com>
> 
> GCC now assigns even register pairs for CASP, the fix has also been
> backported to all stable releases of older GCC versions.
> Removing the manual register allocation allows GCC to inline the functions
> and pick optimal registers for performing CASP.
> 
> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> ---
>  v2 Changes:
>  - Remove unnecessary LSE_PREAMBLE for GCC (Ruifeng).
> 
>  lib/eal/arm/include/rte_atomic_64.h | 21 ++++++++++++++-------
>  1 file changed, 14 insertions(+), 7 deletions(-)
> 
Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>
David Marchand Nov. 16, 2021, 2:56 p.m. UTC | #2
On Mon, Nov 8, 2021 at 8:15 AM Ruifeng Wang <Ruifeng.Wang@arm.com> wrote:
>
> > -----Original Message-----
> > From: pbhagavatula@marvell.com <pbhagavatula@marvell.com>
> > Sent: Friday, November 5, 2021 4:57 PM
> > To: Ruifeng Wang <Ruifeng.Wang@arm.com>; david.marchand@redhat.com;
> > jerinj@marvell.com
> > Cc: dev@dpdk.org; Pavan Nikhilesh <pbhagavatula@marvell.com>
> > Subject: [dpdk-dev] [PATCH v2] eal/arm: remove CASP constraints for GCC
> >
> > From: Pavan Nikhilesh <pbhagavatula@marvell.com>
> >
> > GCC now assigns even register pairs for CASP, the fix has also been
> > backported to all stable releases of older GCC versions.
> > Removing the manual register allocation allows GCC to inline the functions
> > and pick optimal registers for performing CASP.
> >
> > Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>

Patch lgtm but it is late for merging in 21.11.

It is in EAL, and is an optimisation of the 128 bits cas operation on ARM.
This is used by the stack library and mempool.
There might be other impacts I did not think of.


Do you have links to bugs or commits for the mentionned fix on gcc side?
This will help when we get reports from users with compilers without the fix.


Thanks.
Pavan Nikhilesh Bhagavatula Jan. 20, 2022, 3:32 p.m. UTC | #3
>On Mon, Nov 8, 2021 at 8:15 AM Ruifeng Wang
><Ruifeng.Wang@arm.com> wrote:
>>
>> > -----Original Message-----
>> > From: pbhagavatula@marvell.com <pbhagavatula@marvell.com>
>> > Sent: Friday, November 5, 2021 4:57 PM
>> > To: Ruifeng Wang <Ruifeng.Wang@arm.com>;
>david.marchand@redhat.com;
>> > jerinj@marvell.com
>> > Cc: dev@dpdk.org; Pavan Nikhilesh <pbhagavatula@marvell.com>
>> > Subject: [dpdk-dev] [PATCH v2] eal/arm: remove CASP constraints
>for GCC
>> >
>> > From: Pavan Nikhilesh <pbhagavatula@marvell.com>
>> >
>> > GCC now assigns even register pairs for CASP, the fix has also been
>> > backported to all stable releases of older GCC versions.
>> > Removing the manual register allocation allows GCC to inline the
>functions
>> > and pick optimal registers for performing CASP.
>> >
>> > Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
>> Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>
>
>Patch lgtm but it is late for merging in 21.11.
>
>It is in EAL, and is an optimisation of the 128 bits cas operation on ARM.
>This is used by the stack library and mempool.
>There might be other impacts I did not think of.
>
>
>Do you have links to bugs or commits for the mentionned fix on gcc
>side?

Here is the gcc git commit that fixes this.

https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=563cc649beaf11d707c422e5f4e9e5cdacb818c3

>This will help when we get reports from users with compilers without
>the fix.
>
>
>Thanks.
>
>--
>David Marchand
diff mbox series

Patch

diff --git a/lib/eal/arm/include/rte_atomic_64.h b/lib/eal/arm/include/rte_atomic_64.h
index fa6f334c0d..6047911507 100644
--- a/lib/eal/arm/include/rte_atomic_64.h
+++ b/lib/eal/arm/include/rte_atomic_64.h
@@ -46,12 +46,8 @@  rte_atomic_thread_fence(int memorder)
 /*------------------------ 128 bit atomic operations -------------------------*/

 #if defined(__ARM_FEATURE_ATOMICS) || defined(RTE_ARM_FEATURE_ATOMICS)
-#if defined(RTE_CC_CLANG)
-#define __LSE_PREAMBLE	".arch armv8-a+lse\n"
-#else
-#define __LSE_PREAMBLE	""
-#endif

+#if defined(RTE_CC_CLANG)
 #define __ATOMIC128_CAS_OP(cas_op_name, op_string)                          \
 static __rte_noinline void                                                  \
 cas_op_name(rte_int128_t *dst, rte_int128_t *old, rte_int128_t updated)     \
@@ -65,7 +61,7 @@  cas_op_name(rte_int128_t *dst, rte_int128_t *old, rte_int128_t updated)     \
 	register uint64_t x2 __asm("x2") = (uint64_t)updated.val[0];        \
 	register uint64_t x3 __asm("x3") = (uint64_t)updated.val[1];        \
 	asm volatile(                                                       \
-		__LSE_PREAMBLE						    \
+		".arch armv8-a+lse\n"                                       \
 		op_string " %[old0], %[old1], %[upd0], %[upd1], [%[dst]]"   \
 		: [old0] "+r" (x0),                                         \
 		[old1] "+r" (x1)                                            \
@@ -76,13 +72,24 @@  cas_op_name(rte_int128_t *dst, rte_int128_t *old, rte_int128_t updated)     \
 	old->val[0] = x0;                                                   \
 	old->val[1] = x1;                                                   \
 }
+#else
+#define __ATOMIC128_CAS_OP(cas_op_name, op_string)                          \
+static __rte_always_inline void                                             \
+cas_op_name(rte_int128_t *dst, rte_int128_t *old, rte_int128_t updated)     \
+{                                                                           \
+	asm volatile(                                                       \
+		op_string " %[old], %H[old], %[upd], %H[upd], [%[dst]]"     \
+		: [old] "+r"(old->int128)                                   \
+		: [upd] "r"(updated.int128), [dst] "r"(dst)                 \
+		: "memory");                                                \
+}
+#endif

 __ATOMIC128_CAS_OP(__cas_128_relaxed, "casp")
 __ATOMIC128_CAS_OP(__cas_128_acquire, "caspa")
 __ATOMIC128_CAS_OP(__cas_128_release, "caspl")
 __ATOMIC128_CAS_OP(__cas_128_acq_rel, "caspal")

-#undef __LSE_PREAMBLE
 #undef __ATOMIC128_CAS_OP

 #endif