[v3,2/2] ring: use wfe to wait for ring tail update on aarch64

Message ID 20210425055653.1509261-3-ruifeng.wang@arm.com (mailing list archive)
State Superseded, archived
Delegated to: David Marchand
Headers
Series [v3,1/2] spinlock: use wfe to reduce contention on aarch64 |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-abi-testing success Testing PASS
ci/iol-testing success Testing PASS
ci/github-robot success github build: passed
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS

Commit Message

Ruifeng Wang April 25, 2021, 5:56 a.m. UTC
  Instead of polling for tail to be updated, use wfe instruction.

Signed-off-by: Gavin Hu <gavin.hu@arm.com>
Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 lib/ring/rte_ring_c11_pvt.h     | 4 ++--
 lib/ring/rte_ring_generic_pvt.h | 3 +--
 2 files changed, 3 insertions(+), 4 deletions(-)
  

Comments

Jerin Jacob April 26, 2021, 5:38 a.m. UTC | #1
On Sun, Apr 25, 2021 at 11:27 AM Ruifeng Wang <ruifeng.wang@arm.com> wrote:
>
> Instead of polling for tail to be updated, use wfe instruction.
>
> Signed-off-by: Gavin Hu <gavin.hu@arm.com>
> Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
> Reviewed-by: Steve Capper <steve.capper@arm.com>
> Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

Acked-by: Jerin Jacob <jerinj@marvell.com>



> ---
>  lib/ring/rte_ring_c11_pvt.h     | 4 ++--
>  lib/ring/rte_ring_generic_pvt.h | 3 +--
>  2 files changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/lib/ring/rte_ring_c11_pvt.h b/lib/ring/rte_ring_c11_pvt.h
> index 759192f4c4..37e0b2afd6 100644
> --- a/lib/ring/rte_ring_c11_pvt.h
> +++ b/lib/ring/rte_ring_c11_pvt.h
> @@ -2,6 +2,7 @@
>   *
>   * Copyright (c) 2017,2018 HXT-semitech Corporation.
>   * Copyright (c) 2007-2009 Kip Macy kmacy@freebsd.org
> + * Copyright (c) 2021 Arm Limited
>   * All rights reserved.
>   * Derived from FreeBSD's bufring.h
>   * Used as BSD-3 Licensed with permission from Kip Macy.
> @@ -21,8 +22,7 @@ __rte_ring_update_tail(struct rte_ring_headtail *ht, uint32_t old_val,
>          * we need to wait for them to complete
>          */
>         if (!single)
> -               while (unlikely(ht->tail != old_val))
> -                       rte_pause();
> +               rte_wait_until_equal_32(&ht->tail, old_val, __ATOMIC_RELAXED);
>
>         __atomic_store_n(&ht->tail, new_val, __ATOMIC_RELEASE);
>  }
> diff --git a/lib/ring/rte_ring_generic_pvt.h b/lib/ring/rte_ring_generic_pvt.h
> index 532deb5e7a..c95ad7e12c 100644
> --- a/lib/ring/rte_ring_generic_pvt.h
> +++ b/lib/ring/rte_ring_generic_pvt.h
> @@ -23,8 +23,7 @@ __rte_ring_update_tail(struct rte_ring_headtail *ht, uint32_t old_val,
>          * we need to wait for them to complete
>          */
>         if (!single)
> -               while (unlikely(ht->tail != old_val))
> -                       rte_pause();
> +               rte_wait_until_equal_32(&ht->tail, old_val, __ATOMIC_RELAXED);
>
>         ht->tail = new_val;
>  }
> --
> 2.25.1
>
  
Stephen Hemminger April 28, 2021, 5:17 p.m. UTC | #2
On Sun, 25 Apr 2021 05:56:53 +0000
Ruifeng Wang <ruifeng.wang@arm.com> wrote:

> Instead of polling for tail to be updated, use wfe instruction.
> 
> Signed-off-by: Gavin Hu <gavin.hu@arm.com>
> Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
> Reviewed-by: Steve Capper <steve.capper@arm.com>
> Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

Looks ok to me, but it does raise an interesting question.
Shouldn't the original code have been using atomic load to look at ht->tail.

This another place where "volatile considered harmful" applies.
  
Ruifeng Wang April 29, 2021, 2:35 p.m. UTC | #3
> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Thursday, April 29, 2021 1:17 AM
> To: Ruifeng Wang <Ruifeng.Wang@arm.com>
> Cc: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; Konstantin
> Ananyev <konstantin.ananyev@intel.com>; dev@dpdk.org;
> david.marchand@redhat.com; thomas@monjalon.net; jerinj@marvell.com;
> nd <nd@arm.com>; Gavin Hu <Gavin.Hu@arm.com>; Steve Capper
> <Steve.Capper@arm.com>; Ola Liljedahl <Ola.Liljedahl@arm.com>
> Subject: Re: [dpdk-dev] [PATCH v3 2/2] ring: use wfe to wait for ring tail
> update on aarch64
> 
> On Sun, 25 Apr 2021 05:56:53 +0000
> Ruifeng Wang <ruifeng.wang@arm.com> wrote:
> 
> > Instead of polling for tail to be updated, use wfe instruction.
> >
> > Signed-off-by: Gavin Hu <gavin.hu@arm.com>
> > Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > Reviewed-by: Steve Capper <steve.capper@arm.com>
> > Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
> > Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> > Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
> 
> Looks ok to me, but it does raise an interesting question.
> Shouldn't the original code have been using atomic load to look at ht->tail.
> 
> This another place where "volatile considered harmful" applies.

Do you mean 'volatile' should be removed from rte_wait_until_equal_xxx parameters?
  
Stephen Hemminger April 29, 2021, 3:05 p.m. UTC | #4
On Thu, 29 Apr 2021 14:35:35 +0000
Ruifeng Wang <Ruifeng.Wang@arm.com> wrote:

> > -----Original Message-----
> > From: Stephen Hemminger <stephen@networkplumber.org>
> > Sent: Thursday, April 29, 2021 1:17 AM
> > To: Ruifeng Wang <Ruifeng.Wang@arm.com>
> > Cc: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; Konstantin
> > Ananyev <konstantin.ananyev@intel.com>; dev@dpdk.org;
> > david.marchand@redhat.com; thomas@monjalon.net; jerinj@marvell.com;
> > nd <nd@arm.com>; Gavin Hu <Gavin.Hu@arm.com>; Steve Capper
> > <Steve.Capper@arm.com>; Ola Liljedahl <Ola.Liljedahl@arm.com>
> > Subject: Re: [dpdk-dev] [PATCH v3 2/2] ring: use wfe to wait for ring tail
> > update on aarch64
> > 
> > On Sun, 25 Apr 2021 05:56:53 +0000
> > Ruifeng Wang <ruifeng.wang@arm.com> wrote:
> >   
> > > Instead of polling for tail to be updated, use wfe instruction.
> > >
> > > Signed-off-by: Gavin Hu <gavin.hu@arm.com>
> > > Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > > Reviewed-by: Steve Capper <steve.capper@arm.com>
> > > Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
> > > Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> > > Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>  
> > 
> > Looks ok to me, but it does raise an interesting question.
> > Shouldn't the original code have been using atomic load to look at ht->tail.
> > 
> > This another place where "volatile considered harmful" applies.  
> 
> Do you mean 'volatile' should be removed from rte_wait_until_equal_xxx parameters?
> 

I meant that all access to tail should be via C11 atomic builtin. At that point,
the volatile on the data structure elements does not matter.
  
Ruifeng Wang May 7, 2021, 8:25 a.m. UTC | #5
> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Thursday, April 29, 2021 11:06 PM
> To: Ruifeng Wang <Ruifeng.Wang@arm.com>
> Cc: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; Konstantin
> Ananyev <konstantin.ananyev@intel.com>; dev@dpdk.org;
> david.marchand@redhat.com; thomas@monjalon.net; jerinj@marvell.com;
> nd <nd@arm.com>; Gavin Hu <Gavin.Hu@arm.com>; Steve Capper
> <Steve.Capper@arm.com>; Ola Liljedahl <Ola.Liljedahl@arm.com>
> Subject: Re: [dpdk-dev] [PATCH v3 2/2] ring: use wfe to wait for ring tail
> update on aarch64
> 
> On Thu, 29 Apr 2021 14:35:35 +0000
> Ruifeng Wang <Ruifeng.Wang@arm.com> wrote:
> 
> > > -----Original Message-----
> > > From: Stephen Hemminger <stephen@networkplumber.org>
> > > Sent: Thursday, April 29, 2021 1:17 AM
> > > To: Ruifeng Wang <Ruifeng.Wang@arm.com>
> > > Cc: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; Konstantin
> > > Ananyev <konstantin.ananyev@intel.com>; dev@dpdk.org;
> > > david.marchand@redhat.com; thomas@monjalon.net;
> jerinj@marvell.com;
> > > nd <nd@arm.com>; Gavin Hu <Gavin.Hu@arm.com>; Steve Capper
> > > <Steve.Capper@arm.com>; Ola Liljedahl <Ola.Liljedahl@arm.com>
> > > Subject: Re: [dpdk-dev] [PATCH v3 2/2] ring: use wfe to wait for
> > > ring tail update on aarch64
> > >
> > > On Sun, 25 Apr 2021 05:56:53 +0000
> > > Ruifeng Wang <ruifeng.wang@arm.com> wrote:
> > >
> > > > Instead of polling for tail to be updated, use wfe instruction.
> > > >
> > > > Signed-off-by: Gavin Hu <gavin.hu@arm.com>
> > > > Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > > > Reviewed-by: Steve Capper <steve.capper@arm.com>
> > > > Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
> > > > Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> > > > Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
> > >
> > > Looks ok to me, but it does raise an interesting question.
> > > Shouldn't the original code have been using atomic load to look at ht->tail.
> > >
> > > This another place where "volatile considered harmful" applies.
> >
> > Do you mean 'volatile' should be removed from rte_wait_until_equal_xxx
> parameters?
> >
> 
> I meant that all access to tail should be via C11 atomic builtin. At that point,
> the volatile on the data structure elements does not matter.

Agreed. If synchronization is ensured by using C11 atomic builtin, 'volatile' on elements can be removed.
  

Patch

diff --git a/lib/ring/rte_ring_c11_pvt.h b/lib/ring/rte_ring_c11_pvt.h
index 759192f4c4..37e0b2afd6 100644
--- a/lib/ring/rte_ring_c11_pvt.h
+++ b/lib/ring/rte_ring_c11_pvt.h
@@ -2,6 +2,7 @@ 
  *
  * Copyright (c) 2017,2018 HXT-semitech Corporation.
  * Copyright (c) 2007-2009 Kip Macy kmacy@freebsd.org
+ * Copyright (c) 2021 Arm Limited
  * All rights reserved.
  * Derived from FreeBSD's bufring.h
  * Used as BSD-3 Licensed with permission from Kip Macy.
@@ -21,8 +22,7 @@  __rte_ring_update_tail(struct rte_ring_headtail *ht, uint32_t old_val,
 	 * we need to wait for them to complete
 	 */
 	if (!single)
-		while (unlikely(ht->tail != old_val))
-			rte_pause();
+		rte_wait_until_equal_32(&ht->tail, old_val, __ATOMIC_RELAXED);
 
 	__atomic_store_n(&ht->tail, new_val, __ATOMIC_RELEASE);
 }
diff --git a/lib/ring/rte_ring_generic_pvt.h b/lib/ring/rte_ring_generic_pvt.h
index 532deb5e7a..c95ad7e12c 100644
--- a/lib/ring/rte_ring_generic_pvt.h
+++ b/lib/ring/rte_ring_generic_pvt.h
@@ -23,8 +23,7 @@  __rte_ring_update_tail(struct rte_ring_headtail *ht, uint32_t old_val,
 	 * we need to wait for them to complete
 	 */
 	if (!single)
-		while (unlikely(ht->tail != old_val))
-			rte_pause();
+		rte_wait_until_equal_32(&ht->tail, old_val, __ATOMIC_RELAXED);
 
 	ht->tail = new_val;
 }