[dpdk-dev,v5,18/19] ring: add sched_yield to avoid spin forever
Commit Message
Add a sched_yield() syscall if the thread spins for too long, waiting other thread to finish its operations on the ring.
That gives pre-empted thread a chance to proceed and finish with ring enqnue/dequeue operation.
The purpose is to reduce contention on the ring. By ring_perf_test, it doesn't shows additional perf penalty.
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
v5 changes:
add RTE_RING_PAUSE_REP to config file
v4 changes:
update and add more comments on sched_yield()
v3 changes:
new patch adding sched_yield() in rte_ring to avoid long spin
config/common_bsdapp | 1 +
config/common_linuxapp | 1 +
lib/librte_ring/rte_ring.h | 31 +++++++++++++++++++++++++++----
3 files changed, 29 insertions(+), 4 deletions(-)
Comments
Hi,
On 02/12/2015 09:16 AM, Cunming Liang wrote:
> Add a sched_yield() syscall if the thread spins for too long, waiting other thread to finish its operations on the ring.
> That gives pre-empted thread a chance to proceed and finish with ring enqnue/dequeue operation.
> The purpose is to reduce contention on the ring. By ring_perf_test, it doesn't shows additional perf penalty.
>
> Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> ---
> v5 changes:
> add RTE_RING_PAUSE_REP to config file
>
> v4 changes:
> update and add more comments on sched_yield()
>
> v3 changes:
> new patch adding sched_yield() in rte_ring to avoid long spin
>
> config/common_bsdapp | 1 +
> config/common_linuxapp | 1 +
> lib/librte_ring/rte_ring.h | 31 +++++++++++++++++++++++++++----
> 3 files changed, 29 insertions(+), 4 deletions(-)
>
> diff --git a/config/common_bsdapp b/config/common_bsdapp
> index 57bacb8..52c5143 100644
> --- a/config/common_bsdapp
> +++ b/config/common_bsdapp
> @@ -234,6 +234,7 @@ CONFIG_RTE_PMD_PACKET_PREFETCH=y
> CONFIG_RTE_LIBRTE_RING=y
> CONFIG_RTE_LIBRTE_RING_DEBUG=n
> CONFIG_RTE_RING_SPLIT_PROD_CONS=n
> +CONFIG_RTE_RING_PAUSE_REP=n
Maybe it's better to use CONFIG_RTE_RING_PAUSE_REP=0 instead?
If I understand well, it has to be set to an integer value to
enable it, am I correct?
Thanks,
Olivier
Hi,
> -----Original Message-----
> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> Sent: Thursday, February 12, 2015 7:16 PM
> To: Liang, Cunming; dev@dpdk.org
> Cc: Ananyev, Konstantin
> Subject: Re: [PATCH v5 18/19] ring: add sched_yield to avoid spin forever
>
> Hi,
>
> On 02/12/2015 09:16 AM, Cunming Liang wrote:
> > Add a sched_yield() syscall if the thread spins for too long, waiting other thread
> to finish its operations on the ring.
> > That gives pre-empted thread a chance to proceed and finish with ring
> enqnue/dequeue operation.
> > The purpose is to reduce contention on the ring. By ring_perf_test, it doesn't
> shows additional perf penalty.
> >
> > Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> > ---
> > v5 changes:
> > add RTE_RING_PAUSE_REP to config file
> >
> > v4 changes:
> > update and add more comments on sched_yield()
> >
> > v3 changes:
> > new patch adding sched_yield() in rte_ring to avoid long spin
> >
> > config/common_bsdapp | 1 +
> > config/common_linuxapp | 1 +
> > lib/librte_ring/rte_ring.h | 31 +++++++++++++++++++++++++++----
> > 3 files changed, 29 insertions(+), 4 deletions(-)
> >
> > diff --git a/config/common_bsdapp b/config/common_bsdapp
> > index 57bacb8..52c5143 100644
> > --- a/config/common_bsdapp
> > +++ b/config/common_bsdapp
> > @@ -234,6 +234,7 @@ CONFIG_RTE_PMD_PACKET_PREFETCH=y
> > CONFIG_RTE_LIBRTE_RING=y
> > CONFIG_RTE_LIBRTE_RING_DEBUG=n
> > CONFIG_RTE_RING_SPLIT_PROD_CONS=n
> > +CONFIG_RTE_RING_PAUSE_REP=n
>
> Maybe it's better to use CONFIG_RTE_RING_PAUSE_REP=0 instead?
> If I understand well, it has to be set to an integer value to
> enable it, am I correct?
[LCM] If RTE_RING_PAUSE_REP=N (no define), by default will use 0. If it's set to 'y'(=1), will issue yield in the most frequent rate.
It also can set as integer to assign any number. All cases works for this configure.
One point is in configure file, just demonstrate the default way to use it.
It can't prevent to use anything unexpected. Except we rule the n & y illegal for this option.
The meaningful value of it can write in the doc.
>
> Thanks,
> Olivier
> -----Original Message-----
> From: Liang, Cunming
> Sent: Thursday, February 12, 2015 1:05 PM
> To: Olivier MATZ; dev@dpdk.org
> Cc: Ananyev, Konstantin
> Subject: RE: [PATCH v5 18/19] ring: add sched_yield to avoid spin forever
>
> Hi,
>
> > -----Original Message-----
> > From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> > Sent: Thursday, February 12, 2015 7:16 PM
> > To: Liang, Cunming; dev@dpdk.org
> > Cc: Ananyev, Konstantin
> > Subject: Re: [PATCH v5 18/19] ring: add sched_yield to avoid spin forever
> >
> > Hi,
> >
> > On 02/12/2015 09:16 AM, Cunming Liang wrote:
> > > Add a sched_yield() syscall if the thread spins for too long, waiting other thread
> > to finish its operations on the ring.
> > > That gives pre-empted thread a chance to proceed and finish with ring
> > enqnue/dequeue operation.
> > > The purpose is to reduce contention on the ring. By ring_perf_test, it doesn't
> > shows additional perf penalty.
> > >
> > > Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> > > ---
> > > v5 changes:
> > > add RTE_RING_PAUSE_REP to config file
> > >
> > > v4 changes:
> > > update and add more comments on sched_yield()
> > >
> > > v3 changes:
> > > new patch adding sched_yield() in rte_ring to avoid long spin
> > >
> > > config/common_bsdapp | 1 +
> > > config/common_linuxapp | 1 +
> > > lib/librte_ring/rte_ring.h | 31 +++++++++++++++++++++++++++----
> > > 3 files changed, 29 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/config/common_bsdapp b/config/common_bsdapp
> > > index 57bacb8..52c5143 100644
> > > --- a/config/common_bsdapp
> > > +++ b/config/common_bsdapp
> > > @@ -234,6 +234,7 @@ CONFIG_RTE_PMD_PACKET_PREFETCH=y
> > > CONFIG_RTE_LIBRTE_RING=y
> > > CONFIG_RTE_LIBRTE_RING_DEBUG=n
> > > CONFIG_RTE_RING_SPLIT_PROD_CONS=n
> > > +CONFIG_RTE_RING_PAUSE_REP=n
> >
> > Maybe it's better to use CONFIG_RTE_RING_PAUSE_REP=0 instead?
> > If I understand well, it has to be set to an integer value to
> > enable it, am I correct?
> [LCM] If RTE_RING_PAUSE_REP=N (no define), by default will use 0. If it's set to 'y'(=1), will issue yield in the most frequent rate.
> It also can set as integer to assign any number. All cases works for this configure.
> One point is in configure file, just demonstrate the default way to use it.
> It can't prevent to use anything unexpected. Except we rule the n & y illegal for this option.
> The meaningful value of it can write in the doc.
I also think that it is better avoid 'y' and 'n' (though as you said it would work), but use integers for all cases.
Less confusion for the users.
Konstantin
> >
> > Thanks,
> > Olivier
On Thu, Feb 12, 2015 at 01:08:43PM +0000, Ananyev, Konstantin wrote:
>
>
> > -----Original Message-----
> > From: Liang, Cunming
> > Sent: Thursday, February 12, 2015 1:05 PM
> > To: Olivier MATZ; dev@dpdk.org
> > Cc: Ananyev, Konstantin
> > Subject: RE: [PATCH v5 18/19] ring: add sched_yield to avoid spin forever
> >
> > Hi,
> >
> > > -----Original Message-----
> > > From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> > > Sent: Thursday, February 12, 2015 7:16 PM
> > > To: Liang, Cunming; dev@dpdk.org
> > > Cc: Ananyev, Konstantin
> > > Subject: Re: [PATCH v5 18/19] ring: add sched_yield to avoid spin forever
> > >
> > > Hi,
> > >
> > > On 02/12/2015 09:16 AM, Cunming Liang wrote:
> > > > Add a sched_yield() syscall if the thread spins for too long, waiting other thread
> > > to finish its operations on the ring.
> > > > That gives pre-empted thread a chance to proceed and finish with ring
> > > enqnue/dequeue operation.
> > > > The purpose is to reduce contention on the ring. By ring_perf_test, it doesn't
> > > shows additional perf penalty.
> > > >
> > > > Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> > > > ---
> > > > v5 changes:
> > > > add RTE_RING_PAUSE_REP to config file
> > > >
> > > > v4 changes:
> > > > update and add more comments on sched_yield()
> > > >
> > > > v3 changes:
> > > > new patch adding sched_yield() in rte_ring to avoid long spin
> > > >
> > > > config/common_bsdapp | 1 +
> > > > config/common_linuxapp | 1 +
> > > > lib/librte_ring/rte_ring.h | 31 +++++++++++++++++++++++++++----
> > > > 3 files changed, 29 insertions(+), 4 deletions(-)
> > > >
> > > > diff --git a/config/common_bsdapp b/config/common_bsdapp
> > > > index 57bacb8..52c5143 100644
> > > > --- a/config/common_bsdapp
> > > > +++ b/config/common_bsdapp
> > > > @@ -234,6 +234,7 @@ CONFIG_RTE_PMD_PACKET_PREFETCH=y
> > > > CONFIG_RTE_LIBRTE_RING=y
> > > > CONFIG_RTE_LIBRTE_RING_DEBUG=n
> > > > CONFIG_RTE_RING_SPLIT_PROD_CONS=n
> > > > +CONFIG_RTE_RING_PAUSE_REP=n
> > >
> > > Maybe it's better to use CONFIG_RTE_RING_PAUSE_REP=0 instead?
> > > If I understand well, it has to be set to an integer value to
> > > enable it, am I correct?
> > [LCM] If RTE_RING_PAUSE_REP=N (no define), by default will use 0. If it's set to 'y'(=1), will issue yield in the most frequent rate.
> > It also can set as integer to assign any number. All cases works for this configure.
> > One point is in configure file, just demonstrate the default way to use it.
> > It can't prevent to use anything unexpected. Except we rule the n & y illegal for this option.
> > The meaningful value of it can write in the doc.
>
> I also think that it is better avoid 'y' and 'n' (though as you said it would work), but use integers for all cases.
> Less confusion for the users.
> Konstantin
>
+1
Also expand "REP" to "REP_COUNT" to make it clear that it's a numeric value.
/Bruce
@@ -234,6 +234,7 @@ CONFIG_RTE_PMD_PACKET_PREFETCH=y
CONFIG_RTE_LIBRTE_RING=y
CONFIG_RTE_LIBRTE_RING_DEBUG=n
CONFIG_RTE_RING_SPLIT_PROD_CONS=n
+CONFIG_RTE_RING_PAUSE_REP=n
#
# Compile librte_mempool
@@ -242,6 +242,7 @@ CONFIG_RTE_PMD_PACKET_PREFETCH=y
CONFIG_RTE_LIBRTE_RING=y
CONFIG_RTE_LIBRTE_RING_DEBUG=n
CONFIG_RTE_RING_SPLIT_PROD_CONS=n
+CONFIG_RTE_RING_PAUSE_REP=n
#
# Compile librte_mempool
@@ -127,6 +127,11 @@ struct rte_ring_debug_stats {
#define RTE_RING_NAMESIZE 32 /**< The maximum length of a ring name. */
#define RTE_RING_MZ_PREFIX "RG_"
+#ifndef RTE_RING_PAUSE_REP
+#define RTE_RING_PAUSE_REP 0 /**< yield after pause num of times,
+ * no yield if RTE_RING_PAUSE_REP not defined. */
+#endif
+
/**
* An RTE ring structure.
*
@@ -410,7 +415,7 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
uint32_t cons_tail, free_entries;
const unsigned max = n;
int success;
- unsigned i;
+ unsigned i, rep = 0;
uint32_t mask = r->prod.mask;
int ret;
@@ -468,9 +473,18 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
* If there are other enqueues in progress that preceded us,
* we need to wait for them to complete
*/
- while (unlikely(r->prod.tail != prod_head))
+ while (unlikely(r->prod.tail != prod_head)) {
rte_pause();
+ /* Set RTE_RING_PAUSE_REP to avoid spin too long waiting for
+ * other thread finish. It gives pre-empted thread a chance
+ * to proceed and finish with ring denqnue operation. */
+ if (RTE_RING_PAUSE_REP &&
+ ++rep == RTE_RING_PAUSE_REP) {
+ rep = 0;
+ sched_yield();
+ }
+ }
r->prod.tail = prod_next;
return ret;
}
@@ -589,7 +603,7 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
uint32_t cons_next, entries;
const unsigned max = n;
int success;
- unsigned i;
+ unsigned i, rep = 0;
uint32_t mask = r->prod.mask;
/* move cons.head atomically */
@@ -634,9 +648,18 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
* If there are other dequeues in progress that preceded us,
* we need to wait for them to complete
*/
- while (unlikely(r->cons.tail != cons_head))
+ while (unlikely(r->cons.tail != cons_head)) {
rte_pause();
+ /* Set RTE_RING_PAUSE_REP to avoid spin too long waiting for
+ * other thread finish. It gives pre-empted thread a chance
+ * to proceed and finish with ring denqnue operation. */
+ if (RTE_RING_PAUSE_REP &&
+ ++rep == RTE_RING_PAUSE_REP) {
+ rep = 0;
+ sched_yield();
+ }
+ }
__RING_STAT_ADD(r, deq_success, n);
r->cons.tail = cons_next;