[v2,4/4] eventdev: relax smp barriers with c11 atomics
Checks
Commit Message
The implementation-specific opaque data is shared between arm and cancel
operations. The state flag acts as a guard variable to make sure the
update of opaque data is synchronized. This patch uses c11 atomics with
explicit one way memory barrier instead of full barriers rte_smp_w/rmb()
to synchronize the opaque data between timer arm and cancel threads.
Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
v2:
1. Removed implementation-specific opaque data cleanup code.
2. Replaced thread fence with atomic ACQURE/RELEASE ordering on state access.
lib/librte_eventdev/rte_event_timer_adapter.c | 55 ++++++++++++++++++---------
lib/librte_eventdev/rte_event_timer_adapter.h | 2 +-
2 files changed, 38 insertions(+), 19 deletions(-)
Comments
> -----Original Message-----
> From: Phil Yang <phil.yang@arm.com>
> Sent: Thursday, July 2, 2020 12:27 AM
> To: Carrillo, Erik G <erik.g.carrillo@intel.com>; dev@dpdk.org
> Cc: jerinj@marvell.com; Honnappa.Nagarahalli@arm.com;
> drc@linux.vnet.ibm.com; Ruifeng.Wang@arm.com;
> Dharmik.Thakkar@arm.com; nd@arm.com
> Subject: [PATCH v2 4/4] eventdev: relax smp barriers with c11 atomics
>
> The implementation-specific opaque data is shared between arm and cancel
> operations. The state flag acts as a guard variable to make sure the update of
> opaque data is synchronized. This patch uses c11 atomics with explicit one
> way memory barrier instead of full barriers rte_smp_w/rmb() to synchronize
> the opaque data between timer arm and cancel threads.
>
> Signed-off-by: Phil Yang <phil.yang@arm.com>
> Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
On Fri, Jul 3, 2020 at 2:00 AM Carrillo, Erik G
<erik.g.carrillo@intel.com> wrote:
>
> > -----Original Message-----
> > From: Phil Yang <phil.yang@arm.com>
> > Sent: Thursday, July 2, 2020 12:27 AM
> > To: Carrillo, Erik G <erik.g.carrillo@intel.com>; dev@dpdk.org
> > Cc: jerinj@marvell.com; Honnappa.Nagarahalli@arm.com;
> > drc@linux.vnet.ibm.com; Ruifeng.Wang@arm.com;
> > Dharmik.Thakkar@arm.com; nd@arm.com
> > Subject: [PATCH v2 4/4] eventdev: relax smp barriers with c11 atomics
> >
> > The implementation-specific opaque data is shared between arm and cancel
> > operations. The state flag acts as a guard variable to make sure the update of
> > opaque data is synchronized. This patch uses c11 atomics with explicit one
> > way memory barrier instead of full barriers rte_smp_w/rmb() to synchronize
> > the opaque data between timer arm and cancel threads.
> >
> > Signed-off-by: Phil Yang <phil.yang@arm.com>
> > Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
Series applied to dpdk-next-eventdev/master. Thanks.
02/07/2020 07:26, Phil Yang:
> The implementation-specific opaque data is shared between arm and cancel
> operations. The state flag acts as a guard variable to make sure the
> update of opaque data is synchronized. This patch uses c11 atomics with
> explicit one way memory barrier instead of full barriers rte_smp_w/rmb()
> to synchronize the opaque data between timer arm and cancel threads.
I think we should write C11 (uppercase).
Please, in your explanations, try to be more specific.
Naming fields may help to make things clear.
[...]
> --- a/lib/librte_eventdev/rte_event_timer_adapter.h
> +++ b/lib/librte_eventdev/rte_event_timer_adapter.h
> @@ -467,7 +467,7 @@ struct rte_event_timer {
> * - op: RTE_EVENT_OP_NEW
> * - event_type: RTE_EVENT_TYPE_TIMER
> */
> - volatile enum rte_event_timer_state state;
> + enum rte_event_timer_state state;
> /**< State of the event timer. */
Why do you remove the volatile keyword?
It is not explained in the commit log.
This change is triggering a warning in the ABI check:
http://mails.dpdk.org/archives/test-report/2020-July/140440.html
Moving from volatile to non-volatile is probably not an issue.
I expect the code generated for the volatile case to work the same
in non-volatile case. Do you confirm?
In any case, we need an explanation and an ABI check exception.
> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Monday, July 6, 2020 6:04 PM
> To: Phil Yang <Phil.Yang@arm.com>
> Cc: erik.g.carrillo@intel.com; dev@dpdk.org; jerinj@marvell.com; Honnappa
> Nagarahalli <Honnappa.Nagarahalli@arm.com>; drc@linux.vnet.ibm.com;
> Ruifeng Wang <Ruifeng.Wang@arm.com>; Dharmik Thakkar
> <Dharmik.Thakkar@arm.com>; nd <nd@arm.com>;
> david.marchand@redhat.com; mdr@ashroe.eu; Neil Horman
> <nhorman@tuxdriver.com>; Dodji Seketeli <dodji@redhat.com>
> Subject: Re: [dpdk-dev] [PATCH v2 4/4] eventdev: relax smp barriers with c11
> atomics
>
> 02/07/2020 07:26, Phil Yang:
> > The implementation-specific opaque data is shared between arm and
> cancel
> > operations. The state flag acts as a guard variable to make sure the
> > update of opaque data is synchronized. This patch uses c11 atomics with
> > explicit one way memory barrier instead of full barriers rte_smp_w/rmb()
> > to synchronize the opaque data between timer arm and cancel threads.
>
> I think we should write C11 (uppercase).
Agreed.
I will change it in the next version.
>
> Please, in your explanations, try to be more specific.
> Naming fields may help to make things clear.
OK. Thanks.
>
> [...]
> > --- a/lib/librte_eventdev/rte_event_timer_adapter.h
> > +++ b/lib/librte_eventdev/rte_event_timer_adapter.h
> > @@ -467,7 +467,7 @@ struct rte_event_timer {
> > * - op: RTE_EVENT_OP_NEW
> > * - event_type: RTE_EVENT_TYPE_TIMER
> > */
> > - volatile enum rte_event_timer_state state;
> > + enum rte_event_timer_state state;
> > /**< State of the event timer. */
>
> Why do you remove the volatile keyword?
> It is not explained in the commit log.
By using the C11 atomic operations, it will generate the same instructions for non-volatile and volatile version.
Please check the sample code here: https://gcc.godbolt.org/z/8x5rWs
>
> This change is triggering a warning in the ABI check:
> http://mails.dpdk.org/archives/test-report/2020-July/140440.html
> Moving from volatile to non-volatile is probably not an issue.
> I expect the code generated for the volatile case to work the same
> in non-volatile case. Do you confirm?
They generate the same instructions, so either way will work.
Do I need to revert it to the volatile version?
Thanks,
Phil
>
> In any case, we need an explanation and an ABI check exception.
>
06/07/2020 17:32, Phil Yang:
> From: Thomas Monjalon <thomas@monjalon.net>
> > 02/07/2020 07:26, Phil Yang:
> > > --- a/lib/librte_eventdev/rte_event_timer_adapter.h
> > > +++ b/lib/librte_eventdev/rte_event_timer_adapter.h
> > > @@ -467,7 +467,7 @@ struct rte_event_timer {
> > > * - op: RTE_EVENT_OP_NEW
> > > * - event_type: RTE_EVENT_TYPE_TIMER
> > > */
> > > - volatile enum rte_event_timer_state state;
> > > + enum rte_event_timer_state state;
> > > /**< State of the event timer. */
> >
> > Why do you remove the volatile keyword?
> > It is not explained in the commit log.
> By using the C11 atomic operations, it will generate the same instructions for non-volatile and volatile version.
> Please check the sample code here: https://gcc.godbolt.org/z/8x5rWs
>
> > This change is triggering a warning in the ABI check:
> > http://mails.dpdk.org/archives/test-report/2020-July/140440.html
> > Moving from volatile to non-volatile is probably not an issue.
> > I expect the code generated for the volatile case to work the same
> > in non-volatile case. Do you confirm?
> They generate the same instructions, so either way will work.
> Do I need to revert it to the volatile version?
Either you revert, or you add explanation in the commit log
+ exception in libabigail.abignore
@@ -629,7 +629,8 @@ swtim_callback(struct rte_timer *tim)
sw->expired_timers[sw->n_expired_timers++] = tim;
sw->stats.evtim_exp_count++;
- evtim->state = RTE_EVENT_TIMER_NOT_ARMED;
+ __atomic_store_n(&evtim->state, RTE_EVENT_TIMER_NOT_ARMED,
+ __ATOMIC_RELEASE);
}
if (event_buffer_batch_ready(&sw->buffer)) {
@@ -1020,6 +1021,7 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
int n_lcores;
/* Timer list for this lcore is not in use. */
uint16_t exp_state = 0;
+ enum rte_event_timer_state n_state;
#ifdef RTE_LIBRTE_EVENTDEV_DEBUG
/* Check that the service is running. */
@@ -1060,30 +1062,36 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
}
for (i = 0; i < nb_evtims; i++) {
- /* Don't modify the event timer state in these cases */
- if (evtims[i]->state == RTE_EVENT_TIMER_ARMED) {
+ n_state = __atomic_load_n(&evtims[i]->state, __ATOMIC_ACQUIRE);
+ if (n_state == RTE_EVENT_TIMER_ARMED) {
rte_errno = EALREADY;
break;
- } else if (!(evtims[i]->state == RTE_EVENT_TIMER_NOT_ARMED ||
- evtims[i]->state == RTE_EVENT_TIMER_CANCELED)) {
+ } else if (!(n_state == RTE_EVENT_TIMER_NOT_ARMED ||
+ n_state == RTE_EVENT_TIMER_CANCELED)) {
rte_errno = EINVAL;
break;
}
ret = check_timeout(evtims[i], adapter);
if (unlikely(ret == -1)) {
- evtims[i]->state = RTE_EVENT_TIMER_ERROR_TOOLATE;
+ __atomic_store_n(&evtims[i]->state,
+ RTE_EVENT_TIMER_ERROR_TOOLATE,
+ __ATOMIC_RELAXED);
rte_errno = EINVAL;
break;
} else if (unlikely(ret == -2)) {
- evtims[i]->state = RTE_EVENT_TIMER_ERROR_TOOEARLY;
+ __atomic_store_n(&evtims[i]->state,
+ RTE_EVENT_TIMER_ERROR_TOOEARLY,
+ __ATOMIC_RELAXED);
rte_errno = EINVAL;
break;
}
if (unlikely(check_destination_event_queue(evtims[i],
adapter) < 0)) {
- evtims[i]->state = RTE_EVENT_TIMER_ERROR;
+ __atomic_store_n(&evtims[i]->state,
+ RTE_EVENT_TIMER_ERROR,
+ __ATOMIC_RELAXED);
rte_errno = EINVAL;
break;
}
@@ -1099,13 +1107,18 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
SINGLE, lcore_id, NULL, evtims[i]);
if (ret < 0) {
/* tim was in RUNNING or CONFIG state */
- evtims[i]->state = RTE_EVENT_TIMER_ERROR;
+ __atomic_store_n(&evtims[i]->state,
+ RTE_EVENT_TIMER_ERROR,
+ __ATOMIC_RELEASE);
break;
}
- rte_smp_wmb();
EVTIM_LOG_DBG("armed an event timer");
- evtims[i]->state = RTE_EVENT_TIMER_ARMED;
+ /* RELEASE ordering guarantees the adapter specific value
+ * changes observed before the update of state.
+ */
+ __atomic_store_n(&evtims[i]->state, RTE_EVENT_TIMER_ARMED,
+ __ATOMIC_RELEASE);
}
if (i < nb_evtims)
@@ -1132,6 +1145,7 @@ swtim_cancel_burst(const struct rte_event_timer_adapter *adapter,
struct rte_timer *timp;
uint64_t opaque;
struct swtim *sw = swtim_pmd_priv(adapter);
+ enum rte_event_timer_state n_state;
#ifdef RTE_LIBRTE_EVENTDEV_DEBUG
/* Check that the service is running. */
@@ -1143,16 +1157,18 @@ swtim_cancel_burst(const struct rte_event_timer_adapter *adapter,
for (i = 0; i < nb_evtims; i++) {
/* Don't modify the event timer state in these cases */
- if (evtims[i]->state == RTE_EVENT_TIMER_CANCELED) {
+ /* ACQUIRE ordering guarantees the access of implementation
+ * specific opague data under the correct state.
+ */
+ n_state = __atomic_load_n(&evtims[i]->state, __ATOMIC_ACQUIRE);
+ if (n_state == RTE_EVENT_TIMER_CANCELED) {
rte_errno = EALREADY;
break;
- } else if (evtims[i]->state != RTE_EVENT_TIMER_ARMED) {
+ } else if (n_state != RTE_EVENT_TIMER_ARMED) {
rte_errno = EINVAL;
break;
}
- rte_smp_rmb();
-
opaque = evtims[i]->impl_opaque[0];
timp = (struct rte_timer *)(uintptr_t)opaque;
RTE_ASSERT(timp != NULL);
@@ -1166,9 +1182,12 @@ swtim_cancel_burst(const struct rte_event_timer_adapter *adapter,
rte_mempool_put(sw->tim_pool, (void **)timp);
- evtims[i]->state = RTE_EVENT_TIMER_CANCELED;
-
- rte_smp_wmb();
+ /* The RELEASE ordering here pairs with atomic ordering
+ * to make sure the state update data observed between
+ * threads.
+ */
+ __atomic_store_n(&evtims[i]->state, RTE_EVENT_TIMER_CANCELED,
+ __ATOMIC_RELEASE);
}
return i;
@@ -467,7 +467,7 @@ struct rte_event_timer {
* - op: RTE_EVENT_OP_NEW
* - event_type: RTE_EVENT_TYPE_TIMER
*/
- volatile enum rte_event_timer_state state;
+ enum rte_event_timer_state state;
/**< State of the event timer. */
uint64_t timeout_ticks;
/**< Expiry timer ticks expressed in number of *timer_ticks_ns* from