[v2,3/4] timer: fix function to stop all timers
Checks
Commit Message
There is a possibility of deadlock in this API,
as same spinlock is tried to be acquired in nested manner.
In timer_del function, if the previous owner and current owner lcore
are different, the lock is tried to be acquired even though the same
lock is already acquired by the caller of timer_del function.
This patch removes the acquisition of nested locking.
Fixes: 821c51267bcd63a ("timer: add function to stop all timers in a list")
Cc: stable@dpdk.org
Signed-off-by: Naga Harish K S V <s.v.naga.harish.k@intel.com>
---
lib/timer/rte_timer.c | 13 ++++---------
1 file changed, 4 insertions(+), 9 deletions(-)
Comments
Hi Harish,
> -----Original Message-----
> From: Naga Harish K, S V <s.v.naga.harish.k@intel.com>
> Sent: Wednesday, August 10, 2022 2:10 AM
> To: Carrillo, Erik G <erik.g.carrillo@intel.com>
> Cc: dev@dpdk.org; stable@dpdk.org
> Subject: [PATCH v2 3/4] timer: fix function to stop all timers
>
> There is a possibility of deadlock in this API, as same spinlock is tried to be
> acquired in nested manner.
>
> In timer_del function, if the previous owner and current owner lcore are
It might be clearer to say something like:
"If the lcore that is stopping the timer is different from the lcore that owns the timer, the timer list lock is acquired in timer_del(), even if local_is_locked is true. Because the same lock was already acquired in rte_timer_stop_all(), the thread will hang."
Thanks,
Erik
> different, the lock is tried to be acquired even though the same lock is
> already acquired by the caller of timer_del function.
>
> This patch removes the acquisition of nested locking.
>
> Fixes: 821c51267bcd63a ("timer: add function to stop all timers in a list")
> Cc: stable@dpdk.org
>
> Signed-off-by: Naga Harish K S V <s.v.naga.harish.k@intel.com>
> ---
On Wed, 10 Aug 2022 19:29:36 +0000
"Carrillo, Erik G" <erik.g.carrillo@intel.com> wrote:
> Hi Harish,
>
> > -----Original Message-----
> > From: Naga Harish K, S V <s.v.naga.harish.k@intel.com>
> > Sent: Wednesday, August 10, 2022 2:10 AM
> > To: Carrillo, Erik G <erik.g.carrillo@intel.com>
> > Cc: dev@dpdk.org; stable@dpdk.org
> > Subject: [PATCH v2 3/4] timer: fix function to stop all timers
> >
> > There is a possibility of deadlock in this API, as same spinlock is tried to be
> > acquired in nested manner.
> >
> > In timer_del function, if the previous owner and current owner lcore are
>
> It might be clearer to say something like:
>
> "If the lcore that is stopping the timer is different from the lcore that owns the timer, the timer list lock is acquired in timer_del(), even if local_is_locked is true. Because the same lock was already acquired in rte_timer_stop_all(), the thread will hang."
>
Yes, the timer owner flag acts like a lock and this is AB BA deadlock
Hi Gabe,
> -----Original Message-----
> From: Carrillo, Erik G <erik.g.carrillo@intel.com>
> Sent: Thursday, August 11, 2022 1:00 AM
> To: Naga Harish K, S V <s.v.naga.harish.k@intel.com>
> Cc: dev@dpdk.org; stable@dpdk.org
> Subject: RE: [PATCH v2 3/4] timer: fix function to stop all timers
>
> Hi Harish,
>
> > -----Original Message-----
> > From: Naga Harish K, S V <s.v.naga.harish.k@intel.com>
> > Sent: Wednesday, August 10, 2022 2:10 AM
> > To: Carrillo, Erik G <erik.g.carrillo@intel.com>
> > Cc: dev@dpdk.org; stable@dpdk.org
> > Subject: [PATCH v2 3/4] timer: fix function to stop all timers
> >
> > There is a possibility of deadlock in this API, as same spinlock is
> > tried to be acquired in nested manner.
> >
> > In timer_del function, if the previous owner and current owner lcore
> > are
>
> It might be clearer to say something like:
>
> "If the lcore that is stopping the timer is different from the lcore that owns
> the timer, the timer list lock is acquired in timer_del(), even if local_is_locked
> is true. Because the same lock was already acquired in rte_timer_stop_all(),
> the thread will hang."
>
Incorporated the commit message in v3 version of the patch
> Thanks,
> Erik
>
> > different, the lock is tried to be acquired even though the same lock
> > is already acquired by the caller of timer_del function.
> >
> > This patch removes the acquisition of nested locking.
> >
> > Fixes: 821c51267bcd63a ("timer: add function to stop all timers in a
> > list")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Naga Harish K S V <s.v.naga.harish.k@intel.com>
> > ---
@@ -580,7 +580,7 @@ rte_timer_reset_sync(struct rte_timer *tim, uint64_t ticks,
}
static int
-__rte_timer_stop(struct rte_timer *tim, int local_is_locked,
+__rte_timer_stop(struct rte_timer *tim,
struct rte_timer_data *timer_data)
{
union rte_timer_status prev_status, status;
@@ -602,7 +602,7 @@ __rte_timer_stop(struct rte_timer *tim, int local_is_locked,
/* remove it from list */
if (prev_status.state == RTE_TIMER_PENDING) {
- timer_del(tim, prev_status, local_is_locked, priv_timer);
+ timer_del(tim, prev_status, 0, priv_timer);
__TIMER_STAT_ADD(priv_timer, pending, -1);
}
@@ -631,7 +631,7 @@ rte_timer_alt_stop(uint32_t timer_data_id, struct rte_timer *tim)
TIMER_DATA_VALID_GET_OR_ERR_RET(timer_data_id, timer_data, -EINVAL);
- return __rte_timer_stop(tim, 0, timer_data);
+ return __rte_timer_stop(tim, timer_data);
}
/* loop until rte_timer_stop() succeed */
@@ -987,21 +987,16 @@ rte_timer_stop_all(uint32_t timer_data_id, unsigned int *walk_lcores,
walk_lcore = walk_lcores[i];
priv_timer = &timer_data->priv_timer[walk_lcore];
- rte_spinlock_lock(&priv_timer->list_lock);
-
for (tim = priv_timer->pending_head.sl_next[0];
tim != NULL;
tim = next_tim) {
next_tim = tim->sl_next[0];
- /* Call timer_stop with lock held */
- __rte_timer_stop(tim, 1, timer_data);
+ __rte_timer_stop(tim, timer_data);
if (f)
f(tim, f_arg);
}
-
- rte_spinlock_unlock(&priv_timer->list_lock);
}
return 0;