[v7,2/3] timer: fix function to stop all timers

Message ID 20220914153319.1887248-2-s.v.naga.harish.k@intel.com (mailing list archive)
State Accepted, archived
Delegated to: Thomas Monjalon
Headers
Series [v7,1/3] eventdev/timer: add periodic event timer support |

Checks

Context Check Description
ci/checkpatch warning coding style issues

Commit Message

Naga Harish K, S V Sept. 14, 2022, 3:33 p.m. UTC
  There is a possibility of deadlock in this API,
as same spinlock is tried to be acquired in nested manner.

If the lcore that is stopping the timer is different from the lcore
that owns the timer, the timer list lock is acquired in timer_del(),
even if local_is_locked is true. Because the same lock was already
acquired in rte_timer_stop_all(), the thread will hang.

This patch removes the acquisition of nested lock.

Fixes: 821c51267bcd63a ("timer: add function to stop all timers in a list")
Cc: stable@dpdk.org

Signed-off-by: Naga Harish K S V <s.v.naga.harish.k@intel.com>
---
 lib/timer/rte_timer.c | 13 ++++---------
 1 file changed, 4 insertions(+), 9 deletions(-)
  

Comments

Jerin Jacob Sept. 15, 2022, 6:41 a.m. UTC | #1
On Wed, Sep 14, 2022 at 9:03 PM Naga Harish K S V
<s.v.naga.harish.k@intel.com> wrote:
>
> There is a possibility of deadlock in this API,
> as same spinlock is tried to be acquired in nested manner.
>
> If the lcore that is stopping the timer is different from the lcore
> that owns the timer, the timer list lock is acquired in timer_del(),
> even if local_is_locked is true. Because the same lock was already
> acquired in rte_timer_stop_all(), the thread will hang.
>
> This patch removes the acquisition of nested lock.
>
> Fixes: 821c51267bcd63a ("timer: add function to stop all timers in a list")
> Cc: stable@dpdk.org
>
> Signed-off-by: Naga Harish K S V <s.v.naga.harish.k@intel.com>
> ---
>  lib/timer/rte_timer.c | 13 ++++---------

Since this change in lib/timer. Delegating this patch to @Thomas Monjalon
  
Naga Harish K, S V Sept. 16, 2022, 4:40 a.m. UTC | #2
> -----Original Message-----
> From: Jerin Jacob <jerinjacobk@gmail.com>
> Sent: Thursday, September 15, 2022 12:12 PM
> To: Naga Harish K, S V <s.v.naga.harish.k@intel.com>; Thomas Monjalon
> <thomas@monjalon.net>
> Cc: jerinj@marvell.com; dev@dpdk.org; Carrillo, Erik G
> <erik.g.carrillo@intel.com>; pbhagavatula@marvell.com;
> sthotton@marvell.com; stable@dpdk.org
> Subject: Re: [PATCH v7 2/3] timer: fix function to stop all timers
> 
> On Wed, Sep 14, 2022 at 9:03 PM Naga Harish K S V
> <s.v.naga.harish.k@intel.com> wrote:
> >
> > There is a possibility of deadlock in this API, as same spinlock is
> > tried to be acquired in nested manner.
> >
> > If the lcore that is stopping the timer is different from the lcore
> > that owns the timer, the timer list lock is acquired in timer_del(),
> > even if local_is_locked is true. Because the same lock was already
> > acquired in rte_timer_stop_all(), the thread will hang.
> >
> > This patch removes the acquisition of nested lock.
> >
> > Fixes: 821c51267bcd63a ("timer: add function to stop all timers in a
> > list")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Naga Harish K S V <s.v.naga.harish.k@intel.com>
Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>

Added missing ack

> > ---
> >  lib/timer/rte_timer.c | 13 ++++---------
> 
> Since this change in lib/timer. Delegating this patch to @Thomas Monjalon
  
Naga Harish K, S V Sept. 26, 2022, 5:21 a.m. UTC | #3
Hi Thomas,
    Did you get a chance to review this patch?
Without this patch, the periodic event timer tests for SW timer adapter hangs.

-Harish

> -----Original Message-----
> From: Jerin Jacob <jerinjacobk@gmail.com>
> Sent: Thursday, September 15, 2022 12:12 PM
> To: Naga Harish K, S V <s.v.naga.harish.k@intel.com>; Thomas Monjalon
> <thomas@monjalon.net>
> Cc: jerinj@marvell.com; dev@dpdk.org; Carrillo, Erik G
> <erik.g.carrillo@intel.com>; pbhagavatula@marvell.com;
> sthotton@marvell.com; stable@dpdk.org
> Subject: Re: [PATCH v7 2/3] timer: fix function to stop all timers
> 
> On Wed, Sep 14, 2022 at 9:03 PM Naga Harish K S V
> <s.v.naga.harish.k@intel.com> wrote:
> >
> > There is a possibility of deadlock in this API, as same spinlock is
> > tried to be acquired in nested manner.
> >
> > If the lcore that is stopping the timer is different from the lcore
> > that owns the timer, the timer list lock is acquired in timer_del(),
> > even if local_is_locked is true. Because the same lock was already
> > acquired in rte_timer_stop_all(), the thread will hang.
> >
> > This patch removes the acquisition of nested lock.
> >
> > Fixes: 821c51267bcd63a ("timer: add function to stop all timers in a
> > list")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Naga Harish K S V <s.v.naga.harish.k@intel.com>
> > ---
> >  lib/timer/rte_timer.c | 13 ++++---------
> 
> Since this change in lib/timer. Delegating this patch to @Thomas Monjalon
  
Thomas Monjalon Oct. 5, 2022, 12:59 p.m. UTC | #4
> > On Wed, Sep 14, 2022 at 9:03 PM Naga Harish K S V
> > <s.v.naga.harish.k@intel.com> wrote:
> > >
> > > There is a possibility of deadlock in this API, as same spinlock is
> > > tried to be acquired in nested manner.
> > >
> > > If the lcore that is stopping the timer is different from the lcore
> > > that owns the timer, the timer list lock is acquired in timer_del(),
> > > even if local_is_locked is true. Because the same lock was already
> > > acquired in rte_timer_stop_all(), the thread will hang.
> > >
> > > This patch removes the acquisition of nested lock.
> > >
> > > Fixes: 821c51267bcd63a ("timer: add function to stop all timers in a
> > > list")
> > > Cc: stable@dpdk.org
> > >
> > > Signed-off-by: Naga Harish K S V <s.v.naga.harish.k@intel.com>
> Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
> 
> Added missing ack

Applied, thanks.
  

Patch

diff --git a/lib/timer/rte_timer.c b/lib/timer/rte_timer.c
index 9994813d0d..85d67573eb 100644
--- a/lib/timer/rte_timer.c
+++ b/lib/timer/rte_timer.c
@@ -580,7 +580,7 @@  rte_timer_reset_sync(struct rte_timer *tim, uint64_t ticks,
 }
 
 static int
-__rte_timer_stop(struct rte_timer *tim, int local_is_locked,
+__rte_timer_stop(struct rte_timer *tim,
 		 struct rte_timer_data *timer_data)
 {
 	union rte_timer_status prev_status, status;
@@ -602,7 +602,7 @@  __rte_timer_stop(struct rte_timer *tim, int local_is_locked,
 
 	/* remove it from list */
 	if (prev_status.state == RTE_TIMER_PENDING) {
-		timer_del(tim, prev_status, local_is_locked, priv_timer);
+		timer_del(tim, prev_status, 0, priv_timer);
 		__TIMER_STAT_ADD(priv_timer, pending, -1);
 	}
 
@@ -631,7 +631,7 @@  rte_timer_alt_stop(uint32_t timer_data_id, struct rte_timer *tim)
 
 	TIMER_DATA_VALID_GET_OR_ERR_RET(timer_data_id, timer_data, -EINVAL);
 
-	return __rte_timer_stop(tim, 0, timer_data);
+	return __rte_timer_stop(tim, timer_data);
 }
 
 /* loop until rte_timer_stop() succeed */
@@ -987,21 +987,16 @@  rte_timer_stop_all(uint32_t timer_data_id, unsigned int *walk_lcores,
 		walk_lcore = walk_lcores[i];
 		priv_timer = &timer_data->priv_timer[walk_lcore];
 
-		rte_spinlock_lock(&priv_timer->list_lock);
-
 		for (tim = priv_timer->pending_head.sl_next[0];
 		     tim != NULL;
 		     tim = next_tim) {
 			next_tim = tim->sl_next[0];
 
-			/* Call timer_stop with lock held */
-			__rte_timer_stop(tim, 1, timer_data);
+			__rte_timer_stop(tim, timer_data);
 
 			if (f)
 				f(tim, f_arg);
 		}
-
-		rte_spinlock_unlock(&priv_timer->list_lock);
 	}
 
 	return 0;