net/mlx5: fix memory regions release deadlock
Checks
Commit Message
When we create memory callback list, we add cb function managing memory
regions. This function uses lock. This callback iterates over shared
device list and takes a lock of each shared device to avoid parallel
accessing to the MR list of the shared device.
When PND releases memory regions while the list is exist, the callback
function maps the MRs using the lock.
In shared device closing, when all its MRs are freed, the same lock is
taken.
The MRs freeing calls rte_free what may trigger the memory callback.
The MR freeing, wrongly, took the lock before the shared device removal
from the callback list what causes the deadlock.
In order to solve it, first we remove the share device from the list and
then release memory regions.
Fixes: 0e3d0525b2f2 ("net/mlx5: fix memory event callback list")
Cc: viacheslavo@mellanox.com
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@mellanox.com>
---
drivers/net/mlx5/mlx5.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
@@ -679,12 +679,12 @@ struct mlx5_flow_id_pool *
MLX5_ASSERT(rte_eal_process_type() == RTE_PROC_PRIMARY);
if (--sh->refcnt)
goto exit;
- /* Release created Memory Regions. */
- mlx5_mr_release(sh);
/* Remove from memory callback device list. */
rte_rwlock_write_lock(&mlx5_shared_data->mem_event_rwlock);
LIST_REMOVE(sh, mem_event_cb);
rte_rwlock_write_unlock(&mlx5_shared_data->mem_event_rwlock);
+ /* Release created Memory Regions. */
+ mlx5_mr_release(sh);
/* Remove context from the global device list. */
LIST_REMOVE(sh, next);
/*