common/mlx5: fix MR lookup for non-contiguous mempool
Checks
Commit Message
Memory region (MR) lookup by address inside mempool MRs
was not accounting for the upper bound of an MR.
For mempools covered by multiple MRs this could return
a wrong MR LKey, typically resulting in an unrecoverable
TxQ failure:
mlx5_net: Cannot change Tx QP state to INIT Invalid argument
Corresponding message from /var/log/dpdk_mlx5_port_X_txq_Y_index_Z*:
Unexpected CQE error syndrome 0x04 CQN = 128 SQN = 4848
wqe_counter = 0 wq_ci = 9 cq_ci = 122
This is likely to happen with --legacy-mem and IOVA-as-PA,
because EAL intentionally maps pages at non-adjacent PA
to non-adjacent VA in this mode, and MLX5 PMD works with VA.
Fixes: 690b2a88c2f7 ("common/mlx5: add mempool registration facilities")
Cc: stable@dpdk.org
Reported-by: Wang Yunjian <wangyunjian@huawei.com>
Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
drivers/common/mlx5/mlx5_common_mr.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
Comments
Hi,
> -----Original Message-----
> From: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
> Sent: Friday, January 14, 2022 12:52 PM
> To: dev@dpdk.org
> Cc: stable@dpdk.org; Wang Yunjian <wangyunjian@huawei.com>; Slava
> Ovsiienko <viacheslavo@nvidia.com>; Matan Azrad <matan@nvidia.com>
> Subject: [PATCH] common/mlx5: fix MR lookup for non-contiguous mempool
>
> Memory region (MR) lookup by address inside mempool MRs was not
> accounting for the upper bound of an MR.
> For mempools covered by multiple MRs this could return a wrong MR LKey,
> typically resulting in an unrecoverable TxQ failure:
>
> mlx5_net: Cannot change Tx QP state to INIT Invalid argument
>
> Corresponding message from /var/log/dpdk_mlx5_port_X_txq_Y_index_Z*:
>
> Unexpected CQE error syndrome 0x04 CQN = 128 SQN = 4848
> wqe_counter = 0 wq_ci = 9 cq_ci = 122
>
> This is likely to happen with --legacy-mem and IOVA-as-PA, because EAL
> intentionally maps pages at non-adjacent PA to non-adjacent VA in this
> mode, and MLX5 PMD works with VA.
>
> Fixes: 690b2a88c2f7 ("common/mlx5: add mempool registration facilities")
> Cc: stable@dpdk.org
>
> Reported-by: Wang Yunjian <wangyunjian@huawei.com>
> Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
> Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Patch applied to next-net-mlx,
Kindest regards,
Raslan Darawsheh
@@ -1834,12 +1834,13 @@ mlx5_mempool_reg_addr2mr(struct mlx5_mempool_reg *mpr, uintptr_t addr,
for (i = 0; i < mpr->mrs_n; i++) {
const struct mlx5_pmd_mr *mr = &mpr->mrs[i].pmd_mr;
- uintptr_t mr_addr = (uintptr_t)mr->addr;
+ uintptr_t mr_start = (uintptr_t)mr->addr;
+ uintptr_t mr_end = mr_start + mr->len;
- if (mr_addr <= addr) {
+ if (mr_start <= addr && addr < mr_end) {
lkey = rte_cpu_to_be_32(mr->lkey);
- entry->start = mr_addr;
- entry->end = mr_addr + mr->len;
+ entry->start = mr_start;
+ entry->end = mr_end;
entry->lkey = lkey;
break;
}