net/mlx5: fix error packets drop in the regular Rx
Checks
Commit Message
When packet gets received with error it is reported in CQE
structure and PMD analyzes the error syndrome and provides
two options - either reset the entire queue for the critical
errors, or just ignore the packet.
The non-vectorized rx_burst did not ignore the non-critical
error packets, and in case of packet length exceeding the
mbuf data buffer length it took the next element in the queue
WQE ring, resulting in CQE/WQE consume indices synchronization
lost.
Fixes: aa67ed308458 ("net/mlx5: ignore non-critical syndromes for Rx queue")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
drivers/net/mlx5/mlx5_rx.c | 19 ++++++++++++-------
1 file changed, 12 insertions(+), 7 deletions(-)
Comments
> -----Original Message-----
> From: Slava Ovsiienko <viacheslavo@nvidia.com>
> Sent: Tuesday, February 20, 2024 12:45
> To: dev@dpdk.org
> Cc: Matan Azrad <matan@nvidia.com>; Raslan Darawsheh
> <rasland@nvidia.com>; Ori Kam <orika@nvidia.com>; Dariusz Sosnowski
> <dsosnowski@nvidia.com>; stable@dpdk.org
> Subject: [PATCH] net/mlx5: fix error packets drop in the regular Rx
>
> When packet gets received with error it is reported in CQE structure and PMD
> analyzes the error syndrome and provides two options - either reset the entire
> queue for the critical errors, or just ignore the packet.
>
> The non-vectorized rx_burst did not ignore the non-critical error packets, and
> in case of packet length exceeding the mbuf data buffer length it took the next
> element in the queue WQE ring, resulting in CQE/WQE consume indices
> synchronization lost.
>
> Fixes: aa67ed308458 ("net/mlx5: ignore non-critical syndromes for Rx
> queue")
> Cc: stable@dpdk.org
>
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
Best regards,
Dariusz Sosnowski
Hi,
> -----Original Message-----
> From: Slava Ovsiienko <viacheslavo@nvidia.com>
> Sent: Tuesday, February 20, 2024 1:45 PM
> To: dev@dpdk.org
> Cc: Matan Azrad <matan@nvidia.com>; Raslan Darawsheh
> <rasland@nvidia.com>; Ori Kam <orika@nvidia.com>; Dariusz Sosnowski
> <dsosnowski@nvidia.com>; stable@dpdk.org
> Subject: [PATCH] net/mlx5: fix error packets drop in the regular Rx
>
> When packet gets received with error it is reported in CQE structure and PMD
> analyzes the error syndrome and provides two options - either reset the entire
> queue for the critical errors, or just ignore the packet.
>
> The non-vectorized rx_burst did not ignore the non-critical error packets, and
> in case of packet length exceeding the mbuf data buffer length it took the next
> element in the queue WQE ring, resulting in CQE/WQE consume indices
> synchronization lost.
>
> Fixes: aa67ed308458 ("net/mlx5: ignore non-critical syndromes for Rx
> queue")
> Cc: stable@dpdk.org
>
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Patch applied to next-net-mlx,
Kindest regards,
Raslan Darawsheh
@@ -613,7 +613,8 @@ mlx5_rx_err_handle(struct mlx5_rxq_data *rxq, uint8_t vec,
* @param mprq
* Indication if it is called from MPRQ.
* @return
- * 0 in case of empty CQE, MLX5_REGULAR_ERROR_CQE_RET in case of error CQE,
+ * 0 in case of empty CQE,
+ * MLX5_REGULAR_ERROR_CQE_RET in case of error CQE,
* MLX5_CRITICAL_ERROR_CQE_RET in case of error CQE lead to Rx queue reset,
* otherwise the packet size in regular RxQ,
* and striding byte count format in mprq case.
@@ -697,6 +698,11 @@ mlx5_rx_poll_len(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe,
if (ret == MLX5_RECOVERY_ERROR_RET ||
ret == MLX5_RECOVERY_COMPLETED_RET)
return MLX5_CRITICAL_ERROR_CQE_RET;
+ if (!mprq && ret == MLX5_RECOVERY_IGNORE_RET) {
+ *skip_cnt = 1;
+ ++rxq->cq_ci;
+ return MLX5_ERROR_CQE_MASK;
+ }
} else {
return 0;
}
@@ -971,19 +977,18 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
cqe = &(*rxq->cqes)[rxq->cq_ci & cqe_mask];
len = mlx5_rx_poll_len(rxq, cqe, cqe_n, cqe_mask, &mcqe, &skip_cnt, false);
if (unlikely(len & MLX5_ERROR_CQE_MASK)) {
+ /* We drop packets with non-critical errors */
+ rte_mbuf_raw_free(rep);
if (len == MLX5_CRITICAL_ERROR_CQE_RET) {
- rte_mbuf_raw_free(rep);
rq_ci = rxq->rq_ci << sges_n;
break;
}
+ /* Skip specified amount of error CQEs packets */
rq_ci >>= sges_n;
rq_ci += skip_cnt;
rq_ci <<= sges_n;
- idx = rq_ci & wqe_mask;
- wqe = &((volatile struct mlx5_wqe_data_seg *)rxq->wqes)[idx];
- seg = (*rxq->elts)[idx];
- cqe = &(*rxq->cqes)[rxq->cq_ci & cqe_mask];
- len = len & ~MLX5_ERROR_CQE_MASK;
+ MLX5_ASSERT(!pkt);
+ continue;
}
if (len == 0) {
rte_mbuf_raw_free(rep);