net/mlx5: fix mbufs overflow in vectorized MPRQ
Checks
Commit Message
Changing the allocation scheme to improve mbufs locality caused mbufs
overrun in some cases. Revert the previous replenish logic back.
Calculate a number of unused mbufs and replenish max this number of mbufs.
Mark the last 4 mbufs as fake mbufs to prevent overflowing into consumed
mbufs in the future. Keep the consumed index and the produced index 4 mbufs
apart for this purpose.
Replenish some mbufs only in case the consumed index is within the
replenish threshold of the produced index in order to retain the cache
locality for the vectorized MPRQ routine.
Fixes: 5c68764377 ("net/mlx5: improve vectorized MPRQ descriptors locality")
Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
---
drivers/net/mlx5/mlx5_rxtx_vec.c | 17 ++++++++++++-----
1 file changed, 12 insertions(+), 5 deletions(-)
Comments
> -----Original Message-----
> From: Alexander Kozyrev <akozyrev@nvidia.com>
> Sent: Saturday, November 21, 2020 5:43
> To: dev@dpdk.org
> Cc: Raslan Darawsheh <rasland@nvidia.com>; Slava Ovsiienko
> <viacheslavo@nvidia.com>; Matan Azrad <matan@nvidia.com>
> Subject: [PATCH] net/mlx5: fix mbufs overflow in vectorized MPRQ
>
> Changing the allocation scheme to improve mbufs locality caused mbufs
> overrun in some cases. Revert the previous replenish logic back.
> Calculate a number of unused mbufs and replenish max this number of mbufs.
>
> Mark the last 4 mbufs as fake mbufs to prevent overflowing into consumed
> mbufs in the future. Keep the consumed index and the produced index 4
> mbufs apart for this purpose.
>
> Replenish some mbufs only in case the consumed index is within the replenish
> threshold of the produced index in order to retain the cache locality for the
> vectorized MPRQ routine.
>
> Fixes: 5c68764377 ("net/mlx5: improve vectorized MPRQ descriptors locality")
>
> Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Hi,
> -----Original Message-----
> From: Alexander Kozyrev <akozyrev@nvidia.com>
> Sent: Saturday, November 21, 2020 5:43 AM
> To: dev@dpdk.org
> Cc: Raslan Darawsheh <rasland@nvidia.com>; Slava Ovsiienko
> <viacheslavo@nvidia.com>; Matan Azrad <matan@nvidia.com>
> Subject: [PATCH] net/mlx5: fix mbufs overflow in vectorized MPRQ
>
> Changing the allocation scheme to improve mbufs locality caused mbufs
> overrun in some cases. Revert the previous replenish logic back.
> Calculate a number of unused mbufs and replenish max this number of
> mbufs.
>
> Mark the last 4 mbufs as fake mbufs to prevent overflowing into consumed
> mbufs in the future. Keep the consumed index and the produced index 4
> mbufs
> apart for this purpose.
>
> Replenish some mbufs only in case the consumed index is within the
> replenish threshold of the produced index in order to retain the cache
> locality for the vectorized MPRQ routine.
>
> Fixes: 5c68764377 ("net/mlx5: improve vectorized MPRQ descriptors
> locality")
>
> Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
> ---
> drivers/net/mlx5/mlx5_rxtx_vec.c | 17 ++++++++++++-----
> 1 file changed, 12 insertions(+), 5 deletions(-)
>
Patch applied to next-net-mlx,
Kindest regards,
Raslan Darawsheh
@@ -145,22 +145,29 @@ mlx5_rx_mprq_replenish_bulk_mbuf(struct mlx5_rxq_data *rxq)
const uint32_t strd_n = 1 << rxq->strd_num_n;
const uint32_t elts_n = wqe_n * strd_n;
const uint32_t wqe_mask = elts_n - 1;
- uint32_t n = rxq->elts_ci - rxq->rq_pi;
+ uint32_t n = elts_n - (rxq->elts_ci - rxq->rq_pi);
uint32_t elts_idx = rxq->elts_ci & wqe_mask;
struct rte_mbuf **elts = &(*rxq->elts)[elts_idx];
+ unsigned int i;
- if (n <= rxq->rq_repl_thresh) {
- MLX5_ASSERT(n + MLX5_VPMD_RX_MAX_BURST >=
- MLX5_VPMD_RXQ_RPLNSH_THRESH(elts_n));
+ if (n >= rxq->rq_repl_thresh &&
+ rxq->elts_ci - rxq->rq_pi <= rxq->rq_repl_thresh) {
+ MLX5_ASSERT(n >= MLX5_VPMD_RXQ_RPLNSH_THRESH(elts_n));
MLX5_ASSERT(MLX5_VPMD_RXQ_RPLNSH_THRESH(elts_n) >
MLX5_VPMD_DESCS_PER_LOOP);
/* Not to cross queue end. */
- n = RTE_MIN(n + MLX5_VPMD_RX_MAX_BURST, elts_n - elts_idx);
+ n = RTE_MIN(n - MLX5_VPMD_DESCS_PER_LOOP, elts_n - elts_idx);
+ /* Limit replenish number to threshold value. */
+ n = RTE_MIN(n, rxq->rq_repl_thresh);
if (rte_mempool_get_bulk(rxq->mp, (void *)elts, n) < 0) {
rxq->stats.rx_nombuf += n;
return;
}
rxq->elts_ci += n;
+ /* Prevent overflowing into consumed mbufs. */
+ elts_idx = rxq->elts_ci & wqe_mask;
+ for (i = 0; i < MLX5_VPMD_DESCS_PER_LOOP; ++i)
+ (*rxq->elts)[elts_idx + i] = &rxq->fake_mbuf;
}
}