[v3] net/mlx5: fix RSS hash for non-RSS CQE zipping

Message ID 20241130003925.2228586-1-akozyrev@nvidia.com (mailing list archive)
State Awaiting Upstream
Delegated to: Raslan Darawsheh
Headers
Series [v3] net/mlx5: fix RSS hash for non-RSS CQE zipping |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/intel-Functional success Functional PASS
ci/github-robot: build success github build: passed
ci/iol-compile-amd64-testing success Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-sample-apps-testing success Testing PASS

Commit Message

Alexander Kozyrev Nov. 30, 2024, 12:39 a.m. UTC
Take the RSS hash value from the title packet
before it gets overwritten by the decompression routine.
Set the RSS hash flag in the packet mbuf if RSS is enabled
in case of non-RSS CQE zipping format.

Fixes: 54c2d46 ("net/mlx5: support flow tag and packet header miniCQEs")
Cc: stable@dpdk.org

Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
---
 drivers/net/mlx5/mlx5_rxtx_vec_altivec.h | 32 +++++++++++++-----------
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h    | 18 ++++++-------
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h     | 18 ++++++-------
 3 files changed, 35 insertions(+), 33 deletions(-)
  

Comments

Dariusz Sosnowski Dec. 2, 2024, 2:27 p.m. UTC | #1
> -----Original Message-----
> From: Alexander Kozyrev <akozyrev@nvidia.com>
> Sent: Saturday, November 30, 2024 01:39
> To: dev@dpdk.org
> Cc: stable@dpdk.org; Raslan Darawsheh <rasland@nvidia.com>; Slava Ovsiienko
> <viacheslavo@nvidia.com>; Matan Azrad <matan@nvidia.com>; Dariusz
> Sosnowski <dsosnowski@nvidia.com>; Bing Zhao <bingz@nvidia.com>;
> Suanming Mou <suanmingm@nvidia.com>
> Subject: [PATCH v3] net/mlx5: fix RSS hash for non-RSS CQE zipping
> 
> Take the RSS hash value from the title packet before it gets overwritten by the
> decompression routine.
> Set the RSS hash flag in the packet mbuf if RSS is enabled in case of non-RSS CQE
> zipping format.
> 
> Fixes: 54c2d46 ("net/mlx5: support flow tag and packet header miniCQEs")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>

Acked-by: Dariusz Sosnowski <dsosnowski@nvidia.com>

Best regards,
Dariusz Sosnowski
  
Raslan Darawsheh Jan. 19, 2025, 11:47 a.m. UTC | #2
Hi,


From: Alexander Kozyrev <akozyrev@nvidia.com>
Sent: Saturday, November 30, 2024 2:39 AM
To: dev@dpdk.org
Cc: stable@dpdk.org; Raslan Darawsheh; Slava Ovsiienko; Matan Azrad; Dariusz Sosnowski; Bing Zhao; Suanming Mou
Subject: [PATCH v3] net/mlx5: fix RSS hash for non-RSS CQE zipping

Take the RSS hash value from the title packet
before it gets overwritten by the decompression routine.
Set the RSS hash flag in the packet mbuf if RSS is enabled
in case of non-RSS CQE zipping format.

Fixes: 54c2d46 ("net/mlx5: support flow tag and packet header miniCQEs")
Cc: stable@dpdk.org

Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>


Sending reply to the correct Patch version,

Patch applied to next-net-mlx,

Kindest regards,
Raslan Darawsheh
  
Alexander Kozyrev Jan. 19, 2025, 1:09 p.m. UTC | #3
Raslan, please revert this patch. I rejected it last week. This fix is incorrect without the FW changes.

Regards,
Alex
  

Patch

diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h b/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
index 240987d03d..0f48298def 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
@@ -82,6 +82,7 @@  rxq_cq_decompress_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq,
 		(void *)&(cq + !rxq->cqe_comp_layout)->pkt_info;
 	/* Title packet is pre-built. */
 	struct rte_mbuf *t_pkt = rxq->cqe_comp_layout ? &rxq->title_pkt : elts[0];
+	const uint32_t hash_rss = rxq->rss_hash * t_pkt->hash.rss;
 	const __vector unsigned char zero = (__vector unsigned char){0};
 	/* Mask to shuffle from extracted mini CQE to mbuf. */
 	const __vector unsigned char shuf_mask1 = (__vector unsigned char){
@@ -113,8 +114,18 @@  rxq_cq_decompress_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq,
 	const __vector unsigned short rxdf_sel_mask =
 		(__vector unsigned short){
 			0xffff, 0xffff, 0, 0, 0, 0xffff, 0, 0};
-	__vector unsigned char ol_flags = (__vector unsigned char){0};
-	__vector unsigned char ol_flags_mask = (__vector unsigned char){0};
+	__vector unsigned char ol_flags =
+			(__vector unsigned char)(__vector unsigned int) {
+				rxq->rss_hash * RTE_MBUF_F_RX_RSS_HASH,
+				rxq->rss_hash * RTE_MBUF_F_RX_RSS_HASH,
+				rxq->rss_hash * RTE_MBUF_F_RX_RSS_HASH,
+				rxq->rss_hash * RTE_MBUF_F_RX_RSS_HASH};
+	__vector unsigned char ol_flags_mask =
+			(__vector unsigned char)(__vector unsigned int) {
+				rxq->rss_hash * RTE_MBUF_F_RX_RSS_HASH,
+				rxq->rss_hash * RTE_MBUF_F_RX_RSS_HASH,
+				rxq->rss_hash * RTE_MBUF_F_RX_RSS_HASH,
+				rxq->rss_hash * RTE_MBUF_F_RX_RSS_HASH};
 	unsigned int pos;
 	unsigned int i;
 	unsigned int inv = 0;
@@ -440,12 +451,6 @@  rxq_cq_decompress_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq,
 						pkt_info) & (1 << 6));
 				}
 			}
-			const __vector unsigned char hash_mask =
-				(__vector unsigned char)(__vector unsigned int) {
-					RTE_MBUF_F_RX_RSS_HASH,
-					RTE_MBUF_F_RX_RSS_HASH,
-					RTE_MBUF_F_RX_RSS_HASH,
-					RTE_MBUF_F_RX_RSS_HASH};
 			const __vector unsigned char rearm_flags =
 				(__vector unsigned char)(__vector unsigned int) {
 				(uint32_t)t_pkt->ol_flags,
@@ -453,9 +458,6 @@  rxq_cq_decompress_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq,
 				(uint32_t)t_pkt->ol_flags,
 				(uint32_t)t_pkt->ol_flags};
 
-			ol_flags_mask = (__vector unsigned char)
-				vec_or((__vector unsigned long)ol_flags_mask,
-				(__vector unsigned long)hash_mask);
 			ol_flags = (__vector unsigned char)
 				vec_or((__vector unsigned long)ol_flags,
 				(__vector unsigned long)
@@ -470,10 +472,10 @@  rxq_cq_decompress_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq,
 				((__vector unsigned int)ol_flags)[2];
 			elts[pos + 3]->ol_flags =
 				((__vector unsigned int)ol_flags)[3];
-			elts[pos]->hash.rss = 0;
-			elts[pos + 1]->hash.rss = 0;
-			elts[pos + 2]->hash.rss = 0;
-			elts[pos + 3]->hash.rss = 0;
+			elts[pos]->hash.rss = hash_rss;
+			elts[pos + 1]->hash.rss = hash_rss;
+			elts[pos + 2]->hash.rss = hash_rss;
+			elts[pos + 3]->hash.rss = hash_rss;
 		}
 		if (rxq->dynf_meta) {
 			int32_t offs = rxq->flow_meta_offset;
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
index dc1d30753d..462819cb4a 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
@@ -78,6 +78,7 @@  rxq_cq_decompress_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq,
 		(void *)&(cq + !rxq->cqe_comp_layout)->pkt_info;
 	/* Title packet is pre-built. */
 	struct rte_mbuf *t_pkt = rxq->cqe_comp_layout ? &rxq->title_pkt : elts[0];
+	const uint32_t hash_rss = rxq->rss_hash * t_pkt->hash.rss;
 	unsigned int pos;
 	unsigned int i;
 	unsigned int inv = 0;
@@ -117,8 +118,10 @@  rxq_cq_decompress_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq,
 		rxq->crc_present * RTE_ETHER_CRC_LEN, 0,
 		0, 0
 	};
-	uint32x4_t ol_flags = {0, 0, 0, 0};
-	uint32x4_t ol_flags_mask = {0, 0, 0, 0};
+	uint32x4_t ol_flags =
+		vdupq_n_u32(rxq->rss_hash * RTE_MBUF_F_RX_RSS_HASH);
+	uint32x4_t ol_flags_mask =
+		vdupq_n_u32(rxq->rss_hash * RTE_MBUF_F_RX_RSS_HASH);
 #ifdef MLX5_PMD_SOFT_COUNTERS
 	uint32_t rcvd_byte = 0;
 #endif
@@ -326,22 +329,19 @@  rxq_cq_decompress_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq,
 						pkt_info) & (1 << 6));
 				}
 			}
-			const uint32x4_t hash_flags =
-				vdupq_n_u32(RTE_MBUF_F_RX_RSS_HASH);
 			const uint32x4_t rearm_flags =
 				vdupq_n_u32((uint32_t)t_pkt->ol_flags);
 
-			ol_flags_mask = vorrq_u32(ol_flags_mask, hash_flags);
 			ol_flags = vorrq_u32(ol_flags,
 					vbicq_u32(rearm_flags, ol_flags_mask));
 			elts[pos]->ol_flags = vgetq_lane_u32(ol_flags, 3);
 			elts[pos + 1]->ol_flags = vgetq_lane_u32(ol_flags, 2);
 			elts[pos + 2]->ol_flags = vgetq_lane_u32(ol_flags, 1);
 			elts[pos + 3]->ol_flags = vgetq_lane_u32(ol_flags, 0);
-			elts[pos]->hash.rss = 0;
-			elts[pos + 1]->hash.rss = 0;
-			elts[pos + 2]->hash.rss = 0;
-			elts[pos + 3]->hash.rss = 0;
+			elts[pos]->hash.rss = hash_rss;
+			elts[pos + 1]->hash.rss = hash_rss;
+			elts[pos + 2]->hash.rss = hash_rss;
+			elts[pos + 3]->hash.rss = hash_rss;
 		}
 		if (rxq->dynf_meta) {
 			int32_t offs = rxq->flow_meta_offset;
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
index 81a177fce7..fc1b436b72 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
@@ -78,6 +78,7 @@  rxq_cq_decompress_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq,
 	volatile struct mlx5_mini_cqe8 *mcq = (void *)(cq + !rxq->cqe_comp_layout);
 	/* Title packet is pre-built. */
 	struct rte_mbuf *t_pkt = rxq->cqe_comp_layout ? &rxq->title_pkt : elts[0];
+	const uint32_t hash_rss = rxq->rss_hash * t_pkt->hash.rss;
 	unsigned int pos;
 	unsigned int i;
 	unsigned int inv = 0;
@@ -108,8 +109,10 @@  rxq_cq_decompress_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq,
 			      0,
 			      rxq->crc_present * RTE_ETHER_CRC_LEN,
 			      0, 0);
-	__m128i ol_flags = _mm_setzero_si128();
-	__m128i ol_flags_mask = _mm_setzero_si128();
+	__m128i ol_flags =
+		_mm_set1_epi32(rxq->rss_hash * RTE_MBUF_F_RX_RSS_HASH);
+	__m128i ol_flags_mask =
+		_mm_set1_epi32(rxq->rss_hash * RTE_MBUF_F_RX_RSS_HASH);
 #ifdef MLX5_PMD_SOFT_COUNTERS
 	const __m128i zero = _mm_setzero_si128();
 	const __m128i ones = _mm_cmpeq_epi32(zero, zero);
@@ -310,12 +313,9 @@  rxq_cq_decompress_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq,
 						pkt_info) & (1 << 6));
 				}
 			}
-			const __m128i hash_flags =
-				_mm_set1_epi32(RTE_MBUF_F_RX_RSS_HASH);
 			const __m128i rearm_flags =
 				_mm_set1_epi32((uint32_t)t_pkt->ol_flags);
 
-			ol_flags_mask = _mm_or_si128(ol_flags_mask, hash_flags);
 			ol_flags = _mm_or_si128(ol_flags,
 				_mm_andnot_si128(ol_flags_mask, rearm_flags));
 			elts[pos]->ol_flags =
@@ -326,10 +326,10 @@  rxq_cq_decompress_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq,
 				_mm_extract_epi32(ol_flags, 2);
 			elts[pos + 3]->ol_flags =
 				_mm_extract_epi32(ol_flags, 3);
-			elts[pos]->hash.rss = 0;
-			elts[pos + 1]->hash.rss = 0;
-			elts[pos + 2]->hash.rss = 0;
-			elts[pos + 3]->hash.rss = 0;
+			elts[pos]->hash.rss = hash_rss;
+			elts[pos + 1]->hash.rss = hash_rss;
+			elts[pos + 2]->hash.rss = hash_rss;
+			elts[pos + 3]->hash.rss = hash_rss;
 		}
 		if (rxq->dynf_meta) {
 			int32_t offs = rxq->flow_meta_offset;