common/mlx5: fix misalignment issue detected by ASan

Message ID 20241112082126.40349-1-shperetz@nvidia.com (mailing list archive)
State Accepted, archived
Delegated to: Raslan Darawsheh
Headers
Series common/mlx5: fix misalignment issue detected by ASan |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/intel-Functional success Functional PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/github-robot: build success github build: passed
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-unit-amd64-testing success Testing PASS
ci/iol-marvell-Functional success Functional Testing PASS
ci/iol-compile-amd64-testing success Testing PASS
ci/iol-sample-apps-testing warning Testing issues
ci/iol-unit-arm64-testing success Testing PASS
ci/iol-compile-arm64-testing success Testing PASS

Commit Message

Shani Peretz Nov. 12, 2024, 8:21 a.m. UTC
ASan reported a runtime error due to misalignment
involving three structures.

The first issue arises when accessing
l_inconst->cache[MLX5_LIST_GLOBAL]->h.
If struct mlx5_list_cache is not properly aligned, the pointer gc,
assigned to l_inconst->cache[MLX5_LIST_GLOBAL], could be misaligned.
To address this, the __rte_aligned(16) attribute was added to
struct mlx5_list_inconst in struct mlx5_list, which includes struct
mlx5_list_cache, ensuring that the entire mlx5_list structure,
including mlx5_list_cache, is aligned to 64 bytes.

To resolve misalignment issues with struct mlx5_flow_handle,
The initialization of resources for the ipool ensures that
the ipool size is rounded up to the 8-byte boundary

The error in assigning values to actions[i] was due to potential
padding or misalignment in struct mlx5_modification_cmd.
To prevent such issues, the __rte_packed attribute was added to
struct mlx5_modification_cmd, ensuring that the structure is packed
without extra padding which helps avoid misaligned memory accesses.

Two performance degradation tests were conducted.
Following are the results comparing this commit to the most recent
commit in mlnx_dpdk_22.11 at that time (b69408ae453).

Before asan misalignment fix (average kflows/sec) -
Insertion - 4461.269, Deletion - 7799.9992
After:
Insertion - 4579.0642 , Deletion - 7913.0034

Fixes: 9a4c36880704 ("common/mlx5: optimize cache list object memory")
Cc: suanmingm@nvidia.com

Signed-off-by: Shani Peretz <shperetz@nvidia.com>
Acked-by: Bing Zhao <bingz@nvidia.com>
---
 drivers/common/mlx5/mlx5_common_utils.h | 2 +-
 drivers/common/mlx5/mlx5_prm.h          | 4 ++--
 drivers/net/mlx5/mlx5.c                 | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)
  

Comments

Raslan Darawsheh Nov. 13, 2024, 1:39 p.m. UTC | #1
Hi,

From: Shani Peretz <shperetz@nvidia.com>
Sent: Tuesday, November 12, 2024 10:21 AM
To: dev@dpdk.org
Cc: Shani Peretz; Maayan Kashani; Raslan Darawsheh; Suanming Mou; Bing Zhao; Dariusz Sosnowski; Slava Ovsiienko; Ori Kam; Matan Azrad
Subject: [PATCH] common/mlx5: fix misalignment issue detected by ASan

ASan reported a runtime error due to misalignment
involving three structures.

The first issue arises when accessing
l_inconst->cache[MLX5_LIST_GLOBAL]->h.
If struct mlx5_list_cache is not properly aligned, the pointer gc,
assigned to l_inconst->cache[MLX5_LIST_GLOBAL], could be misaligned.
To address this, the __rte_aligned(16) attribute was added to
struct mlx5_list_inconst in struct mlx5_list, which includes struct
mlx5_list_cache, ensuring that the entire mlx5_list structure,
including mlx5_list_cache, is aligned to 64 bytes.

To resolve misalignment issues with struct mlx5_flow_handle,
The initialization of resources for the ipool ensures that
the ipool size is rounded up to the 8-byte boundary

The error in assigning values to actions[i] was due to potential
padding or misalignment in struct mlx5_modification_cmd.
To prevent such issues, the __rte_packed attribute was added to
struct mlx5_modification_cmd, ensuring that the structure is packed
without extra padding which helps avoid misaligned memory accesses.

Two performance degradation tests were conducted.
Following are the results comparing this commit to the most recent
commit in mlnx_dpdk_22.11 at that time (b69408ae453).

Before asan misalignment fix (average kflows/sec) -
Insertion - 4461.269, Deletion - 7799.9992
After:
Insertion - 4579.0642 , Deletion - 7913.0034

Fixes: 9a4c36880704 ("common/mlx5: optimize cache list object memory")
Cc: suanmingm@nvidia.com

Signed-off-by: Shani Peretz <shperetz@nvidia.com>
Acked-by: Bing Zhao <bingz@nvidia.com>

Patch applied to next-net-mlx,

Kindest regards,
Raslan Darawsheh
  

Patch

diff --git a/drivers/common/mlx5/mlx5_common_utils.h b/drivers/common/mlx5/mlx5_common_utils.h
index c5eff7a0bf..9139bc6829 100644
--- a/drivers/common/mlx5/mlx5_common_utils.h
+++ b/drivers/common/mlx5/mlx5_common_utils.h
@@ -131,7 +131,7 @@  struct mlx5_list_inconst {
  * For huge amount of entries, please consider hash list.
  *
  */
-struct mlx5_list {
+struct __rte_aligned(16) mlx5_list {
 	struct mlx5_list_const l_const;
 	struct mlx5_list_inconst l_inconst;
 };
diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h
index 359f02f17c..5d73751182 100644
--- a/drivers/common/mlx5/mlx5_prm.h
+++ b/drivers/common/mlx5/mlx5_prm.h
@@ -914,7 +914,7 @@  struct mlx5_modification_cmd {
 			unsigned int field:12;
 			unsigned int action_type:4;
 		};
-	};
+	} __rte_packed;
 	union {
 		uint32_t data1;
 		uint8_t data[4];
@@ -925,7 +925,7 @@  struct mlx5_modification_cmd {
 			unsigned int dst_field:12;
 			unsigned int rsvd4:4;
 		};
-	};
+	} __rte_packed;
 };
 
 typedef uint64_t u64;
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 52b90e6ff3..6e4473e2f4 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -907,7 +907,7 @@  mlx5_flow_ipool_create(struct mlx5_dev_ctx_shared *sh)
 		 */
 		case MLX5_IPOOL_MLX5_FLOW:
 			cfg.size = sh->config.dv_flow_en ?
-				sizeof(struct mlx5_flow_handle) :
+				RTE_ALIGN_MUL_CEIL(sizeof(struct mlx5_flow_handle), 8) :
 				MLX5_FLOW_HANDLE_VERBS_SIZE;
 			break;
 #if defined(HAVE_IBV_FLOW_DV_SUPPORT) || !defined(HAVE_INFINIBAND_VERBS_H)