[2/2] net/i40e: fix risk in Rx descriptor read in scalar path

Message ID 20210906033201.1789796-3-ruifeng.wang@arm.com (mailing list archive)
State Superseded, archived
Delegated to: Qi Zhang
Headers
Series i40e Rx descriptor loads ordering |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/github-robot: build success github build: passed
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-intel-Performance fail Performance Testing issues
ci/iol-x86_64-unit-testing success Testing PASS
ci/iol-x86_64-compile-testing success Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS

Commit Message

Ruifeng Wang Sept. 6, 2021, 3:32 a.m. UTC
  Rx descriptor is 16B/32B in size and consists of multiple words.
The word that includes DD field should be read first. Read result
with DD bit set indicates the rest part in a descriptor is valid.

In functions for simple Rx, the descriptor is not read atomically
in whole. On weaker ordered systems like aarch64, read of the word
that includes DD field could be reordered after read of other words.
In this case, some words could be invalid data.

Read barrier is inserted between read of the word with DD field
and read of other words. The barrier ensures what fetched is correct
descriptor data.

Fixes: 7b0cf70135d1 ("net/i40e: support ARM platform")
Cc: stable@dpdk.org

Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
The change should not impact performance on x86 as acquire fence is
ignored on x86.

 drivers/net/i40e/i40e_rxtx.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)
  

Comments

Honnappa Nagarahalli Sept. 14, 2021, 6:06 p.m. UTC | #1
<snip>

> 
> Rx descriptor is 16B/32B in size and consists of multiple words.
> The word that includes DD field should be read first. Read result with DD bit
> set indicates the rest part in a descriptor is valid.
Suggest rewording as follows:
Rx descriptor is 16B/32B in size. If the DD bit is set, it indicates that the rest of the descriptor words have valid values. Hence, the word containing DD bit must be read first before reading the rest of the descriptor words.

> 
> In functions for simple Rx, the descriptor is not read atomically in whole. On
> weaker ordered systems like aarch64, read of the word that includes DD field
> could be reordered after read of other words.
> In this case, some words could be invalid data.
Since the entire descriptor is not read atomically, on relaxed memory ordered systems like Aarch64, read of the word containing DD field could be reordered after read of other words.

> 
> Read barrier is inserted between read of the word with DD field and read of
> other words. The barrier ensures what fetched is correct descriptor data.
Suggest capturing the performance impact, so it is clearly documented.

> 
> Fixes: 7b0cf70135d1 ("net/i40e: support ARM platform")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
With the above comments,
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>

> ---
> The change should not impact performance on x86 as acquire fence is ignored
> on x86.
> 
>  drivers/net/i40e/i40e_rxtx.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c index
> 8329cbdd4e..c4cd6b6b60 100644
> --- a/drivers/net/i40e/i40e_rxtx.c
> +++ b/drivers/net/i40e/i40e_rxtx.c
> @@ -746,6 +746,12 @@ i40e_recv_pkts(void *rx_queue, struct rte_mbuf
> **rx_pkts, uint16_t nb_pkts)
>  			break;
>  		}
> 
> +		/**
> +		 * Use acquire fence to ensure that qword1 which includes DD
> +		 * bit is loaded before loading of other descriptor words.
> +		 */
> +		rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
> +
>  		rxd = *rxdp;
>  		nb_hold++;
>  		rxe = &sw_ring[rx_id];
> @@ -862,6 +868,12 @@ i40e_recv_scattered_pkts(void *rx_queue,
>  			break;
>  		}
> 
> +		/**
> +		 * Use acquire fence to ensure that qword1 which includes DD
> +		 * bit is loaded before loading of other descriptor words.
> +		 */
> +		rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
> +
>  		rxd = *rxdp;
>  		nb_hold++;
>  		rxe = &sw_ring[rx_id];
> --
> 2.25.1
  

Patch

diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index 8329cbdd4e..c4cd6b6b60 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -746,6 +746,12 @@  i40e_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 			break;
 		}
 
+		/**
+		 * Use acquire fence to ensure that qword1 which includes DD
+		 * bit is loaded before loading of other descriptor words.
+		 */
+		rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
+
 		rxd = *rxdp;
 		nb_hold++;
 		rxe = &sw_ring[rx_id];
@@ -862,6 +868,12 @@  i40e_recv_scattered_pkts(void *rx_queue,
 			break;
 		}
 
+		/**
+		 * Use acquire fence to ensure that qword1 which includes DD
+		 * bit is loaded before loading of other descriptor words.
+		 */
+		rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
+
 		rxd = *rxdp;
 		nb_hold++;
 		rxe = &sw_ring[rx_id];