[dpdk-dev,1/2] eal/arm64: modify I/O device memory barriers
Checks
Commit Message
Instead of using system-wide 'dsb' instruction for IO barriers, 'dmb' is
sufficient and could bring better performance. Using 'dmb' with Outer
Shareable Domain option is also consistent with linux kernel.
Cc: Thomas Speier <tspeier@qti.qualcomm.com>
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Thomas Speier <tspeier@qti.qualcomm.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
---
lib/librte_eal/common/include/arch/arm/rte_atomic_64.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
Comments
-----Original Message-----
> Date: Tue, 26 Dec 2017 20:28:23 -0800
> From: Yongseok Koh <yskoh@mellanox.com>
> To: adrien.mazarguil@6wind.com, nelio.laranjeiro@6wind.com,
> jerin.jacob@caviumnetworks.com, jianbo.liu@arm.com
> CC: dev@dpdk.org, Yongseok Koh <yskoh@mellanox.com>, Thomas Speier
> <tspeier@qti.qualcomm.com>
> Subject: [PATCH 1/2] eal/arm64: modify I/O device memory barriers
> X-Mailer: git-send-email 2.11.0
>
> Instead of using system-wide 'dsb' instruction for IO barriers, 'dmb' is
> sufficient and could bring better performance. Using 'dmb' with Outer
> Shareable Domain option is also consistent with linux kernel.
>
> Cc: Thomas Speier <tspeier@qti.qualcomm.com>
>
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> Acked-by: Thomas Speier <tspeier@qti.qualcomm.com>
> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> ---
> lib/librte_eal/common/include/arch/arm/rte_atomic_64.h | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
> index 0b70d6209..8dcce6054 100644
> --- a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
> +++ b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
> @@ -58,11 +58,11 @@ extern "C" {
>
> #define rte_smp_rmb() dmb(ishld)
>
> -#define rte_io_mb() rte_mb()
> +#define rte_io_mb() dmb(osh)
>
> -#define rte_io_wmb() rte_wmb()
> +#define rte_io_wmb() dmb(oshst)
>
> -#define rte_io_rmb() rte_rmb()
> +#define rte_io_rmb() dmb(oshld)
>
> #ifdef __cplusplus
> }
> --
> 2.11.0
>
The 12/26/2017 20:28, Yongseok Koh wrote:
> Instead of using system-wide 'dsb' instruction for IO barriers, 'dmb' is
> sufficient and could bring better performance. Using 'dmb' with Outer
> Shareable Domain option is also consistent with linux kernel.
But in kernel dsb is used for io barriers.
https://github.com/torvalds/linux/blob/master/arch/arm64/include/asm/io.h#L109
Do you consider adding dma_*mb?
https://github.com/torvalds/linux/blob/master/arch/arm64/include/asm/barrier.h#L40
>
> Cc: Thomas Speier <tspeier@qti.qualcomm.com>
>
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> Acked-by: Thomas Speier <tspeier@qti.qualcomm.com>
> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
> ---
> lib/librte_eal/common/include/arch/arm/rte_atomic_64.h | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
> index 0b70d6209..8dcce6054 100644
> --- a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
> +++ b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
> @@ -58,11 +58,11 @@ extern "C" {
>
> #define rte_smp_rmb() dmb(ishld)
>
> -#define rte_io_mb() rte_mb()
> +#define rte_io_mb() dmb(osh)
>
> -#define rte_io_wmb() rte_wmb()
> +#define rte_io_wmb() dmb(oshst)
>
> -#define rte_io_rmb() rte_rmb()
> +#define rte_io_rmb() dmb(oshld)
>
> #ifdef __cplusplus
> }
> --
> 2.11.0
>
--
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
> On Jan 7, 2018, at 5:55 PM, Jianbo Liu <Jianbo.Liu@arm.com> wrote:
>
> The 12/26/2017 20:28, Yongseok Koh wrote:
>> Instead of using system-wide 'dsb' instruction for IO barriers, 'dmb' is
>> sufficient and could bring better performance. Using 'dmb' with Outer
>> Shareable Domain option is also consistent with linux kernel.
>
> But in kernel dsb is used for io barriers.
> Do you consider adding dma_*mb?
Right. I'll send out a patchset, which adds rte_dma_rmb/wmb() today.
Thanks
Yongseok
This patchset is to introduce DMA memory barriers, which could be more
efficient for coherent memory between I/O device and CPU, especially for
ARMv8.
Yongseok Koh (8):
eal: introduce DMA memory barriers
eal/x86: define DMA memory barriers
eal/ppc64: define DMA device memory barriers
eal/armv7: define DMA memory barriers
eal/arm64: define DMA memory barriers
net/mlx5: remove unnecessary memory barrier
net/mlx5: replace IO memory barrier with DMA memory barrier
net/mlx5: fix synchonization on polling Rx completions
drivers/net/mlx5/mlx5_rxq.c | 1 -
drivers/net/mlx5/mlx5_rxtx.c | 5 +-
drivers/net/mlx5/mlx5_rxtx.h | 2 +-
drivers/net/mlx5/mlx5_rxtx_vec.h | 2 +-
drivers/net/mlx5/mlx5_rxtx_vec_neon.h | 53 ++++++++++++----------
drivers/net/mlx5/mlx5_rxtx_vec_sse.h | 2 +-
.../common/include/arch/arm/rte_atomic_32.h | 4 ++
.../common/include/arch/arm/rte_atomic_64.h | 4 ++
.../common/include/arch/ppc_64/rte_atomic.h | 4 ++
.../common/include/arch/x86/rte_atomic.h | 4 ++
lib/librte_eal/common/include/generic/rte_atomic.h | 18 ++++++++
11 files changed, 70 insertions(+), 29 deletions(-)
This patchset is to introduce DMA memory barriers, which could be more
efficient for coherent memory between I/O device and CPU, especially for
ARMv8.
v3:
* add more detailed comments about the new memory barriers.
v2:
* introduce DMA memory barriers.
Yongseok Koh (8):
eal: introduce DMA memory barriers
eal/x86: define DMA memory barriers
eal/ppc64: define DMA memory barriers
eal/armv7: define DMA memory barriers
eal/arm64: define DMA memory barriers
net/mlx5: remove unnecessary memory barrier
net/mlx5: replace IO memory barrier with DMA memory barrier
net/mlx5: fix synchonization on polling Rx completions
drivers/net/mlx5/mlx5_rxq.c | 1 -
drivers/net/mlx5/mlx5_rxtx.c | 5 +-
drivers/net/mlx5/mlx5_rxtx.h | 2 +-
drivers/net/mlx5/mlx5_rxtx_vec.h | 2 +-
drivers/net/mlx5/mlx5_rxtx_vec_neon.h | 53 ++++++++++++----------
drivers/net/mlx5/mlx5_rxtx_vec_sse.h | 2 +-
.../common/include/arch/arm/rte_atomic_32.h | 4 ++
.../common/include/arch/arm/rte_atomic_64.h | 4 ++
.../common/include/arch/ppc_64/rte_atomic.h | 4 ++
.../common/include/arch/x86/rte_atomic.h | 4 ++
lib/librte_eal/common/include/generic/rte_atomic.h | 52 +++++++++++++++++++++
11 files changed, 104 insertions(+), 29 deletions(-)
This patchset is to introduce coherent I/O memory barriers, which could be more
efficient for coherent memory between I/O device and CPU, especially for ARMv8.
v4:
* rename barriers to "coherent I/O memory barrier".
* Make groups for various barriers in Doxygen doc.
v3:
* add more detailed comments about the new memory barriers.
v2:
* introduce DMA memory barriers.
Yongseok Koh (9):
eal: add Doxygen grouping for memory barriers
eal: introduce coherent I/O memory barriers
eal/x86: define coherent I/O memory barriers
eal/ppc64: define coherent I/O memory barriers
eal/armv7: define coherent I/O memory barriers
eal/arm64: define coherent I/O memory barriers
net/mlx5: remove unnecessary memory barrier
net/mlx5: replace I/O memory barrier with coherent version
net/mlx5: fix synchronization on polling Rx completions
drivers/net/mlx5/mlx5_rxq.c | 1 -
drivers/net/mlx5/mlx5_rxtx.c | 5 +-
drivers/net/mlx5/mlx5_rxtx.h | 2 +-
drivers/net/mlx5/mlx5_rxtx_vec.h | 2 +-
drivers/net/mlx5/mlx5_rxtx_vec_neon.h | 53 ++++++++++++----------
drivers/net/mlx5/mlx5_rxtx_vec_sse.h | 2 +-
.../common/include/arch/arm/rte_atomic_32.h | 4 ++
.../common/include/arch/arm/rte_atomic_64.h | 4 ++
.../common/include/arch/ppc_64/rte_atomic.h | 4 ++
.../common/include/arch/x86/rte_atomic.h | 4 ++
lib/librte_eal/common/include/generic/rte_atomic.h | 51 +++++++++++++++++++++
11 files changed, 103 insertions(+), 29 deletions(-)
25/01/2018 22:02, Yongseok Koh:
> This patchset is to introduce coherent I/O memory barriers, which could be more
> efficient for coherent memory between I/O device and CPU, especially for ARMv8.
>
> v4:
> * rename barriers to "coherent I/O memory barrier".
> * Make groups for various barriers in Doxygen doc.
>
> v3:
> * add more detailed comments about the new memory barriers.
>
> v2:
> * introduce DMA memory barriers.
>
> Yongseok Koh (9):
> eal: add Doxygen grouping for memory barriers
> eal: introduce coherent I/O memory barriers
> eal/x86: define coherent I/O memory barriers
> eal/ppc64: define coherent I/O memory barriers
> eal/armv7: define coherent I/O memory barriers
> eal/arm64: define coherent I/O memory barriers
> net/mlx5: remove unnecessary memory barrier
> net/mlx5: replace I/O memory barrier with coherent version
> net/mlx5: fix synchronization on polling Rx completions
Applied, thanks
@@ -58,11 +58,11 @@ extern "C" {
#define rte_smp_rmb() dmb(ishld)
-#define rte_io_mb() rte_mb()
+#define rte_io_mb() dmb(osh)
-#define rte_io_wmb() rte_wmb()
+#define rte_io_wmb() dmb(oshst)
-#define rte_io_rmb() rte_rmb()
+#define rte_io_rmb() dmb(oshld)
#ifdef __cplusplus
}