[v1,1/5] lib/eal: implement the family of rte bit operation APIs

Message ID 1571125801-45773-2-git-send-email-joyce.kong@arm.com (mailing list archive)
State Superseded, archived
Headers
Series implement common rte bit operation APIs in PMDs |

Checks

Context Check Description
ci/iol-compilation success Compile Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/checkpatch success coding style OK
ci/Intel-compilation fail Compilation issues
ci/iol-mellanox-Performance success Performance Testing PASS

Commit Message

Joyce Kong Oct. 15, 2019, 7:49 a.m. UTC
  There are a lot functions of bit operations scattered and
duplicated in PMDs, consolidating them into a common API
family is necessary. Furthermore, the bit operation is
mostly applied to the IO devices, so use __ATOMIC_ACQ_REL
to ensure the ordering.

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
---
 lib/librte_eal/common/Makefile             |  1 +
 lib/librte_eal/common/include/rte_bitops.h | 56 ++++++++++++++++++++++++++++++
 lib/librte_eal/common/meson.build          |  1 +
 3 files changed, 58 insertions(+)
 create mode 100644 lib/librte_eal/common/include/rte_bitops.h
  

Comments

Stephen Hemminger Oct. 15, 2019, 4:53 p.m. UTC | #1
On Tue, 15 Oct 2019 15:49:57 +0800
Joyce Kong <joyce.kong@arm.com> wrote:

> +static inline void
> +rte_set_bit(unsigned int nr, unsigned long *addr)
> +{
> +	__atomic_fetch_or(addr, (1UL << nr), __ATOMIC_ACQ_REL);
> +}
> +
> +static inline void
> +rte_clear_bit(int nr, unsigned long *addr)
> +{
> +	__atomic_fetch_and(addr, ~(1UL << nr), __ATOMIC_ACQ_REL);
> +}
> +
> +static inline int
> +rte_test_bit(int nr, unsigned long *addr)
> +{
> +	int res;
> +	rte_mb();
> +	res = ((*addr) & (1UL << nr)) != 0;
> +	rte_mb();
> +
> +	return res;
> +}
> +
> +static inline int
> +rte_test_and_set_bit(int nr, unsigned long *addr)
> +{
> +	unsigned long mask = (1UL << nr);
> +
> +	return __atomic_fetch_or(addr, mask, __ATOMIC_ACQ_REL) & mask;
> +}
> +
> +static inline int
> +rte_test_and_clear_bit(int nr, unsigned long *addr)
> +{
> +	unsigned long mask = (1UL << nr);
> +
> +	return __atomic_fetch_and(addr, ~mask, __ATOMIC_ACQ_REL) & mask;
> +}

These functions need to be part of API, and have doxygen comments?
  
Jerin Jacob Oct. 16, 2019, 7:54 a.m. UTC | #2
On Tue, Oct 15, 2019 at 1:20 PM Joyce Kong <joyce.kong@arm.com> wrote:
>
> There are a lot functions of bit operations scattered and
> duplicated in PMDs, consolidating them into a common API
> family is necessary. Furthermore, the bit operation is
> mostly applied to the IO devices, so use __ATOMIC_ACQ_REL
> to ensure the ordering.
>
> Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> ---
>  lib/librte_eal/common/Makefile             |  1 +
>  lib/librte_eal/common/include/rte_bitops.h | 56 ++++++++++++++++++++++++++++++
>  lib/librte_eal/common/meson.build          |  1 +
> +
> +static inline void
> +rte_set_bit(unsigned int nr, unsigned long *addr)
> +{
> +       __atomic_fetch_or(addr, (1UL << nr), __ATOMIC_ACQ_REL);
> +}

If it is specific for IO the IMO, it makes sense call the API to
rte_io_set_bit() like rte_io_rmb
and change the header file to rte_io_bitops.h.

The barries are only needed for IO operations. Explicitly is not
conveying it in API name
would call for using it for normal cases.

Other option could be to introduce, generic and IO specific bit
operations operations
separately.
  
Stephen Hemminger Oct. 16, 2019, 7:05 p.m. UTC | #3
On Tue, 15 Oct 2019 15:49:57 +0800
Joyce Kong <joyce.kong@arm.com> wrote:

> There are a lot functions of bit operations scattered and
> duplicated in PMDs, consolidating them into a common API
> family is necessary. Furthermore, the bit operation is
> mostly applied to the IO devices, so use __ATOMIC_ACQ_REL
> to ensure the ordering.
> 
> Signed-off-by: Joyce Kong <joyce.kong@arm.com>
	'include/rte_common.h',

Patchwork reports several build failures for this patch set.

/tmp/UB1604-64_K4.4.0_Clang3.8.0/x86_64-native-linuxapp-clang/62c86b2c1091439598f2f1688566632c/dpdk/x86_64-native-linuxapp-clang/lib/librte_pmd_bnx2x.a(bnx2x.o): In function `bnx2x_set_storm_rx_mode':
/tmp/UB1604-64_K4.4.0_Clang3.8.0/x86_64-native-linuxapp-clang/62c86b2c1091439598f2f1688566632c/dpdk/drivers/net/bnx2x/bnx2x.c:(.text+0x1602): undefined reference to `ret_set_bit'
  
Morten Brørup Oct. 17, 2019, 1:32 p.m. UTC | #4
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Joyce Kong
> Sent: Tuesday, October 15, 2019 9:50 AM
> 
> There are a lot functions of bit operations scattered and
> duplicated in PMDs, consolidating them into a common API
> family is necessary. Furthermore, the bit operation is
> mostly applied to the IO devices, so use __ATOMIC_ACQ_REL
> to ensure the ordering.

Good initiative.

> 
> Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> ---
>  lib/librte_eal/common/Makefile             |  1 +
>  lib/librte_eal/common/include/rte_bitops.h | 56
> ++++++++++++++++++++++++++++++
>  lib/librte_eal/common/meson.build          |  1 +
>  3 files changed, 58 insertions(+)
>  create mode 100644 lib/librte_eal/common/include/rte_bitops.h
> 
> diff --git a/lib/librte_eal/common/Makefile
> b/lib/librte_eal/common/Makefile
> index a00d4fc..8586ca8 100644
> --- a/lib/librte_eal/common/Makefile
> +++ b/lib/librte_eal/common/Makefile
> @@ -18,6 +18,7 @@ INC += rte_malloc.h rte_keepalive.h rte_time.h
>  INC += rte_service.h rte_service_component.h
>  INC += rte_bitmap.h rte_vfio.h rte_hypervisor.h rte_test.h
>  INC += rte_reciprocal.h rte_fbarray.h rte_uuid.h
> +INC += rte_bitops.h
> 
>  GENERIC_INC := rte_atomic.h rte_byteorder.h rte_cycles.h
> rte_prefetch.h
>  GENERIC_INC += rte_memcpy.h rte_cpuflags.h
> diff --git a/lib/librte_eal/common/include/rte_bitops.h
> b/lib/librte_eal/common/include/rte_bitops.h
> new file mode 100644
> index 0000000..4d7c5a3
> --- /dev/null
> +++ b/lib/librte_eal/common/include/rte_bitops.h
> @@ -0,0 +1,56 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2019 Arm Corporation
> + */
> +
> +#ifndef _RTE_BITOPS_H_
> +#define _RTE_BITOPS_H_
> +
> +/**
> + * @file
> + * Bit Operations
> + *
> + * This file defines a generic API for bit operations.
> + */
> +
> +#include <stdint.h>
> +#include <rte_atomic.h>
> +
> +static inline void
> +rte_set_bit(unsigned int nr, unsigned long *addr)
> +{
> +	__atomic_fetch_or(addr, (1UL << nr), __ATOMIC_ACQ_REL);
> +}
> +
> +static inline void
> +rte_clear_bit(int nr, unsigned long *addr)
> +{
> +	__atomic_fetch_and(addr, ~(1UL << nr), __ATOMIC_ACQ_REL);
> +}
> +
> +static inline int
> +rte_test_bit(int nr, unsigned long *addr)
> +{
> +	int res;
> +	rte_mb();
> +	res = ((*addr) & (1UL << nr)) != 0;
> +	rte_mb();
> +
> +	return res;
> +}

Why does rte_test_bit() not use any of the __atomic_xx functions instead? E.g.:

static inline int
rte_test_bit(int nr, unsigned long *addr)
{
	return __atomic_load_n(addr, __ATOMIC_ACQUIRE);
}

> +
> +static inline int
> +rte_test_and_set_bit(int nr, unsigned long *addr)
> +{
> +	unsigned long mask = (1UL << nr);
> +
> +	return __atomic_fetch_or(addr, mask, __ATOMIC_ACQ_REL) & mask;
> +}
> +
> +static inline int
> +rte_test_and_clear_bit(int nr, unsigned long *addr)
> +{
> +	unsigned long mask = (1UL << nr);
> +
> +	return __atomic_fetch_and(addr, ~mask, __ATOMIC_ACQ_REL) & mask;
> +}
> +#endif /* _RTE_BITOPS_H_ */
> diff --git a/lib/librte_eal/common/meson.build
> b/lib/librte_eal/common/meson.build
> index 386577c..a277cdf 100644
> --- a/lib/librte_eal/common/meson.build
> +++ b/lib/librte_eal/common/meson.build
> @@ -52,6 +52,7 @@ common_headers = files(
>  	'include/rte_alarm.h',
>  	'include/rte_branch_prediction.h',
>  	'include/rte_bus.h',
> +	'include/rte_bitops.h',
>  	'include/rte_bitmap.h',
>  	'include/rte_class.h',
>  	'include/rte_common.h',
> --
> 2.7.4
> 

These functions use unsigned long as the type of their value, like they do in the PMDs.

However, a generic bit operations library should preferably work with multiple types, like the __atomic_xx functions. Or use an well defined uint_NN_t type. Or have individually named functions for each type size, e.g. rte_set_bit_32() and rte_set_bit_64().


Med venlig hilsen / kind regards
- Morten Brørup
  
Joyce Kong Oct. 18, 2019, 8:58 a.m. UTC | #5
Hi Morten,

> -----Original Message-----
> From: Morten Brørup <mb@smartsharesystems.com>
> Sent: Thursday, October 17, 2019 9:32 PM
> To: Joyce Kong (Arm Technology China) <Joyce.Kong@arm.com>;
> dev@dpdk.org
> Cc: nd <nd@arm.com>; thomas@monjalon.net; jerinj@marvell.com;
> ravi1.kumar@amd.com; xuanziyang2@huawei.com;
> cloud.wangxiaoyun@huawei.com; zhouguoyang@huawei.com;
> rmody@marvell.com; shshaikh@marvell.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Gavin Hu (Arm Technology China)
> <Gavin.Hu@arm.com>; Stephen Hemminger
> <stephen@networkplumber.org>
> Subject: RE: [dpdk-dev] [PATCH v1 1/5] lib/eal: implement the family of rte
> bitoperation APIs
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Joyce Kong
> > Sent: Tuesday, October 15, 2019 9:50 AM
> >
> > There are a lot functions of bit operations scattered and duplicated
> > in PMDs, consolidating them into a common API family is necessary.
> > Furthermore, the bit operation is mostly applied to the IO devices, so
> > use __ATOMIC_ACQ_REL to ensure the ordering.
> 
> Good initiative.
> 
> >
> > Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> > ---
> >  lib/librte_eal/common/Makefile             |  1 +
> >  lib/librte_eal/common/include/rte_bitops.h | 56
> > ++++++++++++++++++++++++++++++
> >  lib/librte_eal/common/meson.build          |  1 +
> >  3 files changed, 58 insertions(+)
> >  create mode 100644 lib/librte_eal/common/include/rte_bitops.h
> >
> > diff --git a/lib/librte_eal/common/Makefile
> > b/lib/librte_eal/common/Makefile index a00d4fc..8586ca8 100644
> > --- a/lib/librte_eal/common/Makefile
> > +++ b/lib/librte_eal/common/Makefile
> > @@ -18,6 +18,7 @@ INC += rte_malloc.h rte_keepalive.h rte_time.h  INC
> > += rte_service.h rte_service_component.h  INC += rte_bitmap.h
> > rte_vfio.h rte_hypervisor.h rte_test.h  INC += rte_reciprocal.h
> > rte_fbarray.h rte_uuid.h
> > +INC += rte_bitops.h
> >
> >  GENERIC_INC := rte_atomic.h rte_byteorder.h rte_cycles.h
> > rte_prefetch.h  GENERIC_INC += rte_memcpy.h rte_cpuflags.h diff --git
> > a/lib/librte_eal/common/include/rte_bitops.h
> > b/lib/librte_eal/common/include/rte_bitops.h
> > new file mode 100644
> > index 0000000..4d7c5a3
> > --- /dev/null
> > +++ b/lib/librte_eal/common/include/rte_bitops.h
> > @@ -0,0 +1,56 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(c) 2019 Arm Corporation
> > + */
> > +
> > +#ifndef _RTE_BITOPS_H_
> > +#define _RTE_BITOPS_H_
> > +
> > +/**
> > + * @file
> > + * Bit Operations
> > + *
> > + * This file defines a generic API for bit operations.
> > + */
> > +
> > +#include <stdint.h>
> > +#include <rte_atomic.h>
> > +
> > +static inline void
> > +rte_set_bit(unsigned int nr, unsigned long *addr) {
> > +	__atomic_fetch_or(addr, (1UL << nr), __ATOMIC_ACQ_REL); }
> > +
> > +static inline void
> > +rte_clear_bit(int nr, unsigned long *addr) {
> > +	__atomic_fetch_and(addr, ~(1UL << nr), __ATOMIC_ACQ_REL); }
> > +
> > +static inline int
> > +rte_test_bit(int nr, unsigned long *addr) {
> > +	int res;
> > +	rte_mb();
> > +	res = ((*addr) & (1UL << nr)) != 0;
> > +	rte_mb();
> > +
> > +	return res;
> > +}
> 
> Why does rte_test_bit() not use any of the __atomic_xx functions instead?
> E.g.:
> 
> static inline int
> rte_test_bit(int nr, unsigned long *addr) {
> 	return __atomic_load_n(addr, __ATOMIC_ACQUIRE); }
> 
You re right, it's better to use __atomic_xx here to keep the consistent with other APIs.

> > +
> > +static inline int
> > +rte_test_and_set_bit(int nr, unsigned long *addr) {
> > +	unsigned long mask = (1UL << nr);
> > +
> > +	return __atomic_fetch_or(addr, mask, __ATOMIC_ACQ_REL) &
> mask; }
> > +
> > +static inline int
> > +rte_test_and_clear_bit(int nr, unsigned long *addr) {
> > +	unsigned long mask = (1UL << nr);
> > +
> > +	return __atomic_fetch_and(addr, ~mask, __ATOMIC_ACQ_REL) &
> mask; }
> > +#endif /* _RTE_BITOPS_H_ */
> > diff --git a/lib/librte_eal/common/meson.build
> > b/lib/librte_eal/common/meson.build
> > index 386577c..a277cdf 100644
> > --- a/lib/librte_eal/common/meson.build
> > +++ b/lib/librte_eal/common/meson.build
> > @@ -52,6 +52,7 @@ common_headers = files(
> >  	'include/rte_alarm.h',
> >  	'include/rte_branch_prediction.h',
> >  	'include/rte_bus.h',
> > +	'include/rte_bitops.h',
> >  	'include/rte_bitmap.h',
> >  	'include/rte_class.h',
> >  	'include/rte_common.h',
> > --
> > 2.7.4
> >
> 
> These functions use unsigned long as the type of their value, like they do in
> the PMDs.
> 
> However, a generic bit operations library should preferably work with
> multiple types, like the __atomic_xx functions. Or use an well defined
> uint_NN_t type. Or have individually named functions for each type size, e.g.
> rte_set_bit_32() and rte_set_bit_64().
> 
Good suggestion! And will do this in next version.

> Med venlig hilsen / kind regards
> - Morten Brørup
  
Joyce Kong Oct. 18, 2019, 9 a.m. UTC | #6
> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Wednesday, October 16, 2019 12:54 AM
> To: Joyce Kong (Arm Technology China) <Joyce.Kong@arm.com>
> Cc: dev@dpdk.org; nd <nd@arm.com>; thomas@monjalon.net;
> jerinj@marvell.com; ravi1.kumar@amd.com; xuanziyang2@huawei.com;
> cloud.wangxiaoyun@huawei.com; zhouguoyang@huawei.com;
> rmody@marvell.com; shshaikh@marvell.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Gavin Hu (Arm Technology China)
> <Gavin.Hu@arm.com>
> Subject: Re: [dpdk-dev] [PATCH v1 1/5] lib/eal: implement the family of rte
> bit operation APIs
> 
> On Tue, 15 Oct 2019 15:49:57 +0800
> Joyce Kong <joyce.kong@arm.com> wrote:
> 
> > +static inline void
> > +rte_set_bit(unsigned int nr, unsigned long *addr) {
> > +	__atomic_fetch_or(addr, (1UL << nr), __ATOMIC_ACQ_REL); }
> > +
> > +static inline void
> > +rte_clear_bit(int nr, unsigned long *addr) {
> > +	__atomic_fetch_and(addr, ~(1UL << nr), __ATOMIC_ACQ_REL); }
> > +
> > +static inline int
> > +rte_test_bit(int nr, unsigned long *addr) {
> > +	int res;
> > +	rte_mb();
> > +	res = ((*addr) & (1UL << nr)) != 0;
> > +	rte_mb();
> > +
> > +	return res;
> > +}
> > +
> > +static inline int
> > +rte_test_and_set_bit(int nr, unsigned long *addr) {
> > +	unsigned long mask = (1UL << nr);
> > +
> > +	return __atomic_fetch_or(addr, mask, __ATOMIC_ACQ_REL) &
> mask; }
> > +
> > +static inline int
> > +rte_test_and_clear_bit(int nr, unsigned long *addr) {
> > +	unsigned long mask = (1UL << nr);
> > +
> > +	return __atomic_fetch_and(addr, ~mask, __ATOMIC_ACQ_REL) &
> mask; }
> 
> These functions need to be part of API, and have doxygen comments?

Will add doxygen comments in next version.
  
Joyce Kong Oct. 18, 2019, 9:02 a.m. UTC | #7
> -----Original Message-----
> From: Jerin Jacob <jerinjacobk@gmail.com>
> Sent: Wednesday, October 16, 2019 3:54 PM
> To: Joyce Kong (Arm Technology China) <Joyce.Kong@arm.com>
> Cc: dpdk-dev <dev@dpdk.org>; nd <nd@arm.com>; thomas@monjalon.net;
> jerinj@marvell.com; ravi1.kumar@amd.com; Ziyang Xuan
> <xuanziyang2@huawei.com>; Xiaoyun Wang
> <cloud.wangxiaoyun@huawei.com>; Guoyang Zhou
> <zhouguoyang@huawei.com>; Rasesh Mody <rmody@marvell.com>;
> Shahed Shaikh <shshaikh@marvell.com>; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Gavin Hu (Arm Technology China)
> <Gavin.Hu@arm.com>
> Subject: Re: [dpdk-dev] [PATCH v1 1/5] lib/eal: implement the family of rte
> bit operation APIs
> 
> On Tue, Oct 15, 2019 at 1:20 PM Joyce Kong <joyce.kong@arm.com> wrote:
> >
> > There are a lot functions of bit operations scattered and duplicated
> > in PMDs, consolidating them into a common API family is necessary.
> > Furthermore, the bit operation is mostly applied to the IO devices, so
> > use __ATOMIC_ACQ_REL to ensure the ordering.
> >
> > Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> > ---
> >  lib/librte_eal/common/Makefile             |  1 +
> >  lib/librte_eal/common/include/rte_bitops.h | 56
> ++++++++++++++++++++++++++++++
> >  lib/librte_eal/common/meson.build          |  1 +
> > +
> > +static inline void
> > +rte_set_bit(unsigned int nr, unsigned long *addr) {
> > +       __atomic_fetch_or(addr, (1UL << nr), __ATOMIC_ACQ_REL); }
> 
> If it is specific for IO the IMO, it makes sense call the API to
> rte_io_set_bit() like rte_io_rmb
> and change the header file to rte_io_bitops.h.
> 
> The barries are only needed for IO operations. Explicitly is not conveying it in
> API name would call for using it for normal cases.
> 
> Other option could be to introduce, generic and IO specific bit operations
> operations separately.

Would do some related changes in next version.
  
Joyce Kong Oct. 23, 2019, 3:07 a.m. UTC | #8
> > > -----Original Message-----
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Joyce Kong
> > > Sent: Tuesday, October 15, 2019 9:50 AM
> > >
> > > There are a lot functions of bit operations scattered and duplicated
> > > in PMDs, consolidating them into a common API family is necessary.
> > > Furthermore, the bit operation is mostly applied to the IO devices,
> > > so use __ATOMIC_ACQ_REL to ensure the ordering.
> >
> > Good initiative.
> >
> > >
> > > Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> > > ---
> > >  lib/librte_eal/common/Makefile             |  1 +
> > >  lib/librte_eal/common/include/rte_bitops.h | 56
> > > ++++++++++++++++++++++++++++++
> > >  lib/librte_eal/common/meson.build          |  1 +
> > >  3 files changed, 58 insertions(+)
> > >  create mode 100644 lib/librte_eal/common/include/rte_bitops.h
> > >
> > > diff --git a/lib/librte_eal/common/Makefile
> > > b/lib/librte_eal/common/Makefile index a00d4fc..8586ca8 100644
> > > --- a/lib/librte_eal/common/Makefile
> > > +++ b/lib/librte_eal/common/Makefile
> > > @@ -18,6 +18,7 @@ INC += rte_malloc.h rte_keepalive.h rte_time.h
> > > INC
> > > += rte_service.h rte_service_component.h  INC += rte_bitmap.h
> > > rte_vfio.h rte_hypervisor.h rte_test.h  INC += rte_reciprocal.h
> > > rte_fbarray.h rte_uuid.h
> > > +INC += rte_bitops.h
> > >
> > >  GENERIC_INC := rte_atomic.h rte_byteorder.h rte_cycles.h
> > > rte_prefetch.h  GENERIC_INC += rte_memcpy.h rte_cpuflags.h diff
> > > --git a/lib/librte_eal/common/include/rte_bitops.h
> > > b/lib/librte_eal/common/include/rte_bitops.h
> > > new file mode 100644
> > > index 0000000..4d7c5a3
> > > --- /dev/null
> > > +++ b/lib/librte_eal/common/include/rte_bitops.h
> > > @@ -0,0 +1,56 @@
> > > +/* SPDX-License-Identifier: BSD-3-Clause
> > > + * Copyright(c) 2019 Arm Corporation  */
> > > +
> > > +#ifndef _RTE_BITOPS_H_
> > > +#define _RTE_BITOPS_H_
> > > +
> > > +/**
> > > + * @file
> > > + * Bit Operations
> > > + *
> > > + * This file defines a generic API for bit operations.
> > > + */
> > > +
> > > +#include <stdint.h>
> > > +#include <rte_atomic.h>
> > > +
> > > +static inline void
> > > +rte_set_bit(unsigned int nr, unsigned long *addr) {
> > > +	__atomic_fetch_or(addr, (1UL << nr), __ATOMIC_ACQ_REL); }
> > > +
> > > +static inline void
> > > +rte_clear_bit(int nr, unsigned long *addr) {
> > > +	__atomic_fetch_and(addr, ~(1UL << nr), __ATOMIC_ACQ_REL); }
> > > +
> > > +static inline int
> > > +rte_test_bit(int nr, unsigned long *addr) {
> > > +	int res;
> > > +	rte_mb();
> > > +	res = ((*addr) & (1UL << nr)) != 0;
> > > +	rte_mb();
> > > +
> > > +	return res;
> > > +}
> >
> > Why does rte_test_bit() not use any of the __atomic_xx functions instead?
> > E.g.:
> >
> > static inline int
> > rte_test_bit(int nr, unsigned long *addr) {
> > 	return __atomic_load_n(addr, __ATOMIC_ACQUIRE); }
> >
> You re right, it's better to use __atomic_xx here to keep the consistent with
> other APIs.
> 
> > > +
> > > +static inline int
> > > +rte_test_and_set_bit(int nr, unsigned long *addr) {
> > > +	unsigned long mask = (1UL << nr);
> > > +
> > > +	return __atomic_fetch_or(addr, mask, __ATOMIC_ACQ_REL) &
> > mask; }
> > > +
> > > +static inline int
> > > +rte_test_and_clear_bit(int nr, unsigned long *addr) {
> > > +	unsigned long mask = (1UL << nr);
> > > +
> > > +	return __atomic_fetch_and(addr, ~mask, __ATOMIC_ACQ_REL) &
> > mask; }
> > > +#endif /* _RTE_BITOPS_H_ */
> > > diff --git a/lib/librte_eal/common/meson.build
> > > b/lib/librte_eal/common/meson.build
> > > index 386577c..a277cdf 100644
> > > --- a/lib/librte_eal/common/meson.build
> > > +++ b/lib/librte_eal/common/meson.build
> > > @@ -52,6 +52,7 @@ common_headers = files(
> > >  	'include/rte_alarm.h',
> > >  	'include/rte_branch_prediction.h',
> > >  	'include/rte_bus.h',
> > > +	'include/rte_bitops.h',
> > >  	'include/rte_bitmap.h',
> > >  	'include/rte_class.h',
> > >  	'include/rte_common.h',
> > > --
> > > 2.7.4
> > >
> >
> > These functions use unsigned long as the type of their value, like
> > they do in the PMDs.
> >
> > However, a generic bit operations library should preferably work with
> > multiple types, like the __atomic_xx functions. Or use an well defined
> > uint_NN_t type. Or have individually named functions for each type size,
> e.g.
> > rte_set_bit_32() and rte_set_bit_64().
> >
> Good suggestion! And will do this in next version.

The PMDs which use the common API now are all 32bit operation, so change
the definition to uint_32_t type instead of individually naming functions for
each type size.
  
Joyce Kong Oct. 23, 2019, 3:12 a.m. UTC | #9
> > On Tue, Oct 15, 2019 at 1:20 PM Joyce Kong <joyce.kong@arm.com> wrote:
> > >
> > > There are a lot functions of bit operations scattered and duplicated
> > > in PMDs, consolidating them into a common API family is necessary.
> > > Furthermore, the bit operation is mostly applied to the IO devices,
> > > so use __ATOMIC_ACQ_REL to ensure the ordering.
> > >
> > > Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> > > ---
> > >  lib/librte_eal/common/Makefile             |  1 +
> > >  lib/librte_eal/common/include/rte_bitops.h | 56
> > ++++++++++++++++++++++++++++++
> > >  lib/librte_eal/common/meson.build          |  1 +
> > > +
> > > +static inline void
> > > +rte_set_bit(unsigned int nr, unsigned long *addr) {
> > > +       __atomic_fetch_or(addr, (1UL << nr), __ATOMIC_ACQ_REL); }
> >
> > If it is specific for IO the IMO, it makes sense call the API to
> > rte_io_set_bit() like rte_io_rmb
> > and change the header file to rte_io_bitops.h.
> >
> > The barries are only needed for IO operations. Explicitly is not
> > conveying it in API name would call for using it for normal cases.
> >
> > Other option could be to introduce, generic and IO specific bit
> > operations operations separately.
> 
> Would do some related changes in next version.

As bit operations are mostly applied to IO devices, change the header file
to rte_io_bitops.h to introduce IO specific bit operations now. And do this
change in v2.
  
Morten Brørup Oct. 23, 2019, 7:45 a.m. UTC | #10
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Joyce Kong (Arm
> Technology China)
> Sent: Wednesday, October 23, 2019 5:08 AM
> 
> > > > -----Original Message-----
> > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Joyce Kong
> > > > Sent: Tuesday, October 15, 2019 9:50 AM
> > > >
> > > > There are a lot functions of bit operations scattered and
> duplicated
> > > > in PMDs, consolidating them into a common API family is
> necessary.
> > > > Furthermore, the bit operation is mostly applied to the IO
> devices,
> > > > so use __ATOMIC_ACQ_REL to ensure the ordering.
> > >
> > > Good initiative.
> > >
> > > >
> > > > Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> > > > ---
> > > >  lib/librte_eal/common/Makefile             |  1 +
> > > >  lib/librte_eal/common/include/rte_bitops.h | 56
> > > > ++++++++++++++++++++++++++++++
> > > >  lib/librte_eal/common/meson.build          |  1 +
> > > >  3 files changed, 58 insertions(+)
> > > >  create mode 100644 lib/librte_eal/common/include/rte_bitops.h
> > > >
> > > > diff --git a/lib/librte_eal/common/Makefile
> > > > b/lib/librte_eal/common/Makefile index a00d4fc..8586ca8 100644
> > > > --- a/lib/librte_eal/common/Makefile
> > > > +++ b/lib/librte_eal/common/Makefile
> > > > @@ -18,6 +18,7 @@ INC += rte_malloc.h rte_keepalive.h rte_time.h
> > > > INC
> > > > += rte_service.h rte_service_component.h  INC += rte_bitmap.h
> > > > rte_vfio.h rte_hypervisor.h rte_test.h  INC += rte_reciprocal.h
> > > > rte_fbarray.h rte_uuid.h
> > > > +INC += rte_bitops.h
> > > >
> > > >  GENERIC_INC := rte_atomic.h rte_byteorder.h rte_cycles.h
> > > > rte_prefetch.h  GENERIC_INC += rte_memcpy.h rte_cpuflags.h diff
> > > > --git a/lib/librte_eal/common/include/rte_bitops.h
> > > > b/lib/librte_eal/common/include/rte_bitops.h
> > > > new file mode 100644
> > > > index 0000000..4d7c5a3
> > > > --- /dev/null
> > > > +++ b/lib/librte_eal/common/include/rte_bitops.h
> > > > @@ -0,0 +1,56 @@
> > > > +/* SPDX-License-Identifier: BSD-3-Clause
> > > > + * Copyright(c) 2019 Arm Corporation  */
> > > > +
> > > > +#ifndef _RTE_BITOPS_H_
> > > > +#define _RTE_BITOPS_H_
> > > > +
> > > > +/**
> > > > + * @file
> > > > + * Bit Operations
> > > > + *
> > > > + * This file defines a generic API for bit operations.
> > > > + */
> > > > +
> > > > +#include <stdint.h>
> > > > +#include <rte_atomic.h>
> > > > +
> > > > +static inline void
> > > > +rte_set_bit(unsigned int nr, unsigned long *addr) {
> > > > +	__atomic_fetch_or(addr, (1UL << nr), __ATOMIC_ACQ_REL); }
> > > > +
> > > > +static inline void
> > > > +rte_clear_bit(int nr, unsigned long *addr) {
> > > > +	__atomic_fetch_and(addr, ~(1UL << nr), __ATOMIC_ACQ_REL); }
> > > > +
> > > > +static inline int
> > > > +rte_test_bit(int nr, unsigned long *addr) {
> > > > +	int res;
> > > > +	rte_mb();
> > > > +	res = ((*addr) & (1UL << nr)) != 0;
> > > > +	rte_mb();
> > > > +
> > > > +	return res;
> > > > +}
> > >
> > > Why does rte_test_bit() not use any of the __atomic_xx functions
> instead?
> > > E.g.:
> > >
> > > static inline int
> > > rte_test_bit(int nr, unsigned long *addr) {
> > > 	return __atomic_load_n(addr, __ATOMIC_ACQUIRE); }
> > >
> > You re right, it's better to use __atomic_xx here to keep the
> consistent with
> > other APIs.
> >
> > > > +
> > > > +static inline int
> > > > +rte_test_and_set_bit(int nr, unsigned long *addr) {
> > > > +	unsigned long mask = (1UL << nr);
> > > > +
> > > > +	return __atomic_fetch_or(addr, mask, __ATOMIC_ACQ_REL) &
> > > mask; }
> > > > +
> > > > +static inline int
> > > > +rte_test_and_clear_bit(int nr, unsigned long *addr) {
> > > > +	unsigned long mask = (1UL << nr);
> > > > +
> > > > +	return __atomic_fetch_and(addr, ~mask, __ATOMIC_ACQ_REL) &
> > > mask; }
> > > > +#endif /* _RTE_BITOPS_H_ */
> > > > diff --git a/lib/librte_eal/common/meson.build
> > > > b/lib/librte_eal/common/meson.build
> > > > index 386577c..a277cdf 100644
> > > > --- a/lib/librte_eal/common/meson.build
> > > > +++ b/lib/librte_eal/common/meson.build
> > > > @@ -52,6 +52,7 @@ common_headers = files(
> > > >  	'include/rte_alarm.h',
> > > >  	'include/rte_branch_prediction.h',
> > > >  	'include/rte_bus.h',
> > > > +	'include/rte_bitops.h',
> > > >  	'include/rte_bitmap.h',
> > > >  	'include/rte_class.h',
> > > >  	'include/rte_common.h',
> > > > --
> > > > 2.7.4
> > > >
> > >
> > > These functions use unsigned long as the type of their value, like
> > > they do in the PMDs.
> > >
> > > However, a generic bit operations library should preferably work
> with
> > > multiple types, like the __atomic_xx functions. Or use an well
> defined
> > > uint_NN_t type. Or have individually named functions for each type
> size,
> > e.g.
> > > rte_set_bit_32() and rte_set_bit_64().
> > >
> > Good suggestion! And will do this in next version.
> 
> The PMDs which use the common API now are all 32bit operation, so
> change
> the definition to uint_32_t type instead of individually naming
> functions for
> each type size.

Unless you are certain that all current and future I/O devices only need 32 bit, it should provide variants for different types, like the rte_atomic_xxx API.

There might also be a need to support both big and little endian byte ordering? Perhaps the CPU uses a different byte ordering than the I/O device being accessed through this API. I don't know; I'm only providing half baked feedback on this point.
  
Honnappa Nagarahalli Oct. 23, 2019, 5:30 p.m. UTC | #11
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Joyce Kong (Arm
> > Technology China)
> > Sent: Wednesday, October 23, 2019 5:08 AM
> >
> > > > > -----Original Message-----
> > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Joyce Kong
> > > > > Sent: Tuesday, October 15, 2019 9:50 AM
> > > > >
> > > > > There are a lot functions of bit operations scattered and
> > duplicated
> > > > > in PMDs, consolidating them into a common API family is
> > necessary.
> > > > > Furthermore, the bit operation is mostly applied to the IO
> > devices,
> > > > > so use __ATOMIC_ACQ_REL to ensure the ordering.
> > > >
> > > > Good initiative.
> > > >
> > > > >
> > > > > Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> > > > > ---
> > > > >  lib/librte_eal/common/Makefile             |  1 +
> > > > >  lib/librte_eal/common/include/rte_bitops.h | 56
> > > > > ++++++++++++++++++++++++++++++
> > > > >  lib/librte_eal/common/meson.build          |  1 +
> > > > >  3 files changed, 58 insertions(+)  create mode 100644
> > > > > lib/librte_eal/common/include/rte_bitops.h
> > > > >
> > > > > diff --git a/lib/librte_eal/common/Makefile
> > > > > b/lib/librte_eal/common/Makefile index a00d4fc..8586ca8 100644
> > > > > --- a/lib/librte_eal/common/Makefile
> > > > > +++ b/lib/librte_eal/common/Makefile
> > > > > @@ -18,6 +18,7 @@ INC += rte_malloc.h rte_keepalive.h rte_time.h
> > > > > INC
> > > > > += rte_service.h rte_service_component.h  INC += rte_bitmap.h
> > > > > rte_vfio.h rte_hypervisor.h rte_test.h  INC += rte_reciprocal.h
> > > > > rte_fbarray.h rte_uuid.h
> > > > > +INC += rte_bitops.h
> > > > >
> > > > >  GENERIC_INC := rte_atomic.h rte_byteorder.h rte_cycles.h
> > > > > rte_prefetch.h  GENERIC_INC += rte_memcpy.h rte_cpuflags.h diff
> > > > > --git a/lib/librte_eal/common/include/rte_bitops.h
> > > > > b/lib/librte_eal/common/include/rte_bitops.h
> > > > > new file mode 100644
> > > > > index 0000000..4d7c5a3
> > > > > --- /dev/null
> > > > > +++ b/lib/librte_eal/common/include/rte_bitops.h
> > > > > @@ -0,0 +1,56 @@
> > > > > +/* SPDX-License-Identifier: BSD-3-Clause
> > > > > + * Copyright(c) 2019 Arm Corporation  */
> > > > > +
> > > > > +#ifndef _RTE_BITOPS_H_
> > > > > +#define _RTE_BITOPS_H_
> > > > > +
> > > > > +/**
> > > > > + * @file
> > > > > + * Bit Operations
> > > > > + *
> > > > > + * This file defines a generic API for bit operations.
> > > > > + */
> > > > > +
> > > > > +#include <stdint.h>
> > > > > +#include <rte_atomic.h>
> > > > > +
> > > > > +static inline void
> > > > > +rte_set_bit(unsigned int nr, unsigned long *addr) {
> > > > > +	__atomic_fetch_or(addr, (1UL << nr), __ATOMIC_ACQ_REL); }
> > > > > +
> > > > > +static inline void
> > > > > +rte_clear_bit(int nr, unsigned long *addr) {
> > > > > +	__atomic_fetch_and(addr, ~(1UL << nr),
> __ATOMIC_ACQ_REL); }
> > > > > +
> > > > > +static inline int
> > > > > +rte_test_bit(int nr, unsigned long *addr) {
> > > > > +	int res;
> > > > > +	rte_mb();
> > > > > +	res = ((*addr) & (1UL << nr)) != 0;
> > > > > +	rte_mb();
> > > > > +
> > > > > +	return res;
> > > > > +}
> > > >
> > > > Why does rte_test_bit() not use any of the __atomic_xx functions
> > instead?
> > > > E.g.:
> > > >
> > > > static inline int
> > > > rte_test_bit(int nr, unsigned long *addr) {
> > > > 	return __atomic_load_n(addr, __ATOMIC_ACQUIRE); }
> > > >
> > > You re right, it's better to use __atomic_xx here to keep the
> > consistent with
> > > other APIs.
> > >
> > > > > +
> > > > > +static inline int
> > > > > +rte_test_and_set_bit(int nr, unsigned long *addr) {
> > > > > +	unsigned long mask = (1UL << nr);
> > > > > +
> > > > > +	return __atomic_fetch_or(addr, mask, __ATOMIC_ACQ_REL) &
> > > > mask; }
> > > > > +
> > > > > +static inline int
> > > > > +rte_test_and_clear_bit(int nr, unsigned long *addr) {
> > > > > +	unsigned long mask = (1UL << nr);
> > > > > +
> > > > > +	return __atomic_fetch_and(addr, ~mask, __ATOMIC_ACQ_REL)
> &
> > > > mask; }
> > > > > +#endif /* _RTE_BITOPS_H_ */
> > > > > diff --git a/lib/librte_eal/common/meson.build
> > > > > b/lib/librte_eal/common/meson.build
> > > > > index 386577c..a277cdf 100644
> > > > > --- a/lib/librte_eal/common/meson.build
> > > > > +++ b/lib/librte_eal/common/meson.build
> > > > > @@ -52,6 +52,7 @@ common_headers = files(
> > > > >  	'include/rte_alarm.h',
> > > > >  	'include/rte_branch_prediction.h',
> > > > >  	'include/rte_bus.h',
> > > > > +	'include/rte_bitops.h',
> > > > >  	'include/rte_bitmap.h',
> > > > >  	'include/rte_class.h',
> > > > >  	'include/rte_common.h',
> > > > > --
> > > > > 2.7.4
> > > > >
> > > >
> > > > These functions use unsigned long as the type of their value, like
> > > > they do in the PMDs.
> > > >
> > > > However, a generic bit operations library should preferably work
> > with
> > > > multiple types, like the __atomic_xx functions. Or use an well
> > defined
> > > > uint_NN_t type. Or have individually named functions for each type
> > size,
> > > e.g.
> > > > rte_set_bit_32() and rte_set_bit_64().
> > > >
> > > Good suggestion! And will do this in next version.
> >
> > The PMDs which use the common API now are all 32bit operation, so
> > change the definition to uint_32_t type instead of individually naming
> > functions for each type size.
> 
> Unless you are certain that all current and future I/O devices only need 32 bit,
> it should provide variants for different types, like the rte_atomic_xxx API.
Why not do these using macros? The __atomic_xxx APIs anyway work with multiple types. Then we do not have to provide variants for all sizes.

> 
> There might also be a need to support both big and little endian byte ordering?
> Perhaps the CPU uses a different byte ordering than the I/O device being
> accessed through this API. I don't know; I'm only providing half baked feedback
> on this point.
  
Gavin Hu Oct. 24, 2019, 3:38 a.m. UTC | #12
> -----Original Message-----
> From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> Sent: Thursday, October 24, 2019 1:30 AM
> To: Morten Brørup <mb@smartsharesystems.com>; Joyce Kong (Arm
> Technology China) <Joyce.Kong@arm.com>; dev@dpdk.org
> Cc: nd <nd@arm.com>; thomas@monjalon.net; jerinj@marvell.com;
> ravi1.kumar@amd.com; xuanziyang2@huawei.com;
> cloud.wangxiaoyun@huawei.com; zhouguoyang@huawei.com;
> rmody@marvell.com; shshaikh@marvell.com; Gavin Hu (Arm Technology
> China) <Gavin.Hu@arm.com>; Stephen Hemminger
> <stephen@networkplumber.org>; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; nd <nd@arm.com>
> Subject: RE: [dpdk-dev] [PATCH v1 1/5] lib/eal: implement the family of rte
> bitoperation APIs
> 
> >
> > > -----Original Message-----
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Joyce Kong
> (Arm
> > > Technology China)
> > > Sent: Wednesday, October 23, 2019 5:08 AM
> > >
> > > > > > -----Original Message-----
> > > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Joyce
> Kong
> > > > > > Sent: Tuesday, October 15, 2019 9:50 AM
> > > > > >
> > > > > > There are a lot functions of bit operations scattered and
> > > duplicated
> > > > > > in PMDs, consolidating them into a common API family is
> > > necessary.
> > > > > > Furthermore, the bit operation is mostly applied to the IO
> > > devices,
> > > > > > so use __ATOMIC_ACQ_REL to ensure the ordering.
> > > > >
> > > > > Good initiative.
> > > > >
> > > > > >
> > > > > > Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> > > > > > ---
> > > > > >  lib/librte_eal/common/Makefile             |  1 +
> > > > > >  lib/librte_eal/common/include/rte_bitops.h | 56
> > > > > > ++++++++++++++++++++++++++++++
> > > > > >  lib/librte_eal/common/meson.build          |  1 +
> > > > > >  3 files changed, 58 insertions(+)  create mode 100644
> > > > > > lib/librte_eal/common/include/rte_bitops.h
> > > > > >
> > > > > > diff --git a/lib/librte_eal/common/Makefile
> > > > > > b/lib/librte_eal/common/Makefile index a00d4fc..8586ca8 100644
> > > > > > --- a/lib/librte_eal/common/Makefile
> > > > > > +++ b/lib/librte_eal/common/Makefile
> > > > > > @@ -18,6 +18,7 @@ INC += rte_malloc.h rte_keepalive.h
> rte_time.h
> > > > > > INC
> > > > > > += rte_service.h rte_service_component.h  INC += rte_bitmap.h
> > > > > > rte_vfio.h rte_hypervisor.h rte_test.h  INC += rte_reciprocal.h
> > > > > > rte_fbarray.h rte_uuid.h
> > > > > > +INC += rte_bitops.h
> > > > > >
> > > > > >  GENERIC_INC := rte_atomic.h rte_byteorder.h rte_cycles.h
> > > > > > rte_prefetch.h  GENERIC_INC += rte_memcpy.h rte_cpuflags.h diff
> > > > > > --git a/lib/librte_eal/common/include/rte_bitops.h
> > > > > > b/lib/librte_eal/common/include/rte_bitops.h
> > > > > > new file mode 100644
> > > > > > index 0000000..4d7c5a3
> > > > > > --- /dev/null
> > > > > > +++ b/lib/librte_eal/common/include/rte_bitops.h
> > > > > > @@ -0,0 +1,56 @@
> > > > > > +/* SPDX-License-Identifier: BSD-3-Clause
> > > > > > + * Copyright(c) 2019 Arm Corporation  */
> > > > > > +
> > > > > > +#ifndef _RTE_BITOPS_H_
> > > > > > +#define _RTE_BITOPS_H_
> > > > > > +
> > > > > > +/**
> > > > > > + * @file
> > > > > > + * Bit Operations
> > > > > > + *
> > > > > > + * This file defines a generic API for bit operations.
> > > > > > + */
> > > > > > +
> > > > > > +#include <stdint.h>
> > > > > > +#include <rte_atomic.h>
> > > > > > +
> > > > > > +static inline void
> > > > > > +rte_set_bit(unsigned int nr, unsigned long *addr) {
> > > > > > +__atomic_fetch_or(addr, (1UL << nr), __ATOMIC_ACQ_REL); }
> > > > > > +
> > > > > > +static inline void
> > > > > > +rte_clear_bit(int nr, unsigned long *addr) {
> > > > > > +__atomic_fetch_and(addr, ~(1UL << nr),
> > __ATOMIC_ACQ_REL); }
> > > > > > +
> > > > > > +static inline int
> > > > > > +rte_test_bit(int nr, unsigned long *addr) {
> > > > > > +int res;
> > > > > > +rte_mb();
> > > > > > +res = ((*addr) & (1UL << nr)) != 0;
> > > > > > +rte_mb();
> > > > > > +
> > > > > > +return res;
> > > > > > +}
> > > > >
> > > > > Why does rte_test_bit() not use any of the __atomic_xx functions
> > > instead?
> > > > > E.g.:
> > > > >
> > > > > static inline int
> > > > > rte_test_bit(int nr, unsigned long *addr) {
> > > > > return __atomic_load_n(addr, __ATOMIC_ACQUIRE); }
> > > > >
> > > > You re right, it's better to use __atomic_xx here to keep the
> > > consistent with
> > > > other APIs.
> > > >
> > > > > > +
> > > > > > +static inline int
> > > > > > +rte_test_and_set_bit(int nr, unsigned long *addr) {
> > > > > > +unsigned long mask = (1UL << nr);
> > > > > > +
> > > > > > +return __atomic_fetch_or(addr, mask, __ATOMIC_ACQ_REL) &
> > > > > mask; }
> > > > > > +
> > > > > > +static inline int
> > > > > > +rte_test_and_clear_bit(int nr, unsigned long *addr) {
> > > > > > +unsigned long mask = (1UL << nr);
> > > > > > +
> > > > > > +return __atomic_fetch_and(addr, ~mask, __ATOMIC_ACQ_REL)
> > &
> > > > > mask; }
> > > > > > +#endif /* _RTE_BITOPS_H_ */
> > > > > > diff --git a/lib/librte_eal/common/meson.build
> > > > > > b/lib/librte_eal/common/meson.build
> > > > > > index 386577c..a277cdf 100644
> > > > > > --- a/lib/librte_eal/common/meson.build
> > > > > > +++ b/lib/librte_eal/common/meson.build
> > > > > > @@ -52,6 +52,7 @@ common_headers = files(
> > > > > >  'include/rte_alarm.h',
> > > > > >  'include/rte_branch_prediction.h',
> > > > > >  'include/rte_bus.h',
> > > > > > +'include/rte_bitops.h',
> > > > > >  'include/rte_bitmap.h',
> > > > > >  'include/rte_class.h',
> > > > > >  'include/rte_common.h',
> > > > > > --
> > > > > > 2.7.4
> > > > > >
> > > > >
> > > > > These functions use unsigned long as the type of their value, like
> > > > > they do in the PMDs.
> > > > >
> > > > > However, a generic bit operations library should preferably work
> > > with
> > > > > multiple types, like the __atomic_xx functions. Or use an well
> > > defined
> > > > > uint_NN_t type. Or have individually named functions for each type
> > > size,
> > > > e.g.
> > > > > rte_set_bit_32() and rte_set_bit_64().
> > > > >
> > > > Good suggestion! And will do this in next version.
> > >
> > > The PMDs which use the common API now are all 32bit operation, so
> > > change the definition to uint_32_t type instead of individually naming
> > > functions for each type size.
> >
> > Unless you are certain that all current and future I/O devices only need 32
> bit,
> > it should provide variants for different types, like the rte_atomic_xxx API.
> Why not do these using macros? The __atomic_xxx APIs anyway work with
> multiple types. Then we do not have to provide variants for all sizes.

We really come to the point for the community to give a guideline: how to generalize APIs to support multiple-sized arguments. 
Looks like macros was disliked by the community, for readability and debuggability reasons.
Besides macros, there are an alternative: _Generic https://gcc.gnu.org/onlinedocs/gccint/GENERIC.html, but it is not supported by older gcc(<4.9), this made a hard requirement for gcc/clang.

We have to compromise over all these: code duplication, readability and debuggability.
/Gavin
> >
> > There might also be a need to support both big and little endian byte
> ordering?
> > Perhaps the CPU uses a different byte ordering than the I/O device being
> > accessed through this API. I don't know; I'm only providing half baked
> feedback
> > on this point.
>
  
Honnappa Nagarahalli Nov. 1, 2019, 1:48 p.m. UTC | #13
> >
> > >
> > > > -----Original Message-----
> > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Joyce Kong
> > (Arm
> > > > Technology China)
> > > > Sent: Wednesday, October 23, 2019 5:08 AM
> > > >
> > > > > > > -----Original Message-----
> > > > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Joyce
> > Kong
> > > > > > > Sent: Tuesday, October 15, 2019 9:50 AM
> > > > > > >
> > > > > > > There are a lot functions of bit operations scattered and
> > > > duplicated
> > > > > > > in PMDs, consolidating them into a common API family is
> > > > necessary.
> > > > > > > Furthermore, the bit operation is mostly applied to the IO
> > > > devices,
> > > > > > > so use __ATOMIC_ACQ_REL to ensure the ordering.
> > > > > >
> > > > > > Good initiative.
> > > > > >
> > > > > > >
> > > > > > > Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> > > > > > > ---
> > > > > > >  lib/librte_eal/common/Makefile             |  1 +
> > > > > > >  lib/librte_eal/common/include/rte_bitops.h | 56
> > > > > > > ++++++++++++++++++++++++++++++
> > > > > > >  lib/librte_eal/common/meson.build          |  1 +
> > > > > > >  3 files changed, 58 insertions(+)  create mode 100644
> > > > > > > lib/librte_eal/common/include/rte_bitops.h
> > > > > > >
> > > > > > > diff --git a/lib/librte_eal/common/Makefile
> > > > > > > b/lib/librte_eal/common/Makefile index a00d4fc..8586ca8
> > > > > > > 100644
> > > > > > > --- a/lib/librte_eal/common/Makefile
> > > > > > > +++ b/lib/librte_eal/common/Makefile
> > > > > > > @@ -18,6 +18,7 @@ INC += rte_malloc.h rte_keepalive.h
> > rte_time.h
> > > > > > > INC
> > > > > > > += rte_service.h rte_service_component.h  INC +=
> > > > > > > +rte_bitmap.h
> > > > > > > rte_vfio.h rte_hypervisor.h rte_test.h  INC +=
> > > > > > > rte_reciprocal.h rte_fbarray.h rte_uuid.h
> > > > > > > +INC += rte_bitops.h
> > > > > > >
> > > > > > >  GENERIC_INC := rte_atomic.h rte_byteorder.h rte_cycles.h
> > > > > > > rte_prefetch.h  GENERIC_INC += rte_memcpy.h rte_cpuflags.h
> > > > > > > diff --git a/lib/librte_eal/common/include/rte_bitops.h
> > > > > > > b/lib/librte_eal/common/include/rte_bitops.h
> > > > > > > new file mode 100644
> > > > > > > index 0000000..4d7c5a3
> > > > > > > --- /dev/null
> > > > > > > +++ b/lib/librte_eal/common/include/rte_bitops.h
> > > > > > > @@ -0,0 +1,56 @@
> > > > > > > +/* SPDX-License-Identifier: BSD-3-Clause
> > > > > > > + * Copyright(c) 2019 Arm Corporation  */
> > > > > > > +
> > > > > > > +#ifndef _RTE_BITOPS_H_
> > > > > > > +#define _RTE_BITOPS_H_
> > > > > > > +
> > > > > > > +/**
> > > > > > > + * @file
> > > > > > > + * Bit Operations
> > > > > > > + *
> > > > > > > + * This file defines a generic API for bit operations.
> > > > > > > + */
> > > > > > > +
> > > > > > > +#include <stdint.h>
> > > > > > > +#include <rte_atomic.h>
> > > > > > > +
> > > > > > > +static inline void
> > > > > > > +rte_set_bit(unsigned int nr, unsigned long *addr) {
> > > > > > > +__atomic_fetch_or(addr, (1UL << nr), __ATOMIC_ACQ_REL); }
> > > > > > > +
> > > > > > > +static inline void
> > > > > > > +rte_clear_bit(int nr, unsigned long *addr) {
> > > > > > > +__atomic_fetch_and(addr, ~(1UL << nr),
> > > __ATOMIC_ACQ_REL); }
> > > > > > > +
> > > > > > > +static inline int
> > > > > > > +rte_test_bit(int nr, unsigned long *addr) { int res;
> > > > > > > +rte_mb(); res = ((*addr) & (1UL << nr)) != 0; rte_mb();
> > > > > > > +
> > > > > > > +return res;
> > > > > > > +}
> > > > > >
> > > > > > Why does rte_test_bit() not use any of the __atomic_xx
> > > > > > functions
> > > > instead?
> > > > > > E.g.:
> > > > > >
> > > > > > static inline int
> > > > > > rte_test_bit(int nr, unsigned long *addr) { return
> > > > > > __atomic_load_n(addr, __ATOMIC_ACQUIRE); }
> > > > > >
> > > > > You re right, it's better to use __atomic_xx here to keep the
> > > > consistent with
> > > > > other APIs.
> > > > >
> > > > > > > +
> > > > > > > +static inline int
> > > > > > > +rte_test_and_set_bit(int nr, unsigned long *addr) {
> > > > > > > +unsigned long mask = (1UL << nr);
> > > > > > > +
> > > > > > > +return __atomic_fetch_or(addr, mask, __ATOMIC_ACQ_REL) &
> > > > > > mask; }
> > > > > > > +
> > > > > > > +static inline int
> > > > > > > +rte_test_and_clear_bit(int nr, unsigned long *addr) {
> > > > > > > +unsigned long mask = (1UL << nr);
> > > > > > > +
> > > > > > > +return __atomic_fetch_and(addr, ~mask, __ATOMIC_ACQ_REL)
> > > &
> > > > > > mask; }
> > > > > > > +#endif /* _RTE_BITOPS_H_ */
> > > > > > > diff --git a/lib/librte_eal/common/meson.build
> > > > > > > b/lib/librte_eal/common/meson.build
> > > > > > > index 386577c..a277cdf 100644
> > > > > > > --- a/lib/librte_eal/common/meson.build
> > > > > > > +++ b/lib/librte_eal/common/meson.build
> > > > > > > @@ -52,6 +52,7 @@ common_headers = files(
> > > > > > > 'include/rte_alarm.h',  'include/rte_branch_prediction.h',
> > > > > > >  'include/rte_bus.h',
> > > > > > > +'include/rte_bitops.h',
> > > > > > >  'include/rte_bitmap.h',
> > > > > > >  'include/rte_class.h',
> > > > > > >  'include/rte_common.h',
> > > > > > > --
> > > > > > > 2.7.4
> > > > > > >
> > > > > >
> > > > > > These functions use unsigned long as the type of their value,
> > > > > > like they do in the PMDs.
> > > > > >
> > > > > > However, a generic bit operations library should preferably
> > > > > > work
> > > > with
> > > > > > multiple types, like the __atomic_xx functions. Or use an well
> > > > defined
> > > > > > uint_NN_t type. Or have individually named functions for each
> > > > > > type
> > > > size,
> > > > > e.g.
> > > > > > rte_set_bit_32() and rte_set_bit_64().
> > > > > >
> > > > > Good suggestion! And will do this in next version.
> > > >
> > > > The PMDs which use the common API now are all 32bit operation, so
> > > > change the definition to uint_32_t type instead of individually
> > > > naming functions for each type size.
> > >
> > > Unless you are certain that all current and future I/O devices only
> > > need 32
> > bit,
> > > it should provide variants for different types, like the rte_atomic_xxx API.
> > Why not do these using macros? The __atomic_xxx APIs anyway work with
> > multiple types. Then we do not have to provide variants for all sizes.
> 
> We really come to the point for the community to give a guideline: how to
> generalize APIs to support multiple-sized arguments.
> Looks like macros was disliked by the community, for readability and
> debuggability reasons.
IMO, it should not be considered as a blanket ban on using macros. It should be considered case by case basis. For ex: I do not see a point in writing the same API for 32b/64b/128b especially when the APIs are one liners.

> Besides macros, there are an alternative: _Generic
> https://gcc.gnu.org/onlinedocs/gccint/GENERIC.html, but it is not supported
> by older gcc(<4.9), this made a hard requirement for gcc/clang.
> 
> We have to compromise over all these: code duplication, readability and
> debuggability.
> /Gavin
> > >
> > > There might also be a need to support both big and little endian
> > > byte
> > ordering?
> > > Perhaps the CPU uses a different byte ordering than the I/O device
> > > being accessed through this API. I don't know; I'm only providing
> > > half baked
> > feedback
> > > on this point.
> >
>
  
Gavin Hu Nov. 3, 2019, 3:45 p.m. UTC | #14
Hi Honnappa,
> -----Original Message-----
> From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> Sent: Friday, November 1, 2019 9:48 PM
> To: Gavin Hu (Arm Technology China) <Gavin.Hu@arm.com>; Morten
> Brørup <mb@smartsharesystems.com>; Joyce Kong (Arm Technology China)
> <Joyce.Kong@arm.com>; dev@dpdk.org
> Cc: nd <nd@arm.com>; thomas@monjalon.net; jerinj@marvell.com;
> ravi1.kumar@amd.com; xuanziyang2@huawei.com;
> cloud.wangxiaoyun@huawei.com; zhouguoyang@huawei.com;
> rmody@marvell.com; shshaikh@marvell.com; Stephen Hemminger
> <stephen@networkplumber.org>; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; nd <nd@arm.com>
> Subject: RE: [dpdk-dev] [PATCH v1 1/5] lib/eal: implement the family of rte
> bitoperation APIs
> 
> > >
> > > >
> > > > > -----Original Message-----
> > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Joyce Kong
> > > (Arm
> > > > > Technology China)
> > > > > Sent: Wednesday, October 23, 2019 5:08 AM
> > > > >
> > > > > > > > -----Original Message-----
> > > > > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Joyce
> > > Kong
> > > > > > > > Sent: Tuesday, October 15, 2019 9:50 AM
> > > > > > > >
> > > > > > > > There are a lot functions of bit operations scattered and
> > > > > duplicated
> > > > > > > > in PMDs, consolidating them into a common API family is
> > > > > necessary.
> > > > > > > > Furthermore, the bit operation is mostly applied to the IO
> > > > > devices,
> > > > > > > > so use __ATOMIC_ACQ_REL to ensure the ordering.
> > > > > > >
> > > > > > > Good initiative.
> > > > > > >
> > > > > > > >
> > > > > > > > Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> > > > > > > > ---
> > > > > > > >  lib/librte_eal/common/Makefile             |  1 +
> > > > > > > >  lib/librte_eal/common/include/rte_bitops.h | 56
> > > > > > > > ++++++++++++++++++++++++++++++
> > > > > > > >  lib/librte_eal/common/meson.build          |  1 +
> > > > > > > >  3 files changed, 58 insertions(+)  create mode 100644
> > > > > > > > lib/librte_eal/common/include/rte_bitops.h
> > > > > > > >
> > > > > > > > diff --git a/lib/librte_eal/common/Makefile
> > > > > > > > b/lib/librte_eal/common/Makefile index a00d4fc..8586ca8
> > > > > > > > 100644
> > > > > > > > --- a/lib/librte_eal/common/Makefile
> > > > > > > > +++ b/lib/librte_eal/common/Makefile
> > > > > > > > @@ -18,6 +18,7 @@ INC += rte_malloc.h rte_keepalive.h
> > > rte_time.h
> > > > > > > > INC
> > > > > > > > += rte_service.h rte_service_component.h  INC +=
> > > > > > > > +rte_bitmap.h
> > > > > > > > rte_vfio.h rte_hypervisor.h rte_test.h  INC +=
> > > > > > > > rte_reciprocal.h rte_fbarray.h rte_uuid.h
> > > > > > > > +INC += rte_bitops.h
> > > > > > > >
> > > > > > > >  GENERIC_INC := rte_atomic.h rte_byteorder.h rte_cycles.h
> > > > > > > > rte_prefetch.h  GENERIC_INC += rte_memcpy.h rte_cpuflags.h
> > > > > > > > diff --git a/lib/librte_eal/common/include/rte_bitops.h
> > > > > > > > b/lib/librte_eal/common/include/rte_bitops.h
> > > > > > > > new file mode 100644
> > > > > > > > index 0000000..4d7c5a3
> > > > > > > > --- /dev/null
> > > > > > > > +++ b/lib/librte_eal/common/include/rte_bitops.h
> > > > > > > > @@ -0,0 +1,56 @@
> > > > > > > > +/* SPDX-License-Identifier: BSD-3-Clause
> > > > > > > > + * Copyright(c) 2019 Arm Corporation  */
> > > > > > > > +
> > > > > > > > +#ifndef _RTE_BITOPS_H_
> > > > > > > > +#define _RTE_BITOPS_H_
> > > > > > > > +
> > > > > > > > +/**
> > > > > > > > + * @file
> > > > > > > > + * Bit Operations
> > > > > > > > + *
> > > > > > > > + * This file defines a generic API for bit operations.
> > > > > > > > + */
> > > > > > > > +
> > > > > > > > +#include <stdint.h>
> > > > > > > > +#include <rte_atomic.h>
> > > > > > > > +
> > > > > > > > +static inline void
> > > > > > > > +rte_set_bit(unsigned int nr, unsigned long *addr) {
> > > > > > > > +__atomic_fetch_or(addr, (1UL << nr), __ATOMIC_ACQ_REL); }
> > > > > > > > +
> > > > > > > > +static inline void
> > > > > > > > +rte_clear_bit(int nr, unsigned long *addr) {
> > > > > > > > +__atomic_fetch_and(addr, ~(1UL << nr),
> > > > __ATOMIC_ACQ_REL); }
> > > > > > > > +
> > > > > > > > +static inline int
> > > > > > > > +rte_test_bit(int nr, unsigned long *addr) { int res;
> > > > > > > > +rte_mb(); res = ((*addr) & (1UL << nr)) != 0; rte_mb();
> > > > > > > > +
> > > > > > > > +return res;
> > > > > > > > +}
> > > > > > >
> > > > > > > Why does rte_test_bit() not use any of the __atomic_xx
> > > > > > > functions
> > > > > instead?
> > > > > > > E.g.:
> > > > > > >
> > > > > > > static inline int
> > > > > > > rte_test_bit(int nr, unsigned long *addr) { return
> > > > > > > __atomic_load_n(addr, __ATOMIC_ACQUIRE); }
> > > > > > >
> > > > > > You re right, it's better to use __atomic_xx here to keep the
> > > > > consistent with
> > > > > > other APIs.
> > > > > >
> > > > > > > > +
> > > > > > > > +static inline int
> > > > > > > > +rte_test_and_set_bit(int nr, unsigned long *addr) {
> > > > > > > > +unsigned long mask = (1UL << nr);
> > > > > > > > +
> > > > > > > > +return __atomic_fetch_or(addr, mask, __ATOMIC_ACQ_REL) &
> > > > > > > mask; }
> > > > > > > > +
> > > > > > > > +static inline int
> > > > > > > > +rte_test_and_clear_bit(int nr, unsigned long *addr) {
> > > > > > > > +unsigned long mask = (1UL << nr);
> > > > > > > > +
> > > > > > > > +return __atomic_fetch_and(addr, ~mask, __ATOMIC_ACQ_REL)
> > > > &
> > > > > > > mask; }
> > > > > > > > +#endif /* _RTE_BITOPS_H_ */
> > > > > > > > diff --git a/lib/librte_eal/common/meson.build
> > > > > > > > b/lib/librte_eal/common/meson.build
> > > > > > > > index 386577c..a277cdf 100644
> > > > > > > > --- a/lib/librte_eal/common/meson.build
> > > > > > > > +++ b/lib/librte_eal/common/meson.build
> > > > > > > > @@ -52,6 +52,7 @@ common_headers = files(
> > > > > > > > 'include/rte_alarm.h',  'include/rte_branch_prediction.h',
> > > > > > > >  'include/rte_bus.h',
> > > > > > > > +'include/rte_bitops.h',
> > > > > > > >  'include/rte_bitmap.h',
> > > > > > > >  'include/rte_class.h',
> > > > > > > >  'include/rte_common.h',
> > > > > > > > --
> > > > > > > > 2.7.4
> > > > > > > >
> > > > > > >
> > > > > > > These functions use unsigned long as the type of their value,
> > > > > > > like they do in the PMDs.
> > > > > > >
> > > > > > > However, a generic bit operations library should preferably
> > > > > > > work
> > > > > with
> > > > > > > multiple types, like the __atomic_xx functions. Or use an well
> > > > > defined
> > > > > > > uint_NN_t type. Or have individually named functions for each
> > > > > > > type
> > > > > size,
> > > > > > e.g.
> > > > > > > rte_set_bit_32() and rte_set_bit_64().
> > > > > > >
> > > > > > Good suggestion! And will do this in next version.
> > > > >
> > > > > The PMDs which use the common API now are all 32bit operation, so
> > > > > change the definition to uint_32_t type instead of individually
> > > > > naming functions for each type size.
> > > >
> > > > Unless you are certain that all current and future I/O devices only
> > > > need 32
> > > bit,
> > > > it should provide variants for different types, like the rte_atomic_xxx
> API.
> > > Why not do these using macros? The __atomic_xxx APIs anyway work
> with
> > > multiple types. Then we do not have to provide variants for all sizes.
> >
> > We really come to the point for the community to give a guideline: how to
> > generalize APIs to support multiple-sized arguments.
> > Looks like macros was disliked by the community, for readability and
> > debuggability reasons.
> IMO, it should not be considered as a blanket ban on using macros. It should
> be considered case by case basis. For ex: I do not see a point in writing the
> same API for 32b/64b/128b especially when the APIs are one liners.
Jerin and Morten have different opinions, they thought the MACRO based scheme only as of the last resort. 
Another argument is the API familiarity(similar to rte io read APIs).
Joyce made a new version and let's see how the community balance the duplication and other considerations. 
/Gavin
> 
> > Besides macros, there are an alternative: _Generic
> > https://gcc.gnu.org/onlinedocs/gccint/GENERIC.html, but it is not
> supported
> > by older gcc(<4.9), this made a hard requirement for gcc/clang.
> >
> > We have to compromise over all these: code duplication, readability and
> > debuggability.
> > /Gavin
> > > >
> > > > There might also be a need to support both big and little endian
> > > > byte
> > > ordering?
> > > > Perhaps the CPU uses a different byte ordering than the I/O device
> > > > being accessed through this API. I don't know; I'm only providing
> > > > half baked
> > > feedback
> > > > on this point.
> > >
> >
>
  

Patch

diff --git a/lib/librte_eal/common/Makefile b/lib/librte_eal/common/Makefile
index a00d4fc..8586ca8 100644
--- a/lib/librte_eal/common/Makefile
+++ b/lib/librte_eal/common/Makefile
@@ -18,6 +18,7 @@  INC += rte_malloc.h rte_keepalive.h rte_time.h
 INC += rte_service.h rte_service_component.h
 INC += rte_bitmap.h rte_vfio.h rte_hypervisor.h rte_test.h
 INC += rte_reciprocal.h rte_fbarray.h rte_uuid.h
+INC += rte_bitops.h
 
 GENERIC_INC := rte_atomic.h rte_byteorder.h rte_cycles.h rte_prefetch.h
 GENERIC_INC += rte_memcpy.h rte_cpuflags.h
diff --git a/lib/librte_eal/common/include/rte_bitops.h b/lib/librte_eal/common/include/rte_bitops.h
new file mode 100644
index 0000000..4d7c5a3
--- /dev/null
+++ b/lib/librte_eal/common/include/rte_bitops.h
@@ -0,0 +1,56 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2019 Arm Corporation
+ */
+
+#ifndef _RTE_BITOPS_H_
+#define _RTE_BITOPS_H_
+
+/**
+ * @file
+ * Bit Operations
+ *
+ * This file defines a generic API for bit operations.
+ */
+
+#include <stdint.h>
+#include <rte_atomic.h>
+
+static inline void
+rte_set_bit(unsigned int nr, unsigned long *addr)
+{
+	__atomic_fetch_or(addr, (1UL << nr), __ATOMIC_ACQ_REL);
+}
+
+static inline void
+rte_clear_bit(int nr, unsigned long *addr)
+{
+	__atomic_fetch_and(addr, ~(1UL << nr), __ATOMIC_ACQ_REL);
+}
+
+static inline int
+rte_test_bit(int nr, unsigned long *addr)
+{
+	int res;
+	rte_mb();
+	res = ((*addr) & (1UL << nr)) != 0;
+	rte_mb();
+
+	return res;
+}
+
+static inline int
+rte_test_and_set_bit(int nr, unsigned long *addr)
+{
+	unsigned long mask = (1UL << nr);
+
+	return __atomic_fetch_or(addr, mask, __ATOMIC_ACQ_REL) & mask;
+}
+
+static inline int
+rte_test_and_clear_bit(int nr, unsigned long *addr)
+{
+	unsigned long mask = (1UL << nr);
+
+	return __atomic_fetch_and(addr, ~mask, __ATOMIC_ACQ_REL) & mask;
+}
+#endif /* _RTE_BITOPS_H_ */
diff --git a/lib/librte_eal/common/meson.build b/lib/librte_eal/common/meson.build
index 386577c..a277cdf 100644
--- a/lib/librte_eal/common/meson.build
+++ b/lib/librte_eal/common/meson.build
@@ -52,6 +52,7 @@  common_headers = files(
 	'include/rte_alarm.h',
 	'include/rte_branch_prediction.h',
 	'include/rte_bus.h',
+	'include/rte_bitops.h',
 	'include/rte_bitmap.h',
 	'include/rte_class.h',
 	'include/rte_common.h',