mbox series

[v6,0/6] Improve EAL bit operations API

Message ID 20240910083139.699291-1-mattias.ronnblom@ericsson.com (mailing list archive)
Headers
Series Improve EAL bit operations API |

Message

Mattias Rönnblom Sept. 10, 2024, 8:31 a.m. UTC
This patch set represent an attempt to improve and extend the RTE
bitops API, in particular for functions that operate on individual
bits.

All new functionality is exposed to the user as generic selection
macros, delegating the actual work to private (__-marked) static
inline functions. Public functions (e.g., rte_bit_set32()) would just
be bloating the API. Such generic selection macros will here be
referred to as "functions", although technically they are not.

The legacy <rte_bitops.h> rte_bit_relaxed_*() functions is replaced
with two new families:

rte_bit_[test|set|clear|assign|flip]() which provides no memory
ordering or atomicity guarantees, but does provide the best
performance. The performance degradation resulting from the use of
volatile (e.g., forcing loads and stores to actually occur and in the
number specified) and atomic (e.g., LOCK-prefixed instructions on x86)
may be significant. rte_bit_[test|set|clear|assign|flip]() may be
used with volatile word pointers, in which case they guarantee
that the program-level accesses actually occur.

rte_bit_atomic_*() which provides atomic bit-level operations,
including the possibility to specifying memory ordering constraints
(or the lack thereof).

The atomic functions take non-_Atomic pointers, to be flexible, just
like the GCC builtins and default <rte_stdatomic.h>. The issue with
_Atomic APIs is that it may well be the case that the user wants to
perform both non-atomic and atomic operations on the same word.

Having _Atomic-marked addresses would complicate supporting atomic
bit-level operations in the bitset API (proposed in a different RFC
patchset), and potentially other APIs depending on RTE bitops for
atomic bit-level ops). Either one needs two bitset variants, one
_Atomic bitset and one non-atomic one, or the bitset code needs to
cast the non-_Atomic pointer to an _Atomic one. Having a separate
_Atomic bitset would be bloat and also prevent the user from both, in
some situations, doing atomic operations against a bit set, while in
other situations (e.g., at times when MT safety is not a concern)
operating on the same objects in a non-atomic manner.

Unlike rte_bit_relaxed_*(), individual bits are represented by bool,
not uint32_t or uint64_t. The author found the use of such large types
confusing, and also failed to see any performance benefits.

A set of functions rte_bit_*_assign() are added, to assign a
particular boolean value to a particular bit.

All new functions have properly documented semantics.

All new functions operate on both 32 and 64-bit words, with type
checking.

_Generic allow the user code to be a little more impact. Have a
type-generic atomic test/set/clear/assign bit API also seems
consistent with the "core" (word-size) atomics API, which is generic
(both GCC builtins and <rte_stdatomic.h> are).

The _Generic versions avoids having explicit unsigned long versions of
all functions. If you have an unsigned long, it's safe to use the
generic version (e.g., rte_set_bit()) and _Generic will pick the right
function, provided long is either 32 or 64 bit on your platform (which
it is on all DPDK-supported ABIs).

The generic rte_bit_set() is a macro, and not a function, but
nevertheless has been given a lower-case name. That's how C11 does it
(for atomics, and other _Generic), and <rte_stdatomic.h>. Its address
can't be taken, but it does not evaluate its parameters more than
once.

C++ doesn't support generic selection. In C++ translation units the
_Generic macros are replaced with overloaded functions, implemented by
means of a huge, complicated C macro mess.

Mattias Rönnblom (6):
  dpdk: do not force C linkage on include file dependencies
  eal: extend bit manipulation functionality
  eal: add unit tests for bit operations
  eal: add atomic bit operations
  eal: add unit tests for atomic bit access functions
  eal: extend bitops to handle volatile pointers

 app/test/packet_burst_generator.h             |   8 +-
 app/test/test_bitops.c                        | 416 +++++++++-
 app/test/virtual_pmd.h                        |   4 +-
 doc/guides/rel_notes/release_24_11.rst        |  17 +
 drivers/bus/auxiliary/bus_auxiliary_driver.h  |   8 +-
 drivers/bus/cdx/bus_cdx_driver.h              |   8 +-
 drivers/bus/dpaa/include/fsl_qman.h           |   8 +-
 drivers/bus/fslmc/bus_fslmc_driver.h          |   8 +-
 drivers/bus/pci/bus_pci_driver.h              |   8 +-
 drivers/bus/pci/rte_bus_pci.h                 |   8 +-
 drivers/bus/platform/bus_platform_driver.h    |   8 +-
 drivers/bus/vdev/bus_vdev_driver.h            |   8 +-
 drivers/bus/vmbus/bus_vmbus_driver.h          |   8 +-
 drivers/bus/vmbus/rte_bus_vmbus.h             |   8 +-
 drivers/dma/cnxk/cnxk_dma_event_dp.h          |   8 +-
 drivers/dma/ioat/ioat_hw_defs.h               |   4 +-
 drivers/event/dlb2/rte_pmd_dlb2.h             |   8 +-
 drivers/mempool/dpaa2/rte_dpaa2_mempool.h     |   6 +-
 drivers/net/avp/rte_avp_fifo.h                |   8 +-
 drivers/net/bonding/rte_eth_bond.h            |   4 +-
 drivers/net/i40e/rte_pmd_i40e.h               |   8 +-
 drivers/net/mlx5/mlx5_trace.h                 |   8 +-
 drivers/net/ring/rte_eth_ring.h               |   4 +-
 drivers/net/vhost/rte_eth_vhost.h             |   8 +-
 drivers/raw/ifpga/afu_pmd_core.h              |   8 +-
 drivers/raw/ifpga/afu_pmd_he_hssi.h           |   6 +-
 drivers/raw/ifpga/afu_pmd_he_lpbk.h           |   6 +-
 drivers/raw/ifpga/afu_pmd_he_mem.h            |   6 +-
 drivers/raw/ifpga/afu_pmd_n3000.h             |   6 +-
 drivers/raw/ifpga/rte_pmd_afu.h               |   4 +-
 drivers/raw/ifpga/rte_pmd_ifpga.h             |   4 +-
 examples/ethtool/lib/rte_ethtool.h            |   8 +-
 examples/qos_sched/main.h                     |   4 +-
 examples/vm_power_manager/channel_manager.h   |   8 +-
 lib/acl/rte_acl_osdep.h                       |   8 +-
 lib/bbdev/rte_bbdev.h                         |   8 +-
 lib/bbdev/rte_bbdev_op.h                      |   8 +-
 lib/bbdev/rte_bbdev_pmd.h                     |   8 +-
 lib/bpf/bpf_def.h                             |   8 +-
 lib/compressdev/rte_comp.h                    |   4 +-
 lib/compressdev/rte_compressdev.h             |   6 +-
 lib/compressdev/rte_compressdev_internal.h    |   8 +-
 lib/compressdev/rte_compressdev_pmd.h         |   8 +-
 lib/cryptodev/cryptodev_pmd.h                 |   8 +-
 lib/cryptodev/cryptodev_trace.h               |   8 +-
 lib/cryptodev/rte_crypto.h                    |   8 +-
 lib/cryptodev/rte_crypto_asym.h               |   8 +-
 lib/cryptodev/rte_crypto_sym.h                |   8 +-
 lib/cryptodev/rte_cryptodev.h                 |   8 +-
 lib/cryptodev/rte_cryptodev_trace_fp.h        |   4 +-
 lib/dispatcher/rte_dispatcher.h               |   8 +-
 lib/dmadev/rte_dmadev.h                       |   8 +
 lib/eal/arm/include/rte_atomic_32.h           |   4 +-
 lib/eal/arm/include/rte_atomic_64.h           |   8 +-
 lib/eal/arm/include/rte_byteorder.h           |   8 +-
 lib/eal/arm/include/rte_cpuflags_32.h         |   8 +-
 lib/eal/arm/include/rte_cpuflags_64.h         |   8 +-
 lib/eal/arm/include/rte_cycles_32.h           |   4 +-
 lib/eal/arm/include/rte_cycles_64.h           |   4 +-
 lib/eal/arm/include/rte_io.h                  |   8 +-
 lib/eal/arm/include/rte_io_64.h               |   8 +-
 lib/eal/arm/include/rte_memcpy_32.h           |   8 +-
 lib/eal/arm/include/rte_memcpy_64.h           |   8 +-
 lib/eal/arm/include/rte_pause.h               |   8 +-
 lib/eal/arm/include/rte_pause_32.h            |   6 +-
 lib/eal/arm/include/rte_pause_64.h            |   8 +-
 lib/eal/arm/include/rte_power_intrinsics.h    |   8 +-
 lib/eal/arm/include/rte_prefetch_32.h         |   8 +-
 lib/eal/arm/include/rte_prefetch_64.h         |   8 +-
 lib/eal/arm/include/rte_rwlock.h              |   4 +-
 lib/eal/arm/include/rte_spinlock.h            |   6 +-
 lib/eal/freebsd/include/rte_os.h              |   8 +-
 lib/eal/include/bus_driver.h                  |   8 +-
 lib/eal/include/dev_driver.h                  |   6 +-
 lib/eal/include/eal_trace_internal.h          |   8 +-
 lib/eal/include/generic/rte_atomic.h          |   8 +
 lib/eal/include/generic/rte_byteorder.h       |   8 +
 lib/eal/include/generic/rte_cpuflags.h        |   8 +
 lib/eal/include/generic/rte_cycles.h          |   8 +
 lib/eal/include/generic/rte_io.h              |   8 +
 lib/eal/include/generic/rte_memcpy.h          |   8 +
 lib/eal/include/generic/rte_pause.h           |   8 +
 .../include/generic/rte_power_intrinsics.h    |   8 +
 lib/eal/include/generic/rte_prefetch.h        |   8 +
 lib/eal/include/generic/rte_rwlock.h          |   8 +-
 lib/eal/include/generic/rte_spinlock.h        |   8 +
 lib/eal/include/generic/rte_vect.h            |   8 +
 lib/eal/include/rte_alarm.h                   |   4 +-
 lib/eal/include/rte_bitmap.h                  |   8 +-
 lib/eal/include/rte_bitops.h                  | 768 +++++++++++++++++-
 lib/eal/include/rte_bus.h                     |   8 +-
 lib/eal/include/rte_class.h                   |   4 +-
 lib/eal/include/rte_common.h                  |   8 +-
 lib/eal/include/rte_dev.h                     |   8 +-
 lib/eal/include/rte_devargs.h                 |   8 +-
 lib/eal/include/rte_eal_trace.h               |   4 +-
 lib/eal/include/rte_errno.h                   |   4 +-
 lib/eal/include/rte_fbarray.h                 |   8 +-
 lib/eal/include/rte_keepalive.h               |   6 +-
 lib/eal/include/rte_mcslock.h                 |   8 +-
 lib/eal/include/rte_memory.h                  |   8 +-
 lib/eal/include/rte_pci_dev_features.h        |   4 +-
 lib/eal/include/rte_pflock.h                  |   8 +-
 lib/eal/include/rte_random.h                  |   4 +-
 lib/eal/include/rte_seqcount.h                |   8 +-
 lib/eal/include/rte_seqlock.h                 |   8 +-
 lib/eal/include/rte_service.h                 |   8 +-
 lib/eal/include/rte_service_component.h       |   4 +-
 lib/eal/include/rte_stdatomic.h               |   5 +-
 lib/eal/include/rte_string_fns.h              |  17 +-
 lib/eal/include/rte_tailq.h                   |   6 +-
 lib/eal/include/rte_ticketlock.h              |   8 +-
 lib/eal/include/rte_time.h                    |   6 +-
 lib/eal/include/rte_trace.h                   |   8 +-
 lib/eal/include/rte_trace_point.h             |   8 +-
 lib/eal/include/rte_trace_point_register.h    |   8 +-
 lib/eal/include/rte_uuid.h                    |   8 +-
 lib/eal/include/rte_version.h                 |   6 +-
 lib/eal/include/rte_vfio.h                    |   8 +-
 lib/eal/linux/include/rte_os.h                |   8 +-
 lib/eal/loongarch/include/rte_atomic.h        |   6 +-
 lib/eal/loongarch/include/rte_byteorder.h     |   4 +-
 lib/eal/loongarch/include/rte_cpuflags.h      |   8 +-
 lib/eal/loongarch/include/rte_cycles.h        |   4 +-
 lib/eal/loongarch/include/rte_io.h            |   4 +-
 lib/eal/loongarch/include/rte_memcpy.h        |   4 +-
 lib/eal/loongarch/include/rte_pause.h         |   8 +-
 .../loongarch/include/rte_power_intrinsics.h  |   8 +-
 lib/eal/loongarch/include/rte_prefetch.h      |   8 +-
 lib/eal/loongarch/include/rte_rwlock.h        |   4 +-
 lib/eal/loongarch/include/rte_spinlock.h      |   6 +-
 lib/eal/ppc/include/rte_atomic.h              |   6 +-
 lib/eal/ppc/include/rte_byteorder.h           |   6 +-
 lib/eal/ppc/include/rte_cpuflags.h            |   8 +-
 lib/eal/ppc/include/rte_cycles.h              |   8 +-
 lib/eal/ppc/include/rte_io.h                  |   4 +-
 lib/eal/ppc/include/rte_memcpy.h              |   4 +-
 lib/eal/ppc/include/rte_pause.h               |   8 +-
 lib/eal/ppc/include/rte_power_intrinsics.h    |   8 +-
 lib/eal/ppc/include/rte_prefetch.h            |   8 +-
 lib/eal/ppc/include/rte_rwlock.h              |   4 +-
 lib/eal/ppc/include/rte_spinlock.h            |   8 +-
 lib/eal/riscv/include/rte_atomic.h            |   8 +-
 lib/eal/riscv/include/rte_byteorder.h         |   8 +-
 lib/eal/riscv/include/rte_cpuflags.h          |   8 +-
 lib/eal/riscv/include/rte_cycles.h            |   4 +-
 lib/eal/riscv/include/rte_io.h                |   4 +-
 lib/eal/riscv/include/rte_memcpy.h            |   4 +-
 lib/eal/riscv/include/rte_pause.h             |   8 +-
 lib/eal/riscv/include/rte_power_intrinsics.h  |   8 +-
 lib/eal/riscv/include/rte_prefetch.h          |   8 +-
 lib/eal/riscv/include/rte_rwlock.h            |   4 +-
 lib/eal/riscv/include/rte_spinlock.h          |   6 +-
 lib/eal/windows/include/pthread.h             |   6 +-
 lib/eal/windows/include/regex.h               |   8 +-
 lib/eal/windows/include/rte_windows.h         |   8 +-
 lib/eal/x86/include/rte_atomic.h              |  25 +-
 lib/eal/x86/include/rte_byteorder.h           |  16 +-
 lib/eal/x86/include/rte_cpuflags.h            |   8 +-
 lib/eal/x86/include/rte_cycles.h              |   8 +-
 lib/eal/x86/include/rte_io.h                  |   8 +-
 lib/eal/x86/include/rte_pause.h               |   7 +-
 lib/eal/x86/include/rte_power_intrinsics.h    |   8 +-
 lib/eal/x86/include/rte_prefetch.h            |   8 +-
 lib/eal/x86/include/rte_rwlock.h              |   6 +-
 lib/eal/x86/include/rte_spinlock.h            |   9 +-
 lib/ethdev/ethdev_driver.h                    |   8 +-
 lib/ethdev/ethdev_pci.h                       |   8 +-
 lib/ethdev/ethdev_trace.h                     |   8 +-
 lib/ethdev/ethdev_vdev.h                      |   8 +-
 lib/ethdev/rte_cman.h                         |   4 +-
 lib/ethdev/rte_dev_info.h                     |   4 +-
 lib/ethdev/rte_ethdev.h                       |   8 +-
 lib/ethdev/rte_ethdev_trace_fp.h              |   4 +-
 lib/eventdev/event_timer_adapter_pmd.h        |   4 +-
 lib/eventdev/eventdev_pmd.h                   |   8 +-
 lib/eventdev/eventdev_pmd_pci.h               |   8 +-
 lib/eventdev/eventdev_pmd_vdev.h              |   8 +-
 lib/eventdev/eventdev_trace.h                 |   8 +-
 lib/eventdev/rte_event_crypto_adapter.h       |   8 +-
 lib/eventdev/rte_event_eth_rx_adapter.h       |   8 +-
 lib/eventdev/rte_event_eth_tx_adapter.h       |   8 +-
 lib/eventdev/rte_event_ring.h                 |   8 +-
 lib/eventdev/rte_event_timer_adapter.h        |   8 +-
 lib/eventdev/rte_eventdev.h                   |   8 +-
 lib/eventdev/rte_eventdev_trace_fp.h          |   4 +-
 lib/graph/rte_graph_model_mcore_dispatch.h    |   8 +-
 lib/graph/rte_graph_worker.h                  |   6 +-
 lib/gso/rte_gso.h                             |   6 +-
 lib/hash/rte_fbk_hash.h                       |   8 +-
 lib/hash/rte_hash_crc.h                       |   8 +-
 lib/hash/rte_jhash.h                          |   8 +-
 lib/hash/rte_thash.h                          |   8 +-
 lib/hash/rte_thash_gfni.h                     |   8 +-
 lib/ip_frag/rte_ip_frag.h                     |   8 +-
 lib/ipsec/rte_ipsec.h                         |   8 +-
 lib/log/rte_log.h                             |   8 +-
 lib/lpm/rte_lpm.h                             |   8 +-
 lib/member/rte_member.h                       |   8 +-
 lib/member/rte_member_sketch.h                |   6 +-
 lib/member/rte_member_sketch_avx512.h         |   8 +-
 lib/member/rte_member_x86.h                   |   4 +-
 lib/member/rte_xxh64_avx512.h                 |   6 +-
 lib/mempool/mempool_trace.h                   |   8 +-
 lib/mempool/rte_mempool_trace_fp.h            |   4 +-
 lib/meter/rte_meter.h                         |   8 +-
 lib/mldev/mldev_utils.h                       |   8 +-
 lib/mldev/rte_mldev_core.h                    |   8 +-
 lib/mldev/rte_mldev_pmd.h                     |   8 +-
 lib/net/rte_ether.h                           |   8 +-
 lib/net/rte_net.h                             |   8 +-
 lib/net/rte_sctp.h                            |   8 +-
 lib/node/rte_node_eth_api.h                   |   8 +-
 lib/node/rte_node_ip4_api.h                   |   8 +-
 lib/node/rte_node_ip6_api.h                   |   6 +-
 lib/node/rte_node_udp4_input_api.h            |   8 +-
 lib/pci/rte_pci.h                             |   8 +-
 lib/pdcp/rte_pdcp.h                           |   8 +-
 lib/pipeline/rte_pipeline.h                   |   8 +-
 lib/pipeline/rte_port_in_action.h             |   8 +-
 lib/pipeline/rte_swx_ctl.h                    |   8 +-
 lib/pipeline/rte_swx_extern.h                 |   8 +-
 lib/pipeline/rte_swx_ipsec.h                  |   8 +-
 lib/pipeline/rte_swx_pipeline.h               |   8 +-
 lib/pipeline/rte_swx_pipeline_spec.h          |   8 +-
 lib/pipeline/rte_table_action.h               |   8 +-
 lib/port/rte_port.h                           |   8 +-
 lib/port/rte_port_ethdev.h                    |   8 +-
 lib/port/rte_port_eventdev.h                  |   8 +-
 lib/port/rte_port_fd.h                        |   8 +-
 lib/port/rte_port_frag.h                      |   8 +-
 lib/port/rte_port_ras.h                       |   8 +-
 lib/port/rte_port_ring.h                      |   8 +-
 lib/port/rte_port_sched.h                     |   8 +-
 lib/port/rte_port_source_sink.h               |   8 +-
 lib/port/rte_port_sym_crypto.h                |   8 +-
 lib/port/rte_swx_port.h                       |   8 +-
 lib/port/rte_swx_port_ethdev.h                |   8 +-
 lib/port/rte_swx_port_fd.h                    |   8 +-
 lib/port/rte_swx_port_ring.h                  |   8 +-
 lib/port/rte_swx_port_source_sink.h           |   8 +-
 lib/rawdev/rte_rawdev.h                       |   6 +-
 lib/rawdev/rte_rawdev_pmd.h                   |   8 +-
 lib/rcu/rte_rcu_qsbr.h                        |   8 +-
 lib/regexdev/rte_regexdev.h                   |   8 +-
 lib/ring/rte_ring.h                           |   6 +-
 lib/ring/rte_ring_core.h                      |   8 +-
 lib/ring/rte_ring_elem.h                      |   8 +-
 lib/ring/rte_ring_hts.h                       |   4 +-
 lib/ring/rte_ring_peek.h                      |   4 +-
 lib/ring/rte_ring_peek_zc.h                   |   4 +-
 lib/ring/rte_ring_rts.h                       |   4 +-
 lib/sched/rte_approx.h                        |   8 +-
 lib/sched/rte_pie.h                           |   8 +-
 lib/sched/rte_red.h                           |   8 +-
 lib/sched/rte_sched.h                         |   8 +-
 lib/sched/rte_sched_common.h                  |   6 +-
 lib/security/rte_security.h                   |   8 +-
 lib/security/rte_security_driver.h            |   6 +-
 lib/stack/rte_stack.h                         |   8 +-
 lib/table/rte_lru.h                           |  12 +-
 lib/table/rte_lru_arm64.h                     |   8 +-
 lib/table/rte_lru_x86.h                       |   8 -
 lib/table/rte_swx_hash_func.h                 |   8 +-
 lib/table/rte_swx_keycmp.h                    |   8 +-
 lib/table/rte_swx_table.h                     |   8 +-
 lib/table/rte_swx_table_em.h                  |   8 +-
 lib/table/rte_swx_table_learner.h             |   8 +-
 lib/table/rte_swx_table_selector.h            |   8 +-
 lib/table/rte_swx_table_wm.h                  |   8 +-
 lib/table/rte_table.h                         |   8 +-
 lib/table/rte_table_acl.h                     |   8 +-
 lib/table/rte_table_array.h                   |   8 +-
 lib/table/rte_table_hash.h                    |   8 +-
 lib/table/rte_table_hash_cuckoo.h             |   8 +-
 lib/table/rte_table_hash_func.h               |  12 +-
 lib/table/rte_table_lpm.h                     |   8 +-
 lib/table/rte_table_lpm_ipv6.h                |   8 +-
 lib/table/rte_table_stub.h                    |   8 +-
 lib/telemetry/rte_telemetry.h                 |   8 +-
 lib/vhost/rte_vdpa.h                          |   8 +-
 lib/vhost/rte_vhost.h                         |   8 +-
 lib/vhost/rte_vhost_async.h                   |   8 +-
 lib/vhost/rte_vhost_crypto.h                  |   4 +-
 lib/vhost/vdpa_driver.h                       |   8 +-
 285 files changed, 2264 insertions(+), 998 deletions(-)