diff mbox series

[v2,1/5] baseband/acc100: introduce PMD for ACC101

Message ID 1651083423-33202-2-git-send-email-nicolas.chautru@intel.com (mailing list archive)
State Superseded
Delegated to: akhil goyal
Headers show
Series drivers/baseband: PMD to support ACC101 device | expand

Checks

Context Check Description
ci/checkpatch success coding style OK

Commit Message

Chautru, Nicolas April 27, 2022, 6:16 p.m. UTC
Support for ACC101 as a derivative of ACC100.
Reusing existing code when possible.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 doc/guides/bbdevs/acc101.rst             | 237 +++++++++++++++++++++++++++++++
 doc/guides/bbdevs/features/acc101.ini    |  13 ++
 doc/guides/bbdevs/index.rst              |   1 +
 doc/guides/rel_notes/release_22_07.rst   |   4 +
 drivers/baseband/acc100/rte_acc100_pmd.c | 194 ++++++++++++++++++++++++-
 drivers/baseband/acc100/rte_acc100_pmd.h |   6 +
 drivers/baseband/acc100/rte_acc101_pmd.h |  61 ++++++++
 7 files changed, 511 insertions(+), 5 deletions(-)
 create mode 100644 doc/guides/bbdevs/acc101.rst
 create mode 100644 doc/guides/bbdevs/features/acc101.ini
 create mode 100644 drivers/baseband/acc100/rte_acc101_pmd.h

Comments

Tom Rix May 8, 2022, 1:02 p.m. UTC | #1
This is a good start reusing code, but I think it needs to do more reuse.

These cards should be very close and likely represent a family of cards.

On 4/27/22 11:16 AM, Nicolas Chautru wrote:
> Support for ACC101 as a derivative of ACC100.
> Reusing existing code when possible.
>
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
>   doc/guides/bbdevs/acc101.rst             | 237 +++++++++++++++++++++++++++++++
>   doc/guides/bbdevs/features/acc101.ini    |  13 ++
>   doc/guides/bbdevs/index.rst              |   1 +
>   doc/guides/rel_notes/release_22_07.rst   |   4 +
>   drivers/baseband/acc100/rte_acc100_pmd.c | 194 ++++++++++++++++++++++++-
>   drivers/baseband/acc100/rte_acc100_pmd.h |   6 +
>   drivers/baseband/acc100/rte_acc101_pmd.h |  61 ++++++++
>   7 files changed, 511 insertions(+), 5 deletions(-)
>   create mode 100644 doc/guides/bbdevs/acc101.rst
>   create mode 100644 doc/guides/bbdevs/features/acc101.ini
>   create mode 100644 drivers/baseband/acc100/rte_acc101_pmd.h
>
> diff --git a/doc/guides/bbdevs/acc101.rst b/doc/guides/bbdevs/acc101.rst
> new file mode 100644
> index 0000000..46c310b
> --- /dev/null
> +++ b/doc/guides/bbdevs/acc101.rst
> @@ -0,0 +1,237 @@
> +..  SPDX-License-Identifier: BSD-3-Clause
> +    Copyright(c) 2020 Intel Corporation
> +
> +Intel(R) ACC101 5G/4G FEC Poll Mode Driver
> +==========================================
> +
> +The BBDEV ACC101 5G/4G FEC poll mode driver (PMD) supports an
> +implementation of a VRAN FEC wireless acceleration function.
> +This device is also known as Mount Cirrus.
> +This is a follow-up to Mount Bryce (ACC100) and includes fixes, improved
> +feature set for error scenarios and performance capacity increase.

includes fixes, better error handling and increased performance.

A quick look at acc100.rst and the bulk of acc101.rst looks the same.

Consider a user of the acc100 is upgrading to acc101, they will

want to know what is the same and what has changed and test accordingly.

These two documents should be combined.

> +
> +Features
> +--------
> +
> +ACC101 5G/4G FEC PMD supports the following features:
> +
> +- LDPC Encode in the DL (5GNR)
> +- LDPC Decode in the UL (5GNR)
> +- Turbo Encode in the DL (4G)
> +- Turbo Decode in the UL (4G)
> +- 16 VFs per PF (physical device)
> +- Maximum of 128 queues per VF
> +- PCIe Gen-3 x16 Interface
> +- MSI
> +- SR-IOV
> +
> +ACC101 5G/4G FEC PMD supports the following BBDEV capabilities:
> +
> +* For the LDPC encode operation:
> +   - ``RTE_BBDEV_LDPC_CRC_24B_ATTACH`` :  set to attach CRC24B to CB(s)
> +   - ``RTE_BBDEV_LDPC_RATE_MATCH`` :  if set then do not do Rate Match bypass
> +   - ``RTE_BBDEV_LDPC_INTERLEAVER_BYPASS`` : if set then bypass interleaver
> +
> +* For the LDPC decode operation:
> +   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK`` :  check CRC24B from CB(s)
> +   - ``RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE`` :  disable early termination
> +   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP`` :  drops CRC24B bits appended while decoding
> +   - ``RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE`` :  provides an input for HARQ combining
> +   - ``RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE`` :  provides an input for HARQ combining
> +   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE`` :  HARQ memory input is internal
> +   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE`` :  HARQ memory output is internal
> +   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK`` :  loopback data to/from HARQ memory
> +   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_FILLERS`` :  HARQ memory includes the fillers bits
> +   - ``RTE_BBDEV_LDPC_DEC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
> +   - ``RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION`` :  supports compression of the HARQ input/output
> +   - ``RTE_BBDEV_LDPC_LLR_COMPRESSION`` :  supports LLR input compression
> +
> +* For the turbo encode operation:
> +   - ``RTE_BBDEV_TURBO_CRC_24B_ATTACH`` :  set to attach CRC24B to CB(s)
> +   - ``RTE_BBDEV_TURBO_RATE_MATCH`` :  if set then do not do Rate Match bypass
> +   - ``RTE_BBDEV_TURBO_ENC_INTERRUPTS`` :  set for encoder dequeue interrupts
> +   - ``RTE_BBDEV_TURBO_RV_INDEX_BYPASS`` :  set to bypass RV index
> +   - ``RTE_BBDEV_TURBO_ENC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
> +
> +* For the turbo decode operation:
> +   - ``RTE_BBDEV_TURBO_CRC_TYPE_24B`` :  check CRC24B from CB(s)
> +   - ``RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE`` :  perform subblock de-interleave
> +   - ``RTE_BBDEV_TURBO_DEC_INTERRUPTS`` :  set for decoder dequeue interrupts
> +   - ``RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN`` :  set if negative LLR encoder i/p is supported
> +   - ``RTE_BBDEV_TURBO_POS_LLR_1_BIT_IN`` :  set if positive LLR encoder i/p is supported
> +   - ``RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP`` :  keep CRC24B bits appended while decoding
> +   - ``RTE_BBDEV_TURBO_DEC_CRC_24B_DROP`` : option to drop the code block CRC after decoding
> +   - ``RTE_BBDEV_TURBO_EARLY_TERMINATION`` :  set early termination feature
> +   - ``RTE_BBDEV_TURBO_DEC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
> +   - ``RTE_BBDEV_TURBO_HALF_ITERATION_EVEN`` :  set half iteration granularity
> +
> +Installation
> +------------
> +
> +Section 3 of the DPDK manual provides instructions on installing and compiling DPDK.
> +
> +DPDK requires hugepages to be configured as detailed in section 2 of the DPDK manual.
> +The bbdev test application has been tested with a configuration 40 x 1GB hugepages. The
> +hugepage configuration of a server may be examined using:
> +
> +.. code-block:: console
> +
> +   grep Huge* /proc/meminfo
> +
> +
> +Initialization
> +--------------
> +
> +When the device first powers up, its PCI Physical Functions (PF) can be listed through this command:
> +
> +.. code-block:: console
> +
> +  sudo lspci -vd8086:57c4
> +
> +The physical and virtual functions are compatible with Linux UIO drivers:
> +``vfio`` and ``igb_uio``. However, in order to work the ACC101 5G/4G
> +FEC device first needs to be bound to one of these linux drivers through DPDK.
> +
> +
> +Bind PF UIO driver(s)
> +~~~~~~~~~~~~~~~~~~~~~
> +
> +Install the DPDK igb_uio driver, bind it with the PF PCI device ID and use
> +``lspci`` to confirm the PF device is under use by ``igb_uio`` DPDK UIO driver.
> +
> +The igb_uio driver may be bound to the PF PCI device using one of two methods:
> +
> +
> +1. PCI functions (physical or virtual, depending on the use case) can be bound to
> +the UIO driver by repeating this command for every function.
> +
> +.. code-block:: console
> +
> +  cd <dpdk-top-level-directory>
> +  insmod ./build/kmod/igb_uio.ko
> +  echo "8086 57c4" > /sys/bus/pci/drivers/igb_uio/new_id
> +  lspci -vd8086:57c4
> +
> +
> +2. Another way to bind PF with DPDK UIO driver is by using the ``dpdk-devbind.py`` tool
> +
> +.. code-block:: console
> +
> +  cd <dpdk-top-level-directory>
> +  ./usertools/dpdk-devbind.py -b igb_uio 0000:06:00.0
> +
> +where the PCI device ID (example: 0000:06:00.0) is obtained using lspci -vd8086:57c4
> +
> +
> +In a similar way the ACC101 5G/4G FEC PF may be bound with vfio-pci as any PCIe device.
> +
> +
> +Enable Virtual Functions
> +~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +Now, it should be visible in the printouts that PCI PF is under igb_uio control
> +"``Kernel driver in use: igb_uio``"
> +
> +To show the number of available VFs on the device, read ``sriov_totalvfs`` file..
> +
> +.. code-block:: console
> +
> +  cat /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_totalvfs
> +
> +  where 0000\:<b>\:<d>.<f> is the PCI device ID
> +
> +
> +To enable VFs via igb_uio, echo the number of virtual functions intended to
> +enable to ``max_vfs`` file..
> +
> +.. code-block:: console
> +
> +  echo <num-of-vfs> > /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/max_vfs
> +
> +
> +Afterwards, all VFs must be bound to appropriate UIO drivers as required, same
> +way it was done with the physical function previously.
> +
> +Enabling SR-IOV via vfio driver is pretty much the same, except that the file
> +name is different:
> +
> +.. code-block:: console
> +
> +  echo <num-of-vfs> > /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_numvfs
> +
> +
> +Configure the VFs through PF
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +The PCI virtual functions must be configured before working or getting assigned
> +to VMs/Containers. The configuration involves allocating the number of hardware
> +queues, priorities, load balance, bandwidth and other settings necessary for the
> +device to perform FEC functions.
> +
> +This configuration needs to be executed at least once after reboot or PCI FLR and can
> +be achieved by using the function ``acc101_configure()``, which sets up the
> +parameters defined in ``acc100_conf`` structure.
> +
> +Test Application
> +----------------
> +
> +BBDEV provides a test application, ``test-bbdev.py`` and range of test data for testing
> +the functionality of ACC101 5G/4G FEC encode and decode, depending on the device's
> +capabilities. The test application is located under app->test-bbdev folder and has the
> +following options:
> +
> +.. code-block:: console
> +
> +  "-p", "--testapp-path": specifies path to the bbdev test app.
> +  "-e", "--eal-params"	: EAL arguments which are passed to the test app.
> +  "-t", "--timeout"	: Timeout in seconds (default=300).
> +  "-c", "--test-cases"	: Defines test cases to run. Run all if not specified.
> +  "-v", "--test-vector"	: Test vector path (default=dpdk_path+/app/test-bbdev/test_vectors/bbdev_null.data).
> +  "-n", "--num-ops"	: Number of operations to process on device (default=32).
> +  "-b", "--burst-size"	: Operations enqueue/dequeue burst size (default=32).
> +  "-s", "--snr"		: SNR in dB used when generating LLRs for bler tests.
> +  "-s", "--iter_max"	: Number of iterations for LDPC decoder.
> +  "-l", "--num-lcores"	: Number of lcores to run (default=16).
> +  "-i", "--init-device" : Initialise PF device with default values.
> +
> +
> +To execute the test application tool using simple decode or encode data,
> +type one of the following:
> +
> +.. code-block:: console
> +
> +  ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_dec_default.data
> +  ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_enc_default.data
> +
> +
> +The test application ``test-bbdev.py``, supports the ability to configure the PF device with
> +a default set of values, if the "-i" or "- -init-device" option is included. The default values
> +are defined in test_bbdev_perf.c.
> +
> +
> +Test Vectors
> +~~~~~~~~~~~~
> +
> +In addition to the simple LDPC decoder and LDPC encoder tests, bbdev also provides
> +a range of additional tests under the test_vectors folder, which may be useful. The results
> +of these tests will depend on the ACC101 5G/4G FEC capabilities which may cause some
> +testcases to be skipped, but no failure should be reported.
> +
> +
> +Alternate Baseband Device configuration tool
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +On top of the embedded configuration feature supported in test-bbdev using "- -init-device"
> +option mentioned above, there is also a tool available to perform that device configuration
> +using a companion application.
> +The ``pf_bb_config`` application notably enables then to run bbdev-test from the VF
> +and not only limited to the PF as captured above.
> +
> +See for more details: https://github.com/intel/pf-bb-config
> +
> +Specifically for the BBDEV ACC101 PMD, the command below can be used:
> +
> +.. code-block:: console
> +
> +  ./pf_bb_config ACC101 -c acc101/acc101_config_4vf_4g5g.cfg
> +  ./test-bbdev.py -e="-c 0xff0 -a${VF_PCI_ADDR}" -c validation -l 1 -v ./ldpc_dec_default.data
> \ No newline at end of file
> diff --git a/doc/guides/bbdevs/features/acc101.ini b/doc/guides/bbdevs/features/acc101.ini
> new file mode 100644
> index 0000000..0e2c21a
> --- /dev/null
> +++ b/doc/guides/bbdevs/features/acc101.ini
> @@ -0,0 +1,13 @@
> +;
> +; Supported features of the 'acc101' bbdev driver.
> +;
> +; Refer to default.ini for the full list of available PMD features.
> +;
> +[Features]
> +Turbo Decoder (4G)     = Y
> +Turbo Encoder (4G)     = Y
> +LDPC Decoder (5G)      = Y
> +LDPC Encoder (5G)      = Y
> +LLR/HARQ Compression   = Y
> +External DDR Access    = Y
> +HW Accelerated         = Y
This is the same as acc100.ini, why do we need 2 ?
> diff --git a/doc/guides/bbdevs/index.rst b/doc/guides/bbdevs/index.rst
> index cedd706..e76883c 100644
> --- a/doc/guides/bbdevs/index.rst
> +++ b/doc/guides/bbdevs/index.rst
> @@ -14,4 +14,5 @@ Baseband Device Drivers
>       fpga_lte_fec
>       fpga_5gnr_fec
>       acc100
> +    acc101
>       la12xx
> diff --git a/doc/guides/rel_notes/release_22_07.rst b/doc/guides/rel_notes/release_22_07.rst
> index 42a5f2d..ef9906b 100644
> --- a/doc/guides/rel_notes/release_22_07.rst
> +++ b/doc/guides/rel_notes/release_22_07.rst
> @@ -55,6 +55,10 @@ New Features
>        Also, make sure to start the actual text at the margin.
>        =======================================================
>   
> +* **Added Intel ACC101 baseband PMD.**
> +
> +  * Added a new baseband PMD for Intel ACC101 device (Mount Cirrus).
> +  * See the :doc:`../bbdevs/acc101` for more details.
>   
>   Removed Items
>   -------------
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
> index de7e4bc..fca27ef 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> @@ -22,6 +22,7 @@
>   #include <rte_bbdev.h>
>   #include <rte_bbdev_pmd.h>
>   #include "rte_acc100_pmd.h"
> +#include "rte_acc101_pmd.h"
>   
>   #ifdef RTE_LIBRTE_BBDEV_DEBUG
>   RTE_LOG_REGISTER_DEFAULT(acc100_logtype, DEBUG);
> @@ -1286,6 +1287,12 @@
>   			RTE_BBDEV_TURBO_HALF_ITERATION_EVEN);
>   }
>   
> +static inline bool
> +is_acc100(struct acc100_queue *q)
> +{
> +	return (q->d->device_variant == ACC100_VARIANT);
> +}
> +
>   /* Fill in a frame control word for LDPC decoding. */
>   static inline void
>   acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw,
> @@ -1412,6 +1419,139 @@
>   	}
>   }
>   
> +/* Convert offset to harq index for harq_layout structure */
> +static inline uint32_t hq_index(uint32_t offset)
> +{
> +	return (offset >> ACC100_HARQ_OFFSET_SHIFT) & ACC100_HARQ_OFFSET_MASK;
> +}
> +
> +/* Fill in a frame control word for LDPC decoding for ACC101 */
> +static inline void
> +acc101_fcw_ld_fill(struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw,
> +		union acc100_harq_layout_data *harq_layout)
This looks extremely similar to the acc100*, why isn't this combined ?
> +{
> +	uint16_t harq_out_length, harq_in_length, ncb_p, k0_p, parity_offset;
> +	uint32_t harq_index;
> +	uint32_t l;
> +
> +	fcw->qm = op->ldpc_dec.q_m;
> +	fcw->nfiller = op->ldpc_dec.n_filler;
> +	fcw->BG = (op->ldpc_dec.basegraph - 1);
> +	fcw->Zc = op->ldpc_dec.z_c;
> +	fcw->ncb = op->ldpc_dec.n_cb;
> +	fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_dec.basegraph,
> +			op->ldpc_dec.rv_index);
> +	if (op->ldpc_dec.code_block_mode == RTE_BBDEV_CODE_BLOCK)
> +		fcw->rm_e = op->ldpc_dec.cb_params.e;
> +	else
> +		fcw->rm_e = (op->ldpc_dec.tb_params.r <
> +				op->ldpc_dec.tb_params.cab) ?
> +						op->ldpc_dec.tb_params.ea :
> +						op->ldpc_dec.tb_params.eb;
> +
> +	if (unlikely(check_bit(op->ldpc_dec.op_flags,
> +			RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE) &&
> +			(op->ldpc_dec.harq_combined_input.length == 0))) {
> +		rte_bbdev_log(WARNING, "Null HARQ input size provided");
> +		/* Disable HARQ input in that case to carry forward */
> +		op->ldpc_dec.op_flags ^= RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE;
> +	}
> +
> +	fcw->hcin_en = check_bit(op->ldpc_dec.op_flags,
> +			RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE);
> +	fcw->hcout_en = check_bit(op->ldpc_dec.op_flags,
> +			RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE);
> +	fcw->crc_select = check_bit(op->ldpc_dec.op_flags,
> +			RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK);
> +	fcw->bypass_dec = check_bit(op->ldpc_dec.op_flags,
> +			RTE_BBDEV_LDPC_DECODE_BYPASS);
> +	fcw->bypass_intlv = check_bit(op->ldpc_dec.op_flags,
> +			RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS);
> +	if (op->ldpc_dec.q_m == 1) {
> +		fcw->bypass_intlv = 1;
> +		fcw->qm = 2;
> +	}
> +	fcw->hcin_decomp_mode = check_bit(op->ldpc_dec.op_flags,
> +			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> +	fcw->hcout_comp_mode = check_bit(op->ldpc_dec.op_flags,
> +			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> +	fcw->llr_pack_mode = check_bit(op->ldpc_dec.op_flags,
> +			RTE_BBDEV_LDPC_LLR_COMPRESSION);
> +	harq_index = hq_index(op->ldpc_dec.harq_combined_output.offset);
> +	if (fcw->hcin_en > 0) {
> +		harq_in_length = op->ldpc_dec.harq_combined_input.length;
> +		if (fcw->hcin_decomp_mode > 0)
> +			harq_in_length = harq_in_length * 8 / 6;
> +		harq_in_length = RTE_MIN(harq_in_length, op->ldpc_dec.n_cb
> +				- op->ldpc_dec.n_filler);
> +		/* Alignment on next 64B - Already enforced from HC output */
> +		harq_in_length = RTE_ALIGN_FLOOR(harq_in_length, 64);
> +		fcw->hcin_size0 = harq_in_length;
> +		fcw->hcin_offset = 0;
> +		fcw->hcin_size1 = 0;
> +	} else {
> +		fcw->hcin_size0 = 0;
> +		fcw->hcin_offset = 0;
> +		fcw->hcin_size1 = 0;
> +	}
> +
> +	fcw->itmax = op->ldpc_dec.iter_max;
> +	fcw->itstop = check_bit(op->ldpc_dec.op_flags,
> +			RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE);
> +	fcw->synd_precoder = fcw->itstop;
> +	/*
> +	 * These are all implicitly set
> +	 * fcw->synd_post = 0;
> +	 * fcw->so_en = 0;
> +	 * fcw->so_bypass_rm = 0;
> +	 * fcw->so_bypass_intlv = 0;
> +	 * fcw->dec_convllr = 0;
> +	 * fcw->hcout_convllr = 0;
> +	 * fcw->hcout_size1 = 0;
> +	 * fcw->so_it = 0;
> +	 * fcw->hcout_offset = 0;
> +	 * fcw->negstop_th = 0;
> +	 * fcw->negstop_it = 0;
> +	 * fcw->negstop_en = 0;
> +	 * fcw->gain_i = 1;
> +	 * fcw->gain_h = 1;
> +	 */
> +	if (fcw->hcout_en > 0) {
> +		parity_offset = (op->ldpc_dec.basegraph == 1 ? 20 : 8)
> +			* op->ldpc_dec.z_c - op->ldpc_dec.n_filler;
> +		k0_p = (fcw->k0 > parity_offset) ?
> +				fcw->k0 - op->ldpc_dec.n_filler : fcw->k0;
> +		ncb_p = fcw->ncb - op->ldpc_dec.n_filler;
> +		l = RTE_MIN(k0_p + fcw->rm_e, INT16_MAX);
> +		harq_out_length = (uint16_t) fcw->hcin_size0;
> +		harq_out_length = RTE_MAX(harq_out_length, l);
> +		/* Cannot exceed the pruned Ncb circular buffer */
> +		harq_out_length = RTE_MIN(harq_out_length, ncb_p);
> +		/* Alignment on next 64B */
> +		harq_out_length = RTE_ALIGN_CEIL(harq_out_length, 64);
> +		fcw->hcout_size0 = harq_out_length;
> +		fcw->hcout_size1 = 0;
> +		fcw->hcout_offset = 0;
> +		harq_layout[harq_index].offset = fcw->hcout_offset;
> +		harq_layout[harq_index].size0 = fcw->hcout_size0;
> +	} else {
> +		fcw->hcout_size0 = 0;
> +		fcw->hcout_size1 = 0;
> +		fcw->hcout_offset = 0;
> +	}
> +}
> +
> +static inline void
> +acc10x_fcw_ld_fill(struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw,
> +		union acc100_harq_layout_data *harq_layout, struct acc100_queue *q)
> +{
> +	if (is_acc100(q))
consider having a function table in the private data so the call can be 
made without this if-check
> +		return acc100_fcw_ld_fill(op, fcw, harq_layout);
> +	else
> +		return acc101_fcw_ld_fill(op, fcw, harq_layout);
> +}
> +
> +
>   /**
>    * Fills descriptor with data pointers of one block type.
>    *
> @@ -2960,7 +3100,7 @@
>   		struct acc100_fcw_ld *fcw;
>   		uint32_t seg_total_left;
>   		fcw = &desc->req.fcw_ld;
> -		acc100_fcw_ld_fill(op, fcw, harq_layout);
> +		acc10x_fcw_ld_fill(op, fcw, harq_layout, q);
>   
>   		/* Special handling when overusing mbuf */
>   		if (fcw->rm_e < ACC100_MAX_E_MBUF)
> @@ -3027,7 +3167,7 @@
>   	desc = q->ring_addr + desc_idx;
>   	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
>   	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
> -	acc100_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout);
> +	acc10x_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout, q);
>   
>   	input = op->ldpc_dec.input.data;
>   	h_output_head = h_output = op->ldpc_dec.hard_output.data;
> @@ -4139,9 +4279,17 @@
>   	dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
>   	dev->dequeue_ldpc_dec_ops = acc100_dequeue_ldpc_dec;
>   
> -	((struct acc100_device *) dev->data->dev_private)->pf_device =
> -			!strcmp(drv->driver.name,
> -					RTE_STR(ACC100PF_DRIVER_NAME));
> +	if ((!strcmp(drv->driver.name, RTE_STR(ACC100PF_DRIVER_NAME))) ||
> +			(!strcmp(drv->driver.name, RTE_STR(ACC100VF_DRIVER_NAME)))) {
> +		((struct acc100_device *) dev->data->dev_private)->pf_device =
> +				!strcmp(drv->driver.name, RTE_STR(ACC100PF_DRIVER_NAME));
> +		((struct acc100_device *) dev->data->dev_private)->device_variant = ACC100_VARIANT;
> +	} else {
> +		((struct acc100_device *) dev->data->dev_private)->pf_device =
> +				!strcmp(drv->driver.name, RTE_STR(ACC101PF_DRIVER_NAME));
> +		((struct acc100_device *) dev->data->dev_private)->device_variant = ACC101_VARIANT;
> +	}
> +
>   	((struct acc100_device *) dev->data->dev_private)->mmio_base =
>   			pci_dev->mem_resource[0].addr;
>   
> @@ -4251,6 +4399,42 @@ static int acc100_pci_remove(struct rte_pci_device *pci_dev)
>   RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
>   RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
>   
> +/* ACC101 PCI PF address map */
> +static struct rte_pci_id pci_id_acc101_pf_map[] = {
> +	{
> +		RTE_PCI_DEVICE(RTE_ACC101_VENDOR_ID, RTE_ACC101_PF_DEVICE_ID)
> +	},
> +	{.device_id = 0},
> +};
> +
> +/* ACC101 PCI VF address map */
> +static struct rte_pci_id pci_id_acc101_vf_map[] = {
> +	{
> +		RTE_PCI_DEVICE(RTE_ACC101_VENDOR_ID, RTE_ACC101_VF_DEVICE_ID)
> +	},
> +	{.device_id = 0},
> +};
> +
> +
> +static struct rte_pci_driver acc101_pci_pf_driver = {
> +		.probe = acc100_pci_probe,
> +		.remove = acc100_pci_remove,
> +		.id_table = pci_id_acc101_pf_map,
> +		.drv_flags = RTE_PCI_DRV_NEED_MAPPING
> +};
> +
> +static struct rte_pci_driver acc101_pci_vf_driver = {
> +		.probe = acc100_pci_probe,
> +		.remove = acc100_pci_remove,
> +		.id_table = pci_id_acc101_vf_map,
> +		.drv_flags = RTE_PCI_DRV_NEED_MAPPING
> +};
> +
> +RTE_PMD_REGISTER_PCI(ACC101PF_DRIVER_NAME, acc101_pci_pf_driver);
> +RTE_PMD_REGISTER_PCI_TABLE(ACC101PF_DRIVER_NAME, pci_id_acc101_pf_map);
> +RTE_PMD_REGISTER_PCI(ACC101VF_DRIVER_NAME, acc101_pci_vf_driver);
> +RTE_PMD_REGISTER_PCI_TABLE(ACC101VF_DRIVER_NAME, pci_id_acc101_vf_map);
> +
>   /*
>    * Workaround implementation to fix the power on status of some 5GUL engines
>    * This requires DMA permission if ported outside DPDK
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
> index cbcece2..6438031 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> @@ -22,6 +22,9 @@
>   #define rte_bbdev_log_debug(fmt, ...)
>   #endif
>   
> +#define ACC100_VARIANT 0
> +#define ACC101_VARIANT 1
> +
>   /* ACC100 PF and VF driver names */
>   #define ACC100PF_DRIVER_NAME           intel_acc100_pf
>   #define ACC100VF_DRIVER_NAME           intel_acc100_vf
> @@ -67,6 +70,8 @@
>   #define ACC100_HARQ_LAYOUT             (64*1024*1024)
>   /* Assume offset for HARQ in memory */
>   #define ACC100_HARQ_OFFSET             (32*1024)
> +#define ACC100_HARQ_OFFSET_SHIFT       15
> +#define ACC100_HARQ_OFFSET_MASK        0x7ffffff
>   /* Mask used to calculate an index in an Info Ring array (not a byte offset) */
>   #define ACC100_INFO_RING_MASK          (ACC100_INFO_RING_NUM_ENTRIES-1)
>   /* Number of Virtual Functions ACC100 supports */
> @@ -590,6 +595,7 @@ struct acc100_device {
>   	uint16_t q_assigned_bit_map[ACC100_NUM_QGRPS];
>   	bool pf_device; /**< True if this is a PF ACC100 device */
>   	bool configured; /**< True if this ACC100 device is configured */
> +	uint16_t device_variant;  /**< Device variant */
this is not needed, check the pci id
>   };
>   
>   /**
> diff --git a/drivers/baseband/acc100/rte_acc101_pmd.h b/drivers/baseband/acc100/rte_acc101_pmd.h
> new file mode 100644
> index 0000000..efab400
> --- /dev/null
> +++ b/drivers/baseband/acc100/rte_acc101_pmd.h

New files need license, copyrights.

This file looks very similar to rte_acc100_pmd.h

The common parts should be in only one file, maybe a rte_acc10x_pmd.h

> @@ -0,0 +1,61 @@
> +/* ACC101 PF and VF driver names */
> +#define ACC101PF_DRIVER_NAME           intel_acc101_pf
> +#define ACC101VF_DRIVER_NAME           intel_acc101_vf

this maybe changes to intel_acc10x_pr/vf ?

Tom

> +
> +/* ACC101 PCI vendor & device IDs */
> +#define RTE_ACC101_VENDOR_ID           (0x8086)
> +#define RTE_ACC101_PF_DEVICE_ID        (0x57c4)
> +#define RTE_ACC101_VF_DEVICE_ID        (0x57c5)
> +
> +/* Define as 1 to use only a single FEC engine */
> +#ifndef RTE_ACC101_SINGLE_FEC
> +#define RTE_ACC101_SINGLE_FEC 0
> +#endif
> +
> +/* Values used in writing to the registers */
> +#define ACC101_REG_IRQ_EN_ALL          0x1FF83FF  /* Enable all interrupts */
> +
> +/* Number of Virtual Functions ACC101 supports */
> +#define ACC101_NUM_VFS                  16
> +#define ACC101_NUM_QGRPS                8
> +#define ACC101_NUM_AQS                  16
> +/* All ACC101 Registers alignment are 32bits = 4B */
> +#define ACC101_BYTES_IN_WORD                 4
> +
> +#define ACC101_GRP_ID_SHIFT    10 /* Queue Index Hierarchy */
> +#define ACC101_VF_ID_SHIFT     4  /* Queue Index Hierarchy */
> +#define ACC101_VF_OFFSET_QOS   16 /* offset in Memory specific to QoS Mon */
> +#define ACC101_TMPL_PRI_0      0x03020100
> +#define ACC101_TMPL_PRI_1      0x07060504
> +#define ACC101_TMPL_PRI_2      0x0b0a0908
> +#define ACC101_TMPL_PRI_3      0x0f0e0d0c
> +#define ACC101_WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
> +
> +#define ACC101_NUM_TMPL       32
> +/* Mapping of signals for the available engines */
> +#define ACC101_SIG_UL_5G      0
> +#define ACC101_SIG_UL_5G_LAST 8
> +#define ACC101_SIG_DL_5G      13
> +#define ACC101_SIG_DL_5G_LAST 15
> +#define ACC101_SIG_UL_4G      16
> +#define ACC101_SIG_UL_4G_LAST 19
> +#define ACC101_SIG_DL_4G      27
> +#define ACC101_SIG_DL_4G_LAST 31
> +#define ACC101_NUM_ACCS       5
> +#define ACC101_PF_VAL         2
> +
> +/* ACC101 Configuration */
> +#define ACC101_CFG_DMA_ERROR    0x3D7
> +#define ACC101_CFG_AXI_CACHE    0x11
> +#define ACC101_CFG_QMGR_HI_P    0x0F0F
> +#define ACC101_CFG_PCI_AXI      0xC003
> +#define ACC101_CFG_PCI_BRIDGE   0x40006033
> +#define ACC101_ENGINE_OFFSET    0x1000
> +#define ACC101_LONG_WAIT        1000
> +#define ACC101_GPEX_AXIMAP_NUM  17
> +#define ACC101_CLOCK_GATING_EN  0x30000
> +#define ACC101_DMA_INBOUND      0x104
> +/* DDR Size per VF - 512MB by default
> + * Can be increased up to 4 GB with single PF/VF
> + */
> +#define ACC101_HARQ_DDR         (512 * 1)
Chautru, Nicolas May 9, 2022, 9:23 p.m. UTC | #2
Hi Tom, 

> -----Original Message-----
> From: Tom Rix <trix@redhat.com>
> Sent: Sunday, May 8, 2022 6:03 AM
> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org;
> gakhil@marvell.com
> Cc: thomas@monjalon.net; Kinsella, Ray <ray.kinsella@intel.com>; Richardson,
> Bruce <bruce.richardson@intel.com>; hemant.agrawal@nxp.com; Zhang,
> Mingshan <mingshan.zhang@intel.com>; david.marchand@redhat.com
> Subject: Re: [PATCH v2 1/5] baseband/acc100: introduce PMD for ACC101
> 
> This is a good start reusing code, but I think it needs to do more reuse.
> 
> These cards should be very close and likely represent a family of cards.
> 
> On 4/27/22 11:16 AM, Nicolas Chautru wrote:
> > Support for ACC101 as a derivative of ACC100.
> > Reusing existing code when possible.
> >
> > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > ---
> >   doc/guides/bbdevs/acc101.rst             | 237
> +++++++++++++++++++++++++++++++
> >   doc/guides/bbdevs/features/acc101.ini    |  13 ++
> >   doc/guides/bbdevs/index.rst              |   1 +
> >   doc/guides/rel_notes/release_22_07.rst   |   4 +
> >   drivers/baseband/acc100/rte_acc100_pmd.c | 194
> ++++++++++++++++++++++++-
> >   drivers/baseband/acc100/rte_acc100_pmd.h |   6 +
> >   drivers/baseband/acc100/rte_acc101_pmd.h |  61 ++++++++
> >   7 files changed, 511 insertions(+), 5 deletions(-)
> >   create mode 100644 doc/guides/bbdevs/acc101.rst
> >   create mode 100644 doc/guides/bbdevs/features/acc101.ini
> >   create mode 100644 drivers/baseband/acc100/rte_acc101_pmd.h
> >
> > diff --git a/doc/guides/bbdevs/acc101.rst
> > b/doc/guides/bbdevs/acc101.rst new file mode 100644 index
> > 0000000..46c310b
> > --- /dev/null
> > +++ b/doc/guides/bbdevs/acc101.rst
> > @@ -0,0 +1,237 @@
> > +..  SPDX-License-Identifier: BSD-3-Clause
> > +    Copyright(c) 2020 Intel Corporation
> > +
> > +Intel(R) ACC101 5G/4G FEC Poll Mode Driver
> > +==========================================
> > +
> > +The BBDEV ACC101 5G/4G FEC poll mode driver (PMD) supports an
> > +implementation of a VRAN FEC wireless acceleration function.
> > +This device is also known as Mount Cirrus.
> > +This is a follow-up to Mount Bryce (ACC100) and includes fixes,
> > +improved feature set for error scenarios and performance capacity increase.
> 
> includes fixes, better error handling and increased performance.
> 
> A quick look at acc100.rst and the bulk of acc101.rst looks the same.
> 
> Consider a user of the acc100 is upgrading to acc101, they will
> 
> want to know what is the same and what has changed and test accordingly.
> 
> These two documents should be combined.
> 

Well in term of documentation, for the users it helps to be able to follow steps as they are for a given variant. 
As opposed to have to have multiple options through the document when using ACC100 vs ACC101.
Except if they are other objections, I would see this more useful for the user as is and less source of errors. 


> > +
> > +Features
> > +--------
> > +
> > +ACC101 5G/4G FEC PMD supports the following features:
> > +
> > +- LDPC Encode in the DL (5GNR)
> > +- LDPC Decode in the UL (5GNR)
> > +- Turbo Encode in the DL (4G)
> > +- Turbo Decode in the UL (4G)
> > +- 16 VFs per PF (physical device)
> > +- Maximum of 128 queues per VF
> > +- PCIe Gen-3 x16 Interface
> > +- MSI
> > +- SR-IOV
> > +
> > +ACC101 5G/4G FEC PMD supports the following BBDEV capabilities:
> > +
> > +* For the LDPC encode operation:
> > +   - ``RTE_BBDEV_LDPC_CRC_24B_ATTACH`` :  set to attach CRC24B to CB(s)
> > +   - ``RTE_BBDEV_LDPC_RATE_MATCH`` :  if set then do not do Rate Match
> bypass
> > +   - ``RTE_BBDEV_LDPC_INTERLEAVER_BYPASS`` : if set then bypass
> > +interleaver
> > +
> > +* For the LDPC decode operation:
> > +   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK`` :  check CRC24B from CB(s)
> > +   - ``RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE`` :  disable early
> termination
> > +   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP`` :  drops CRC24B bits
> appended while decoding
> > +   - ``RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE`` :  provides an input for
> HARQ combining
> > +   - ``RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE`` :  provides an input
> for HARQ combining
> > +   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE`` :  HARQ
> memory input is internal
> > +   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE`` :  HARQ
> memory output is internal
> > +   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK`` :
> loopback data to/from HARQ memory
> > +   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_FILLERS`` :  HARQ
> memory includes the fillers bits
> > +   - ``RTE_BBDEV_LDPC_DEC_SCATTER_GATHER`` :  supports scatter-gather
> for input/output data
> > +   - ``RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION`` :  supports
> compression of the HARQ input/output
> > +   - ``RTE_BBDEV_LDPC_LLR_COMPRESSION`` :  supports LLR input
> > +compression
> > +
> > +* For the turbo encode operation:
> > +   - ``RTE_BBDEV_TURBO_CRC_24B_ATTACH`` :  set to attach CRC24B to
> CB(s)
> > +   - ``RTE_BBDEV_TURBO_RATE_MATCH`` :  if set then do not do Rate Match
> bypass
> > +   - ``RTE_BBDEV_TURBO_ENC_INTERRUPTS`` :  set for encoder dequeue
> interrupts
> > +   - ``RTE_BBDEV_TURBO_RV_INDEX_BYPASS`` :  set to bypass RV index
> > +   - ``RTE_BBDEV_TURBO_ENC_SCATTER_GATHER`` :  supports
> > +scatter-gather for input/output data
> > +
> > +* For the turbo decode operation:
> > +   - ``RTE_BBDEV_TURBO_CRC_TYPE_24B`` :  check CRC24B from CB(s)
> > +   - ``RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE`` :  perform subblock
> de-interleave
> > +   - ``RTE_BBDEV_TURBO_DEC_INTERRUPTS`` :  set for decoder dequeue
> interrupts
> > +   - ``RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN`` :  set if negative LLR encoder
> i/p is supported
> > +   - ``RTE_BBDEV_TURBO_POS_LLR_1_BIT_IN`` :  set if positive LLR encoder
> i/p is supported
> > +   - ``RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP`` :  keep CRC24B bits
> appended while decoding
> > +   - ``RTE_BBDEV_TURBO_DEC_CRC_24B_DROP`` : option to drop the code
> block CRC after decoding
> > +   - ``RTE_BBDEV_TURBO_EARLY_TERMINATION`` :  set early termination
> feature
> > +   - ``RTE_BBDEV_TURBO_DEC_SCATTER_GATHER`` :  supports scatter-gather
> for input/output data
> > +   - ``RTE_BBDEV_TURBO_HALF_ITERATION_EVEN`` :  set half iteration
> > +granularity
> > +
> > +Installation
> > +------------
> > +
> > +Section 3 of the DPDK manual provides instructions on installing and
> compiling DPDK.
> > +
> > +DPDK requires hugepages to be configured as detailed in section 2 of the
> DPDK manual.
> > +The bbdev test application has been tested with a configuration 40 x
> > +1GB hugepages. The hugepage configuration of a server may be examined
> using:
> > +
> > +.. code-block:: console
> > +
> > +   grep Huge* /proc/meminfo
> > +
> > +
> > +Initialization
> > +--------------
> > +
> > +When the device first powers up, its PCI Physical Functions (PF) can be listed
> through this command:
> > +
> > +.. code-block:: console
> > +
> > +  sudo lspci -vd8086:57c4
> > +
> > +The physical and virtual functions are compatible with Linux UIO drivers:
> > +``vfio`` and ``igb_uio``. However, in order to work the ACC101 5G/4G
> > +FEC device first needs to be bound to one of these linux drivers through
> DPDK.
> > +
> > +
> > +Bind PF UIO driver(s)
> > +~~~~~~~~~~~~~~~~~~~~~
> > +
> > +Install the DPDK igb_uio driver, bind it with the PF PCI device ID
> > +and use ``lspci`` to confirm the PF device is under use by ``igb_uio`` DPDK
> UIO driver.
> > +
> > +The igb_uio driver may be bound to the PF PCI device using one of two
> methods:
> > +
> > +
> > +1. PCI functions (physical or virtual, depending on the use case) can
> > +be bound to the UIO driver by repeating this command for every function.
> > +
> > +.. code-block:: console
> > +
> > +  cd <dpdk-top-level-directory>
> > +  insmod ./build/kmod/igb_uio.ko
> > +  echo "8086 57c4" > /sys/bus/pci/drivers/igb_uio/new_id
> > +  lspci -vd8086:57c4
> > +
> > +
> > +2. Another way to bind PF with DPDK UIO driver is by using the
> > +``dpdk-devbind.py`` tool
> > +
> > +.. code-block:: console
> > +
> > +  cd <dpdk-top-level-directory>
> > +  ./usertools/dpdk-devbind.py -b igb_uio 0000:06:00.0
> > +
> > +where the PCI device ID (example: 0000:06:00.0) is obtained using
> > +lspci -vd8086:57c4
> > +
> > +
> > +In a similar way the ACC101 5G/4G FEC PF may be bound with vfio-pci as any
> PCIe device.
> > +
> > +
> > +Enable Virtual Functions
> > +~~~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +Now, it should be visible in the printouts that PCI PF is under
> > +igb_uio control "``Kernel driver in use: igb_uio``"
> > +
> > +To show the number of available VFs on the device, read ``sriov_totalvfs``
> file..
> > +
> > +.. code-block:: console
> > +
> > +  cat /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_totalvfs
> > +
> > +  where 0000\:<b>\:<d>.<f> is the PCI device ID
> > +
> > +
> > +To enable VFs via igb_uio, echo the number of virtual functions
> > +intended to enable to ``max_vfs`` file..
> > +
> > +.. code-block:: console
> > +
> > +  echo <num-of-vfs> > /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/max_vfs
> > +
> > +
> > +Afterwards, all VFs must be bound to appropriate UIO drivers as
> > +required, same way it was done with the physical function previously.
> > +
> > +Enabling SR-IOV via vfio driver is pretty much the same, except that
> > +the file name is different:
> > +
> > +.. code-block:: console
> > +
> > +  echo <num-of-vfs> >
> > + /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_numvfs
> > +
> > +
> > +Configure the VFs through PF
> > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +The PCI virtual functions must be configured before working or
> > +getting assigned to VMs/Containers. The configuration involves
> > +allocating the number of hardware queues, priorities, load balance,
> > +bandwidth and other settings necessary for the device to perform FEC
> functions.
> > +
> > +This configuration needs to be executed at least once after reboot or
> > +PCI FLR and can be achieved by using the function
> > +``acc101_configure()``, which sets up the parameters defined in
> ``acc100_conf`` structure.
> > +
> > +Test Application
> > +----------------
> > +
> > +BBDEV provides a test application, ``test-bbdev.py`` and range of
> > +test data for testing the functionality of ACC101 5G/4G FEC encode
> > +and decode, depending on the device's capabilities. The test
> > +application is located under app->test-bbdev folder and has the following
> options:
> > +
> > +.. code-block:: console
> > +
> > +  "-p", "--testapp-path": specifies path to the bbdev test app.
> > +  "-e", "--eal-params"	: EAL arguments which are passed to the test app.
> > +  "-t", "--timeout"	: Timeout in seconds (default=300).
> > +  "-c", "--test-cases"	: Defines test cases to run. Run all if not specified.
> > +  "-v", "--test-vector"	: Test vector path (default=dpdk_path+/app/test-
> bbdev/test_vectors/bbdev_null.data).
> > +  "-n", "--num-ops"	: Number of operations to process on device
> (default=32).
> > +  "-b", "--burst-size"	: Operations enqueue/dequeue burst size (default=32).
> > +  "-s", "--snr"		: SNR in dB used when generating LLRs for bler tests.
> > +  "-s", "--iter_max"	: Number of iterations for LDPC decoder.
> > +  "-l", "--num-lcores"	: Number of lcores to run (default=16).
> > +  "-i", "--init-device" : Initialise PF device with default values.
> > +
> > +
> > +To execute the test application tool using simple decode or encode
> > +data, type one of the following:
> > +
> > +.. code-block:: console
> > +
> > +  ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_dec_default.data
> > + ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_enc_default.data
> > +
> > +
> > +The test application ``test-bbdev.py``, supports the ability to
> > +configure the PF device with a default set of values, if the "-i" or
> > +"- -init-device" option is included. The default values are defined in
> test_bbdev_perf.c.
> > +
> > +
> > +Test Vectors
> > +~~~~~~~~~~~~
> > +
> > +In addition to the simple LDPC decoder and LDPC encoder tests, bbdev
> > +also provides a range of additional tests under the test_vectors
> > +folder, which may be useful. The results of these tests will depend
> > +on the ACC101 5G/4G FEC capabilities which may cause some testcases to be
> skipped, but no failure should be reported.
> > +
> > +
> > +Alternate Baseband Device configuration tool
> > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +On top of the embedded configuration feature supported in test-bbdev using
> "- -init-device"
> > +option mentioned above, there is also a tool available to perform
> > +that device configuration using a companion application.
> > +The ``pf_bb_config`` application notably enables then to run
> > +bbdev-test from the VF and not only limited to the PF as captured above.
> > +
> > +See for more details: https://github.com/intel/pf-bb-config
> > +
> > +Specifically for the BBDEV ACC101 PMD, the command below can be used:
> > +
> > +.. code-block:: console
> > +
> > +  ./pf_bb_config ACC101 -c acc101/acc101_config_4vf_4g5g.cfg
> > + ./test-bbdev.py -e="-c 0xff0 -a${VF_PCI_ADDR}" -c validation -l 1 -v
> > + ./ldpc_dec_default.data
> > \ No newline at end of file
> > diff --git a/doc/guides/bbdevs/features/acc101.ini
> > b/doc/guides/bbdevs/features/acc101.ini
> > new file mode 100644
> > index 0000000..0e2c21a
> > --- /dev/null
> > +++ b/doc/guides/bbdevs/features/acc101.ini
> > @@ -0,0 +1,13 @@
> > +;
> > +; Supported features of the 'acc101' bbdev driver.
> > +;
> > +; Refer to default.ini for the full list of available PMD features.
> > +;
> > +[Features]
> > +Turbo Decoder (4G)     = Y
> > +Turbo Encoder (4G)     = Y
> > +LDPC Decoder (5G)      = Y
> > +LDPC Encoder (5G)      = Y
> > +LLR/HARQ Compression   = Y
> > +External DDR Access    = Y
> > +HW Accelerated         = Y
> This is the same as acc100.ini, why do we need 2 ?

This is a different product, needs to be consistent. 

> > diff --git a/doc/guides/bbdevs/index.rst b/doc/guides/bbdevs/index.rst
> > index cedd706..e76883c 100644
> > --- a/doc/guides/bbdevs/index.rst
> > +++ b/doc/guides/bbdevs/index.rst
> > @@ -14,4 +14,5 @@ Baseband Device Drivers
> >       fpga_lte_fec
> >       fpga_5gnr_fec
> >       acc100
> > +    acc101
> >       la12xx
> > diff --git a/doc/guides/rel_notes/release_22_07.rst
> > b/doc/guides/rel_notes/release_22_07.rst
> > index 42a5f2d..ef9906b 100644
> > --- a/doc/guides/rel_notes/release_22_07.rst
> > +++ b/doc/guides/rel_notes/release_22_07.rst
> > @@ -55,6 +55,10 @@ New Features
> >        Also, make sure to start the actual text at the margin.
> >        =======================================================
> >
> > +* **Added Intel ACC101 baseband PMD.**
> > +
> > +  * Added a new baseband PMD for Intel ACC101 device (Mount Cirrus).
> > +  * See the :doc:`../bbdevs/acc101` for more details.
> >
> >   Removed Items
> >   -------------
> > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> > b/drivers/baseband/acc100/rte_acc100_pmd.c
> > index de7e4bc..fca27ef 100644
> > --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> > +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> > @@ -22,6 +22,7 @@
> >   #include <rte_bbdev.h>
> >   #include <rte_bbdev_pmd.h>
> >   #include "rte_acc100_pmd.h"
> > +#include "rte_acc101_pmd.h"
> >
> >   #ifdef RTE_LIBRTE_BBDEV_DEBUG
> >   RTE_LOG_REGISTER_DEFAULT(acc100_logtype, DEBUG); @@ -1286,6
> +1287,12
> > @@
> >   			RTE_BBDEV_TURBO_HALF_ITERATION_EVEN);
> >   }
> >
> > +static inline bool
> > +is_acc100(struct acc100_queue *q)
> > +{
> > +	return (q->d->device_variant == ACC100_VARIANT); }
> > +
> >   /* Fill in a frame control word for LDPC decoding. */
> >   static inline void
> >   acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct
> > acc100_fcw_ld *fcw, @@ -1412,6 +1419,139 @@
> >   	}
> >   }
> >
> > +/* Convert offset to harq index for harq_layout structure */ static
> > +inline uint32_t hq_index(uint32_t offset) {
> > +	return (offset >> ACC100_HARQ_OFFSET_SHIFT) &
> > +ACC100_HARQ_OFFSET_MASK; }
> > +
> > +/* Fill in a frame control word for LDPC decoding for ACC101 */
> > +static inline void acc101_fcw_ld_fill(struct rte_bbdev_dec_op *op,
> > +struct acc100_fcw_ld *fcw,
> > +		union acc100_harq_layout_data *harq_layout)
> This looks extremely similar to the acc100*, why isn't this combined ?
> > +{
> > +	uint16_t harq_out_length, harq_in_length, ncb_p, k0_p, parity_offset;
> > +	uint32_t harq_index;
> > +	uint32_t l;
> > +
> > +	fcw->qm = op->ldpc_dec.q_m;
> > +	fcw->nfiller = op->ldpc_dec.n_filler;
> > +	fcw->BG = (op->ldpc_dec.basegraph - 1);
> > +	fcw->Zc = op->ldpc_dec.z_c;
> > +	fcw->ncb = op->ldpc_dec.n_cb;
> > +	fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_dec.basegraph,
> > +			op->ldpc_dec.rv_index);
> > +	if (op->ldpc_dec.code_block_mode == RTE_BBDEV_CODE_BLOCK)
> > +		fcw->rm_e = op->ldpc_dec.cb_params.e;
> > +	else
> > +		fcw->rm_e = (op->ldpc_dec.tb_params.r <
> > +				op->ldpc_dec.tb_params.cab) ?
> > +						op->ldpc_dec.tb_params.ea :
> > +						op->ldpc_dec.tb_params.eb;
> > +
> > +	if (unlikely(check_bit(op->ldpc_dec.op_flags,
> > +			RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE) &&
> > +			(op->ldpc_dec.harq_combined_input.length == 0))) {
> > +		rte_bbdev_log(WARNING, "Null HARQ input size provided");
> > +		/* Disable HARQ input in that case to carry forward */
> > +		op->ldpc_dec.op_flags ^=
> RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE;
> > +	}
> > +
> > +	fcw->hcin_en = check_bit(op->ldpc_dec.op_flags,
> > +			RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE);
> > +	fcw->hcout_en = check_bit(op->ldpc_dec.op_flags,
> > +			RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE);
> > +	fcw->crc_select = check_bit(op->ldpc_dec.op_flags,
> > +			RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK);
> > +	fcw->bypass_dec = check_bit(op->ldpc_dec.op_flags,
> > +			RTE_BBDEV_LDPC_DECODE_BYPASS);
> > +	fcw->bypass_intlv = check_bit(op->ldpc_dec.op_flags,
> > +			RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS);
> > +	if (op->ldpc_dec.q_m == 1) {
> > +		fcw->bypass_intlv = 1;
> > +		fcw->qm = 2;
> > +	}
> > +	fcw->hcin_decomp_mode = check_bit(op->ldpc_dec.op_flags,
> > +			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> > +	fcw->hcout_comp_mode = check_bit(op->ldpc_dec.op_flags,
> > +			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> > +	fcw->llr_pack_mode = check_bit(op->ldpc_dec.op_flags,
> > +			RTE_BBDEV_LDPC_LLR_COMPRESSION);
> > +	harq_index = hq_index(op->ldpc_dec.harq_combined_output.offset);
> > +	if (fcw->hcin_en > 0) {
> > +		harq_in_length = op->ldpc_dec.harq_combined_input.length;
> > +		if (fcw->hcin_decomp_mode > 0)
> > +			harq_in_length = harq_in_length * 8 / 6;
> > +		harq_in_length = RTE_MIN(harq_in_length, op-
> >ldpc_dec.n_cb
> > +				- op->ldpc_dec.n_filler);
> > +		/* Alignment on next 64B - Already enforced from HC output */
> > +		harq_in_length = RTE_ALIGN_FLOOR(harq_in_length, 64);
> > +		fcw->hcin_size0 = harq_in_length;
> > +		fcw->hcin_offset = 0;
> > +		fcw->hcin_size1 = 0;
> > +	} else {
> > +		fcw->hcin_size0 = 0;
> > +		fcw->hcin_offset = 0;
> > +		fcw->hcin_size1 = 0;
> > +	}
> > +
> > +	fcw->itmax = op->ldpc_dec.iter_max;
> > +	fcw->itstop = check_bit(op->ldpc_dec.op_flags,
> > +			RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE);
> > +	fcw->synd_precoder = fcw->itstop;
> > +	/*
> > +	 * These are all implicitly set
> > +	 * fcw->synd_post = 0;
> > +	 * fcw->so_en = 0;
> > +	 * fcw->so_bypass_rm = 0;
> > +	 * fcw->so_bypass_intlv = 0;
> > +	 * fcw->dec_convllr = 0;
> > +	 * fcw->hcout_convllr = 0;
> > +	 * fcw->hcout_size1 = 0;
> > +	 * fcw->so_it = 0;
> > +	 * fcw->hcout_offset = 0;
> > +	 * fcw->negstop_th = 0;
> > +	 * fcw->negstop_it = 0;
> > +	 * fcw->negstop_en = 0;
> > +	 * fcw->gain_i = 1;
> > +	 * fcw->gain_h = 1;
> > +	 */
> > +	if (fcw->hcout_en > 0) {
> > +		parity_offset = (op->ldpc_dec.basegraph == 1 ? 20 : 8)
> > +			* op->ldpc_dec.z_c - op->ldpc_dec.n_filler;
> > +		k0_p = (fcw->k0 > parity_offset) ?
> > +				fcw->k0 - op->ldpc_dec.n_filler : fcw->k0;
> > +		ncb_p = fcw->ncb - op->ldpc_dec.n_filler;
> > +		l = RTE_MIN(k0_p + fcw->rm_e, INT16_MAX);
> > +		harq_out_length = (uint16_t) fcw->hcin_size0;
> > +		harq_out_length = RTE_MAX(harq_out_length, l);
> > +		/* Cannot exceed the pruned Ncb circular buffer */
> > +		harq_out_length = RTE_MIN(harq_out_length, ncb_p);
> > +		/* Alignment on next 64B */
> > +		harq_out_length = RTE_ALIGN_CEIL(harq_out_length, 64);
> > +		fcw->hcout_size0 = harq_out_length;
> > +		fcw->hcout_size1 = 0;
> > +		fcw->hcout_offset = 0;
> > +		harq_layout[harq_index].offset = fcw->hcout_offset;
> > +		harq_layout[harq_index].size0 = fcw->hcout_size0;
> > +	} else {
> > +		fcw->hcout_size0 = 0;
> > +		fcw->hcout_size1 = 0;
> > +		fcw->hcout_offset = 0;
> > +	}
> > +}
> > +
> > +static inline void
> > +acc10x_fcw_ld_fill(struct rte_bbdev_dec_op *op, struct acc100_fcw_ld
> *fcw,
> > +		union acc100_harq_layout_data *harq_layout, struct
> acc100_queue *q)
> > +{
> > +	if (is_acc100(q))
> consider having a function table in the private data so the call can be made
> without this if-check
> > +		return acc100_fcw_ld_fill(op, fcw, harq_layout);
> > +	else
> > +		return acc101_fcw_ld_fill(op, fcw, harq_layout); }
> > +
> > +
> >   /**
> >    * Fills descriptor with data pointers of one block type.
> >    *
> > @@ -2960,7 +3100,7 @@
> >   		struct acc100_fcw_ld *fcw;
> >   		uint32_t seg_total_left;
> >   		fcw = &desc->req.fcw_ld;
> > -		acc100_fcw_ld_fill(op, fcw, harq_layout);
> > +		acc10x_fcw_ld_fill(op, fcw, harq_layout, q);
> >
> >   		/* Special handling when overusing mbuf */
> >   		if (fcw->rm_e < ACC100_MAX_E_MBUF) @@ -3027,7 +3167,7
> @@
> >   	desc = q->ring_addr + desc_idx;
> >   	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
> >   	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
> > -	acc100_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout);
> > +	acc10x_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout, q);
> >
> >   	input = op->ldpc_dec.input.data;
> >   	h_output_head = h_output = op->ldpc_dec.hard_output.data; @@
> > -4139,9 +4279,17 @@
> >   	dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
> >   	dev->dequeue_ldpc_dec_ops = acc100_dequeue_ldpc_dec;
> >
> > -	((struct acc100_device *) dev->data->dev_private)->pf_device =
> > -			!strcmp(drv->driver.name,
> > -					RTE_STR(ACC100PF_DRIVER_NAME));
> > +	if ((!strcmp(drv->driver.name, RTE_STR(ACC100PF_DRIVER_NAME)))
> ||
> > +			(!strcmp(drv->driver.name,
> RTE_STR(ACC100VF_DRIVER_NAME)))) {
> > +		((struct acc100_device *) dev->data->dev_private)->pf_device
> =
> > +				!strcmp(drv->driver.name,
> RTE_STR(ACC100PF_DRIVER_NAME));
> > +		((struct acc100_device *) dev->data->dev_private)-
> >device_variant = ACC100_VARIANT;
> > +	} else {
> > +		((struct acc100_device *) dev->data->dev_private)->pf_device
> =
> > +				!strcmp(drv->driver.name,
> RTE_STR(ACC101PF_DRIVER_NAME));
> > +		((struct acc100_device *) dev->data->dev_private)-
> >device_variant = ACC101_VARIANT;
> > +	}
> > +
> >   	((struct acc100_device *) dev->data->dev_private)->mmio_base =
> >   			pci_dev->mem_resource[0].addr;
> >
> > @@ -4251,6 +4399,42 @@ static int acc100_pci_remove(struct
> rte_pci_device *pci_dev)
> >   RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME,
> acc100_pci_vf_driver);
> >   RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME,
> > pci_id_acc100_vf_map);
> >
> > +/* ACC101 PCI PF address map */
> > +static struct rte_pci_id pci_id_acc101_pf_map[] = {
> > +	{
> > +		RTE_PCI_DEVICE(RTE_ACC101_VENDOR_ID,
> RTE_ACC101_PF_DEVICE_ID)
> > +	},
> > +	{.device_id = 0},
> > +};
> > +
> > +/* ACC101 PCI VF address map */
> > +static struct rte_pci_id pci_id_acc101_vf_map[] = {
> > +	{
> > +		RTE_PCI_DEVICE(RTE_ACC101_VENDOR_ID,
> RTE_ACC101_VF_DEVICE_ID)
> > +	},
> > +	{.device_id = 0},
> > +};
> > +
> > +
> > +static struct rte_pci_driver acc101_pci_pf_driver = {
> > +		.probe = acc100_pci_probe,
> > +		.remove = acc100_pci_remove,
> > +		.id_table = pci_id_acc101_pf_map,
> > +		.drv_flags = RTE_PCI_DRV_NEED_MAPPING };
> > +
> > +static struct rte_pci_driver acc101_pci_vf_driver = {
> > +		.probe = acc100_pci_probe,
> > +		.remove = acc100_pci_remove,
> > +		.id_table = pci_id_acc101_vf_map,
> > +		.drv_flags = RTE_PCI_DRV_NEED_MAPPING };
> > +
> > +RTE_PMD_REGISTER_PCI(ACC101PF_DRIVER_NAME,
> acc101_pci_pf_driver);
> > +RTE_PMD_REGISTER_PCI_TABLE(ACC101PF_DRIVER_NAME,
> > +pci_id_acc101_pf_map);
> RTE_PMD_REGISTER_PCI(ACC101VF_DRIVER_NAME,
> > +acc101_pci_vf_driver);
> > +RTE_PMD_REGISTER_PCI_TABLE(ACC101VF_DRIVER_NAME,
> > +pci_id_acc101_vf_map);
> > +
> >   /*
> >    * Workaround implementation to fix the power on status of some 5GUL
> engines
> >    * This requires DMA permission if ported outside DPDK diff --git
> > a/drivers/baseband/acc100/rte_acc100_pmd.h
> > b/drivers/baseband/acc100/rte_acc100_pmd.h
> > index cbcece2..6438031 100644
> > --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> > +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> > @@ -22,6 +22,9 @@
> >   #define rte_bbdev_log_debug(fmt, ...)
> >   #endif
> >
> > +#define ACC100_VARIANT 0
> > +#define ACC101_VARIANT 1
> > +
> >   /* ACC100 PF and VF driver names */
> >   #define ACC100PF_DRIVER_NAME           intel_acc100_pf
> >   #define ACC100VF_DRIVER_NAME           intel_acc100_vf
> > @@ -67,6 +70,8 @@
> >   #define ACC100_HARQ_LAYOUT             (64*1024*1024)
> >   /* Assume offset for HARQ in memory */
> >   #define ACC100_HARQ_OFFSET             (32*1024)
> > +#define ACC100_HARQ_OFFSET_SHIFT       15
> > +#define ACC100_HARQ_OFFSET_MASK        0x7ffffff
> >   /* Mask used to calculate an index in an Info Ring array (not a byte offset) */
> >   #define ACC100_INFO_RING_MASK
> (ACC100_INFO_RING_NUM_ENTRIES-1)
> >   /* Number of Virtual Functions ACC100 supports */ @@ -590,6 +595,7
> > @@ struct acc100_device {
> >   	uint16_t q_assigned_bit_map[ACC100_NUM_QGRPS];
> >   	bool pf_device; /**< True if this is a PF ACC100 device */
> >   	bool configured; /**< True if this ACC100 device is configured */
> > +	uint16_t device_variant;  /**< Device variant */
> this is not needed, check the pci id

That device_variant is sent once during probing then we can reuse that flexibly through the code.  

> >   };
> >
> >   /**
> > diff --git a/drivers/baseband/acc100/rte_acc101_pmd.h
> > b/drivers/baseband/acc100/rte_acc101_pmd.h
> > new file mode 100644
> > index 0000000..efab400
> > --- /dev/null
> > +++ b/drivers/baseband/acc100/rte_acc101_pmd.h
> 
> New files need license, copyrights.

Thanks!!

> 
> This file looks very similar to rte_acc100_pmd.h

Actually the configuration of the device differs and could diverge further in the future. This is limited to the part that would be specific to the device configuration.
Already kept to extremely reduced set. Doing it more would be artificial and source of possible errors. 

> 
> The common parts should be in only one file, maybe a rte_acc10x_pmd.h
> 
> > @@ -0,0 +1,61 @@
> > +/* ACC101 PF and VF driver names */
> > +#define ACC101PF_DRIVER_NAME           intel_acc101_pf
> > +#define ACC101VF_DRIVER_NAME           intel_acc101_vf
> 
> this maybe changes to intel_acc10x_pr/vf ?

That string would be different from the 2 products on purpose even if under the bonnet we try to reuse code as much as possible. 

> 
> Tom
> 
> > +
> > +/* ACC101 PCI vendor & device IDs */
> > +#define RTE_ACC101_VENDOR_ID           (0x8086)
> > +#define RTE_ACC101_PF_DEVICE_ID        (0x57c4)
> > +#define RTE_ACC101_VF_DEVICE_ID        (0x57c5)
> > +
> > +/* Define as 1 to use only a single FEC engine */ #ifndef
> > +RTE_ACC101_SINGLE_FEC #define RTE_ACC101_SINGLE_FEC 0 #endif
> > +
> > +/* Values used in writing to the registers */
> > +#define ACC101_REG_IRQ_EN_ALL          0x1FF83FF  /* Enable all interrupts
> */
> > +
> > +/* Number of Virtual Functions ACC101 supports */
> > +#define ACC101_NUM_VFS                  16
> > +#define ACC101_NUM_QGRPS                8
> > +#define ACC101_NUM_AQS                  16
> > +/* All ACC101 Registers alignment are 32bits = 4B */
> > +#define ACC101_BYTES_IN_WORD                 4
> > +
> > +#define ACC101_GRP_ID_SHIFT    10 /* Queue Index Hierarchy */
> > +#define ACC101_VF_ID_SHIFT     4  /* Queue Index Hierarchy */
> > +#define ACC101_VF_OFFSET_QOS   16 /* offset in Memory specific to QoS
> Mon */
> > +#define ACC101_TMPL_PRI_0      0x03020100
> > +#define ACC101_TMPL_PRI_1      0x07060504
> > +#define ACC101_TMPL_PRI_2      0x0b0a0908
> > +#define ACC101_TMPL_PRI_3      0x0f0e0d0c
> > +#define ACC101_WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
> > +
> > +#define ACC101_NUM_TMPL       32
> > +/* Mapping of signals for the available engines */
> > +#define ACC101_SIG_UL_5G      0
> > +#define ACC101_SIG_UL_5G_LAST 8
> > +#define ACC101_SIG_DL_5G      13
> > +#define ACC101_SIG_DL_5G_LAST 15
> > +#define ACC101_SIG_UL_4G      16
> > +#define ACC101_SIG_UL_4G_LAST 19
> > +#define ACC101_SIG_DL_4G      27
> > +#define ACC101_SIG_DL_4G_LAST 31
> > +#define ACC101_NUM_ACCS       5
> > +#define ACC101_PF_VAL         2
> > +
> > +/* ACC101 Configuration */
> > +#define ACC101_CFG_DMA_ERROR    0x3D7
> > +#define ACC101_CFG_AXI_CACHE    0x11
> > +#define ACC101_CFG_QMGR_HI_P    0x0F0F
> > +#define ACC101_CFG_PCI_AXI      0xC003
> > +#define ACC101_CFG_PCI_BRIDGE   0x40006033
> > +#define ACC101_ENGINE_OFFSET    0x1000
> > +#define ACC101_LONG_WAIT        1000
> > +#define ACC101_GPEX_AXIMAP_NUM  17
> > +#define ACC101_CLOCK_GATING_EN  0x30000
> > +#define ACC101_DMA_INBOUND      0x104
> > +/* DDR Size per VF - 512MB by default
> > + * Can be increased up to 4 GB with single PF/VF  */
> > +#define ACC101_HARQ_DDR         (512 * 1)
Thomas Monjalon May 10, 2022, 8:52 a.m. UTC | #3
09/05/2022 23:23, Chautru, Nicolas:
> From: Tom Rix <trix@redhat.com>
> > A quick look at acc100.rst and the bulk of acc101.rst looks the same.
> > Consider a user of the acc100 is upgrading to acc101, they will
> > want to know what is the same and what has changed and test accordingly.
> > These two documents should be combined.
> 
> Well in term of documentation, for the users it helps to be able to follow steps as they are for a given variant. 
> As opposed to have to have multiple options through the document when using ACC100 vs ACC101.
> Except if they are other objections, I would see this more useful for the user as is and less source of errors. 

I agree with Tom, it is more useful to highlight the differences
with some notes and keep a common tutorial.
Tom Rix May 10, 2022, 11:55 a.m. UTC | #4
On 5/9/22 2:23 PM, Chautru, Nicolas wrote:
> Hi Tom,
>
>> -----Original Message-----
>> From: Tom Rix <trix@redhat.com>
>> Sent: Sunday, May 8, 2022 6:03 AM
>> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org;
>> gakhil@marvell.com
>> Cc: thomas@monjalon.net; Kinsella, Ray <ray.kinsella@intel.com>; Richardson,
>> Bruce <bruce.richardson@intel.com>; hemant.agrawal@nxp.com; Zhang,
>> Mingshan <mingshan.zhang@intel.com>; david.marchand@redhat.com
>> Subject: Re: [PATCH v2 1/5] baseband/acc100: introduce PMD for ACC101
>>
>> This is a good start reusing code, but I think it needs to do more reuse.
>>
>> These cards should be very close and likely represent a family of cards.
>>
>> On 4/27/22 11:16 AM, Nicolas Chautru wrote:
>>> Support for ACC101 as a derivative of ACC100.
>>> Reusing existing code when possible.
>>>
>>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
>>> ---
>>>    doc/guides/bbdevs/acc101.rst             | 237
>> +++++++++++++++++++++++++++++++
>>>    doc/guides/bbdevs/features/acc101.ini    |  13 ++
>>>    doc/guides/bbdevs/index.rst              |   1 +
>>>    doc/guides/rel_notes/release_22_07.rst   |   4 +
>>>    drivers/baseband/acc100/rte_acc100_pmd.c | 194
>> ++++++++++++++++++++++++-
>>>    drivers/baseband/acc100/rte_acc100_pmd.h |   6 +
>>>    drivers/baseband/acc100/rte_acc101_pmd.h |  61 ++++++++
>>>    7 files changed, 511 insertions(+), 5 deletions(-)
>>>    create mode 100644 doc/guides/bbdevs/acc101.rst
>>>    create mode 100644 doc/guides/bbdevs/features/acc101.ini
>>>    create mode 100644 drivers/baseband/acc100/rte_acc101_pmd.h
>>>
>>> diff --git a/doc/guides/bbdevs/acc101.rst
>>> b/doc/guides/bbdevs/acc101.rst new file mode 100644 index
>>> 0000000..46c310b
>>> --- /dev/null
>>> +++ b/doc/guides/bbdevs/acc101.rst
>>> @@ -0,0 +1,237 @@
>>> +..  SPDX-License-Identifier: BSD-3-Clause
>>> +    Copyright(c) 2020 Intel Corporation
>>> +
>>> +Intel(R) ACC101 5G/4G FEC Poll Mode Driver
>>> +==========================================
>>> +
>>> +The BBDEV ACC101 5G/4G FEC poll mode driver (PMD) supports an
>>> +implementation of a VRAN FEC wireless acceleration function.
>>> +This device is also known as Mount Cirrus.
>>> +This is a follow-up to Mount Bryce (ACC100) and includes fixes,
>>> +improved feature set for error scenarios and performance capacity increase.
>> includes fixes, better error handling and increased performance.
>>
>> A quick look at acc100.rst and the bulk of acc101.rst looks the same.
>>
>> Consider a user of the acc100 is upgrading to acc101, they will
>>
>> want to know what is the same and what has changed and test accordingly.
>>
>> These two documents should be combined.
>>
> Well in term of documentation, for the users it helps to be able to follow steps as they are for a given variant.
> As opposed to have to have multiple options through the document when using ACC100 vs ACC101.
> Except if they are other objections, I would see this more useful for the user as is and less source of errors.
>
My perspective is having existing acc100 users that are upgrading and/or 
having to support both acc100 and acc101

for a very long time.  In the first case users of existing acc100 users 
will want to know only the parts that have changed.

In the second, later changes that are common to both acc100 and acc101 
and later accXXX will be have to fixed in

multiple places.  As myself or someone at Red Hat will be on the hook 
for both, I would prefer if the refactoring

of the common parts acc100,acc101 were in good shape over the expediency 
of having acc101 sooner.

>>> +
>>> +Features
>>> +--------
>>> +

>>> index 0000000..0e2c21a
>>> --- /dev/null
>>> +++ b/doc/guides/bbdevs/features/acc101.ini
>>> @@ -0,0 +1,13 @@
>>> +;
>>> +; Supported features of the 'acc101' bbdev driver.
>>> +;
>>> +; Refer to default.ini for the full list of available PMD features.
>>> +;
>>> +[Features]
>>> +Turbo Decoder (4G)     = Y
>>> +Turbo Encoder (4G)     = Y
>>> +LDPC Decoder (5G)      = Y
>>> +LDPC Encoder (5G)      = Y
>>> +LLR/HARQ Compression   = Y
>>> +External DDR Access    = Y
>>> +HW Accelerated         = Y
>> This is the same as acc100.ini, why do we need 2 ?
> This is a different product, needs to be consistent.
ok
>
>>> diff --git a/doc/guides/bbdevs/index.rst b/doc/guides/bbdevs/index.rst
>>> index cedd706..e76883c 100644
>>> --- a/doc/guides/bbdevs/index.rst
>>> +++ b/doc/guides/bbdevs/index.rst
>>> @@ -14,4 +14,5 @@ Baseband Device Drivers
>>>        fpga_lte_fec
>>>        fpga_5gnr_fec
>>>        acc100
>>> +    acc101
>>>        la12xx
>>> diff --git a/doc/guides/rel_notes/release_22_07.rst
>>> b/doc/guides/rel_notes/release_22_07.rst
>>> index 42a5f2d..ef9906b 100644
>>> --- a/doc/guides/rel_notes/release_22_07.rst
>>> +++ b/doc/guides/rel_notes/release_22_07.rst
>>> @@ -55,6 +55,10 @@ New Features
>>>         Also, make sure to start the actual text at the margin.
>>>         =======================================================
>>>
>>> +* **Added Intel ACC101 baseband PMD.**
>>> +
>>> +  * Added a new baseband PMD for Intel ACC101 device (Mount Cirrus).
>>> +  * See the :doc:`../bbdevs/acc101` for more details.
>>>
>>>    Removed Items
>>>    -------------
>>> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
>>> b/drivers/baseband/acc100/rte_acc100_pmd.c
>>> index de7e4bc..fca27ef 100644
>>> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
>>> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
>>> @@ -22,6 +22,7 @@
>>>    #include <rte_bbdev.h>
>>>    #include <rte_bbdev_pmd.h>
>>>    #include "rte_acc100_pmd.h"
>>> +#include "rte_acc101_pmd.h"
>>>
>>>    #ifdef RTE_LIBRTE_BBDEV_DEBUG
>>>    RTE_LOG_REGISTER_DEFAULT(acc100_logtype, DEBUG); @@ -1286,6
>> +1287,12
>>> @@
>>>    			RTE_BBDEV_TURBO_HALF_ITERATION_EVEN);
>>>    }
>>>
>>> +static inline bool
>>> +is_acc100(struct acc100_queue *q)
>>> +{
>>> +	return (q->d->device_variant == ACC100_VARIANT); }
>>> +
>>>    /* Fill in a frame control word for LDPC decoding. */
>>>    static inline void
>>>    acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct
>>> acc100_fcw_ld *fcw, @@ -1412,6 +1419,139 @@
>>>    	}
>>>    }
>>>
>>> +/* Convert offset to harq index for harq_layout structure */ static
>>> +inline uint32_t hq_index(uint32_t offset) {
>>> +	return (offset >> ACC100_HARQ_OFFSET_SHIFT) &
>>> +ACC100_HARQ_OFFSET_MASK; }
>>> +
>>> +/* Fill in a frame control word for LDPC decoding for ACC101 */
>>> +static inline void acc101_fcw_ld_fill(struct rte_bbdev_dec_op *op,
>>> +struct acc100_fcw_ld *fcw,
>>> +		union acc100_harq_layout_data *harq_layout)
>> This looks extremely similar to the acc100*, why isn't this combined ?

Please answer.

Functions that look similar should be combined.

This gets to doing more to refactor the parts that are common between 
acc100 and acc101.

>>> +{
>>> +	uint16_t harq_out_length, harq_in_length, ncb_p, k0_p, parity_offset;
>>> +	uint32_t harq_index;
>>> +	uint32_t l;
>>> +
>>> +	fcw->qm = op->ldpc_dec.q_m;
>>> +	fcw->nfiller = op->ldpc_dec.n_filler;
>>> +	fcw->BG = (op->ldpc_dec.basegraph - 1);
>>> +	fcw->Zc = op->ldpc_dec.z_c;
>>> +	fcw->ncb = op->ldpc_dec.n_cb;
>>> +	fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_dec.basegraph,
>>> +			op->ldpc_dec.rv_index);
>>> +	if (op->ldpc_dec.code_block_mode == RTE_BBDEV_CODE_BLOCK)
>>> +		fcw->rm_e = op->ldpc_dec.cb_params.e;
>>> +	else
>>> +		fcw->rm_e = (op->ldpc_dec.tb_params.r <
>>> +				op->ldpc_dec.tb_params.cab) ?
>>> +						op->ldpc_dec.tb_params.ea :
>>> +						op->ldpc_dec.tb_params.eb;
>>> +
>>> +	if (unlikely(check_bit(op->ldpc_dec.op_flags,
>>> +			RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE) &&
>>> +			(op->ldpc_dec.harq_combined_input.length == 0))) {
>>> +		rte_bbdev_log(WARNING, "Null HARQ input size provided");
>>> +		/* Disable HARQ input in that case to carry forward */
>>> +		op->ldpc_dec.op_flags ^=
>> RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE;
>>> +	}
>>> +
>>> +	fcw->hcin_en = check_bit(op->ldpc_dec.op_flags,
>>> +			RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE);
>>> +	fcw->hcout_en = check_bit(op->ldpc_dec.op_flags,
>>> +			RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE);
>>> +	fcw->crc_select = check_bit(op->ldpc_dec.op_flags,
>>> +			RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK);
>>> +	fcw->bypass_dec = check_bit(op->ldpc_dec.op_flags,
>>> +			RTE_BBDEV_LDPC_DECODE_BYPASS);
>>> +	fcw->bypass_intlv = check_bit(op->ldpc_dec.op_flags,
>>> +			RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS);
>>> +	if (op->ldpc_dec.q_m == 1) {
>>> +		fcw->bypass_intlv = 1;
>>> +		fcw->qm = 2;
>>> +	}
>>> +	fcw->hcin_decomp_mode = check_bit(op->ldpc_dec.op_flags,
>>> +			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
>>> +	fcw->hcout_comp_mode = check_bit(op->ldpc_dec.op_flags,
>>> +			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
>>> +	fcw->llr_pack_mode = check_bit(op->ldpc_dec.op_flags,
>>> +			RTE_BBDEV_LDPC_LLR_COMPRESSION);
>>> +	harq_index = hq_index(op->ldpc_dec.harq_combined_output.offset);
>>> +	if (fcw->hcin_en > 0) {
>>> +		harq_in_length = op->ldpc_dec.harq_combined_input.length;
>>> +		if (fcw->hcin_decomp_mode > 0)
>>> +			harq_in_length = harq_in_length * 8 / 6;
>>> +		harq_in_length = RTE_MIN(harq_in_length, op-
>>> ldpc_dec.n_cb
>>> +				- op->ldpc_dec.n_filler);
>>> +		/* Alignment on next 64B - Already enforced from HC output */
>>> +		harq_in_length = RTE_ALIGN_FLOOR(harq_in_length, 64);
>>> +		fcw->hcin_size0 = harq_in_length;
>>> +		fcw->hcin_offset = 0;
>>> +		fcw->hcin_size1 = 0;
>>> +	} else {
>>> +		fcw->hcin_size0 = 0;
>>> +		fcw->hcin_offset = 0;
>>> +		fcw->hcin_size1 = 0;
>>> +	}
>>> +
>>> +	fcw->itmax = op->ldpc_dec.iter_max;
>>> +	fcw->itstop = check_bit(op->ldpc_dec.op_flags,
>>> +			RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE);
>>> +	fcw->synd_precoder = fcw->itstop;
>>> +	/*
>>> +	 * These are all implicitly set
>>> +	 * fcw->synd_post = 0;
>>> +	 * fcw->so_en = 0;
>>> +	 * fcw->so_bypass_rm = 0;
>>> +	 * fcw->so_bypass_intlv = 0;
>>> +	 * fcw->dec_convllr = 0;
>>> +	 * fcw->hcout_convllr = 0;
>>> +	 * fcw->hcout_size1 = 0;
>>> +	 * fcw->so_it = 0;
>>> +	 * fcw->hcout_offset = 0;
>>> +	 * fcw->negstop_th = 0;
>>> +	 * fcw->negstop_it = 0;
>>> +	 * fcw->negstop_en = 0;
>>> +	 * fcw->gain_i = 1;
>>> +	 * fcw->gain_h = 1;
>>> +	 */
>>> +	if (fcw->hcout_en > 0) {
>>> +		parity_offset = (op->ldpc_dec.basegraph == 1 ? 20 : 8)
>>> +			* op->ldpc_dec.z_c - op->ldpc_dec.n_filler;
>>> +		k0_p = (fcw->k0 > parity_offset) ?
>>> +				fcw->k0 - op->ldpc_dec.n_filler : fcw->k0;
>>> +		ncb_p = fcw->ncb - op->ldpc_dec.n_filler;
>>> +		l = RTE_MIN(k0_p + fcw->rm_e, INT16_MAX);
>>> +		harq_out_length = (uint16_t) fcw->hcin_size0;
>>> +		harq_out_length = RTE_MAX(harq_out_length, l);
>>> +		/* Cannot exceed the pruned Ncb circular buffer */
>>> +		harq_out_length = RTE_MIN(harq_out_length, ncb_p);
>>> +		/* Alignment on next 64B */
>>> +		harq_out_length = RTE_ALIGN_CEIL(harq_out_length, 64);
>>> +		fcw->hcout_size0 = harq_out_length;
>>> +		fcw->hcout_size1 = 0;
>>> +		fcw->hcout_offset = 0;
>>> +		harq_layout[harq_index].offset = fcw->hcout_offset;
>>> +		harq_layout[harq_index].size0 = fcw->hcout_size0;
>>> +	} else {
>>> +		fcw->hcout_size0 = 0;
>>> +		fcw->hcout_size1 = 0;
>>> +		fcw->hcout_offset = 0;
>>> +	}
>>> +}
>>> +
>>> +static inline void
>>> +acc10x_fcw_ld_fill(struct rte_bbdev_dec_op *op, struct acc100_fcw_ld
>> *fcw,
>>> +		union acc100_harq_layout_data *harq_layout, struct
>> acc100_queue *q)
>>> +{
>>> +	if (is_acc100(q))
>> consider having a function table in the private data so the call can be made
>> without this if-check
Please answer.
>>> +		return acc100_fcw_ld_fill(op, fcw, harq_layout);
>>> +	else
>>> +		return acc101_fcw_ld_fill(op, fcw, harq_layout); }
>>> +
>>> +
>>>    /**
>>>     * Fills descriptor with data pointers of one block type.
>>>     *
>>> @@ -2960,7 +3100,7 @@
>>>    		struct acc100_fcw_ld *fcw;
>>>    		uint32_t seg_total_left;
>>>    		fcw = &desc->req.fcw_ld;
>>> -		acc100_fcw_ld_fill(op, fcw, harq_layout);
>>> +		acc10x_fcw_ld_fill(op, fcw, harq_layout, q);
>>>
>>>    		/* Special handling when overusing mbuf */
>>>    		if (fcw->rm_e < ACC100_MAX_E_MBUF) @@ -3027,7 +3167,7
>> @@
>>>    	desc = q->ring_addr + desc_idx;
>>>    	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
>>>    	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
>>> -	acc100_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout);
>>> +	acc10x_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout, q);
>>>
>>>    	input = op->ldpc_dec.input.data;
>>>    	h_output_head = h_output = op->ldpc_dec.hard_output.data; @@
>>> -4139,9 +4279,17 @@
>>>    	dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
>>>    	dev->dequeue_ldpc_dec_ops = acc100_dequeue_ldpc_dec;
>>>
>>> -	((struct acc100_device *) dev->data->dev_private)->pf_device =
>>> -			!strcmp(drv->driver.name,
>>> -					RTE_STR(ACC100PF_DRIVER_NAME));
>>> +	if ((!strcmp(drv->driver.name, RTE_STR(ACC100PF_DRIVER_NAME)))
>> ||
>>> +			(!strcmp(drv->driver.name,
>> RTE_STR(ACC100VF_DRIVER_NAME)))) {
>>> +		((struct acc100_device *) dev->data->dev_private)->pf_device
>> =
>>> +				!strcmp(drv->driver.name,
>> RTE_STR(ACC100PF_DRIVER_NAME));
>>> +		((struct acc100_device *) dev->data->dev_private)-
>>> device_variant = ACC100_VARIANT;
>>> +	} else {
>>> +		((struct acc100_device *) dev->data->dev_private)->pf_device
>> =
>>> +				!strcmp(drv->driver.name,
>> RTE_STR(ACC101PF_DRIVER_NAME));
>>> +		((struct acc100_device *) dev->data->dev_private)-
>>> device_variant = ACC101_VARIANT;
>>> +	}
>>> +
>>>    	((struct acc100_device *) dev->data->dev_private)->mmio_base =
>>>    			pci_dev->mem_resource[0].addr;
>>>
>>> @@ -4251,6 +4399,42 @@ static int acc100_pci_remove(struct
>> rte_pci_device *pci_dev)
>>>    RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME,
>> acc100_pci_vf_driver);
>>>    RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME,
>>> pci_id_acc100_vf_map);
>>>
>>> +/* ACC101 PCI PF address map */
>>> +static struct rte_pci_id pci_id_acc101_pf_map[] = {
>>> +	{
>>> +		RTE_PCI_DEVICE(RTE_ACC101_VENDOR_ID,
>> RTE_ACC101_PF_DEVICE_ID)
>>> +	},
>>> +	{.device_id = 0},
>>> +};
>>> +
>>> +/* ACC101 PCI VF address map */
>>> +static struct rte_pci_id pci_id_acc101_vf_map[] = {
>>> +	{
>>> +		RTE_PCI_DEVICE(RTE_ACC101_VENDOR_ID,
>> RTE_ACC101_VF_DEVICE_ID)
>>> +	},
>>> +	{.device_id = 0},
>>> +};
>>> +
>>> +
>>> +static struct rte_pci_driver acc101_pci_pf_driver = {
>>> +		.probe = acc100_pci_probe,
>>> +		.remove = acc100_pci_remove,
>>> +		.id_table = pci_id_acc101_pf_map,
>>> +		.drv_flags = RTE_PCI_DRV_NEED_MAPPING };
>>> +
>>> +static struct rte_pci_driver acc101_pci_vf_driver = {
>>> +		.probe = acc100_pci_probe,
>>> +		.remove = acc100_pci_remove,
>>> +		.id_table = pci_id_acc101_vf_map,
>>> +		.drv_flags = RTE_PCI_DRV_NEED_MAPPING };
>>> +
>>> +RTE_PMD_REGISTER_PCI(ACC101PF_DRIVER_NAME,
>> acc101_pci_pf_driver);
>>> +RTE_PMD_REGISTER_PCI_TABLE(ACC101PF_DRIVER_NAME,
>>> +pci_id_acc101_pf_map);
>> RTE_PMD_REGISTER_PCI(ACC101VF_DRIVER_NAME,
>>> +acc101_pci_vf_driver);
>>> +RTE_PMD_REGISTER_PCI_TABLE(ACC101VF_DRIVER_NAME,
>>> +pci_id_acc101_vf_map);
>>> +
>>>    /*
>>>     * Workaround implementation to fix the power on status of some 5GUL
>> engines
>>>     * This requires DMA permission if ported outside DPDK diff --git
>>> a/drivers/baseband/acc100/rte_acc100_pmd.h
>>> b/drivers/baseband/acc100/rte_acc100_pmd.h
>>> index cbcece2..6438031 100644
>>> --- a/drivers/baseband/acc100/rte_acc100_pmd.h
>>> +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
>>> @@ -22,6 +22,9 @@
>>>    #define rte_bbdev_log_debug(fmt, ...)
>>>    #endif
>>>
>>> +#define ACC100_VARIANT 0
>>> +#define ACC101_VARIANT 1
>>> +
>>>    /* ACC100 PF and VF driver names */
>>>    #define ACC100PF_DRIVER_NAME           intel_acc100_pf
>>>    #define ACC100VF_DRIVER_NAME           intel_acc100_vf
>>> @@ -67,6 +70,8 @@
>>>    #define ACC100_HARQ_LAYOUT             (64*1024*1024)
>>>    /* Assume offset for HARQ in memory */
>>>    #define ACC100_HARQ_OFFSET             (32*1024)
>>> +#define ACC100_HARQ_OFFSET_SHIFT       15
>>> +#define ACC100_HARQ_OFFSET_MASK        0x7ffffff
>>>    /* Mask used to calculate an index in an Info Ring array (not a byte offset) */
>>>    #define ACC100_INFO_RING_MASK
>> (ACC100_INFO_RING_NUM_ENTRIES-1)
>>>    /* Number of Virtual Functions ACC100 supports */ @@ -590,6 +595,7
>>> @@ struct acc100_device {
>>>    	uint16_t q_assigned_bit_map[ACC100_NUM_QGRPS];
>>>    	bool pf_device; /**< True if this is a PF ACC100 device */
>>>    	bool configured; /**< True if this ACC100 device is configured */
>>> +	uint16_t device_variant;  /**< Device variant */
>> this is not needed, check the pci id
> That device_variant is sent once during probing then we can reuse that flexibly through the code.

In the same way the pci id is set once.

Generally if there is a way to use existing data to figure something 
out, then use the existing data.

Special purpose data increases the size and complexity as well as 
destabilizing the api

>   
>
>>>    };
>>>
>>>    /**
>>> diff --git a/drivers/baseband/acc100/rte_acc101_pmd.h
>>> b/drivers/baseband/acc100/rte_acc101_pmd.h
>>> new file mode 100644
>>> index 0000000..efab400
>>> --- /dev/null
>>> +++ b/drivers/baseband/acc100/rte_acc101_pmd.h
>> New files need license, copyrights.
> Thanks!!
>
>> This file looks very similar to rte_acc100_pmd.h
> Actually the configuration of the device differs and could diverge further in the future. This is limited to the part that would be specific to the device configuration.
> Already kept to extremely reduced set. Doing it more would be artificial and source of possible errors.
>
>> The common parts should be in only one file, maybe a rte_acc10x_pmd.h
>>
>>> @@ -0,0 +1,61 @@
>>> +/* ACC101 PF and VF driver names */
>>> +#define ACC101PF_DRIVER_NAME           intel_acc101_pf
>>> +#define ACC101VF_DRIVER_NAME           intel_acc101_vf
>> this maybe changes to intel_acc10x_pr/vf ?
> That string would be different from the 2 products on purpose even if under the bonnet we try to reuse code as much as possible.

These two devices could be run from the same driver.

I am used to linux kernel drivers that handle multiple devices with the 
same driver.

The reuse in the linux kernel is about 95%.

The reuse here is about 30%, it should be at least 80%

Tom

>
>> Tom
>>
>>> +
>>> +/* ACC101 PCI vendor & device IDs */
>>> +#define RTE_ACC101_VENDOR_ID           (0x8086)
>>> +#define RTE_ACC101_PF_DEVICE_ID        (0x57c4)
>>> +#define RTE_ACC101_VF_DEVICE_ID        (0x57c5)
>>> +
>>> +/* Define as 1 to use only a single FEC engine */ #ifndef
>>> +RTE_ACC101_SINGLE_FEC #define RTE_ACC101_SINGLE_FEC 0 #endif
>>> +
>>> +/* Values used in writing to the registers */
>>> +#define ACC101_REG_IRQ_EN_ALL          0x1FF83FF  /* Enable all interrupts
>> */
>>> +
>>> +/* Number of Virtual Functions ACC101 supports */
>>> +#define ACC101_NUM_VFS                  16
>>> +#define ACC101_NUM_QGRPS                8
>>> +#define ACC101_NUM_AQS                  16
>>> +/* All ACC101 Registers alignment are 32bits = 4B */
>>> +#define ACC101_BYTES_IN_WORD                 4
>>> +
>>> +#define ACC101_GRP_ID_SHIFT    10 /* Queue Index Hierarchy */
>>> +#define ACC101_VF_ID_SHIFT     4  /* Queue Index Hierarchy */
>>> +#define ACC101_VF_OFFSET_QOS   16 /* offset in Memory specific to QoS
>> Mon */
>>> +#define ACC101_TMPL_PRI_0      0x03020100
>>> +#define ACC101_TMPL_PRI_1      0x07060504
>>> +#define ACC101_TMPL_PRI_2      0x0b0a0908
>>> +#define ACC101_TMPL_PRI_3      0x0f0e0d0c
>>> +#define ACC101_WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
>>> +
>>> +#define ACC101_NUM_TMPL       32
>>> +/* Mapping of signals for the available engines */
>>> +#define ACC101_SIG_UL_5G      0
>>> +#define ACC101_SIG_UL_5G_LAST 8
>>> +#define ACC101_SIG_DL_5G      13
>>> +#define ACC101_SIG_DL_5G_LAST 15
>>> +#define ACC101_SIG_UL_4G      16
>>> +#define ACC101_SIG_UL_4G_LAST 19
>>> +#define ACC101_SIG_DL_4G      27
>>> +#define ACC101_SIG_DL_4G_LAST 31
>>> +#define ACC101_NUM_ACCS       5
>>> +#define ACC101_PF_VAL         2
>>> +
>>> +/* ACC101 Configuration */
>>> +#define ACC101_CFG_DMA_ERROR    0x3D7
>>> +#define ACC101_CFG_AXI_CACHE    0x11
>>> +#define ACC101_CFG_QMGR_HI_P    0x0F0F
>>> +#define ACC101_CFG_PCI_AXI      0xC003
>>> +#define ACC101_CFG_PCI_BRIDGE   0x40006033
>>> +#define ACC101_ENGINE_OFFSET    0x1000
>>> +#define ACC101_LONG_WAIT        1000
>>> +#define ACC101_GPEX_AXIMAP_NUM  17
>>> +#define ACC101_CLOCK_GATING_EN  0x30000
>>> +#define ACC101_DMA_INBOUND      0x104
>>> +/* DDR Size per VF - 512MB by default
>>> + * Can be increased up to 4 GB with single PF/VF  */
>>> +#define ACC101_HARQ_DDR         (512 * 1)
Chautru, Nicolas May 23, 2022, 5:53 p.m. UTC | #5
Hi Tom, 
I realize I had not replied to that one

> -----Original Message-----
> From: Tom Rix <trix@redhat.com>
> Sent: Tuesday, May 10, 2022 4:56 AM
> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org;
> gakhil@marvell.com
> Cc: thomas@monjalon.net; Kinsella, Ray <ray.kinsella@intel.com>;
> Richardson, Bruce <bruce.richardson@intel.com>;
> hemant.agrawal@nxp.com; Zhang, Mingshan <mingshan.zhang@intel.com>;
> david.marchand@redhat.com
> Subject: Re: [PATCH v2 1/5] baseband/acc100: introduce PMD for ACC101
> 
> 
> On 5/9/22 2:23 PM, Chautru, Nicolas wrote:
> > Hi Tom,
> >
> >> -----Original Message-----
> >> From: Tom Rix <trix@redhat.com>
> >> Sent: Sunday, May 8, 2022 6:03 AM
> >> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org;
> >> gakhil@marvell.com
> >> Cc: thomas@monjalon.net; Kinsella, Ray <ray.kinsella@intel.com>;
> >> Richardson, Bruce <bruce.richardson@intel.com>;
> >> hemant.agrawal@nxp.com; Zhang, Mingshan
> <mingshan.zhang@intel.com>;
> >> david.marchand@redhat.com
> >> Subject: Re: [PATCH v2 1/5] baseband/acc100: introduce PMD for ACC101
> >>
> >> This is a good start reusing code, but I think it needs to do more reuse.
> >>
> >> These cards should be very close and likely represent a family of cards.
> >>
> >> On 4/27/22 11:16 AM, Nicolas Chautru wrote:
> >>> Support for ACC101 as a derivative of ACC100.
> >>> Reusing existing code when possible.
> >>>
> >>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> >>> ---
> >>>    doc/guides/bbdevs/acc101.rst             | 237
> >> +++++++++++++++++++++++++++++++
> >>>    doc/guides/bbdevs/features/acc101.ini    |  13 ++
> >>>    doc/guides/bbdevs/index.rst              |   1 +
> >>>    doc/guides/rel_notes/release_22_07.rst   |   4 +
> >>>    drivers/baseband/acc100/rte_acc100_pmd.c | 194
> >> ++++++++++++++++++++++++-
> >>>    drivers/baseband/acc100/rte_acc100_pmd.h |   6 +
> >>>    drivers/baseband/acc100/rte_acc101_pmd.h |  61 ++++++++
> >>>    7 files changed, 511 insertions(+), 5 deletions(-)
> >>>    create mode 100644 doc/guides/bbdevs/acc101.rst
> >>>    create mode 100644 doc/guides/bbdevs/features/acc101.ini
> >>>    create mode 100644 drivers/baseband/acc100/rte_acc101_pmd.h
> >>>
> >>> diff --git a/doc/guides/bbdevs/acc101.rst
> >>> b/doc/guides/bbdevs/acc101.rst new file mode 100644 index
> >>> 0000000..46c310b
> >>> --- /dev/null
> >>> +++ b/doc/guides/bbdevs/acc101.rst
> >>> @@ -0,0 +1,237 @@
> >>> +..  SPDX-License-Identifier: BSD-3-Clause
> >>> +    Copyright(c) 2020 Intel Corporation
> >>> +
> >>> +Intel(R) ACC101 5G/4G FEC Poll Mode Driver
> >>> +==========================================
> >>> +
> >>> +The BBDEV ACC101 5G/4G FEC poll mode driver (PMD) supports an
> >>> +implementation of a VRAN FEC wireless acceleration function.
> >>> +This device is also known as Mount Cirrus.
> >>> +This is a follow-up to Mount Bryce (ACC100) and includes fixes,
> >>> +improved feature set for error scenarios and performance capacity
> increase.
> >> includes fixes, better error handling and increased performance.
> >>
> >> A quick look at acc100.rst and the bulk of acc101.rst looks the same.
> >>
> >> Consider a user of the acc100 is upgrading to acc101, they will
> >>
> >> want to know what is the same and what has changed and test
> accordingly.
> >>
> >> These two documents should be combined.
> >>
> > Well in term of documentation, for the users it helps to be able to follow
> steps as they are for a given variant.
> > As opposed to have to have multiple options through the document when
> using ACC100 vs ACC101.
> > Except if they are other objections, I would see this more useful for the
> user as is and less source of errors.
> >
> My perspective is having existing acc100 users that are upgrading and/or
> having to support both acc100 and acc101
> 
> for a very long time.  In the first case users of existing acc100 users will want
> to know only the parts that have changed.
> 
> In the second, later changes that are common to both acc100 and acc101 and
> later accXXX will be have to fixed in
> 
> multiple places.  As myself or someone at Red Hat will be on the hook for
> both, I would prefer if the refactoring
> 
> of the common parts acc100,acc101 were in good shape over the expediency
> of having acc101 sooner.

Will merge documentation


> 
> >>> +
> >>> +Features
> >>> +--------
> >>> +
> 
> >>> index 0000000..0e2c21a
> >>> --- /dev/null
> >>> +++ b/doc/guides/bbdevs/features/acc101.ini
> >>> @@ -0,0 +1,13 @@
> >>> +;
> >>> +; Supported features of the 'acc101' bbdev driver.
> >>> +;
> >>> +; Refer to default.ini for the full list of available PMD features.
> >>> +;
> >>> +[Features]
> >>> +Turbo Decoder (4G)     = Y
> >>> +Turbo Encoder (4G)     = Y
> >>> +LDPC Decoder (5G)      = Y
> >>> +LDPC Encoder (5G)      = Y
> >>> +LLR/HARQ Compression   = Y
> >>> +External DDR Access    = Y
> >>> +HW Accelerated         = Y
> >> This is the same as acc100.ini, why do we need 2 ?
> > This is a different product, needs to be consistent.
> ok
> >
> >>> diff --git a/doc/guides/bbdevs/index.rst
> >>> b/doc/guides/bbdevs/index.rst index cedd706..e76883c 100644
> >>> --- a/doc/guides/bbdevs/index.rst
> >>> +++ b/doc/guides/bbdevs/index.rst
> >>> @@ -14,4 +14,5 @@ Baseband Device Drivers
> >>>        fpga_lte_fec
> >>>        fpga_5gnr_fec
> >>>        acc100
> >>> +    acc101
> >>>        la12xx
> >>> diff --git a/doc/guides/rel_notes/release_22_07.rst
> >>> b/doc/guides/rel_notes/release_22_07.rst
> >>> index 42a5f2d..ef9906b 100644
> >>> --- a/doc/guides/rel_notes/release_22_07.rst
> >>> +++ b/doc/guides/rel_notes/release_22_07.rst
> >>> @@ -55,6 +55,10 @@ New Features
> >>>         Also, make sure to start the actual text at the margin.
> >>>         =======================================================
> >>>
> >>> +* **Added Intel ACC101 baseband PMD.**
> >>> +
> >>> +  * Added a new baseband PMD for Intel ACC101 device (Mount Cirrus).
> >>> +  * See the :doc:`../bbdevs/acc101` for more details.
> >>>
> >>>    Removed Items
> >>>    -------------
> >>> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> >>> b/drivers/baseband/acc100/rte_acc100_pmd.c
> >>> index de7e4bc..fca27ef 100644
> >>> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> >>> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> >>> @@ -22,6 +22,7 @@
> >>>    #include <rte_bbdev.h>
> >>>    #include <rte_bbdev_pmd.h>
> >>>    #include "rte_acc100_pmd.h"
> >>> +#include "rte_acc101_pmd.h"
> >>>
> >>>    #ifdef RTE_LIBRTE_BBDEV_DEBUG
> >>>    RTE_LOG_REGISTER_DEFAULT(acc100_logtype, DEBUG); @@ -1286,6
> >> +1287,12
> >>> @@
> >>>    			RTE_BBDEV_TURBO_HALF_ITERATION_EVEN);
> >>>    }
> >>>
> >>> +static inline bool
> >>> +is_acc100(struct acc100_queue *q)
> >>> +{
> >>> +	return (q->d->device_variant == ACC100_VARIANT); }
> >>> +
> >>>    /* Fill in a frame control word for LDPC decoding. */
> >>>    static inline void
> >>>    acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct
> >>> acc100_fcw_ld *fcw, @@ -1412,6 +1419,139 @@
> >>>    	}
> >>>    }
> >>>
> >>> +/* Convert offset to harq index for harq_layout structure */ static
> >>> +inline uint32_t hq_index(uint32_t offset) {
> >>> +	return (offset >> ACC100_HARQ_OFFSET_SHIFT) &
> >>> +ACC100_HARQ_OFFSET_MASK; }
> >>> +
> >>> +/* Fill in a frame control word for LDPC decoding for ACC101 */
> >>> +static inline void acc101_fcw_ld_fill(struct rte_bbdev_dec_op *op,
> >>> +struct acc100_fcw_ld *fcw,
> >>> +		union acc100_harq_layout_data *harq_layout)
> >> This looks extremely similar to the acc100*, why isn't this combined ?
> 
> Please answer.
> 
> Functions that look similar should be combined.

Most of the code is common except that very function since they are HW differences  which would need to be managed differently including through future patches and/or maintenance.
Hopefully you would agree that this difference is kept very small (one function). 


> 
> This gets to doing more to refactor the parts that are common between
> acc100 and acc101.
> 
> >>> +{
> >>> +	uint16_t harq_out_length, harq_in_length, ncb_p, k0_p,
> parity_offset;
> >>> +	uint32_t harq_index;
> >>> +	uint32_t l;
> >>> +
> >>> +	fcw->qm = op->ldpc_dec.q_m;
> >>> +	fcw->nfiller = op->ldpc_dec.n_filler;
> >>> +	fcw->BG = (op->ldpc_dec.basegraph - 1);
> >>> +	fcw->Zc = op->ldpc_dec.z_c;
> >>> +	fcw->ncb = op->ldpc_dec.n_cb;
> >>> +	fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_dec.basegraph,
> >>> +			op->ldpc_dec.rv_index);
> >>> +	if (op->ldpc_dec.code_block_mode == RTE_BBDEV_CODE_BLOCK)
> >>> +		fcw->rm_e = op->ldpc_dec.cb_params.e;
> >>> +	else
> >>> +		fcw->rm_e = (op->ldpc_dec.tb_params.r <
> >>> +				op->ldpc_dec.tb_params.cab) ?
> >>> +						op->ldpc_dec.tb_params.ea :
> >>> +						op->ldpc_dec.tb_params.eb;
> >>> +
> >>> +	if (unlikely(check_bit(op->ldpc_dec.op_flags,
> >>> +			RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE) &&
> >>> +			(op->ldpc_dec.harq_combined_input.length == 0))) {
> >>> +		rte_bbdev_log(WARNING, "Null HARQ input size provided");
> >>> +		/* Disable HARQ input in that case to carry forward */
> >>> +		op->ldpc_dec.op_flags ^=
> >> RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE;
> >>> +	}
> >>> +
> >>> +	fcw->hcin_en = check_bit(op->ldpc_dec.op_flags,
> >>> +			RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE);
> >>> +	fcw->hcout_en = check_bit(op->ldpc_dec.op_flags,
> >>> +			RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE);
> >>> +	fcw->crc_select = check_bit(op->ldpc_dec.op_flags,
> >>> +			RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK);
> >>> +	fcw->bypass_dec = check_bit(op->ldpc_dec.op_flags,
> >>> +			RTE_BBDEV_LDPC_DECODE_BYPASS);
> >>> +	fcw->bypass_intlv = check_bit(op->ldpc_dec.op_flags,
> >>> +			RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS);
> >>> +	if (op->ldpc_dec.q_m == 1) {
> >>> +		fcw->bypass_intlv = 1;
> >>> +		fcw->qm = 2;
> >>> +	}
> >>> +	fcw->hcin_decomp_mode = check_bit(op->ldpc_dec.op_flags,
> >>> +			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> >>> +	fcw->hcout_comp_mode = check_bit(op->ldpc_dec.op_flags,
> >>> +			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> >>> +	fcw->llr_pack_mode = check_bit(op->ldpc_dec.op_flags,
> >>> +			RTE_BBDEV_LDPC_LLR_COMPRESSION);
> >>> +	harq_index = hq_index(op-
> >ldpc_dec.harq_combined_output.offset);
> >>> +	if (fcw->hcin_en > 0) {
> >>> +		harq_in_length = op-
> >ldpc_dec.harq_combined_input.length;
> >>> +		if (fcw->hcin_decomp_mode > 0)
> >>> +			harq_in_length = harq_in_length * 8 / 6;
> >>> +		harq_in_length = RTE_MIN(harq_in_length, op-
> >>> ldpc_dec.n_cb
> >>> +				- op->ldpc_dec.n_filler);
> >>> +		/* Alignment on next 64B - Already enforced from HC output
> */
> >>> +		harq_in_length = RTE_ALIGN_FLOOR(harq_in_length, 64);
> >>> +		fcw->hcin_size0 = harq_in_length;
> >>> +		fcw->hcin_offset = 0;
> >>> +		fcw->hcin_size1 = 0;
> >>> +	} else {
> >>> +		fcw->hcin_size0 = 0;
> >>> +		fcw->hcin_offset = 0;
> >>> +		fcw->hcin_size1 = 0;
> >>> +	}
> >>> +
> >>> +	fcw->itmax = op->ldpc_dec.iter_max;
> >>> +	fcw->itstop = check_bit(op->ldpc_dec.op_flags,
> >>> +			RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE);
> >>> +	fcw->synd_precoder = fcw->itstop;
> >>> +	/*
> >>> +	 * These are all implicitly set
> >>> +	 * fcw->synd_post = 0;
> >>> +	 * fcw->so_en = 0;
> >>> +	 * fcw->so_bypass_rm = 0;
> >>> +	 * fcw->so_bypass_intlv = 0;
> >>> +	 * fcw->dec_convllr = 0;
> >>> +	 * fcw->hcout_convllr = 0;
> >>> +	 * fcw->hcout_size1 = 0;
> >>> +	 * fcw->so_it = 0;
> >>> +	 * fcw->hcout_offset = 0;
> >>> +	 * fcw->negstop_th = 0;
> >>> +	 * fcw->negstop_it = 0;
> >>> +	 * fcw->negstop_en = 0;
> >>> +	 * fcw->gain_i = 1;
> >>> +	 * fcw->gain_h = 1;
> >>> +	 */
> >>> +	if (fcw->hcout_en > 0) {
> >>> +		parity_offset = (op->ldpc_dec.basegraph == 1 ? 20 : 8)
> >>> +			* op->ldpc_dec.z_c - op->ldpc_dec.n_filler;
> >>> +		k0_p = (fcw->k0 > parity_offset) ?
> >>> +				fcw->k0 - op->ldpc_dec.n_filler : fcw->k0;
> >>> +		ncb_p = fcw->ncb - op->ldpc_dec.n_filler;
> >>> +		l = RTE_MIN(k0_p + fcw->rm_e, INT16_MAX);
> >>> +		harq_out_length = (uint16_t) fcw->hcin_size0;
> >>> +		harq_out_length = RTE_MAX(harq_out_length, l);
> >>> +		/* Cannot exceed the pruned Ncb circular buffer */
> >>> +		harq_out_length = RTE_MIN(harq_out_length, ncb_p);
> >>> +		/* Alignment on next 64B */
> >>> +		harq_out_length = RTE_ALIGN_CEIL(harq_out_length, 64);
> >>> +		fcw->hcout_size0 = harq_out_length;
> >>> +		fcw->hcout_size1 = 0;
> >>> +		fcw->hcout_offset = 0;
> >>> +		harq_layout[harq_index].offset = fcw->hcout_offset;
> >>> +		harq_layout[harq_index].size0 = fcw->hcout_size0;
> >>> +	} else {
> >>> +		fcw->hcout_size0 = 0;
> >>> +		fcw->hcout_size1 = 0;
> >>> +		fcw->hcout_offset = 0;
> >>> +	}
> >>> +}
> >>> +
> >>> +static inline void
> >>> +acc10x_fcw_ld_fill(struct rte_bbdev_dec_op *op, struct
> >>> +acc100_fcw_ld
> >> *fcw,
> >>> +		union acc100_harq_layout_data *harq_layout, struct
> >> acc100_queue *q)
> >>> +{
> >>> +	if (is_acc100(q))
> >> consider having a function table in the private data so the call can
> >> be made without this if-check
> Please answer.

Answered offline I believe. This was left for consideration but I tend to believe that both options are similar, this was slightly less convoluted.

> >>> +		return acc100_fcw_ld_fill(op, fcw, harq_layout);
> >>> +	else
> >>> +		return acc101_fcw_ld_fill(op, fcw, harq_layout); }
> >>> +
> >>> +
> >>>    /**
> >>>     * Fills descriptor with data pointers of one block type.
> >>>     *
> >>> @@ -2960,7 +3100,7 @@
> >>>    		struct acc100_fcw_ld *fcw;
> >>>    		uint32_t seg_total_left;
> >>>    		fcw = &desc->req.fcw_ld;
> >>> -		acc100_fcw_ld_fill(op, fcw, harq_layout);
> >>> +		acc10x_fcw_ld_fill(op, fcw, harq_layout, q);
> >>>
> >>>    		/* Special handling when overusing mbuf */
> >>>    		if (fcw->rm_e < ACC100_MAX_E_MBUF) @@ -3027,7 +3167,7
> >> @@
> >>>    	desc = q->ring_addr + desc_idx;
> >>>    	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
> >>>    	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
> >>> -	acc100_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout);
> >>> +	acc10x_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout, q);
> >>>
> >>>    	input = op->ldpc_dec.input.data;
> >>>    	h_output_head = h_output = op->ldpc_dec.hard_output.data; @@
> >>> -4139,9 +4279,17 @@
> >>>    	dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
> >>>    	dev->dequeue_ldpc_dec_ops = acc100_dequeue_ldpc_dec;
> >>>
> >>> -	((struct acc100_device *) dev->data->dev_private)->pf_device =
> >>> -			!strcmp(drv->driver.name,
> >>> -					RTE_STR(ACC100PF_DRIVER_NAME));
> >>> +	if ((!strcmp(drv->driver.name, RTE_STR(ACC100PF_DRIVER_NAME)))
> >> ||
> >>> +			(!strcmp(drv->driver.name,
> >> RTE_STR(ACC100VF_DRIVER_NAME)))) {
> >>> +		((struct acc100_device *) dev->data->dev_private)->pf_device
> >> =
> >>> +				!strcmp(drv->driver.name,
> >> RTE_STR(ACC100PF_DRIVER_NAME));
> >>> +		((struct acc100_device *) dev->data->dev_private)-
> >>> device_variant = ACC100_VARIANT;
> >>> +	} else {
> >>> +		((struct acc100_device *) dev->data->dev_private)->pf_device
> >> =
> >>> +				!strcmp(drv->driver.name,
> >> RTE_STR(ACC101PF_DRIVER_NAME));
> >>> +		((struct acc100_device *) dev->data->dev_private)-
> >>> device_variant = ACC101_VARIANT;
> >>> +	}
> >>> +
> >>>    	((struct acc100_device *) dev->data->dev_private)->mmio_base =
> >>>    			pci_dev->mem_resource[0].addr;
> >>>
> >>> @@ -4251,6 +4399,42 @@ static int acc100_pci_remove(struct
> >> rte_pci_device *pci_dev)
> >>>    RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME,
> >> acc100_pci_vf_driver);
> >>>    RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME,
> >>> pci_id_acc100_vf_map);
> >>>
> >>> +/* ACC101 PCI PF address map */
> >>> +static struct rte_pci_id pci_id_acc101_pf_map[] = {
> >>> +	{
> >>> +		RTE_PCI_DEVICE(RTE_ACC101_VENDOR_ID,
> >> RTE_ACC101_PF_DEVICE_ID)
> >>> +	},
> >>> +	{.device_id = 0},
> >>> +};
> >>> +
> >>> +/* ACC101 PCI VF address map */
> >>> +static struct rte_pci_id pci_id_acc101_vf_map[] = {
> >>> +	{
> >>> +		RTE_PCI_DEVICE(RTE_ACC101_VENDOR_ID,
> >> RTE_ACC101_VF_DEVICE_ID)
> >>> +	},
> >>> +	{.device_id = 0},
> >>> +};
> >>> +
> >>> +
> >>> +static struct rte_pci_driver acc101_pci_pf_driver = {
> >>> +		.probe = acc100_pci_probe,
> >>> +		.remove = acc100_pci_remove,
> >>> +		.id_table = pci_id_acc101_pf_map,
> >>> +		.drv_flags = RTE_PCI_DRV_NEED_MAPPING };
> >>> +
> >>> +static struct rte_pci_driver acc101_pci_vf_driver = {
> >>> +		.probe = acc100_pci_probe,
> >>> +		.remove = acc100_pci_remove,
> >>> +		.id_table = pci_id_acc101_vf_map,
> >>> +		.drv_flags = RTE_PCI_DRV_NEED_MAPPING };
> >>> +
> >>> +RTE_PMD_REGISTER_PCI(ACC101PF_DRIVER_NAME,
> >> acc101_pci_pf_driver);
> >>> +RTE_PMD_REGISTER_PCI_TABLE(ACC101PF_DRIVER_NAME,
> >>> +pci_id_acc101_pf_map);
> >> RTE_PMD_REGISTER_PCI(ACC101VF_DRIVER_NAME,
> >>> +acc101_pci_vf_driver);
> >>> +RTE_PMD_REGISTER_PCI_TABLE(ACC101VF_DRIVER_NAME,
> >>> +pci_id_acc101_vf_map);
> >>> +
> >>>    /*
> >>>     * Workaround implementation to fix the power on status of some
> >>> 5GUL
> >> engines
> >>>     * This requires DMA permission if ported outside DPDK diff --git
> >>> a/drivers/baseband/acc100/rte_acc100_pmd.h
> >>> b/drivers/baseband/acc100/rte_acc100_pmd.h
> >>> index cbcece2..6438031 100644
> >>> --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> >>> +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> >>> @@ -22,6 +22,9 @@
> >>>    #define rte_bbdev_log_debug(fmt, ...)
> >>>    #endif
> >>>
> >>> +#define ACC100_VARIANT 0
> >>> +#define ACC101_VARIANT 1
> >>> +
> >>>    /* ACC100 PF and VF driver names */
> >>>    #define ACC100PF_DRIVER_NAME           intel_acc100_pf
> >>>    #define ACC100VF_DRIVER_NAME           intel_acc100_vf
> >>> @@ -67,6 +70,8 @@
> >>>    #define ACC100_HARQ_LAYOUT             (64*1024*1024)
> >>>    /* Assume offset for HARQ in memory */
> >>>    #define ACC100_HARQ_OFFSET             (32*1024)
> >>> +#define ACC100_HARQ_OFFSET_SHIFT       15
> >>> +#define ACC100_HARQ_OFFSET_MASK        0x7ffffff
> >>>    /* Mask used to calculate an index in an Info Ring array (not a byte
> offset) */
> >>>    #define ACC100_INFO_RING_MASK
> >> (ACC100_INFO_RING_NUM_ENTRIES-1)
> >>>    /* Number of Virtual Functions ACC100 supports */ @@ -590,6
> >>> +595,7 @@ struct acc100_device {
> >>>    	uint16_t q_assigned_bit_map[ACC100_NUM_QGRPS];
> >>>    	bool pf_device; /**< True if this is a PF ACC100 device */
> >>>    	bool configured; /**< True if this ACC100 device is configured
> >>> */
> >>> +	uint16_t device_variant;  /**< Device variant */
> >> this is not needed, check the pci id
> > That device_variant is sent once during probing then we can reuse that
> flexibly through the code.
> 
> In the same way the pci id is set once.
> 
> Generally if there is a way to use existing data to figure something out, then
> use the existing data.
> 
> Special purpose data increases the size and complexity as well as
> destabilizing the api

In that very case it helps to keep in one place I believe, the device ids would be different from PF and VF.
The API is not impacted here, only underlying structure within the PMD. 

> 
> >
> >
> >>>    };
> >>>
> >>>    /**
> >>> diff --git a/drivers/baseband/acc100/rte_acc101_pmd.h
> >>> b/drivers/baseband/acc100/rte_acc101_pmd.h
> >>> new file mode 100644
> >>> index 0000000..efab400
> >>> --- /dev/null
> >>> +++ b/drivers/baseband/acc100/rte_acc101_pmd.h
> >> New files need license, copyrights.
> > Thanks!!
> >
> >> This file looks very similar to rte_acc100_pmd.h
> > Actually the configuration of the device differs and could diverge further in
> the future. This is limited to the part that would be specific to the device
> configuration.
> > Already kept to extremely reduced set. Doing it more would be artificial
> and source of possible errors.
> >
> >> The common parts should be in only one file, maybe a rte_acc10x_pmd.h
> >>
> >>> @@ -0,0 +1,61 @@
> >>> +/* ACC101 PF and VF driver names */
> >>> +#define ACC101PF_DRIVER_NAME           intel_acc101_pf
> >>> +#define ACC101VF_DRIVER_NAME           intel_acc101_vf
> >> this maybe changes to intel_acc10x_pr/vf ?
> > That string would be different from the 2 products on purpose even if
> under the bonnet we try to reuse code as much as possible.
> 
> These two devices could be run from the same driver.
> 
> I am used to linux kernel drivers that handle multiple devices with the same
> driver.
> 
> The reuse in the linux kernel is about 95%.
> 
> The reuse here is about 30%, it should be at least 80%

Where did you get this 30% number? I believe the reuse above is already >80% of the PMD code. (The companion configure function being outside of the PMD per se). 

Thanks
Nic

> 
> Tom
> 
> >
> >> Tom
> >>
> >>> +
> >>> +/* ACC101 PCI vendor & device IDs */
> >>> +#define RTE_ACC101_VENDOR_ID           (0x8086)
> >>> +#define RTE_ACC101_PF_DEVICE_ID        (0x57c4)
> >>> +#define RTE_ACC101_VF_DEVICE_ID        (0x57c5)
> >>> +
> >>> +/* Define as 1 to use only a single FEC engine */ #ifndef
> >>> +RTE_ACC101_SINGLE_FEC #define RTE_ACC101_SINGLE_FEC 0 #endif
> >>> +
> >>> +/* Values used in writing to the registers */
> >>> +#define ACC101_REG_IRQ_EN_ALL          0x1FF83FF  /* Enable all
> interrupts
> >> */
> >>> +
> >>> +/* Number of Virtual Functions ACC101 supports */
> >>> +#define ACC101_NUM_VFS                  16
> >>> +#define ACC101_NUM_QGRPS                8
> >>> +#define ACC101_NUM_AQS                  16
> >>> +/* All ACC101 Registers alignment are 32bits = 4B */
> >>> +#define ACC101_BYTES_IN_WORD                 4
> >>> +
> >>> +#define ACC101_GRP_ID_SHIFT    10 /* Queue Index Hierarchy */
> >>> +#define ACC101_VF_ID_SHIFT     4  /* Queue Index Hierarchy */
> >>> +#define ACC101_VF_OFFSET_QOS   16 /* offset in Memory specific to
> QoS
> >> Mon */
> >>> +#define ACC101_TMPL_PRI_0      0x03020100
> >>> +#define ACC101_TMPL_PRI_1      0x07060504
> >>> +#define ACC101_TMPL_PRI_2      0x0b0a0908
> >>> +#define ACC101_TMPL_PRI_3      0x0f0e0d0c
> >>> +#define ACC101_WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
> >>> +
> >>> +#define ACC101_NUM_TMPL       32
> >>> +/* Mapping of signals for the available engines */
> >>> +#define ACC101_SIG_UL_5G      0
> >>> +#define ACC101_SIG_UL_5G_LAST 8
> >>> +#define ACC101_SIG_DL_5G      13
> >>> +#define ACC101_SIG_DL_5G_LAST 15
> >>> +#define ACC101_SIG_UL_4G      16
> >>> +#define ACC101_SIG_UL_4G_LAST 19
> >>> +#define ACC101_SIG_DL_4G      27
> >>> +#define ACC101_SIG_DL_4G_LAST 31
> >>> +#define ACC101_NUM_ACCS       5
> >>> +#define ACC101_PF_VAL         2
> >>> +
> >>> +/* ACC101 Configuration */
> >>> +#define ACC101_CFG_DMA_ERROR    0x3D7
> >>> +#define ACC101_CFG_AXI_CACHE    0x11
> >>> +#define ACC101_CFG_QMGR_HI_P    0x0F0F
> >>> +#define ACC101_CFG_PCI_AXI      0xC003
> >>> +#define ACC101_CFG_PCI_BRIDGE   0x40006033
> >>> +#define ACC101_ENGINE_OFFSET    0x1000
> >>> +#define ACC101_LONG_WAIT        1000
> >>> +#define ACC101_GPEX_AXIMAP_NUM  17
> >>> +#define ACC101_CLOCK_GATING_EN  0x30000
> >>> +#define ACC101_DMA_INBOUND      0x104
> >>> +/* DDR Size per VF - 512MB by default
> >>> + * Can be increased up to 4 GB with single PF/VF  */
> >>> +#define ACC101_HARQ_DDR         (512 * 1)
diff mbox series

Patch

diff --git a/doc/guides/bbdevs/acc101.rst b/doc/guides/bbdevs/acc101.rst
new file mode 100644
index 0000000..46c310b
--- /dev/null
+++ b/doc/guides/bbdevs/acc101.rst
@@ -0,0 +1,237 @@ 
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright(c) 2020 Intel Corporation
+
+Intel(R) ACC101 5G/4G FEC Poll Mode Driver
+==========================================
+
+The BBDEV ACC101 5G/4G FEC poll mode driver (PMD) supports an
+implementation of a VRAN FEC wireless acceleration function.
+This device is also known as Mount Cirrus.
+This is a follow-up to Mount Bryce (ACC100) and includes fixes, improved
+feature set for error scenarios and performance capacity increase.
+
+Features
+--------
+
+ACC101 5G/4G FEC PMD supports the following features:
+
+- LDPC Encode in the DL (5GNR)
+- LDPC Decode in the UL (5GNR)
+- Turbo Encode in the DL (4G)
+- Turbo Decode in the UL (4G)
+- 16 VFs per PF (physical device)
+- Maximum of 128 queues per VF
+- PCIe Gen-3 x16 Interface
+- MSI
+- SR-IOV
+
+ACC101 5G/4G FEC PMD supports the following BBDEV capabilities:
+
+* For the LDPC encode operation:
+   - ``RTE_BBDEV_LDPC_CRC_24B_ATTACH`` :  set to attach CRC24B to CB(s)
+   - ``RTE_BBDEV_LDPC_RATE_MATCH`` :  if set then do not do Rate Match bypass
+   - ``RTE_BBDEV_LDPC_INTERLEAVER_BYPASS`` : if set then bypass interleaver
+
+* For the LDPC decode operation:
+   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK`` :  check CRC24B from CB(s)
+   - ``RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE`` :  disable early termination
+   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP`` :  drops CRC24B bits appended while decoding
+   - ``RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE`` :  provides an input for HARQ combining
+   - ``RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE`` :  provides an input for HARQ combining
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE`` :  HARQ memory input is internal
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE`` :  HARQ memory output is internal
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK`` :  loopback data to/from HARQ memory
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_FILLERS`` :  HARQ memory includes the fillers bits
+   - ``RTE_BBDEV_LDPC_DEC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
+   - ``RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION`` :  supports compression of the HARQ input/output
+   - ``RTE_BBDEV_LDPC_LLR_COMPRESSION`` :  supports LLR input compression
+
+* For the turbo encode operation:
+   - ``RTE_BBDEV_TURBO_CRC_24B_ATTACH`` :  set to attach CRC24B to CB(s)
+   - ``RTE_BBDEV_TURBO_RATE_MATCH`` :  if set then do not do Rate Match bypass
+   - ``RTE_BBDEV_TURBO_ENC_INTERRUPTS`` :  set for encoder dequeue interrupts
+   - ``RTE_BBDEV_TURBO_RV_INDEX_BYPASS`` :  set to bypass RV index
+   - ``RTE_BBDEV_TURBO_ENC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
+
+* For the turbo decode operation:
+   - ``RTE_BBDEV_TURBO_CRC_TYPE_24B`` :  check CRC24B from CB(s)
+   - ``RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE`` :  perform subblock de-interleave
+   - ``RTE_BBDEV_TURBO_DEC_INTERRUPTS`` :  set for decoder dequeue interrupts
+   - ``RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN`` :  set if negative LLR encoder i/p is supported
+   - ``RTE_BBDEV_TURBO_POS_LLR_1_BIT_IN`` :  set if positive LLR encoder i/p is supported
+   - ``RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP`` :  keep CRC24B bits appended while decoding
+   - ``RTE_BBDEV_TURBO_DEC_CRC_24B_DROP`` : option to drop the code block CRC after decoding
+   - ``RTE_BBDEV_TURBO_EARLY_TERMINATION`` :  set early termination feature
+   - ``RTE_BBDEV_TURBO_DEC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
+   - ``RTE_BBDEV_TURBO_HALF_ITERATION_EVEN`` :  set half iteration granularity
+
+Installation
+------------
+
+Section 3 of the DPDK manual provides instructions on installing and compiling DPDK.
+
+DPDK requires hugepages to be configured as detailed in section 2 of the DPDK manual.
+The bbdev test application has been tested with a configuration 40 x 1GB hugepages. The
+hugepage configuration of a server may be examined using:
+
+.. code-block:: console
+
+   grep Huge* /proc/meminfo
+
+
+Initialization
+--------------
+
+When the device first powers up, its PCI Physical Functions (PF) can be listed through this command:
+
+.. code-block:: console
+
+  sudo lspci -vd8086:57c4
+
+The physical and virtual functions are compatible with Linux UIO drivers:
+``vfio`` and ``igb_uio``. However, in order to work the ACC101 5G/4G
+FEC device first needs to be bound to one of these linux drivers through DPDK.
+
+
+Bind PF UIO driver(s)
+~~~~~~~~~~~~~~~~~~~~~
+
+Install the DPDK igb_uio driver, bind it with the PF PCI device ID and use
+``lspci`` to confirm the PF device is under use by ``igb_uio`` DPDK UIO driver.
+
+The igb_uio driver may be bound to the PF PCI device using one of two methods:
+
+
+1. PCI functions (physical or virtual, depending on the use case) can be bound to
+the UIO driver by repeating this command for every function.
+
+.. code-block:: console
+
+  cd <dpdk-top-level-directory>
+  insmod ./build/kmod/igb_uio.ko
+  echo "8086 57c4" > /sys/bus/pci/drivers/igb_uio/new_id
+  lspci -vd8086:57c4
+
+
+2. Another way to bind PF with DPDK UIO driver is by using the ``dpdk-devbind.py`` tool
+
+.. code-block:: console
+
+  cd <dpdk-top-level-directory>
+  ./usertools/dpdk-devbind.py -b igb_uio 0000:06:00.0
+
+where the PCI device ID (example: 0000:06:00.0) is obtained using lspci -vd8086:57c4
+
+
+In a similar way the ACC101 5G/4G FEC PF may be bound with vfio-pci as any PCIe device.
+
+
+Enable Virtual Functions
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Now, it should be visible in the printouts that PCI PF is under igb_uio control
+"``Kernel driver in use: igb_uio``"
+
+To show the number of available VFs on the device, read ``sriov_totalvfs`` file..
+
+.. code-block:: console
+
+  cat /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_totalvfs
+
+  where 0000\:<b>\:<d>.<f> is the PCI device ID
+
+
+To enable VFs via igb_uio, echo the number of virtual functions intended to
+enable to ``max_vfs`` file..
+
+.. code-block:: console
+
+  echo <num-of-vfs> > /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/max_vfs
+
+
+Afterwards, all VFs must be bound to appropriate UIO drivers as required, same
+way it was done with the physical function previously.
+
+Enabling SR-IOV via vfio driver is pretty much the same, except that the file
+name is different:
+
+.. code-block:: console
+
+  echo <num-of-vfs> > /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_numvfs
+
+
+Configure the VFs through PF
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The PCI virtual functions must be configured before working or getting assigned
+to VMs/Containers. The configuration involves allocating the number of hardware
+queues, priorities, load balance, bandwidth and other settings necessary for the
+device to perform FEC functions.
+
+This configuration needs to be executed at least once after reboot or PCI FLR and can
+be achieved by using the function ``acc101_configure()``, which sets up the
+parameters defined in ``acc100_conf`` structure.
+
+Test Application
+----------------
+
+BBDEV provides a test application, ``test-bbdev.py`` and range of test data for testing
+the functionality of ACC101 5G/4G FEC encode and decode, depending on the device's
+capabilities. The test application is located under app->test-bbdev folder and has the
+following options:
+
+.. code-block:: console
+
+  "-p", "--testapp-path": specifies path to the bbdev test app.
+  "-e", "--eal-params"	: EAL arguments which are passed to the test app.
+  "-t", "--timeout"	: Timeout in seconds (default=300).
+  "-c", "--test-cases"	: Defines test cases to run. Run all if not specified.
+  "-v", "--test-vector"	: Test vector path (default=dpdk_path+/app/test-bbdev/test_vectors/bbdev_null.data).
+  "-n", "--num-ops"	: Number of operations to process on device (default=32).
+  "-b", "--burst-size"	: Operations enqueue/dequeue burst size (default=32).
+  "-s", "--snr"		: SNR in dB used when generating LLRs for bler tests.
+  "-s", "--iter_max"	: Number of iterations for LDPC decoder.
+  "-l", "--num-lcores"	: Number of lcores to run (default=16).
+  "-i", "--init-device" : Initialise PF device with default values.
+
+
+To execute the test application tool using simple decode or encode data,
+type one of the following:
+
+.. code-block:: console
+
+  ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_dec_default.data
+  ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_enc_default.data
+
+
+The test application ``test-bbdev.py``, supports the ability to configure the PF device with
+a default set of values, if the "-i" or "- -init-device" option is included. The default values
+are defined in test_bbdev_perf.c.
+
+
+Test Vectors
+~~~~~~~~~~~~
+
+In addition to the simple LDPC decoder and LDPC encoder tests, bbdev also provides
+a range of additional tests under the test_vectors folder, which may be useful. The results
+of these tests will depend on the ACC101 5G/4G FEC capabilities which may cause some
+testcases to be skipped, but no failure should be reported.
+
+
+Alternate Baseband Device configuration tool
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+On top of the embedded configuration feature supported in test-bbdev using "- -init-device"
+option mentioned above, there is also a tool available to perform that device configuration
+using a companion application.
+The ``pf_bb_config`` application notably enables then to run bbdev-test from the VF
+and not only limited to the PF as captured above.
+
+See for more details: https://github.com/intel/pf-bb-config
+
+Specifically for the BBDEV ACC101 PMD, the command below can be used:
+
+.. code-block:: console
+
+  ./pf_bb_config ACC101 -c acc101/acc101_config_4vf_4g5g.cfg
+  ./test-bbdev.py -e="-c 0xff0 -a${VF_PCI_ADDR}" -c validation -l 1 -v ./ldpc_dec_default.data
\ No newline at end of file
diff --git a/doc/guides/bbdevs/features/acc101.ini b/doc/guides/bbdevs/features/acc101.ini
new file mode 100644
index 0000000..0e2c21a
--- /dev/null
+++ b/doc/guides/bbdevs/features/acc101.ini
@@ -0,0 +1,13 @@ 
+;
+; Supported features of the 'acc101' bbdev driver.
+;
+; Refer to default.ini for the full list of available PMD features.
+;
+[Features]
+Turbo Decoder (4G)     = Y
+Turbo Encoder (4G)     = Y
+LDPC Decoder (5G)      = Y
+LDPC Encoder (5G)      = Y
+LLR/HARQ Compression   = Y
+External DDR Access    = Y
+HW Accelerated         = Y
diff --git a/doc/guides/bbdevs/index.rst b/doc/guides/bbdevs/index.rst
index cedd706..e76883c 100644
--- a/doc/guides/bbdevs/index.rst
+++ b/doc/guides/bbdevs/index.rst
@@ -14,4 +14,5 @@  Baseband Device Drivers
     fpga_lte_fec
     fpga_5gnr_fec
     acc100
+    acc101
     la12xx
diff --git a/doc/guides/rel_notes/release_22_07.rst b/doc/guides/rel_notes/release_22_07.rst
index 42a5f2d..ef9906b 100644
--- a/doc/guides/rel_notes/release_22_07.rst
+++ b/doc/guides/rel_notes/release_22_07.rst
@@ -55,6 +55,10 @@  New Features
      Also, make sure to start the actual text at the margin.
      =======================================================
 
+* **Added Intel ACC101 baseband PMD.**
+
+  * Added a new baseband PMD for Intel ACC101 device (Mount Cirrus).
+  * See the :doc:`../bbdevs/acc101` for more details.
 
 Removed Items
 -------------
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index de7e4bc..fca27ef 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -22,6 +22,7 @@ 
 #include <rte_bbdev.h>
 #include <rte_bbdev_pmd.h>
 #include "rte_acc100_pmd.h"
+#include "rte_acc101_pmd.h"
 
 #ifdef RTE_LIBRTE_BBDEV_DEBUG
 RTE_LOG_REGISTER_DEFAULT(acc100_logtype, DEBUG);
@@ -1286,6 +1287,12 @@ 
 			RTE_BBDEV_TURBO_HALF_ITERATION_EVEN);
 }
 
+static inline bool
+is_acc100(struct acc100_queue *q)
+{
+	return (q->d->device_variant == ACC100_VARIANT);
+}
+
 /* Fill in a frame control word for LDPC decoding. */
 static inline void
 acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw,
@@ -1412,6 +1419,139 @@ 
 	}
 }
 
+/* Convert offset to harq index for harq_layout structure */
+static inline uint32_t hq_index(uint32_t offset)
+{
+	return (offset >> ACC100_HARQ_OFFSET_SHIFT) & ACC100_HARQ_OFFSET_MASK;
+}
+
+/* Fill in a frame control word for LDPC decoding for ACC101 */
+static inline void
+acc101_fcw_ld_fill(struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw,
+		union acc100_harq_layout_data *harq_layout)
+{
+	uint16_t harq_out_length, harq_in_length, ncb_p, k0_p, parity_offset;
+	uint32_t harq_index;
+	uint32_t l;
+
+	fcw->qm = op->ldpc_dec.q_m;
+	fcw->nfiller = op->ldpc_dec.n_filler;
+	fcw->BG = (op->ldpc_dec.basegraph - 1);
+	fcw->Zc = op->ldpc_dec.z_c;
+	fcw->ncb = op->ldpc_dec.n_cb;
+	fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_dec.basegraph,
+			op->ldpc_dec.rv_index);
+	if (op->ldpc_dec.code_block_mode == RTE_BBDEV_CODE_BLOCK)
+		fcw->rm_e = op->ldpc_dec.cb_params.e;
+	else
+		fcw->rm_e = (op->ldpc_dec.tb_params.r <
+				op->ldpc_dec.tb_params.cab) ?
+						op->ldpc_dec.tb_params.ea :
+						op->ldpc_dec.tb_params.eb;
+
+	if (unlikely(check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE) &&
+			(op->ldpc_dec.harq_combined_input.length == 0))) {
+		rte_bbdev_log(WARNING, "Null HARQ input size provided");
+		/* Disable HARQ input in that case to carry forward */
+		op->ldpc_dec.op_flags ^= RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE;
+	}
+
+	fcw->hcin_en = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE);
+	fcw->hcout_en = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE);
+	fcw->crc_select = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK);
+	fcw->bypass_dec = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_DECODE_BYPASS);
+	fcw->bypass_intlv = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS);
+	if (op->ldpc_dec.q_m == 1) {
+		fcw->bypass_intlv = 1;
+		fcw->qm = 2;
+	}
+	fcw->hcin_decomp_mode = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+	fcw->hcout_comp_mode = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+	fcw->llr_pack_mode = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_LLR_COMPRESSION);
+	harq_index = hq_index(op->ldpc_dec.harq_combined_output.offset);
+	if (fcw->hcin_en > 0) {
+		harq_in_length = op->ldpc_dec.harq_combined_input.length;
+		if (fcw->hcin_decomp_mode > 0)
+			harq_in_length = harq_in_length * 8 / 6;
+		harq_in_length = RTE_MIN(harq_in_length, op->ldpc_dec.n_cb
+				- op->ldpc_dec.n_filler);
+		/* Alignment on next 64B - Already enforced from HC output */
+		harq_in_length = RTE_ALIGN_FLOOR(harq_in_length, 64);
+		fcw->hcin_size0 = harq_in_length;
+		fcw->hcin_offset = 0;
+		fcw->hcin_size1 = 0;
+	} else {
+		fcw->hcin_size0 = 0;
+		fcw->hcin_offset = 0;
+		fcw->hcin_size1 = 0;
+	}
+
+	fcw->itmax = op->ldpc_dec.iter_max;
+	fcw->itstop = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE);
+	fcw->synd_precoder = fcw->itstop;
+	/*
+	 * These are all implicitly set
+	 * fcw->synd_post = 0;
+	 * fcw->so_en = 0;
+	 * fcw->so_bypass_rm = 0;
+	 * fcw->so_bypass_intlv = 0;
+	 * fcw->dec_convllr = 0;
+	 * fcw->hcout_convllr = 0;
+	 * fcw->hcout_size1 = 0;
+	 * fcw->so_it = 0;
+	 * fcw->hcout_offset = 0;
+	 * fcw->negstop_th = 0;
+	 * fcw->negstop_it = 0;
+	 * fcw->negstop_en = 0;
+	 * fcw->gain_i = 1;
+	 * fcw->gain_h = 1;
+	 */
+	if (fcw->hcout_en > 0) {
+		parity_offset = (op->ldpc_dec.basegraph == 1 ? 20 : 8)
+			* op->ldpc_dec.z_c - op->ldpc_dec.n_filler;
+		k0_p = (fcw->k0 > parity_offset) ?
+				fcw->k0 - op->ldpc_dec.n_filler : fcw->k0;
+		ncb_p = fcw->ncb - op->ldpc_dec.n_filler;
+		l = RTE_MIN(k0_p + fcw->rm_e, INT16_MAX);
+		harq_out_length = (uint16_t) fcw->hcin_size0;
+		harq_out_length = RTE_MAX(harq_out_length, l);
+		/* Cannot exceed the pruned Ncb circular buffer */
+		harq_out_length = RTE_MIN(harq_out_length, ncb_p);
+		/* Alignment on next 64B */
+		harq_out_length = RTE_ALIGN_CEIL(harq_out_length, 64);
+		fcw->hcout_size0 = harq_out_length;
+		fcw->hcout_size1 = 0;
+		fcw->hcout_offset = 0;
+		harq_layout[harq_index].offset = fcw->hcout_offset;
+		harq_layout[harq_index].size0 = fcw->hcout_size0;
+	} else {
+		fcw->hcout_size0 = 0;
+		fcw->hcout_size1 = 0;
+		fcw->hcout_offset = 0;
+	}
+}
+
+static inline void
+acc10x_fcw_ld_fill(struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw,
+		union acc100_harq_layout_data *harq_layout, struct acc100_queue *q)
+{
+	if (is_acc100(q))
+		return acc100_fcw_ld_fill(op, fcw, harq_layout);
+	else
+		return acc101_fcw_ld_fill(op, fcw, harq_layout);
+}
+
+
 /**
  * Fills descriptor with data pointers of one block type.
  *
@@ -2960,7 +3100,7 @@ 
 		struct acc100_fcw_ld *fcw;
 		uint32_t seg_total_left;
 		fcw = &desc->req.fcw_ld;
-		acc100_fcw_ld_fill(op, fcw, harq_layout);
+		acc10x_fcw_ld_fill(op, fcw, harq_layout, q);
 
 		/* Special handling when overusing mbuf */
 		if (fcw->rm_e < ACC100_MAX_E_MBUF)
@@ -3027,7 +3167,7 @@ 
 	desc = q->ring_addr + desc_idx;
 	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
 	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
-	acc100_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout);
+	acc10x_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout, q);
 
 	input = op->ldpc_dec.input.data;
 	h_output_head = h_output = op->ldpc_dec.hard_output.data;
@@ -4139,9 +4279,17 @@ 
 	dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
 	dev->dequeue_ldpc_dec_ops = acc100_dequeue_ldpc_dec;
 
-	((struct acc100_device *) dev->data->dev_private)->pf_device =
-			!strcmp(drv->driver.name,
-					RTE_STR(ACC100PF_DRIVER_NAME));
+	if ((!strcmp(drv->driver.name, RTE_STR(ACC100PF_DRIVER_NAME))) ||
+			(!strcmp(drv->driver.name, RTE_STR(ACC100VF_DRIVER_NAME)))) {
+		((struct acc100_device *) dev->data->dev_private)->pf_device =
+				!strcmp(drv->driver.name, RTE_STR(ACC100PF_DRIVER_NAME));
+		((struct acc100_device *) dev->data->dev_private)->device_variant = ACC100_VARIANT;
+	} else {
+		((struct acc100_device *) dev->data->dev_private)->pf_device =
+				!strcmp(drv->driver.name, RTE_STR(ACC101PF_DRIVER_NAME));
+		((struct acc100_device *) dev->data->dev_private)->device_variant = ACC101_VARIANT;
+	}
+
 	((struct acc100_device *) dev->data->dev_private)->mmio_base =
 			pci_dev->mem_resource[0].addr;
 
@@ -4251,6 +4399,42 @@  static int acc100_pci_remove(struct rte_pci_device *pci_dev)
 RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
 RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
 
+/* ACC101 PCI PF address map */
+static struct rte_pci_id pci_id_acc101_pf_map[] = {
+	{
+		RTE_PCI_DEVICE(RTE_ACC101_VENDOR_ID, RTE_ACC101_PF_DEVICE_ID)
+	},
+	{.device_id = 0},
+};
+
+/* ACC101 PCI VF address map */
+static struct rte_pci_id pci_id_acc101_vf_map[] = {
+	{
+		RTE_PCI_DEVICE(RTE_ACC101_VENDOR_ID, RTE_ACC101_VF_DEVICE_ID)
+	},
+	{.device_id = 0},
+};
+
+
+static struct rte_pci_driver acc101_pci_pf_driver = {
+		.probe = acc100_pci_probe,
+		.remove = acc100_pci_remove,
+		.id_table = pci_id_acc101_pf_map,
+		.drv_flags = RTE_PCI_DRV_NEED_MAPPING
+};
+
+static struct rte_pci_driver acc101_pci_vf_driver = {
+		.probe = acc100_pci_probe,
+		.remove = acc100_pci_remove,
+		.id_table = pci_id_acc101_vf_map,
+		.drv_flags = RTE_PCI_DRV_NEED_MAPPING
+};
+
+RTE_PMD_REGISTER_PCI(ACC101PF_DRIVER_NAME, acc101_pci_pf_driver);
+RTE_PMD_REGISTER_PCI_TABLE(ACC101PF_DRIVER_NAME, pci_id_acc101_pf_map);
+RTE_PMD_REGISTER_PCI(ACC101VF_DRIVER_NAME, acc101_pci_vf_driver);
+RTE_PMD_REGISTER_PCI_TABLE(ACC101VF_DRIVER_NAME, pci_id_acc101_vf_map);
+
 /*
  * Workaround implementation to fix the power on status of some 5GUL engines
  * This requires DMA permission if ported outside DPDK
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index cbcece2..6438031 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -22,6 +22,9 @@ 
 #define rte_bbdev_log_debug(fmt, ...)
 #endif
 
+#define ACC100_VARIANT 0
+#define ACC101_VARIANT 1
+
 /* ACC100 PF and VF driver names */
 #define ACC100PF_DRIVER_NAME           intel_acc100_pf
 #define ACC100VF_DRIVER_NAME           intel_acc100_vf
@@ -67,6 +70,8 @@ 
 #define ACC100_HARQ_LAYOUT             (64*1024*1024)
 /* Assume offset for HARQ in memory */
 #define ACC100_HARQ_OFFSET             (32*1024)
+#define ACC100_HARQ_OFFSET_SHIFT       15
+#define ACC100_HARQ_OFFSET_MASK        0x7ffffff
 /* Mask used to calculate an index in an Info Ring array (not a byte offset) */
 #define ACC100_INFO_RING_MASK          (ACC100_INFO_RING_NUM_ENTRIES-1)
 /* Number of Virtual Functions ACC100 supports */
@@ -590,6 +595,7 @@  struct acc100_device {
 	uint16_t q_assigned_bit_map[ACC100_NUM_QGRPS];
 	bool pf_device; /**< True if this is a PF ACC100 device */
 	bool configured; /**< True if this ACC100 device is configured */
+	uint16_t device_variant;  /**< Device variant */
 };
 
 /**
diff --git a/drivers/baseband/acc100/rte_acc101_pmd.h b/drivers/baseband/acc100/rte_acc101_pmd.h
new file mode 100644
index 0000000..efab400
--- /dev/null
+++ b/drivers/baseband/acc100/rte_acc101_pmd.h
@@ -0,0 +1,61 @@ 
+/* ACC101 PF and VF driver names */
+#define ACC101PF_DRIVER_NAME           intel_acc101_pf
+#define ACC101VF_DRIVER_NAME           intel_acc101_vf
+
+/* ACC101 PCI vendor & device IDs */
+#define RTE_ACC101_VENDOR_ID           (0x8086)
+#define RTE_ACC101_PF_DEVICE_ID        (0x57c4)
+#define RTE_ACC101_VF_DEVICE_ID        (0x57c5)
+
+/* Define as 1 to use only a single FEC engine */
+#ifndef RTE_ACC101_SINGLE_FEC
+#define RTE_ACC101_SINGLE_FEC 0
+#endif
+
+/* Values used in writing to the registers */
+#define ACC101_REG_IRQ_EN_ALL          0x1FF83FF  /* Enable all interrupts */
+
+/* Number of Virtual Functions ACC101 supports */
+#define ACC101_NUM_VFS                  16
+#define ACC101_NUM_QGRPS                8
+#define ACC101_NUM_AQS                  16
+/* All ACC101 Registers alignment are 32bits = 4B */
+#define ACC101_BYTES_IN_WORD                 4
+
+#define ACC101_GRP_ID_SHIFT    10 /* Queue Index Hierarchy */
+#define ACC101_VF_ID_SHIFT     4  /* Queue Index Hierarchy */
+#define ACC101_VF_OFFSET_QOS   16 /* offset in Memory specific to QoS Mon */
+#define ACC101_TMPL_PRI_0      0x03020100
+#define ACC101_TMPL_PRI_1      0x07060504
+#define ACC101_TMPL_PRI_2      0x0b0a0908
+#define ACC101_TMPL_PRI_3      0x0f0e0d0c
+#define ACC101_WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
+
+#define ACC101_NUM_TMPL       32
+/* Mapping of signals for the available engines */
+#define ACC101_SIG_UL_5G      0
+#define ACC101_SIG_UL_5G_LAST 8
+#define ACC101_SIG_DL_5G      13
+#define ACC101_SIG_DL_5G_LAST 15
+#define ACC101_SIG_UL_4G      16
+#define ACC101_SIG_UL_4G_LAST 19
+#define ACC101_SIG_DL_4G      27
+#define ACC101_SIG_DL_4G_LAST 31
+#define ACC101_NUM_ACCS       5
+#define ACC101_PF_VAL         2
+
+/* ACC101 Configuration */
+#define ACC101_CFG_DMA_ERROR    0x3D7
+#define ACC101_CFG_AXI_CACHE    0x11
+#define ACC101_CFG_QMGR_HI_P    0x0F0F
+#define ACC101_CFG_PCI_AXI      0xC003
+#define ACC101_CFG_PCI_BRIDGE   0x40006033
+#define ACC101_ENGINE_OFFSET    0x1000
+#define ACC101_LONG_WAIT        1000
+#define ACC101_GPEX_AXIMAP_NUM  17
+#define ACC101_CLOCK_GATING_EN  0x30000
+#define ACC101_DMA_INBOUND      0x104
+/* DDR Size per VF - 512MB by default
+ * Can be increased up to 4 GB with single PF/VF
+ */
+#define ACC101_HARQ_DDR         (512 * 1)