From patchwork Fri Jul 8 00:01:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Chautru, Nicolas" X-Patchwork-Id: 113813 X-Patchwork-Delegate: gakhil@marvell.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 82C85A0543; Fri, 8 Jul 2022 02:16:13 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 757F6427F0; Fri, 8 Jul 2022 02:16:07 +0200 (CEST) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by mails.dpdk.org (Postfix) with ESMTP id 472DE40041 for ; Fri, 8 Jul 2022 02:16:04 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1657239365; x=1688775365; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Wa3DaZ0Pn3HCCEgQQkBmlVmcTKotq2lDdbHLft2smH0=; b=VnyO7co58n/lYmQ0b0Oxf3QksUYrpkSCHqGu+Ph4a82HehksO1WRbGcw gYljzkyazd4o7ifKjqqm7WTp3gjC31eauKT6hHlLlfklgorqFSCo9TjrH tGgjXc7kbbQDNUbIHZWBMVoJkg8WVxm1clqJyIHSRAnO+pj9Kt9ZZLHrz jviAtBv3EccViv1BCkWT5c5MT1xV+85854QmbCv6L9zaYMSULvy2Kn1kJ xsNxp6DgA24ONdUDvOXwqeg7MSYPTyKeQBrAN4x7rhLhcNDgfFsXIWTz7 uG/aQ1cRe+b0kMA14bg/k8V4VA1EEU2jOUkJkEeisbEZc4X2VtbMHFAGN w==; X-IronPort-AV: E=McAfee;i="6400,9594,10401"; a="264563074" X-IronPort-AV: E=Sophos;i="5.92,253,1650956400"; d="scan'208";a="264563074" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Jul 2022 17:16:03 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.92,253,1650956400"; d="scan'208";a="591387522" Received: from skx-5gnr-sc12-4.sc.intel.com ([172.25.69.210]) by orsmga007.jf.intel.com with ESMTP; 07 Jul 2022 17:16:02 -0700 From: Nicolas Chautru To: dev@dpdk.org, thomas@monjalon.net, gakhil@marvell.com, hemant.agrawal@nxp.com, trix@redhat.com Cc: maxime.coquelin@redhat.com, mdr@ashroe.eu, bruce.richardson@intel.com, david.marchand@redhat.com, stephen@networkplumber.org, Nicolas Chautru Subject: [PATCH v1 01/10] baseband/acc200: introduce PMD for ACC200 Date: Thu, 7 Jul 2022 17:01:34 -0700 Message-Id: <1657238503-143836-2-git-send-email-nicolas.chautru@intel.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1657238503-143836-1-git-send-email-nicolas.chautru@intel.com> References: <1657238503-143836-1-git-send-email-nicolas.chautru@intel.com> MIME-Version: 1.0 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org This patch introduce stubs for device driver for the ACC200 integrated VRAN accelerator on SPR-EEC Signed-off-by: Nicolas Chautru --- MAINTAINERS | 3 + doc/guides/bbdevs/acc200.rst | 244 +++++++++++++++++++++++++++++++ doc/guides/bbdevs/index.rst | 1 + drivers/baseband/acc200/acc200_pmd.h | 38 +++++ drivers/baseband/acc200/meson.build | 6 + drivers/baseband/acc200/rte_acc200_pmd.c | 179 +++++++++++++++++++++++ drivers/baseband/acc200/version.map | 3 + drivers/baseband/meson.build | 1 + 8 files changed, 475 insertions(+) create mode 100644 doc/guides/bbdevs/acc200.rst create mode 100644 drivers/baseband/acc200/acc200_pmd.h create mode 100644 drivers/baseband/acc200/meson.build create mode 100644 drivers/baseband/acc200/rte_acc200_pmd.c create mode 100644 drivers/baseband/acc200/version.map diff --git a/MAINTAINERS b/MAINTAINERS index 1652e08..73284a1 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1337,6 +1337,9 @@ F: doc/guides/bbdevs/features/fpga_5gnr_fec.ini F: drivers/baseband/acc100/ F: doc/guides/bbdevs/acc100.rst F: doc/guides/bbdevs/features/acc100.ini +F: drivers/baseband/acc200/ +F: doc/guides/bbdevs/acc200.rst +F: doc/guides/bbdevs/features/acc200.ini Null baseband M: Nicolas Chautru diff --git a/doc/guides/bbdevs/acc200.rst b/doc/guides/bbdevs/acc200.rst new file mode 100644 index 0000000..3a4dd55 --- /dev/null +++ b/doc/guides/bbdevs/acc200.rst @@ -0,0 +1,244 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright(c) 2022 Intel Corporation + +Intel(R) ACC200 vRAN Dedicated Accelerator Poll Mode Driver +=========================================================== + +The IntelĀ® vRAN Dedicated Accelerator ACC200 peripheral enables cost-effective 4G +and 5G next-generation virtualized Radio Access Network (vRAN) solutions integrated on +Sapphire Rapids EEC Intel(R)7 based Xeon(R) multi-core Serverprocessor. + +Features +-------- + +The ACC200 includes a 5G Low Density Parity Check (LDPC) encoder/decoder, rate match/dematch, +Hybrid Automatic Repeat Request (HARQ) with access to DDR memory for buffer management, a 4G +Turbo encoder/decoder, a Fast Fourier Transform (FFT) block providing DFT/iDFT processing offload +for the 5G Sounding Reference Signal (SRS), a Queue Manager (QMGR), and a DMA subsystem. +There is no dedidated on-card memory for HARQ, this is using coherent memory on the CPU side. + +These correspond to the following features exposed by the PMD: + +- LDPC Encode in the Downlink (5GNR) +- LDPC Decode in the Uplink (5GNR) +- Turbo Encode in the Downlink (4G) +- Turbo Decode in the Uplink (4G) +- FFT processing +- SR-IOV with 16 VFs per PF +- Maximum of 256 queues per VF +- MSI + +ACC200 PMD supports the following BBDEV capabilities: + +* For the LDPC encode operation: + - ``RTE_BBDEV_LDPC_CRC_24B_ATTACH`` : set to attach CRC24B to CB(s) + - ``RTE_BBDEV_LDPC_RATE_MATCH`` : if set then do not do Rate Match bypass + - ``RTE_BBDEV_LDPC_INTERLEAVER_BYPASS`` : if set then bypass interleaver + +* For the LDPC decode operation: + - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK`` : check CRC24B from CB(s) + - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP`` : drops CRC24B bits appended while decoding + - ``RTE_BBDEV_LDPC_CRC_TYPE_24A_CHECK`` : check CRC24A from CB(s) + - ``RTE_BBDEV_LDPC_CRC_TYPE_16_CHECK`` : check CRC16 from CB(s) + - ``RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE`` : provides an input for HARQ combining + - ``RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE`` : provides an input for HARQ combining + - ``RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE`` : disable early termination + - ``RTE_BBDEV_LDPC_DEC_SCATTER_GATHER`` : supports scatter-gather for input/output data + - ``RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION`` : supports compression of the HARQ input/output + - ``RTE_BBDEV_LDPC_LLR_COMPRESSION`` : supports LLR input compression + +* For the turbo encode operation: + - ``RTE_BBDEV_TURBO_CRC_24B_ATTACH`` : set to attach CRC24B to CB(s) + - ``RTE_BBDEV_TURBO_RATE_MATCH`` : if set then do not do Rate Match bypass + - ``RTE_BBDEV_TURBO_ENC_INTERRUPTS`` : set for encoder dequeue interrupts + - ``RTE_BBDEV_TURBO_RV_INDEX_BYPASS`` : set to bypass RV index + - ``RTE_BBDEV_TURBO_ENC_SCATTER_GATHER`` : supports scatter-gather for input/output data + +* For the turbo decode operation: + - ``RTE_BBDEV_TURBO_CRC_TYPE_24B`` : check CRC24B from CB(s) + - ``RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE`` : perform subblock de-interleave + - ``RTE_BBDEV_TURBO_DEC_INTERRUPTS`` : set for decoder dequeue interrupts + - ``RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN`` : set if negative LLR input is supported + - ``RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP`` : keep CRC24B bits appended while decoding + - ``RTE_BBDEV_TURBO_DEC_CRC_24B_DROP`` : option to drop the code block CRC after decoding + - ``RTE_BBDEV_TURBO_EARLY_TERMINATION`` : set early termination feature + - ``RTE_BBDEV_TURBO_DEC_SCATTER_GATHER`` : supports scatter-gather for input/output data + - ``RTE_BBDEV_TURBO_HALF_ITERATION_EVEN`` : set half iteration granularity + - ``RTE_BBDEV_TURBO_SOFT_OUTPUT`` : set the APP LLR soft output + - ``RTE_BBDEV_TURBO_EQUALIZER`` : set the turbo equalizer feature + - ``RTE_BBDEV_TURBO_SOFT_OUT_SATURATE`` : set the soft output saturation + - ``RTE_BBDEV_TURBO_CONTINUE_CRC_MATCH`` : set to run an extra odd iteration after CRC match + - ``RTE_BBDEV_TURBO_NEG_LLR_1_BIT_SOFT_OUT`` : set if negative APP LLR output supported + - ``RTE_BBDEV_TURBO_MAP_DEC`` : supports flexible parallel MAP engine decoding + +Installation +------------ + +Section 3 of the DPDK manual provides instructions on installing and compiling DPDK. + +DPDK requires hugepages to be configured as detailed in section 2 of the DPDK manual. +The bbdev test application has been tested with a configuration 40 x 1GB hugepages. The +hugepage configuration of a server may be examined using: + +.. code-block:: console + + grep Huge* /proc/meminfo + + +Initialization +-------------- + +When the device first powers up, its PCI Physical Functions (PF) can be listed through these +commands for ACC200: + +.. code-block:: console + + sudo lspci -vd8086:57c0 + +The physical and virtual functions are compatible with Linux UIO drivers: +``vfio`` and ``igb_uio``. However, in order to work the 5G/4G +FEC device first needs to be bound to one of these linux drivers through DPDK. + + +Bind PF UIO driver(s) +~~~~~~~~~~~~~~~~~~~~~ + +Install the DPDK igb_uio driver, bind it with the PF PCI device ID and use +``lspci`` to confirm the PF device is under use by ``igb_uio`` DPDK UIO driver. + +The igb_uio driver may be bound to the PF PCI device using one of two methods for ACC200: + + +1. PCI functions (physical or virtual, depending on the use case) can be bound to +the UIO driver by repeating this command for every function. + +.. code-block:: console + + cd + insmod ./build/kmod/igb_uio.ko + echo "8086 57c0" > /sys/bus/pci/drivers/igb_uio/new_id + lspci -vd8086:57c0 + + +2. Another way to bind PF with DPDK UIO driver is by using the ``dpdk-devbind.py`` tool + +.. code-block:: console + + cd + ./usertools/dpdk-devbind.py -b igb_uio 0000:f7:00.0 + +where the PCI device ID (example: 0000:f7:00.0) is obtained using lspci -vd8086:57c0 + + +In a similar way the PF may be bound with vfio-pci as any PCIe device. + + +Enable Virtual Functions +~~~~~~~~~~~~~~~~~~~~~~~~ + +Now, it should be visible in the printouts that PCI PF is under igb_uio control +"``Kernel driver in use: igb_uio``" + +To show the number of available VFs on the device, read ``sriov_totalvfs`` file.. + +.. code-block:: console + + cat /sys/bus/pci/devices/0000\:\:./sriov_totalvfs + + where 0000\:\:. is the PCI device ID + + +To enable VFs via igb_uio, echo the number of virtual functions intended to +enable to ``max_vfs`` file.. + +.. code-block:: console + + echo > /sys/bus/pci/devices/0000\:\:./max_vfs + + +Afterwards, all VFs must be bound to appropriate UIO drivers as required, same +way it was done with the physical function previously. + +Enabling SR-IOV via vfio driver is pretty much the same, except that the file +name is different: + +.. code-block:: console + + echo > /sys/bus/pci/devices/0000\:\:./sriov_numvfs + + +Configure the VFs through PF +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The PCI virtual functions must be configured before working or getting assigned +to VMs/Containers. The configuration involves allocating the number of hardware +queues, priorities, load balance, bandwidth and other settings necessary for the +device to perform FEC functions. + +This configuration needs to be executed at least once after reboot or PCI FLR and can +be achieved by using the functions ``rte_acc200_configure()``, +which sets up the parameters defined in the compatible ``acc200_conf`` structure. + +Test Application +---------------- + +BBDEV provides a test application, ``test-bbdev.py`` and range of test data for testing +the functionality of the device, depending on the device's +capabilities. The test application is located under app->test-bbdev folder and has the +following options: + +.. code-block:: console + + "-p", "--testapp-path": specifies path to the bbdev test app. + "-e", "--eal-params" : EAL arguments which are passed to the test app. + "-t", "--timeout" : Timeout in seconds (default=300). + "-c", "--test-cases" : Defines test cases to run. Run all if not specified. + "-v", "--test-vector" : Test vector path (default=dpdk_path+/app/test-bbdev/test_vectors/bbdev_null.data). + "-n", "--num-ops" : Number of operations to process on device (default=32). + "-b", "--burst-size" : Operations enqueue/dequeue burst size (default=32). + "-s", "--snr" : SNR in dB used when generating LLRs for bler tests. + "-s", "--iter_max" : Number of iterations for LDPC decoder. + "-l", "--num-lcores" : Number of lcores to run (default=16). + "-i", "--init-device" : Initialise PF device with default values. + + +To execute the test application tool using simple decode or encode data, +type one of the following: + +.. code-block:: console + + ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_dec_default.data + ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_enc_default.data + + +The test application ``test-bbdev.py``, supports the ability to configure the PF device with +a default set of values, if the "-i" or "- -init-device" option is included. The default values +are defined in test_bbdev_perf.c. + + +Test Vectors +~~~~~~~~~~~~ + +In addition to the simple LDPC decoder and LDPC encoder tests, bbdev also provides +a range of additional tests under the test_vectors folder, which may be useful. The results +of these tests will depend on the device capabilities which may cause some +testcases to be skipped, but no failure should be reported. + + +Alternate Baseband Device configuration tool +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +On top of the embedded configuration feature supported in test-bbdev using "- -init-device" +option mentioned above, there is also a tool available to perform that device configuration +using a companion application. +The ``pf_bb_config`` application notably enables then to run bbdev-test from the VF +and not only limited to the PF as captured above. + +See for more details: https://github.com/intel/pf-bb-config + +Specifically for the BBDEV ACC200 PMD, the command below can be used: + +.. code-block:: console + + ./pf_bb_config ACC200 -c ./acc200/acc200_config_vf_5g.cfg + ./test-bbdev.py -e="-c 0xff0 -a${VF_PCI_ADDR}" -c validation -n 64 -b 64 -l 1 -v ./ldpc_dec_default.data diff --git a/doc/guides/bbdevs/index.rst b/doc/guides/bbdevs/index.rst index cedd706..4e9dea8 100644 --- a/doc/guides/bbdevs/index.rst +++ b/doc/guides/bbdevs/index.rst @@ -14,4 +14,5 @@ Baseband Device Drivers fpga_lte_fec fpga_5gnr_fec acc100 + acc200 la12xx diff --git a/drivers/baseband/acc200/acc200_pmd.h b/drivers/baseband/acc200/acc200_pmd.h new file mode 100644 index 0000000..a22ca67 --- /dev/null +++ b/drivers/baseband/acc200/acc200_pmd.h @@ -0,0 +1,38 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 Intel Corporation + */ + +#ifndef _RTE_ACC200_PMD_H_ +#define _RTE_ACC200_PMD_H_ + +/* Helper macro for logging */ +#define rte_bbdev_log(level, fmt, ...) \ + rte_log(RTE_LOG_ ## level, acc200_logtype, fmt "\n", \ + ##__VA_ARGS__) + +#ifdef RTE_LIBRTE_BBDEV_DEBUG +#define rte_bbdev_log_debug(fmt, ...) \ + rte_bbdev_log(DEBUG, "acc200_pmd: " fmt, \ + ##__VA_ARGS__) +#else +#define rte_bbdev_log_debug(fmt, ...) +#endif + +/* ACC200 PF and VF driver names */ +#define ACC200PF_DRIVER_NAME intel_acc200_pf +#define ACC200VF_DRIVER_NAME intel_acc200_vf + +/* ACC200 PCI vendor & device IDs */ +#define RTE_ACC200_VENDOR_ID (0x8086) +#define RTE_ACC200_PF_DEVICE_ID (0x57C0) +#define RTE_ACC200_VF_DEVICE_ID (0x57C1) + +/* Private data structure for each ACC200 device */ +struct acc200_device { + void *mmio_base; /**< Base address of MMIO registers (BAR0) */ + uint32_t ddr_size; /* Size in kB */ + bool pf_device; /**< True if this is a PF ACC200 device */ + bool configured; /**< True if this ACC200 device is configured */ +}; + +#endif /* _RTE_ACC200_PMD_H_ */ diff --git a/drivers/baseband/acc200/meson.build b/drivers/baseband/acc200/meson.build new file mode 100644 index 0000000..7b47bc6 --- /dev/null +++ b/drivers/baseband/acc200/meson.build @@ -0,0 +1,6 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2021 Intel Corporation + +deps += ['bbdev', 'bus_vdev', 'ring', 'pci', 'bus_pci'] + +sources = files('rte_acc200_pmd.c') diff --git a/drivers/baseband/acc200/rte_acc200_pmd.c b/drivers/baseband/acc200/rte_acc200_pmd.c new file mode 100644 index 0000000..4103e48 --- /dev/null +++ b/drivers/baseband/acc200/rte_acc200_pmd.c @@ -0,0 +1,179 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 Intel Corporation + */ + +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#ifdef RTE_BBDEV_OFFLOAD_COST +#include +#endif + +#include +#include +#include "acc200_pmd.h" + +#ifdef RTE_LIBRTE_BBDEV_DEBUG +RTE_LOG_REGISTER_DEFAULT(acc200_logtype, DEBUG); +#else +RTE_LOG_REGISTER_DEFAULT(acc200_logtype, NOTICE); +#endif + +/* Free memory used for software rings */ +static int +acc200_dev_close(struct rte_bbdev *dev) +{ + RTE_SET_USED(dev); + return 0; +} + + +static const struct rte_bbdev_ops acc200_bbdev_ops = { + .close = acc200_dev_close, +}; + +/* ACC200 PCI PF address map */ +static struct rte_pci_id pci_id_acc200_pf_map[] = { + { + RTE_PCI_DEVICE(RTE_ACC200_VENDOR_ID, RTE_ACC200_PF_DEVICE_ID) + }, + {.device_id = 0}, +}; + +/* ACC200 PCI VF address map */ +static struct rte_pci_id pci_id_acc200_vf_map[] = { + { + RTE_PCI_DEVICE(RTE_ACC200_VENDOR_ID, RTE_ACC200_VF_DEVICE_ID) + }, + {.device_id = 0}, +}; + +/* Initialization Function */ +static void +acc200_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv) +{ + struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device); + + dev->dev_ops = &acc200_bbdev_ops; + + ((struct acc200_device *) dev->data->dev_private)->pf_device = + !strcmp(drv->driver.name, + RTE_STR(ACC200PF_DRIVER_NAME)); + ((struct acc200_device *) dev->data->dev_private)->mmio_base = + pci_dev->mem_resource[0].addr; + + rte_bbdev_log_debug("Init device %s [%s] @ vaddr %p paddr %#"PRIx64"", + drv->driver.name, dev->data->name, + (void *)pci_dev->mem_resource[0].addr, + pci_dev->mem_resource[0].phys_addr); +} + +static int acc200_pci_probe(struct rte_pci_driver *pci_drv, + struct rte_pci_device *pci_dev) +{ + struct rte_bbdev *bbdev = NULL; + char dev_name[RTE_BBDEV_NAME_MAX_LEN]; + + if (pci_dev == NULL) { + rte_bbdev_log(ERR, "NULL PCI device"); + return -EINVAL; + } + + rte_pci_device_name(&pci_dev->addr, dev_name, sizeof(dev_name)); + + /* Allocate memory to be used privately by drivers */ + bbdev = rte_bbdev_allocate(pci_dev->device.name); + if (bbdev == NULL) + return -ENODEV; + + /* allocate device private memory */ + bbdev->data->dev_private = rte_zmalloc_socket(dev_name, + sizeof(struct acc200_device), RTE_CACHE_LINE_SIZE, + pci_dev->device.numa_node); + + if (bbdev->data->dev_private == NULL) { + rte_bbdev_log(CRIT, + "Allocate of %zu bytes for device \"%s\" failed", + sizeof(struct acc200_device), dev_name); + rte_bbdev_release(bbdev); + return -ENOMEM; + } + + /* Fill HW specific part of device structure */ + bbdev->device = &pci_dev->device; + bbdev->intr_handle = pci_dev->intr_handle; + bbdev->data->socket_id = pci_dev->device.numa_node; + + /* Invoke ACC200 device initialization function */ + acc200_bbdev_init(bbdev, pci_drv); + + rte_bbdev_log_debug("Initialised bbdev %s (id = %u)", + dev_name, bbdev->data->dev_id); + return 0; +} + +static int acc200_pci_remove(struct rte_pci_device *pci_dev) +{ + struct rte_bbdev *bbdev; + int ret; + uint8_t dev_id; + + if (pci_dev == NULL) + return -EINVAL; + + /* Find device */ + bbdev = rte_bbdev_get_named_dev(pci_dev->device.name); + if (bbdev == NULL) { + rte_bbdev_log(CRIT, + "Couldn't find HW dev \"%s\" to uninitialise it", + pci_dev->device.name); + return -ENODEV; + } + dev_id = bbdev->data->dev_id; + + /* free device private memory before close */ + rte_free(bbdev->data->dev_private); + + /* Close device */ + ret = rte_bbdev_close(dev_id); + if (ret < 0) + rte_bbdev_log(ERR, + "Device %i failed to close during uninit: %i", + dev_id, ret); + + /* release bbdev from library */ + rte_bbdev_release(bbdev); + + rte_bbdev_log_debug("Destroyed bbdev = %u", dev_id); + + return 0; +} + +static struct rte_pci_driver acc200_pci_pf_driver = { + .probe = acc200_pci_probe, + .remove = acc200_pci_remove, + .id_table = pci_id_acc200_pf_map, + .drv_flags = RTE_PCI_DRV_NEED_MAPPING +}; + +static struct rte_pci_driver acc200_pci_vf_driver = { + .probe = acc200_pci_probe, + .remove = acc200_pci_remove, + .id_table = pci_id_acc200_vf_map, + .drv_flags = RTE_PCI_DRV_NEED_MAPPING +}; + +RTE_PMD_REGISTER_PCI(ACC200PF_DRIVER_NAME, acc200_pci_pf_driver); +RTE_PMD_REGISTER_PCI_TABLE(ACC200PF_DRIVER_NAME, pci_id_acc200_pf_map); +RTE_PMD_REGISTER_PCI(ACC200VF_DRIVER_NAME, acc200_pci_vf_driver); +RTE_PMD_REGISTER_PCI_TABLE(ACC200VF_DRIVER_NAME, pci_id_acc200_vf_map); diff --git a/drivers/baseband/acc200/version.map b/drivers/baseband/acc200/version.map new file mode 100644 index 0000000..c2e0723 --- /dev/null +++ b/drivers/baseband/acc200/version.map @@ -0,0 +1,3 @@ +DPDK_22 { + local: *; +}; diff --git a/drivers/baseband/meson.build b/drivers/baseband/meson.build index 686e98b..343f83a 100644 --- a/drivers/baseband/meson.build +++ b/drivers/baseband/meson.build @@ -7,6 +7,7 @@ endif drivers = [ 'acc100', + 'acc200', 'fpga_5gnr_fec', 'fpga_lte_fec', 'la12xx', From patchwork Fri Jul 8 00:01:35 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Chautru, Nicolas" X-Patchwork-Id: 113814 X-Patchwork-Delegate: gakhil@marvell.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id B183BA0543; Fri, 8 Jul 2022 02:16:20 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 626E04280D; Fri, 8 Jul 2022 02:16:08 +0200 (CEST) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by mails.dpdk.org (Postfix) with ESMTP id A284B4069D for ; Fri, 8 Jul 2022 02:16:05 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1657239365; x=1688775365; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=dD3A5SrLeG1cPGDVTTLL3LRxNCbHvN7cXiY37+Ge8Uw=; b=oH/aEpviybbfqLBwgj2ZbsZ6Aa66iICNL/c7xUubhr4p0v8T94V4JWuK X4Crqr3xfcMd8JEGzKPDNQbSdkTXKb85t/tjms3mj4X5p08HwQOkENT5u MIlyKU9eS9ZOmvTq/phQ1fPAHXnbGaG9kHT4iyoZQMmeOpi2FHLPo6tyY myO9ETU9Gvm9XbRvXIGi8SrMKpLH6Th2oelOB4wmScEK3zcgvz4+lMTRA z6ktNkfSidaqh96EsEMv8D6xQu6o+QgRmKYDPCLBgeQexN6kNHV5G2kdQ m6WwOA4dFSaAEqRPGMfOqeUOpGCY5A6vmxZAEW6c+wfbn+gDrnGeAMPeo w==; X-IronPort-AV: E=McAfee;i="6400,9594,10401"; a="264563076" X-IronPort-AV: E=Sophos;i="5.92,253,1650956400"; d="scan'208";a="264563076" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Jul 2022 17:16:03 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.92,253,1650956400"; d="scan'208";a="591387526" Received: from skx-5gnr-sc12-4.sc.intel.com ([172.25.69.210]) by orsmga007.jf.intel.com with ESMTP; 07 Jul 2022 17:16:03 -0700 From: Nicolas Chautru To: dev@dpdk.org, thomas@monjalon.net, gakhil@marvell.com, hemant.agrawal@nxp.com, trix@redhat.com Cc: maxime.coquelin@redhat.com, mdr@ashroe.eu, bruce.richardson@intel.com, david.marchand@redhat.com, stephen@networkplumber.org, Nicolas Chautru Subject: [PATCH v1 02/10] baseband/acc200: add HW register definitions Date: Thu, 7 Jul 2022 17:01:35 -0700 Message-Id: <1657238503-143836-3-git-send-email-nicolas.chautru@intel.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1657238503-143836-1-git-send-email-nicolas.chautru@intel.com> References: <1657238503-143836-1-git-send-email-nicolas.chautru@intel.com> X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Add registers list and structure to access the device. Signed-off-by: Nicolas Chautru --- drivers/baseband/acc200/acc200_pf_enum.h | 468 ++++++++++++++++++++++++ drivers/baseband/acc200/acc200_pmd.h | 588 +++++++++++++++++++++++++++++++ drivers/baseband/acc200/acc200_vf_enum.h | 89 +++++ drivers/baseband/acc200/rte_acc200_pmd.c | 2 + 4 files changed, 1147 insertions(+) create mode 100644 drivers/baseband/acc200/acc200_pf_enum.h create mode 100644 drivers/baseband/acc200/acc200_vf_enum.h diff --git a/drivers/baseband/acc200/acc200_pf_enum.h b/drivers/baseband/acc200/acc200_pf_enum.h new file mode 100644 index 0000000..e8d7001 --- /dev/null +++ b/drivers/baseband/acc200/acc200_pf_enum.h @@ -0,0 +1,468 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 Intel Corporation + */ + +#ifndef ACC200_PF_ENUM_H +#define ACC200_PF_ENUM_H + +/* + * ACC200 Register mapping on PF BAR0 + * This is automatically generated from RDL, format may change with new RDL + * Release. + * Variable names are as is + */ +enum { + HWPfQmgrEgressQueuesTemplate = 0x0007FC00, + HWPfQmgrIngressAq = 0x00080000, + HWPfQmgrArbQAvail = 0x00A00010, + HWPfQmgrArbQBlock = 0x00A00020, + HWPfQmgrAqueueDropNotifEn = 0x00A00024, + HWPfQmgrAqueueDisableNotifEn = 0x00A00028, + HWPfQmgrSoftReset = 0x00A00038, + HWPfQmgrInitStatus = 0x00A0003C, + HWPfQmgrAramWatchdogCount = 0x00A00040, + HWPfQmgrAramWatchdogCounterEn = 0x00A00044, + HWPfQmgrAxiWatchdogCount = 0x00A00048, + HWPfQmgrAxiWatchdogCounterEn = 0x00A0004C, + HWPfQmgrProcessWatchdogCount = 0x00A00060, + HWPfQmgrProcessWatchdogCounterEn = 0x00A00054, + HWPfQmgrProcessWatchdogCounter = 0x00A00060, + HWPfQmgrMsiOverflowUpperVf = 0x00A00080, + HWPfQmgrMsiOverflowLowerVf = 0x00A00084, + HWPfQmgrMsiWatchdogOverflow = 0x00A00088, + HWPfQmgrMsiOverflowEnable = 0x00A0008C, + HWPfQmgrDebugAqPointerMemGrp = 0x00A00100, + HWPfQmgrDebugOutputArbQFifoGrp = 0x00A00140, + HWPfQmgrDebugMsiFifoGrp = 0x00A00180, + HWPfQmgrDebugAxiWdTimeoutMsiFifo = 0x00A001C0, + HWPfQmgrDebugProcessWdTimeoutMsiFifo = 0x00A001C4, + HWPfQmgrDepthLog2Grp = 0x00A00200, + HWPfQmgrTholdGrp = 0x00A00300, + HWPfQmgrGrpTmplateReg0Indx = 0x00A00600, + HWPfQmgrGrpTmplateReg1Indx = 0x00A00700, + HWPfQmgrGrpTmplateReg2indx = 0x00A00800, + HWPfQmgrGrpTmplateReg3Indx = 0x00A00900, + HWPfQmgrGrpTmplateReg4Indx = 0x00A00A00, + HWPfQmgrVfBaseAddr = 0x00A01000, + HWPfQmgrUl4GWeightRrVf = 0x00A02000, + HWPfQmgrDl4GWeightRrVf = 0x00A02100, + HWPfQmgrUl5GWeightRrVf = 0x00A02200, + HWPfQmgrDl5GWeightRrVf = 0x00A02300, + HWPfQmgrMldWeightRrVf = 0x00A02400, + HWPfQmgrArbQDepthGrp = 0x00A02F00, + HWPfQmgrGrpFunction0 = 0x00A02F40, + HWPfQmgrGrpFunction1 = 0x00A02F44, + HWPfQmgrGrpPriority = 0x00A02F48, + HWPfQmgrWeightSync = 0x00A03000, + HWPfQmgrAqEnableVf = 0x00A10000, + HWPfQmgrAqResetVf = 0x00A20000, + HWPfQmgrRingSizeVf = 0x00A20004, + HWPfQmgrGrpDepthLog20Vf = 0x00A20008, + HWPfQmgrGrpDepthLog21Vf = 0x00A2000C, + HWPfQmgrGrpFunction0Vf = 0x00A20010, + HWPfQmgrGrpFunction1Vf = 0x00A20014, + HWPfFabricM2iBufferReg = 0x00B30000, + HWPfFabricI2Mcore_reg_g0 = 0x00B31000, + HWPfFabricI2Mcore_weight_g0 = 0x00B31004, + HWPfFabricI2Mbuffer_g0 = 0x00B31008, + HWPfFabricI2Mcore_reg_g1 = 0x00B31010, + HWPfFabricI2Mcore_weight_g1 = 0x00B31014, + HWPfFabricI2Mbuffer_g1 = 0x00B31018, + HWPfFabricI2Mcore_reg_g2 = 0x00B31020, + HWPfFabricI2Mcore_weight_g2 = 0x00B31024, + HWPfFabricI2Mbuffer_g2 = 0x00B31028, + HWPfFabricI2Mcore_reg_g3 = 0x00B31030, + HWPfFabricI2Mcore_weight_g3 = 0x00B31034, + HWPfFabricI2Mbuffer_g3 = 0x00B31038, + HWPfFabricI2Mdma_weight = 0x00B31044, + HWPfFecUl5gCntrlReg = 0x00B40000, + HWPfFecUl5gI2MThreshReg = 0x00B40004, + HWPfFecUl5gVersionReg = 0x00B40100, + HWPfFecUl5gFcwStatusReg = 0x00B40104, + HWPfFecUl5gWarnReg = 0x00B40108, + HwPfFecUl5gIbDebugReg = 0x00B40200, + HwPfFecUl5gObLlrDebugReg = 0x00B40204, + HwPfFecUl5gObHarqDebugReg = 0x00B40208, + HwPfFecUl5g1CntrlReg = 0x00B41000, + HwPfFecUl5g1I2MThreshReg = 0x00B41004, + HwPfFecUl5g1VersionReg = 0x00B41100, + HwPfFecUl5g1FcwStatusReg = 0x00B41104, + HwPfFecUl5g1WarnReg = 0x00B41108, + HwPfFecUl5g1IbDebugReg = 0x00B41200, + HwPfFecUl5g1ObLlrDebugReg = 0x00B41204, + HwPfFecUl5g1ObHarqDebugReg = 0x00B41208, + HwPfFecUl5g2CntrlReg = 0x00B42000, + HwPfFecUl5g2I2MThreshReg = 0x00B42004, + HwPfFecUl5g2VersionReg = 0x00B42100, + HwPfFecUl5g2FcwStatusReg = 0x00B42104, + HwPfFecUl5g2WarnReg = 0x00B42108, + HwPfFecUl5g2IbDebugReg = 0x00B42200, + HwPfFecUl5g2ObLlrDebugReg = 0x00B42204, + HwPfFecUl5g2ObHarqDebugReg = 0x00B42208, + HwPfFecUl5g3CntrlReg = 0x00B43000, + HwPfFecUl5g3I2MThreshReg = 0x00B43004, + HwPfFecUl5g3VersionReg = 0x00B43100, + HwPfFecUl5g3FcwStatusReg = 0x00B43104, + HwPfFecUl5g3WarnReg = 0x00B43108, + HwPfFecUl5g3IbDebugReg = 0x00B43200, + HwPfFecUl5g3ObLlrDebugReg = 0x00B43204, + HwPfFecUl5g3ObHarqDebugReg = 0x00B43208, + HwPfFecUl5g4CntrlReg = 0x00B44000, + HwPfFecUl5g4I2MThreshReg = 0x00B44004, + HwPfFecUl5g4VersionReg = 0x00B44100, + HwPfFecUl5g4FcwStatusReg = 0x00B44104, + HwPfFecUl5g4WarnReg = 0x00B44108, + HwPfFecUl5g4IbDebugReg = 0x00B44200, + HwPfFecUl5g4ObLlrDebugReg = 0x00B44204, + HwPfFecUl5g4ObHarqDebugReg = 0x00B44208, + HwPfFecUl5g5CntrlReg = 0x00B45000, + HwPfFecUl5g5I2MThreshReg = 0x00B45004, + HwPfFecUl5g5VersionReg = 0x00B45100, + HwPfFecUl5g5FcwStatusReg = 0x00B45104, + HwPfFecUl5g5WarnReg = 0x00B45108, + HwPfFecUl5g5IbDebugReg = 0x00B45200, + HwPfFecUl5g5ObLlrDebugReg = 0x00B45204, + HwPfFecUl5g5ObHarqDebugReg = 0x00B45208, + HwPfFecUl5g6CntrlReg = 0x00B46000, + HwPfFecUl5g6I2MThreshReg = 0x00B46004, + HwPfFecUl5g6VersionReg = 0x00B46100, + HwPfFecUl5g6FcwStatusReg = 0x00B46104, + HwPfFecUl5g6WarnReg = 0x00B46108, + HwPfFecUl5g6IbDebugReg = 0x00B46200, + HwPfFecUl5g6ObLlrDebugReg = 0x00B46204, + HwPfFecUl5g6ObHarqDebugReg = 0x00B46208, + HwPfFecUl5g7CntrlReg = 0x00B47000, + HwPfFecUl5g7I2MThreshReg = 0x00B47004, + HwPfFecUl5g7VersionReg = 0x00B47100, + HwPfFecUl5g7FcwStatusReg = 0x00B47104, + HwPfFecUl5g7WarnReg = 0x00B47108, + HwPfFecUl5g7IbDebugReg = 0x00B47200, + HwPfFecUl5g7ObLlrDebugReg = 0x00B47204, + HwPfFecUl5g7ObHarqDebugReg = 0x00B47208, + HwPfFecUl5g8CntrlReg = 0x00B48000, + HwPfFecUl5g8I2MThreshReg = 0x00B48004, + HwPfFecUl5g8VersionReg = 0x00B48100, + HwPfFecUl5g8FcwStatusReg = 0x00B48104, + HwPfFecUl5g8WarnReg = 0x00B48108, + HwPfFecUl5g8IbDebugReg = 0x00B48200, + HwPfFecUl5g8ObLlrDebugReg = 0x00B48204, + HwPfFecUl5g8ObHarqDebugReg = 0x00B48208, + HWPfFecDl5gCntrlReg = 0x00B4F000, + HWPfFecDl5gI2MThreshReg = 0x00B4F004, + HWPfFecDl5gVersionReg = 0x00B4F100, + HWPfFecDl5gFcwStatusReg = 0x00B4F104, + HWPfFecDl5gWarnReg = 0x00B4F108, + HWPfFecUlVersionReg = 0x00B50000, + HWPfFecUlControlReg = 0x00B50004, + HWPfFecUlStatusReg = 0x00B50008, + HWPfFftConfig0 = 0x00B58004, + HWPfFftConfig1 = 0x00B58008, + HWPfFftRamPageAccess = 0x00B5800C, + HWPfFftRamOff = 0x00B58800, + HWPfFecDlVersionReg = 0x00B5F000, + HWPfFecDlClusterConfigReg = 0x00B5F004, + HWPfFecDlBurstThres = 0x00B5F00C, + HWPfFecDlClusterStatusReg0 = 0x00B5F040, + HWPfFecDlClusterStatusReg1 = 0x00B5F044, + HWPfFecDlClusterStatusReg2 = 0x00B5F048, + HWPfFecDlClusterStatusReg3 = 0x00B5F04C, + HWPfFecDlClusterStatusReg4 = 0x00B5F050, + HWPfFecDlClusterStatusReg5 = 0x00B5F054, + HWPfDmaConfig0Reg = 0x00B80000, + HWPfDmaConfig1Reg = 0x00B80004, + HWPfDmaQmgrAddrReg = 0x00B80008, + HWPfDmaSoftResetReg = 0x00B8000C, + HWPfDmaAxcacheReg = 0x00B80010, + HWPfDmaVersionReg = 0x00B80014, + HWPfDmaFrameThreshold = 0x00B80018, + HWPfDmaTimestampLo = 0x00B8001C, + HWPfDmaTimestampHi = 0x00B80020, + HWPfDmaAxiStatus = 0x00B80028, + HWPfDmaAxiControl = 0x00B8002C, + HWPfDmaNoQmgr = 0x00B80030, + HWPfDmaQosScale = 0x00B80034, + HWPfDmaQmanen = 0x00B80040, + HWPfDmaFftModeThld = 0x00B80054, + HWPfDmaQmgrQosBase = 0x00B80060, + HWPfDmaFecClkGatingEnable = 0x00B80080, + HWPfDmaPmEnable = 0x00B80084, + HWPfDmaQosEnable = 0x00B80088, + HWPfDmaHarqWeightedRrFrameThreshold = 0x00B800B0, + HWPfDmaDataSmallWeightedRrFrameThresh = 0x00B800B4, + HWPfDmaDataLargeWeightedRrFrameThresh = 0x00B800B8, + HWPfDmaInboundCbMaxSize = 0x00B800BC, + HWPfDmaInboundDrainDataSize = 0x00B800C0, + HWPfDmaEngineTypeSmall = 0x00B800C4, + HWPfDma5gdlIbThld = 0x00B800C8, + HWPfDma4gdlIbThld = 0x00B800CC, + HWPfDmafftIbThld = 0x00B800D0, + HWPfDmaVfDdrBaseRw = 0x00B80400, + HWPfDmaCmplTmOutCnt = 0x00B80800, + HWPfDmaProcTmOutCnt = 0x00B80804, + HWPfDmaStatusRrespBresp = 0x00B80810, + HWPfDmaCfgRrespBresp = 0x00B80814, + HWPfDmaStatusMemParErr = 0x00B80818, + HWPfDmaCfgMemParErrEn = 0x00B8081C, + HWPfDmaStatusDmaHwErr = 0x00B80820, + HWPfDmaCfgDmaHwErrEn = 0x00B80824, + HWPfDmaStatusFecCoreErr = 0x00B80828, + HWPfDmaCfgFecCoreErrEn = 0x00B8082C, + HWPfDmaStatusFcwDescrErr = 0x00B80830, + HWPfDmaCfgFcwDescrErrEn = 0x00B80834, + HWPfDmaStatusBlockTransmit = 0x00B80838, + HWPfDmaBlockOnErrEn = 0x00B8083C, + HWPfDmaStatusFlushDma = 0x00B80840, + HWPfDmaFlushDmaOnErrEn = 0x00B80844, + HWPfDmaStatusSdoneFifoFull = 0x00B80848, + HWPfDmaStatusDescriptorErrLoVf = 0x00B8084C, + HWPfDmaStatusDescriptorErrHiVf = 0x00B80850, + HWPfDmaStatusFcwErrLoVf = 0x00B80854, + HWPfDmaStatusFcwErrHiVf = 0x00B80858, + HWPfDmaStatusDataErrLoVf = 0x00B8085C, + HWPfDmaStatusDataErrHiVf = 0x00B80860, + HWPfDmaCfgMsiEnSoftwareErr = 0x00B80864, + HWPfDmaDescriptorSignatuture = 0x00B80868, + HWPfDmaFcwSignature = 0x00B8086C, + HWPfDmaErrorDetectionEn = 0x00B80870, + HWPfDmaErrCntrlFifoDebug = 0x00B8087C, + HWPfDmaStatusToutData = 0x00B80880, + HWPfDmaStatusToutDesc = 0x00B80884, + HWPfDmaStatusToutUnexpData = 0x00B80888, + HWPfDmaStatusToutUnexpDesc = 0x00B8088C, + HWPfDmaStatusToutProcess = 0x00B80890, + HWPfDmaConfigCtoutOutDataEn = 0x00B808A0, + HWPfDmaConfigCtoutOutDescrEn = 0x00B808A4, + HWPfDmaConfigUnexpComplDataEn = 0x00B808A8, + HWPfDmaConfigUnexpComplDescrEn = 0x00B808AC, + HWPfDmaConfigPtoutOutEn = 0x00B808B0, + HWPfDmaFec5GulDescBaseLoRegVf = 0x00B88020, + HWPfDmaFec5GulDescBaseHiRegVf = 0x00B88024, + HWPfDmaFec5GulRespPtrLoRegVf = 0x00B88028, + HWPfDmaFec5GulRespPtrHiRegVf = 0x00B8802C, + HWPfDmaFec5GdlDescBaseLoRegVf = 0x00B88040, + HWPfDmaFec5GdlDescBaseHiRegVf = 0x00B88044, + HWPfDmaFec5GdlRespPtrLoRegVf = 0x00B88048, + HWPfDmaFec5GdlRespPtrHiRegVf = 0x00B8804C, + HWPfDmaFec4GulDescBaseLoRegVf = 0x00B88060, + HWPfDmaFec4GulDescBaseHiRegVf = 0x00B88064, + HWPfDmaFec4GulRespPtrLoRegVf = 0x00B88068, + HWPfDmaFec4GulRespPtrHiRegVf = 0x00B8806C, + HWPfDmaFec4GdlDescBaseLoRegVf = 0x00B88080, + HWPfDmaFec4GdlDescBaseHiRegVf = 0x00B88084, + HWPfDmaFec4GdlRespPtrLoRegVf = 0x00B88088, + HWPfDmaFec4GdlRespPtrHiRegVf = 0x00B8808C, + HWPDmaFftDescBaseLoRegVf = 0x00B880A0, + HWPDmaFftDescBaseHiRegVf = 0x00B880A4, + HWPDmaFftRespPtrLoRegVf = 0x00B880A8, + HWPDmaFftRespPtrHiRegVf = 0x00B880AC, + HWPfQosmonACntrlReg = 0x00B90000, + HWPfQosmonAEvalOverflow0 = 0x00B90008, + HWPfQosmonAEvalOverflow1 = 0x00B9000C, + HWPfQosmonADivTerm = 0x00B90010, + HWPfQosmonATickTerm = 0x00B90014, + HWPfQosmonAEvalTerm = 0x00B90018, + HWPfQosmonAAveTerm = 0x00B9001C, + HWPfQosmonAForceEccErr = 0x00B90020, + HWPfQosmonAEccErrDetect = 0x00B90024, + HWPfQosmonAIterationConfig0Low = 0x00B90060, + HWPfQosmonAIterationConfig0High = 0x00B90064, + HWPfQosmonAIterationConfig1Low = 0x00B90068, + HWPfQosmonAIterationConfig1High = 0x00B9006C, + HWPfQosmonAIterationConfig2Low = 0x00B90070, + HWPfQosmonAIterationConfig2High = 0x00B90074, + HWPfQosmonAIterationConfig3Low = 0x00B90078, + HWPfQosmonAIterationConfig3High = 0x00B9007C, + HWPfQosmonAEvalMemAddr = 0x00B90080, + HWPfQosmonAEvalMemData = 0x00B90084, + HWPfQosmonAXaction = 0x00B900C0, + HWPfQosmonARemThres1Vf = 0x00B90400, + HWPfQosmonAThres2Vf = 0x00B90404, + HWPfQosmonAWeiFracVf = 0x00B90408, + HWPfQosmonARrWeiVf = 0x00B9040C, + HWPfPermonACntrlRegVf = 0x00B98000, + HWPfPermonACountVf = 0x00B98008, + HWPfPermonAKCntLoVf = 0x00B98010, + HWPfPermonAKCntHiVf = 0x00B98014, + HWPfPermonADeltaCntLoVf = 0x00B98020, + HWPfPermonADeltaCntHiVf = 0x00B98024, + HWPfPermonAVersionReg = 0x00B9C000, + HWPfPermonACbControlFec = 0x00B9C0F0, + HWPfPermonADltTimerLoFec = 0x00B9C0F4, + HWPfPermonADltTimerHiFec = 0x00B9C0F8, + HWPfPermonACbCountFec = 0x00B9C100, + HWPfPermonAAccExecTimerLoFec = 0x00B9C104, + HWPfPermonAAccExecTimerHiFec = 0x00B9C108, + HWPfPermonAExecTimerMinFec = 0x00B9C200, + HWPfPermonAExecTimerMaxFec = 0x00B9C204, + HWPfPermonAControlBusMon = 0x00B9C400, + HWPfPermonAConfigBusMon = 0x00B9C404, + HWPfPermonASkipCountBusMon = 0x00B9C408, + HWPfPermonAMinLatBusMon = 0x00B9C40C, + HWPfPermonAMaxLatBusMon = 0x00B9C500, + HWPfPermonATotalLatLowBusMon = 0x00B9C504, + HWPfPermonATotalLatUpperBusMon = 0x00B9C508, + HWPfPermonATotalReqCntBusMon = 0x00B9C50C, + HWPfQosmonBCntrlReg = 0x00BA0000, + HWPfQosmonBEvalOverflow0 = 0x00BA0008, + HWPfQosmonBEvalOverflow1 = 0x00BA000C, + HWPfQosmonBDivTerm = 0x00BA0010, + HWPfQosmonBTickTerm = 0x00BA0014, + HWPfQosmonBEvalTerm = 0x00BA0018, + HWPfQosmonBAveTerm = 0x00BA001C, + HWPfQosmonBForceEccErr = 0x00BA0020, + HWPfQosmonBEccErrDetect = 0x00BA0024, + HWPfQosmonBIterationConfig0Low = 0x00BA0060, + HWPfQosmonBIterationConfig0High = 0x00BA0064, + HWPfQosmonBIterationConfig1Low = 0x00BA0068, + HWPfQosmonBIterationConfig1High = 0x00BA006C, + HWPfQosmonBIterationConfig2Low = 0x00BA0070, + HWPfQosmonBIterationConfig2High = 0x00BA0074, + HWPfQosmonBIterationConfig3Low = 0x00BA0078, + HWPfQosmonBIterationConfig3High = 0x00BA007C, + HWPfQosmonBEvalMemAddr = 0x00BA0080, + HWPfQosmonBEvalMemData = 0x00BA0084, + HWPfQosmonBXaction = 0x00BA00C0, + HWPfQosmonBRemThres1Vf = 0x00BA0400, + HWPfQosmonBThres2Vf = 0x00BA0404, + HWPfQosmonBWeiFracVf = 0x00BA0408, + HWPfQosmonBRrWeiVf = 0x00BA040C, + HWPfPermonBCntrlRegVf = 0x00BA8000, + HWPfPermonBCountVf = 0x00BA8008, + HWPfPermonBKCntLoVf = 0x00BA8010, + HWPfPermonBKCntHiVf = 0x00BA8014, + HWPfPermonBDeltaCntLoVf = 0x00BA8020, + HWPfPermonBDeltaCntHiVf = 0x00BA8024, + HWPfPermonBVersionReg = 0x00BAC000, + HWPfPermonBCbControlFec = 0x00BAC0F0, + HWPfPermonBDltTimerLoFec = 0x00BAC0F4, + HWPfPermonBDltTimerHiFec = 0x00BAC0F8, + HWPfPermonBCbCountFec = 0x00BAC100, + HWPfPermonBAccExecTimerLoFec = 0x00BAC104, + HWPfPermonBAccExecTimerHiFec = 0x00BAC108, + HWPfPermonBExecTimerMinFec = 0x00BAC200, + HWPfPermonBExecTimerMaxFec = 0x00BAC204, + HWPfPermonBControlBusMon = 0x00BAC400, + HWPfPermonBConfigBusMon = 0x00BAC404, + HWPfPermonBSkipCountBusMon = 0x00BAC408, + HWPfPermonBMinLatBusMon = 0x00BAC40C, + HWPfPermonBMaxLatBusMon = 0x00BAC500, + HWPfPermonBTotalLatLowBusMon = 0x00BAC504, + HWPfPermonBTotalLatUpperBusMon = 0x00BAC508, + HWPfPermonBTotalReqCntBusMon = 0x00BAC50C, + HWPfQosmonCCntrlReg = 0x00BB0000, + HWPfQosmonCEvalOverflow0 = 0x00BB0008, + HWPfQosmonCEvalOverflow1 = 0x00BB000C, + HWPfQosmonCDivTerm = 0x00BB0010, + HWPfQosmonCTickTerm = 0x00BB0014, + HWPfQosmonCEvalTerm = 0x00BB0018, + HWPfQosmonCAveTerm = 0x00BB001C, + HWPfQosmonCForceEccErr = 0x00BB0020, + HWPfQosmonCEccErrDetect = 0x00BB0024, + HWPfQosmonCIterationConfig0Low = 0x00BB0060, + HWPfQosmonCIterationConfig0High = 0x00BB0064, + HWPfQosmonCIterationConfig1Low = 0x00BB0068, + HWPfQosmonCIterationConfig1High = 0x00BB006C, + HWPfQosmonCIterationConfig2Low = 0x00BB0070, + HWPfQosmonCIterationConfig2High = 0x00BB0074, + HWPfQosmonCIterationConfig3Low = 0x00BB0078, + HWPfQosmonCIterationConfig3High = 0x00BB007C, + HWPfQosmonCEvalMemAddr = 0x00BB0080, + HWPfQosmonCEvalMemData = 0x00BB0084, + HWPfQosmonCXaction = 0x00BB00C0, + HWPfQosmonCRemThres1Vf = 0x00BB0400, + HWPfQosmonCThres2Vf = 0x00BB0404, + HWPfQosmonCWeiFracVf = 0x00BB0408, + HWPfQosmonCRrWeiVf = 0x00BB040C, + HWPfPermonCCntrlRegVf = 0x00BB8000, + HWPfPermonCCountVf = 0x00BB8008, + HWPfPermonCKCntLoVf = 0x00BB8010, + HWPfPermonCKCntHiVf = 0x00BB8014, + HWPfPermonCDeltaCntLoVf = 0x00BB8020, + HWPfPermonCDeltaCntHiVf = 0x00BB8024, + HWPfPermonCVersionReg = 0x00BBC000, + HWPfPermonCCbControlFec = 0x00BBC0F0, + HWPfPermonCDltTimerLoFec = 0x00BBC0F4, + HWPfPermonCDltTimerHiFec = 0x00BBC0F8, + HWPfPermonCCbCountFec = 0x00BBC100, + HWPfPermonCAccExecTimerLoFec = 0x00BBC104, + HWPfPermonCAccExecTimerHiFec = 0x00BBC108, + HWPfPermonCExecTimerMinFec = 0x00BBC200, + HWPfPermonCExecTimerMaxFec = 0x00BBC204, + HWPfPermonCControlBusMon = 0x00BBC400, + HWPfPermonCConfigBusMon = 0x00BBC404, + HWPfPermonCSkipCountBusMon = 0x00BBC408, + HWPfPermonCMinLatBusMon = 0x00BBC40C, + HWPfPermonCMaxLatBusMon = 0x00BBC500, + HWPfPermonCTotalLatLowBusMon = 0x00BBC504, + HWPfPermonCTotalLatUpperBusMon = 0x00BBC508, + HWPfPermonCTotalReqCntBusMon = 0x00BBC50C, + HWPfHiVfToPfDbellVf = 0x00C80000, + HWPfHiPfToVfDbellVf = 0x00C80008, + HWPfHiInfoRingBaseLoVf = 0x00C80010, + HWPfHiInfoRingBaseHiVf = 0x00C80014, + HWPfHiInfoRingPointerVf = 0x00C80018, + HWPfHiInfoRingIntWrEnVf = 0x00C80020, + HWPfHiInfoRingPf2VfWrEnVf = 0x00C80024, + HWPfHiMsixVectorMapperVf = 0x00C80060, + HWPfHiModuleVersionReg = 0x00C84000, + HWPfHiIosf2axiErrLogReg = 0x00C84004, + HWPfHiHardResetReg = 0x00C84008, + HWPfHi5GHardResetReg = 0x00C8400C, + HWPfHiInfoRingBaseLoRegPf = 0x00C84014, + HWPfHiInfoRingBaseHiRegPf = 0x00C84018, + HWPfHiInfoRingPointerRegPf = 0x00C8401C, + HWPfHiInfoRingIntWrEnRegPf = 0x00C84020, + HWPfHiInfoRingVf2pfLoWrEnReg = 0x00C84024, + HWPfHiInfoRingVf2pfHiWrEnReg = 0x00C84028, + HWPfHiLogParityErrStatusReg = 0x00C8402C, + HWPfHiLogDataParityErrorVfStatusLo = 0x00C84030, + HWPfHiLogDataParityErrorVfStatusHi = 0x00C84034, + HWPfHiBlockTransmitOnErrorEn = 0x00C84038, + HWPfHiCfgMsiIntWrEnRegPf = 0x00C84040, + HWPfHiCfgMsiVf2pfLoWrEnReg = 0x00C84044, + HWPfHiCfgMsiVf2pfHighWrEnReg = 0x00C84048, + HWPfHiMsixVectorMapperPf = 0x00C84060, + HWPfHiApbWrWaitTime = 0x00C84100, + HWPfHiXCounterMaxValue = 0x00C84104, + HWPfHiPfMode = 0x00C84108, + HWPfHiClkGateHystReg = 0x00C8410C, + HWPfHiSnoopBitsReg = 0x00C84110, + HWPfHiMsiDropEnableReg = 0x00C84114, + HWPfHiMsiStatReg = 0x00C84120, + HWPfHiFifoOflStatReg = 0x00C84124, + HWPfHiSectionPowerGatingReq = 0x00C84128, + HWPfHiSectionPowerGatingAck = 0x00C8412C, + HWPfHiSectionPowerGatingWaitCounter = 0x00C84130, + HWPfHiHiDebugReg = 0x00C841F4, + HWPfHiDebugMemSnoopMsiFifo = 0x00C841F8, + HWPfHiDebugMemSnoopInputFifo = 0x00C841FC, + HWPfHiMsixMappingConfig = 0x00C84200, + HWPfHiJunkReg = 0x00C8FF00, + HWPfHiMSIXBaseLoRegPf = 0x00D20000, + HWPfHiMSIXBaseHiRegPf = 0x00D20004, + HWPfHiMSIXBaseDataRegPf = 0x00D20008, + HWPfHiMSIXBaseMaskRegPf = 0x00D2000c, + HWPfHiMSIXPBABaseLoRegPf = 0x00E01000, +}; + +/* TIP PF Interrupt numbers */ +enum { + ACC200_PF_INT_QMGR_AQ_OVERFLOW = 0, + ACC200_PF_INT_DOORBELL_VF_2_PF = 1, + ACC200_PF_INT_ILLEGAL_FORMAT = 2, + ACC200_PF_INT_QMGR_DISABLED_ACCESS = 3, + ACC200_PF_INT_QMGR_AQ_OVERTHRESHOLD = 4, + ACC200_PF_INT_DMA_DL_DESC_IRQ = 5, + ACC200_PF_INT_DMA_UL_DESC_IRQ = 6, + ACC200_PF_INT_DMA_FFT_DESC_IRQ = 7, + ACC200_PF_INT_DMA_UL5G_DESC_IRQ = 8, + ACC200_PF_INT_DMA_DL5G_DESC_IRQ = 9, + ACC200_PF_INT_DMA_MLD_DESC_IRQ = 10, + ACC200_PF_INT_ARAM_ECC_1BIT_ERR = 11, + ACC200_PF_INT_PARITY_ERR = 12, + ACC200_PF_INT_QMGR_ERR = 13, + ACC200_PF_INT_INT_REQ_OVERFLOW = 14, + ACC200_PF_INT_APB_TIMEOUT = 15, +}; + +#endif /* ACC200_PF_ENUM_H */ diff --git a/drivers/baseband/acc200/acc200_pmd.h b/drivers/baseband/acc200/acc200_pmd.h index a22ca67..b420524 100644 --- a/drivers/baseband/acc200/acc200_pmd.h +++ b/drivers/baseband/acc200/acc200_pmd.h @@ -5,6 +5,9 @@ #ifndef _RTE_ACC200_PMD_H_ #define _RTE_ACC200_PMD_H_ +#include "acc200_pf_enum.h" +#include "acc200_vf_enum.h" + /* Helper macro for logging */ #define rte_bbdev_log(level, fmt, ...) \ rte_log(RTE_LOG_ ## level, acc200_logtype, fmt "\n", \ @@ -27,6 +30,591 @@ #define RTE_ACC200_PF_DEVICE_ID (0x57C0) #define RTE_ACC200_VF_DEVICE_ID (0x57C1) +/* Define as 1 to use only a single FEC engine */ +#ifndef RTE_ACC200_SINGLE_FEC +#define RTE_ACC200_SINGLE_FEC 0 +#endif + +/* Values used in filling in descriptors */ +#define ACC200_DMA_DESC_TYPE 2 +#define ACC200_DMA_CODE_BLK_MODE 0 +#define ACC200_DMA_BLKID_FCW 1 +#define ACC200_DMA_BLKID_IN 2 +#define ACC200_DMA_BLKID_OUT_ENC 1 +#define ACC200_DMA_BLKID_OUT_HARD 1 +#define ACC200_DMA_BLKID_OUT_SOFT 2 +#define ACC200_DMA_BLKID_OUT_HARQ 3 +#define ACC200_DMA_BLKID_IN_HARQ 3 + +/* Values used in filling in decode FCWs */ +#define ACC200_FCW_TD_VER 1 +#define ACC200_FCW_TD_EXT_COLD_REG_EN 1 +#define ACC200_FCW_TD_AUTOMAP 0x0f +#define ACC200_FCW_TD_RVIDX_0 2 +#define ACC200_FCW_TD_RVIDX_1 26 +#define ACC200_FCW_TD_RVIDX_2 50 +#define ACC200_FCW_TD_RVIDX_3 74 +#define ACC200_MAX_PF_MSIX (256+32) +#define ACC200_MAX_VF_MSIX (256+7) + +/* Values used in writing to the registers */ +#define ACC200_REG_IRQ_EN_ALL 0x1FF83FF /* Enable all interrupts */ + +/* ACC200 Specific Dimensioning */ +#define ACC200_SIZE_64MBYTE (64*1024*1024) +/* Number of elements in an Info Ring */ +#define ACC200_INFO_RING_NUM_ENTRIES 1024 +/* Number of elements in HARQ layout memory + * 128M x 32kB = 4GB addressable memory + */ +#define ACC200_HARQ_LAYOUT (128 * 1024 * 1024) +/* Assume offset for HARQ in memory */ +#define ACC200_HARQ_OFFSET (32 * 1024) +#define ACC200_HARQ_OFFSET_SHIFT 15 +#define ACC200_HARQ_OFFSET_MASK 0x7ffffff +/* Mask used to calculate an index in an Info Ring array (not a byte offset) */ +#define ACC200_INFO_RING_MASK (ACC200_INFO_RING_NUM_ENTRIES-1) +/* Number of Virtual Functions ACC200 supports */ +#define ACC200_NUM_VFS 16 +#define ACC200_NUM_QGRPS 16 +#define ACC200_NUM_QGRPS_PER_WORD 8 +#define ACC200_NUM_AQS 16 +#define MAX_ENQ_BATCH_SIZE 255 +/* All ACC200 Registers alignment are 32bits = 4B */ +#define ACC200_BYTES_IN_WORD 4 +#define ACC200_MAX_E_MBUF 64000 +#define ACC200_ALGO_SPA 0 +#define ACC200_ALGO_MSA 1 + +#define ACC200_GRP_ID_SHIFT 10 /* Queue Index Hierarchy */ +#define ACC200_VF_ID_SHIFT 4 /* Queue Index Hierarchy */ +#define ACC200_VF_OFFSET_QOS 16 /* offset in Memory specific to QoS Mon */ +#define ACC200_TMPL_PRI_0 0x03020100 +#define ACC200_TMPL_PRI_1 0x07060504 +#define ACC200_TMPL_PRI_2 0x0b0a0908 +#define ACC200_TMPL_PRI_3 0x0f0e0d0c +#define ACC200_QUEUE_ENABLE 0x80000000 /* Bit to mark Queue as Enabled */ +#define ACC200_WORDS_IN_ARAM_SIZE (256 * 1024 / 4) +#define ACC200_FDONE 0x80000000 +#define ACC200_SDONE 0x40000000 + +#define ACC200_NUM_TMPL 32 +/* Mapping of signals for the available engines */ +#define ACC200_SIG_UL_5G 0 +#define ACC200_SIG_UL_5G_LAST 4 +#define ACC200_SIG_DL_5G 10 +#define ACC200_SIG_DL_5G_LAST 11 +#define ACC200_SIG_UL_4G 12 +#define ACC200_SIG_UL_4G_LAST 16 +#define ACC200_SIG_DL_4G 21 +#define ACC200_SIG_DL_4G_LAST 23 +#define ACC200_SIG_FFT 24 +#define ACC200_SIG_FFT_LAST 24 + +#define ACC200_NUM_ACCS 5 /* FIXMEFFT */ +#define ACC200_ACCMAP_0 0 +#define ACC200_ACCMAP_1 2 +#define ACC200_ACCMAP_2 1 +#define ACC200_ACCMAP_3 3 +#define ACC200_ACCMAP_4 4 +#define ACC200_PF_VAL 2 + +/* max number of iterations to allocate memory block for all rings */ +#define ACC200_SW_RING_MEM_ALLOC_ATTEMPTS 5 +#define ACC200_MAX_QUEUE_DEPTH 1024 +#define ACC200_DMA_MAX_NUM_POINTERS 14 +#define ACC200_DMA_MAX_NUM_POINTERS_IN 7 +#define ACC200_DMA_DESC_PADDING 8 +#define ACC200_FCW_PADDING 12 +#define ACC200_DESC_FCW_OFFSET 192 +#define ACC200_DESC_SIZE 256 +#define ACC200_DESC_OFFSET (ACC200_DESC_SIZE / 64) +#define ACC200_FCW_TE_BLEN 32 +#define ACC200_FCW_TD_BLEN 24 +#define ACC200_FCW_LE_BLEN 32 +#define ACC200_FCW_LD_BLEN 36 +#define ACC200_FCW_FFT_BLEN 28 +#define ACC200_5GUL_SIZE_0 16 +#define ACC200_5GUL_SIZE_1 40 +#define ACC200_5GUL_OFFSET_0 36 +#define ACC200_COMPANION_PTRS 8 + +#define ACC200_FCW_VER 2 +#define ACC200_MUX_5GDL_DESC 6 +#define ACC200_CMP_ENC_SIZE 20 +#define ACC200_CMP_DEC_SIZE 24 +#define ACC200_ENC_OFFSET (32) +#define ACC200_DEC_OFFSET (80) +#define ACC200_HARQ_OFFSET_THRESHOLD 1024 +#define ACC200_LIMIT_DL_MUX_BITS 534 + +/* Constants from K0 computation from 3GPP 38.212 Table 5.4.2.1-2 */ +#define ACC200_N_ZC_1 66 /* N = 66 Zc for BG 1 */ +#define ACC200_N_ZC_2 50 /* N = 50 Zc for BG 2 */ +#define ACC200_K0_1_1 17 /* K0 fraction numerator for rv 1 and BG 1 */ +#define ACC200_K0_1_2 13 /* K0 fraction numerator for rv 1 and BG 2 */ +#define ACC200_K0_2_1 33 /* K0 fraction numerator for rv 2 and BG 1 */ +#define ACC200_K0_2_2 25 /* K0 fraction numerator for rv 2 and BG 2 */ +#define ACC200_K0_3_1 56 /* K0 fraction numerator for rv 3 and BG 1 */ +#define ACC200_K0_3_2 43 /* K0 fraction numerator for rv 3 and BG 2 */ + +/* ACC200 Configuration */ +#define ACC200_FABRIC_MODE 0x8000103 +#define ACC200_CFG_DMA_ERROR 0x3DF +#define ACC200_CFG_AXI_CACHE 0x11 +#define ACC200_CFG_QMGR_HI_P 0x0F0F +#define ACC200_ENGINE_OFFSET 0x1000 +#define ACC200_RESET_HI 0x20100 +#define ACC200_RESET_LO 0x20000 +#define ACC200_RESET_HARD 0x1FF +#define ACC200_ENGINES_MAX 9 +#define ACC200_LONG_WAIT 1000 +#define ACC200_GPEX_AXIMAP_NUM 17 +#define ACC200_CLOCK_GATING_EN 0x30000 +#define ACC200_MS_IN_US (1000) +#define ACC200_FFT_CFG_0 0x2001 +#define ACC200_FFT_RAM_EN 0x80008000 +#define ACC200_FFT_RAM_DIS 0x0 +#define ACC200_FFT_RAM_SIZE 512 +#define ACC200_CLK_EN 0x00010A01 +#define ACC200_CLK_DIS 0x01F10A01 +#define ACC200_PG_MASK_0 0x1F +#define ACC200_PG_MASK_1 0xF +#define ACC200_PG_MASK_2 0x1 +#define ACC200_PG_MASK_3 0x0 +#define ACC200_PG_MASK_FFT 1 +#define ACC200_PG_MASK_4GUL 4 +#define ACC200_PG_MASK_5GUL 8 +#define ACC200_STATUS_WAIT 10 +#define ACC200_STATUS_TO 100 + +/* ACC200 DMA Descriptor triplet */ +struct acc200_dma_triplet { + uint64_t address; + uint32_t blen:20, + res0:4, + last:1, + dma_ext:1, + res1:2, + blkid:4; +} __rte_packed; + +/* ACC200 DMA Response Descriptor */ +union acc200_dma_rsp_desc { + uint32_t val; + struct { + uint32_t crc_status:1, + synd_ok:1, + dma_err:1, + neg_stop:1, + fcw_err:1, + output_truncat:1, + input_err:1, + timestampEn:1, + iterCountFrac:8, + iter_cnt:8, + rsrvd3:6, + sdone:1, + fdone:1; + uint32_t add_info_0; + uint32_t add_info_1; + }; +}; + + +/* ACC200 Queue Manager Enqueue PCI Register */ +union acc200_enqueue_reg_fmt { + uint32_t val; + struct { + uint32_t num_elem:8, + addr_offset:3, + rsrvd:1, + req_elem_addr:20; + }; +}; + +/* FEC 4G Uplink Frame Control Word */ +struct __rte_packed acc200_fcw_td { + uint8_t fcw_ver:4, + num_maps:4; + uint8_t filler:6, + rsrvd0:1, + bypass_sb_deint:1; + uint16_t k_pos; + uint16_t k_neg; + uint8_t c_neg; + uint8_t c; + uint32_t ea; + uint32_t eb; + uint8_t cab; + uint8_t k0_start_col; + uint8_t rsrvd1; + uint8_t code_block_mode:1, + turbo_crc_type:1, + rsrvd2:3, + bypass_teq:1, + soft_output_en:1, + ext_td_cold_reg_en:1; + union { /* External Cold register */ + uint32_t ext_td_cold_reg; + struct { + uint32_t min_iter:4, + max_iter:4, + ext_scale:5, + rsrvd3:3, + early_stop_en:1, + sw_soft_out_dis:1, + sw_et_cont:1, + sw_soft_out_saturation:1, + half_iter_on:1, + raw_decoder_input_on:1, /* Unused */ + rsrvd4:10; + }; + }; +}; + +/* FEC 5GNR Uplink Frame Control Word */ +struct __rte_packed acc200_fcw_ld { + uint32_t FCWversion:4, + qm:4, + nfiller:11, + BG:1, + Zc:9, + cnu_algo:1, + synd_precoder:1, + synd_post:1; + uint32_t ncb:16, + k0:16; + uint32_t rm_e:24, + hcin_en:1, + hcout_en:1, + crc_select:1, + bypass_dec:1, + bypass_intlv:1, + so_en:1, + so_bypass_rm:1, + so_bypass_intlv:1; + uint32_t hcin_offset:16, + hcin_size0:16; + uint32_t hcin_size1:16, + hcin_decomp_mode:3, + llr_pack_mode:1, + hcout_comp_mode:3, + res2:1, + dec_convllr:4, + hcout_convllr:4; + uint32_t itmax:7, + itstop:1, + so_it:7, + res3:1, + hcout_offset:16; + uint32_t hcout_size0:16, + hcout_size1:16; + uint32_t gain_i:8, + gain_h:8, + negstop_th:16; + uint32_t negstop_it:7, + negstop_en:1, + tb_crc_select:2, + res4:2, + tb_trailer_size:20; +}; + +/* FEC 4G Downlink Frame Control Word */ +struct __rte_packed acc200_fcw_te { + uint16_t k_neg; + uint16_t k_pos; + uint8_t c_neg; + uint8_t c; + uint8_t filler; + uint8_t cab; + uint32_t ea:17, + rsrvd0:15; + uint32_t eb:17, + rsrvd1:15; + uint16_t ncb_neg; + uint16_t ncb_pos; + uint8_t rv_idx0:2, + rsrvd2:2, + rv_idx1:2, + rsrvd3:2; + uint8_t bypass_rv_idx0:1, + bypass_rv_idx1:1, + bypass_rm:1, + rsrvd4:5; + uint8_t rsrvd5:1, + rsrvd6:3, + code_block_crc:1, + rsrvd7:3; + uint8_t code_block_mode:1, + rsrvd8:7; + uint64_t rsrvd9; +}; + +/* FEC 5GNR Downlink Frame Control Word */ +struct __rte_packed acc200_fcw_le { + uint32_t FCWversion:4, + qm:4, + nfiller:11, + BG:1, + Zc:9, + res0:3; + uint32_t ncb:16, + k0:16; + uint32_t rm_e:24, + res1:2, + crc_select:1, + res2:1, + bypass_intlv:1, + res3:3; + uint32_t res4_a:12, + mcb_count:3, + res4_b:17; + uint32_t res5; + uint32_t res6; + uint32_t res7; + uint32_t res8; +}; + +/* FFT Frame Control Word */ +struct __rte_packed acc200_fcw_fft { + uint32_t in_frame_size:16, + leading_pad_size:16; + uint32_t out_frame_size:16, + leading_depad_size:16; + uint32_t cs_window_sel; + uint32_t cs_window_sel2:16, + cs_enable_bmap:16; + uint32_t num_antennas:8, + idft_size:8, + dft_size:8, + cs_offset:8; + uint32_t idft_shift:8, + dft_shift:8, + cs_multiplier:16; + uint32_t bypass:2, + res:30; +}; + +struct __rte_packed acc200_pad_ptr { + void *op_addr; + uint64_t pad1; /* pad to 64 bits */ +}; + +struct __rte_packed acc200_ptrs { + struct acc200_pad_ptr ptr[ACC200_COMPANION_PTRS]; +}; + +/* ACC200 DMA Request Descriptor */ +struct __rte_packed acc200_dma_req_desc { + union { + struct{ + uint32_t type:4, + rsrvd0:26, + sdone:1, + fdone:1; + uint32_t ib_ant_offset:16, + res2:12, + num_ant:4; + uint32_t ob_ant_offset:16, + ob_cyc_offset:12, + num_cs:4; + uint32_t pass_param:8, + sdone_enable:1, + irq_enable:1, + timeStampEn:1, + res0:5, + numCBs:4, + res1:4, + m2dlen:4, + d2mlen:4; + }; + struct{ + uint32_t word0; + uint32_t word1; + uint32_t word2; + uint32_t word3; + }; + }; + struct acc200_dma_triplet data_ptrs[ACC200_DMA_MAX_NUM_POINTERS]; + + /* Virtual addresses used to retrieve SW context info */ + union { + void *op_addr; + uint64_t pad1; /* pad to 64 bits */ + }; + /* + * Stores additional information needed for driver processing: + * - last_desc_in_batch - flag used to mark last descriptor (CB) + * in batch + * - cbs_in_tb - stores information about total number of Code Blocks + * in currently processed Transport Block + */ + union { + struct { + union { + struct acc200_fcw_ld fcw_ld; + struct acc200_fcw_td fcw_td; + struct acc200_fcw_le fcw_le; + struct acc200_fcw_te fcw_te; + struct acc200_fcw_fft fcw_fft; + uint32_t pad2[ACC200_FCW_PADDING]; + }; + uint32_t last_desc_in_batch :8, + cbs_in_tb:8, + pad4 : 16; + }; + uint64_t pad3[ACC200_DMA_DESC_PADDING]; /* pad to 64 bits */ + }; +}; + +/* ACC200 DMA Descriptor */ +union acc200_dma_desc { + struct acc200_dma_req_desc req; + union acc200_dma_rsp_desc rsp; + uint64_t atom_hdr; +}; + + +/* Union describing Info Ring entry */ +union acc200_harq_layout_data { + uint32_t val; + struct { + uint16_t offset; + uint16_t size0; + }; +} __rte_packed; + + +/* Union describing Info Ring entry */ +union acc200_info_ring_data { + uint32_t val; + struct { + union { + uint16_t detailed_info; + struct { + uint16_t aq_id: 4; + uint16_t qg_id: 4; + uint16_t vf_id: 6; + uint16_t reserved: 2; + }; + }; + uint16_t int_nb: 7; + uint16_t msi_0: 1; + uint16_t vf2pf: 6; + uint16_t loop: 1; + uint16_t valid: 1; + }; +} __rte_packed; + +struct acc200_registry_addr { + unsigned int dma_ring_dl5g_hi; + unsigned int dma_ring_dl5g_lo; + unsigned int dma_ring_ul5g_hi; + unsigned int dma_ring_ul5g_lo; + unsigned int dma_ring_dl4g_hi; + unsigned int dma_ring_dl4g_lo; + unsigned int dma_ring_ul4g_hi; + unsigned int dma_ring_ul4g_lo; + unsigned int dma_ring_fft_hi; + unsigned int dma_ring_fft_lo; + unsigned int ring_size; + unsigned int info_ring_hi; + unsigned int info_ring_lo; + unsigned int info_ring_en; + unsigned int info_ring_ptr; + unsigned int tail_ptrs_dl5g_hi; + unsigned int tail_ptrs_dl5g_lo; + unsigned int tail_ptrs_ul5g_hi; + unsigned int tail_ptrs_ul5g_lo; + unsigned int tail_ptrs_dl4g_hi; + unsigned int tail_ptrs_dl4g_lo; + unsigned int tail_ptrs_ul4g_hi; + unsigned int tail_ptrs_ul4g_lo; + unsigned int tail_ptrs_fft_hi; + unsigned int tail_ptrs_fft_lo; + unsigned int depth_log0_offset; + unsigned int depth_log1_offset; + unsigned int qman_group_func; + unsigned int hi_mode; + unsigned int pmon_ctrl_a; + unsigned int pmon_ctrl_b; + unsigned int pmon_ctrl_c; +}; + +/* Structure holding registry addresses for PF */ +static const struct acc200_registry_addr pf_reg_addr = { + .dma_ring_dl5g_hi = HWPfDmaFec5GdlDescBaseHiRegVf, + .dma_ring_dl5g_lo = HWPfDmaFec5GdlDescBaseLoRegVf, + .dma_ring_ul5g_hi = HWPfDmaFec5GulDescBaseHiRegVf, + .dma_ring_ul5g_lo = HWPfDmaFec5GulDescBaseLoRegVf, + .dma_ring_dl4g_hi = HWPfDmaFec4GdlDescBaseHiRegVf, + .dma_ring_dl4g_lo = HWPfDmaFec4GdlDescBaseLoRegVf, + .dma_ring_ul4g_hi = HWPfDmaFec4GulDescBaseHiRegVf, + .dma_ring_ul4g_lo = HWPfDmaFec4GulDescBaseLoRegVf, + .dma_ring_fft_hi = HWPDmaFftDescBaseHiRegVf, + .dma_ring_fft_lo = HWPDmaFftDescBaseLoRegVf, + .ring_size = HWPfQmgrRingSizeVf, + .info_ring_hi = HWPfHiInfoRingBaseHiRegPf, + .info_ring_lo = HWPfHiInfoRingBaseLoRegPf, + .info_ring_en = HWPfHiInfoRingIntWrEnRegPf, + .info_ring_ptr = HWPfHiInfoRingPointerRegPf, + .tail_ptrs_dl5g_hi = HWPfDmaFec5GdlRespPtrHiRegVf, + .tail_ptrs_dl5g_lo = HWPfDmaFec5GdlRespPtrLoRegVf, + .tail_ptrs_ul5g_hi = HWPfDmaFec5GulRespPtrHiRegVf, + .tail_ptrs_ul5g_lo = HWPfDmaFec5GulRespPtrLoRegVf, + .tail_ptrs_dl4g_hi = HWPfDmaFec4GdlRespPtrHiRegVf, + .tail_ptrs_dl4g_lo = HWPfDmaFec4GdlRespPtrLoRegVf, + .tail_ptrs_ul4g_hi = HWPfDmaFec4GulRespPtrHiRegVf, + .tail_ptrs_ul4g_lo = HWPfDmaFec4GulRespPtrLoRegVf, + .tail_ptrs_fft_hi = HWPDmaFftRespPtrHiRegVf, + .tail_ptrs_fft_lo = HWPDmaFftRespPtrLoRegVf, + .depth_log0_offset = HWPfQmgrGrpDepthLog20Vf, + .depth_log1_offset = HWPfQmgrGrpDepthLog21Vf, + .qman_group_func = HWPfQmgrGrpFunction0, + .hi_mode = HWPfHiMsixVectorMapperPf, + .pmon_ctrl_a = HWPfPermonACntrlRegVf, + .pmon_ctrl_b = HWPfPermonBCntrlRegVf, + .pmon_ctrl_c = HWPfPermonCCntrlRegVf, +}; + +/* Structure holding registry addresses for VF */ +static const struct acc200_registry_addr vf_reg_addr = { + .dma_ring_dl5g_hi = HWVfDmaFec5GdlDescBaseHiRegVf, + .dma_ring_dl5g_lo = HWVfDmaFec5GdlDescBaseLoRegVf, + .dma_ring_ul5g_hi = HWVfDmaFec5GulDescBaseHiRegVf, + .dma_ring_ul5g_lo = HWVfDmaFec5GulDescBaseLoRegVf, + .dma_ring_dl4g_hi = HWVfDmaFec4GdlDescBaseHiRegVf, + .dma_ring_dl4g_lo = HWVfDmaFec4GdlDescBaseLoRegVf, + .dma_ring_ul4g_hi = HWVfDmaFec4GulDescBaseHiRegVf, + .dma_ring_ul4g_lo = HWVfDmaFec4GulDescBaseLoRegVf, + .dma_ring_fft_hi = HWVfDmaFftDescBaseHiRegVf, + .dma_ring_fft_lo = HWVfDmaFftDescBaseLoRegVf, + .ring_size = HWVfQmgrRingSizeVf, + .info_ring_hi = HWVfHiInfoRingBaseHiVf, + .info_ring_lo = HWVfHiInfoRingBaseLoVf, + .info_ring_en = HWVfHiInfoRingIntWrEnVf, + .info_ring_ptr = HWVfHiInfoRingPointerVf, + .tail_ptrs_dl5g_hi = HWVfDmaFec5GdlRespPtrHiRegVf, + .tail_ptrs_dl5g_lo = HWVfDmaFec5GdlRespPtrLoRegVf, + .tail_ptrs_ul5g_hi = HWVfDmaFec5GulRespPtrHiRegVf, + .tail_ptrs_ul5g_lo = HWVfDmaFec5GulRespPtrLoRegVf, + .tail_ptrs_dl4g_hi = HWVfDmaFec4GdlRespPtrHiRegVf, + .tail_ptrs_dl4g_lo = HWVfDmaFec4GdlRespPtrLoRegVf, + .tail_ptrs_ul4g_hi = HWVfDmaFec4GulRespPtrHiRegVf, + .tail_ptrs_ul4g_lo = HWVfDmaFec4GulRespPtrLoRegVf, + .tail_ptrs_fft_hi = HWVfDmaFftRespPtrHiRegVf, + .tail_ptrs_fft_lo = HWVfDmaFftRespPtrLoRegVf, + .depth_log0_offset = HWVfQmgrGrpDepthLog20Vf, + .depth_log1_offset = HWVfQmgrGrpDepthLog21Vf, + .qman_group_func = HWVfQmgrGrpFunction0Vf, + .hi_mode = HWVfHiMsixVectorMapperVf, + .pmon_ctrl_a = HWVfPmACntrlRegVf, + .pmon_ctrl_b = HWVfPmBCntrlRegVf, + .pmon_ctrl_c = HWVfPmCCntrlRegVf, +}; + + /* Private data structure for each ACC200 device */ struct acc200_device { void *mmio_base; /**< Base address of MMIO registers (BAR0) */ diff --git a/drivers/baseband/acc200/acc200_vf_enum.h b/drivers/baseband/acc200/acc200_vf_enum.h new file mode 100644 index 0000000..616edb6 --- /dev/null +++ b/drivers/baseband/acc200/acc200_vf_enum.h @@ -0,0 +1,89 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 Intel Corporation + */ + +#ifndef ACC200_VF_ENUM_H +#define ACC200_VF_ENUM_H + +/* + * ACC200 Register mapping on VF BAR0 + * This is automatically generated from RDL, format may change with new RDL + */ +enum { + HWVfQmgrIngressAq = 0x00000000, + HWVfHiVfToPfDbellVf = 0x00000800, + HWVfHiPfToVfDbellVf = 0x00000808, + HWVfHiInfoRingBaseLoVf = 0x00000810, + HWVfHiInfoRingBaseHiVf = 0x00000814, + HWVfHiInfoRingPointerVf = 0x00000818, + HWVfHiInfoRingIntWrEnVf = 0x00000820, + HWVfHiInfoRingPf2VfWrEnVf = 0x00000824, + HWVfHiMsixVectorMapperVf = 0x00000860, + HWVfDmaFec5GulDescBaseLoRegVf = 0x00000920, + HWVfDmaFec5GulDescBaseHiRegVf = 0x00000924, + HWVfDmaFec5GulRespPtrLoRegVf = 0x00000928, + HWVfDmaFec5GulRespPtrHiRegVf = 0x0000092C, + HWVfDmaFec5GdlDescBaseLoRegVf = 0x00000940, + HWVfDmaFec5GdlDescBaseHiRegVf = 0x00000944, + HWVfDmaFec5GdlRespPtrLoRegVf = 0x00000948, + HWVfDmaFec5GdlRespPtrHiRegVf = 0x0000094C, + HWVfDmaFec4GulDescBaseLoRegVf = 0x00000960, + HWVfDmaFec4GulDescBaseHiRegVf = 0x00000964, + HWVfDmaFec4GulRespPtrLoRegVf = 0x00000968, + HWVfDmaFec4GulRespPtrHiRegVf = 0x0000096C, + HWVfDmaFec4GdlDescBaseLoRegVf = 0x00000980, + HWVfDmaFec4GdlDescBaseHiRegVf = 0x00000984, + HWVfDmaFec4GdlRespPtrLoRegVf = 0x00000988, + HWVfDmaFec4GdlRespPtrHiRegVf = 0x0000098C, + HWVfDmaFftDescBaseLoRegVf = 0x000009A0, + HWVfDmaFftDescBaseHiRegVf = 0x000009A4, + HWVfDmaFftRespPtrLoRegVf = 0x000009A8, + HWVfDmaFftRespPtrHiRegVf = 0x000009AC, + HWVfQmgrAqResetVf = 0x00000E00, + HWVfQmgrRingSizeVf = 0x00000E04, + HWVfQmgrGrpDepthLog20Vf = 0x00000E08, + HWVfQmgrGrpDepthLog21Vf = 0x00000E0C, + HWVfQmgrGrpFunction0Vf = 0x00000E10, + HWVfQmgrGrpFunction1Vf = 0x00000E14, + HWVfPmACntrlRegVf = 0x00000F40, + HWVfPmACountVf = 0x00000F48, + HWVfPmAKCntLoVf = 0x00000F50, + HWVfPmAKCntHiVf = 0x00000F54, + HWVfPmADeltaCntLoVf = 0x00000F60, + HWVfPmADeltaCntHiVf = 0x00000F64, + HWVfPmBCntrlRegVf = 0x00000F80, + HWVfPmBCountVf = 0x00000F88, + HWVfPmBKCntLoVf = 0x00000F90, + HWVfPmBKCntHiVf = 0x00000F94, + HWVfPmBDeltaCntLoVf = 0x00000FA0, + HWVfPmBDeltaCntHiVf = 0x00000FA4, + HWVfPmCCntrlRegVf = 0x00000FC0, + HWVfPmCCountVf = 0x00000FC8, + HWVfPmCKCntLoVf = 0x00000FD0, + HWVfPmCKCntHiVf = 0x00000FD4, + HWVfPmCDeltaCntLoVf = 0x00000FE0, + HWVfPmCDeltaCntHiVf = 0x00000FE4 +}; + +/* TIP VF Interrupt numbers */ +enum { + ACC200_VF_INT_QMGR_AQ_OVERFLOW = 0, + ACC200_VF_INT_DOORBELL_PF_2_VF = 1, + ACC200_VF_INT_ILLEGAL_FORMAT = 2, + ACC200_VF_INT_QMGR_DISABLED_ACCESS = 3, + ACC200_VF_INT_QMGR_AQ_OVERTHRESHOLD = 4, + ACC200_VF_INT_DMA_DL_DESC_IRQ = 5, + ACC200_VF_INT_DMA_UL_DESC_IRQ = 6, + ACC200_VF_INT_DMA_FFT_DESC_IRQ = 7, + ACC200_VF_INT_DMA_UL5G_DESC_IRQ = 8, + ACC200_VF_INT_DMA_DL5G_DESC_IRQ = 9, + ACC200_VF_INT_DMA_MLD_DESC_IRQ = 10, +}; + +/* TIP VF2PF Comms */ +enum { + ACC200_VF2PF_STATUS_REQUEST = 0, + ACC200_VF2PF_USING_VF = 1, +}; + +#endif /* ACC200_VF_ENUM_H */ diff --git a/drivers/baseband/acc200/rte_acc200_pmd.c b/drivers/baseband/acc200/rte_acc200_pmd.c index 4103e48..70b6cc5 100644 --- a/drivers/baseband/acc200/rte_acc200_pmd.c +++ b/drivers/baseband/acc200/rte_acc200_pmd.c @@ -34,6 +34,8 @@ acc200_dev_close(struct rte_bbdev *dev) { RTE_SET_USED(dev); + /* Ensure all in flight HW transactions are completed */ + usleep(ACC200_LONG_WAIT); return 0; } From patchwork Fri Jul 8 00:01:36 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Chautru, Nicolas" X-Patchwork-Id: 113815 X-Patchwork-Delegate: gakhil@marvell.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 159FAA0543; Fri, 8 Jul 2022 02:16:28 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 5249042826; Fri, 8 Jul 2022 02:16:09 +0200 (CEST) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by mails.dpdk.org (Postfix) with ESMTP id 343A640041 for ; Fri, 8 Jul 2022 02:16:06 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1657239366; x=1688775366; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=4Tt08NPmJjgBCMcuYTy9FvBdHOok+xlE5efOSBB/9uM=; b=jDeQhhNIphWF7Y8SM/J4IuXf8o9QglfYBkIibIFdZDFaywajW2ncH3Il cqIub0+1EJxJdatY8ksqadrWq8KR1z8TlnZict6q0KJt4m+73Cn4p88ue hEjGo+xMtxya2oz+K0QT6IewL5NaBw8v4Hi1ZLV+MWCte/Ij18qAF16VM 9Y1FCpvBO5VgGafBEMzJh/iGrPFu7L7wQ/jz3thNN1twQ35Qjg5YK2O2O z9KzsphVTRqK0SFE8+WabWMmLvuwJ0t184OFVRAhTPfNq8CwBXmx2DP1+ ISFExn3PXMiwq9JGC39sVV0PiKiohgoSKHoMIZGytMsmXgbW6UqttCmCI A==; X-IronPort-AV: E=McAfee;i="6400,9594,10401"; a="264563078" X-IronPort-AV: E=Sophos;i="5.92,253,1650956400"; d="scan'208";a="264563078" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Jul 2022 17:16:03 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.92,253,1650956400"; d="scan'208";a="591387530" Received: from skx-5gnr-sc12-4.sc.intel.com ([172.25.69.210]) by orsmga007.jf.intel.com with ESMTP; 07 Jul 2022 17:16:03 -0700 From: Nicolas Chautru To: dev@dpdk.org, thomas@monjalon.net, gakhil@marvell.com, hemant.agrawal@nxp.com, trix@redhat.com Cc: maxime.coquelin@redhat.com, mdr@ashroe.eu, bruce.richardson@intel.com, david.marchand@redhat.com, stephen@networkplumber.org, Nicolas Chautru Subject: [PATCH v1 03/10] baseband/acc200: add info get function Date: Thu, 7 Jul 2022 17:01:36 -0700 Message-Id: <1657238503-143836-4-git-send-email-nicolas.chautru@intel.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1657238503-143836-1-git-send-email-nicolas.chautru@intel.com> References: <1657238503-143836-1-git-send-email-nicolas.chautru@intel.com> X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Add support for info_get to allow to query the device. Null capability exposed. Signed-off-by: Nicolas Chautru --- drivers/baseband/acc200/acc200_pmd.h | 2 + drivers/baseband/acc200/rte_acc200_cfg.h | 94 ++++++++++++ drivers/baseband/acc200/rte_acc200_pmd.c | 256 +++++++++++++++++++++++++++++++ 3 files changed, 352 insertions(+) create mode 100644 drivers/baseband/acc200/rte_acc200_cfg.h diff --git a/drivers/baseband/acc200/acc200_pmd.h b/drivers/baseband/acc200/acc200_pmd.h index b420524..91e0798 100644 --- a/drivers/baseband/acc200/acc200_pmd.h +++ b/drivers/baseband/acc200/acc200_pmd.h @@ -7,6 +7,7 @@ #include "acc200_pf_enum.h" #include "acc200_vf_enum.h" +#include "rte_acc200_cfg.h" /* Helper macro for logging */ #define rte_bbdev_log(level, fmt, ...) \ @@ -619,6 +620,7 @@ struct acc200_registry_addr { struct acc200_device { void *mmio_base; /**< Base address of MMIO registers (BAR0) */ uint32_t ddr_size; /* Size in kB */ + struct rte_acc200_conf acc200_conf; /* ACC200 Initial configuration */ bool pf_device; /**< True if this is a PF ACC200 device */ bool configured; /**< True if this ACC200 device is configured */ }; diff --git a/drivers/baseband/acc200/rte_acc200_cfg.h b/drivers/baseband/acc200/rte_acc200_cfg.h new file mode 100644 index 0000000..fcccfbf --- /dev/null +++ b/drivers/baseband/acc200/rte_acc200_cfg.h @@ -0,0 +1,94 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 Intel Corporation + */ + +#ifndef _RTE_ACC200_CFG_H_ +#define _RTE_ACC200_CFG_H_ + +/** + * @file rte_acc200_cfg.h + * + * Functions for configuring ACC200 HW, exposed directly to applications. + * Configuration related to encoding/decoding is done through the + * librte_bbdev library. + * + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + */ + +#include +#include + +#ifdef __cplusplus +extern "C" { +#endif +/**< Number of Virtual Functions ACC200 supports */ +#define RTE_ACC200_NUM_VFS 16 + +/** + * Definition of Queue Topology for ACC200 Configuration + * Some level of details is abstracted out to expose a clean interface + * given that comprehensive flexibility is not required + */ +struct rte_acc200_queue_topology { + /** Number of QGroups in incremental order of priority */ + uint16_t num_qgroups; + /** + * All QGroups have the same number of AQs here. + * Note : Could be made a 16-array if more flexibility is really + * required + */ + uint16_t num_aqs_per_groups; + /** + * Depth of the AQs is the same of all QGroups here. Log2 Enum : 2^N + * Note : Could be made a 16-array if more flexibility is really + * required + */ + uint16_t aq_depth_log2; + /** + * Index of the first Queue Group Index - assuming contiguity + * Initialized as -1 + */ + int8_t first_qgroup_index; +}; + +/** + * Definition of Arbitration related parameters for ACC200 Configuration + */ +struct rte_acc200_arbitration { + /** Default Weight for VF Fairness Arbitration */ + uint16_t round_robin_weight; + uint32_t gbr_threshold1; /**< Guaranteed Bitrate Threshold 1 */ + uint32_t gbr_threshold2; /**< Guaranteed Bitrate Threshold 2 */ +}; + +/** + * Structure to pass ACC200 configuration. + * Note: all VF Bundles will have the same configuration. + */ +struct rte_acc200_conf { + bool pf_mode_en; /**< 1 if PF is used for dataplane, 0 for VFs */ + /** 1 if input '1' bit is represented by a positive LLR value, 0 if '1' + * bit is represented by a negative value. + */ + bool input_pos_llr_1_bit; + /** 1 if output '1' bit is represented by a positive value, 0 if '1' + * bit is represented by a negative value. + */ + bool output_pos_llr_1_bit; + uint16_t num_vf_bundles; /**< Number of VF bundles to setup */ + /** Queue topology for each operation type */ + struct rte_acc200_queue_topology q_ul_4g; + struct rte_acc200_queue_topology q_dl_4g; + struct rte_acc200_queue_topology q_ul_5g; + struct rte_acc200_queue_topology q_dl_5g; + struct rte_acc200_queue_topology q_fft; + /** Arbitration configuration for each operation type */ + struct rte_acc200_arbitration arb_ul_4g[RTE_ACC200_NUM_VFS]; + struct rte_acc200_arbitration arb_dl_4g[RTE_ACC200_NUM_VFS]; + struct rte_acc200_arbitration arb_ul_5g[RTE_ACC200_NUM_VFS]; + struct rte_acc200_arbitration arb_dl_5g[RTE_ACC200_NUM_VFS]; + struct rte_acc200_arbitration arb_fft[RTE_ACC200_NUM_VFS]; +}; + +#endif /* _RTE_ACC200_CFG_H_ */ diff --git a/drivers/baseband/acc200/rte_acc200_pmd.c b/drivers/baseband/acc200/rte_acc200_pmd.c index 70b6cc5..ce72654 100644 --- a/drivers/baseband/acc200/rte_acc200_pmd.c +++ b/drivers/baseband/acc200/rte_acc200_pmd.c @@ -29,6 +29,207 @@ RTE_LOG_REGISTER_DEFAULT(acc200_logtype, NOTICE); #endif +/* Read a register of a ACC200 device */ +static inline uint32_t +acc200_reg_read(struct acc200_device *d, uint32_t offset) +{ + + void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset); + uint32_t ret = *((volatile uint32_t *)(reg_addr)); + return rte_le_to_cpu_32(ret); +} + +/* Calculate the offset of the enqueue register */ +static inline uint32_t +queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id, uint16_t aq_id) +{ + if (pf_device) + return ((vf_id << 12) + (qgrp_id << 7) + (aq_id << 3) + + HWPfQmgrIngressAq); + else + return ((qgrp_id << 7) + (aq_id << 3) + + HWVfQmgrIngressAq); +} + +enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, FFT, NUM_ACC}; + +/* Return the queue topology for a Queue Group Index */ +static inline void +qtopFromAcc(struct rte_acc200_queue_topology **qtop, int acc_enum, + struct rte_acc200_conf *acc200_conf) +{ + struct rte_acc200_queue_topology *p_qtop; + p_qtop = NULL; + switch (acc_enum) { + case UL_4G: + p_qtop = &(acc200_conf->q_ul_4g); + break; + case UL_5G: + p_qtop = &(acc200_conf->q_ul_5g); + break; + case DL_4G: + p_qtop = &(acc200_conf->q_dl_4g); + break; + case DL_5G: + p_qtop = &(acc200_conf->q_dl_5g); + break; + case FFT: + p_qtop = &(acc200_conf->q_fft); + break; + default: + /* NOTREACHED */ + rte_bbdev_log(ERR, "Unexpected error evaluating qtopFromAcc %d", + acc_enum); + break; + } + *qtop = p_qtop; +} + +static void +initQTop(struct rte_acc200_conf *acc200_conf) +{ + acc200_conf->q_ul_4g.num_aqs_per_groups = 0; + acc200_conf->q_ul_4g.num_qgroups = 0; + acc200_conf->q_ul_4g.first_qgroup_index = -1; + acc200_conf->q_ul_5g.num_aqs_per_groups = 0; + acc200_conf->q_ul_5g.num_qgroups = 0; + acc200_conf->q_ul_5g.first_qgroup_index = -1; + acc200_conf->q_dl_4g.num_aqs_per_groups = 0; + acc200_conf->q_dl_4g.num_qgroups = 0; + acc200_conf->q_dl_4g.first_qgroup_index = -1; + acc200_conf->q_dl_5g.num_aqs_per_groups = 0; + acc200_conf->q_dl_5g.num_qgroups = 0; + acc200_conf->q_dl_5g.first_qgroup_index = -1; + acc200_conf->q_fft.num_aqs_per_groups = 0; + acc200_conf->q_fft.num_qgroups = 0; + acc200_conf->q_fft.first_qgroup_index = -1; +} + +static inline void +updateQtop(uint8_t acc, uint8_t qg, struct rte_acc200_conf *acc200_conf, + struct acc200_device *d) { + uint32_t reg; + struct rte_acc200_queue_topology *q_top = NULL; + qtopFromAcc(&q_top, acc, acc200_conf); + if (unlikely(q_top == NULL)) + return; + uint16_t aq; + q_top->num_qgroups++; + if (q_top->first_qgroup_index == -1) { + q_top->first_qgroup_index = qg; + /* Can be optimized to assume all are enabled by default */ + reg = acc200_reg_read(d, queue_offset(d->pf_device, + 0, qg, ACC200_NUM_AQS - 1)); + if (reg & ACC200_QUEUE_ENABLE) { + q_top->num_aqs_per_groups = ACC200_NUM_AQS; + return; + } + q_top->num_aqs_per_groups = 0; + for (aq = 0; aq < ACC200_NUM_AQS; aq++) { + reg = acc200_reg_read(d, queue_offset(d->pf_device, + 0, qg, aq)); + if (reg & ACC200_QUEUE_ENABLE) + q_top->num_aqs_per_groups++; + } + } +} + +/* Fetch configuration enabled for the PF/VF using MMIO Read (slow) */ +static inline void +fetch_acc200_config(struct rte_bbdev *dev) +{ + struct acc200_device *d = dev->data->dev_private; + struct rte_acc200_conf *acc200_conf = &d->acc200_conf; + const struct acc200_registry_addr *reg_addr; + uint8_t acc, qg; + uint32_t reg_aq, reg_len0, reg_len1, reg0, reg1; + uint32_t reg_mode, idx; + + /* No need to retrieve the configuration is already done */ + if (d->configured) + return; + + /* Choose correct registry addresses for the device type */ + if (d->pf_device) + reg_addr = &pf_reg_addr; + else + reg_addr = &vf_reg_addr; + + d->ddr_size = 0; + + /* Single VF Bundle by VF */ + acc200_conf->num_vf_bundles = 1; + initQTop(acc200_conf); + + struct rte_acc200_queue_topology *q_top = NULL; + int qman_func_id[ACC200_NUM_ACCS] = {ACC200_ACCMAP_0, ACC200_ACCMAP_1, + ACC200_ACCMAP_2, ACC200_ACCMAP_3, ACC200_ACCMAP_4}; + reg0 = acc200_reg_read(d, reg_addr->qman_group_func); + reg1 = acc200_reg_read(d, reg_addr->qman_group_func + 4); + for (qg = 0; qg < ACC200_NUM_QGRPS; qg++) { + reg_aq = acc200_reg_read(d, + queue_offset(d->pf_device, 0, qg, 0)); + if (reg_aq & ACC200_QUEUE_ENABLE) { + /* printf("Qg enabled %d %x\n", qg, reg_aq); */ + if (qg < ACC200_NUM_QGRPS_PER_WORD) + idx = (reg0 >> (qg * 4)) & 0x7; + else + idx = (reg1 >> ((qg - + ACC200_NUM_QGRPS_PER_WORD) * 4)) & 0x7; + if (idx < ACC200_NUM_ACCS) { + acc = qman_func_id[idx]; + updateQtop(acc, qg, acc200_conf, d); + } + } + } + + /* Check the depth of the AQs*/ + reg_len0 = acc200_reg_read(d, reg_addr->depth_log0_offset); + reg_len1 = acc200_reg_read(d, reg_addr->depth_log1_offset); + for (acc = 0; acc < NUM_ACC; acc++) { + qtopFromAcc(&q_top, acc, acc200_conf); + if (q_top->first_qgroup_index < ACC200_NUM_QGRPS_PER_WORD) + q_top->aq_depth_log2 = (reg_len0 >> + (q_top->first_qgroup_index * 4)) + & 0xF; + else + q_top->aq_depth_log2 = (reg_len1 >> + ((q_top->first_qgroup_index - + ACC200_NUM_QGRPS_PER_WORD) * 4)) + & 0xF; + } + + /* Read PF mode */ + if (d->pf_device) { + reg_mode = acc200_reg_read(d, HWPfHiPfMode); + acc200_conf->pf_mode_en = (reg_mode == ACC200_PF_VAL) ? 1 : 0; + } else { + reg_mode = acc200_reg_read(d, reg_addr->hi_mode); + acc200_conf->pf_mode_en = reg_mode & 1; + } + + rte_bbdev_log_debug( + "%s Config LLR SIGN IN/OUT %s %s QG %u %u %u %u %u AQ %u %u %u %u %u Len %u %u %u %u %u\n", + (d->pf_device) ? "PF" : "VF", + (acc200_conf->input_pos_llr_1_bit) ? "POS" : "NEG", + (acc200_conf->output_pos_llr_1_bit) ? "POS" : "NEG", + acc200_conf->q_ul_4g.num_qgroups, + acc200_conf->q_dl_4g.num_qgroups, + acc200_conf->q_ul_5g.num_qgroups, + acc200_conf->q_dl_5g.num_qgroups, + acc200_conf->q_fft.num_qgroups, + acc200_conf->q_ul_4g.num_aqs_per_groups, + acc200_conf->q_dl_4g.num_aqs_per_groups, + acc200_conf->q_ul_5g.num_aqs_per_groups, + acc200_conf->q_dl_5g.num_aqs_per_groups, + acc200_conf->q_fft.num_aqs_per_groups, + acc200_conf->q_ul_4g.aq_depth_log2, + acc200_conf->q_dl_4g.aq_depth_log2, + acc200_conf->q_ul_5g.aq_depth_log2, + acc200_conf->q_dl_5g.aq_depth_log2, + acc200_conf->q_fft.aq_depth_log2); +} + /* Free memory used for software rings */ static int acc200_dev_close(struct rte_bbdev *dev) @@ -39,9 +240,57 @@ return 0; } +/* Get ACC200 device info */ +static void +acc200_dev_info_get(struct rte_bbdev *dev, + struct rte_bbdev_driver_info *dev_info) +{ + struct acc200_device *d = dev->data->dev_private; + int i; + static const struct rte_bbdev_op_cap bbdev_capabilities[] = { + RTE_BBDEV_END_OF_CAPABILITIES_LIST() + }; + + static struct rte_bbdev_queue_conf default_queue_conf; + default_queue_conf.socket = dev->data->socket_id; + default_queue_conf.queue_size = ACC200_MAX_QUEUE_DEPTH; + + dev_info->driver_name = dev->device->driver->name; + + /* Read and save the populated config from ACC200 registers */ + fetch_acc200_config(dev); + + /* Exposed number of queues */ + dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0; + dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = 0; + dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = 0; + dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = 0; + dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = 0; + dev_info->num_queues[RTE_BBDEV_OP_FFT] = 0; + dev_info->queue_priority[RTE_BBDEV_OP_TURBO_DEC] = 0; + dev_info->queue_priority[RTE_BBDEV_OP_TURBO_ENC] = 0; + dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = 0; + dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = 0; + dev_info->queue_priority[RTE_BBDEV_OP_FFT] = 0; + dev_info->max_num_queues = 0; + for (i = RTE_BBDEV_OP_NONE; i <= RTE_BBDEV_OP_FFT; i++) + dev_info->max_num_queues += dev_info->num_queues[i]; + dev_info->queue_size_lim = ACC200_MAX_QUEUE_DEPTH; + dev_info->hardware_accelerated = true; + dev_info->max_dl_queue_priority = + d->acc200_conf.q_dl_4g.num_qgroups - 1; + dev_info->max_ul_queue_priority = + d->acc200_conf.q_ul_4g.num_qgroups - 1; + dev_info->default_queue_conf = default_queue_conf; + dev_info->cpu_flag_reqs = NULL; + dev_info->min_alignment = 1; + dev_info->capabilities = bbdev_capabilities; + dev_info->harq_buffer_size = 0; +} static const struct rte_bbdev_ops acc200_bbdev_ops = { .close = acc200_dev_close, + .info_get = acc200_dev_info_get, }; /* ACC200 PCI PF address map */ @@ -60,6 +309,13 @@ {.device_id = 0}, }; +/* Read flag value 0/1 from bitmap */ +static inline bool +check_bit(uint32_t bitmap, uint32_t bitmask) +{ + return bitmap & bitmask; +} + /* Initialization Function */ static void acc200_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv) From patchwork Fri Jul 8 00:01:37 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Chautru, Nicolas" X-Patchwork-Id: 113816 X-Patchwork-Delegate: gakhil@marvell.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 7FB3EA0543; Fri, 8 Jul 2022 02:16:35 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 3CEB04282E; Fri, 8 Jul 2022 02:16:10 +0200 (CEST) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by mails.dpdk.org (Postfix) with ESMTP id 7C54840A7B for ; Fri, 8 Jul 2022 02:16:06 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1657239366; x=1688775366; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=fHeTvnQ79x1ALOavbyqJ6tKrFn3Pgc+402TadMQ1ieM=; b=hguhzIrGBoTXw5UCm9hw3OhftjwIMTRcFzo//3PvRztwhIs2TPuXWxN2 zxL+kUFcsONaesRNJKb0173YJIGif0IIdQaLuaV1EdLzNlDILuqbrWqAz RYokMmE70H0NMil9RVOMY9Izrg5m8C0xPYTRKbjtJqvGJ2zlG3dvfcGhS tizpoUWrLyC8qNCDs0cx7ypKo0QPOGnn7H3BZZsalvAOYZ9sAjkzcbJyG aiFNUrgCWjX4WOR1nb5VMleVBSsMaPVqy2Q5ANGB4y/ik0b1QH5H0xSuM hh20/xjQbuZufq0tc9DzWUGQzfu4dMfVk81ckSe0nZ9jwcLh4kM3XxvuW A==; X-IronPort-AV: E=McAfee;i="6400,9594,10401"; a="264563082" X-IronPort-AV: E=Sophos;i="5.92,253,1650956400"; d="scan'208";a="264563082" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Jul 2022 17:16:04 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.92,253,1650956400"; d="scan'208";a="591387534" Received: from skx-5gnr-sc12-4.sc.intel.com ([172.25.69.210]) by orsmga007.jf.intel.com with ESMTP; 07 Jul 2022 17:16:03 -0700 From: Nicolas Chautru To: dev@dpdk.org, thomas@monjalon.net, gakhil@marvell.com, hemant.agrawal@nxp.com, trix@redhat.com Cc: maxime.coquelin@redhat.com, mdr@ashroe.eu, bruce.richardson@intel.com, david.marchand@redhat.com, stephen@networkplumber.org, Nicolas Chautru Subject: [PATCH v1 04/10] baseband/acc200: add queue configuration Date: Thu, 7 Jul 2022 17:01:37 -0700 Message-Id: <1657238503-143836-5-git-send-email-nicolas.chautru@intel.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1657238503-143836-1-git-send-email-nicolas.chautru@intel.com> References: <1657238503-143836-1-git-send-email-nicolas.chautru@intel.com> X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Adding fuinction to create and configure queues for the device. Signed-off-by: Nicolas Chautru --- drivers/baseband/acc200/acc200_pmd.h | 62 ++++ drivers/baseband/acc200/rte_acc200_pmd.c | 506 ++++++++++++++++++++++++++++++- 2 files changed, 567 insertions(+), 1 deletion(-) diff --git a/drivers/baseband/acc200/acc200_pmd.h b/drivers/baseband/acc200/acc200_pmd.h index 91e0798..47ad00e 100644 --- a/drivers/baseband/acc200/acc200_pmd.h +++ b/drivers/baseband/acc200/acc200_pmd.h @@ -615,14 +615,76 @@ struct acc200_registry_addr { .pmon_ctrl_c = HWVfPmCCntrlRegVf, }; +/* Structure associated with each queue. */ +struct __rte_cache_aligned acc200_queue { + union acc200_dma_desc *ring_addr; /* Virtual address of sw ring */ + rte_iova_t ring_addr_iova; /* IOVA address of software ring */ + uint32_t sw_ring_head; /* software ring head */ + uint32_t sw_ring_tail; /* software ring tail */ + /* software ring size (descriptors, not bytes) */ + uint32_t sw_ring_depth; + /* mask used to wrap enqueued descriptors on the sw ring */ + uint32_t sw_ring_wrap_mask; + /* Virtual address of companion ring */ + struct acc200_ptrs *companion_ring_addr; + /* MMIO register used to enqueue descriptors */ + void *mmio_reg_enqueue; + uint8_t vf_id; /* VF ID (max = 63) */ + uint8_t qgrp_id; /* Queue Group ID */ + uint16_t aq_id; /* Atomic Queue ID */ + uint16_t aq_depth; /* Depth of atomic queue */ + uint32_t aq_enqueued; /* Count how many "batches" have been enqueued */ + uint32_t aq_dequeued; /* Count how many "batches" have been dequeued */ + uint32_t irq_enable; /* Enable ops dequeue interrupts if set to 1 */ + struct rte_mempool *fcw_mempool; /* FCW mempool */ + enum rte_bbdev_op_type op_type; /* Type of this Queue: TE or TD */ + /* Internal Buffers for loopback input */ + uint8_t *lb_in; + uint8_t *lb_out; + rte_iova_t lb_in_addr_iova; + rte_iova_t lb_out_addr_iova; + struct acc200_device *d; +}; /* Private data structure for each ACC200 device */ struct acc200_device { void *mmio_base; /**< Base address of MMIO registers (BAR0) */ + void *sw_rings_base; /* Base addr of un-aligned memory for sw rings */ + void *sw_rings; /* 64MBs of 64MB aligned memory for sw rings */ + rte_iova_t sw_rings_iova; /* IOVA address of sw_rings */ + /* Virtual address of the info memory routed to the this function under + * operation, whether it is PF or VF. + * HW may DMA information data at this location asynchronously + */ + union acc200_info_ring_data *info_ring; + + union acc200_harq_layout_data *harq_layout; + /* Virtual Info Ring head */ + uint16_t info_ring_head; + /* Number of bytes available for each queue in device, depending on + * how many queues are enabled with configure() + */ + uint32_t sw_ring_size; uint32_t ddr_size; /* Size in kB */ + uint32_t *tail_ptrs; /* Base address of response tail pointer buffer */ + rte_iova_t tail_ptr_iova; /* IOVA address of tail pointers */ + /* Max number of entries available for each queue in device, depending + * on how many queues are enabled with configure() + */ + uint32_t sw_ring_max_depth; struct rte_acc200_conf acc200_conf; /* ACC200 Initial configuration */ + /* Bitmap capturing which Queues have already been assigned */ + uint16_t q_assigned_bit_map[ACC200_NUM_QGRPS]; bool pf_device; /**< True if this is a PF ACC200 device */ bool configured; /**< True if this ACC200 device is configured */ }; +/** + * Structure with details about RTE_BBDEV_EVENT_DEQUEUE event. It's passed to + * the callback function. + */ +struct acc200_deq_intr_details { + uint16_t queue_id; +}; + #endif /* _RTE_ACC200_PMD_H_ */ diff --git a/drivers/baseband/acc200/rte_acc200_pmd.c b/drivers/baseband/acc200/rte_acc200_pmd.c index ce72654..ec082f1 100644 --- a/drivers/baseband/acc200/rte_acc200_pmd.c +++ b/drivers/baseband/acc200/rte_acc200_pmd.c @@ -29,6 +29,22 @@ RTE_LOG_REGISTER_DEFAULT(acc200_logtype, NOTICE); #endif +/* Write to MMIO register address */ +static inline void +mmio_write(void *addr, uint32_t value) +{ + *((volatile uint32_t *)(addr)) = rte_cpu_to_le_32(value); +} + +/* Write a register of a ACC200 device */ +static inline void +acc200_reg_write(struct acc200_device *d, uint32_t offset, uint32_t value) +{ + void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset); + mmio_write(reg_addr, value); + usleep(ACC200_LONG_WAIT); +} + /* Read a register of a ACC200 device */ static inline uint32_t acc200_reg_read(struct acc200_device *d, uint32_t offset) @@ -39,6 +55,22 @@ return rte_le_to_cpu_32(ret); } +/* Basic Implementation of Log2 for exact 2^N */ +static inline uint32_t +log2_basic(uint32_t value) +{ + return (value == 0) ? 0 : rte_bsf32(value); +} + +/* Calculate memory alignment offset assuming alignment is 2^N */ +static inline uint32_t +calc_mem_alignment_offset(void *unaligned_virt_mem, uint32_t alignment) +{ + rte_iova_t unaligned_phy_mem = rte_malloc_virt2iova(unaligned_virt_mem); + return (uint32_t)(alignment - + (unaligned_phy_mem & (alignment-1))); +} + /* Calculate the offset of the enqueue register */ static inline uint32_t queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id, uint16_t aq_id) @@ -230,16 +262,484 @@ acc200_conf->q_fft.aq_depth_log2); } +static void +free_base_addresses(void **base_addrs, int size) +{ + int i; + for (i = 0; i < size; i++) + rte_free(base_addrs[i]); +} + +static inline uint32_t +get_desc_len(void) +{ + return sizeof(union acc200_dma_desc); +} + +/* Allocate the 2 * 64MB block for the sw rings */ +static int +alloc_2x64mb_sw_rings_mem(struct rte_bbdev *dev, struct acc200_device *d, + int socket) +{ + uint32_t sw_ring_size = ACC200_SIZE_64MBYTE; + d->sw_rings_base = rte_zmalloc_socket(dev->device->driver->name, + 2 * sw_ring_size, RTE_CACHE_LINE_SIZE, socket); + if (d->sw_rings_base == NULL) { + rte_bbdev_log(ERR, "Failed to allocate memory for %s:%u", + dev->device->driver->name, + dev->data->dev_id); + return -ENOMEM; + } + uint32_t next_64mb_align_offset = calc_mem_alignment_offset( + d->sw_rings_base, ACC200_SIZE_64MBYTE); + d->sw_rings = RTE_PTR_ADD(d->sw_rings_base, next_64mb_align_offset); + d->sw_rings_iova = rte_malloc_virt2iova(d->sw_rings_base) + + next_64mb_align_offset; + d->sw_ring_size = ACC200_MAX_QUEUE_DEPTH * get_desc_len(); + d->sw_ring_max_depth = ACC200_MAX_QUEUE_DEPTH; + + return 0; +} + +/* Attempt to allocate minimised memory space for sw rings */ +static void +alloc_sw_rings_min_mem(struct rte_bbdev *dev, struct acc200_device *d, + uint16_t num_queues, int socket) +{ + rte_iova_t sw_rings_base_iova, next_64mb_align_addr_iova; + uint32_t next_64mb_align_offset; + rte_iova_t sw_ring_iova_end_addr; + void *base_addrs[ACC200_SW_RING_MEM_ALLOC_ATTEMPTS]; + void *sw_rings_base; + int i = 0; + uint32_t q_sw_ring_size = ACC200_MAX_QUEUE_DEPTH * get_desc_len(); + uint32_t dev_sw_ring_size = q_sw_ring_size * num_queues; + /* Free first in case this is a reconfiguration */ + rte_free(d->sw_rings_base); + + /* Find an aligned block of memory to store sw rings */ + while (i < ACC200_SW_RING_MEM_ALLOC_ATTEMPTS) { + /* + * sw_ring allocated memory is guaranteed to be aligned to + * q_sw_ring_size at the condition that the requested size is + * less than the page size + */ + sw_rings_base = rte_zmalloc_socket( + dev->device->driver->name, + dev_sw_ring_size, q_sw_ring_size, socket); + + if (sw_rings_base == NULL) { + rte_bbdev_log(ERR, + "Failed to allocate memory for %s:%u", + dev->device->driver->name, + dev->data->dev_id); + break; + } + + sw_rings_base_iova = rte_malloc_virt2iova(sw_rings_base); + next_64mb_align_offset = calc_mem_alignment_offset( + sw_rings_base, ACC200_SIZE_64MBYTE); + next_64mb_align_addr_iova = sw_rings_base_iova + + next_64mb_align_offset; + sw_ring_iova_end_addr = sw_rings_base_iova + dev_sw_ring_size; + + /* Check if the end of the sw ring memory block is before the + * start of next 64MB aligned mem address + */ + if (sw_ring_iova_end_addr < next_64mb_align_addr_iova) { + d->sw_rings_iova = sw_rings_base_iova; + d->sw_rings = sw_rings_base; + d->sw_rings_base = sw_rings_base; + d->sw_ring_size = q_sw_ring_size; + d->sw_ring_max_depth = ACC200_MAX_QUEUE_DEPTH; + break; + } + /* Store the address of the unaligned mem block */ + base_addrs[i] = sw_rings_base; + i++; + } + + /* Free all unaligned blocks of mem allocated in the loop */ + free_base_addresses(base_addrs, i); +} + +/* Allocate 64MB memory used for all software rings */ +static int +acc200_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id) +{ + uint32_t phys_low, phys_high, value; + struct acc200_device *d = dev->data->dev_private; + const struct acc200_registry_addr *reg_addr; + + if (d->pf_device && !d->acc200_conf.pf_mode_en) { + rte_bbdev_log(NOTICE, + "%s has PF mode disabled. This PF can't be used.", + dev->data->name); + return -ENODEV; + } + if (!d->pf_device && d->acc200_conf.pf_mode_en) { + rte_bbdev_log(NOTICE, + "%s has PF mode enabled. This VF can't be used.", + dev->data->name); + return -ENODEV; + } + + alloc_sw_rings_min_mem(dev, d, num_queues, socket_id); + + /* If minimal memory space approach failed, then allocate + * the 2 * 64MB block for the sw rings + */ + if (d->sw_rings == NULL) + alloc_2x64mb_sw_rings_mem(dev, d, socket_id); + + if (d->sw_rings == NULL) { + rte_bbdev_log(NOTICE, + "Failure allocating sw_rings memory"); + return -ENODEV; + } + + /* Configure ACC200 with the base address for DMA descriptor rings + * Same descriptor rings used for UL and DL DMA Engines + * Note : Assuming only VF0 bundle is used for PF mode + */ + phys_high = (uint32_t)(d->sw_rings_iova >> 32); + phys_low = (uint32_t)(d->sw_rings_iova & ~(ACC200_SIZE_64MBYTE-1)); + + /* Choose correct registry addresses for the device type */ + if (d->pf_device) + reg_addr = &pf_reg_addr; + else + reg_addr = &vf_reg_addr; + + /* Read the populated cfg from ACC200 registers */ + fetch_acc200_config(dev); + + /* Start Pmon */ + for (value = 0; value <= 2; value++) { + acc200_reg_write(d, reg_addr->pmon_ctrl_a, value); + acc200_reg_write(d, reg_addr->pmon_ctrl_b, value); + acc200_reg_write(d, reg_addr->pmon_ctrl_c, value); + } + + /* Release AXI from PF */ + if (d->pf_device) + acc200_reg_write(d, HWPfDmaAxiControl, 1); + + acc200_reg_write(d, reg_addr->dma_ring_ul5g_hi, phys_high); + acc200_reg_write(d, reg_addr->dma_ring_ul5g_lo, phys_low); + acc200_reg_write(d, reg_addr->dma_ring_dl5g_hi, phys_high); + acc200_reg_write(d, reg_addr->dma_ring_dl5g_lo, phys_low); + acc200_reg_write(d, reg_addr->dma_ring_ul4g_hi, phys_high); + acc200_reg_write(d, reg_addr->dma_ring_ul4g_lo, phys_low); + acc200_reg_write(d, reg_addr->dma_ring_dl4g_hi, phys_high); + acc200_reg_write(d, reg_addr->dma_ring_dl4g_lo, phys_low); + acc200_reg_write(d, reg_addr->dma_ring_fft_hi, phys_high); + acc200_reg_write(d, reg_addr->dma_ring_fft_lo, phys_low); + /* + * Configure Ring Size to the max queue ring size + * (used for wrapping purpose) + */ + value = log2_basic(d->sw_ring_size / 64); + acc200_reg_write(d, reg_addr->ring_size, value); + + /* Configure tail pointer for use when SDONE enabled */ + if (d->tail_ptrs == NULL) + d->tail_ptrs = rte_zmalloc_socket( + dev->device->driver->name, + ACC200_NUM_QGRPS * ACC200_NUM_AQS * sizeof(uint32_t), + RTE_CACHE_LINE_SIZE, socket_id); + if (d->tail_ptrs == NULL) { + rte_bbdev_log(ERR, "Failed to allocate tail ptr for %s:%u", + dev->device->driver->name, + dev->data->dev_id); + rte_free(d->sw_rings); + return -ENOMEM; + } + d->tail_ptr_iova = rte_malloc_virt2iova(d->tail_ptrs); + + phys_high = (uint32_t)(d->tail_ptr_iova >> 32); + phys_low = (uint32_t)(d->tail_ptr_iova); + acc200_reg_write(d, reg_addr->tail_ptrs_ul5g_hi, phys_high); + acc200_reg_write(d, reg_addr->tail_ptrs_ul5g_lo, phys_low); + acc200_reg_write(d, reg_addr->tail_ptrs_dl5g_hi, phys_high); + acc200_reg_write(d, reg_addr->tail_ptrs_dl5g_lo, phys_low); + acc200_reg_write(d, reg_addr->tail_ptrs_ul4g_hi, phys_high); + acc200_reg_write(d, reg_addr->tail_ptrs_ul4g_lo, phys_low); + acc200_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high); + acc200_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low); + acc200_reg_write(d, reg_addr->tail_ptrs_fft_hi, phys_high); + acc200_reg_write(d, reg_addr->tail_ptrs_fft_lo, phys_low); + + if (d->harq_layout == NULL) + d->harq_layout = rte_zmalloc_socket("HARQ Layout", + ACC200_HARQ_LAYOUT * sizeof(*d->harq_layout), + RTE_CACHE_LINE_SIZE, dev->data->socket_id); + if (d->harq_layout == NULL) { + rte_bbdev_log(ERR, "Failed to allocate harq_layout for %s:%u", + dev->device->driver->name, + dev->data->dev_id); + rte_free(d->sw_rings); + return -ENOMEM; + } + + /* Mark as configured properly */ + d->configured = true; + + rte_bbdev_log_debug( + "ACC200 (%s) configured sw_rings = %p, sw_rings_iova = %#" + PRIx64, dev->data->name, d->sw_rings, d->sw_rings_iova); + + return 0; +} + /* Free memory used for software rings */ static int acc200_dev_close(struct rte_bbdev *dev) { - RTE_SET_USED(dev); + struct acc200_device *d = dev->data->dev_private; + if (d->sw_rings_base != NULL) { + rte_free(d->tail_ptrs); + rte_free(d->sw_rings_base); + rte_free(d->harq_layout); + d->sw_rings_base = NULL; + d->tail_ptrs = NULL; + d->harq_layout = NULL; + } /* Ensure all in flight HW transactions are completed */ usleep(ACC200_LONG_WAIT); return 0; } +/** + * Report a ACC200 queue index which is free + * Return 0 to 16k for a valid queue_idx or -1 when no queue is available + * Note : Only supporting VF0 Bundle for PF mode + */ +static int +acc200_find_free_queue_idx(struct rte_bbdev *dev, + const struct rte_bbdev_queue_conf *conf) +{ + struct acc200_device *d = dev->data->dev_private; + int op_2_acc[6] = {0, UL_4G, DL_4G, UL_5G, DL_5G, FFT}; + int acc = op_2_acc[conf->op_type]; + struct rte_acc200_queue_topology *qtop = NULL; + + qtopFromAcc(&qtop, acc, &(d->acc200_conf)); + if (qtop == NULL) + return -1; + /* Identify matching QGroup Index which are sorted in priority order */ + uint16_t group_idx = qtop->first_qgroup_index; + group_idx += conf->priority; + if (group_idx >= ACC200_NUM_QGRPS || + conf->priority >= qtop->num_qgroups) { + rte_bbdev_log(INFO, "Invalid Priority on %s, priority %u", + dev->data->name, conf->priority); + return -1; + } + /* Find a free AQ_idx */ + uint16_t aq_idx; + for (aq_idx = 0; aq_idx < qtop->num_aqs_per_groups; aq_idx++) { + if (((d->q_assigned_bit_map[group_idx] >> aq_idx) & 0x1) == 0) { + /* Mark the Queue as assigned */ + d->q_assigned_bit_map[group_idx] |= (1 << aq_idx); + /* Report the AQ Index */ + return (group_idx << ACC200_GRP_ID_SHIFT) + aq_idx; + } + } + rte_bbdev_log(INFO, "Failed to find free queue on %s, priority %u", + dev->data->name, conf->priority); + return -1; +} + +/* Setup ACC200 queue */ +static int +acc200_queue_setup(struct rte_bbdev *dev, uint16_t queue_id, + const struct rte_bbdev_queue_conf *conf) +{ + struct acc200_device *d = dev->data->dev_private; + struct acc200_queue *q; + int16_t q_idx; + + if (d == NULL) { + rte_bbdev_log(ERR, "Undefined device"); + return -ENODEV; + } + /* Allocate the queue data structure. */ + q = rte_zmalloc_socket(dev->device->driver->name, sizeof(*q), + RTE_CACHE_LINE_SIZE, conf->socket); + if (q == NULL) { + rte_bbdev_log(ERR, "Failed to allocate queue memory"); + return -ENOMEM; + } + + q->d = d; + q->ring_addr = RTE_PTR_ADD(d->sw_rings, (d->sw_ring_size * queue_id)); + q->ring_addr_iova = d->sw_rings_iova + (d->sw_ring_size * queue_id); + + /* Prepare the Ring with default descriptor format */ + union acc200_dma_desc *desc = NULL; + unsigned int desc_idx, b_idx; + int fcw_len = (conf->op_type == RTE_BBDEV_OP_LDPC_ENC ? + ACC200_FCW_LE_BLEN : (conf->op_type == RTE_BBDEV_OP_TURBO_DEC ? + ACC200_FCW_TD_BLEN : (conf->op_type == RTE_BBDEV_OP_LDPC_DEC ? + ACC200_FCW_LD_BLEN : ACC200_FCW_FFT_BLEN))); + + for (desc_idx = 0; desc_idx < d->sw_ring_max_depth; desc_idx++) { + desc = q->ring_addr + desc_idx; + desc->req.word0 = ACC200_DMA_DESC_TYPE; + desc->req.word1 = 0; /**< Timestamp */ + desc->req.word2 = 0; + desc->req.word3 = 0; + uint64_t fcw_offset = (desc_idx << 8) + ACC200_DESC_FCW_OFFSET; + desc->req.data_ptrs[0].address = q->ring_addr_iova + fcw_offset; + desc->req.data_ptrs[0].blen = fcw_len; + desc->req.data_ptrs[0].blkid = ACC200_DMA_BLKID_FCW; + desc->req.data_ptrs[0].last = 0; + desc->req.data_ptrs[0].dma_ext = 0; + for (b_idx = 1; b_idx < ACC200_DMA_MAX_NUM_POINTERS - 1; + b_idx++) { + desc->req.data_ptrs[b_idx].blkid = ACC200_DMA_BLKID_IN; + desc->req.data_ptrs[b_idx].last = 1; + desc->req.data_ptrs[b_idx].dma_ext = 0; + b_idx++; + desc->req.data_ptrs[b_idx].blkid = + ACC200_DMA_BLKID_OUT_ENC; + desc->req.data_ptrs[b_idx].last = 1; + desc->req.data_ptrs[b_idx].dma_ext = 0; + } + /* Preset some fields of LDPC FCW */ + desc->req.fcw_ld.FCWversion = ACC200_FCW_VER; + desc->req.fcw_ld.gain_i = 1; + desc->req.fcw_ld.gain_h = 1; + } + + q->lb_in = rte_zmalloc_socket(dev->device->driver->name, + RTE_CACHE_LINE_SIZE, + RTE_CACHE_LINE_SIZE, conf->socket); + if (q->lb_in == NULL) { + rte_bbdev_log(ERR, "Failed to allocate lb_in memory"); + rte_free(q); + return -ENOMEM; + } + q->lb_in_addr_iova = rte_malloc_virt2iova(q->lb_in); + q->lb_out = rte_zmalloc_socket(dev->device->driver->name, + RTE_CACHE_LINE_SIZE, + RTE_CACHE_LINE_SIZE, conf->socket); + if (q->lb_out == NULL) { + rte_bbdev_log(ERR, "Failed to allocate lb_out memory"); + rte_free(q->lb_in); + rte_free(q); + return -ENOMEM; + } + q->lb_out_addr_iova = rte_malloc_virt2iova(q->lb_out); + q->companion_ring_addr = rte_zmalloc_socket(dev->device->driver->name, + d->sw_ring_max_depth * sizeof(*q->companion_ring_addr), + RTE_CACHE_LINE_SIZE, conf->socket); + if (q->companion_ring_addr == NULL) { + rte_bbdev_log(ERR, "Failed to allocate companion_ring memory"); + rte_free(q->lb_in); + rte_free(q->lb_out); + rte_free(q); + return -ENOMEM; + } + + /* + * Software queue ring wraps synchronously with the HW when it reaches + * the boundary of the maximum allocated queue size, no matter what the + * sw queue size is. This wrapping is guarded by setting the wrap_mask + * to represent the maximum queue size as allocated at the time when + * the device has been setup (in configure()). + * + * The queue depth is set to the queue size value (conf->queue_size). + * This limits the occupancy of the queue at any point of time, so that + * the queue does not get swamped with enqueue requests. + */ + q->sw_ring_depth = conf->queue_size; + q->sw_ring_wrap_mask = d->sw_ring_max_depth - 1; + + q->op_type = conf->op_type; + + q_idx = acc200_find_free_queue_idx(dev, conf); + if (q_idx == -1) { + rte_free(q->companion_ring_addr); + rte_free(q->lb_in); + rte_free(q->lb_out); + rte_free(q); + return -1; + } + + q->qgrp_id = (q_idx >> ACC200_GRP_ID_SHIFT) & 0xF; + q->vf_id = (q_idx >> ACC200_VF_ID_SHIFT) & 0x3F; + q->aq_id = q_idx & 0xF; + q->aq_depth = 0; + if (conf->op_type == RTE_BBDEV_OP_TURBO_DEC) + q->aq_depth = (1 << d->acc200_conf.q_ul_4g.aq_depth_log2); + else if (conf->op_type == RTE_BBDEV_OP_TURBO_ENC) + q->aq_depth = (1 << d->acc200_conf.q_dl_4g.aq_depth_log2); + else if (conf->op_type == RTE_BBDEV_OP_LDPC_DEC) + q->aq_depth = (1 << d->acc200_conf.q_ul_5g.aq_depth_log2); + else if (conf->op_type == RTE_BBDEV_OP_LDPC_ENC) + q->aq_depth = (1 << d->acc200_conf.q_dl_5g.aq_depth_log2); + else if (conf->op_type == RTE_BBDEV_OP_FFT) + q->aq_depth = (1 << d->acc200_conf.q_fft.aq_depth_log2); + + q->mmio_reg_enqueue = RTE_PTR_ADD(d->mmio_base, + queue_offset(d->pf_device, + q->vf_id, q->qgrp_id, q->aq_id)); + + rte_bbdev_log_debug( + "Setup dev%u q%u: qgrp_id=%u, vf_id=%u, aq_id=%u, aq_depth=%u, mmio_reg_enqueue=%p base %p\n", + dev->data->dev_id, queue_id, q->qgrp_id, q->vf_id, + q->aq_id, q->aq_depth, q->mmio_reg_enqueue, + d->mmio_base); + + dev->data->queues[queue_id].queue_private = q; + return 0; +} + + +static int +acc200_queue_stop(struct rte_bbdev *dev, uint16_t queue_id) +{ + struct acc200_queue *q; + q = dev->data->queues[queue_id].queue_private; + rte_bbdev_log(INFO, "Queue Stop %d H/T/D %d %d %x OpType %d", + queue_id, q->sw_ring_head, q->sw_ring_tail, + q->sw_ring_depth, q->op_type); + /* ignore all operations in flight and clear counters */ + q->sw_ring_tail = q->sw_ring_head; + q->aq_enqueued = 0; + q->aq_dequeued = 0; + dev->data->queues[queue_id].queue_stats.enqueued_count = 0; + dev->data->queues[queue_id].queue_stats.dequeued_count = 0; + dev->data->queues[queue_id].queue_stats.enqueue_err_count = 0; + dev->data->queues[queue_id].queue_stats.dequeue_err_count = 0; + dev->data->queues[queue_id].queue_stats.enqueue_warn_count = 0; + dev->data->queues[queue_id].queue_stats.dequeue_warn_count = 0; + return 0; +} + +/* Release ACC200 queue */ +static int +acc200_queue_release(struct rte_bbdev *dev, uint16_t q_id) +{ + struct acc200_device *d = dev->data->dev_private; + struct acc200_queue *q = dev->data->queues[q_id].queue_private; + + if (q != NULL) { + /* Mark the Queue as un-assigned */ + d->q_assigned_bit_map[q->qgrp_id] &= (0xFFFFFFFF - + (1 << q->aq_id)); + rte_free(q->companion_ring_addr); + rte_free(q->lb_in); + rte_free(q->lb_out); + rte_free(q); + dev->data->queues[q_id].queue_private = NULL; + } + + return 0; +} + /* Get ACC200 device info */ static void acc200_dev_info_get(struct rte_bbdev *dev, @@ -289,8 +789,12 @@ } static const struct rte_bbdev_ops acc200_bbdev_ops = { + .setup_queues = acc200_setup_queues, .close = acc200_dev_close, .info_get = acc200_dev_info_get, + .queue_setup = acc200_queue_setup, + .queue_release = acc200_queue_release, + .queue_stop = acc200_queue_stop, }; /* ACC200 PCI PF address map */ From patchwork Fri Jul 8 00:01:38 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Chautru, Nicolas" X-Patchwork-Id: 113817 X-Patchwork-Delegate: gakhil@marvell.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 803A3A0543; Fri, 8 Jul 2022 02:16:46 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 7EB6642B72; Fri, 8 Jul 2022 02:16:11 +0200 (CEST) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by mails.dpdk.org (Postfix) with ESMTP id DE83540041 for ; Fri, 8 Jul 2022 02:16:06 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1657239367; x=1688775367; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=UVWjXr2NJpXBQE8CTt8PqXBMLEQDDzGckFbe/adSlZg=; b=jJoBdqsN++zIe9V518oby5Vg0HfFjfUgZ8mZteRUCSHBJxHvNm9iKSm/ osz7WqdnrJbYFg0GFNFilEEjyyuQpy6EWb2et/SglKgNf3b5o1nB8JTsH 23rZGqa1yHFWPxDykz6q1pdok59UIO+Q1csOVaueqBdy4cD5WAJ/cCLwf XKDA2pSTAAat9gEHkk9cGEf3yuPX8KiHX/bcQEkMJg6eNZhxJJRTy70uP qpm61sffs2hV6vGCFI/UGlZ4cHyx1UaHjmCisW8Jy9WXb/Kv04BBOd6U1 rbB1wXiHFXQcKUo+4vKaUhQXfPho9+zDw2lbPaTAUL+/IIE+w0Bd4EYRK g==; X-IronPort-AV: E=McAfee;i="6400,9594,10401"; a="264563084" X-IronPort-AV: E=Sophos;i="5.92,253,1650956400"; d="scan'208";a="264563084" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Jul 2022 17:16:04 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.92,253,1650956400"; d="scan'208";a="591387538" Received: from skx-5gnr-sc12-4.sc.intel.com ([172.25.69.210]) by orsmga007.jf.intel.com with ESMTP; 07 Jul 2022 17:16:04 -0700 From: Nicolas Chautru To: dev@dpdk.org, thomas@monjalon.net, gakhil@marvell.com, hemant.agrawal@nxp.com, trix@redhat.com Cc: maxime.coquelin@redhat.com, mdr@ashroe.eu, bruce.richardson@intel.com, david.marchand@redhat.com, stephen@networkplumber.org, Nicolas Chautru Subject: [PATCH v1 05/10] baseband/acc200: add LDPC processing functions Date: Thu, 7 Jul 2022 17:01:38 -0700 Message-Id: <1657238503-143836-6-git-send-email-nicolas.chautru@intel.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1657238503-143836-1-git-send-email-nicolas.chautru@intel.com> References: <1657238503-143836-1-git-send-email-nicolas.chautru@intel.com> X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Adding LDPC encode and decode processing functions. Signed-off-by: Nicolas Chautru --- drivers/baseband/acc200/rte_acc200_pmd.c | 2116 +++++++++++++++++++++++++++++- 1 file changed, 2112 insertions(+), 4 deletions(-) diff --git a/drivers/baseband/acc200/rte_acc200_pmd.c b/drivers/baseband/acc200/rte_acc200_pmd.c index ec082f1..42cf2c8 100644 --- a/drivers/baseband/acc200/rte_acc200_pmd.c +++ b/drivers/baseband/acc200/rte_acc200_pmd.c @@ -697,15 +697,50 @@ return 0; } +static inline void +acc200_print_op(struct rte_bbdev_dec_op *op, enum rte_bbdev_op_type op_type, + uint16_t index) +{ + if (op == NULL) + return; + if (op_type == RTE_BBDEV_OP_LDPC_DEC) + rte_bbdev_log(INFO, + " Op 5GUL %d %d %d %d %d %d %d %d %d %d %d %d", + index, + op->ldpc_dec.basegraph, op->ldpc_dec.z_c, + op->ldpc_dec.n_cb, op->ldpc_dec.q_m, + op->ldpc_dec.n_filler, op->ldpc_dec.cb_params.e, + op->ldpc_dec.op_flags, op->ldpc_dec.rv_index, + op->ldpc_dec.iter_max, op->ldpc_dec.iter_count, + op->ldpc_dec.harq_combined_input.length + ); + else if (op_type == RTE_BBDEV_OP_LDPC_ENC) { + struct rte_bbdev_enc_op *op_dl = (struct rte_bbdev_enc_op *) op; + rte_bbdev_log(INFO, + " Op 5GDL %d %d %d %d %d %d %d %d %d", + index, + op_dl->ldpc_enc.basegraph, op_dl->ldpc_enc.z_c, + op_dl->ldpc_enc.n_cb, op_dl->ldpc_enc.q_m, + op_dl->ldpc_enc.n_filler, op_dl->ldpc_enc.cb_params.e, + op_dl->ldpc_enc.op_flags, op_dl->ldpc_enc.rv_index + ); + } +} static int acc200_queue_stop(struct rte_bbdev *dev, uint16_t queue_id) { struct acc200_queue *q; + struct rte_bbdev_dec_op *op; + uint16_t i; q = dev->data->queues[queue_id].queue_private; rte_bbdev_log(INFO, "Queue Stop %d H/T/D %d %d %x OpType %d", queue_id, q->sw_ring_head, q->sw_ring_tail, q->sw_ring_depth, q->op_type); + for (i = 0; i < q->sw_ring_depth; ++i) { + op = (q->ring_addr + i)->req.op_addr; + acc200_print_op(op, q->op_type, i); + } /* ignore all operations in flight and clear counters */ q->sw_ring_tail = q->sw_ring_head; q->aq_enqueued = 0; @@ -748,6 +783,43 @@ struct acc200_device *d = dev->data->dev_private; int i; static const struct rte_bbdev_op_cap bbdev_capabilities[] = { + { + .type = RTE_BBDEV_OP_LDPC_ENC, + .cap.ldpc_enc = { + .capability_flags = + RTE_BBDEV_LDPC_RATE_MATCH | + RTE_BBDEV_LDPC_CRC_24B_ATTACH | + RTE_BBDEV_LDPC_INTERLEAVER_BYPASS, + .num_buffers_src = + RTE_BBDEV_LDPC_MAX_CODE_BLOCKS, + .num_buffers_dst = + RTE_BBDEV_LDPC_MAX_CODE_BLOCKS, + } + }, + { + .type = RTE_BBDEV_OP_LDPC_DEC, + .cap.ldpc_dec = { + .capability_flags = + RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK | + RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP | + RTE_BBDEV_LDPC_CRC_TYPE_24A_CHECK | + RTE_BBDEV_LDPC_CRC_TYPE_16_CHECK | + RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE | + RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE | + RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE | + RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS | + RTE_BBDEV_LDPC_DEC_SCATTER_GATHER | + RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION | + RTE_BBDEV_LDPC_LLR_COMPRESSION, + .llr_size = 8, + .llr_decimals = 1, + .num_buffers_src = + RTE_BBDEV_LDPC_MAX_CODE_BLOCKS, + .num_buffers_hard_out = + RTE_BBDEV_LDPC_MAX_CODE_BLOCKS, + .num_buffers_soft_out = 0, + } + }, RTE_BBDEV_END_OF_CAPABILITIES_LIST() }; @@ -764,13 +836,15 @@ dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0; dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = 0; dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = 0; - dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = 0; - dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = 0; + dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = d->acc200_conf.q_ul_5g.num_aqs_per_groups * + d->acc200_conf.q_ul_5g.num_qgroups; + dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = d->acc200_conf.q_dl_5g.num_aqs_per_groups * + d->acc200_conf.q_dl_5g.num_qgroups; dev_info->num_queues[RTE_BBDEV_OP_FFT] = 0; dev_info->queue_priority[RTE_BBDEV_OP_TURBO_DEC] = 0; dev_info->queue_priority[RTE_BBDEV_OP_TURBO_ENC] = 0; - dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = 0; - dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = 0; + dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = d->acc200_conf.q_ul_5g.num_qgroups; + dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = d->acc200_conf.q_dl_5g.num_qgroups; dev_info->queue_priority[RTE_BBDEV_OP_FFT] = 0; dev_info->max_num_queues = 0; for (i = RTE_BBDEV_OP_NONE; i <= RTE_BBDEV_OP_FFT; i++) @@ -820,6 +894,2036 @@ return bitmap & bitmask; } +static inline char * +mbuf_append(struct rte_mbuf *m_head, struct rte_mbuf *m, uint16_t len) +{ + if (unlikely(len > rte_pktmbuf_tailroom(m))) + return NULL; + + char *tail = (char *)m->buf_addr + m->data_off + m->data_len; + m->data_len = (uint16_t)(m->data_len + len); + m_head->pkt_len = (m_head->pkt_len + len); + return tail; +} + +/* Compute value of k0. + * Based on 3GPP 38.212 Table 5.4.2.1-2 + * Starting position of different redundancy versions, k0 + */ +static inline uint16_t +get_k0(uint16_t n_cb, uint16_t z_c, uint8_t bg, uint8_t rv_index) +{ + if (rv_index == 0) + return 0; + uint16_t n = (bg == 1 ? ACC200_N_ZC_1 : ACC200_N_ZC_2) * z_c; + if (n_cb == n) { + if (rv_index == 1) + return (bg == 1 ? ACC200_K0_1_1 : ACC200_K0_1_2) * z_c; + else if (rv_index == 2) + return (bg == 1 ? ACC200_K0_2_1 : ACC200_K0_2_2) * z_c; + else + return (bg == 1 ? ACC200_K0_3_1 : ACC200_K0_3_2) * z_c; + } + /* LBRM case - includes a division by N */ + if (unlikely(z_c == 0)) + return 0; + if (rv_index == 1) + return (((bg == 1 ? ACC200_K0_1_1 : ACC200_K0_1_2) * n_cb) + / n) * z_c; + else if (rv_index == 2) + return (((bg == 1 ? ACC200_K0_2_1 : ACC200_K0_2_2) * n_cb) + / n) * z_c; + else + return (((bg == 1 ? ACC200_K0_3_1 : ACC200_K0_3_2) * n_cb) + / n) * z_c; +} + +/* Fill in a frame control word for LDPC encoding. */ +static inline void +acc200_fcw_le_fill(const struct rte_bbdev_enc_op *op, + struct acc200_fcw_le *fcw, int num_cb, uint32_t default_e) +{ + fcw->qm = op->ldpc_enc.q_m; + fcw->nfiller = op->ldpc_enc.n_filler; + fcw->BG = (op->ldpc_enc.basegraph - 1); + fcw->Zc = op->ldpc_enc.z_c; + fcw->ncb = op->ldpc_enc.n_cb; + fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_enc.basegraph, + op->ldpc_enc.rv_index); + fcw->rm_e = (default_e == 0) ? op->ldpc_enc.cb_params.e : default_e; + fcw->crc_select = check_bit(op->ldpc_enc.op_flags, + RTE_BBDEV_LDPC_CRC_24B_ATTACH); + fcw->bypass_intlv = check_bit(op->ldpc_enc.op_flags, + RTE_BBDEV_LDPC_INTERLEAVER_BYPASS); + fcw->mcb_count = num_cb; +} + +/* Convert offset to harq index for harq_layout structure */ +static inline uint32_t hq_index(uint32_t offset) +{ + return (offset >> ACC200_HARQ_OFFSET_SHIFT) & ACC200_HARQ_OFFSET_MASK; +} + +/* Fill in a frame control word for LDPC decoding. */ +static inline void +acc200_fcw_ld_fill(struct rte_bbdev_dec_op *op, struct acc200_fcw_ld *fcw, + union acc200_harq_layout_data *harq_layout) +{ + uint16_t harq_out_length, harq_in_length, ncb_p, k0_p, parity_offset; + uint32_t harq_index; + uint32_t l; + bool harq_prun = false; + + fcw->qm = op->ldpc_dec.q_m; + fcw->nfiller = op->ldpc_dec.n_filler; + fcw->BG = (op->ldpc_dec.basegraph - 1); + fcw->Zc = op->ldpc_dec.z_c; + fcw->ncb = op->ldpc_dec.n_cb; + fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_dec.basegraph, + op->ldpc_dec.rv_index); + if (op->ldpc_dec.code_block_mode == RTE_BBDEV_CODE_BLOCK) + fcw->rm_e = op->ldpc_dec.cb_params.e; + else + fcw->rm_e = (op->ldpc_dec.tb_params.r < + op->ldpc_dec.tb_params.cab) ? + op->ldpc_dec.tb_params.ea : + op->ldpc_dec.tb_params.eb; + + if (unlikely(check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE) && + (op->ldpc_dec.harq_combined_input.length == 0))) { + rte_bbdev_log(WARNING, "Null HARQ input size provided"); + /* Disable HARQ input in that case to carry forward */ + op->ldpc_dec.op_flags ^= RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE; + } + if (unlikely(fcw->rm_e == 0)) { + rte_bbdev_log(WARNING, "Null E input provided"); + fcw->rm_e = 2; + } + + fcw->hcin_en = check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE); + fcw->hcout_en = check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE); + fcw->crc_select = check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK); + fcw->bypass_dec = 0; + fcw->bypass_intlv = check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS); + if (op->ldpc_dec.q_m == 1) { + fcw->bypass_intlv = 1; + fcw->qm = 2; + } + fcw->hcin_decomp_mode = check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION); + fcw->hcout_comp_mode = check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION); + fcw->llr_pack_mode = check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_LLR_COMPRESSION); + harq_index = hq_index(op->ldpc_dec.harq_combined_output.offset); +#ifdef ACC200_EXT_MEM + /* Limit cases when HARQ pruning is valid */ + harq_prun = ((op->ldpc_dec.harq_combined_output.offset % + ACC200_HARQ_OFFSET) == 0); +#endif + if (fcw->hcin_en > 0) { + harq_in_length = op->ldpc_dec.harq_combined_input.length; + if (fcw->hcin_decomp_mode > 0) + harq_in_length = harq_in_length * 8 / 6; + harq_in_length = RTE_MIN(harq_in_length, op->ldpc_dec.n_cb + - op->ldpc_dec.n_filler); + harq_in_length = RTE_ALIGN_CEIL(harq_in_length, 64); + if ((harq_layout[harq_index].offset > 0) & harq_prun) { + rte_bbdev_log_debug("HARQ IN offset unexpected for now\n"); + fcw->hcin_size0 = harq_layout[harq_index].size0; + fcw->hcin_offset = harq_layout[harq_index].offset; + fcw->hcin_size1 = harq_in_length - + harq_layout[harq_index].offset; + } else { + fcw->hcin_size0 = harq_in_length; + fcw->hcin_offset = 0; + fcw->hcin_size1 = 0; + } + } else { + fcw->hcin_size0 = 0; + fcw->hcin_offset = 0; + fcw->hcin_size1 = 0; + } + + fcw->itmax = op->ldpc_dec.iter_max; + fcw->itstop = check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE); + fcw->cnu_algo = ACC200_ALGO_MSA; + fcw->synd_precoder = fcw->itstop; + /* + * These are all implicitly set + * fcw->synd_post = 0; + * fcw->so_en = 0; + * fcw->so_bypass_rm = 0; + * fcw->so_bypass_intlv = 0; + * fcw->dec_convllr = 0; + * fcw->hcout_convllr = 0; + * fcw->hcout_size1 = 0; + * fcw->so_it = 0; + * fcw->hcout_offset = 0; + * fcw->negstop_th = 0; + * fcw->negstop_it = 0; + * fcw->negstop_en = 0; + * fcw->gain_i = 1; + * fcw->gain_h = 1; + */ + if (fcw->hcout_en > 0) { + parity_offset = (op->ldpc_dec.basegraph == 1 ? 20 : 8) + * op->ldpc_dec.z_c - op->ldpc_dec.n_filler; + k0_p = (fcw->k0 > parity_offset) ? + fcw->k0 - op->ldpc_dec.n_filler : fcw->k0; + ncb_p = fcw->ncb - op->ldpc_dec.n_filler; + l = k0_p + fcw->rm_e; + harq_out_length = (uint16_t) fcw->hcin_size0; + harq_out_length = RTE_MIN(RTE_MAX(harq_out_length, l), ncb_p); + harq_out_length = RTE_ALIGN_CEIL(harq_out_length, 64); + if ((k0_p > fcw->hcin_size0 + ACC200_HARQ_OFFSET_THRESHOLD) && + harq_prun) { + fcw->hcout_size0 = (uint16_t) fcw->hcin_size0; + fcw->hcout_offset = k0_p & 0xFFC0; + fcw->hcout_size1 = harq_out_length - fcw->hcout_offset; + } else { + fcw->hcout_size0 = harq_out_length; + fcw->hcout_size1 = 0; + fcw->hcout_offset = 0; + } + harq_layout[harq_index].offset = fcw->hcout_offset; + harq_layout[harq_index].size0 = fcw->hcout_size0; + } else { + fcw->hcout_size0 = 0; + fcw->hcout_size1 = 0; + fcw->hcout_offset = 0; + } + + fcw->tb_crc_select = 0; + if (check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_CRC_TYPE_24A_CHECK)) + fcw->tb_crc_select = 2; + if (check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_CRC_TYPE_16_CHECK)) + fcw->tb_crc_select = 1; +} + +/** + * Fills descriptor with data pointers of one block type. + * + * @param desc + * Pointer to DMA descriptor. + * @param input + * Pointer to pointer to input data which will be encoded. It can be changed + * and points to next segment in scatter-gather case. + * @param offset + * Input offset in rte_mbuf structure. It is used for calculating the point + * where data is starting. + * @param cb_len + * Length of currently processed Code Block + * @param seg_total_left + * It indicates how many bytes still left in segment (mbuf) for further + * processing. + * @param op_flags + * Store information about device capabilities + * @param next_triplet + * Index for ACC200 DMA Descriptor triplet + * @param scattergather + * Flag to support scatter-gather for the mbuf + * + * @return + * Returns index of next triplet on success, other value if lengths of + * pkt and processed cb do not match. + * + */ +static inline int +acc200_dma_fill_blk_type_in(struct acc200_dma_req_desc *desc, + struct rte_mbuf **input, uint32_t *offset, uint32_t cb_len, + uint32_t *seg_total_left, int next_triplet, + bool scattergather) +{ + uint32_t part_len; + struct rte_mbuf *m = *input; + if (scattergather) + part_len = (*seg_total_left < cb_len) ? + *seg_total_left : cb_len; + else + part_len = cb_len; + cb_len -= part_len; + *seg_total_left -= part_len; + + desc->data_ptrs[next_triplet].address = + rte_pktmbuf_iova_offset(m, *offset); + desc->data_ptrs[next_triplet].blen = part_len; + desc->data_ptrs[next_triplet].blkid = ACC200_DMA_BLKID_IN; + desc->data_ptrs[next_triplet].last = 0; + desc->data_ptrs[next_triplet].dma_ext = 0; + *offset += part_len; + next_triplet++; + + while (cb_len > 0) { + if (next_triplet < ACC200_DMA_MAX_NUM_POINTERS_IN && m->next != NULL) { + + m = m->next; + *seg_total_left = rte_pktmbuf_data_len(m); + part_len = (*seg_total_left < cb_len) ? + *seg_total_left : + cb_len; + desc->data_ptrs[next_triplet].address = + rte_pktmbuf_iova_offset(m, 0); + desc->data_ptrs[next_triplet].blen = part_len; + desc->data_ptrs[next_triplet].blkid = + ACC200_DMA_BLKID_IN; + desc->data_ptrs[next_triplet].last = 0; + desc->data_ptrs[next_triplet].dma_ext = 0; + cb_len -= part_len; + *seg_total_left -= part_len; + /* Initializing offset for next segment (mbuf) */ + *offset = part_len; + next_triplet++; + } else { + rte_bbdev_log(ERR, + "Some data still left for processing: " + "data_left: %u, next_triplet: %u, next_mbuf: %p", + cb_len, next_triplet, m->next); + return -EINVAL; + } + } + /* Storing new mbuf as it could be changed in scatter-gather case*/ + *input = m; + + return next_triplet; +} + +/* Fills descriptor with data pointers of one block type. + * Returns index of next triplet + */ +static inline int +acc200_dma_fill_blk_type(struct acc200_dma_req_desc *desc, + struct rte_mbuf *mbuf, uint32_t offset, + uint32_t len, int next_triplet, int blk_id) +{ + desc->data_ptrs[next_triplet].address = + rte_pktmbuf_iova_offset(mbuf, offset); + desc->data_ptrs[next_triplet].blen = len; + desc->data_ptrs[next_triplet].blkid = blk_id; + desc->data_ptrs[next_triplet].last = 0; + desc->data_ptrs[next_triplet].dma_ext = 0; + next_triplet++; + + return next_triplet; +} + +static inline void +acc200_header_init(struct acc200_dma_req_desc *desc) +{ + desc->word0 = ACC200_DMA_DESC_TYPE; + desc->word1 = 0; /**< Timestamp could be disabled */ + desc->word2 = 0; + desc->word3 = 0; + desc->numCBs = 1; +} + +#ifdef RTE_LIBRTE_BBDEV_DEBUG +/* Check if any input data is unexpectedly left for processing */ +static inline int +check_mbuf_total_left(uint32_t mbuf_total_left) +{ + if (mbuf_total_left == 0) + return 0; + rte_bbdev_log(ERR, + "Some date still left for processing: mbuf_total_left = %u", + mbuf_total_left); + return -EINVAL; +} +#endif + +static inline int +acc200_dma_desc_le_fill(struct rte_bbdev_enc_op *op, + struct acc200_dma_req_desc *desc, struct rte_mbuf **input, + struct rte_mbuf *output, uint32_t *in_offset, + uint32_t *out_offset, uint32_t *out_length, + uint32_t *mbuf_total_left, uint32_t *seg_total_left) +{ + int next_triplet = 1; /* FCW already done */ + uint16_t K, in_length_in_bits, in_length_in_bytes; + struct rte_bbdev_op_ldpc_enc *enc = &op->ldpc_enc; + + acc200_header_init(desc); + K = (enc->basegraph == 1 ? 22 : 10) * enc->z_c; + in_length_in_bits = K - enc->n_filler; + if ((enc->op_flags & RTE_BBDEV_LDPC_CRC_24A_ATTACH) || + (enc->op_flags & RTE_BBDEV_LDPC_CRC_24B_ATTACH)) + in_length_in_bits -= 24; + in_length_in_bytes = in_length_in_bits >> 3; + + if (unlikely((*mbuf_total_left == 0) || + (*mbuf_total_left < in_length_in_bytes))) { + rte_bbdev_log(ERR, + "Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u", + *mbuf_total_left, in_length_in_bytes); + return -1; + } + + next_triplet = acc200_dma_fill_blk_type_in(desc, input, in_offset, + in_length_in_bytes, + seg_total_left, next_triplet, + check_bit(op->ldpc_enc.op_flags, + RTE_BBDEV_LDPC_ENC_SCATTER_GATHER)); + if (unlikely(next_triplet < 0)) { + rte_bbdev_log(ERR, + "Mismatch between data to process and mbuf data length in bbdev_op: %p", + op); + return -1; + } + desc->data_ptrs[next_triplet - 1].last = 1; + desc->m2dlen = next_triplet; + *mbuf_total_left -= in_length_in_bytes; + + /* Set output length */ + /* Integer round up division by 8 */ + *out_length = (enc->cb_params.e + 7) >> 3; + + next_triplet = acc200_dma_fill_blk_type(desc, output, *out_offset, + *out_length, next_triplet, ACC200_DMA_BLKID_OUT_ENC); + op->ldpc_enc.output.length += *out_length; + *out_offset += *out_length; + desc->data_ptrs[next_triplet - 1].last = 1; + desc->data_ptrs[next_triplet - 1].dma_ext = 0; + desc->d2mlen = next_triplet - desc->m2dlen; + + desc->op_addr = op; + + return 0; +} + +static inline int +acc200_dma_desc_ld_fill(struct rte_bbdev_dec_op *op, + struct acc200_dma_req_desc *desc, + struct rte_mbuf **input, struct rte_mbuf *h_output, + uint32_t *in_offset, uint32_t *h_out_offset, + uint32_t *h_out_length, uint32_t *mbuf_total_left, + uint32_t *seg_total_left, + struct acc200_fcw_ld *fcw) +{ + struct rte_bbdev_op_ldpc_dec *dec = &op->ldpc_dec; + int next_triplet = 1; /* FCW already done */ + uint32_t input_length; + uint16_t output_length, crc24_overlap = 0; + uint16_t sys_cols, K, h_p_size, h_np_size; + bool h_comp = check_bit(dec->op_flags, + RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION); + + acc200_header_init(desc); + + if (check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP)) + crc24_overlap = 24; + + /* Compute some LDPC BG lengths */ + input_length = fcw->rm_e; + if (check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_LLR_COMPRESSION)) + input_length = (input_length * 3 + 3) / 4; + sys_cols = (dec->basegraph == 1) ? 22 : 10; + K = sys_cols * dec->z_c; + output_length = K - dec->n_filler - crc24_overlap; + + if (unlikely((*mbuf_total_left == 0) || + (*mbuf_total_left < input_length))) { + rte_bbdev_log(ERR, + "Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u", + *mbuf_total_left, input_length); + return -1; + } + + next_triplet = acc200_dma_fill_blk_type_in(desc, input, + in_offset, input_length, + seg_total_left, next_triplet, + check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_DEC_SCATTER_GATHER)); + + if (unlikely(next_triplet < 0)) { + rte_bbdev_log(ERR, + "Mismatch between data to process and mbuf data length in bbdev_op: %p", + op); + return -1; + } + + if (check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) { + if (op->ldpc_dec.harq_combined_input.data == 0) { + rte_bbdev_log(ERR, "HARQ input is not defined"); + return -1; + } + h_p_size = fcw->hcin_size0 + fcw->hcin_size1; + if (h_comp) + h_p_size = (h_p_size * 3 + 3) / 4; + if (op->ldpc_dec.harq_combined_input.data == 0) { + rte_bbdev_log(ERR, "HARQ input is not defined"); + return -1; + } + acc200_dma_fill_blk_type( + desc, + op->ldpc_dec.harq_combined_input.data, + op->ldpc_dec.harq_combined_input.offset, + h_p_size, + next_triplet, + ACC200_DMA_BLKID_IN_HARQ); + next_triplet++; + } + + desc->data_ptrs[next_triplet - 1].last = 1; + desc->m2dlen = next_triplet; + *mbuf_total_left -= input_length; + + next_triplet = acc200_dma_fill_blk_type(desc, h_output, + *h_out_offset, output_length >> 3, next_triplet, + ACC200_DMA_BLKID_OUT_HARD); + + if (check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) { + if (op->ldpc_dec.harq_combined_output.data == 0) { + rte_bbdev_log(ERR, "HARQ output is not defined"); + return -1; + } + + /* Pruned size of the HARQ */ + h_p_size = fcw->hcout_size0 + fcw->hcout_size1; + /* Non-Pruned size of the HARQ */ + h_np_size = fcw->hcout_offset > 0 ? + fcw->hcout_offset + fcw->hcout_size1 : + h_p_size; + if (h_comp) { + h_np_size = (h_np_size * 3 + 3) / 4; + h_p_size = (h_p_size * 3 + 3) / 4; + } + dec->harq_combined_output.length = h_np_size; + acc200_dma_fill_blk_type( + desc, + dec->harq_combined_output.data, + dec->harq_combined_output.offset, + h_p_size, + next_triplet, + ACC200_DMA_BLKID_OUT_HARQ); + + next_triplet++; + } + + *h_out_length = output_length >> 3; + dec->hard_output.length += *h_out_length; + *h_out_offset += *h_out_length; + desc->data_ptrs[next_triplet - 1].last = 1; + desc->d2mlen = next_triplet - desc->m2dlen; + + desc->op_addr = op; + + return 0; +} + +static inline void +acc200_dma_desc_ld_update(struct rte_bbdev_dec_op *op, + struct acc200_dma_req_desc *desc, + struct rte_mbuf *input, struct rte_mbuf *h_output, + uint32_t *in_offset, uint32_t *h_out_offset, + uint32_t *h_out_length, + union acc200_harq_layout_data *harq_layout) +{ + int next_triplet = 1; /* FCW already done */ + desc->data_ptrs[next_triplet].address = + rte_pktmbuf_iova_offset(input, *in_offset); + next_triplet++; + + if (check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) { + struct rte_bbdev_op_data hi = op->ldpc_dec.harq_combined_input; + desc->data_ptrs[next_triplet].address = + rte_pktmbuf_iova_offset(hi.data, hi.offset); + next_triplet++; + } + + desc->data_ptrs[next_triplet].address = + rte_pktmbuf_iova_offset(h_output, *h_out_offset); + *h_out_length = desc->data_ptrs[next_triplet].blen; + next_triplet++; + + if (check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) { + /* Adjust based on previous operation */ + struct rte_bbdev_dec_op *prev_op = desc->op_addr; + op->ldpc_dec.harq_combined_output.length = + prev_op->ldpc_dec.harq_combined_output.length; + uint32_t harq_idx = hq_index( + op->ldpc_dec.harq_combined_output.offset); + uint32_t prev_harq_idx = hq_index( + prev_op->ldpc_dec.harq_combined_output.offset); + harq_layout[harq_idx].val = harq_layout[prev_harq_idx].val; + struct rte_bbdev_op_data ho = + op->ldpc_dec.harq_combined_output; + desc->data_ptrs[next_triplet].address = + rte_pktmbuf_iova_offset(ho.data, ho.offset); + next_triplet++; + } + + op->ldpc_dec.hard_output.length += *h_out_length; + desc->op_addr = op; +} + + +/* Enqueue a number of operations to HW and update software rings */ +static inline void +acc200_dma_enqueue(struct acc200_queue *q, uint16_t n, + struct rte_bbdev_stats *queue_stats) +{ + union acc200_enqueue_reg_fmt enq_req; +#ifdef RTE_BBDEV_OFFLOAD_COST + uint64_t start_time = 0; + queue_stats->acc_offload_cycles = 0; +#else + RTE_SET_USED(queue_stats); +#endif + + enq_req.val = 0; + /* Setting offset, 100b for 256 DMA Desc */ + enq_req.addr_offset = ACC200_DESC_OFFSET; + + /* Split ops into batches */ + do { + union acc200_dma_desc *desc; + uint16_t enq_batch_size; + uint64_t offset; + rte_iova_t req_elem_addr; + + enq_batch_size = RTE_MIN(n, MAX_ENQ_BATCH_SIZE); + + /* Set flag on last descriptor in a batch */ + desc = q->ring_addr + ((q->sw_ring_head + enq_batch_size - 1) & + q->sw_ring_wrap_mask); + desc->req.last_desc_in_batch = 1; + + /* Calculate the 1st descriptor's address */ + offset = ((q->sw_ring_head & q->sw_ring_wrap_mask) * + sizeof(union acc200_dma_desc)); + req_elem_addr = q->ring_addr_iova + offset; + + /* Fill enqueue struct */ + enq_req.num_elem = enq_batch_size; + /* low 6 bits are not needed */ + enq_req.req_elem_addr = (uint32_t)(req_elem_addr >> 6); + +#ifdef RTE_LIBRTE_BBDEV_DEBUG + rte_memdump(stderr, "Req sdone", desc, sizeof(*desc)); +#endif + rte_bbdev_log_debug( + "Enqueue %u reqs (phys %#"PRIx64") to reg %p\n", + enq_batch_size, + req_elem_addr, + (void *)q->mmio_reg_enqueue); + + rte_wmb(); + +#ifdef RTE_BBDEV_OFFLOAD_COST + /* Start time measurement for enqueue function offload. */ + start_time = rte_rdtsc_precise(); +#endif + rte_bbdev_log(DEBUG, "Debug : MMIO Enqueue"); + mmio_write(q->mmio_reg_enqueue, enq_req.val); + +#ifdef RTE_BBDEV_OFFLOAD_COST + queue_stats->acc_offload_cycles += + rte_rdtsc_precise() - start_time; +#endif + + q->aq_enqueued++; + q->sw_ring_head += enq_batch_size; + n -= enq_batch_size; + + } while (n); + + +} + +#ifdef RTE_LIBRTE_BBDEV_DEBUG + +/* Validates LDPC encoder parameters */ +static inline int +validate_ldpc_enc_op(struct rte_bbdev_enc_op *op) +{ + struct rte_bbdev_op_ldpc_enc *ldpc_enc = &op->ldpc_enc; + + /* Check Zc is valid value */ + if ((ldpc_enc->z_c > 384) || (ldpc_enc->z_c < 2)) { + rte_bbdev_log(ERR, + "Zc (%u) is out of range", + ldpc_enc->z_c); + return -1; + } + if (ldpc_enc->z_c > 256) { + if ((ldpc_enc->z_c % 32) != 0) { + rte_bbdev_log(ERR, "Invalid Zc %d", ldpc_enc->z_c); + return -1; + } + } else if (ldpc_enc->z_c > 128) { + if ((ldpc_enc->z_c % 16) != 0) { + rte_bbdev_log(ERR, "Invalid Zc %d", ldpc_enc->z_c); + return -1; + } + } else if (ldpc_enc->z_c > 64) { + if ((ldpc_enc->z_c % 8) != 0) { + rte_bbdev_log(ERR, "Invalid Zc %d", ldpc_enc->z_c); + return -1; + } + } else if (ldpc_enc->z_c > 32) { + if ((ldpc_enc->z_c % 4) != 0) { + rte_bbdev_log(ERR, "Invalid Zc %d", ldpc_enc->z_c); + return -1; + } + } else if (ldpc_enc->z_c > 16) { + if ((ldpc_enc->z_c % 2) != 0) { + rte_bbdev_log(ERR, "Invalid Zc %d", ldpc_enc->z_c); + return -1; + } + } + return 0; +} + +/* Validates LDPC decoder parameters */ +static inline int +validate_ldpc_dec_op(struct rte_bbdev_dec_op *op) +{ + struct rte_bbdev_op_ldpc_dec *ldpc_dec = &op->ldpc_dec; + /* Check Zc is valid value */ + if ((ldpc_dec->z_c > 384) || (ldpc_dec->z_c < 2)) { + rte_bbdev_log(ERR, + "Zc (%u) is out of range", + ldpc_dec->z_c); + return -1; + } + if (ldpc_dec->z_c > 256) { + if ((ldpc_dec->z_c % 32) != 0) { + rte_bbdev_log(ERR, "Invalid Zc %d", ldpc_dec->z_c); + return -1; + } + } else if (ldpc_dec->z_c > 128) { + if ((ldpc_dec->z_c % 16) != 0) { + rte_bbdev_log(ERR, "Invalid Zc %d", ldpc_dec->z_c); + return -1; + } + } else if (ldpc_dec->z_c > 64) { + if ((ldpc_dec->z_c % 8) != 0) { + rte_bbdev_log(ERR, "Invalid Zc %d", ldpc_dec->z_c); + return -1; + } + } else if (ldpc_dec->z_c > 32) { + if ((ldpc_dec->z_c % 4) != 0) { + rte_bbdev_log(ERR, "Invalid Zc %d", ldpc_dec->z_c); + return -1; + } + } else if (ldpc_dec->z_c > 16) { + if ((ldpc_dec->z_c % 2) != 0) { + rte_bbdev_log(ERR, "Invalid Zc %d", ldpc_dec->z_c); + return -1; + } + } + return 0; +} + +#endif + +/* Enqueue one encode operations for ACC200 device in CB mode + * multiplexed on the same descriptor + */ +static inline int +enqueue_ldpc_enc_n_op_cb(struct acc200_queue *q, struct rte_bbdev_enc_op **ops, + uint16_t total_enqueued_descs, int16_t num) +{ + union acc200_dma_desc *desc = NULL; + uint32_t out_length; + struct rte_mbuf *output_head, *output; + int i, next_triplet; + uint16_t in_length_in_bytes; + struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc; + +#ifdef RTE_LIBRTE_BBDEV_DEBUG + if (validate_ldpc_enc_op(ops[0]) == -1) { + rte_bbdev_log(ERR, "LDPC encoder validation failed"); + return -EINVAL; + } +#endif + + uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_descs) + & q->sw_ring_wrap_mask); + desc = q->ring_addr + desc_idx; + acc200_fcw_le_fill(ops[0], &desc->req.fcw_le, num, 0); + + /** This could be done at polling */ + acc200_header_init(&desc->req); + desc->req.numCBs = num; + + in_length_in_bytes = ops[0]->ldpc_enc.input.data->data_len; + out_length = (enc->cb_params.e + 7) >> 3; + desc->req.m2dlen = 1 + num; + desc->req.d2mlen = num; + next_triplet = 1; + + for (i = 0; i < num; i++) { + desc->req.data_ptrs[next_triplet].address = + rte_pktmbuf_iova_offset(ops[i]->ldpc_enc.input.data, 0); + desc->req.data_ptrs[next_triplet].blen = in_length_in_bytes; + next_triplet++; + desc->req.data_ptrs[next_triplet].address = + rte_pktmbuf_iova_offset( + ops[i]->ldpc_enc.output.data, 0); + desc->req.data_ptrs[next_triplet].blen = out_length; + next_triplet++; + ops[i]->ldpc_enc.output.length = out_length; + output_head = output = ops[i]->ldpc_enc.output.data; + mbuf_append(output_head, output, out_length); + output->data_len = out_length; + } + + desc->req.op_addr = ops[0]; + /* Keep track of pointers even when multiplexed in single descriptor */ + struct acc200_ptrs *context_ptrs = q->companion_ring_addr + desc_idx; + for (i = 0; i < num; i++) + context_ptrs->ptr[i].op_addr = ops[i]; + +#ifdef RTE_LIBRTE_BBDEV_DEBUG + rte_memdump(stderr, "FCW", &desc->req.fcw_le, + sizeof(desc->req.fcw_le) - 8); + rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc)); +#endif + + /* One CB (one op) was successfully prepared to enqueue */ + return num; +} + +/* Enqueue one encode operations for ACC200 device for a partial TB + * all codes blocks have same configuration multiplexed on the same descriptor + */ +static inline void +enqueue_ldpc_enc_part_tb(struct acc200_queue *q, struct rte_bbdev_enc_op *op, + uint16_t total_enqueued_descs, int16_t num_cbs, uint32_t e, + uint16_t in_len_B, uint32_t out_len_B, uint32_t *in_offset, + uint32_t *out_offset) +{ + + union acc200_dma_desc *desc = NULL; + struct rte_mbuf *output_head, *output; + int i, next_triplet; + struct rte_bbdev_op_ldpc_enc *enc = &op->ldpc_enc; + + + uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_descs) + & q->sw_ring_wrap_mask); + desc = q->ring_addr + desc_idx; + acc200_fcw_le_fill(op, &desc->req.fcw_le, num_cbs, e); + + /** This could be done at polling */ + acc200_header_init(&desc->req); + desc->req.numCBs = num_cbs; + + desc->req.m2dlen = 1 + num_cbs; + desc->req.d2mlen = num_cbs; + next_triplet = 1; + + for (i = 0; i < num_cbs; i++) { + desc->req.data_ptrs[next_triplet].address = + rte_pktmbuf_iova_offset(enc->input.data, + *in_offset); + *in_offset += in_len_B; + desc->req.data_ptrs[next_triplet].blen = in_len_B; + next_triplet++; + desc->req.data_ptrs[next_triplet].address = + rte_pktmbuf_iova_offset( + enc->output.data, *out_offset); + *out_offset += out_len_B; + desc->req.data_ptrs[next_triplet].blen = out_len_B; + next_triplet++; + enc->output.length += out_len_B; + output_head = output = enc->output.data; + mbuf_append(output_head, output, out_len_B); + } + +#ifdef RTE_LIBRTE_BBDEV_DEBUG + rte_memdump(stderr, "FCW", &desc->req.fcw_le, + sizeof(desc->req.fcw_le) - 8); + rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc)); +#endif + +} + +/* Enqueue one encode operations for ACC200 device in CB mode */ +static inline int +enqueue_ldpc_enc_one_op_cb(struct acc200_queue *q, struct rte_bbdev_enc_op *op, + uint16_t total_enqueued_cbs) +{ + union acc200_dma_desc *desc = NULL; + int ret; + uint32_t in_offset, out_offset, out_length, mbuf_total_left, + seg_total_left; + struct rte_mbuf *input, *output_head, *output; + +#ifdef RTE_LIBRTE_BBDEV_DEBUG + if (validate_ldpc_enc_op(op) == -1) { + rte_bbdev_log(ERR, "LDPC encoder validation failed"); + return -EINVAL; + } +#endif + + uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs) + & q->sw_ring_wrap_mask); + desc = q->ring_addr + desc_idx; + acc200_fcw_le_fill(op, &desc->req.fcw_le, 1, 0); + + input = op->ldpc_enc.input.data; + output_head = output = op->ldpc_enc.output.data; + in_offset = op->ldpc_enc.input.offset; + out_offset = op->ldpc_enc.output.offset; + out_length = 0; + mbuf_total_left = op->ldpc_enc.input.length; + seg_total_left = rte_pktmbuf_data_len(op->ldpc_enc.input.data) + - in_offset; + + ret = acc200_dma_desc_le_fill(op, &desc->req, &input, output, + &in_offset, &out_offset, &out_length, &mbuf_total_left, + &seg_total_left); + + if (unlikely(ret < 0)) + return ret; + + mbuf_append(output_head, output, out_length); + +#ifdef RTE_LIBRTE_BBDEV_DEBUG + rte_memdump(stderr, "FCW", &desc->req.fcw_le, + sizeof(desc->req.fcw_le) - 8); + rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc)); + + if (check_mbuf_total_left(mbuf_total_left) != 0) + return -EINVAL; +#endif + /* One CB (one op) was successfully prepared to enqueue */ + return 1; +} + +/* Enqueue one encode operations for ACC200 device in TB mode. + * returns the number of descs used + */ +static inline int +enqueue_ldpc_enc_one_op_tb(struct acc200_queue *q, struct rte_bbdev_enc_op *op, + uint16_t enq_descs, uint8_t cbs_in_tb) +{ + uint8_t num_a, num_b; + uint16_t desc_idx; + uint8_t r = op->ldpc_enc.tb_params.r; + uint8_t cab = op->ldpc_enc.tb_params.cab; + union acc200_dma_desc *desc; + uint16_t init_enq_descs = enq_descs; + uint16_t input_len_B = ((op->ldpc_enc.basegraph == 1 ? 22 : 10) * + op->ldpc_enc.z_c) >> 3; + if (check_bit(op->ldpc_enc.op_flags, RTE_BBDEV_LDPC_CRC_24B_ATTACH)) + input_len_B -= 3; + + if (r < cab) { + num_a = cab - r; + num_b = cbs_in_tb - cab; + } else { + num_a = 0; + num_b = cbs_in_tb - r; + } + uint32_t in_offset = 0, out_offset = 0; + + while (num_a > 0) { + uint32_t e = op->ldpc_enc.tb_params.ea; + uint32_t out_len_B = (e + 7) >> 3; + uint8_t enq = RTE_MIN(num_a, ACC200_MUX_5GDL_DESC); + num_a -= enq; + enqueue_ldpc_enc_part_tb(q, op, enq_descs, enq, e, input_len_B, + out_len_B, &in_offset, &out_offset); + enq_descs++; + } + while (num_b > 0) { + uint32_t e = op->ldpc_enc.tb_params.eb; + uint32_t out_len_B = (e + 7) >> 3; + uint8_t enq = RTE_MIN(num_b, ACC200_MUX_5GDL_DESC); + num_b -= enq; + enqueue_ldpc_enc_part_tb(q, op, enq_descs, enq, e, input_len_B, + out_len_B, &in_offset, &out_offset); + enq_descs++; + } + + uint16_t return_descs = enq_descs - init_enq_descs; + /* Keep total number of CBs in first TB */ + desc_idx = ((q->sw_ring_head + init_enq_descs) + & q->sw_ring_wrap_mask); + desc = q->ring_addr + desc_idx; + desc->req.cbs_in_tb = return_descs; /** Actual number of descriptors */ + desc->req.op_addr = op; + + /* Set SDone on last CB descriptor for TB mode. */ + desc_idx = ((q->sw_ring_head + enq_descs - 1) + & q->sw_ring_wrap_mask); + desc = q->ring_addr + desc_idx; + desc->req.sdone_enable = 1; + desc->req.irq_enable = q->irq_enable; + desc->req.op_addr = op; + return return_descs; +} + +/** Enqueue one decode operations for ACC200 device in CB mode */ +static inline int +enqueue_ldpc_dec_one_op_cb(struct acc200_queue *q, struct rte_bbdev_dec_op *op, + uint16_t total_enqueued_cbs, bool same_op) +{ + int ret, hq_len; + if (op->ldpc_dec.cb_params.e == 0) + return -EINVAL; + +#ifdef RTE_LIBRTE_BBDEV_DEBUG + if (validate_ldpc_dec_op(op) == -1) { + rte_bbdev_log(ERR, "LDPC decoder validation failed"); + return -EINVAL; + } +#endif + + union acc200_dma_desc *desc; + uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs) + & q->sw_ring_wrap_mask); + desc = q->ring_addr + desc_idx; + struct rte_mbuf *input, *h_output_head, *h_output; + uint32_t in_offset, h_out_offset, mbuf_total_left, h_out_length = 0; + input = op->ldpc_dec.input.data; + h_output_head = h_output = op->ldpc_dec.hard_output.data; + in_offset = op->ldpc_dec.input.offset; + h_out_offset = op->ldpc_dec.hard_output.offset; + mbuf_total_left = op->ldpc_dec.input.length; +#ifdef RTE_LIBRTE_BBDEV_DEBUG + if (unlikely(input == NULL)) { + rte_bbdev_log(ERR, "Invalid mbuf pointer"); + return -EFAULT; + } +#endif + union acc200_harq_layout_data *harq_layout = q->d->harq_layout; + + if (same_op) { + union acc200_dma_desc *prev_desc; + desc_idx = ((q->sw_ring_head + total_enqueued_cbs - 1) + & q->sw_ring_wrap_mask); + prev_desc = q->ring_addr + desc_idx; + uint8_t *prev_ptr = (uint8_t *) prev_desc; + uint8_t *new_ptr = (uint8_t *) desc; + /* Copy first 4 words and BDESCs */ + rte_memcpy(new_ptr, prev_ptr, ACC200_5GUL_SIZE_0); + rte_memcpy(new_ptr + ACC200_5GUL_OFFSET_0, + prev_ptr + ACC200_5GUL_OFFSET_0, + ACC200_5GUL_SIZE_1); + desc->req.op_addr = prev_desc->req.op_addr; + /* Copy FCW */ + rte_memcpy(new_ptr + ACC200_DESC_FCW_OFFSET, + prev_ptr + ACC200_DESC_FCW_OFFSET, + ACC200_FCW_LD_BLEN); + acc200_dma_desc_ld_update(op, &desc->req, input, h_output, + &in_offset, &h_out_offset, + &h_out_length, harq_layout); + } else { + struct acc200_fcw_ld *fcw; + uint32_t seg_total_left; + fcw = &desc->req.fcw_ld; + acc200_fcw_ld_fill(op, fcw, harq_layout); + + /* Special handling when using mbuf or not */ + if (check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_DEC_SCATTER_GATHER)) + seg_total_left = rte_pktmbuf_data_len(input) + - in_offset; + else + seg_total_left = fcw->rm_e; + + ret = acc200_dma_desc_ld_fill(op, &desc->req, &input, h_output, + &in_offset, &h_out_offset, + &h_out_length, &mbuf_total_left, + &seg_total_left, fcw); + if (unlikely(ret < 0)) + return ret; + } + + /* Hard output */ + mbuf_append(h_output_head, h_output, h_out_length); + if (op->ldpc_dec.harq_combined_output.length > 0) { + /* Push the HARQ output into host memory */ + struct rte_mbuf *hq_output_head, *hq_output; + hq_output_head = op->ldpc_dec.harq_combined_output.data; + hq_output = op->ldpc_dec.harq_combined_output.data; + hq_len = op->ldpc_dec.harq_combined_output.length; + if (unlikely(!mbuf_append(hq_output_head, hq_output, + hq_len))) { + rte_bbdev_log(ERR, "HARQ output mbuf issue %d %d\n", + hq_output->buf_len, + hq_len); + return -1; + } + } + +#ifdef RTE_LIBRTE_BBDEV_DEBUG + rte_memdump(stderr, "FCW", &desc->req.fcw_ld, + sizeof(desc->req.fcw_ld) - 8); + rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc)); +#endif + + /* One CB (one op) was successfully prepared to enqueue */ + return 1; +} + + +/* Enqueue one decode operations for ACC200 device in TB mode */ +static inline int +enqueue_ldpc_dec_one_op_tb(struct acc200_queue *q, struct rte_bbdev_dec_op *op, + uint16_t total_enqueued_cbs, uint8_t cbs_in_tb) +{ + union acc200_dma_desc *desc = NULL; + union acc200_dma_desc *desc_first = NULL; + int ret; + uint8_t r, c; + uint32_t in_offset, h_out_offset, + h_out_length, mbuf_total_left, seg_total_left; + struct rte_mbuf *input, *h_output_head, *h_output; + uint16_t current_enqueued_cbs = 0; + uint16_t sys_cols, trail_len = 0; + +#ifdef RTE_LIBRTE_BBDEV_DEBUG + if (validate_ldpc_dec_op(op) == -1) { + rte_bbdev_log(ERR, "LDPC decoder validation failed"); + return -EINVAL; + } +#endif + + uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs) + & q->sw_ring_wrap_mask); + desc = q->ring_addr + desc_idx; + desc_first = desc; + uint64_t fcw_offset = (desc_idx << 8) + ACC200_DESC_FCW_OFFSET; + union acc200_harq_layout_data *harq_layout = q->d->harq_layout; + acc200_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout); + + input = op->ldpc_dec.input.data; + h_output_head = h_output = op->ldpc_dec.hard_output.data; + in_offset = op->ldpc_dec.input.offset; + h_out_offset = op->ldpc_dec.hard_output.offset; + h_out_length = 0; + mbuf_total_left = op->ldpc_dec.input.length; + c = op->ldpc_dec.tb_params.c; + r = op->ldpc_dec.tb_params.r; + if (check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_CRC_TYPE_24A_CHECK)) { + sys_cols = (op->ldpc_dec.basegraph == 1) ? 22 : 10; + trail_len = sys_cols * op->ldpc_dec.z_c - + op->ldpc_dec.n_filler - 24; + } + + while (mbuf_total_left > 0 && r < c) { + if (check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_DEC_SCATTER_GATHER)) + seg_total_left = rte_pktmbuf_data_len(input) + - in_offset; + else + seg_total_left = op->ldpc_dec.input.length; + /* Set up DMA descriptor */ + desc_idx = ((q->sw_ring_head + total_enqueued_cbs) + & q->sw_ring_wrap_mask); + desc = q->ring_addr + desc_idx; + fcw_offset = (desc_idx << 8) + ACC200_DESC_FCW_OFFSET; + desc->req.data_ptrs[0].address = q->ring_addr_iova + + fcw_offset; + desc->req.data_ptrs[0].blen = ACC200_FCW_LD_BLEN; + rte_memcpy(&desc->req.fcw_ld, &desc_first->req.fcw_ld, + ACC200_FCW_LD_BLEN); + desc->req.fcw_ld.tb_trailer_size = (c - r - 1) * trail_len; + + ret = acc200_dma_desc_ld_fill(op, &desc->req, &input, + h_output, &in_offset, &h_out_offset, + &h_out_length, + &mbuf_total_left, &seg_total_left, + &desc->req.fcw_ld); + + if (unlikely(ret < 0)) + return ret; + + /* Hard output */ + mbuf_append(h_output_head, h_output, h_out_length); + + /* Set total number of CBs in TB */ + desc->req.cbs_in_tb = cbs_in_tb; +#ifdef RTE_LIBRTE_BBDEV_DEBUG + rte_memdump(stderr, "FCW", &desc->req.fcw_td, + sizeof(desc->req.fcw_td) - 8); + rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc)); +#endif + if (check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_DEC_SCATTER_GATHER) + && (seg_total_left == 0)) { + /* Go to the next mbuf */ + input = input->next; + in_offset = 0; + h_output = h_output->next; + h_out_offset = 0; + } + total_enqueued_cbs++; + current_enqueued_cbs++; + r++; + } + +#ifdef RTE_LIBRTE_BBDEV_DEBUG + if (check_mbuf_total_left(mbuf_total_left) != 0) + return -EINVAL; +#endif + /* Set SDone on last CB descriptor for TB mode */ + desc->req.sdone_enable = 1; + desc->req.irq_enable = q->irq_enable; + + return current_enqueued_cbs; +} + +/* Calculates number of CBs in processed encoder TB based on 'r' and input + * length. + */ +static inline uint8_t +get_num_cbs_in_tb_ldpc_enc(struct rte_bbdev_op_ldpc_enc *ldpc_enc) +{ + uint8_t c, r, crc24_bits = 0; + uint16_t k = (ldpc_enc->basegraph == 1 ? 22 : 10) * ldpc_enc->z_c + - ldpc_enc->n_filler; + uint8_t cbs_in_tb = 0; + int32_t length; + + length = ldpc_enc->input.length; + r = ldpc_enc->tb_params.r; + c = ldpc_enc->tb_params.c; + crc24_bits = 0; + if (check_bit(ldpc_enc->op_flags, RTE_BBDEV_LDPC_CRC_24B_ATTACH)) + crc24_bits = 24; + while (length > 0 && r < c) { + length -= (k - crc24_bits) >> 3; + r++; + cbs_in_tb++; + } + + return cbs_in_tb; +} + +/* Calculates number of CBs in processed encoder TB based on 'r' and input + * length. + */ +static inline uint8_t +get_num_cbs_in_tb_enc(struct rte_bbdev_op_turbo_enc *turbo_enc) +{ + uint8_t c, c_neg, r, crc24_bits = 0; + uint16_t k, k_neg, k_pos; + uint8_t cbs_in_tb = 0; + int32_t length; + + length = turbo_enc->input.length; + r = turbo_enc->tb_params.r; + c = turbo_enc->tb_params.c; + c_neg = turbo_enc->tb_params.c_neg; + k_neg = turbo_enc->tb_params.k_neg; + k_pos = turbo_enc->tb_params.k_pos; + crc24_bits = 0; + if (check_bit(turbo_enc->op_flags, RTE_BBDEV_TURBO_CRC_24B_ATTACH)) + crc24_bits = 24; + while (length > 0 && r < c) { + k = (r < c_neg) ? k_neg : k_pos; + length -= (k - crc24_bits) >> 3; + r++; + cbs_in_tb++; + } + + return cbs_in_tb; +} + +/* Calculates number of CBs in processed decoder TB based on 'r' and input + * length. + */ +static inline uint16_t +get_num_cbs_in_tb_dec(struct rte_bbdev_op_turbo_dec *turbo_dec) +{ + uint8_t c, c_neg, r = 0; + uint16_t kw, k, k_neg, k_pos, cbs_in_tb = 0; + int32_t length; + + length = turbo_dec->input.length; + r = turbo_dec->tb_params.r; + c = turbo_dec->tb_params.c; + c_neg = turbo_dec->tb_params.c_neg; + k_neg = turbo_dec->tb_params.k_neg; + k_pos = turbo_dec->tb_params.k_pos; + while (length > 0 && r < c) { + k = (r < c_neg) ? k_neg : k_pos; + kw = RTE_ALIGN_CEIL(k + 4, 32) * 3; + length -= kw; + r++; + cbs_in_tb++; + } + + return cbs_in_tb; +} + +/* Calculates number of CBs in processed decoder TB based on 'r' and input + * length. + */ +static inline uint16_t +get_num_cbs_in_tb_ldpc_dec(struct rte_bbdev_op_ldpc_dec *ldpc_dec) +{ + uint16_t r, cbs_in_tb = 0; + int32_t length = ldpc_dec->input.length; + r = ldpc_dec->tb_params.r; + while (length > 0 && r < ldpc_dec->tb_params.c) { + length -= (r < ldpc_dec->tb_params.cab) ? + ldpc_dec->tb_params.ea : + ldpc_dec->tb_params.eb; + r++; + cbs_in_tb++; + } + return cbs_in_tb; +} + +static inline void +acc200_enqueue_status(struct rte_bbdev_queue_data *q_data, + enum rte_bbdev_enqueue_status status) +{ + q_data->enqueue_status = status; + q_data->queue_stats.enqueue_status_count[status]++; + rte_bbdev_log(WARNING, "Enqueue Status: %s %#"PRIx64"", + rte_bbdev_enqueue_status_str(status), + q_data->queue_stats.enqueue_status_count[status]); +} + +static inline void +acc200_enqueue_invalid(struct rte_bbdev_queue_data *q_data) +{ + acc200_enqueue_status(q_data, RTE_BBDEV_ENQ_STATUS_INVALID_OP); +} + +static inline void +acc200_enqueue_ring_full(struct rte_bbdev_queue_data *q_data) +{ + acc200_enqueue_status(q_data, RTE_BBDEV_ENQ_STATUS_RING_FULL); +} + +static inline void +acc200_enqueue_queue_full(struct rte_bbdev_queue_data *q_data) +{ + acc200_enqueue_status(q_data, RTE_BBDEV_ENQ_STATUS_QUEUE_FULL); +} + +/* Number of available descriptor in ring to enqueue */ +static uint32_t +acc200_ring_avail_enq(struct acc200_queue *q) +{ + return (q->sw_ring_depth - 1 + q->sw_ring_tail - q->sw_ring_head) % q->sw_ring_depth; +} + +/* Number of available descriptor in ring to dequeue */ +static uint32_t +acc200_ring_avail_deq(struct acc200_queue *q) +{ + return (q->sw_ring_depth + q->sw_ring_head - q->sw_ring_tail) % q->sw_ring_depth; +} + +/* Check we can mux encode operations with common FCW */ +static inline int16_t +check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) { + uint16_t i; + if (num <= 1) + return 1; + for (i = 1; i < num; ++i) { + /* Only mux compatible code blocks */ + if (memcmp((uint8_t *)(&ops[i]->ldpc_enc) + ACC200_ENC_OFFSET, + (uint8_t *)(&ops[0]->ldpc_enc) + + ACC200_ENC_OFFSET, + ACC200_CMP_ENC_SIZE) != 0) + return i; + } + /* Avoid multiplexing small inbound size frames */ + int Kp = (ops[0]->ldpc_enc.basegraph == 1 ? 22 : 10) * + ops[0]->ldpc_enc.z_c - ops[0]->ldpc_enc.n_filler; + if (Kp <= ACC200_LIMIT_DL_MUX_BITS) + return 1; + return num; +} + +/** Enqueue encode operations for ACC200 device in CB mode. */ +static inline uint16_t +acc200_enqueue_ldpc_enc_cb(struct rte_bbdev_queue_data *q_data, + struct rte_bbdev_enc_op **ops, uint16_t num) +{ + struct acc200_queue *q = q_data->queue_private; + int32_t avail = acc200_ring_avail_enq(q); + uint16_t i = 0; + union acc200_dma_desc *desc; + int ret, desc_idx = 0; + int16_t enq, left = num; + + while (left > 0) { + if (unlikely(avail < 1)) { + acc200_enqueue_ring_full(q_data); + break; + } + avail--; + enq = RTE_MIN(left, ACC200_MUX_5GDL_DESC); + enq = check_mux(&ops[i], enq); + if (enq > 1) { + ret = enqueue_ldpc_enc_n_op_cb(q, &ops[i], + desc_idx, enq); + if (ret < 0) { + acc200_enqueue_invalid(q_data); + break; + } + i += enq; + } else { + ret = enqueue_ldpc_enc_one_op_cb(q, ops[i], desc_idx); + if (ret < 0) { + acc200_enqueue_invalid(q_data); + break; + } + i++; + } + desc_idx++; + left = num - i; + } + + if (unlikely(i == 0)) + return 0; /* Nothing to enqueue */ + + /* Set SDone in last CB in enqueued ops for CB mode*/ + desc = q->ring_addr + ((q->sw_ring_head + desc_idx - 1) + & q->sw_ring_wrap_mask); + desc->req.sdone_enable = 1; + desc->req.irq_enable = q->irq_enable; + + acc200_dma_enqueue(q, desc_idx, &q_data->queue_stats); + + /* Update stats */ + q_data->queue_stats.enqueued_count += i; + q_data->queue_stats.enqueue_err_count += num - i; + + return i; +} + +/* Enqueue LDPC encode operations for ACC200 device in TB mode. */ +static uint16_t +acc200_enqueue_ldpc_enc_tb(struct rte_bbdev_queue_data *q_data, + struct rte_bbdev_enc_op **ops, uint16_t num) +{ + struct acc200_queue *q = q_data->queue_private; + int32_t avail = acc200_ring_avail_enq(q); + uint16_t i, enqueued_descs = 0; + uint8_t cbs_in_tb; + int descs_used; + + for (i = 0; i < num; ++i) { + cbs_in_tb = get_num_cbs_in_tb_ldpc_enc(&ops[i]->ldpc_enc); + /* Check if there are available space for further processing */ + if (unlikely((avail - cbs_in_tb < 0) || (cbs_in_tb == 0))) { + acc200_enqueue_ring_full(q_data); + break; + } + + descs_used = enqueue_ldpc_enc_one_op_tb(q, ops[i], + enqueued_descs, cbs_in_tb); + if (descs_used < 0) { + acc200_enqueue_invalid(q_data); + break; + } + enqueued_descs += descs_used; + avail -= descs_used; + } + if (unlikely(enqueued_descs == 0)) + return 0; /* Nothing to enqueue */ + + acc200_dma_enqueue(q, enqueued_descs, &q_data->queue_stats); + + /* Update stats */ + q_data->queue_stats.enqueued_count += i; + q_data->queue_stats.enqueue_err_count += num - i; + + return i; +} + +/* Check room in AQ for the enqueues batches into Qmgr */ +static int32_t +acc200_aq_avail(struct rte_bbdev_queue_data *q_data, uint16_t num_ops) +{ + struct acc200_queue *q = q_data->queue_private; + int32_t aq_avail = q->aq_depth - + ((q->aq_enqueued - q->aq_dequeued + + ACC200_MAX_QUEUE_DEPTH) % ACC200_MAX_QUEUE_DEPTH) + - (num_ops >> 7); + if (aq_avail <= 0) + acc200_enqueue_queue_full(q_data); + return aq_avail; +} + +/* Enqueue encode operations for ACC200 device. */ +static uint16_t +acc200_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data, + struct rte_bbdev_enc_op **ops, uint16_t num) +{ + struct acc200_queue *q = q_data->queue_private; + int32_t aq_avail = acc200_ring_avail_enq(q); + if (unlikely((aq_avail <= 0) || (num == 0))) + return 0; + if (ops[0]->ldpc_enc.code_block_mode == RTE_BBDEV_TRANSPORT_BLOCK) + return acc200_enqueue_ldpc_enc_tb(q_data, ops, num); + else + return acc200_enqueue_ldpc_enc_cb(q_data, ops, num); +} + +/* Check we can mux encode operations with common FCW */ +static inline bool +cmp_ldpc_dec_op(struct rte_bbdev_dec_op **ops) { + /* Only mux compatible code blocks */ + if (memcmp((uint8_t *)(&ops[0]->ldpc_dec) + ACC200_DEC_OFFSET, + (uint8_t *)(&ops[1]->ldpc_dec) + + ACC200_DEC_OFFSET, ACC200_CMP_DEC_SIZE) != 0) { + return false; + } else + return true; +} + + +/* Enqueue decode operations for ACC200 device in TB mode */ +static uint16_t +acc200_enqueue_ldpc_dec_tb(struct rte_bbdev_queue_data *q_data, + struct rte_bbdev_dec_op **ops, uint16_t num) +{ + struct acc200_queue *q = q_data->queue_private; + int32_t avail = acc200_ring_avail_enq(q); + uint16_t i, enqueued_cbs = 0; + uint8_t cbs_in_tb; + int ret; + + for (i = 0; i < num; ++i) { + cbs_in_tb = get_num_cbs_in_tb_ldpc_dec(&ops[i]->ldpc_dec); + /* Check if there are available space for further processing */ + if (unlikely((avail - cbs_in_tb < 0) || + (cbs_in_tb == 0))) + break; + avail -= cbs_in_tb; + + ret = enqueue_ldpc_dec_one_op_tb(q, ops[i], + enqueued_cbs, cbs_in_tb); + if (ret <= 0) + break; + enqueued_cbs += ret; + } + + acc200_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats); + + /* Update stats */ + q_data->queue_stats.enqueued_count += i; + q_data->queue_stats.enqueue_err_count += num - i; + return i; +} + +/* Enqueue decode operations for ACC200 device in CB mode */ +static uint16_t +acc200_enqueue_ldpc_dec_cb(struct rte_bbdev_queue_data *q_data, + struct rte_bbdev_dec_op **ops, uint16_t num) +{ + struct acc200_queue *q = q_data->queue_private; + int32_t avail = acc200_ring_avail_enq(q); + uint16_t i; + union acc200_dma_desc *desc; + int ret; + bool same_op = false; + for (i = 0; i < num; ++i) { + /* Check if there are available space for further processing */ + if (unlikely(avail < 1)) { + acc200_enqueue_ring_full(q_data); + break; + } + avail -= 1; +#ifdef ACC200_DESC_OPTIMIZATION + if (i > 0) + same_op = cmp_ldpc_dec_op(&ops[i-1]); +#endif + rte_bbdev_log(INFO, "Op %d %d %d %d %d %d %d %d %d %d %d %d\n", + i, ops[i]->ldpc_dec.op_flags, ops[i]->ldpc_dec.rv_index, + ops[i]->ldpc_dec.iter_max, ops[i]->ldpc_dec.iter_count, + ops[i]->ldpc_dec.basegraph, ops[i]->ldpc_dec.z_c, + ops[i]->ldpc_dec.n_cb, ops[i]->ldpc_dec.q_m, + ops[i]->ldpc_dec.n_filler, ops[i]->ldpc_dec.cb_params.e, + same_op); + ret = enqueue_ldpc_dec_one_op_cb(q, ops[i], i, same_op); + if (ret < 0) { + acc200_enqueue_invalid(q_data); + break; + } + } + + if (unlikely(i == 0)) + return 0; /* Nothing to enqueue */ + + /* Set SDone in last CB in enqueued ops for CB mode*/ + desc = q->ring_addr + ((q->sw_ring_head + i - 1) + & q->sw_ring_wrap_mask); + + desc->req.sdone_enable = 1; + desc->req.irq_enable = q->irq_enable; + + acc200_dma_enqueue(q, i, &q_data->queue_stats); + + /* Update stats */ + q_data->queue_stats.enqueued_count += i; + q_data->queue_stats.enqueue_err_count += num - i; + return i; +} + +/* Enqueue decode operations for ACC200 device. */ +static uint16_t +acc200_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data, + struct rte_bbdev_dec_op **ops, uint16_t num) +{ + int32_t aq_avail = acc200_aq_avail(q_data, num); + if (unlikely((aq_avail <= 0) || (num == 0))) + return 0; + if (ops[0]->ldpc_dec.code_block_mode == RTE_BBDEV_TRANSPORT_BLOCK) + return acc200_enqueue_ldpc_dec_tb(q_data, ops, num); + else + return acc200_enqueue_ldpc_dec_cb(q_data, ops, num); +} + + +/* Dequeue one encode operations from ACC200 device in CB mode + */ +static inline int +dequeue_enc_one_op_cb(struct acc200_queue *q, struct rte_bbdev_enc_op **ref_op, + uint16_t *dequeued_ops, uint32_t *aq_dequeued, + uint16_t *dequeued_descs) +{ + union acc200_dma_desc *desc, atom_desc; + union acc200_dma_rsp_desc rsp; + struct rte_bbdev_enc_op *op; + int i; + int desc_idx = ((q->sw_ring_tail + *dequeued_descs) + & q->sw_ring_wrap_mask); + desc = q->ring_addr + desc_idx; + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc, + __ATOMIC_RELAXED); + + /* Check fdone bit */ + if (!(atom_desc.rsp.val & ACC200_FDONE)) + return -1; + + rsp.val = atom_desc.rsp.val; + rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val); + + /* Dequeue */ + op = desc->req.op_addr; + + /* Clearing status, it will be set based on response */ + op->status = 0; + + op->status |= ((rsp.input_err) + ? (1 << RTE_BBDEV_DATA_ERROR) : 0); + op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0); + op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0); + + if (desc->req.last_desc_in_batch) { + (*aq_dequeued)++; + desc->req.last_desc_in_batch = 0; + } + desc->rsp.val = ACC200_DMA_DESC_TYPE; + desc->rsp.add_info_0 = 0; /*Reserved bits */ + desc->rsp.add_info_1 = 0; /*Reserved bits */ + + ref_op[0] = op; + struct acc200_ptrs *context_ptrs = q->companion_ring_addr + desc_idx; + for (i = 1 ; i < desc->req.numCBs; i++) + ref_op[i] = context_ptrs->ptr[i].op_addr; + + /* One op was successfully dequeued */ + (*dequeued_descs)++; + *dequeued_ops += desc->req.numCBs; + return desc->req.numCBs; +} + +/* Dequeue one LDPC encode operations from ACC200 device in TB mode + * That operation may cover multiple descriptors + */ +static inline int +dequeue_enc_one_op_tb(struct acc200_queue *q, struct rte_bbdev_enc_op **ref_op, + uint16_t *dequeued_ops, uint32_t *aq_dequeued, + uint16_t *dequeued_descs) +{ + union acc200_dma_desc *desc, *last_desc, atom_desc; + union acc200_dma_rsp_desc rsp; + struct rte_bbdev_enc_op *op; + uint8_t i = 0; + uint16_t current_dequeued_descs = 0, descs_in_tb; + + desc = q->ring_addr + ((q->sw_ring_tail + *dequeued_descs) + & q->sw_ring_wrap_mask); + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc, + __ATOMIC_RELAXED); + + /* Check fdone bit */ + if (!(atom_desc.rsp.val & ACC200_FDONE)) + return -1; + + /* Get number of CBs in dequeued TB */ + descs_in_tb = desc->req.cbs_in_tb; + /* Get last CB */ + last_desc = q->ring_addr + ((q->sw_ring_tail + + *dequeued_descs + descs_in_tb - 1) + & q->sw_ring_wrap_mask); + /* Check if last CB in TB is ready to dequeue (and thus + * the whole TB) - checking sdone bit. If not return. + */ + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc, + __ATOMIC_RELAXED); + if (!(atom_desc.rsp.val & ACC200_SDONE)) + return -1; + + /* Dequeue */ + op = desc->req.op_addr; + + /* Clearing status, it will be set based on response */ + op->status = 0; + + while (i < descs_in_tb) { + desc = q->ring_addr + ((q->sw_ring_tail + + *dequeued_descs) + & q->sw_ring_wrap_mask); + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc, + __ATOMIC_RELAXED); + rsp.val = atom_desc.rsp.val; + rte_bbdev_log_debug("Resp. desc %p: %x", desc, + rsp.val); + + op->status |= ((rsp.input_err) + ? (1 << RTE_BBDEV_DATA_ERROR) : 0); + op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0); + op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0); + + if (desc->req.last_desc_in_batch) { + (*aq_dequeued)++; + desc->req.last_desc_in_batch = 0; + } + desc->rsp.val = ACC200_DMA_DESC_TYPE; + desc->rsp.add_info_0 = 0; + desc->rsp.add_info_1 = 0; + (*dequeued_descs)++; + current_dequeued_descs++; + i++; + } + + *ref_op = op; + (*dequeued_ops)++; + return current_dequeued_descs; +} + +/* Dequeue one decode operation from ACC200 device in CB mode */ +static inline int +dequeue_dec_one_op_cb(struct rte_bbdev_queue_data *q_data, + struct acc200_queue *q, struct rte_bbdev_dec_op **ref_op, + uint16_t dequeued_cbs, uint32_t *aq_dequeued) +{ + union acc200_dma_desc *desc, atom_desc; + union acc200_dma_rsp_desc rsp; + struct rte_bbdev_dec_op *op; + + desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs) + & q->sw_ring_wrap_mask); + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc, + __ATOMIC_RELAXED); + + /* Check fdone bit */ + if (!(atom_desc.rsp.val & ACC200_FDONE)) + return -1; + + rsp.val = atom_desc.rsp.val; + rte_bbdev_log_debug("Resp. desc %p: %x\n", desc, rsp.val); + + /* Dequeue */ + op = desc->req.op_addr; + + /* Clearing status, it will be set based on response */ + op->status = 0; + op->status |= ((rsp.input_err) + ? (1 << RTE_BBDEV_DATA_ERROR) : 0); + op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0); + op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0); + if (op->status != 0) { + /* These errors are not expected */ + q_data->queue_stats.dequeue_err_count++; + } + + /* CRC invalid if error exists */ + if (!op->status) + op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR; + op->turbo_dec.iter_count = (uint8_t) rsp.iter_cnt; + /* Check if this is the last desc in batch (Atomic Queue) */ + if (desc->req.last_desc_in_batch) { + (*aq_dequeued)++; + desc->req.last_desc_in_batch = 0; + } + desc->rsp.val = ACC200_DMA_DESC_TYPE; + desc->rsp.add_info_0 = 0; + desc->rsp.add_info_1 = 0; + *ref_op = op; + + /* One CB (op) was successfully dequeued */ + return 1; +} + +/* Dequeue one decode operations from ACC200 device in CB mode */ +static inline int +dequeue_ldpc_dec_one_op_cb(struct rte_bbdev_queue_data *q_data, + struct acc200_queue *q, struct rte_bbdev_dec_op **ref_op, + uint16_t dequeued_cbs, uint32_t *aq_dequeued) +{ + union acc200_dma_desc *desc, atom_desc; + union acc200_dma_rsp_desc rsp; + struct rte_bbdev_dec_op *op; + + desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs) + & q->sw_ring_wrap_mask); + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc, + __ATOMIC_RELAXED); + + /* Check fdone bit */ + if (!(atom_desc.rsp.val & ACC200_FDONE)) + return -1; + + rsp.val = atom_desc.rsp.val; + rte_bbdev_log_debug("Resp. desc %p: %x %x %x\n", desc, + rsp.val, desc->rsp.add_info_0, + desc->rsp.add_info_1); + + /* Dequeue */ + op = desc->req.op_addr; + + /* Clearing status, it will be set based on response */ + op->status = 0; + op->status |= rsp.input_err << RTE_BBDEV_DATA_ERROR; + op->status |= rsp.dma_err << RTE_BBDEV_DRV_ERROR; + op->status |= rsp.fcw_err << RTE_BBDEV_DRV_ERROR; + if (op->status != 0) + q_data->queue_stats.dequeue_err_count++; + + op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR; + if (op->ldpc_dec.hard_output.length > 0 && !rsp.synd_ok) + op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR; + + if (check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_CRC_TYPE_24A_CHECK) || + check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_CRC_TYPE_16_CHECK)) { + if (desc->rsp.add_info_1 != 0) + op->status |= 1 << RTE_BBDEV_CRC_ERROR; + } + + op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt; + + /* Check if this is the last desc in batch (Atomic Queue) */ + if (desc->req.last_desc_in_batch) { + (*aq_dequeued)++; + desc->req.last_desc_in_batch = 0; + } + + desc->rsp.val = ACC200_DMA_DESC_TYPE; + desc->rsp.add_info_0 = 0; + desc->rsp.add_info_1 = 0; + + *ref_op = op; + + /* One CB (op) was successfully dequeued */ + return 1; +} + +/* Dequeue one decode operations from ACC200 device in TB mode. */ +static inline int +dequeue_dec_one_op_tb(struct acc200_queue *q, struct rte_bbdev_dec_op **ref_op, + uint16_t dequeued_cbs, uint32_t *aq_dequeued) +{ + union acc200_dma_desc *desc, *last_desc, atom_desc; + union acc200_dma_rsp_desc rsp; + struct rte_bbdev_dec_op *op; + uint8_t cbs_in_tb = 1, cb_idx = 0; + uint32_t tb_crc_check = 0; + + desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs) + & q->sw_ring_wrap_mask); + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc, + __ATOMIC_RELAXED); + + /* Check fdone bit */ + if (!(atom_desc.rsp.val & ACC200_FDONE)) + return -1; + + /* Dequeue */ + op = desc->req.op_addr; + + /* Get number of CBs in dequeued TB */ + cbs_in_tb = desc->req.cbs_in_tb; + /* Get last CB */ + last_desc = q->ring_addr + ((q->sw_ring_tail + + dequeued_cbs + cbs_in_tb - 1) + & q->sw_ring_wrap_mask); + /* Check if last CB in TB is ready to dequeue (and thus + * the whole TB) - checking sdone bit. If not return. + */ + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc, + __ATOMIC_RELAXED); + if (!(atom_desc.rsp.val & ACC200_SDONE)) + return -1; + + /* Clearing status, it will be set based on response */ + op->status = 0; + + /* Read remaining CBs if exists */ + while (cb_idx < cbs_in_tb) { + desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs) + & q->sw_ring_wrap_mask); + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc, + __ATOMIC_RELAXED); + rsp.val = atom_desc.rsp.val; + rte_bbdev_log_debug("Resp. desc %p: %x %x %x", desc, + rsp.val, desc->rsp.add_info_0, + desc->rsp.add_info_1); + + op->status |= ((rsp.input_err) + ? (1 << RTE_BBDEV_DATA_ERROR) : 0); + op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0); + op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0); + + if (check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_CRC_TYPE_24A_CHECK)) + tb_crc_check ^= desc->rsp.add_info_1; + + /* CRC invalid if error exists */ + if (!op->status) + op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR; + op->turbo_dec.iter_count = RTE_MAX((uint8_t) rsp.iter_cnt, + op->turbo_dec.iter_count); + + /* Check if this is the last desc in batch (Atomic Queue) */ + if (desc->req.last_desc_in_batch) { + (*aq_dequeued)++; + desc->req.last_desc_in_batch = 0; + } + desc->rsp.val = ACC200_DMA_DESC_TYPE; + desc->rsp.add_info_0 = 0; + desc->rsp.add_info_1 = 0; + dequeued_cbs++; + cb_idx++; + } + + if (check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_CRC_TYPE_24A_CHECK)) { + rte_bbdev_log_debug("TB-CRC Check %x\n", tb_crc_check); + if (tb_crc_check > 0) + op->status |= 1 << RTE_BBDEV_CRC_ERROR; + } + + *ref_op = op; + + return cb_idx; +} + +/* Dequeue LDPC encode operations from ACC200 device. */ +static uint16_t +acc200_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data, + struct rte_bbdev_enc_op **ops, uint16_t num) +{ + struct acc200_queue *q = q_data->queue_private; + uint32_t avail = acc200_ring_avail_deq(q); + uint32_t aq_dequeued = 0; + uint16_t i, dequeued_ops = 0, dequeued_descs = 0; + int ret; + struct rte_bbdev_enc_op *op; + if (avail == 0) + return 0; + op = (q->ring_addr + (q->sw_ring_tail & + q->sw_ring_wrap_mask))->req.op_addr; +#ifdef RTE_LIBRTE_BBDEV_DEBUG + if (unlikely(ops == NULL || q == NULL || op == NULL)) + return 0; +#endif + int cbm = op->ldpc_enc.code_block_mode; + + for (i = 0; i < avail; i++) { + if (cbm == RTE_BBDEV_TRANSPORT_BLOCK) + ret = dequeue_enc_one_op_tb(q, &ops[dequeued_ops], + &dequeued_ops, &aq_dequeued, + &dequeued_descs); + else + ret = dequeue_enc_one_op_cb(q, &ops[dequeued_ops], + &dequeued_ops, &aq_dequeued, + &dequeued_descs); + if (ret < 0) + break; + if (dequeued_ops >= num) + break; + } + + q->aq_dequeued += aq_dequeued; + q->sw_ring_tail += dequeued_descs; + + /* Update enqueue stats */ + q_data->queue_stats.dequeued_count += dequeued_ops; + + return dequeued_ops; +} + +/* Dequeue decode operations from ACC200 device. */ +static uint16_t +acc200_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data, + struct rte_bbdev_dec_op **ops, uint16_t num) +{ + struct acc200_queue *q = q_data->queue_private; + uint16_t dequeue_num; + uint32_t avail = acc200_ring_avail_deq(q); + uint32_t aq_dequeued = 0; + uint16_t i; + uint16_t dequeued_cbs = 0; + struct rte_bbdev_dec_op *op; + int ret; + +#ifdef RTE_LIBRTE_BBDEV_DEBUG + if (unlikely(ops == 0 && q == NULL)) + return 0; +#endif + + dequeue_num = RTE_MIN(avail, num); + + for (i = 0; i < dequeue_num; ++i) { + op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs) + & q->sw_ring_wrap_mask))->req.op_addr; + if (op->ldpc_dec.code_block_mode == RTE_BBDEV_TRANSPORT_BLOCK) + ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs, + &aq_dequeued); + else + ret = dequeue_ldpc_dec_one_op_cb( + q_data, q, &ops[i], dequeued_cbs, + &aq_dequeued); + + if (ret <= 0) + break; + dequeued_cbs += ret; + } + + q->aq_dequeued += aq_dequeued; + q->sw_ring_tail += dequeued_cbs; + + /* Update enqueue stats */ + q_data->queue_stats.dequeued_count += i; + + return i; +} + /* Initialization Function */ static void acc200_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv) @@ -827,6 +2931,10 @@ struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device); dev->dev_ops = &acc200_bbdev_ops; + dev->enqueue_ldpc_enc_ops = acc200_enqueue_ldpc_enc; + dev->enqueue_ldpc_dec_ops = acc200_enqueue_ldpc_dec; + dev->dequeue_ldpc_enc_ops = acc200_dequeue_ldpc_enc; + dev->dequeue_ldpc_dec_ops = acc200_dequeue_ldpc_dec; ((struct acc200_device *) dev->data->dev_private)->pf_device = !strcmp(drv->driver.name, From patchwork Fri Jul 8 00:01:39 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Chautru, Nicolas" X-Patchwork-Id: 113818 X-Patchwork-Delegate: gakhil@marvell.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 2D121A0543; Fri, 8 Jul 2022 02:16:54 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 7703742B76; Fri, 8 Jul 2022 02:16:12 +0200 (CEST) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by mails.dpdk.org (Postfix) with ESMTP id F1F5F4069D for ; Fri, 8 Jul 2022 02:16:06 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1657239367; x=1688775367; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=YIXhaajQd73ShmjdGw4ETSvVdYz37kJKsAmL8i/95uI=; b=nekKO8SCDapTa2YXsdGyBHSH/U3spUbE2ixa1YFYU2kM1a1lOXPv2NuN Wy8LabSmkybwN8ZJ+sYh3ZDUFDLxu5awf00louUoRqN0UpcqCy17iMkdn 5OP+JAbt8JeVzPCyH/HV4FxrLDOriUjJBlC1Jz3bMs6XLlk+JWqX6Nybg nUJqI9KIswh2VFFSWA5vTkFmS/6XA8xgT9oFHRgrk2Q5CW2/d774DCUrZ SzoQf8Gz1Fzk2V548gdqzFYR9w17v9Ozim/JOxu1NGCDBFQQdFSG8iUcA u54D0dq/W6/8WCTQMhwQZh6QxNsgsHAzGQdS4XcxemWk/8Yc0xikkA+lH w==; X-IronPort-AV: E=McAfee;i="6400,9594,10401"; a="264563088" X-IronPort-AV: E=Sophos;i="5.92,253,1650956400"; d="scan'208";a="264563088" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Jul 2022 17:16:04 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.92,253,1650956400"; d="scan'208";a="591387544" Received: from skx-5gnr-sc12-4.sc.intel.com ([172.25.69.210]) by orsmga007.jf.intel.com with ESMTP; 07 Jul 2022 17:16:04 -0700 From: Nicolas Chautru To: dev@dpdk.org, thomas@monjalon.net, gakhil@marvell.com, hemant.agrawal@nxp.com, trix@redhat.com Cc: maxime.coquelin@redhat.com, mdr@ashroe.eu, bruce.richardson@intel.com, david.marchand@redhat.com, stephen@networkplumber.org, Nicolas Chautru Subject: [PATCH v1 06/10] baseband/acc200: add LTE processing functions Date: Thu, 7 Jul 2022 17:01:39 -0700 Message-Id: <1657238503-143836-7-git-send-email-nicolas.chautru@intel.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1657238503-143836-1-git-send-email-nicolas.chautru@intel.com> References: <1657238503-143836-1-git-send-email-nicolas.chautru@intel.com> X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Add functions and capability for 4G FEC Signed-off-by: Nicolas Chautru --- drivers/baseband/acc200/rte_acc200_pmd.c | 1244 +++++++++++++++++++++++++++++- 1 file changed, 1235 insertions(+), 9 deletions(-) diff --git a/drivers/baseband/acc200/rte_acc200_pmd.c b/drivers/baseband/acc200/rte_acc200_pmd.c index 42cf2c8..003a2a3 100644 --- a/drivers/baseband/acc200/rte_acc200_pmd.c +++ b/drivers/baseband/acc200/rte_acc200_pmd.c @@ -784,6 +784,46 @@ int i; static const struct rte_bbdev_op_cap bbdev_capabilities[] = { { + .type = RTE_BBDEV_OP_TURBO_DEC, + .cap.turbo_dec = { + .capability_flags = + RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE | + RTE_BBDEV_TURBO_CRC_TYPE_24B | + RTE_BBDEV_TURBO_EQUALIZER | + RTE_BBDEV_TURBO_SOFT_OUT_SATURATE | + RTE_BBDEV_TURBO_HALF_ITERATION_EVEN | + RTE_BBDEV_TURBO_CONTINUE_CRC_MATCH | + RTE_BBDEV_TURBO_SOFT_OUTPUT | + RTE_BBDEV_TURBO_EARLY_TERMINATION | + RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN | + RTE_BBDEV_TURBO_NEG_LLR_1_BIT_SOFT_OUT | + RTE_BBDEV_TURBO_MAP_DEC | + RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP | + RTE_BBDEV_TURBO_DEC_SCATTER_GATHER, + .max_llr_modulus = INT8_MAX, + .num_buffers_src = + RTE_BBDEV_TURBO_MAX_CODE_BLOCKS, + .num_buffers_hard_out = + RTE_BBDEV_TURBO_MAX_CODE_BLOCKS, + .num_buffers_soft_out = + RTE_BBDEV_TURBO_MAX_CODE_BLOCKS, + } + }, + { + .type = RTE_BBDEV_OP_TURBO_ENC, + .cap.turbo_enc = { + .capability_flags = + RTE_BBDEV_TURBO_CRC_24B_ATTACH | + RTE_BBDEV_TURBO_RV_INDEX_BYPASS | + RTE_BBDEV_TURBO_RATE_MATCH | + RTE_BBDEV_TURBO_ENC_SCATTER_GATHER, + .num_buffers_src = + RTE_BBDEV_TURBO_MAX_CODE_BLOCKS, + .num_buffers_dst = + RTE_BBDEV_TURBO_MAX_CODE_BLOCKS, + } + }, + { .type = RTE_BBDEV_OP_LDPC_ENC, .cap.ldpc_enc = { .capability_flags = @@ -834,15 +874,17 @@ /* Exposed number of queues */ dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0; - dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = 0; - dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = 0; + dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = d->acc200_conf.q_ul_4g.num_aqs_per_groups * + d->acc200_conf.q_ul_4g.num_qgroups; + dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = d->acc200_conf.q_dl_4g.num_aqs_per_groups * + d->acc200_conf.q_dl_4g.num_qgroups; dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = d->acc200_conf.q_ul_5g.num_aqs_per_groups * d->acc200_conf.q_ul_5g.num_qgroups; dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = d->acc200_conf.q_dl_5g.num_aqs_per_groups * d->acc200_conf.q_dl_5g.num_qgroups; dev_info->num_queues[RTE_BBDEV_OP_FFT] = 0; - dev_info->queue_priority[RTE_BBDEV_OP_TURBO_DEC] = 0; - dev_info->queue_priority[RTE_BBDEV_OP_TURBO_ENC] = 0; + dev_info->queue_priority[RTE_BBDEV_OP_TURBO_DEC] = d->acc200_conf.q_ul_4g.num_qgroups; + dev_info->queue_priority[RTE_BBDEV_OP_TURBO_ENC] = d->acc200_conf.q_dl_4g.num_qgroups; dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = d->acc200_conf.q_ul_5g.num_qgroups; dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = d->acc200_conf.q_dl_5g.num_qgroups; dev_info->queue_priority[RTE_BBDEV_OP_FFT] = 0; @@ -906,6 +948,58 @@ return tail; } +/* Fill in a frame control word for turbo encoding. */ +static inline void +acc200_fcw_te_fill(const struct rte_bbdev_enc_op *op, struct acc200_fcw_te *fcw) +{ + fcw->code_block_mode = op->turbo_enc.code_block_mode; + if (fcw->code_block_mode == RTE_BBDEV_TRANSPORT_BLOCK) { + fcw->k_neg = op->turbo_enc.tb_params.k_neg; + fcw->k_pos = op->turbo_enc.tb_params.k_pos; + fcw->c_neg = op->turbo_enc.tb_params.c_neg; + fcw->c = op->turbo_enc.tb_params.c; + fcw->ncb_neg = op->turbo_enc.tb_params.ncb_neg; + fcw->ncb_pos = op->turbo_enc.tb_params.ncb_pos; + + if (check_bit(op->turbo_enc.op_flags, + RTE_BBDEV_TURBO_RATE_MATCH)) { + fcw->bypass_rm = 0; + fcw->cab = op->turbo_enc.tb_params.cab; + fcw->ea = op->turbo_enc.tb_params.ea; + fcw->eb = op->turbo_enc.tb_params.eb; + } else { + /* E is set to the encoding output size when RM is + * bypassed. + */ + fcw->bypass_rm = 1; + fcw->cab = fcw->c_neg; + fcw->ea = 3 * fcw->k_neg + 12; + fcw->eb = 3 * fcw->k_pos + 12; + } + } else { /* For CB mode */ + fcw->k_pos = op->turbo_enc.cb_params.k; + fcw->ncb_pos = op->turbo_enc.cb_params.ncb; + + if (check_bit(op->turbo_enc.op_flags, + RTE_BBDEV_TURBO_RATE_MATCH)) { + fcw->bypass_rm = 0; + fcw->eb = op->turbo_enc.cb_params.e; + } else { + /* E is set to the encoding output size when RM is + * bypassed. + */ + fcw->bypass_rm = 1; + fcw->eb = 3 * fcw->k_pos + 12; + } + } + + fcw->bypass_rv_idx1 = check_bit(op->turbo_enc.op_flags, + RTE_BBDEV_TURBO_RV_INDEX_BYPASS); + fcw->code_block_crc = check_bit(op->turbo_enc.op_flags, + RTE_BBDEV_TURBO_CRC_24B_ATTACH); + fcw->rv_idx1 = op->turbo_enc.rv_index; +} + /* Compute value of k0. * Based on 3GPP 38.212 Table 5.4.2.1-2 * Starting position of different redundancy versions, k0 @@ -958,6 +1052,70 @@ fcw->mcb_count = num_cb; } +/* Fill in a frame control word for turbo decoding. */ +static inline void +acc200_fcw_td_fill(const struct rte_bbdev_dec_op *op, struct acc200_fcw_td *fcw) +{ + fcw->fcw_ver = 1; + fcw->num_maps = ACC200_FCW_TD_AUTOMAP; + fcw->bypass_sb_deint = !check_bit(op->turbo_dec.op_flags, + RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE); + if (op->turbo_dec.code_block_mode == RTE_BBDEV_TRANSPORT_BLOCK) { + /* FIXME for TB block */ + fcw->k_pos = op->turbo_dec.tb_params.k_pos; + fcw->k_neg = op->turbo_dec.tb_params.k_neg; + } else { + fcw->k_pos = op->turbo_dec.cb_params.k; + fcw->k_neg = op->turbo_dec.cb_params.k; + } + fcw->c = 1; + fcw->c_neg = 1; + if (check_bit(op->turbo_dec.op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT)) { + fcw->soft_output_en = 1; + fcw->sw_soft_out_dis = 0; + fcw->sw_et_cont = check_bit(op->turbo_dec.op_flags, + RTE_BBDEV_TURBO_CONTINUE_CRC_MATCH); + fcw->sw_soft_out_saturation = check_bit(op->turbo_dec.op_flags, + RTE_BBDEV_TURBO_SOFT_OUT_SATURATE); + if (check_bit(op->turbo_dec.op_flags, + RTE_BBDEV_TURBO_EQUALIZER)) { + fcw->bypass_teq = 0; + fcw->ea = op->turbo_dec.cb_params.e; + fcw->eb = op->turbo_dec.cb_params.e; + if (op->turbo_dec.rv_index == 0) + fcw->k0_start_col = ACC200_FCW_TD_RVIDX_0; + else if (op->turbo_dec.rv_index == 1) + fcw->k0_start_col = ACC200_FCW_TD_RVIDX_1; + else if (op->turbo_dec.rv_index == 2) + fcw->k0_start_col = ACC200_FCW_TD_RVIDX_2; + else + fcw->k0_start_col = ACC200_FCW_TD_RVIDX_3; + } else { + fcw->bypass_teq = 1; + fcw->eb = 64; /* avoid undefined value */ + } + } else { + fcw->soft_output_en = 0; + fcw->sw_soft_out_dis = 1; + fcw->bypass_teq = 0; + } + + fcw->code_block_mode = 1; /* FIXME */ + fcw->turbo_crc_type = check_bit(op->turbo_dec.op_flags, + RTE_BBDEV_TURBO_CRC_TYPE_24B); + + fcw->ext_td_cold_reg_en = 1; + fcw->raw_decoder_input_on = 0; + fcw->max_iter = RTE_MAX((uint8_t) op->turbo_dec.iter_max, 2); + fcw->min_iter = 2; + fcw->half_iter_on = !check_bit(op->turbo_dec.op_flags, + RTE_BBDEV_TURBO_HALF_ITERATION_EVEN); + + fcw->early_stop_en = check_bit(op->turbo_dec.op_flags, + RTE_BBDEV_TURBO_EARLY_TERMINATION) & !fcw->soft_output_en; + fcw->ext_scale = 0xF; +} + /* Convert offset to harq index for harq_layout structure */ static inline uint32_t hq_index(uint32_t offset) { @@ -1240,6 +1398,89 @@ static inline uint32_t hq_index(uint32_t offset) #endif static inline int +acc200_dma_desc_te_fill(struct rte_bbdev_enc_op *op, + struct acc200_dma_req_desc *desc, struct rte_mbuf **input, + struct rte_mbuf *output, uint32_t *in_offset, + uint32_t *out_offset, uint32_t *out_length, + uint32_t *mbuf_total_left, uint32_t *seg_total_left, uint8_t r) +{ + int next_triplet = 1; /* FCW already done */ + uint32_t e, ea, eb, length; + uint16_t k, k_neg, k_pos; + uint8_t cab, c_neg; + + desc->word0 = ACC200_DMA_DESC_TYPE; + desc->word1 = 0; /**< Timestamp could be disabled */ + desc->word2 = 0; + desc->word3 = 0; + desc->numCBs = 1; + + if (op->turbo_enc.code_block_mode == RTE_BBDEV_TRANSPORT_BLOCK) { + ea = op->turbo_enc.tb_params.ea; + eb = op->turbo_enc.tb_params.eb; + cab = op->turbo_enc.tb_params.cab; + k_neg = op->turbo_enc.tb_params.k_neg; + k_pos = op->turbo_enc.tb_params.k_pos; + c_neg = op->turbo_enc.tb_params.c_neg; + e = (r < cab) ? ea : eb; + k = (r < c_neg) ? k_neg : k_pos; + } else { + e = op->turbo_enc.cb_params.e; + k = op->turbo_enc.cb_params.k; + } + + if (check_bit(op->turbo_enc.op_flags, RTE_BBDEV_TURBO_CRC_24B_ATTACH)) + length = (k - 24) >> 3; + else + length = k >> 3; + + if (unlikely((*mbuf_total_left == 0) || (*mbuf_total_left < length))) { + rte_bbdev_log(ERR, + "Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u", + *mbuf_total_left, length); + return -1; + } + + next_triplet = acc200_dma_fill_blk_type_in(desc, input, in_offset, + length, seg_total_left, next_triplet, + check_bit(op->turbo_enc.op_flags, + RTE_BBDEV_TURBO_ENC_SCATTER_GATHER)); + if (unlikely(next_triplet < 0)) { + rte_bbdev_log(ERR, + "Mismatch between data to process and mbuf data length in bbdev_op: %p", + op); + return -1; + } + desc->data_ptrs[next_triplet - 1].last = 1; + desc->m2dlen = next_triplet; + *mbuf_total_left -= length; + + /* Set output length */ + if (check_bit(op->turbo_enc.op_flags, RTE_BBDEV_TURBO_RATE_MATCH)) + /* Integer round up division by 8 */ + *out_length = (e + 7) >> 3; + else + *out_length = (k >> 3) * 3 + 2; + + next_triplet = acc200_dma_fill_blk_type(desc, output, *out_offset, + *out_length, next_triplet, ACC200_DMA_BLKID_OUT_ENC); + if (unlikely(next_triplet < 0)) { + rte_bbdev_log(ERR, + "Mismatch between data to process and mbuf data length in bbdev_op: %p", + op); + return -1; + } + op->turbo_enc.output.length += *out_length; + *out_offset += *out_length; + desc->data_ptrs[next_triplet - 1].last = 1; + desc->d2mlen = next_triplet - desc->m2dlen; + + desc->op_addr = op; + + return 0; +} + +static inline int acc200_dma_desc_le_fill(struct rte_bbdev_enc_op *op, struct acc200_dma_req_desc *desc, struct rte_mbuf **input, struct rte_mbuf *output, uint32_t *in_offset, @@ -1299,6 +1540,122 @@ static inline uint32_t hq_index(uint32_t offset) } static inline int +acc200_dma_desc_td_fill(struct rte_bbdev_dec_op *op, + struct acc200_dma_req_desc *desc, struct rte_mbuf **input, + struct rte_mbuf *h_output, struct rte_mbuf *s_output, + uint32_t *in_offset, uint32_t *h_out_offset, + uint32_t *s_out_offset, uint32_t *h_out_length, + uint32_t *s_out_length, uint32_t *mbuf_total_left, + uint32_t *seg_total_left, uint8_t r) +{ + int next_triplet = 1; /* FCW already done */ + uint16_t k; + uint16_t crc24_overlap = 0; + uint32_t e, kw; + + desc->word0 = ACC200_DMA_DESC_TYPE; + desc->word1 = 0; /**< Timestamp could be disabled */ + desc->word2 = 0; + desc->word3 = 0; + desc->numCBs = 1; + + if (op->turbo_dec.code_block_mode == RTE_BBDEV_TRANSPORT_BLOCK) { + k = (r < op->turbo_dec.tb_params.c_neg) + ? op->turbo_dec.tb_params.k_neg + : op->turbo_dec.tb_params.k_pos; + e = (r < op->turbo_dec.tb_params.cab) + ? op->turbo_dec.tb_params.ea + : op->turbo_dec.tb_params.eb; + } else { + k = op->turbo_dec.cb_params.k; + e = op->turbo_dec.cb_params.e; + } + + if ((op->turbo_dec.code_block_mode == RTE_BBDEV_TRANSPORT_BLOCK) + && !check_bit(op->turbo_dec.op_flags, + RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP)) + crc24_overlap = 24; + + /* Calculates circular buffer size. + * According to 3gpp 36.212 section 5.1.4.2 + * Kw = 3 * Kpi, + * where: + * Kpi = nCol * nRow + * where nCol is 32 and nRow can be calculated from: + * D =< nCol * nRow + * where D is the size of each output from turbo encoder block (k + 4). + */ + kw = RTE_ALIGN_CEIL(k + 4, 32) * 3; + + if (unlikely((*mbuf_total_left == 0) || (*mbuf_total_left < kw))) { + rte_bbdev_log(ERR, + "Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u", + *mbuf_total_left, kw); + return -1; + } + + next_triplet = acc200_dma_fill_blk_type_in(desc, input, in_offset, kw, + seg_total_left, next_triplet, + check_bit(op->turbo_dec.op_flags, + RTE_BBDEV_TURBO_DEC_SCATTER_GATHER)); + if (unlikely(next_triplet < 0)) { + rte_bbdev_log(ERR, + "Mismatch between data to process and mbuf data length in bbdev_op: %p", + op); + return -1; + } + desc->data_ptrs[next_triplet - 1].last = 1; + desc->m2dlen = next_triplet; + *mbuf_total_left -= kw; + *h_out_length = ((k - crc24_overlap) >> 3); + next_triplet = acc200_dma_fill_blk_type( + desc, h_output, *h_out_offset, + *h_out_length, next_triplet, ACC200_DMA_BLKID_OUT_HARD); + if (unlikely(next_triplet < 0)) { + rte_bbdev_log(ERR, + "Mismatch between data to process and mbuf data length in bbdev_op: %p", + op); + return -1; + } + + op->turbo_dec.hard_output.length += *h_out_length; + *h_out_offset += *h_out_length; + + /* Soft output */ + if (check_bit(op->turbo_dec.op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT)) { + if (op->turbo_dec.soft_output.data == 0) { + rte_bbdev_log(ERR, "Soft output is not defined"); + return -1; + } + if (check_bit(op->turbo_dec.op_flags, + RTE_BBDEV_TURBO_EQUALIZER)) + *s_out_length = e; + else + *s_out_length = (k * 3) + 12; + + next_triplet = acc200_dma_fill_blk_type(desc, s_output, + *s_out_offset, *s_out_length, next_triplet, + ACC200_DMA_BLKID_OUT_SOFT); + if (unlikely(next_triplet < 0)) { + rte_bbdev_log(ERR, + "Mismatch between data to process and mbuf data length in bbdev_op: %p", + op); + return -1; + } + + op->turbo_dec.soft_output.length += *s_out_length; + *s_out_offset += *s_out_length; + } + + desc->data_ptrs[next_triplet - 1].last = 1; + desc->d2mlen = next_triplet - desc->m2dlen; + + desc->op_addr = op; + + return 0; +} + +static inline int acc200_dma_desc_ld_fill(struct rte_bbdev_dec_op *op, struct acc200_dma_req_desc *desc, struct rte_mbuf **input, struct rte_mbuf *h_output, @@ -1545,6 +1902,144 @@ static inline uint32_t hq_index(uint32_t offset) } #ifdef RTE_LIBRTE_BBDEV_DEBUG +/* Validates turbo encoder parameters */ +static inline int +validate_enc_op(struct rte_bbdev_enc_op *op) +{ + struct rte_bbdev_op_turbo_enc *turbo_enc = &op->turbo_enc; + struct rte_bbdev_op_enc_turbo_cb_params *cb = NULL; + struct rte_bbdev_op_enc_turbo_tb_params *tb = NULL; + uint16_t kw, kw_neg, kw_pos; + + if (op->mempool == NULL) { + rte_bbdev_log(ERR, "Invalid mempool pointer"); + return -1; + } + if (turbo_enc->input.data == NULL) { + rte_bbdev_log(ERR, "Invalid input pointer"); + return -1; + } + if (turbo_enc->output.data == NULL) { + rte_bbdev_log(ERR, "Invalid output pointer"); + return -1; + } + if (turbo_enc->rv_index > 3) { + rte_bbdev_log(ERR, + "rv_index (%u) is out of range 0 <= value <= 3", + turbo_enc->rv_index); + return -1; + } + if (turbo_enc->code_block_mode != RTE_BBDEV_TRANSPORT_BLOCK && + turbo_enc->code_block_mode != RTE_BBDEV_CODE_BLOCK) { + rte_bbdev_log(ERR, + "code_block_mode (%u) is out of range 0 <= value <= 1", + turbo_enc->code_block_mode); + return -1; + } + + if (turbo_enc->code_block_mode == RTE_BBDEV_TRANSPORT_BLOCK) { + tb = &turbo_enc->tb_params; + if ((tb->k_neg < RTE_BBDEV_TURBO_MIN_CB_SIZE + || tb->k_neg > RTE_BBDEV_TURBO_MAX_CB_SIZE) + && tb->c_neg > 0) { + rte_bbdev_log(ERR, + "k_neg (%u) is out of range %u <= value <= %u", + tb->k_neg, RTE_BBDEV_TURBO_MIN_CB_SIZE, + RTE_BBDEV_TURBO_MAX_CB_SIZE); + return -1; + } + if (tb->k_pos < RTE_BBDEV_TURBO_MIN_CB_SIZE + || tb->k_pos > RTE_BBDEV_TURBO_MAX_CB_SIZE) { + rte_bbdev_log(ERR, + "k_pos (%u) is out of range %u <= value <= %u", + tb->k_pos, RTE_BBDEV_TURBO_MIN_CB_SIZE, + RTE_BBDEV_TURBO_MAX_CB_SIZE); + return -1; + } + if (tb->c_neg > (RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1)) + rte_bbdev_log(ERR, + "c_neg (%u) is out of range 0 <= value <= %u", + tb->c_neg, + RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1); + if (tb->c < 1 || tb->c > RTE_BBDEV_TURBO_MAX_CODE_BLOCKS) { + rte_bbdev_log(ERR, + "c (%u) is out of range 1 <= value <= %u", + tb->c, RTE_BBDEV_TURBO_MAX_CODE_BLOCKS); + return -1; + } + if (tb->cab > tb->c) { + rte_bbdev_log(ERR, + "cab (%u) is greater than c (%u)", + tb->cab, tb->c); + return -1; + } + if ((tb->ea < RTE_BBDEV_TURBO_MIN_CB_SIZE || (tb->ea % 2)) + && tb->r < tb->cab) { + rte_bbdev_log(ERR, + "ea (%u) is less than %u or it is not even", + tb->ea, RTE_BBDEV_TURBO_MIN_CB_SIZE); + return -1; + } + if ((tb->eb < RTE_BBDEV_TURBO_MIN_CB_SIZE || (tb->eb % 2)) + && tb->c > tb->cab) { + rte_bbdev_log(ERR, + "eb (%u) is less than %u or it is not even", + tb->eb, RTE_BBDEV_TURBO_MIN_CB_SIZE); + return -1; + } + + kw_neg = 3 * RTE_ALIGN_CEIL(tb->k_neg + 4, + RTE_BBDEV_TURBO_C_SUBBLOCK); + if (tb->ncb_neg < tb->k_neg || tb->ncb_neg > kw_neg) { + rte_bbdev_log(ERR, + "ncb_neg (%u) is out of range (%u) k_neg <= value <= (%u) kw_neg", + tb->ncb_neg, tb->k_neg, kw_neg); + return -1; + } + + kw_pos = 3 * RTE_ALIGN_CEIL(tb->k_pos + 4, + RTE_BBDEV_TURBO_C_SUBBLOCK); + if (tb->ncb_pos < tb->k_pos || tb->ncb_pos > kw_pos) { + rte_bbdev_log(ERR, + "ncb_pos (%u) is out of range (%u) k_pos <= value <= (%u) kw_pos", + tb->ncb_pos, tb->k_pos, kw_pos); + return -1; + } + if (tb->r > (tb->c - 1)) { + rte_bbdev_log(ERR, + "r (%u) is greater than c - 1 (%u)", + tb->r, tb->c - 1); + return -1; + } + } else { + cb = &turbo_enc->cb_params; + if (cb->k < RTE_BBDEV_TURBO_MIN_CB_SIZE + || cb->k > RTE_BBDEV_TURBO_MAX_CB_SIZE) { + rte_bbdev_log(ERR, + "k (%u) is out of range %u <= value <= %u", + cb->k, RTE_BBDEV_TURBO_MIN_CB_SIZE, + RTE_BBDEV_TURBO_MAX_CB_SIZE); + return -1; + } + + if (cb->e < RTE_BBDEV_TURBO_MIN_CB_SIZE || (cb->e % 2)) { + rte_bbdev_log(ERR, + "e (%u) is less than %u or it is not even", + cb->e, RTE_BBDEV_TURBO_MIN_CB_SIZE); + return -1; + } + + kw = RTE_ALIGN_CEIL(cb->k + 4, RTE_BBDEV_TURBO_C_SUBBLOCK) * 3; + if (cb->ncb < cb->k || cb->ncb > kw) { + rte_bbdev_log(ERR, + "ncb (%u) is out of range (%u) k <= value <= (%u) kw", + cb->ncb, cb->k, kw); + return -1; + } + } + + return 0; +} /* Validates LDPC encoder parameters */ static inline int @@ -1631,6 +2126,59 @@ static inline uint32_t hq_index(uint32_t offset) #endif +/* Enqueue one encode operations for ACC200 device in CB mode */ +static inline int +enqueue_enc_one_op_cb(struct acc200_queue *q, struct rte_bbdev_enc_op *op, + uint16_t total_enqueued_cbs) +{ + union acc200_dma_desc *desc = NULL; + int ret; + uint32_t in_offset, out_offset, out_length, mbuf_total_left, + seg_total_left; + struct rte_mbuf *input, *output_head, *output; + +#ifdef RTE_LIBRTE_BBDEV_DEBUG + /* Validate op structure */ + if (validate_enc_op(op) == -1) { + rte_bbdev_log(ERR, "Turbo encoder validation failed"); + return -EINVAL; + } +#endif + + uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs) + & q->sw_ring_wrap_mask); + desc = q->ring_addr + desc_idx; + acc200_fcw_te_fill(op, &desc->req.fcw_te); + + input = op->turbo_enc.input.data; + output_head = output = op->turbo_enc.output.data; + in_offset = op->turbo_enc.input.offset; + out_offset = op->turbo_enc.output.offset; + out_length = 0; + mbuf_total_left = op->turbo_enc.input.length; + seg_total_left = rte_pktmbuf_data_len(op->turbo_enc.input.data) + - in_offset; + + ret = acc200_dma_desc_te_fill(op, &desc->req, &input, output, + &in_offset, &out_offset, &out_length, &mbuf_total_left, + &seg_total_left, 0); + + if (unlikely(ret < 0)) + return ret; + + mbuf_append(output_head, output, out_length); + +#ifdef RTE_LIBRTE_BBDEV_DEBUG + rte_memdump(stderr, "FCW", &desc->req.fcw_te, + sizeof(desc->req.fcw_te) - 8); + rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc)); + if (check_mbuf_total_left(mbuf_total_left) != 0) + return -EINVAL; +#endif + /* One CB (one op) was successfully prepared to enqueue */ + return 1; +} + /* Enqueue one encode operations for ACC200 device in CB mode * multiplexed on the same descriptor */ @@ -1807,12 +2355,98 @@ static inline uint32_t hq_index(uint32_t offset) return 1; } -/* Enqueue one encode operations for ACC200 device in TB mode. - * returns the number of descs used - */ + +/* Enqueue one encode operations for ACC200 device in TB mode. */ static inline int -enqueue_ldpc_enc_one_op_tb(struct acc200_queue *q, struct rte_bbdev_enc_op *op, - uint16_t enq_descs, uint8_t cbs_in_tb) +enqueue_enc_one_op_tb(struct acc200_queue *q, struct rte_bbdev_enc_op *op, + uint16_t total_enqueued_cbs, uint8_t cbs_in_tb) +{ + union acc200_dma_desc *desc = NULL; + int ret; + uint8_t r, c; + uint32_t in_offset, out_offset, out_length, mbuf_total_left, + seg_total_left; + struct rte_mbuf *input, *output_head, *output; + uint16_t current_enqueued_cbs = 0; + +#ifdef RTE_LIBRTE_BBDEV_DEBUG + /* Validate op structure */ + if (validate_enc_op(op) == -1) { + rte_bbdev_log(ERR, "Turbo encoder validation failed"); + return -EINVAL; + } +#endif + + uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs) + & q->sw_ring_wrap_mask); + desc = q->ring_addr + desc_idx; + uint64_t fcw_offset = (desc_idx << 8) + ACC200_DESC_FCW_OFFSET; + acc200_fcw_te_fill(op, &desc->req.fcw_te); + + input = op->turbo_enc.input.data; + output_head = output = op->turbo_enc.output.data; + in_offset = op->turbo_enc.input.offset; + out_offset = op->turbo_enc.output.offset; + out_length = 0; + mbuf_total_left = op->turbo_enc.input.length; + + c = op->turbo_enc.tb_params.c; + r = op->turbo_enc.tb_params.r; + + while (mbuf_total_left > 0 && r < c) { + seg_total_left = rte_pktmbuf_data_len(input) - in_offset; + /* Set up DMA descriptor */ + desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs) + & q->sw_ring_wrap_mask); + desc->req.data_ptrs[0].address = q->ring_addr_iova + fcw_offset; + desc->req.data_ptrs[0].blen = ACC200_FCW_TE_BLEN; + + ret = acc200_dma_desc_te_fill(op, &desc->req, &input, output, + &in_offset, &out_offset, &out_length, + &mbuf_total_left, &seg_total_left, r); + if (unlikely(ret < 0)) + return ret; + mbuf_append(output_head, output, out_length); + + /* Set total number of CBs in TB */ + desc->req.cbs_in_tb = cbs_in_tb; +#ifdef RTE_LIBRTE_BBDEV_DEBUG + rte_memdump(stderr, "FCW", &desc->req.fcw_te, + sizeof(desc->req.fcw_te) - 8); + rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc)); +#endif + + if (seg_total_left == 0) { + /* Go to the next mbuf */ + input = input->next; + in_offset = 0; + output = output->next; + out_offset = 0; + } + + total_enqueued_cbs++; + current_enqueued_cbs++; + r++; + } + +#ifdef RTE_LIBRTE_BBDEV_DEBUG + if (check_mbuf_total_left(mbuf_total_left) != 0) + return -EINVAL; +#endif + + /* Set SDone on last CB descriptor for TB mode. */ + desc->req.sdone_enable = 1; + desc->req.irq_enable = q->irq_enable; + + return current_enqueued_cbs; +} + +/* Enqueue one encode operations for ACC200 device in TB mode. + * returns the number of descs used + */ +static inline int +enqueue_ldpc_enc_one_op_tb(struct acc200_queue *q, struct rte_bbdev_enc_op *op, + uint16_t enq_descs, uint8_t cbs_in_tb) { uint8_t num_a, num_b; uint16_t desc_idx; @@ -1871,6 +2505,213 @@ static inline uint32_t hq_index(uint32_t offset) return return_descs; } +#ifdef RTE_LIBRTE_BBDEV_DEBUG +/* Validates turbo decoder parameters */ +static inline int +validate_dec_op(struct rte_bbdev_dec_op *op) +{ + struct rte_bbdev_op_turbo_dec *turbo_dec = &op->turbo_dec; + struct rte_bbdev_op_dec_turbo_cb_params *cb = NULL; + struct rte_bbdev_op_dec_turbo_tb_params *tb = NULL; + + if (op->mempool == NULL) { + rte_bbdev_log(ERR, "Invalid mempool pointer"); + return -1; + } + if (turbo_dec->input.data == NULL) { + rte_bbdev_log(ERR, "Invalid input pointer"); + return -1; + } + if (turbo_dec->hard_output.data == NULL) { + rte_bbdev_log(ERR, "Invalid hard_output pointer"); + return -1; + } + if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT) && + turbo_dec->soft_output.data == NULL) { + rte_bbdev_log(ERR, "Invalid soft_output pointer"); + return -1; + } + if (turbo_dec->rv_index > 3) { + rte_bbdev_log(ERR, + "rv_index (%u) is out of range 0 <= value <= 3", + turbo_dec->rv_index); + return -1; + } + if (turbo_dec->iter_min < 1) { + rte_bbdev_log(ERR, + "iter_min (%u) is less than 1", + turbo_dec->iter_min); + return -1; + } + if (turbo_dec->iter_max <= 2) { + rte_bbdev_log(ERR, + "iter_max (%u) is less than or equal to 2", + turbo_dec->iter_max); + return -1; + } + if (turbo_dec->iter_min > turbo_dec->iter_max) { + rte_bbdev_log(ERR, + "iter_min (%u) is greater than iter_max (%u)", + turbo_dec->iter_min, turbo_dec->iter_max); + return -1; + } + if (turbo_dec->code_block_mode != RTE_BBDEV_TRANSPORT_BLOCK && + turbo_dec->code_block_mode != RTE_BBDEV_CODE_BLOCK) { + rte_bbdev_log(ERR, + "code_block_mode (%u) is out of range 0 <= value <= 1", + turbo_dec->code_block_mode); + return -1; + } + + if (turbo_dec->code_block_mode == RTE_BBDEV_TRANSPORT_BLOCK) { + tb = &turbo_dec->tb_params; + if ((tb->k_neg < RTE_BBDEV_TURBO_MIN_CB_SIZE + || tb->k_neg > RTE_BBDEV_TURBO_MAX_CB_SIZE) + && tb->c_neg > 0) { + rte_bbdev_log(ERR, + "k_neg (%u) is out of range %u <= value <= %u", + tb->k_neg, RTE_BBDEV_TURBO_MIN_CB_SIZE, + RTE_BBDEV_TURBO_MAX_CB_SIZE); + return -1; + } + if ((tb->k_pos < RTE_BBDEV_TURBO_MIN_CB_SIZE + || tb->k_pos > RTE_BBDEV_TURBO_MAX_CB_SIZE) + && tb->c > tb->c_neg) { + rte_bbdev_log(ERR, + "k_pos (%u) is out of range %u <= value <= %u", + tb->k_pos, RTE_BBDEV_TURBO_MIN_CB_SIZE, + RTE_BBDEV_TURBO_MAX_CB_SIZE); + return -1; + } + if (tb->c_neg > (RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1)) + rte_bbdev_log(ERR, + "c_neg (%u) is out of range 0 <= value <= %u", + tb->c_neg, + RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1); + if (tb->c < 1 || tb->c > RTE_BBDEV_TURBO_MAX_CODE_BLOCKS) { + rte_bbdev_log(ERR, + "c (%u) is out of range 1 <= value <= %u", + tb->c, RTE_BBDEV_TURBO_MAX_CODE_BLOCKS); + return -1; + } + if (tb->cab > tb->c) { + rte_bbdev_log(ERR, + "cab (%u) is greater than c (%u)", + tb->cab, tb->c); + return -1; + } + if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) && + (tb->ea < RTE_BBDEV_TURBO_MIN_CB_SIZE + || (tb->ea % 2)) + && tb->cab > 0) { + rte_bbdev_log(ERR, + "ea (%u) is less than %u or it is not even", + tb->ea, RTE_BBDEV_TURBO_MIN_CB_SIZE); + return -1; + } + if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) && + (tb->eb < RTE_BBDEV_TURBO_MIN_CB_SIZE + || (tb->eb % 2)) + && tb->c > tb->cab) { + rte_bbdev_log(ERR, + "eb (%u) is less than %u or it is not even", + tb->eb, RTE_BBDEV_TURBO_MIN_CB_SIZE); + } + } else { + cb = &turbo_dec->cb_params; + if (cb->k < RTE_BBDEV_TURBO_MIN_CB_SIZE + || cb->k > RTE_BBDEV_TURBO_MAX_CB_SIZE) { + rte_bbdev_log(ERR, + "k (%u) is out of range %u <= value <= %u", + cb->k, RTE_BBDEV_TURBO_MIN_CB_SIZE, + RTE_BBDEV_TURBO_MAX_CB_SIZE); + return -1; + } + if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) && + (cb->e < RTE_BBDEV_TURBO_MIN_CB_SIZE || + (cb->e % 2))) { + rte_bbdev_log(ERR, + "e (%u) is less than %u or it is not even", + cb->e, RTE_BBDEV_TURBO_MIN_CB_SIZE); + return -1; + } + } + + return 0; +} +#endif + +/** Enqueue one decode operations for ACC200 device in CB mode */ +static inline int +enqueue_dec_one_op_cb(struct acc200_queue *q, struct rte_bbdev_dec_op *op, + uint16_t total_enqueued_cbs) +{ + union acc200_dma_desc *desc = NULL; + int ret; + uint32_t in_offset, h_out_offset, s_out_offset, s_out_length, + h_out_length, mbuf_total_left, seg_total_left; + struct rte_mbuf *input, *h_output_head, *h_output, + *s_output_head, *s_output; + +#ifdef RTE_LIBRTE_BBDEV_DEBUG + /* Validate op structure */ + if (validate_dec_op(op) == -1) { + rte_bbdev_log(ERR, "Turbo decoder validation failed"); + return -EINVAL; + } +#endif + + uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs) + & q->sw_ring_wrap_mask); + desc = q->ring_addr + desc_idx; + acc200_fcw_td_fill(op, &desc->req.fcw_td); + + input = op->turbo_dec.input.data; + h_output_head = h_output = op->turbo_dec.hard_output.data; + s_output_head = s_output = op->turbo_dec.soft_output.data; + in_offset = op->turbo_dec.input.offset; + h_out_offset = op->turbo_dec.hard_output.offset; + s_out_offset = op->turbo_dec.soft_output.offset; + h_out_length = s_out_length = 0; + mbuf_total_left = op->turbo_dec.input.length; + seg_total_left = rte_pktmbuf_data_len(input) - in_offset; + +#ifdef RTE_LIBRTE_BBDEV_DEBUG + if (unlikely(input == NULL)) { + rte_bbdev_log(ERR, "Invalid mbuf pointer"); + return -EFAULT; + } +#endif + + /* Set up DMA descriptor */ + desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs) + & q->sw_ring_wrap_mask); + + ret = acc200_dma_desc_td_fill(op, &desc->req, &input, h_output, + s_output, &in_offset, &h_out_offset, &s_out_offset, + &h_out_length, &s_out_length, &mbuf_total_left, + &seg_total_left, 0); + + if (unlikely(ret < 0)) + return ret; + + /* Hard output */ + mbuf_append(h_output_head, h_output, h_out_length); + + /* Soft output */ + if (check_bit(op->turbo_dec.op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT)) + mbuf_append(s_output_head, s_output, s_out_length); + +#ifdef RTE_LIBRTE_BBDEV_DEBUG + rte_memdump(stderr, "FCW", &desc->req.fcw_td, + sizeof(desc->req.fcw_td)); + rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc)); +#endif + + /* One CB (one op) was successfully prepared to enqueue */ + return 1; +} + /** Enqueue one decode operations for ACC200 device in CB mode */ static inline int enqueue_ldpc_dec_one_op_cb(struct acc200_queue *q, struct rte_bbdev_dec_op *op, @@ -2084,6 +2925,108 @@ static inline uint32_t hq_index(uint32_t offset) return current_enqueued_cbs; } +/* Enqueue one decode operations for ACC200 device in TB mode */ +static inline int +enqueue_dec_one_op_tb(struct acc200_queue *q, struct rte_bbdev_dec_op *op, + uint16_t total_enqueued_cbs, uint8_t cbs_in_tb) +{ + union acc200_dma_desc *desc = NULL; + int ret; + uint8_t r, c; + uint32_t in_offset, h_out_offset, s_out_offset, s_out_length, + h_out_length, mbuf_total_left, seg_total_left; + struct rte_mbuf *input, *h_output_head, *h_output, + *s_output_head, *s_output; + uint16_t current_enqueued_cbs = 0; + +#ifdef RTE_LIBRTE_BBDEV_DEBUG + /* Validate op structure */ + if (validate_dec_op(op) == -1) { + rte_bbdev_log(ERR, "Turbo decoder validation failed"); + return -EINVAL; + } +#endif + + uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs) + & q->sw_ring_wrap_mask); + desc = q->ring_addr + desc_idx; + uint64_t fcw_offset = (desc_idx << 8) + ACC200_DESC_FCW_OFFSET; + acc200_fcw_td_fill(op, &desc->req.fcw_td); + + input = op->turbo_dec.input.data; + h_output_head = h_output = op->turbo_dec.hard_output.data; + s_output_head = s_output = op->turbo_dec.soft_output.data; + in_offset = op->turbo_dec.input.offset; + h_out_offset = op->turbo_dec.hard_output.offset; + s_out_offset = op->turbo_dec.soft_output.offset; + h_out_length = s_out_length = 0; + mbuf_total_left = op->turbo_dec.input.length; + c = op->turbo_dec.tb_params.c; + r = op->turbo_dec.tb_params.r; + + while (mbuf_total_left > 0 && r < c) { + + seg_total_left = rte_pktmbuf_data_len(input) - in_offset; + + /* Set up DMA descriptor */ + desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs) + & q->sw_ring_wrap_mask); + desc->req.data_ptrs[0].address = q->ring_addr_iova + fcw_offset; + desc->req.data_ptrs[0].blen = ACC200_FCW_TD_BLEN; + ret = acc200_dma_desc_td_fill(op, &desc->req, &input, + h_output, s_output, &in_offset, &h_out_offset, + &s_out_offset, &h_out_length, &s_out_length, + &mbuf_total_left, &seg_total_left, r); + + if (unlikely(ret < 0)) + return ret; + + /* Hard output */ + mbuf_append(h_output_head, h_output, h_out_length); + + /* Soft output */ + if (check_bit(op->turbo_dec.op_flags, + RTE_BBDEV_TURBO_SOFT_OUTPUT)) + mbuf_append(s_output_head, s_output, s_out_length); + + /* Set total number of CBs in TB */ + desc->req.cbs_in_tb = cbs_in_tb; +#ifdef RTE_LIBRTE_BBDEV_DEBUG + rte_memdump(stderr, "FCW", &desc->req.fcw_td, + sizeof(desc->req.fcw_td) - 8); + rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc)); +#endif + + if (seg_total_left == 0) { + /* Go to the next mbuf */ + input = input->next; + in_offset = 0; + h_output = h_output->next; + h_out_offset = 0; + + if (check_bit(op->turbo_dec.op_flags, + RTE_BBDEV_TURBO_SOFT_OUTPUT)) { + s_output = s_output->next; + s_out_offset = 0; + } + } + + total_enqueued_cbs++; + current_enqueued_cbs++; + r++; + } + +#ifdef RTE_LIBRTE_BBDEV_DEBUG + if (check_mbuf_total_left(mbuf_total_left) != 0) + return -EINVAL; +#endif + /* Set SDone on last CB descriptor for TB mode */ + desc->req.sdone_enable = 1; + desc->req.irq_enable = q->irq_enable; + + return current_enqueued_cbs; +} + /* Calculates number of CBs in processed encoder TB based on 'r' and input * length. */ @@ -2230,6 +3173,49 @@ static inline uint32_t hq_index(uint32_t offset) return (q->sw_ring_depth + q->sw_ring_head - q->sw_ring_tail) % q->sw_ring_depth; } +/* Enqueue encode operations for ACC200 device in CB mode. */ +static uint16_t +acc200_enqueue_enc_cb(struct rte_bbdev_queue_data *q_data, + struct rte_bbdev_enc_op **ops, uint16_t num) +{ + struct acc200_queue *q = q_data->queue_private; + int32_t avail = acc200_ring_avail_enq(q); + uint16_t i; + union acc200_dma_desc *desc; + int ret; + + for (i = 0; i < num; ++i) { + /* Check if there are available space for further processing */ + if (unlikely(avail - 1 < 0)) { + acc200_enqueue_ring_full(q_data); + break; + } + avail -= 1; + + ret = enqueue_enc_one_op_cb(q, ops[i], i); + if (ret < 0) { + acc200_enqueue_invalid(q_data); + break; + } + } + + if (unlikely(i == 0)) + return 0; /* Nothing to enqueue */ + + /* Set SDone in last CB in enqueued ops for CB mode*/ + desc = q->ring_addr + ((q->sw_ring_head + i - 1) + & q->sw_ring_wrap_mask); + desc->req.sdone_enable = 1; + desc->req.irq_enable = q->irq_enable; + + acc200_dma_enqueue(q, i, &q_data->queue_stats); + + /* Update stats */ + q_data->queue_stats.enqueued_count += i; + q_data->queue_stats.enqueue_err_count += num - i; + return i; +} + /* Check we can mux encode operations with common FCW */ static inline int16_t check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) { @@ -2310,6 +3296,45 @@ static inline uint32_t hq_index(uint32_t offset) return i; } +/* Enqueue encode operations for ACC200 device in TB mode. */ +static uint16_t +acc200_enqueue_enc_tb(struct rte_bbdev_queue_data *q_data, + struct rte_bbdev_enc_op **ops, uint16_t num) +{ + struct acc200_queue *q = q_data->queue_private; + int32_t avail = acc200_ring_avail_enq(q); + uint16_t i, enqueued_cbs = 0; + uint8_t cbs_in_tb; + int ret; + + for (i = 0; i < num; ++i) { + cbs_in_tb = get_num_cbs_in_tb_enc(&ops[i]->turbo_enc); + /* Check if there are available space for further processing */ + if (unlikely((avail - cbs_in_tb < 0) || (cbs_in_tb == 0))) { + acc200_enqueue_ring_full(q_data); + break; + } + avail -= cbs_in_tb; + + ret = enqueue_enc_one_op_tb(q, ops[i], enqueued_cbs, cbs_in_tb); + if (ret <= 0) { + acc200_enqueue_invalid(q_data); + break; + } + enqueued_cbs += ret; + } + if (unlikely(enqueued_cbs == 0)) + return 0; /* Nothing to enqueue */ + + acc200_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats); + + /* Update stats */ + q_data->queue_stats.enqueued_count += i; + q_data->queue_stats.enqueue_err_count += num - i; + + return i; +} + /* Enqueue LDPC encode operations for ACC200 device in TB mode. */ static uint16_t acc200_enqueue_ldpc_enc_tb(struct rte_bbdev_queue_data *q_data, @@ -2366,6 +3391,20 @@ static inline uint32_t hq_index(uint32_t offset) /* Enqueue encode operations for ACC200 device. */ static uint16_t +acc200_enqueue_enc(struct rte_bbdev_queue_data *q_data, + struct rte_bbdev_enc_op **ops, uint16_t num) +{ + int32_t aq_avail = acc200_aq_avail(q_data, num); + if (unlikely((aq_avail <= 0) || (num == 0))) + return 0; + if (ops[0]->turbo_enc.code_block_mode == RTE_BBDEV_TRANSPORT_BLOCK) + return acc200_enqueue_enc_tb(q_data, ops, num); + else + return acc200_enqueue_enc_cb(q_data, ops, num); +} + +/* Enqueue encode operations for ACC200 device. */ +static uint16_t acc200_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data, struct rte_bbdev_enc_op **ops, uint16_t num) { @@ -2379,6 +3418,47 @@ static inline uint32_t hq_index(uint32_t offset) return acc200_enqueue_ldpc_enc_cb(q_data, ops, num); } + +/* Enqueue decode operations for ACC200 device in CB mode */ +static uint16_t +acc200_enqueue_dec_cb(struct rte_bbdev_queue_data *q_data, + struct rte_bbdev_dec_op **ops, uint16_t num) +{ + struct acc200_queue *q = q_data->queue_private; + int32_t avail = acc200_ring_avail_enq(q); + uint16_t i; + union acc200_dma_desc *desc; + int ret; + + for (i = 0; i < num; ++i) { + /* Check if there are available space for further processing */ + if (unlikely(avail - 1 < 0)) + break; + avail -= 1; + + ret = enqueue_dec_one_op_cb(q, ops[i], i); + if (ret < 0) + break; + } + + if (unlikely(i == 0)) + return 0; /* Nothing to enqueue */ + + /* Set SDone in last CB in enqueued ops for CB mode*/ + desc = q->ring_addr + ((q->sw_ring_head + i - 1) + & q->sw_ring_wrap_mask); + desc->req.sdone_enable = 1; + desc->req.irq_enable = q->irq_enable; + + acc200_dma_enqueue(q, i, &q_data->queue_stats); + + /* Update stats */ + q_data->queue_stats.enqueued_count += i; + q_data->queue_stats.enqueue_err_count += num - i; + + return i; +} + /* Check we can mux encode operations with common FCW */ static inline bool cmp_ldpc_dec_op(struct rte_bbdev_dec_op **ops) { @@ -2480,6 +3560,58 @@ static inline uint32_t hq_index(uint32_t offset) return i; } + +/* Enqueue decode operations for ACC200 device in TB mode */ +static uint16_t +acc200_enqueue_dec_tb(struct rte_bbdev_queue_data *q_data, + struct rte_bbdev_dec_op **ops, uint16_t num) +{ + struct acc200_queue *q = q_data->queue_private; + int32_t avail = acc200_ring_avail_enq(q); + uint16_t i, enqueued_cbs = 0; + uint8_t cbs_in_tb; + int ret; + + for (i = 0; i < num; ++i) { + cbs_in_tb = get_num_cbs_in_tb_dec(&ops[i]->turbo_dec); + /* Check if there are available space for further processing */ + if (unlikely((avail - cbs_in_tb < 0) || (cbs_in_tb == 0))) { + acc200_enqueue_ring_full(q_data); + break; + } + avail -= cbs_in_tb; + + ret = enqueue_dec_one_op_tb(q, ops[i], enqueued_cbs, cbs_in_tb); + if (ret <= 0) { + acc200_enqueue_invalid(q_data); + break; + } + enqueued_cbs += ret; + } + + acc200_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats); + + /* Update stats */ + q_data->queue_stats.enqueued_count += i; + q_data->queue_stats.enqueue_err_count += num - i; + + return i; +} + +/* Enqueue decode operations for ACC200 device. */ +static uint16_t +acc200_enqueue_dec(struct rte_bbdev_queue_data *q_data, + struct rte_bbdev_dec_op **ops, uint16_t num) +{ + int32_t aq_avail = acc200_aq_avail(q_data, num); + if (unlikely((aq_avail <= 0) || (num == 0))) + return 0; + if (ops[0]->turbo_dec.code_block_mode == RTE_BBDEV_TRANSPORT_BLOCK) + return acc200_enqueue_dec_tb(q_data, ops, num); + else + return acc200_enqueue_dec_cb(q_data, ops, num); +} + /* Enqueue decode operations for ACC200 device. */ static uint16_t acc200_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data, @@ -2833,6 +3965,51 @@ static inline uint32_t hq_index(uint32_t offset) return cb_idx; } +/* Dequeue encode operations from ACC200 device. */ +static uint16_t +acc200_dequeue_enc(struct rte_bbdev_queue_data *q_data, + struct rte_bbdev_enc_op **ops, uint16_t num) +{ + struct acc200_queue *q = q_data->queue_private; + uint32_t avail = acc200_ring_avail_deq(q); + uint32_t aq_dequeued = 0; + uint16_t i, dequeued_ops = 0, dequeued_descs = 0; + int ret; + struct rte_bbdev_enc_op *op; + if (avail == 0) + return 0; + op = (q->ring_addr + (q->sw_ring_tail & + q->sw_ring_wrap_mask))->req.op_addr; +#ifdef RTE_LIBRTE_BBDEV_DEBUG + if (unlikely(ops == NULL || q == NULL || op == NULL)) + return 0; +#endif + int cbm = op->turbo_enc.code_block_mode; + + for (i = 0; i < num; i++) { + if (cbm == RTE_BBDEV_TRANSPORT_BLOCK) + ret = dequeue_enc_one_op_tb(q, &ops[dequeued_ops], + &dequeued_ops, &aq_dequeued, + &dequeued_descs); + else + ret = dequeue_enc_one_op_cb(q, &ops[dequeued_ops], + &dequeued_ops, &aq_dequeued, + &dequeued_descs); + if (ret < 0) + break; + if (dequeued_ops >= num) + break; + } + + q->aq_dequeued += aq_dequeued; + q->sw_ring_tail += dequeued_descs; + + /* Update enqueue stats */ + q_data->queue_stats.dequeued_count += dequeued_ops; + + return dequeued_ops; +} + /* Dequeue LDPC encode operations from ACC200 device. */ static uint16_t acc200_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data, @@ -2880,6 +4057,51 @@ static inline uint32_t hq_index(uint32_t offset) /* Dequeue decode operations from ACC200 device. */ static uint16_t +acc200_dequeue_dec(struct rte_bbdev_queue_data *q_data, + struct rte_bbdev_dec_op **ops, uint16_t num) +{ + struct acc200_queue *q = q_data->queue_private; + uint16_t dequeue_num; + uint32_t avail = acc200_ring_avail_deq(q); + uint32_t aq_dequeued = 0; + uint16_t i; + uint16_t dequeued_cbs = 0; + struct rte_bbdev_dec_op *op; + int ret; + +#ifdef RTE_LIBRTE_BBDEV_DEBUG + if (unlikely(ops == 0 && q == NULL)) + return 0; +#endif + + dequeue_num = (avail < num) ? avail : num; + + for (i = 0; i < dequeue_num; ++i) { + op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs) + & q->sw_ring_wrap_mask))->req.op_addr; + if (op->turbo_dec.code_block_mode == RTE_BBDEV_TRANSPORT_BLOCK) + ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs, + &aq_dequeued); + else + ret = dequeue_dec_one_op_cb(q_data, q, &ops[i], + dequeued_cbs, &aq_dequeued); + + if (ret <= 0) + break; + dequeued_cbs += ret; + } + + q->aq_dequeued += aq_dequeued; + q->sw_ring_tail += dequeued_cbs; + + /* Update enqueue stats */ + q_data->queue_stats.dequeued_count += i; + + return i; +} + +/* Dequeue decode operations from ACC200 device. */ +static uint16_t acc200_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data, struct rte_bbdev_dec_op **ops, uint16_t num) { @@ -2931,6 +4153,10 @@ static inline uint32_t hq_index(uint32_t offset) struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device); dev->dev_ops = &acc200_bbdev_ops; + dev->enqueue_enc_ops = acc200_enqueue_enc; + dev->enqueue_dec_ops = acc200_enqueue_dec; + dev->dequeue_enc_ops = acc200_dequeue_enc; + dev->dequeue_dec_ops = acc200_dequeue_dec; dev->enqueue_ldpc_enc_ops = acc200_enqueue_ldpc_enc; dev->enqueue_ldpc_dec_ops = acc200_enqueue_ldpc_dec; dev->dequeue_ldpc_enc_ops = acc200_dequeue_ldpc_enc; From patchwork Fri Jul 8 00:01:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Chautru, Nicolas" X-Patchwork-Id: 113819 X-Patchwork-Delegate: gakhil@marvell.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 9DC54A0543; Fri, 8 Jul 2022 02:17:03 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 107444284D; Fri, 8 Jul 2022 02:16:14 +0200 (CEST) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by mails.dpdk.org (Postfix) with ESMTP id 41E7240A7B for ; Fri, 8 Jul 2022 02:16:07 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1657239367; x=1688775367; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=o+sh+mls/ygA45OUdH7SaRAzHiH5S4ZXKMp+pqZ4wqo=; b=H4m6YYxDjv7B0ZV0tW020c9ZFyl9fNDglEpk8wx3gtnkC8arHVGSWcXX aWB2+MHSh8q49NK4L1u09r5dJUKy+n/JwJLjXo6EUUhqzPxMs7GAg+p9z 4TX8KBslRAxuY+44z07poqYhsG9rxZs7i3Uj7BL5RLGehnsFPsJea0h1x 64rlSyaiuTODBJcFuPSuWKWoSSSc+AKe+V9XZqLLq9IB2oe4Yedf4Xixp yqP61980dB2GU4XMuqpQx3zv/izdQH0HH4YEsqRuiDolrfVsgAQL24602 54QjC6w7MbT/8GijLSLKfJGSiRXHrygKNfFacryxn8f56keKCP+bPu7Ug Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10401"; a="264563089" X-IronPort-AV: E=Sophos;i="5.92,253,1650956400"; d="scan'208";a="264563089" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Jul 2022 17:16:05 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.92,253,1650956400"; d="scan'208";a="591387547" Received: from skx-5gnr-sc12-4.sc.intel.com ([172.25.69.210]) by orsmga007.jf.intel.com with ESMTP; 07 Jul 2022 17:16:04 -0700 From: Nicolas Chautru To: dev@dpdk.org, thomas@monjalon.net, gakhil@marvell.com, hemant.agrawal@nxp.com, trix@redhat.com Cc: maxime.coquelin@redhat.com, mdr@ashroe.eu, bruce.richardson@intel.com, david.marchand@redhat.com, stephen@networkplumber.org, Nicolas Chautru Subject: [PATCH v1 07/10] baseband/acc200: add support for FFT operations Date: Thu, 7 Jul 2022 17:01:40 -0700 Message-Id: <1657238503-143836-8-git-send-email-nicolas.chautru@intel.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1657238503-143836-1-git-send-email-nicolas.chautru@intel.com> References: <1657238503-143836-1-git-send-email-nicolas.chautru@intel.com> X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Add functions and capability for FFT processing Signed-off-by: Nicolas Chautru --- drivers/baseband/acc200/rte_acc200_pmd.c | 272 ++++++++++++++++++++++++++++++- 1 file changed, 270 insertions(+), 2 deletions(-) diff --git a/drivers/baseband/acc200/rte_acc200_pmd.c b/drivers/baseband/acc200/rte_acc200_pmd.c index 003a2a3..36c5561 100644 --- a/drivers/baseband/acc200/rte_acc200_pmd.c +++ b/drivers/baseband/acc200/rte_acc200_pmd.c @@ -860,6 +860,21 @@ .num_buffers_soft_out = 0, } }, + { + .type = RTE_BBDEV_OP_FFT, + .cap.fft = { + .capability_flags = + RTE_BBDEV_FFT_WINDOWING | + RTE_BBDEV_FFT_CS_ADJUSTMENT | + RTE_BBDEV_FFT_DFT_BYPASS | + RTE_BBDEV_FFT_IDFT_BYPASS | + RTE_BBDEV_FFT_WINDOWING_BYPASS, + .num_buffers_src = + RTE_BBDEV_LDPC_MAX_CODE_BLOCKS, + .num_buffers_dst = + RTE_BBDEV_LDPC_MAX_CODE_BLOCKS, + } + }, RTE_BBDEV_END_OF_CAPABILITIES_LIST() }; @@ -882,12 +897,13 @@ d->acc200_conf.q_ul_5g.num_qgroups; dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = d->acc200_conf.q_dl_5g.num_aqs_per_groups * d->acc200_conf.q_dl_5g.num_qgroups; - dev_info->num_queues[RTE_BBDEV_OP_FFT] = 0; + dev_info->num_queues[RTE_BBDEV_OP_FFT] = d->acc200_conf.q_fft.num_aqs_per_groups * + d->acc200_conf.q_fft.num_qgroups; dev_info->queue_priority[RTE_BBDEV_OP_TURBO_DEC] = d->acc200_conf.q_ul_4g.num_qgroups; dev_info->queue_priority[RTE_BBDEV_OP_TURBO_ENC] = d->acc200_conf.q_dl_4g.num_qgroups; dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = d->acc200_conf.q_ul_5g.num_qgroups; dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = d->acc200_conf.q_dl_5g.num_qgroups; - dev_info->queue_priority[RTE_BBDEV_OP_FFT] = 0; + dev_info->queue_priority[RTE_BBDEV_OP_FFT] = d->acc200_conf.q_fft.num_qgroups; dev_info->max_num_queues = 0; for (i = RTE_BBDEV_OP_NONE; i <= RTE_BBDEV_OP_FFT; i++) dev_info->max_num_queues += dev_info->num_queues[i]; @@ -2124,6 +2140,21 @@ static inline uint32_t hq_index(uint32_t offset) return 0; } + +/* Validates FFT op parameters */ +static inline int +validate_fft_op(struct rte_bbdev_fft_op *op) +{ + struct rte_bbdev_op_fft *fft = &op->fft; + struct rte_mbuf *input; + input = fft->base_input.data; + if (unlikely(input == NULL)) { + rte_bbdev_log(ERR, "Invalid mbuf pointer"); + return -EFAULT; + } + return 0; +} + #endif /* Enqueue one encode operations for ACC200 device in CB mode */ @@ -4146,6 +4177,241 @@ static inline uint32_t hq_index(uint32_t offset) return i; } +/* Fill in a frame control word for FFT processing. */ +static inline void +acc200_fcw_fft_fill(struct rte_bbdev_fft_op *op, struct acc200_fcw_fft *fcw) +{ + fcw->in_frame_size = op->fft.input_sequence_size; + fcw->leading_pad_size = op->fft.input_leading_padding; + fcw->out_frame_size = op->fft.output_sequence_size; + fcw->leading_depad_size = op->fft.output_leading_depadding; + fcw->cs_window_sel = op->fft.window_index[0] + + (op->fft.window_index[1] << 8) + + (op->fft.window_index[2] << 16) + + (op->fft.window_index[3] << 24); + fcw->cs_window_sel2 = op->fft.window_index[4] + + (op->fft.window_index[5] << 8); + fcw->cs_enable_bmap = op->fft.cs_bitmap; + fcw->num_antennas = op->fft.num_antennas_log2; + fcw->idft_size = op->fft.idft_log2; + fcw->dft_size = op->fft.dft_log2; + fcw->cs_offset = op->fft.cs_time_adjustment; + fcw->idft_shift = op->fft.idft_shift; + fcw->dft_shift = op->fft.dft_shift; + fcw->cs_multiplier = op->fft.ncs_reciprocal; + if (check_bit(op->fft.op_flags, + RTE_BBDEV_FFT_IDFT_BYPASS)) { + if (check_bit(op->fft.op_flags, + RTE_BBDEV_FFT_WINDOWING_BYPASS)) + fcw->bypass = 2; + else + fcw->bypass = 1; + } else if (check_bit(op->fft.op_flags, + RTE_BBDEV_FFT_DFT_BYPASS)) + fcw->bypass = 3; + else + fcw->bypass = 0; +} + +static inline int +acc200_dma_desc_fft_fill(struct rte_bbdev_fft_op *op, + struct acc200_dma_req_desc *desc, + struct rte_mbuf *input, struct rte_mbuf *output, + uint32_t *in_offset, uint32_t *out_offset) +{ + /* FCW already done */ + acc200_header_init(desc); + desc->data_ptrs[1].address = + rte_pktmbuf_iova_offset(input, *in_offset); + desc->data_ptrs[1].blen = op->fft.input_sequence_size * 4; + desc->data_ptrs[1].blkid = ACC200_DMA_BLKID_IN; + desc->data_ptrs[1].last = 1; + desc->data_ptrs[1].dma_ext = 0; + desc->data_ptrs[2].address = + rte_pktmbuf_iova_offset(output, *out_offset); + desc->data_ptrs[2].blen = op->fft.output_sequence_size * 4; + desc->data_ptrs[2].blkid = ACC200_DMA_BLKID_OUT_HARD; + desc->data_ptrs[2].last = 1; + desc->data_ptrs[2].dma_ext = 0; + desc->m2dlen = 2; + desc->d2mlen = 1; + desc->ib_ant_offset = op->fft.input_sequence_size; + desc->num_ant = op->fft.num_antennas_log2 - 3; + int num_cs = 0, i; + for (i = 0; i < 12; i++) + if (check_bit(op->fft.cs_bitmap, 1 << i)) + num_cs++; + desc->num_cs = num_cs; + desc->ob_cyc_offset = op->fft.output_sequence_size; + desc->ob_ant_offset = op->fft.output_sequence_size * num_cs; + desc->op_addr = op; + return 0; +} + + +/** Enqueue one FFT operation for ACC200 device*/ +static inline int +enqueue_fft_one_op(struct acc200_queue *q, struct rte_bbdev_fft_op *op, + uint16_t total_enqueued_cbs) +{ +#ifdef RTE_LIBRTE_BBDEV_DEBUG + if (validate_fft_op(op) == -EFAULT) { + rte_bbdev_log(ERR, "FFT op validation failed"); + return -EINVAL; + } +#endif + union acc200_dma_desc *desc; + uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs) + & q->sw_ring_wrap_mask); + desc = q->ring_addr + desc_idx; + struct rte_mbuf *input, *output; + uint32_t in_offset, out_offset; + input = op->fft.base_input.data; + output = op->fft.base_output.data; + in_offset = op->fft.base_input.offset; + out_offset = op->fft.base_output.offset; +#ifdef RTE_LIBRTE_BBDEV_DEBUG + if (unlikely(input == NULL)) { + rte_bbdev_log(ERR, "Invalid mbuf pointer"); + return -EFAULT; + } +#endif + struct acc200_fcw_fft *fcw; + fcw = &desc->req.fcw_fft; + acc200_fcw_fft_fill(op, fcw); + acc200_dma_desc_fft_fill(op, &desc->req, input, output, + &in_offset, &out_offset); +#ifdef RTE_LIBRTE_BBDEV_DEBUG + rte_memdump(stderr, "FCW", &desc->req.fcw_fft, + sizeof(desc->req.fcw_fft)); + rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc)); +#endif + return 1; +} + +/* Enqueue decode operations for ACC200 device. */ +static uint16_t +acc200_enqueue_fft(struct rte_bbdev_queue_data *q_data, + struct rte_bbdev_fft_op **ops, uint16_t num) +{ + int32_t aq_avail = acc200_aq_avail(q_data, num); + if (unlikely((aq_avail <= 0) || (num == 0))) + return 0; + struct acc200_queue *q = q_data->queue_private; + int32_t avail = acc200_ring_avail_enq(q); + uint16_t i; + union acc200_dma_desc *desc; + int ret; + for (i = 0; i < num; ++i) { + /* Check if there are available space for further processing */ + if (unlikely(avail < 1)) + break; + avail -= 1; + ret = enqueue_fft_one_op(q, ops[i], i); + if (ret < 0) + break; + } + + if (unlikely(i == 0)) + return 0; /* Nothing to enqueue */ + + /* Set SDone in last CB in enqueued ops for CB mode*/ + desc = q->ring_addr + ((q->sw_ring_head + i - 1) + & q->sw_ring_wrap_mask); + + desc->req.sdone_enable = 1; + desc->req.irq_enable = q->irq_enable; + acc200_dma_enqueue(q, i, &q_data->queue_stats); + + /* Update stats */ + q_data->queue_stats.enqueued_count += i; + q_data->queue_stats.enqueue_err_count += num - i; + return i; +} + + +/* Dequeue one FFT operations from ACC200 device */ +static inline int +dequeue_fft_one_op(struct rte_bbdev_queue_data *q_data, + struct acc200_queue *q, struct rte_bbdev_fft_op **ref_op, + uint16_t dequeued_cbs, uint32_t *aq_dequeued) +{ + union acc200_dma_desc *desc, atom_desc; + union acc200_dma_rsp_desc rsp; + struct rte_bbdev_fft_op *op; + + desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs) + & q->sw_ring_wrap_mask); + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc, + __ATOMIC_RELAXED); + + /* Check fdone bit */ + if (!(atom_desc.rsp.val & ACC200_FDONE)) + return -1; + + rsp.val = atom_desc.rsp.val; +#ifdef RTE_LIBRTE_BBDEV_DEBUG + rte_memdump(stderr, "Resp", &desc->rsp.val, + sizeof(desc->rsp.val)); +#endif + /* Dequeue */ + op = desc->req.op_addr; + + /* Clearing status, it will be set based on response */ + op->status = 0; + op->status |= rsp.input_err << RTE_BBDEV_DATA_ERROR; + op->status |= rsp.dma_err << RTE_BBDEV_DRV_ERROR; + op->status |= rsp.fcw_err << RTE_BBDEV_DRV_ERROR; + if (op->status != 0) + q_data->queue_stats.dequeue_err_count++; + + /* Check if this is the last desc in batch (Atomic Queue) */ + if (desc->req.last_desc_in_batch) { + (*aq_dequeued)++; + desc->req.last_desc_in_batch = 0; + } + desc->rsp.val = ACC200_DMA_DESC_TYPE; + desc->rsp.add_info_0 = 0; + *ref_op = op; + /* One CB (op) was successfully dequeued */ + return 1; +} + + +/* Dequeue FFT operations from ACC200 device. */ +static uint16_t +acc200_dequeue_fft(struct rte_bbdev_queue_data *q_data, + struct rte_bbdev_fft_op **ops, uint16_t num) +{ + struct acc200_queue *q = q_data->queue_private; + uint16_t dequeue_num, i, dequeued_cbs = 0; + uint32_t avail = acc200_ring_avail_deq(q); + uint32_t aq_dequeued = 0; + int ret; + +#ifdef RTE_LIBRTE_BBDEV_DEBUG + if (unlikely(ops == 0 && q == NULL)) + return 0; +#endif + + dequeue_num = RTE_MIN(avail, num); + + for (i = 0; i < dequeue_num; ++i) { + ret = dequeue_fft_one_op( + q_data, q, &ops[i], dequeued_cbs, + &aq_dequeued); + if (ret <= 0) + break; + dequeued_cbs += ret; + } + + q->aq_dequeued += aq_dequeued; + q->sw_ring_tail += dequeued_cbs; + /* Update enqueue stats */ + q_data->queue_stats.dequeued_count += i; + return i; +} + /* Initialization Function */ static void acc200_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv) @@ -4161,6 +4427,8 @@ static inline uint32_t hq_index(uint32_t offset) dev->enqueue_ldpc_dec_ops = acc200_enqueue_ldpc_dec; dev->dequeue_ldpc_enc_ops = acc200_dequeue_ldpc_enc; dev->dequeue_ldpc_dec_ops = acc200_dequeue_ldpc_dec; + dev->enqueue_fft_ops = acc200_enqueue_fft; + dev->dequeue_fft_ops = acc200_dequeue_fft; ((struct acc200_device *) dev->data->dev_private)->pf_device = !strcmp(drv->driver.name, From patchwork Fri Jul 8 00:01:41 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Chautru, Nicolas" X-Patchwork-Id: 113821 X-Patchwork-Delegate: gakhil@marvell.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id A47DBA0543; Fri, 8 Jul 2022 02:17:19 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 04DFD42B7D; Fri, 8 Jul 2022 02:16:17 +0200 (CEST) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by mails.dpdk.org (Postfix) with ESMTP id C948640041 for ; Fri, 8 Jul 2022 02:16:07 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1657239367; x=1688775367; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=KT6Wc1LkSTCw3drjzga5uSrEnznJC5RNSkveCj1Gxp4=; b=lvOm7ujY7F8K5ng4011Fr8peUroiehhQrAxpFUDjgaALLcSCwPfqOg/6 B0QiAkS7HLuhbqSVTDLJzu7j6yqoP1rpkVujkIHHI/dVbGtRTO8eVC3dN QOgGWk7+NYc8JInn/T8X/bMKua4p3tHSvbMf/IC3IgNZEVaCgjotXDAYy IkKhQkglWTIfB6/CfX9dTytVzFvASrCsVVECVOKwVRlLdQVEpSWdghsFq NfZlPwBGYvy2o9UGOTlqMbQd2AAwW0sNw0dTi4rckUOa7fa2YoRhz8tzC M+XhwcJh5e3X1SUsWPCeVqOVaVA5tv3JrT4YhYWGmD0jCL21+ior/pzoa w==; X-IronPort-AV: E=McAfee;i="6400,9594,10401"; a="264563091" X-IronPort-AV: E=Sophos;i="5.92,253,1650956400"; d="scan'208";a="264563091" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Jul 2022 17:16:05 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.92,253,1650956400"; d="scan'208";a="591387552" Received: from skx-5gnr-sc12-4.sc.intel.com ([172.25.69.210]) by orsmga007.jf.intel.com with ESMTP; 07 Jul 2022 17:16:05 -0700 From: Nicolas Chautru To: dev@dpdk.org, thomas@monjalon.net, gakhil@marvell.com, hemant.agrawal@nxp.com, trix@redhat.com Cc: maxime.coquelin@redhat.com, mdr@ashroe.eu, bruce.richardson@intel.com, david.marchand@redhat.com, stephen@networkplumber.org, Nicolas Chautru Subject: [PATCH v1 08/10] baseband/acc200: support interrupt Date: Thu, 7 Jul 2022 17:01:41 -0700 Message-Id: <1657238503-143836-9-git-send-email-nicolas.chautru@intel.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1657238503-143836-1-git-send-email-nicolas.chautru@intel.com> References: <1657238503-143836-1-git-send-email-nicolas.chautru@intel.com> X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Adding support for capability and functions for MSI/MSI-X interript and underlying information ring. Signed-off-by: Nicolas Chautru --- drivers/baseband/acc200/rte_acc200_pmd.c | 370 ++++++++++++++++++++++++++++++- 1 file changed, 368 insertions(+), 2 deletions(-) diff --git a/drivers/baseband/acc200/rte_acc200_pmd.c b/drivers/baseband/acc200/rte_acc200_pmd.c index 36c5561..ecfbc7a 100644 --- a/drivers/baseband/acc200/rte_acc200_pmd.c +++ b/drivers/baseband/acc200/rte_acc200_pmd.c @@ -363,6 +363,217 @@ free_base_addresses(base_addrs, i); } +/* + * Find queue_id of a device queue based on details from the Info Ring. + * If a queue isn't found UINT16_MAX is returned. + */ +static inline uint16_t +get_queue_id_from_ring_info(struct rte_bbdev_data *data, + const union acc200_info_ring_data ring_data) +{ + uint16_t queue_id; + + for (queue_id = 0; queue_id < data->num_queues; ++queue_id) { + struct acc200_queue *acc200_q = + data->queues[queue_id].queue_private; + if (acc200_q != NULL && acc200_q->aq_id == ring_data.aq_id && + acc200_q->qgrp_id == ring_data.qg_id && + acc200_q->vf_id == ring_data.vf_id) + return queue_id; + } + + return UINT16_MAX; +} + +/* Checks PF Info Ring to find the interrupt cause and handles it accordingly */ +static inline void +acc200_check_ir(struct acc200_device *acc200_dev) +{ + volatile union acc200_info_ring_data *ring_data; + uint16_t info_ring_head = acc200_dev->info_ring_head; + if (acc200_dev->info_ring == NULL) + return; + + ring_data = acc200_dev->info_ring + (acc200_dev->info_ring_head & + ACC200_INFO_RING_MASK); + + while (ring_data->valid) { + if ((ring_data->int_nb < ACC200_PF_INT_DMA_DL_DESC_IRQ) || ( + ring_data->int_nb > + ACC200_PF_INT_DMA_DL5G_DESC_IRQ)) { + rte_bbdev_log(WARNING, "InfoRing: ITR:%d Info:0x%x", + ring_data->int_nb, ring_data->detailed_info); + /* Initialize Info Ring entry and move forward */ + ring_data->val = 0; + } + info_ring_head++; + ring_data = acc200_dev->info_ring + + (info_ring_head & ACC200_INFO_RING_MASK); + } +} + +/* Checks PF Info Ring to find the interrupt cause and handles it accordingly */ +static inline void +acc200_pf_interrupt_handler(struct rte_bbdev *dev) +{ + struct acc200_device *acc200_dev = dev->data->dev_private; + volatile union acc200_info_ring_data *ring_data; + struct acc200_deq_intr_details deq_intr_det; + + ring_data = acc200_dev->info_ring + (acc200_dev->info_ring_head & + ACC200_INFO_RING_MASK); + + while (ring_data->valid) { + + rte_bbdev_log_debug( + "ACC200 PF Interrupt received, Info Ring data: 0x%x -> %d", + ring_data->val, ring_data->int_nb); + + switch (ring_data->int_nb) { + case ACC200_PF_INT_DMA_DL_DESC_IRQ: + case ACC200_PF_INT_DMA_UL_DESC_IRQ: + case ACC200_PF_INT_DMA_FFT_DESC_IRQ: + case ACC200_PF_INT_DMA_UL5G_DESC_IRQ: + case ACC200_PF_INT_DMA_DL5G_DESC_IRQ: + deq_intr_det.queue_id = get_queue_id_from_ring_info( + dev->data, *ring_data); + if (deq_intr_det.queue_id == UINT16_MAX) { + rte_bbdev_log(ERR, + "Couldn't find queue: aq_id: %u, qg_id: %u, vf_id: %u", + ring_data->aq_id, + ring_data->qg_id, + ring_data->vf_id); + return; + } + rte_bbdev_pmd_callback_process(dev, + RTE_BBDEV_EVENT_DEQUEUE, &deq_intr_det); + break; + default: + rte_bbdev_pmd_callback_process(dev, + RTE_BBDEV_EVENT_ERROR, NULL); + break; + } + + /* Initialize Info Ring entry and move forward */ + ring_data->val = 0; + ++acc200_dev->info_ring_head; + ring_data = acc200_dev->info_ring + + (acc200_dev->info_ring_head & + ACC200_INFO_RING_MASK); + } +} + +/* Checks VF Info Ring to find the interrupt cause and handles it accordingly */ +static inline void +acc200_vf_interrupt_handler(struct rte_bbdev *dev) +{ + struct acc200_device *acc200_dev = dev->data->dev_private; + volatile union acc200_info_ring_data *ring_data; + struct acc200_deq_intr_details deq_intr_det; + + ring_data = acc200_dev->info_ring + (acc200_dev->info_ring_head & + ACC200_INFO_RING_MASK); + + while (ring_data->valid) { + + rte_bbdev_log_debug( + "ACC200 VF Interrupt received, Info Ring data: 0x%x\n", + ring_data->val); + + switch (ring_data->int_nb) { + case ACC200_VF_INT_DMA_DL_DESC_IRQ: + case ACC200_VF_INT_DMA_UL_DESC_IRQ: + case ACC200_VF_INT_DMA_FFT_DESC_IRQ: + case ACC200_VF_INT_DMA_UL5G_DESC_IRQ: + case ACC200_VF_INT_DMA_DL5G_DESC_IRQ: + /* VFs are not aware of their vf_id - it's set to 0 in + * queue structures. + */ + ring_data->vf_id = 0; + deq_intr_det.queue_id = get_queue_id_from_ring_info( + dev->data, *ring_data); + if (deq_intr_det.queue_id == UINT16_MAX) { + rte_bbdev_log(ERR, + "Couldn't find queue: aq_id: %u, qg_id: %u", + ring_data->aq_id, + ring_data->qg_id); + return; + } + rte_bbdev_pmd_callback_process(dev, + RTE_BBDEV_EVENT_DEQUEUE, &deq_intr_det); + break; + default: + rte_bbdev_pmd_callback_process(dev, + RTE_BBDEV_EVENT_ERROR, NULL); + break; + } + + /* Initialize Info Ring entry and move forward */ + ring_data->valid = 0; + ++acc200_dev->info_ring_head; + ring_data = acc200_dev->info_ring + (acc200_dev->info_ring_head + & ACC200_INFO_RING_MASK); + } +} + +/* Interrupt handler triggered by ACC200 dev for handling specific interrupt */ +static void +acc200_dev_interrupt_handler(void *cb_arg) +{ + struct rte_bbdev *dev = cb_arg; + struct acc200_device *acc200_dev = dev->data->dev_private; + + /* Read info ring */ + if (acc200_dev->pf_device) + acc200_pf_interrupt_handler(dev); + else + acc200_vf_interrupt_handler(dev); +} + +/* Allocate and setup inforing */ +static int +allocate_info_ring(struct rte_bbdev *dev) +{ + struct acc200_device *d = dev->data->dev_private; + const struct acc200_registry_addr *reg_addr; + rte_iova_t info_ring_iova; + uint32_t phys_low, phys_high; + + if (d->info_ring != NULL) + return 0; /* Already configured */ + + /* Choose correct registry addresses for the device type */ + if (d->pf_device) + reg_addr = &pf_reg_addr; + else + reg_addr = &vf_reg_addr; + /* Allocate InfoRing */ + if (d->info_ring == NULL) + d->info_ring = rte_zmalloc_socket("Info Ring", + ACC200_INFO_RING_NUM_ENTRIES * + sizeof(*d->info_ring), RTE_CACHE_LINE_SIZE, + dev->data->socket_id); + if (d->info_ring == NULL) { + rte_bbdev_log(ERR, + "Failed to allocate Info Ring for %s:%u", + dev->device->driver->name, + dev->data->dev_id); + return -ENOMEM; + } + info_ring_iova = rte_malloc_virt2iova(d->info_ring); + + /* Setup Info Ring */ + phys_high = (uint32_t)(info_ring_iova >> 32); + phys_low = (uint32_t)(info_ring_iova); + acc200_reg_write(d, reg_addr->info_ring_hi, phys_high); + acc200_reg_write(d, reg_addr->info_ring_lo, phys_low); + acc200_reg_write(d, reg_addr->info_ring_en, ACC200_REG_IRQ_EN_ALL); + d->info_ring_head = (acc200_reg_read(d, reg_addr->info_ring_ptr) & + 0xFFF) / sizeof(union acc200_info_ring_data); + return 0; +} + + /* Allocate 64MB memory used for all software rings */ static int acc200_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id) @@ -370,6 +581,7 @@ uint32_t phys_low, phys_high, value; struct acc200_device *d = dev->data->dev_private; const struct acc200_registry_addr *reg_addr; + int ret; if (d->pf_device && !d->acc200_conf.pf_mode_en) { rte_bbdev_log(NOTICE, @@ -470,6 +682,14 @@ acc200_reg_write(d, reg_addr->tail_ptrs_fft_hi, phys_high); acc200_reg_write(d, reg_addr->tail_ptrs_fft_lo, phys_low); + ret = allocate_info_ring(dev); + if (ret < 0) { + rte_bbdev_log(ERR, "Failed to allocate info_ring for %s:%u", + dev->device->driver->name, + dev->data->dev_id); + /* Continue */ + } + if (d->harq_layout == NULL) d->harq_layout = rte_zmalloc_socket("HARQ Layout", ACC200_HARQ_LAYOUT * sizeof(*d->harq_layout), @@ -492,17 +712,121 @@ return 0; } +static int +acc200_intr_enable(struct rte_bbdev *dev) +{ + int ret; + struct acc200_device *d = dev->data->dev_private; + /* + * MSI/MSI-X are supported + * Option controlled by vfio-intr through EAL parameter + */ + if (rte_intr_type_get(dev->intr_handle) == RTE_INTR_HANDLE_VFIO_MSI) { + + ret = allocate_info_ring(dev); + if (ret < 0) { + rte_bbdev_log(ERR, + "Couldn't allocate info ring for device: %s", + dev->data->name); + return ret; + } + ret = rte_intr_enable(dev->intr_handle); + if (ret < 0) { + rte_bbdev_log(ERR, + "Couldn't enable interrupts for device: %s", + dev->data->name); + rte_free(d->info_ring); + return ret; + } + ret = rte_intr_callback_register(dev->intr_handle, + acc200_dev_interrupt_handler, dev); + if (ret < 0) { + rte_bbdev_log(ERR, + "Couldn't register interrupt callback for device: %s", + dev->data->name); + rte_free(d->info_ring); + return ret; + } + + return 0; + } else if (rte_intr_type_get(dev->intr_handle) == RTE_INTR_HANDLE_VFIO_MSIX) { + + ret = allocate_info_ring(dev); + if (ret < 0) { + rte_bbdev_log(ERR, + "Couldn't allocate info ring for device: %s", + dev->data->name); + return ret; + } + + int i, max_queues; + struct acc200_device *acc200_dev = dev->data->dev_private; + + if (acc200_dev->pf_device) + max_queues = ACC200_MAX_PF_MSIX; + else + max_queues = ACC200_MAX_VF_MSIX; + + if (rte_intr_efd_enable(dev->intr_handle, max_queues)) { + rte_bbdev_log(ERR, "Failed to create fds for %u queues", + dev->data->num_queues); + return -1; + } + + for (i = 0; i < max_queues; ++i) { + if (rte_intr_efds_index_set(dev->intr_handle, i, + rte_intr_fd_get(dev->intr_handle))) + return -rte_errno; + } + + if (rte_intr_vec_list_alloc(dev->intr_handle, "intr_vec", + dev->data->num_queues)) { + rte_bbdev_log(ERR, "Failed to allocate %u vectors", + dev->data->num_queues); + return -ENOMEM; + } + + ret = rte_intr_enable(dev->intr_handle); + + if (ret < 0) { + rte_bbdev_log(ERR, + "Couldn't enable interrupts for device: %s", + dev->data->name); + rte_free(d->info_ring); + return ret; + } + ret = rte_intr_callback_register(dev->intr_handle, + acc200_dev_interrupt_handler, dev); + if (ret < 0) { + rte_bbdev_log(ERR, + "Couldn't register interrupt callback for device: %s", + dev->data->name); + rte_free(d->info_ring); + return ret; + } + + return 0; + } + + rte_bbdev_log(ERR, "ACC200 (%s) supports only VFIO MSI/MSI-X interrupts\n", + dev->data->name); + return -ENOTSUP; +} + /* Free memory used for software rings */ static int acc200_dev_close(struct rte_bbdev *dev) { struct acc200_device *d = dev->data->dev_private; + acc200_check_ir(d); if (d->sw_rings_base != NULL) { rte_free(d->tail_ptrs); + rte_free(d->info_ring); rte_free(d->sw_rings_base); rte_free(d->harq_layout); d->sw_rings_base = NULL; d->tail_ptrs = NULL; + d->info_ring = NULL; d->harq_layout = NULL; } /* Ensure all in flight HW transactions are completed */ @@ -795,6 +1119,7 @@ RTE_BBDEV_TURBO_CONTINUE_CRC_MATCH | RTE_BBDEV_TURBO_SOFT_OUTPUT | RTE_BBDEV_TURBO_EARLY_TERMINATION | + RTE_BBDEV_TURBO_DEC_INTERRUPTS | RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN | RTE_BBDEV_TURBO_NEG_LLR_1_BIT_SOFT_OUT | RTE_BBDEV_TURBO_MAP_DEC | @@ -816,6 +1141,7 @@ RTE_BBDEV_TURBO_CRC_24B_ATTACH | RTE_BBDEV_TURBO_RV_INDEX_BYPASS | RTE_BBDEV_TURBO_RATE_MATCH | + RTE_BBDEV_TURBO_ENC_INTERRUPTS | RTE_BBDEV_TURBO_ENC_SCATTER_GATHER, .num_buffers_src = RTE_BBDEV_TURBO_MAX_CODE_BLOCKS, @@ -829,7 +1155,8 @@ .capability_flags = RTE_BBDEV_LDPC_RATE_MATCH | RTE_BBDEV_LDPC_CRC_24B_ATTACH | - RTE_BBDEV_LDPC_INTERLEAVER_BYPASS, + RTE_BBDEV_LDPC_INTERLEAVER_BYPASS | + RTE_BBDEV_LDPC_ENC_INTERRUPTS, .num_buffers_src = RTE_BBDEV_LDPC_MAX_CODE_BLOCKS, .num_buffers_dst = @@ -850,7 +1177,8 @@ RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS | RTE_BBDEV_LDPC_DEC_SCATTER_GATHER | RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION | - RTE_BBDEV_LDPC_LLR_COMPRESSION, + RTE_BBDEV_LDPC_LLR_COMPRESSION | + RTE_BBDEV_LDPC_DEC_INTERRUPTS, .llr_size = 8, .llr_decimals = 1, .num_buffers_src = @@ -918,15 +1246,46 @@ dev_info->min_alignment = 1; dev_info->capabilities = bbdev_capabilities; dev_info->harq_buffer_size = 0; + + acc200_check_ir(d); +} + +static int +acc200_queue_intr_enable(struct rte_bbdev *dev, uint16_t queue_id) +{ + struct acc200_queue *q = dev->data->queues[queue_id].queue_private; + + if (rte_intr_type_get(dev->intr_handle) != RTE_INTR_HANDLE_VFIO_MSI && + rte_intr_type_get(dev->intr_handle) != RTE_INTR_HANDLE_VFIO_MSIX) + return -ENOTSUP; + + q->irq_enable = 1; + return 0; +} + +static int +acc200_queue_intr_disable(struct rte_bbdev *dev, uint16_t queue_id) +{ + struct acc200_queue *q = dev->data->queues[queue_id].queue_private; + + if (rte_intr_type_get(dev->intr_handle) != RTE_INTR_HANDLE_VFIO_MSI && + rte_intr_type_get(dev->intr_handle) != RTE_INTR_HANDLE_VFIO_MSIX) + return -ENOTSUP; + + q->irq_enable = 0; + return 0; } static const struct rte_bbdev_ops acc200_bbdev_ops = { .setup_queues = acc200_setup_queues, + .intr_enable = acc200_intr_enable, .close = acc200_dev_close, .info_get = acc200_dev_info_get, .queue_setup = acc200_queue_setup, .queue_release = acc200_queue_release, .queue_stop = acc200_queue_stop, + .queue_intr_enable = acc200_queue_intr_enable, + .queue_intr_disable = acc200_queue_intr_disable }; /* ACC200 PCI PF address map */ @@ -3821,6 +4180,7 @@ static inline uint32_t hq_index(uint32_t offset) if (op->status != 0) { /* These errors are not expected */ q_data->queue_stats.dequeue_err_count++; + acc200_check_ir(q->d); } /* CRC invalid if error exists */ @@ -3890,6 +4250,9 @@ static inline uint32_t hq_index(uint32_t offset) op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt; + if (op->status & (1 << RTE_BBDEV_DRV_ERROR)) + acc200_check_ir(q->d); + /* Check if this is the last desc in batch (Atomic Queue) */ if (desc->req.last_desc_in_batch) { (*aq_dequeued)++; @@ -4365,6 +4728,9 @@ static inline uint32_t hq_index(uint32_t offset) if (op->status != 0) q_data->queue_stats.dequeue_err_count++; + if (op->status & (1 << RTE_BBDEV_DRV_ERROR)) + acc200_check_ir(q->d); + /* Check if this is the last desc in batch (Atomic Queue) */ if (desc->req.last_desc_in_batch) { (*aq_dequeued)++; From patchwork Fri Jul 8 00:01:42 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Chautru, Nicolas" X-Patchwork-Id: 113820 X-Patchwork-Delegate: gakhil@marvell.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 0DAE4A0543; Fri, 8 Jul 2022 02:17:13 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id CBCD042B93; Fri, 8 Jul 2022 02:16:15 +0200 (CEST) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by mails.dpdk.org (Postfix) with ESMTP id DFE99427F7 for ; Fri, 8 Jul 2022 02:16:07 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1657239368; x=1688775368; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=IAQVFbRmAo4pPnuItFMjTXNblh73foMnpBJ//NElnBU=; b=h8nfIkV10Hf2ufCG3J2QZsOjC+IIAjufr659fIwMfL5uRNhNXjNpF7V0 h5Uv3kY5vkfCm8LeioqVOPqUo0tsYtMtiTf/tyiNQXAwuwf77l1N2Uz86 Fe7WH2JamH6li/UNSv0M1k3N6uaUe+5u/BIvk+xNrP+B0cx7QH2hCbg2h ipwZlMlAr5WJvPZbz1BLkxmL3dVGWOsWRNNHTvGBPXTxNWfs/+hK0L/21 3+Xv5A3Q7MTQSocAUVhOiX2oNUELDlWPZFKw54YGvFaCHb0F/cts1qDM4 W92Io2OOq21wzj64AI8zOhvIYuhO8PWSixsTVUtX4qvsZdp3ekJZFWG1V Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10401"; a="264563092" X-IronPort-AV: E=Sophos;i="5.92,253,1650956400"; d="scan'208";a="264563092" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Jul 2022 17:16:05 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.92,253,1650956400"; d="scan'208";a="591387556" Received: from skx-5gnr-sc12-4.sc.intel.com ([172.25.69.210]) by orsmga007.jf.intel.com with ESMTP; 07 Jul 2022 17:16:05 -0700 From: Nicolas Chautru To: dev@dpdk.org, thomas@monjalon.net, gakhil@marvell.com, hemant.agrawal@nxp.com, trix@redhat.com Cc: maxime.coquelin@redhat.com, mdr@ashroe.eu, bruce.richardson@intel.com, david.marchand@redhat.com, stephen@networkplumber.org, Nicolas Chautru Subject: [PATCH v1 09/10] baseband/acc200: add device status and vf2pf comms Date: Thu, 7 Jul 2022 17:01:42 -0700 Message-Id: <1657238503-143836-10-git-send-email-nicolas.chautru@intel.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1657238503-143836-1-git-send-email-nicolas.chautru@intel.com> References: <1657238503-143836-1-git-send-email-nicolas.chautru@intel.com> X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Add support to expose the device status seen from the host through v2pf mailbox communication. Signed-off-by: Nicolas Chautru --- drivers/baseband/acc200/rte_acc200_pmd.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) diff --git a/drivers/baseband/acc200/rte_acc200_pmd.c b/drivers/baseband/acc200/rte_acc200_pmd.c index ecfbc7a..856ea1c 100644 --- a/drivers/baseband/acc200/rte_acc200_pmd.c +++ b/drivers/baseband/acc200/rte_acc200_pmd.c @@ -262,6 +262,31 @@ acc200_conf->q_fft.aq_depth_log2); } +static inline void +acc200_vf2pf(struct acc200_device *d, unsigned int payload) +{ + acc200_reg_write(d, HWVfHiVfToPfDbellVf, payload); +} + +/* Request device status information */ +static inline uint32_t +acc200_device_status(struct rte_bbdev *dev) +{ + struct acc200_device *d = dev->data->dev_private; + uint32_t reg, time_out = 0; + if (d->pf_device) + return RTE_BBDEV_DEV_NOT_SUPPORTED; + acc200_vf2pf(d, ACC200_VF2PF_STATUS_REQUEST); + reg = acc200_reg_read(d, HWVfHiPfToVfDbellVf); + while ((time_out < ACC200_STATUS_TO) && (reg == RTE_BBDEV_DEV_NOSTATUS)) { + usleep(ACC200_STATUS_WAIT); /*< Wait or VF->PF->VF Comms */ + reg = acc200_reg_read(d, HWVfHiPfToVfDbellVf); + time_out++; + } + /* printf("DevStatus %x %s %d\n", reg, rte_bbdev_device_status_str(reg), time_out); */ + return reg; +} + static void free_base_addresses(void **base_addrs, int size) { @@ -704,6 +729,7 @@ /* Mark as configured properly */ d->configured = true; + acc200_vf2pf(d, ACC200_VF2PF_USING_VF); rte_bbdev_log_debug( "ACC200 (%s) configured sw_rings = %p, sw_rings_iova = %#" @@ -1214,6 +1240,8 @@ /* Read and save the populated config from ACC200 registers */ fetch_acc200_config(dev); + /* Check the status of device */ + dev_info->device_status = acc200_device_status(dev); /* Exposed number of queues */ dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0; From patchwork Fri Jul 8 00:01:43 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Chautru, Nicolas" X-Patchwork-Id: 113822 X-Patchwork-Delegate: gakhil@marvell.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 56CF3A0543; Fri, 8 Jul 2022 02:17:26 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 440CF42B7C; Fri, 8 Jul 2022 02:16:18 +0200 (CEST) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by mails.dpdk.org (Postfix) with ESMTP id 0D9D040A7B for ; Fri, 8 Jul 2022 02:16:07 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1657239368; x=1688775368; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=ZbZUsaL2vaCQ5wZx1c2WCeAg2iHmaeimQuR49VkllwM=; b=ZenZObTFrnKqGoGW0/xsDzMBzmAXsqZowajY+kmuBCTX+OguLpQolmtk Vb7IBkfJtZ+OleItXo4XUfotiaUc+m5iQcTnaGFOCasmvt1/yX6E9Us4W 0JCqeBuRBAEKbnix0FfOnI0rZuL7/GA4IGx/mwFdgK9fYptcW+I8PuJ19 DFxDwwoofUl95NhieapiL9ixoHuxGA7nOYKtgwFcM5u9ZElk6sx2BSD3a wNdyKDJ0zxgZ4ldGlopRJTfa7KsSbE0kxLnnP6aH69XHosdzfBhCwOdUJ ZoG1mbEWPbcCyIqNFPyhlZD6OgQB2sm8nHyFIzy4v8Y3qKesTXtBBLsJK Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10401"; a="264563094" X-IronPort-AV: E=Sophos;i="5.92,253,1650956400"; d="scan'208";a="264563094" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Jul 2022 17:16:06 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.92,253,1650956400"; d="scan'208";a="591387561" Received: from skx-5gnr-sc12-4.sc.intel.com ([172.25.69.210]) by orsmga007.jf.intel.com with ESMTP; 07 Jul 2022 17:16:05 -0700 From: Nicolas Chautru To: dev@dpdk.org, thomas@monjalon.net, gakhil@marvell.com, hemant.agrawal@nxp.com, trix@redhat.com Cc: maxime.coquelin@redhat.com, mdr@ashroe.eu, bruce.richardson@intel.com, david.marchand@redhat.com, stephen@networkplumber.org, Nicolas Chautru Subject: [PATCH v1 10/10] baseband/acc200: add PF configure companion function Date: Thu, 7 Jul 2022 17:01:43 -0700 Message-Id: <1657238503-143836-11-git-send-email-nicolas.chautru@intel.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1657238503-143836-1-git-send-email-nicolas.chautru@intel.com> References: <1657238503-143836-1-git-send-email-nicolas.chautru@intel.com> X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Add configure function notably to configure the device from the PF within DPDK and bbdev-test (without external dependency). Signed-off-by: Nicolas Chautru --- app/test-bbdev/meson.build | 3 + app/test-bbdev/test_bbdev_perf.c | 76 +++++ drivers/baseband/acc200/meson.build | 2 + drivers/baseband/acc200/rte_acc200_cfg.h | 21 ++ drivers/baseband/acc200/rte_acc200_pmd.c | 466 +++++++++++++++++++++++++++++++ drivers/baseband/acc200/version.map | 7 + 6 files changed, 575 insertions(+) diff --git a/app/test-bbdev/meson.build b/app/test-bbdev/meson.build index 76d4c26..1ffaa54 100644 --- a/app/test-bbdev/meson.build +++ b/app/test-bbdev/meson.build @@ -23,6 +23,9 @@ endif if dpdk_conf.has('RTE_BASEBAND_ACC100') deps += ['baseband_acc100'] endif +if dpdk_conf.has('RTE_BASEBAND_ACC200') + deps += ['baseband_acc200'] +endif if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_LA12XX') deps += ['baseband_la12xx'] endif diff --git a/app/test-bbdev/test_bbdev_perf.c b/app/test-bbdev/test_bbdev_perf.c index 653b21f..69a505d 100644 --- a/app/test-bbdev/test_bbdev_perf.c +++ b/app/test-bbdev/test_bbdev_perf.c @@ -64,6 +64,18 @@ #define ACC100_QOS_GBR 0 #endif +#ifdef RTE_BASEBAND_ACC200 +#include +#define ACC200PF_DRIVER_NAME ("intel_acc200_pf") +#define ACC200VF_DRIVER_NAME ("intel_acc200_vf") +#define ACC200_QMGR_NUM_AQS 16 +#define ACC200_QMGR_NUM_QGS 2 +#define ACC200_QMGR_AQ_DEPTH 5 +#define ACC200_QMGR_INVALID_IDX -1 +#define ACC200_QMGR_RR 1 +#define ACC200_QOS_GBR 0 +#endif + #define OPS_CACHE_SIZE 256U #define OPS_POOL_SIZE_MIN 511U /* 0.5K per queue */ @@ -762,6 +774,70 @@ typedef int (test_case_function)(struct active_device *ad, info->dev_name); } #endif +#ifdef RTE_BASEBAND_ACC200 + if ((get_init_device() == true) && + (!strcmp(info->drv.driver_name, ACC200PF_DRIVER_NAME))) { + struct rte_acc200_conf conf; + unsigned int i; + + printf("Configure ACC200 FEC Driver %s with default values\n", + info->drv.driver_name); + + /* clear default configuration before initialization */ + memset(&conf, 0, sizeof(struct rte_acc200_conf)); + + /* Always set in PF mode for built-in configuration */ + conf.pf_mode_en = true; + for (i = 0; i < RTE_ACC200_NUM_VFS; ++i) { + conf.arb_dl_4g[i].gbr_threshold1 = ACC200_QOS_GBR; + conf.arb_dl_4g[i].gbr_threshold1 = ACC200_QOS_GBR; + conf.arb_dl_4g[i].round_robin_weight = ACC200_QMGR_RR; + conf.arb_ul_4g[i].gbr_threshold1 = ACC200_QOS_GBR; + conf.arb_ul_4g[i].gbr_threshold1 = ACC200_QOS_GBR; + conf.arb_ul_4g[i].round_robin_weight = ACC200_QMGR_RR; + conf.arb_dl_5g[i].gbr_threshold1 = ACC200_QOS_GBR; + conf.arb_dl_5g[i].gbr_threshold1 = ACC200_QOS_GBR; + conf.arb_dl_5g[i].round_robin_weight = ACC200_QMGR_RR; + conf.arb_ul_5g[i].gbr_threshold1 = ACC200_QOS_GBR; + conf.arb_ul_5g[i].gbr_threshold1 = ACC200_QOS_GBR; + conf.arb_ul_5g[i].round_robin_weight = ACC200_QMGR_RR; + conf.arb_fft[i].gbr_threshold1 = ACC200_QOS_GBR; + conf.arb_fft[i].gbr_threshold1 = ACC200_QOS_GBR; + conf.arb_fft[i].round_robin_weight = ACC200_QMGR_RR; + } + + conf.input_pos_llr_1_bit = true; + conf.output_pos_llr_1_bit = true; + conf.num_vf_bundles = 1; /**< Number of VF bundles to setup */ + + conf.q_ul_4g.num_qgroups = ACC200_QMGR_NUM_QGS; + conf.q_ul_4g.first_qgroup_index = ACC200_QMGR_INVALID_IDX; + conf.q_ul_4g.num_aqs_per_groups = ACC200_QMGR_NUM_AQS; + conf.q_ul_4g.aq_depth_log2 = ACC200_QMGR_AQ_DEPTH; + conf.q_dl_4g.num_qgroups = ACC200_QMGR_NUM_QGS; + conf.q_dl_4g.first_qgroup_index = ACC200_QMGR_INVALID_IDX; + conf.q_dl_4g.num_aqs_per_groups = ACC200_QMGR_NUM_AQS; + conf.q_dl_4g.aq_depth_log2 = ACC200_QMGR_AQ_DEPTH; + conf.q_ul_5g.num_qgroups = ACC200_QMGR_NUM_QGS; + conf.q_ul_5g.first_qgroup_index = ACC200_QMGR_INVALID_IDX; + conf.q_ul_5g.num_aqs_per_groups = ACC200_QMGR_NUM_AQS; + conf.q_ul_5g.aq_depth_log2 = ACC200_QMGR_AQ_DEPTH; + conf.q_dl_5g.num_qgroups = ACC200_QMGR_NUM_QGS; + conf.q_dl_5g.first_qgroup_index = ACC200_QMGR_INVALID_IDX; + conf.q_dl_5g.num_aqs_per_groups = ACC200_QMGR_NUM_AQS; + conf.q_dl_5g.aq_depth_log2 = ACC200_QMGR_AQ_DEPTH; + conf.q_fft.num_qgroups = ACC200_QMGR_NUM_QGS; + conf.q_fft.first_qgroup_index = ACC200_QMGR_INVALID_IDX; + conf.q_fft.num_aqs_per_groups = ACC200_QMGR_NUM_AQS; + conf.q_fft.aq_depth_log2 = ACC200_QMGR_AQ_DEPTH; + + /* setup PF with configuration information */ + ret = rte_acc200_configure(info->dev_name, &conf); + TEST_ASSERT_SUCCESS(ret, + "Failed to configure ACC200 PF for bbdev %s", + info->dev_name); + } +#endif /* Let's refresh this now this is configured */ rte_bbdev_info_get(dev_id, info); nb_queues = RTE_MIN(rte_lcore_count(), info->drv.max_num_queues); diff --git a/drivers/baseband/acc200/meson.build b/drivers/baseband/acc200/meson.build index 7b47bc6..33b3e5e 100644 --- a/drivers/baseband/acc200/meson.build +++ b/drivers/baseband/acc200/meson.build @@ -4,3 +4,5 @@ deps += ['bbdev', 'bus_vdev', 'ring', 'pci', 'bus_pci'] sources = files('rte_acc200_pmd.c') + +headers = files('rte_acc200_cfg.h') diff --git a/drivers/baseband/acc200/rte_acc200_cfg.h b/drivers/baseband/acc200/rte_acc200_cfg.h index fcccfbf..33ea819 100644 --- a/drivers/baseband/acc200/rte_acc200_cfg.h +++ b/drivers/baseband/acc200/rte_acc200_cfg.h @@ -91,4 +91,25 @@ struct rte_acc200_conf { struct rte_acc200_arbitration arb_fft[RTE_ACC200_NUM_VFS]; }; +/** + * Configure a ACC200 device + * + * @param dev_name + * The name of the device. This is the short form of PCI BDF, e.g. 00:01.0. + * It can also be retrieved for a bbdev device from the dev_name field in the + * rte_bbdev_info structure returned by rte_bbdev_info_get(). + * @param conf + * Configuration to apply to ACC200 HW. + * + * @return + * Zero on success, negative value on failure. + */ +__rte_experimental +int +rte_acc200_configure(const char *dev_name, struct rte_acc200_conf *conf); + +#ifdef __cplusplus +} +#endif + #endif /* _RTE_ACC200_CFG_H_ */ diff --git a/drivers/baseband/acc200/rte_acc200_pmd.c b/drivers/baseband/acc200/rte_acc200_pmd.c index 856ea1c..c44d729 100644 --- a/drivers/baseband/acc200/rte_acc200_pmd.c +++ b/drivers/baseband/acc200/rte_acc200_pmd.c @@ -85,6 +85,27 @@ enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, FFT, NUM_ACC}; +/* Return the accelerator enum for a Queue Group Index */ +static inline int +accFromQgid(int qg_idx, const struct rte_acc200_conf *acc200_conf) +{ + int accQg[ACC200_NUM_QGRPS]; + int NumQGroupsPerFn[NUM_ACC]; + int acc, qgIdx, qgIndex = 0; + for (qgIdx = 0; qgIdx < ACC200_NUM_QGRPS; qgIdx++) + accQg[qgIdx] = 0; + NumQGroupsPerFn[UL_4G] = acc200_conf->q_ul_4g.num_qgroups; + NumQGroupsPerFn[UL_5G] = acc200_conf->q_ul_5g.num_qgroups; + NumQGroupsPerFn[DL_4G] = acc200_conf->q_dl_4g.num_qgroups; + NumQGroupsPerFn[DL_5G] = acc200_conf->q_dl_5g.num_qgroups; + NumQGroupsPerFn[FFT] = acc200_conf->q_fft.num_qgroups; + for (acc = UL_4G; acc < NUM_ACC; acc++) + for (qgIdx = 0; qgIdx < NumQGroupsPerFn[acc]; qgIdx++) + accQg[qgIndex++] = acc; + acc = accQg[qg_idx]; + return acc; +} + /* Return the queue topology for a Queue Group Index */ static inline void qtopFromAcc(struct rte_acc200_queue_topology **qtop, int acc_enum, @@ -117,6 +138,30 @@ *qtop = p_qtop; } +/* Return the AQ depth for a Queue Group Index */ +static inline int +aqDepth(int qg_idx, struct rte_acc200_conf *acc200_conf) +{ + struct rte_acc200_queue_topology *q_top = NULL; + int acc_enum = accFromQgid(qg_idx, acc200_conf); + qtopFromAcc(&q_top, acc_enum, acc200_conf); + if (unlikely(q_top == NULL)) + return 0; + return q_top->aq_depth_log2; +} + +/* Return the AQ depth for a Queue Group Index */ +static inline int +aqNum(int qg_idx, struct rte_acc200_conf *acc200_conf) +{ + struct rte_acc200_queue_topology *q_top = NULL; + int acc_enum = accFromQgid(qg_idx, acc200_conf); + qtopFromAcc(&q_top, acc_enum, acc200_conf); + if (unlikely(q_top == NULL)) + return 0; + return q_top->num_aqs_per_groups; +} + static void initQTop(struct rte_acc200_conf *acc200_conf) { @@ -4935,3 +4980,424 @@ static int acc200_pci_remove(struct rte_pci_device *pci_dev) RTE_PMD_REGISTER_PCI_TABLE(ACC200PF_DRIVER_NAME, pci_id_acc200_pf_map); RTE_PMD_REGISTER_PCI(ACC200VF_DRIVER_NAME, acc200_pci_vf_driver); RTE_PMD_REGISTER_PCI_TABLE(ACC200VF_DRIVER_NAME, pci_id_acc200_vf_map); + +/* Initial configuration of a ACC200 device prior to running configure() */ +int +rte_acc200_configure(const char *dev_name, struct rte_acc200_conf *conf) +{ + rte_bbdev_log(INFO, "rte_acc200_configure"); + uint32_t value, address, status; + int qg_idx, template_idx, vf_idx, acc, i; + struct rte_bbdev *bbdev = rte_bbdev_get_named_dev(dev_name); + + /* Compile time checks */ + RTE_BUILD_BUG_ON(sizeof(struct acc200_dma_req_desc) != 256); + RTE_BUILD_BUG_ON(sizeof(union acc200_dma_desc) != 256); + RTE_BUILD_BUG_ON(sizeof(struct acc200_fcw_td) != 24); + RTE_BUILD_BUG_ON(sizeof(struct acc200_fcw_te) != 32); + + if (bbdev == NULL) { + rte_bbdev_log(ERR, + "Invalid dev_name (%s), or device is not yet initialised", + dev_name); + return -ENODEV; + } + struct acc200_device *d = bbdev->data->dev_private; + + /* Store configuration */ + rte_memcpy(&d->acc200_conf, conf, sizeof(d->acc200_conf)); + + + /* Check we are already out of PG */ + status = acc200_reg_read(d, HWPfHiSectionPowerGatingAck); + if (status > 0) { + if (status != ACC200_PG_MASK_0) { + rte_bbdev_log(ERR, "Unexpected status %x %x", + status, ACC200_PG_MASK_0); + return -ENODEV; + } + /* Clock gate sections that will be un-PG */ + acc200_reg_write(d, HWPfHiClkGateHystReg, ACC200_CLK_DIS); + /* Un-PG required sections */ + acc200_reg_write(d, HWPfHiSectionPowerGatingReq, + ACC200_PG_MASK_1); + status = acc200_reg_read(d, HWPfHiSectionPowerGatingAck); + if (status != ACC200_PG_MASK_1) { + rte_bbdev_log(ERR, "Unexpected status %x %x", + status, ACC200_PG_MASK_1); + return -ENODEV; + } + acc200_reg_write(d, HWPfHiSectionPowerGatingReq, + ACC200_PG_MASK_2); + status = acc200_reg_read(d, HWPfHiSectionPowerGatingAck); + if (status != ACC200_PG_MASK_2) { + rte_bbdev_log(ERR, "Unexpected status %x %x", + status, ACC200_PG_MASK_2); + return -ENODEV; + } + acc200_reg_write(d, HWPfHiSectionPowerGatingReq, + ACC200_PG_MASK_3); + status = acc200_reg_read(d, HWPfHiSectionPowerGatingAck); + if (status != ACC200_PG_MASK_3) { + rte_bbdev_log(ERR, "Unexpected status %x %x", + status, ACC200_PG_MASK_3); + return -ENODEV; + } + /* Enable clocks for all sections */ + acc200_reg_write(d, HWPfHiClkGateHystReg, ACC200_CLK_EN); + } + + /* Explicitly releasing AXI as this may be stopped after PF FLR/BME */ + address = HWPfDmaAxiControl; + value = 1; + acc200_reg_write(d, address, value); + + /* Set the fabric mode */ + address = HWPfFabricM2iBufferReg; + value = ACC200_FABRIC_MODE; + acc200_reg_write(d, address, value); + + /* Set default descriptor signature */ + address = HWPfDmaDescriptorSignatuture; + value = 0; + acc200_reg_write(d, address, value); + + /* Enable the Error Detection in DMA */ + value = ACC200_CFG_DMA_ERROR; + address = HWPfDmaErrorDetectionEn; + acc200_reg_write(d, address, value); + + /* AXI Cache configuration */ + value = ACC200_CFG_AXI_CACHE; + address = HWPfDmaAxcacheReg; + acc200_reg_write(d, address, value); + + /* Default DMA Configuration (Qmgr Enabled) */ + address = HWPfDmaConfig0Reg; + value = 0; + acc200_reg_write(d, address, value); + address = HWPfDmaQmanen; + value = 0; + acc200_reg_write(d, address, value); + + /* Default RLIM/ALEN configuration */ + int rlim = 0; + int alen = 1; + int timestamp = 0; + address = HWPfDmaConfig1Reg; + value = (1 << 31) + (rlim << 8) + (timestamp << 6) + alen; + acc200_reg_write(d, address, value); + + /* Default FFT configuration */ + address = HWPfFftConfig0; + value = ACC200_FFT_CFG_0; + acc200_reg_write(d, address, value); + + /* Configure DMA Qmanager addresses */ + address = HWPfDmaQmgrAddrReg; + value = HWPfQmgrEgressQueuesTemplate; + acc200_reg_write(d, address, value); + + /* ===== Qmgr Configuration ===== */ + /* Configuration of the AQueue Depth QMGR_GRP_0_DEPTH_LOG2 for UL */ + int totalQgs = conf->q_ul_4g.num_qgroups + + conf->q_ul_5g.num_qgroups + + conf->q_dl_4g.num_qgroups + + conf->q_dl_5g.num_qgroups + + conf->q_fft.num_qgroups; + for (qg_idx = 0; qg_idx < ACC200_NUM_QGRPS; qg_idx++) { + address = HWPfQmgrDepthLog2Grp + + ACC200_BYTES_IN_WORD * qg_idx; + value = aqDepth(qg_idx, conf); + acc200_reg_write(d, address, value); + address = HWPfQmgrTholdGrp + + ACC200_BYTES_IN_WORD * qg_idx; + value = (1 << 16) + (1 << (aqDepth(qg_idx, conf) - 1)); + acc200_reg_write(d, address, value); + } + + /* Template Priority in incremental order */ + for (template_idx = 0; template_idx < ACC200_NUM_TMPL; + template_idx++) { + address = HWPfQmgrGrpTmplateReg0Indx + ACC200_BYTES_IN_WORD * template_idx; + value = ACC200_TMPL_PRI_0; + acc200_reg_write(d, address, value); + address = HWPfQmgrGrpTmplateReg1Indx + ACC200_BYTES_IN_WORD * template_idx; + value = ACC200_TMPL_PRI_1; + acc200_reg_write(d, address, value); + address = HWPfQmgrGrpTmplateReg2indx + ACC200_BYTES_IN_WORD * template_idx; + value = ACC200_TMPL_PRI_2; + acc200_reg_write(d, address, value); + address = HWPfQmgrGrpTmplateReg3Indx + ACC200_BYTES_IN_WORD * template_idx; + value = ACC200_TMPL_PRI_3; + acc200_reg_write(d, address, value); + } + + address = HWPfQmgrGrpPriority; + value = ACC200_CFG_QMGR_HI_P; + acc200_reg_write(d, address, value); + + /* Template Configuration */ + for (template_idx = 0; template_idx < ACC200_NUM_TMPL; + template_idx++) { + value = 0; + address = HWPfQmgrGrpTmplateReg4Indx + + ACC200_BYTES_IN_WORD * template_idx; + acc200_reg_write(d, address, value); + } + /* 4GUL */ + int numQgs = conf->q_ul_4g.num_qgroups; + int numQqsAcc = 0; + value = 0; + for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++) + value |= (1 << qg_idx); + for (template_idx = ACC200_SIG_UL_4G; + template_idx <= ACC200_SIG_UL_4G_LAST; + template_idx++) { + address = HWPfQmgrGrpTmplateReg4Indx + + ACC200_BYTES_IN_WORD * template_idx; + acc200_reg_write(d, address, value); + } + /* 5GUL */ + numQqsAcc += numQgs; + numQgs = conf->q_ul_5g.num_qgroups; + value = 0; + int numEngines = 0; + for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++) + value |= (1 << qg_idx); + for (template_idx = ACC200_SIG_UL_5G; + template_idx <= ACC200_SIG_UL_5G_LAST; + template_idx++) { + /* Check engine power-on status */ + address = HwPfFecUl5gIbDebugReg + + ACC200_ENGINE_OFFSET * template_idx; + status = (acc200_reg_read(d, address) >> 4) & 0x7; + address = HWPfQmgrGrpTmplateReg4Indx + + ACC200_BYTES_IN_WORD * template_idx; + if (status == 1) { + acc200_reg_write(d, address, value); + numEngines++; + } else + acc200_reg_write(d, address, 0); +#if RTE_ACC200_SINGLE_FEC == 1 + value = 0; +#endif + } + printf("Number of 5GUL engines %d\n", numEngines); + /* 4GDL */ + numQqsAcc += numQgs; + numQgs = conf->q_dl_4g.num_qgroups; + value = 0; + for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++) + value |= (1 << qg_idx); + for (template_idx = ACC200_SIG_DL_4G; + template_idx <= ACC200_SIG_DL_4G_LAST; + template_idx++) { + address = HWPfQmgrGrpTmplateReg4Indx + + ACC200_BYTES_IN_WORD * template_idx; + acc200_reg_write(d, address, value); +#if RTE_ACC200_SINGLE_FEC == 1 + value = 0; +#endif + } + /* 5GDL */ + numQqsAcc += numQgs; + numQgs = conf->q_dl_5g.num_qgroups; + value = 0; + for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++) + value |= (1 << qg_idx); + for (template_idx = ACC200_SIG_DL_5G; + template_idx <= ACC200_SIG_DL_5G_LAST; + template_idx++) { + address = HWPfQmgrGrpTmplateReg4Indx + + ACC200_BYTES_IN_WORD * template_idx; + acc200_reg_write(d, address, value); +#if RTE_ACC200_SINGLE_FEC == 1 + value = 0; +#endif + } + /* FFT */ + numQqsAcc += numQgs; + numQgs = conf->q_fft.num_qgroups; + value = 0; + for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++) + value |= (1 << qg_idx); + for (template_idx = ACC200_SIG_FFT; + template_idx <= ACC200_SIG_FFT_LAST; + template_idx++) { + address = HWPfQmgrGrpTmplateReg4Indx + + ACC200_BYTES_IN_WORD * template_idx; + acc200_reg_write(d, address, value); +#if RTE_ACC200_SINGLE_FEC == 1 + value = 0; +#endif + } + + /* Queue Group Function mapping */ + int qman_func_id[8] = {0, 2, 1, 3, 4, 0, 0, 0}; + value = 0; + for (qg_idx = 0; qg_idx < ACC200_NUM_QGRPS_PER_WORD; qg_idx++) { + acc = accFromQgid(qg_idx, conf); + value |= qman_func_id[acc] << (qg_idx * 4); + } + acc200_reg_write(d, HWPfQmgrGrpFunction0, value); + value = 0; + for (qg_idx = 0; qg_idx < ACC200_NUM_QGRPS_PER_WORD; qg_idx++) { + acc = accFromQgid(qg_idx + ACC200_NUM_QGRPS_PER_WORD, conf); + value |= qman_func_id[acc] << (qg_idx * 4); + } + acc200_reg_write(d, HWPfQmgrGrpFunction1, value); + + /* Configuration of the Arbitration QGroup depth to 1 */ + for (qg_idx = 0; qg_idx < ACC200_NUM_QGRPS; qg_idx++) { + address = HWPfQmgrArbQDepthGrp + + ACC200_BYTES_IN_WORD * qg_idx; + value = 0; + acc200_reg_write(d, address, value); + } + + /* This pointer to ARAM (256kB) is shifted by 2 (4B per register) */ + uint32_t aram_address = 0; + for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) { + for (vf_idx = 0; vf_idx < conf->num_vf_bundles; vf_idx++) { + address = HWPfQmgrVfBaseAddr + vf_idx + * ACC200_BYTES_IN_WORD + qg_idx + * ACC200_BYTES_IN_WORD * 64; + value = aram_address; + acc200_reg_write(d, address, value); + /* Offset ARAM Address for next memory bank + * - increment of 4B + */ + aram_address += aqNum(qg_idx, conf) * + (1 << aqDepth(qg_idx, conf)); + } + } + + if (aram_address > ACC200_WORDS_IN_ARAM_SIZE) { + rte_bbdev_log(ERR, "ARAM Configuration not fitting %d %d\n", + aram_address, ACC200_WORDS_IN_ARAM_SIZE); + return -EINVAL; + } + + /* Performance tuning */ + acc200_reg_write(d, HWPfFabricI2Mdma_weight, 0x0FFF); + acc200_reg_write(d, HWPfDma4gdlIbThld, 0x1f10); + + /* ==== HI Configuration ==== */ + + /* No Info Ring/MSI by default */ + address = HWPfHiInfoRingIntWrEnRegPf; + value = 0; + acc200_reg_write(d, address, value); + address = HWPfHiCfgMsiIntWrEnRegPf; + value = 0xFFFFFFFF; + acc200_reg_write(d, address, value); + /* Prevent Block on Transmit Error */ + address = HWPfHiBlockTransmitOnErrorEn; + value = 0; + acc200_reg_write(d, address, value); + /* Prevents to drop MSI */ + address = HWPfHiMsiDropEnableReg; + value = 0; + acc200_reg_write(d, address, value); + /* Set the PF Mode register */ + address = HWPfHiPfMode; + value = (conf->pf_mode_en) ? ACC200_PF_VAL : 0; + acc200_reg_write(d, address, value); + + /* QoS overflow init */ + value = 1; + address = HWPfQosmonAEvalOverflow0; + acc200_reg_write(d, address, value); + address = HWPfQosmonBEvalOverflow0; + acc200_reg_write(d, address, value); + + /* Configure the FFT RAM LUT */ + uint32_t fft_lut[ACC200_FFT_RAM_SIZE] = { + 0x1FFFF, 0x1FFFF, 0x1FFFE, 0x1FFFA, 0x1FFF6, 0x1FFF1, 0x1FFEA, 0x1FFE2, + 0x1FFD9, 0x1FFCE, 0x1FFC2, 0x1FFB5, 0x1FFA7, 0x1FF98, 0x1FF87, 0x1FF75, + 0x1FF62, 0x1FF4E, 0x1FF38, 0x1FF21, 0x1FF09, 0x1FEF0, 0x1FED6, 0x1FEBA, + 0x1FE9D, 0x1FE7F, 0x1FE5F, 0x1FE3F, 0x1FE1D, 0x1FDFA, 0x1FDD5, 0x1FDB0, + 0x1FD89, 0x1FD61, 0x1FD38, 0x1FD0D, 0x1FCE1, 0x1FCB4, 0x1FC86, 0x1FC57, + 0x1FC26, 0x1FBF4, 0x1FBC1, 0x1FB8D, 0x1FB58, 0x1FB21, 0x1FAE9, 0x1FAB0, + 0x1FA75, 0x1FA3A, 0x1F9FD, 0x1F9BF, 0x1F980, 0x1F93F, 0x1F8FD, 0x1F8BA, + 0x1F876, 0x1F831, 0x1F7EA, 0x1F7A3, 0x1F75A, 0x1F70F, 0x1F6C4, 0x1F677, + 0x1F629, 0x1F5DA, 0x1F58A, 0x1F539, 0x1F4E6, 0x1F492, 0x1F43D, 0x1F3E7, + 0x1F38F, 0x1F337, 0x1F2DD, 0x1F281, 0x1F225, 0x1F1C8, 0x1F169, 0x1F109, + 0x1F0A8, 0x1F046, 0x1EFE2, 0x1EF7D, 0x1EF18, 0x1EEB0, 0x1EE48, 0x1EDDF, + 0x1ED74, 0x1ED08, 0x1EC9B, 0x1EC2D, 0x1EBBE, 0x1EB4D, 0x1EADB, 0x1EA68, + 0x1E9F4, 0x1E97F, 0x1E908, 0x1E891, 0x1E818, 0x1E79E, 0x1E722, 0x1E6A6, + 0x1E629, 0x1E5AA, 0x1E52A, 0x1E4A9, 0x1E427, 0x1E3A3, 0x1E31F, 0x1E299, + 0x1E212, 0x1E18A, 0x1E101, 0x1E076, 0x1DFEB, 0x1DF5E, 0x1DED0, 0x1DE41, + 0x1DDB1, 0x1DD20, 0x1DC8D, 0x1DBFA, 0x1DB65, 0x1DACF, 0x1DA38, 0x1D9A0, + 0x1D907, 0x1D86C, 0x1D7D1, 0x1D734, 0x1D696, 0x1D5F7, 0x1D557, 0x1D4B6, + 0x1D413, 0x1D370, 0x1D2CB, 0x1D225, 0x1D17E, 0x1D0D6, 0x1D02D, 0x1CF83, + 0x1CED8, 0x1CE2B, 0x1CD7E, 0x1CCCF, 0x1CC1F, 0x1CB6E, 0x1CABC, 0x1CA09, + 0x1C955, 0x1C89F, 0x1C7E9, 0x1C731, 0x1C679, 0x1C5BF, 0x1C504, 0x1C448, + 0x1C38B, 0x1C2CD, 0x1C20E, 0x1C14E, 0x1C08C, 0x1BFCA, 0x1BF06, 0x1BE42, + 0x1BD7C, 0x1BCB5, 0x1BBED, 0x1BB25, 0x1BA5B, 0x1B990, 0x1B8C4, 0x1B7F6, + 0x1B728, 0x1B659, 0x1B589, 0x1B4B7, 0x1B3E5, 0x1B311, 0x1B23D, 0x1B167, + 0x1B091, 0x1AFB9, 0x1AEE0, 0x1AE07, 0x1AD2C, 0x1AC50, 0x1AB73, 0x1AA95, + 0x1A9B6, 0x1A8D6, 0x1A7F6, 0x1A714, 0x1A631, 0x1A54D, 0x1A468, 0x1A382, + 0x1A29A, 0x1A1B2, 0x1A0C9, 0x19FDF, 0x19EF4, 0x19E08, 0x19D1B, 0x19C2D, + 0x19B3E, 0x19A4E, 0x1995D, 0x1986B, 0x19778, 0x19684, 0x1958F, 0x19499, + 0x193A2, 0x192AA, 0x191B1, 0x190B8, 0x18FBD, 0x18EC1, 0x18DC4, 0x18CC7, + 0x18BC8, 0x18AC8, 0x189C8, 0x188C6, 0x187C4, 0x186C1, 0x185BC, 0x184B7, + 0x183B1, 0x182AA, 0x181A2, 0x18099, 0x17F8F, 0x17E84, 0x17D78, 0x17C6C, + 0x17B5E, 0x17A4F, 0x17940, 0x17830, 0x1771E, 0x1760C, 0x174F9, 0x173E5, + 0x172D1, 0x171BB, 0x170A4, 0x16F8D, 0x16E74, 0x16D5B, 0x16C41, 0x16B26, + 0x16A0A, 0x168ED, 0x167CF, 0x166B1, 0x16592, 0x16471, 0x16350, 0x1622E, + 0x1610B, 0x15FE8, 0x15EC3, 0x15D9E, 0x15C78, 0x15B51, 0x15A29, 0x15900, + 0x157D7, 0x156AC, 0x15581, 0x15455, 0x15328, 0x151FB, 0x150CC, 0x14F9D, + 0x14E6D, 0x14D3C, 0x14C0A, 0x14AD8, 0x149A4, 0x14870, 0x1473B, 0x14606, + 0x144CF, 0x14398, 0x14260, 0x14127, 0x13FEE, 0x13EB3, 0x13D78, 0x13C3C, + 0x13B00, 0x139C2, 0x13884, 0x13745, 0x13606, 0x134C5, 0x13384, 0x13242, + 0x130FF, 0x12FBC, 0x12E78, 0x12D33, 0x12BEE, 0x12AA7, 0x12960, 0x12819, + 0x126D0, 0x12587, 0x1243D, 0x122F3, 0x121A8, 0x1205C, 0x11F0F, 0x11DC2, + 0x11C74, 0x11B25, 0x119D6, 0x11886, 0x11735, 0x115E3, 0x11491, 0x1133F, + 0x111EB, 0x11097, 0x10F42, 0x10DED, 0x10C97, 0x10B40, 0x109E9, 0x10891, + 0x10738, 0x105DF, 0x10485, 0x1032B, 0x101D0, 0x10074, 0x0FF18, 0x0FDBB, + 0x0FC5D, 0x0FAFF, 0x0F9A0, 0x0F841, 0x0F6E1, 0x0F580, 0x0F41F, 0x0F2BD, + 0x0F15B, 0x0EFF8, 0x0EE94, 0x0ED30, 0x0EBCC, 0x0EA67, 0x0E901, 0x0E79A, + 0x0E633, 0x0E4CC, 0x0E364, 0x0E1FB, 0x0E092, 0x0DF29, 0x0DDBE, 0x0DC54, + 0x0DAE9, 0x0D97D, 0x0D810, 0x0D6A4, 0x0D536, 0x0D3C8, 0x0D25A, 0x0D0EB, + 0x0CF7C, 0x0CE0C, 0x0CC9C, 0x0CB2B, 0x0C9B9, 0x0C847, 0x0C6D5, 0x0C562, + 0x0C3EF, 0x0C27B, 0x0C107, 0x0BF92, 0x0BE1D, 0x0BCA8, 0x0BB32, 0x0B9BB, + 0x0B844, 0x0B6CD, 0x0B555, 0x0B3DD, 0x0B264, 0x0B0EB, 0x0AF71, 0x0ADF7, + 0x0AC7D, 0x0AB02, 0x0A987, 0x0A80B, 0x0A68F, 0x0A513, 0x0A396, 0x0A219, + 0x0A09B, 0x09F1D, 0x09D9E, 0x09C20, 0x09AA1, 0x09921, 0x097A1, 0x09621, + 0x094A0, 0x0931F, 0x0919E, 0x0901C, 0x08E9A, 0x08D18, 0x08B95, 0x08A12, + 0x0888F, 0x0870B, 0x08587, 0x08402, 0x0827E, 0x080F9, 0x07F73, 0x07DEE, + 0x07C68, 0x07AE2, 0x0795B, 0x077D4, 0x0764D, 0x074C6, 0x0733E, 0x071B6, + 0x0702E, 0x06EA6, 0x06D1D, 0x06B94, 0x06A0B, 0x06881, 0x066F7, 0x0656D, + 0x063E3, 0x06258, 0x060CE, 0x05F43, 0x05DB7, 0x05C2C, 0x05AA0, 0x05914, + 0x05788, 0x055FC, 0x0546F, 0x052E3, 0x05156, 0x04FC9, 0x04E3B, 0x04CAE, + 0x04B20, 0x04992, 0x04804, 0x04676, 0x044E8, 0x04359, 0x041CB, 0x0403C, + 0x03EAD, 0x03D1D, 0x03B8E, 0x039FF, 0x0386F, 0x036DF, 0x0354F, 0x033BF, + 0x0322F, 0x0309F, 0x02F0F, 0x02D7E, 0x02BEE, 0x02A5D, 0x028CC, 0x0273B, + 0x025AA, 0x02419, 0x02288, 0x020F7, 0x01F65, 0x01DD4, 0x01C43, 0x01AB1, + 0x0191F, 0x0178E, 0x015FC, 0x0146A, 0x012D8, 0x01147, 0x00FB5, 0x00E23, + 0x00C91, 0x00AFF, 0x0096D, 0x007DB, 0x00648, 0x004B6, 0x00324, 0x00192}; + + acc200_reg_write(d, HWPfFftRamPageAccess, ACC200_FFT_RAM_EN + 64); + for (i = 0; i < ACC200_FFT_RAM_SIZE; i++) + acc200_reg_write(d, HWPfFftRamOff + i * 4, fft_lut[i]); + acc200_reg_write(d, HWPfFftRamPageAccess, ACC200_FFT_RAM_DIS); + + /* Enabling AQueues through the Queue hierarchy*/ + for (vf_idx = 0; vf_idx < ACC200_NUM_VFS; vf_idx++) { + for (qg_idx = 0; qg_idx < ACC200_NUM_QGRPS; qg_idx++) { + value = 0; + if (vf_idx < conf->num_vf_bundles && + qg_idx < totalQgs) + value = (1 << aqNum(qg_idx, conf)) - 1; + address = HWPfQmgrAqEnableVf + + vf_idx * ACC200_BYTES_IN_WORD; + value += (qg_idx << 16); + acc200_reg_write(d, address, value); + } + } + + rte_bbdev_log_debug("PF Tip configuration complete for %s", dev_name); + return 0; +} diff --git a/drivers/baseband/acc200/version.map b/drivers/baseband/acc200/version.map index c2e0723..9542f2b 100644 --- a/drivers/baseband/acc200/version.map +++ b/drivers/baseband/acc200/version.map @@ -1,3 +1,10 @@ DPDK_22 { local: *; }; + +EXPERIMENTAL { + global: + + rte_acc200_configure; + +};