[RFC,v1] regexdev: introduce regexdev subsystem
diff mbox series

Message ID 20190627155036.56940-1-jerinj@marvell.com
State New
Delegated to: Thomas Monjalon
Headers show
Series
  • [RFC,v1] regexdev: introduce regexdev subsystem
Related show

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/Intel-compilation success Compilation OK

Commit Message

Jerin Jacob Kollanukkaran June 27, 2019, 3:50 p.m. UTC
From: Jerin Jacob <jerinj@marvell.com>

Even though there are some vendors which offer Regex HW offload, due to
lack of standard API, It is diffcult for DPDK consumer to use them
in a portable way.

This _RFC_ attempts to standardize the RegEx/DPI offload APIs for DPDK.

The Doxygen generated RFC API documentation available here:
https://dreamy-noether-22777e.netlify.com/rte__regexdev_8h.html

This RFC crafted based on SW Regex API frameworks such as libpcre and
hyperscan and a few of the RegEx HW IPs which I am aware of.

RegEx pattern matching applications:
• Next Generation Firewalls (NGFW)
• Deep Packet and Flow Inspection (DPI)
• Intrusion Prevention Systems (IPS)
• DDoS Mitigation
• Network Monitoring
• Data Loss Prevention (DLP)
• Smart NICs
• Grammar based content processing
• URL, spam and adware filtering
• Advanced auditing and policing of user/application security policies
• Financial data mining - parsing of streamed financial feeds 

Request to review from HW and SW RegEx vendors and RegEx application users
to have portable DPDK API for RegEx.

The API schematics are based cryptodev, eventdev and ethdev existing device API.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
---

RTE RegEx Device API
--------------------

Defines RTE RegEx Device APIs for RegEx operations and its provisioning.

The RegEx Device API is composed of two parts:

- The application-oriented RegEx API that includes functions to setup
  a RegEx device (configure it, setup its queue pairs and start it),
  update the rule database and so on.

- The driver-oriented RegEx API that exports a function allowing
  a RegEx poll Mode Driver (PMD) to simultaneously register itself as
  a RegEx device driver.

RegEx device components and definitions:

    +-----------------+
    |                 |
    |                 o---------+    rte_regex_[en|de]queue_burst()
    |   PCRE based    o------+  |               |
    |  RegEx pattern  |      |  |  +--------+   |
    | matching engine o------+--+--o        |   |    +------+
    |                 |      |  |  | queue  |<==o===>|Core 0|
    |                 o----+ |  |  | pair 0 |        |      |
    |                 |    | |  |  +--------+        +------+
    +-----------------+    | |  |
           ^               | |  |  +--------+
           |               | |  |  |        |        +------+
           |               | +--+--o queue  |<======>|Core 1|
       Rule|Database       |    |  | pair 1 |        |      |
    +------+----------+    |    |  +--------+        +------+
    |     Group 0     |    |    |
    | +-------------+ |    |    |  +--------+        +------+
    | | Rules 0..n  | |    |    |  |        |        |Core 2|
    | +-------------+ |    |    +--o queue  |<======>|      |
    |     Group 1     |    |       | pair 2 |        +------+
    | +-------------+ |    |       +--------+
    | | Rules 0..n  | |    |
    | +-------------+ |    |       +--------+
    |     Group 2     |    |       |        |        +------+
    | +-------------+ |    |       | queue  |<======>|Core n|
    | | Rules 0..n  | |    +-------o pair n |        |      |
    | +-------------+ |            +--------+        +------+
    |     Group n     |
    | +-------------+ |<-------rte_regex_rule_db_update()
    | | Rules 0..n  | |<-------rte_regex_rule_db_import()
    | +-------------+ |------->rte_regex_rule_db_export()
    +-----------------+

RegEx: A regular expression is a concise and flexible means for matching
strings of text, such as particular characters, words, or patterns of
characters. A common abbreviation for this is “RegEx”.

RegEx device: A hardware or software-based implementation of RegEx
device API for PCRE based pattern matching syntax and semantics.

PCRE RegEx syntax and semantics specification:
http://regexkit.sourceforge.net/Documentation/pcre/pcrepattern.html

RegEx queue pair: Each RegEx device should have one or more queue pair to
transmit a burst of pattern matching request and receive a burst of
receive the pattern matching response. The pattern matching request/response
embedded in *rte_regex_ops* structure.

Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
Match ID and Group ID to identify the rule upon the match.

Rule database: The RegEx device accepts regular expressions and converts them
into a compiled rule database that can then be used to scan data.
Compilation allows the device to analyze the given pattern(s) and
pre-determine how to scan for these patterns in an optimized fashion that
would be far too expensive to compute at run-time. A rule database contains
a set of rules that compiled in device specific binary form.

Match ID or Rule ID: A unique identifier provided at the time of rule
creation for the application to identify the rule upon match.

Group ID: Group of rules can be grouped under one group ID to enable
rule isolation and effective pattern matching. A unique group identifier
provided at the time of rule creation for the application to identify the
rule upon match.

Scan: A pattern matching request through *enqueue* API.

It may possible that a given RegEx device may not support all the features
of PCRE. The application may probe unsupported features through
struct rte_regex_dev_info::pcre_unsup_flags

By default, all the functions of the RegEx Device API exported by a PMD
are lock-free functions which assume to not be invoked in parallel on
different logical cores to work on the same target object. For instance,
the dequeue function of a PMD cannot be invoked in parallel on two logical
cores to operates on same RegEx queue pair. Of course, this function
can be invoked in parallel by different logical core on different queue pair.
It is the responsibility of the upper level application to enforce this rule.

In all functions of the RegEx API, the RegEx device is
designated by an integer >= 0 named the device identifier *dev_id*

At the RegEx driver level, RegEx devices are represented by a generic
data structure of type *rte_regex_dev*.

RegEx devices are dynamically registered during the PCI/SoC device probing
phase performed at EAL initialization time.
When a RegEx device is being probed, a *rte_regex_dev* structure and
a new device identifier are allocated for that device. Then, the
regex_dev_init() function supplied by the RegEx driver matching the probed
device is invoked to properly initialize the device.

The role of the device init function consists of resetting the hardware or
software RegEx driver implementations.

If the device init operation is successful, the correspondence between
the device identifier assigned to the new device and its associated
*rte_regex_dev* structure is effectively registered.
Otherwise, both the *rte_regex_dev* structure and the device identifier are
freed.

The functions exported by the application RegEx API to setup a device
designated by its device identifier must be invoked in the following order:
    - rte_regex_dev_configure()
    - rte_regex_queue_pair_setup()
    - rte_regex_dev_start()

Then, the application can invoke, in any order, the functions
exported by the RegEx API to enqueue pattern matching job, dequeue pattern
matching response, get the stats, update the rule database,
get/set device attributes and so on

If the application wants to change the configuration (i.e. call
rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must call
rte_regex_dev_stop() first to stop the device and then do the reconfiguration
before calling rte_regex_dev_start() again. The enqueue and dequeue
functions should not be invoked when the device is stopped.

Finally, an application can close a RegEx device by invoking the
rte_regex_dev_close() function.

Each function of the application RegEx API invokes a specific function
of the PMD that controls the target device designated by its device
identifier.

For this purpose, all device-specific functions of a RegEx driver are
supplied through a set of pointers contained in a generic structure of type
*regex_dev_ops*.
The address of the *regex_dev_ops* structure is stored in the *rte_regex_dev*
structure by the device init function of the RegEx driver, which is
invoked during the PCI/SoC device probing phase, as explained earlier.

In other words, each function of the RegEx API simply retrieves the
*rte_regex_dev* structure associated with the device identifier and
performs an indirect invocation of the corresponding driver function
supplied in the *regex_dev_ops* structure of the *rte_regex_dev* structure.

For performance reasons, the address of the fast-path functions of the
RegEx driver is not contained in the *regex_dev_ops* structure.
Instead, they are directly stored at the beginning of the *rte_regex_dev*
structure to avoid an extra indirect memory access during their invocation.

RTE RegEx device drivers do not use interrupts for enqueue or dequeue
operation. Instead, RegEx drivers export Poll-Mode enqueue and dequeue
functions to applications.

The *enqueue* operation submits a burst of RegEx pattern matching request
to the RegEx device and the *dequeue* operation gets a burst of pattern
matching response for the ones submitted through *enqueue* operation.

Typical application utilisation of the RegEx device API will follow the
following programming flow.

- rte_regex_dev_configure()
- rte_regex_queue_pair_setup()
- rte_regex_rule_db_update() Needs to invoke if precompiled rule database not
  provided in rte_regex_dev_config::rule_db for rte_regex_dev_configure()
  and/or application needs to update rule database.
- Create or reuse exiting mempool for *rte_regex_ops* objects.
- rte_regex_dev_start()
- rte_regex_enqueue_burst()
- rte_regex_dequeue_burst()

---

 config/common_base                 |    5 +
 doc/api/doxy-api-index.md          |    1 +
 doc/api/doxy-api.conf.in           |    1 +
 lib/Makefile                       |    2 +
 lib/librte_regexdev/Makefile       |   23 +
 lib/librte_regexdev/rte_regexdev.c |    5 +
 lib/librte_regexdev/rte_regexdev.h | 1247 ++++++++++++++++++++++++++++
 7 files changed, 1284 insertions(+)
 create mode 100644 lib/librte_regexdev/Makefile
 create mode 100644 lib/librte_regexdev/rte_regexdev.c
 create mode 100644 lib/librte_regexdev/rte_regexdev.h

Comments

Jerin Jacob Kollanukkaran July 15, 2019, 4:26 a.m. UTC | #1
Ping.

Is anyone interested to collaborate on this RFC[2]?
Marvell would like to contribute on one SW(PCRE based) PMD and HW PMD for this API.

Shahaf from Mellanox proposed a presentation on DPDK Regex device for upcoming user space summit[1] so
It would be good to iron out the differences in various HW based regex engines and understand the requirements
from application perspective and finalize  the specification before the summit.

Let us know, if anyone interested to collaborate on RegEx device API for DPDK?

[1]
https://events.linuxfoundation.org/events/dpdk-userspace-2019-bordeaux/program/schedule/

[2]
http://patches.dpdk.org/patch/55505/


> -----Original Message-----
> From: jerinj@marvell.com <jerinj@marvell.com>
> Sent: Thursday, June 27, 2019 9:21 PM
> To: dev@dpdk.org
> Cc: techboard@dpdk.org; Jerin Jacob Kollanukkaran <jerinj@marvell.com>;
> Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>
> Subject: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev
> subsystem
> 
> From: Jerin Jacob <jerinj@marvell.com>
> 
> Even though there are some vendors which offer Regex HW offload, due to
> lack of standard API, It is diffcult for DPDK consumer to use them
> in a portable way.
> 
> This _RFC_ attempts to standardize the RegEx/DPI offload APIs for DPDK.
> 
> The Doxygen generated RFC API documentation available here:
> https://dreamy-noether-22777e.netlify.com/rte__regexdev_8h.html
> 
> This RFC crafted based on SW Regex API frameworks such as libpcre and
> hyperscan and a few of the RegEx HW IPs which I am aware of.
> 
> RegEx pattern matching applications:
> • Next Generation Firewalls (NGFW)
> • Deep Packet and Flow Inspection (DPI)
> • Intrusion Prevention Systems (IPS)
> • DDoS Mitigation
> • Network Monitoring
> • Data Loss Prevention (DLP)
> • Smart NICs
> • Grammar based content processing
> • URL, spam and adware filtering
> • Advanced auditing and policing of user/application security policies
> • Financial data mining - parsing of streamed financial feeds
> 
> Request to review from HW and SW RegEx vendors and RegEx application
> users
> to have portable DPDK API for RegEx.
> 
> The API schematics are based cryptodev, eventdev and ethdev existing
> device API.
> 
> Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> ---
> 
> RTE RegEx Device API
> --------------------
> 
> Defines RTE RegEx Device APIs for RegEx operations and its provisioning.
> 
> The RegEx Device API is composed of two parts:
> 
> - The application-oriented RegEx API that includes functions to setup
>   a RegEx device (configure it, setup its queue pairs and start it),
>   update the rule database and so on.
> 
> - The driver-oriented RegEx API that exports a function allowing
>   a RegEx poll Mode Driver (PMD) to simultaneously register itself as
>   a RegEx device driver.
> 
> RegEx device components and definitions:
> 
>     +-----------------+
>     |                 |
>     |                 o---------+    rte_regex_[en|de]queue_burst()
>     |   PCRE based    o------+  |               |
>     |  RegEx pattern  |      |  |  +--------+   |
>     | matching engine o------+--+--o        |   |    +------+
>     |                 |      |  |  | queue  |<==o===>|Core 0|
>     |                 o----+ |  |  | pair 0 |        |      |
>     |                 |    | |  |  +--------+        +------+
>     +-----------------+    | |  |
>            ^               | |  |  +--------+
>            |               | |  |  |        |        +------+
>            |               | +--+--o queue  |<======>|Core 1|
>        Rule|Database       |    |  | pair 1 |        |      |
>     +------+----------+    |    |  +--------+        +------+
>     |     Group 0     |    |    |
>     | +-------------+ |    |    |  +--------+        +------+
>     | | Rules 0..n  | |    |    |  |        |        |Core 2|
>     | +-------------+ |    |    +--o queue  |<======>|      |
>     |     Group 1     |    |       | pair 2 |        +------+
>     | +-------------+ |    |       +--------+
>     | | Rules 0..n  | |    |
>     | +-------------+ |    |       +--------+
>     |     Group 2     |    |       |        |        +------+
>     | +-------------+ |    |       | queue  |<======>|Core n|
>     | | Rules 0..n  | |    +-------o pair n |        |      |
>     | +-------------+ |            +--------+        +------+
>     |     Group n     |
>     | +-------------+ |<-------rte_regex_rule_db_update()
>     | | Rules 0..n  | |<-------rte_regex_rule_db_import()
>     | +-------------+ |------->rte_regex_rule_db_export()
>     +-----------------+
> 
> RegEx: A regular expression is a concise and flexible means for matching
> strings of text, such as particular characters, words, or patterns of
> characters. A common abbreviation for this is “RegEx”.
> 
> RegEx device: A hardware or software-based implementation of RegEx
> device API for PCRE based pattern matching syntax and semantics.
> 
> PCRE RegEx syntax and semantics specification:
> http://regexkit.sourceforge.net/Documentation/pcre/pcrepattern.html
> 
> RegEx queue pair: Each RegEx device should have one or more queue pair to
> transmit a burst of pattern matching request and receive a burst of
> receive the pattern matching response. The pattern matching
> request/response
> embedded in *rte_regex_ops* structure.
> 
> Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
> Match ID and Group ID to identify the rule upon the match.
> 
> Rule database: The RegEx device accepts regular expressions and converts
> them
> into a compiled rule database that can then be used to scan data.
> Compilation allows the device to analyze the given pattern(s) and
> pre-determine how to scan for these patterns in an optimized fashion that
> would be far too expensive to compute at run-time. A rule database contains
> a set of rules that compiled in device specific binary form.
> 
> Match ID or Rule ID: A unique identifier provided at the time of rule
> creation for the application to identify the rule upon match.
> 
> Group ID: Group of rules can be grouped under one group ID to enable
> rule isolation and effective pattern matching. A unique group identifier
> provided at the time of rule creation for the application to identify the
> rule upon match.
> 
> Scan: A pattern matching request through *enqueue* API.
> 
> It may possible that a given RegEx device may not support all the features
> of PCRE. The application may probe unsupported features through
> struct rte_regex_dev_info::pcre_unsup_flags
> 
> By default, all the functions of the RegEx Device API exported by a PMD
> are lock-free functions which assume to not be invoked in parallel on
> different logical cores to work on the same target object. For instance,
> the dequeue function of a PMD cannot be invoked in parallel on two logical
> cores to operates on same RegEx queue pair. Of course, this function
> can be invoked in parallel by different logical core on different queue pair.
> It is the responsibility of the upper level application to enforce this rule.
> 
> In all functions of the RegEx API, the RegEx device is
> designated by an integer >= 0 named the device identifier *dev_id*
> 
> At the RegEx driver level, RegEx devices are represented by a generic
> data structure of type *rte_regex_dev*.
> 
> RegEx devices are dynamically registered during the PCI/SoC device probing
> phase performed at EAL initialization time.
> When a RegEx device is being probed, a *rte_regex_dev* structure and
> a new device identifier are allocated for that device. Then, the
> regex_dev_init() function supplied by the RegEx driver matching the probed
> device is invoked to properly initialize the device.
> 
> The role of the device init function consists of resetting the hardware or
> software RegEx driver implementations.
> 
> If the device init operation is successful, the correspondence between
> the device identifier assigned to the new device and its associated
> *rte_regex_dev* structure is effectively registered.
> Otherwise, both the *rte_regex_dev* structure and the device identifier are
> freed.
> 
> The functions exported by the application RegEx API to setup a device
> designated by its device identifier must be invoked in the following order:
>     - rte_regex_dev_configure()
>     - rte_regex_queue_pair_setup()
>     - rte_regex_dev_start()
> 
> Then, the application can invoke, in any order, the functions
> exported by the RegEx API to enqueue pattern matching job, dequeue
> pattern
> matching response, get the stats, update the rule database,
> get/set device attributes and so on
> 
> If the application wants to change the configuration (i.e. call
> rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must call
> rte_regex_dev_stop() first to stop the device and then do the
> reconfiguration
> before calling rte_regex_dev_start() again. The enqueue and dequeue
> functions should not be invoked when the device is stopped.
> 
> Finally, an application can close a RegEx device by invoking the
> rte_regex_dev_close() function.
> 
> Each function of the application RegEx API invokes a specific function
> of the PMD that controls the target device designated by its device
> identifier.
> 
> For this purpose, all device-specific functions of a RegEx driver are
> supplied through a set of pointers contained in a generic structure of type
> *regex_dev_ops*.
> The address of the *regex_dev_ops* structure is stored in the
> *rte_regex_dev*
> structure by the device init function of the RegEx driver, which is
> invoked during the PCI/SoC device probing phase, as explained earlier.
> 
> In other words, each function of the RegEx API simply retrieves the
> *rte_regex_dev* structure associated with the device identifier and
> performs an indirect invocation of the corresponding driver function
> supplied in the *regex_dev_ops* structure of the *rte_regex_dev*
> structure.
> 
> For performance reasons, the address of the fast-path functions of the
> RegEx driver is not contained in the *regex_dev_ops* structure.
> Instead, they are directly stored at the beginning of the *rte_regex_dev*
> structure to avoid an extra indirect memory access during their invocation.
> 
> RTE RegEx device drivers do not use interrupts for enqueue or dequeue
> operation. Instead, RegEx drivers export Poll-Mode enqueue and dequeue
> functions to applications.
> 
> The *enqueue* operation submits a burst of RegEx pattern matching
> request
> to the RegEx device and the *dequeue* operation gets a burst of pattern
> matching response for the ones submitted through *enqueue* operation.
> 
> Typical application utilisation of the RegEx device API will follow the
> following programming flow.
> 
> - rte_regex_dev_configure()
> - rte_regex_queue_pair_setup()
> - rte_regex_rule_db_update() Needs to invoke if precompiled rule database
> not
>   provided in rte_regex_dev_config::rule_db for rte_regex_dev_configure()
>   and/or application needs to update rule database.
> - Create or reuse exiting mempool for *rte_regex_ops* objects.
> - rte_regex_dev_start()
> - rte_regex_enqueue_burst()
> - rte_regex_dequeue_burst()
> 
> ---
> 
>  config/common_base                 |    5 +
>  doc/api/doxy-api-index.md          |    1 +
>  doc/api/doxy-api.conf.in           |    1 +
>  lib/Makefile                       |    2 +
>  lib/librte_regexdev/Makefile       |   23 +
>  lib/librte_regexdev/rte_regexdev.c |    5 +
>  lib/librte_regexdev/rte_regexdev.h | 1247
> ++++++++++++++++++++++++++++
>  7 files changed, 1284 insertions(+)
>  create mode 100644 lib/librte_regexdev/Makefile
>  create mode 100644 lib/librte_regexdev/rte_regexdev.c
>  create mode 100644 lib/librte_regexdev/rte_regexdev.h
> 
> diff --git a/config/common_base b/config/common_base
> index e406e7836..986093d6e 100644
> --- a/config/common_base
> +++ b/config/common_base
> @@ -746,6 +746,11 @@
> CONFIG_RTE_LIBRTE_PMD_DPAA2_QDMA_RAWDEV=n
>  #
>  CONFIG_RTE_LIBRTE_PMD_IFPGA_RAWDEV=y
> 
> +#
> +# Compile regex device support
> +#
> +CONFIG_RTE_LIBRTE_REGEXDEV=y
> +
>  #
>  # Compile librte_ring
>  #
> diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
> index 715248dd1..a0bc27ae4 100644
> --- a/doc/api/doxy-api-index.md
> +++ b/doc/api/doxy-api-index.md
> @@ -26,6 +26,7 @@ The public API headers are grouped by topics:
>    [event_timer_adapter]    (@ref rte_event_timer_adapter.h),
>    [event_crypto_adapter]   (@ref rte_event_crypto_adapter.h),
>    [rawdev]             (@ref rte_rawdev.h),
> +  [regexdev]           (@ref rte_regexdev.h),
>    [metrics]            (@ref rte_metrics.h),
>    [bitrate]            (@ref rte_bitrate.h),
>    [latency]            (@ref rte_latencystats.h),
> diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in
> index b9896cb63..7adb821bb 100644
> --- a/doc/api/doxy-api.conf.in
> +++ b/doc/api/doxy-api.conf.in
> @@ -53,6 +53,7 @@ INPUT                   = @TOPDIR@/doc/api/doxy-api-
> index.md \
>                            @TOPDIR@/lib/librte_rawdev \
>                            @TOPDIR@/lib/librte_rcu \
>                            @TOPDIR@/lib/librte_reorder \
> +                          @TOPDIR@/lib/librte_regexdev \
>                            @TOPDIR@/lib/librte_ring \
>                            @TOPDIR@/lib/librte_sched \
>                            @TOPDIR@/lib/librte_security \
> diff --git a/lib/Makefile b/lib/Makefile
> index 791e0d991..57de9691a 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -44,6 +44,8 @@ DEPDIRS-librte_eventdev := librte_eal librte_ring
> librte_ethdev librte_hash \
>                             librte_mempool librte_timer librte_cryptodev
>  DIRS-$(CONFIG_RTE_LIBRTE_RAWDEV) += librte_rawdev
>  DEPDIRS-librte_rawdev := librte_eal librte_ethdev
> +DIRS-$(CONFIG_RTE_LIBRTE_REGEXDEV) += librte_regexdev
> +DEPDIRS-librte_regexdev := librte_eal
>  DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
>  DEPDIRS-librte_vhost := librte_eal librte_mempool librte_mbuf
> librte_ethdev \
>  			librte_net
> diff --git a/lib/librte_regexdev/Makefile b/lib/librte_regexdev/Makefile
> new file mode 100644
> index 000000000..723b4b28c
> --- /dev/null
> +++ b/lib/librte_regexdev/Makefile
> @@ -0,0 +1,23 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(C) 2019 Marvell International Ltd.
> +#
> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +# library name
> +LIB = librte_regexdev.a
> +
> +# library version
> +LIBABIVER := 1
> +
> +# build flags
> +CFLAGS += -O3
> +CFLAGS += $(WERROR_FLAGS)
> +
> +# library source files
> +SRCS-y += rte_regexdev.c
> +
> +# export include files
> +SYMLINK-y-include += rte_regexdev.h
> +
> +include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/lib/librte_regexdev/rte_regexdev.c
> b/lib/librte_regexdev/rte_regexdev.c
> new file mode 100644
> index 000000000..e5be0f29c
> --- /dev/null
> +++ b/lib/librte_regexdev/rte_regexdev.c
> @@ -0,0 +1,5 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(C) 2019 Marvell International Ltd.
> + */
> +
> +#include <rte_regexdev.h>
> diff --git a/lib/librte_regexdev/rte_regexdev.h
> b/lib/librte_regexdev/rte_regexdev.h
> new file mode 100644
> index 000000000..765da4aaa
> --- /dev/null
> +++ b/lib/librte_regexdev/rte_regexdev.h
> @@ -0,0 +1,1247 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(C) 2019 Marvell International Ltd.
> + */
> +
> +#ifndef _RTE_REGEXDEV_H_
> +#define _RTE_REGEXDEV_H_
> +
> +/**
> + * @file
> + *
> + * RTE RegEx Device API
> + *
> + * Defines RTE RegEx Device APIs for RegEx operations and its provisioning.
> + *
> + * The RegEx Device API is composed of two parts:
> + *
> + * - The application-oriented RegEx API that includes functions to setup
> + *   a RegEx device (configure it, setup its queue pairs and start it),
> + *   update the rule database and so on.
> + *
> + * - The driver-oriented RegEx API that exports a function allowing
> + *   a RegEx poll Mode Driver (PMD) to simultaneously register itself as
> + *   a RegEx device driver.
> + *
> + * RegEx device components and definitions:
> + *
> + *     +-----------------+
> + *     |                 |
> + *     |                 o---------+    rte_regex_[en|de]queue_burst()
> + *     |   PCRE based    o------+  |               |
> + *     |  RegEx pattern  |      |  |  +--------+   |
> + *     | matching engine o------+--+--o        |   |    +------+
> + *     |                 |      |  |  | queue  |<==o===>|Core 0|
> + *     |                 o----+ |  |  | pair 0 |        |      |
> + *     |                 |    | |  |  +--------+        +------+
> + *     +-----------------+    | |  |
> + *            ^               | |  |  +--------+
> + *            |               | |  |  |        |        +------+
> + *            |               | +--+--o queue  |<======>|Core 1|
> + *        Rule|Database       |    |  | pair 1 |        |      |
> + *     +------+----------+    |    |  +--------+        +------+
> + *     |     Group 0     |    |    |
> + *     | +-------------+ |    |    |  +--------+        +------+
> + *     | | Rules 0..n  | |    |    |  |        |        |Core 2|
> + *     | +-------------+ |    |    +--o queue  |<======>|      |
> + *     |     Group 1     |    |       | pair 2 |        +------+
> + *     | +-------------+ |    |       +--------+
> + *     | | Rules 0..n  | |    |
> + *     | +-------------+ |    |       +--------+
> + *     |     Group 2     |    |       |        |        +------+
> + *     | +-------------+ |    |       | queue  |<======>|Core n|
> + *     | | Rules 0..n  | |    +-------o pair n |        |      |
> + *     | +-------------+ |            +--------+        +------+
> + *     |     Group n     |
> + *     | +-------------+ |<-------rte_regex_rule_db_update()
> + *     | | Rules 0..n  | |<-------rte_regex_rule_db_import()
> + *     | +-------------+ |------->rte_regex_rule_db_export()
> + *     +-----------------+
> + *
> + * RegEx: A regular expression is a concise and flexible means for matching
> + * strings of text, such as particular characters, words, or patterns of
> + * characters. A common abbreviation for this is “RegEx”.
> + *
> + * RegEx device: A hardware or software-based implementation of RegEx
> + * device API for PCRE based pattern matching syntax and semantics.
> + *
> + * PCRE RegEx syntax and semantics specification:
> + * http://regexkit.sourceforge.net/Documentation/pcre/pcrepattern.html
> + *
> + * RegEx queue pair: Each RegEx device should have one or more queue
> pair to
> + * transmit a burst of pattern matching request and receive a burst of
> + * receive the pattern matching response. The pattern matching
> request/response
> + * embedded in *rte_regex_ops* structure.
> + *
> + * Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
> + * Match ID and Group ID to identify the rule upon the match.
> + *
> + * Rule database: The RegEx device accepts regular expressions and
> converts them
> + * into a compiled rule database that can then be used to scan data.
> + * Compilation allows the device to analyze the given pattern(s) and
> + * pre-determine how to scan for these patterns in an optimized fashion
> that
> + * would be far too expensive to compute at run-time. A rule database
> contains
> + * a set of rules that compiled in device specific binary form.
> + *
> + * Match ID or Rule ID: A unique identifier provided at the time of rule
> + * creation for the application to identify the rule upon match.
> + *
> + * Group ID: Group of rules can be grouped under one group ID to enable
> + * rule isolation and effective pattern matching. A unique group identifier
> + * provided at the time of rule creation for the application to identify the
> + * rule upon match.
> + *
> + * Scan: A pattern matching request through *enqueue* API.
> + *
> + * It may possible that a given RegEx device may not support all the features
> + * of PCRE. The application may probe unsupported features through
> + * struct rte_regex_dev_info::pcre_unsup_flags
> + *
> + * By default, all the functions of the RegEx Device API exported by a PMD
> + * are lock-free functions which assume to not be invoked in parallel on
> + * different logical cores to work on the same target object. For instance,
> + * the dequeue function of a PMD cannot be invoked in parallel on two
> logical
> + * cores to operates on same RegEx queue pair. Of course, this function
> + * can be invoked in parallel by different logical core on different queue
> pair.
> + * It is the responsibility of the upper level application to enforce this rule.
> + *
> + * In all functions of the RegEx API, the RegEx device is
> + * designated by an integer >= 0 named the device identifier *dev_id*
> + *
> + * At the RegEx driver level, RegEx devices are represented by a generic
> + * data structure of type *rte_regex_dev*.
> + *
> + * RegEx devices are dynamically registered during the PCI/SoC device
> probing
> + * phase performed at EAL initialization time.
> + * When a RegEx device is being probed, a *rte_regex_dev* structure and
> + * a new device identifier are allocated for that device. Then, the
> + * regex_dev_init() function supplied by the RegEx driver matching the
> probed
> + * device is invoked to properly initialize the device.
> + *
> + * The role of the device init function consists of resetting the hardware or
> + * software RegEx driver implementations.
> + *
> + * If the device init operation is successful, the correspondence between
> + * the device identifier assigned to the new device and its associated
> + * *rte_regex_dev* structure is effectively registered.
> + * Otherwise, both the *rte_regex_dev* structure and the device identifier
> are
> + * freed.
> + *
> + * The functions exported by the application RegEx API to setup a device
> + * designated by its device identifier must be invoked in the following order:
> + *     - rte_regex_dev_configure()
> + *     - rte_regex_queue_pair_setup()
> + *     - rte_regex_dev_start()
> + *
> + * Then, the application can invoke, in any order, the functions
> + * exported by the RegEx API to enqueue pattern matching job, dequeue
> pattern
> + * matching response, get the stats, update the rule database,
> + * get/set device attributes and so on
> + *
> + * If the application wants to change the configuration (i.e. call
> + * rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must
> call
> + * rte_regex_dev_stop() first to stop the device and then do the
> reconfiguration
> + * before calling rte_regex_dev_start() again. The enqueue and dequeue
> + * functions should not be invoked when the device is stopped.
> + *
> + * Finally, an application can close a RegEx device by invoking the
> + * rte_regex_dev_close() function.
> + *
> + * Each function of the application RegEx API invokes a specific function
> + * of the PMD that controls the target device designated by its device
> + * identifier.
> + *
> + * For this purpose, all device-specific functions of a RegEx driver are
> + * supplied through a set of pointers contained in a generic structure of type
> + * *regex_dev_ops*.
> + * The address of the *regex_dev_ops* structure is stored in the
> *rte_regex_dev*
> + * structure by the device init function of the RegEx driver, which is
> + * invoked during the PCI/SoC device probing phase, as explained earlier.
> + *
> + * In other words, each function of the RegEx API simply retrieves the
> + * *rte_regex_dev* structure associated with the device identifier and
> + * performs an indirect invocation of the corresponding driver function
> + * supplied in the *regex_dev_ops* structure of the *rte_regex_dev*
> structure.
> + *
> + * For performance reasons, the address of the fast-path functions of the
> + * RegEx driver is not contained in the *regex_dev_ops* structure.
> + * Instead, they are directly stored at the beginning of the *rte_regex_dev*
> + * structure to avoid an extra indirect memory access during their
> invocation.
> + *
> + * RTE RegEx device drivers do not use interrupts for enqueue or dequeue
> + * operation. Instead, RegEx drivers export Poll-Mode enqueue and
> dequeue
> + * functions to applications.
> + *
> + * The *enqueue* operation submits a burst of RegEx pattern matching
> request
> + * to the RegEx device and the *dequeue* operation gets a burst of pattern
> + * matching response for the ones submitted through *enqueue*
> operation.
> + *
> + * Typical application utilisation of the RegEx device API will follow the
> + * following programming flow.
> + *
> + * - rte_regex_dev_configure()
> + * - rte_regex_queue_pair_setup()
> + * - rte_regex_rule_db_update() Needs to invoke if precompiled rule
> database not
> + *   provided in rte_regex_dev_config::rule_db for
> rte_regex_dev_configure()
> + *   and/or application needs to update rule database.
> + * - Create or reuse exiting mempool for *rte_regex_ops* objects.
> + * - rte_regex_dev_start()
> + * - rte_regex_enqueue_burst()
> + * - rte_regex_dequeue_burst()
> + *
> + */
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include <rte_common.h>
> +#include <rte_config.h>
> +#include <rte_dev.h>
> +#include <rte_errno.h>
> +#include <rte_memory.h>
> +
> +/**
> + * Get the total number of RegEx devices that have been successfully
> + * initialised.
> + *
> + * @return
> + *   The total number of usable RegEx devices.
> + */
> +uint8_t
> +rte_regex_dev_count(void);
> +
> +/**
> + * Get the device identifier for the named RegEx device.
> + *
> + * @param name
> + *   RegEx device name to select the RegEx device identifier.
> + *
> + * @return
> + *   Returns RegEx device identifier on success.
> + *   - <0: Failure to find named RegEx device.
> + */
> +int
> +rte_regex_dev_get_dev_id(const char *name);
> +
> +/* Enumerates RegEx device capabilities */
> +#define RTE_REGEX_DEV_CAPA_RUNTIME_COMPILATION_F (1ULL << 0)
> +/**< RegEx device does support compiling the rules at runtime unlike
> + * loading only the pre-built rule database using
> + * struct rte_regex_dev_config::rule_db in rte_regex_dev_configure()
> + * @see struct rte_regex_dev_config::rule_db, rte_regex_dev_configure()
> + * @see struct rte_regex_dev_info::regex_dev_capa
> + */
> +
> +
> +/* Enumerates unsupported PCRE features for the RegEx device */
> +#define RTE_REGEX_DEV_PCRE_UNSUP_START_ANCHOR_F (1ULL << 0)
> +/**< RegEx device doesn't support PCRE Anchor to start of match flag.
> + * Example RegEx is '/\Gfoo\d/'. Here '\G' asserts position at the end of the
> + * previous match or the start of the string for the first match.
> + * This position will change each time the RegEx is applied to the subject
> + * string. If the RegEx is applied to 'foo1foo2Zfoo3' the first two matches will
> + * be successful for 'foo1foo2' and fail for 'Zfoo3'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_ATOMIC_GROUPING_F (1ULL <<
> 1)
> +/**< RegEx device doesn't support PCRE Atomic grouping.
> + * Atomic groups are represented by '(?>)'. An atomic group is a group that,
> + * when the RegEx engine exits from it, automatically throws away all
> + * backtracking positions remembered by any tokens inside the group.
> + * Example RegEx is 'a(?>bc|b)c' if the given patterns are 'abc' and 'abcc'
> then
> + * 'a(bc|b)c' matches both where as 'a(?>bc|b)c' matches only abcc because
> + * atomic groups don't allow backtracing back to 'b'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_BACKTRACKING_CTRL_F (1ULL <<
> 2)
> +/**< RegEx device doesn't support PCRE backtracking control verbs.
> + * Some examples of backtracing verbs are (*COMMIT), (*ACCEPT), (*FAIL),
> + * (*SKIP), (*PRUNE).
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_CALLOUTS_F (1ULL << 3)
> +/**< RegEx device doesn't support PCRE callouts.
> + * PCRE supports calling external function in between matches by using
> '(?C)'.
> + * Example RegEx 'ABC(?C)D' if a given patter is 'ABCD' then the RegEx
> engine
> + * will parse ABC perform a userdefined callout and return a successful
> match at
> + * D.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_BACKREFERENCE_F (1ULL << 4)
> +/**< RegEx device doesn't support PCRE backreference.
> + * Example RegEx is '(\2ABC|(GHI))+' \2 matches the same text as most
> recently
> + * matched by the 2nd capturing group i.e. 'GHI'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_GREEDY_F (1ULL << 5)
> +/**< RegEx device doesn't support PCRE Greedy mode.
> + * For example if the RegEx is 'AB\d*?' then '*?' represents zero or
> unlimited
> + * matches. In greedy mode the pattern 'AB12345' will be matched
> completely
> + * where as the ungreedy mode 'AB' will be returned as the match.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_LOOKAROUND_ASRT_F (1ULL <<
> 6)
> +/**< RegEx device doesn't support PCRE Lookaround assertions
> + * (Zero-width assertions). Example RegEx is '[a-z]+\d+(?=!{3,})' if
> + * the given pattern is 'dwad1234!' the RegEx engine doesn't report any
> matches
> + * because the assert '(?=!{3,})' fails. The pattern 'dwad123!!!' would return
> a
> + * successful match.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_MATCH_POINT_RST_F (1ULL <<
> 7)
> +/**< RegEx device doesn't support PCRE match point reset directive.
> + * Example RegEx is '[a-z]+\K\d+' if the pattern is 'dwad123'
> + * then even though the entire pattern matches only '123'
> + * is reported as a match.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_NEWLINE_CONVENTIONS_F
> (1ULL << 8)
> +/**< RegEx device doesn't support PCRE newline convention.
> + * Newline conventions are represented as follows:
> + * (*CR)        carriage return
> + * (*LF)        linefeed
> + * (*CRLF)      carriage return, followed by linefeed
> + * (*ANYCRLF)   any of the three above
> + * (*ANY)       all Unicode newline sequences
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_NEWLINE_SEQ_F (1ULL << 9)
> +/**< RegEx device doesn't support PCRE newline sequence.
> + * The escape sequence '\R' will match any newline sequence.
> + * It is equivalent to: '(?>\r\n|\n|\x0b|\f|\r|\x85)'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_POSSESSIVE_QUALIFIERS_F (1ULL
> << 10)
> +/**< RegEx device doesn't support PCRE possessive qualifiers.
> + * Example RegEx possessive qualifiers '*+', '++', '?+', '{m,n}+'.
> + * Possessive quantifier repeats the token as many times as possible and it
> does
> + * not give up matches as the engine backtracks. With a possessive
> quantifier,
> + * the deal is all or nothing.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_SUBROUTINE_REFERENCES_F
> (1ULL << 11)
> +/**< RegEx device doesn't support PCRE Subroutine references.
> + * PCRE Subroutine references allow for sub patterns to be assessed
> + * as part of the RegEx. Example RegEx is '(foo|fuzz)\g<1>+bar' matches the
> + * pattern 'foofoofuzzfoofuzzbar'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_8_F (1ULL << 12)
> +/**< RegEx device doesn't support UTF-8 character encoding.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_16_F (1ULL << 13)
> +/**< RegEx device doesn't support UTF-16 character encoding.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_32_F (1ULL << 14)
> +/**< RegEx device doesn't support UTF-32 character encoding.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_WORD_BOUNDARY_F (1ULL <<
> 15)
> +/**< RegEx device doesn't support word boundaries.
> + * The meta character '\b' represents word boundary anchor.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_FORWARD_REFERENCES_F (1ULL
> << 16)
> +/**< RegEx device doesn't support Forward references.
> + * Forward references allow you to use a back reference to a group that
> appears
> + * later in the RegEx. Example RegEx is '(\3ABC|(DEF|(GHI)))+' matches the
> + * following string 'GHIGHIABCDEF'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +/* Enumerates PCRE rule flags */
> +#define RTE_REGEX_PCRE_RULE_ALLOW_EMPTY_F (1ULL << 0)
> +/**< When this flag is set, the pattern that can match against an empty
> string,
> + * such as '.*' are allowed.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_ANCHORED_F (1ULL << 1)
> +/**< When this flag is set, the pattern is forced to be "anchored", that is, it
> + * is constrained to match only at the first matching point in the string that
> + * is being searched. Similar to '^' and represented by \A.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_CASELESS_F (1ULL << 2)
> +/**< When this flag is set, letters in the pattern match both upper and
> lower
> + * case letters in the subject.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_DOTALL_F (1ULL << 3)
> +/**< When this flag is set, a dot metacharacter in the pattern matches any
> + * character, including one that indicates a newline.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_DUPNAMES_F (1ULL << 4)
> +/**< When this flag is set, names used to identify capture groups need not
> be
> + * unique.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_EXTENDED_F (1ULL << 5)
> +/**< When this flag is set, most white space characters in the pattern are
> + * totally ignored except when escaped or inside a character class.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_MATCH_UNSET_BACKREF_F (1ULL << 6)
> +/**< When this flag is set, a backreference to an unset capture group
> matches an
> + * empty string.
> + * @see RTE_REGEX_DEV_PCRE_UNSUP_FORWARD_REFERENCES_F
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_MULTILINE_F (1ULL << 7)
> +/**< When this flag  is set, the '^' and '$' constructs match immediately
> + * following or immediately before internal newlines in the subject string,
> + * respectively, as well as at the very start and end.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_NO_AUTO_CAPTURE_F (1ULL << 8)
> +/**< When this Flag is set, it disables the use of numbered capturing
> + * parentheses in the pattern. References to capture groups
> (backreferences or
> + * recursion/subroutine calls) may only refer to named groups, though the
> + * reference can be by name or by number.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_UCP_F (1ULL << 9)
> +/**< By default, only ASCII characters are recognized, When this flag is set,
> + * Unicode properties are used instead to classify characters.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_UNGREEDY_F (1ULL << 10)
> +/**< When this flag is set, the "greediness" of the quantifiers is inverted
> + * so that they are not greedy by default, but become greedy if followed by
> + * '?'.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_UTF_F (1ULL << 11)
> +/**< When this flag is set, RegEx engine has to regard both the pattern and
> the
> + * subject strings that are subsequently processed as strings of UTF
> characters
> + * instead of single-code-unit strings.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_NEVER_BACKSLASH_C_F (1ULL << 12)
> +/**< This Flag locks out the use of '\C' in the pattern that is being compiled.
> + * This escape matches one data unit, even in UTF mode which can cause
> + * unpredictable behavior in UTF-8 or UTF-16 modes, because it may leave
> the
> + * current matching point in the middle of a multi-code-unit character.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +
> +/**
> + * RegEx device information
> + */
> +struct rte_regex_dev_info {
> +	const char *driver_name; /**< RegEx driver name */
> +	struct rte_device *dev;	/**< Device information */
> +	uint8_t max_matches;
> +	/**< Maximum matches per scan supported by this device */
> +	uint16_t max_queue_pairs;
> +	/**< Maximum queue pairs supported by this device */
> +	uint16_t max_payload_size;
> +	/**< Maximum payload size for a pattern match request or scan.
> +	 * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> +	 */
> +	uint16_t max_rules_per_group;
> +	/**< Maximum rules supported per group by this device */
> +	uint16_t max_groups;
> +	/**< Maximum group supported by this device */
> +	uint32_t regex_dev_capa;
> +	/**< RegEx device capabilities. @see RTE_REGEX_DEV_CAPA_* */
> +	uint64_t rule_flags;
> +	/**< Supported compiler rule flags.
> +	 * @see RTE_REGEX_PCRE_RULE_*, struct rte_regex_rule::rule_flags
> +	 */
> +	uint64_t pcre_unsup_flags;
> +	/**< Unsupported PCRE features for this RegEx device.
> +	 * @see RTE_REGEX_DEV_PCRE_UNSUP_*
> +	 */
> +};
> +
> +/**
> + * Retrieve the contextual information of a RegEx device.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + *
> + * @param[out] dev_info
> + *   A pointer to a structure of type *rte_regex_dev_info* to be filled with
> the
> + *   contextual information of the device.
> + *
> + * @return
> + *   - 0: Success, driver updates the contextual information of the RegEx
> device
> + *   - <0: Error code returned by the driver info get function.
> + *
> + */
> +int
> +rte_regex_dev_info_get(uint8_t dev_id, struct rte_regex_dev_info
> *dev_info);
> +
> +/* Enumerates RegEx device configuration flags */
> +#define RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F (1ULL << 0)
> +/**< Cross buffer scan refers to the ability to be able to detect
> + * matches that occur across buffer boundaries, where the buffers are
> related
> + * to each other in some way. Enable this flag when to scan payload size
> + * greater struct struct rte_regex_dev_info::max_payload_size and/or
> + * matches can present across scan buffer boundaries.
> + *
> + * @see struct rte_regex_dev_info::max_payload_size
> + * @see struct rte_regex_dev_config::dev_cfg_flags,
> rte_regex_dev_configure()
> + * @see RTE_REGEX_OPS_RSP_PMI_SOJ_F
> + * @see RTE_REGEX_OPS_RSP_PMI_EOJ_F
> + */
> +
> +/** RegEx device configuration structure */
> +struct rte_regex_dev_config {
> +	uint8_t nb_max_matches;
> +	/**< Maximum matches per scan configured on this device.
> +	 * This value cannot exceed the *max_matches*
> +	 * which previously provided in rte_regex_dev_info_get().
> +	 * The value 0 is allowed, in which case, value 1 used.
> +	 * @see struct rte_regex_dev_info::max_matches
> +	 */
> +	uint16_t nb_queue_pairs;
> +	/**< Number of RegEx queue pairs to configure on this device.
> +	 * This value cannot exceed the *max_queue_pairs* which
> previously
> +	 * provided in rte_regex_dev_info_get().
> +	 * @see struct rte_regex_dev_info::max_queue_pairs
> +	 */
> +	uint16_t nb_rules_per_group;
> +	/**< Number of rules per group to configure on this device.
> +	 * This value cannot exceed the *max_rules_per_group*
> +	 * which previously provided in rte_regex_dev_info_get().
> +	 * The value 0 is allowed, in which case,
> +	 * struct rte_regex_dev_info::max_rules_per_group used.
> +	 * @see struct rte_regex_dev_info::max_rules_per_group
> +	 */
> +	uint16_t nb_groups;
> +	/**< Number of groups to configure on this device.
> +	 * This value cannot exceed the *max_groups*
> +	 * which previously provided in rte_regex_dev_info_get().
> +	 * @see struct rte_regex_dev_info::max_groups
> +	 */
> +	const char *rule_db;
> +	/**< Import initial set of prebuilt rule database on this device.
> +	 * The value NULL is allowed, in which case, the device will not
> +	 * be configured prebuilt rule database. Application may use
> +	 * rte_regex_rule_db_update() or rte_regex_rule_db_import() API
> +	 * to update or import rule database after the
> +	 * rte_regex_dev_configure().
> +	 * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
> +	 */
> +	uint32_t rule_db_len;
> +	/**< Length of *rule_db* buffer. */
> +	uint32_t dev_cfg_flags;
> +	/**< RegEx device configuration flags, See RTE_REGEX_DEV_CFG_*
> */
> +};
> +
> +/**
> + * Configure a RegEx device.
> + *
> + * This function must be invoked first before any other function in the
> + * API. This function can also be re-invoked when a device is in the
> + * stopped state.
> + *
> + * The caller may use rte_regex_dev_info_get() to get the capability of each
> + * resources available for this regex device.
> + *
> + * @param dev_id
> + *   The identifier of the device to configure.
> + * @param cfg
> + *   The RegEx device configuration structure.
> + *
> + * @return
> + *   - 0: Success, device configured.
> + *   - <0: Error code returned by the driver configuration function.
> + */
> +int
> +rte_regex_dev_configure(uint8_t dev_id, const struct
> rte_regex_dev_config *cfg);
> +
> +/* Enumerates RegEx queue pair configuration flags */
> +#define RTE_REGEX_QUEUE_PAIR_CFG_OOS_F (1ULL << 0)
> +/**< Out of order scan, If not set, a scan must retire after previously issued
> + * in-order scans to this queue pair. If set, this scan can be retired as soon
> + * as device returns completion. Application should not set out of order scan
> + * flag if it needs to maintain the ingress order of scan request.
> + *
> + * @see struct rte_regex_qp_conf::qp_conf_flags,
> rte_regex_queue_pair_setup()
> + */
> +
> +struct rte_regex_ops;
> +typedef void (*regexdev_stop_flush_t)(uint8_t dev_id, uint16_t qp_id,
> +				      struct rte_regex_ops *op);
> +/**< Callback function called during rte_regex_dev_stop(), invoked once
> per
> + * flushed RegEx op.
> + */
> +
> +/** RegEx queue pair configuration structure */
> +struct rte_regex_qp_conf {
> +	uint32_t qp_conf_flags;
> +	/**< Queue pair config flags, See RTE_REGEX_QUEUE_PAIR_CFG_*
> */
> +	uint16_t nb_desc;
> +	/**< The number of descriptors to allocate for this queue pair. */
> +	regexdev_stop_flush_t cb;
> +	/**< Callback function called during rte_regex_dev_stop(), invoked
> +	 * once per flushed regex op. Value NULL is allowed, in which case
> +	 * callback will not be invoked. This function can be used to properly
> +	 * dispose of outstanding regex ops from response queue,
> +	 * for example ops containing memory pointers.
> +	 * @see rte_regex_dev_stop()
> +	 */
> +};
> +
> +/**
> + * Allocate and set up a RegEx queue pair for a RegEx device.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param queue_pair_id
> + *   The index of the RegEx queue pair to setup. The value must be in the
> range
> + *   [0, nb_queue_pairs - 1] previously supplied to
> rte_regex_dev_configure().
> + * @param qp_conf
> + *   The pointer to the configuration data to be used for the RegEx queue
> pair.
> + *   NULL value is allowed, in which case default configuration	used.
> + *
> + * @return
> + *   - 0: Success, RegEx queue pair correctly set up.
> + *   - <0: RegEx queue configuration failed
> + */
> +int
> +rte_regex_queue_pair_setup(uint8_t dev_id, uint8_t queue_pair_id,
> +			   const struct rte_regex_qp_conf *qp_conf);
> +
> +/**
> + * Start a RegEx device.
> + *
> + * The device start step is the last one and consists of setting the RegEx
> + * queues to start accepting the pattern matching scan requests.
> + *
> + * On success, all basic functions exported by the API (RegEx enqueue,
> + * RegEx dequeue and so on) can be invoked.
> + *
> + * @param dev_id
> + *   RegEx device identifier
> + * @return
> + *   - 0: Success, device started.
> + *   - <0: Device start failed.
> + */
> +int
> +rte_regex_dev_start(uint8_t dev_id);
> +
> +/**
> + * Stop a RegEx device.
> + *
> + * Stop a RegEx device. The device can be restarted with a call to
> + * rte_regex_dev_start().
> + *
> + * This function causes all queued response regex ops to be drained in the
> + * response queue. While draining ops out of the device,
> + * struct rte_regex_qp_conf::cb will be invoked for each ops.
> + *
> + * @param dev_id
> + *   RegEx device identifier.
> + *
> + * @see struct rte_regex_qp_conf::cb, rte_regex_queue_pair_setup()
> + */
> +void
> +rte_regex_dev_stop(uint8_t dev_id);
> +
> +/**
> + * Close a RegEx device. The device cannot be restarted!
> + *
> + * @param dev_id
> + *   RegEx device identifier
> + *
> + * @return
> + *  - 0 on successfully closed the device.
> + *  - <0 on failure to close the device.
> + */
> +int
> +rte_regex_dev_close(uint8_t dev_id);
> +
> +/* Device get/set attributes */
> +
> +/** Enumerates RegEx device attribute identifier */
> +enum rte_regex_dev_attr_id {
> +	RTE_REGEX_DEV_ATTR_SOCKET_ID,
> +	/**< The NUMA socket id to which the device is connected or
> +	 * a default of zero if the socket could not be determined.
> +	 * datatype: *int*
> +	 * operation: *get*
> +	 */
> +	RTE_REGEX_DEV_ATTR_MAX_MATCHES,
> +	/**< Maximum number of matches per scan.
> +	 * datatype: *uint8_t*
> +	 * operation: *get* and *set*
> +	 *
> +	 * @see RTE_REGEX_OPS_RSP_MAX_MATCH_F
> +	 */
> +	RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT,
> +	/**< Upper bound scan time in ns.
> +	 * datatype: *uint16_t*
> +	 * operation: *get* and *set*
> +	 *
> +	 * @see RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F
> +	 */
> +	RTE_REGEX_DEV_ATTR_MAX_PREFIX,
> +	/**< Maximum number of prefix detected per scan.
> +	 * This would be useful for denial of service detection.
> +	 * datatype: *uint16_t*
> +	 * operation: *get* and *set*
> +	 *
> +	 * @see RTE_REGEX_OPS_RSP_MAX_PREFIX_F
> +	 */
> +};
> +
> +/**
> + * Get an attribute from a RegEx device.
> + *
> + * @param dev_id RegEx device identifier
> + * @param attr_id The attribute ID to retrieve
> + * @param[out] attr_value A pointer that will be filled in with the attribute
> + *             value if successful.
> + *
> + * @return
> + *   - 0: Successfully retrieved attribute value.
> + *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
> + *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
> + */
> +int
> +rte_regex_dev_attr_get(uint8_t dev_id, enum rte_regex_dev_attr_id
> attr_id,
> +		       void *attr_value);
> +
> +/**
> + * Set an attribute to a RegEx device.
> + *
> + * @param dev_id RegEx device identifier
> + * @param attr_id The attribute ID to retrieve
> + * @param attr_value A pointer that will be filled in with the attribute value
> + *                   by the application
> + *
> + * @return
> + *   - 0: Successfully applied the attribute value.
> + *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
> + *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
> + */
> +int
> +rte_regex_dev_attr_set(uint8_t dev_id, enum rte_regex_dev_attr_id
> attr_id,
> +		       const void *attr_value);
> +
> +/* Rule related APIs */
> +/** Enumerates RegEx rule operation */
> +enum rte_regex_rule_op {
> +	RTE_REGEX_RULE_OP_ADD,
> +	/**< Add RegEx rule to rule database */
> +	RTE_REGEX_RULE_OP_REMOVE
> +	/**< Remove RegEx rule from rule database */
> +};
> +
> +/** Structure to hold a RegEx rule attributes */
> +struct rte_regex_rule {
> +	enum rte_regex_rule_op op;
> +	/**< OP type of the rule either a OP_ADD or OP_DELETE */
> +	uint16_t group_id;
> +	/**< Group identifier to which the rule belongs to. */
> +	uint32_t rule_id;
> +	/**< Rule identifier which is returned on successful match. */
> +	const char *pcre_rule;
> +	/**< Buffer to hold the PCRE rule. */
> +	uint16_t pcre_rule_len;
> +	/**< Length of the PCRE rule*/
> +	uint64_t rule_flags;
> +	/* PCRE rule flags. Supported device specific PCRE rules enumerated
> +	 * in struct rte_regex_dev_info::rule_flags. For successful rule
> +	 * database update, application needs to provide only supported
> +	 * rule flags.
> +	 * @See RTE_REGEX_PCRE_RULE_*, struct
> rte_regex_dev_info::rule_flags
> +	 */
> +};
> +
> +/**
> + * Update the rule database of a RegEx device.
> + *
> + * @param dev_id RegEx device identifier
> + * @param rules
> + *   Points to an array of *nb_rules* objects of type *rte_regex_rule*
> structure
> + *   which contain the regex rules attributes to be updated in rule database.
> + * @param nb_rules
> + *   The number of PCRE rules to update the rule database.
> + *
> + * @return
> + *   The number of regex rules actually updated on the regex device's rule
> + *   database. The return value can be less than the value of the *nb_rules*
> + *   parameter when the regex devices fails to update the rule database or
> + *   if invalid parameters are specified in a *rte_regex_rule*.
> + *   If the return value is less than *nb_rules*, the remaining PCRE rules
> + *   at the end of *rules* are not consumed and the caller has to take
> + *   care of them and rte_errno is set accordingly.
> + *   Possible errno values include:
> + *   - -EINVAL:  Invalid device ID or rules is NULL
> + *   - -ENOTSUP: The last processed rule is not supported on this device.
> + *   - -ENOSPC: No space available in rule database.
> + *
> + * @see rte_regex_rule_db_import(), rte_regex_rule_db_export()
> + */
> +uint16_t
> +rte_regex_rule_db_update(uint8_t dev_id, const struct rte_regex_rule
> *rules,
> +			 uint16_t nb_rules);
> +
> +/**
> + * Import a prebuilt rule database from a buffer to a RegEx device.
> + *
> + * @param dev_id RegEx device identifier
> + * @param rule_db
> + *   Points to prebuilt rule database.
> + * @param rule_db_len
> + *   Length of the rule database.
> + *
> + * @return
> + *   - 0: Successfully updated the prebuilt rule database.
> + *   - -EINVAL:  Invalid device ID or rule_db is NULL
> + *   - -ENOTSUP: Rule database import is not supported on this device.
> + *   - -ENOSPC: No space available in rule database.
> + *
> + * @see rte_regex_rule_db_update(), rte_regex_rule_db_export()
> + */
> +int
> +rte_regex_rule_db_import(uint8_t dev_id, const char *rule_db,
> +			 uint32_t rule_db_len);
> +
> +/**
> + * Export the prebuilt rule database from a RegEx device to the buffer.
> + *
> + * @param dev_id RegEx device identifier
> + * @param[out] rule_db
> + *   Block of memory to insert the rule database. Must be at least size in
> + *   capacity. If set to NULL, function returns required capacity.
> + *
> + * @return
> + *   - 0: Successfully exported the prebuilt rule database.
> + *   - size: If rule_db set to NULL then required capacity for *rule_db*
> + *   - -EINVAL:  Invalid device ID
> + *   - -ENOTSUP: Rule database export is not supported on this device.
> + *
> + * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
> + */
> +int
> +rte_regex_rule_db_export(uint8_t dev_id, char *rule_db);
> +
> +/* Extended statistics */
> +/** Maximum name length for extended statistics counters */
> +#define RTE_REGEX_DEV_XSTATS_NAME_SIZE 64
> +
> +/**
> + * A name-key lookup element for extended statistics.
> + *
> + * This structure is used to map between names and ID numbers
> + * for extended RegEx device statistics.
> + */
> +struct rte_regex_dev_xstats_map {
> +	uint16_t id;
> +	/**< xstat identifier */
> +	char name[RTE_REGEX_DEV_XSTATS_NAME_SIZE];
> +	/**< xstat name */
> +};
> +
> +/**
> + * Retrieve names of extended statistics of a regex device.
> + *
> + * @param dev_id
> + *   The identifier of the regex device.
> + * @param[out] xstats_map
> + *   Block of memory to insert id and names into. Must be at least size in
> + *   capacity. If set to NULL, function returns required capacity.
> + * @return
> + *   - positive value on success:
> + *        -The return value is the number of entries filled in the stats map.
> + *        -If xstats_map set to NULL then required capacity for xstats_map.
> + *   - negative value on error:
> + *      -ENODEV for invalid *dev_id*
> + *      -ENOTSUP if the device doesn't support this function.
> + */
> +int
> +rte_regex_dev_xstats_names_get(uint8_t dev_id,
> +			       struct rte_regex_dev_xstats_map *xstats_map);
> +
> +/**
> + * Retrieve extended statistics of an regex device.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param ids
> + *   The id numbers of the stats to get. The ids can be got from the stat
> + *   position in the stat list from rte_regex_dev_xstats_names_get(), or
> + *   by using rte_regex_dev_xstats_by_name_get().
> + * @param[out] values
> + *   The values for each stats request by ID.
> + * @param n
> + *   The number of stats requested
> + * @return
> + *   - positive value: number of stat entries filled into the values array
> + *   - negative value on error:
> + *      -ENODEV for invalid *dev_id*
> + *      -ENOTSUP if the device doesn't support this function.
> + */
> +int
> +rte_regex_dev_xstats_get(uint8_t dev_id, const uint16_t ids[],
> +			 uint64_t values[], uint16_t n);
> +
> +/**
> + * Retrieve the value of a single stat by requesting it by name.
> + *
> + * @param dev_id
> + *   The identifier of the device
> + * @param name
> + *   The stat name to retrieve
> + * @param[out] id
> + *   If non-NULL, the numerical id of the stat will be returned, so that further
> + *   requests for the stat can be got using rte_regex_dev_xstats_get, which
> will
> + *   be faster as it doesn't need to scan a list of names for the stat.
> + * @param[out] value
> + *   Must be non-NULL, retrieved xstat value will be stored in this address.
> + *
> + * @return
> + *   - 0: Successfully retrieved xstat value.
> + *   - -EINVAL: invalid parameters
> + *   - -ENOTSUP: if not supported.
> + */
> +int
> +rte_regex_dev_xstats_by_name_get(uint8_t dev_id, const char *name,
> +				 uint16_t *id, uint64_t *value);
> +
> +/**
> + * Reset the values of the xstats of the selected component in the device.
> + *
> + * @param dev_id
> + *   The identifier of the device
> + * @param ids
> + *   Selects specific statistics to be reset. When NULL, all statistics will be
> + *   reset. If non-NULL, must point to array of at least *nb_ids* size.
> + * @param nb_ids
> + *   The number of ids available from the *ids* array. Ignored when ids is
> NULL.
> + * @return
> + *   - 0: Successfully reset the statistics to zero.
> + *   - -EINVAL: invalid parameters
> + *   - -ENOTSUP: if not supported.
> + */
> +int
> +rte_regex_dev_xstats_reset(uint8_t dev_id, const uint16_t ids[],
> +			   uint16_t nb_ids);
> +
> +/**
> + * Trigger the RegEx device self test.
> + *
> + * @param dev_id
> + *   The identifier of the device
> + * @return
> + *   - 0: Selftest successful
> + *   - -ENOTSUP if the device doesn't support selftest
> + *   - other values < 0 on failure.
> + */
> +int rte_regex_dev_selftest(uint8_t dev_id);
> +
> +/**
> + * Dump internal information about *dev_id* to the FILE* provided in *f*.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + *
> + * @param f
> + *   A pointer to a file for output
> + *
> + * @return
> + *   - 0: on success
> + *   - <0: on failure.
> + */
> +int
> +rte_regex_dev_dump(uint8_t dev_id, FILE *f);
> +
> +/* Fast path APIs */
> +
> +/**
> + * The generic *rte_regex_match* structure to hold the RegEx match
> attributes.
> + * @see struct rte_regex_ops::matches
> + */
> +struct rte_regex_match {
> +	RTE_STD_C11
> +	union {
> +		uint64_t u64;
> +		struct {
> +			uint32_t rule_id:20;
> +			/**< Rule identifier to which the pattern matched.
> +			 * @see struct rte_regex_rule::rule_id
> +			 */
> +			uint32_t group_id:12;
> +			/**< Group identifier of the rule which the pattern
> +			 * matched. @see struct rte_regex_rule::group_id
> +			 */
> +			uint16_t offset;
> +			/**< Starting Byte Position for matched rule. */
> +			uint16_t len;
> +			/**< Length of match in bytes */
> +		};
> +	};
> +};
> +
> +/* Enumerates RegEx request flags. */
> +#define RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F (1 << 0)
> +/**< Set when struct rte_regex_rule::group_id1 valid */
> +
> +#define RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F (1 << 1)
> +/**< Set when struct rte_regex_rule::group_id2 valid */
> +
> +#define RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F (1 << 2)
> +/**< Set when struct rte_regex_rule::group_id3 valid */
> +
> +#define RTE_REGEX_OPS_REQ_STOP_ON_MATCH_F (1 << 4)
> +/**< The RegEx engine will stop scanning and return the first match. */
> +
> +#define RTE_REGEX_OPS_REQ_MATCH_HIGH_PRIORITY_F (1 << 5)
> +/**< In High Priority mode a maximum of one match will be returned per
> scan to
> + * reduce the post-processing required by the application. The match with
> the
> + * lowest Rule id, lowest start pointer and lowest match length will be
> + * returned.
> + *
> + * @see struct rte_regex_ops::nb_actual_matches
> + * @see struct rte_regex_ops::nb_matches
> + */
> +
> +
> +/* Enumerates RegEx response flags. */
> +#define RTE_REGEX_OPS_RSP_PMI_SOJ_F (1 << 0)
> +/**< Indicates that the RegEx device has encountered a partial match at the
> + * start of scan in the given buffer.
> + *
> + * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> + */
> +
> +#define RTE_REGEX_OPS_RSP_PMI_EOJ_F (1 << 1)
> +/**< Indicates that the RegEx device has encountered a partial match at the
> + * end of scan in the given buffer.
> + *
> + * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> + */
> +
> +#define RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F (1 << 2)
> +/**< Indicates that the RegEx device has exceeded the max timeout while
> + * scanning the given buffer.
> + *
> + * @see RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT
> + */
> +
> +#define RTE_REGEX_OPS_RSP_MAX_MATCH_F (1 << 3)
> +/**< Indicates that the RegEx device has exceeded the max matches while
> + * scanning the given buffer.
> + *
> + * @see RTE_REGEX_DEV_ATTR_MAX_MATCHES
> + */
> +
> +#define RTE_REGEX_OPS_RSP_MAX_PREFIX_F (1 << 4)
> +/**< Indicates that the RegEx device has reached the max allowed prefix
> length
> + * while scanning the given buffer.
> + *
> + * @see RTE_REGEX_DEV_ATTR_MAX_PREFIX
> + */
> +
> +/**
> + * The generic *rte_regex_ops* structure to hold the RegEx attributes
> + * for enqueue and dequeue operation.
> + */
> +struct rte_regex_ops {
> +	/* W0 */
> +	uint16_t req_flags;
> +	/**< Request flags for the RegEx ops.
> +	 * @see RTE_REGEX_OPS_REQ_*
> +	 */
> +	uint16_t scan_size;
> +	/**< Scan size of the buffer to be scanned in bytes. */
> +	uint16_t rsp_flags;
> +	/**< Response flags for the RegEx ops.
> +	 * @see RTE_REGEX_OPS_RSP_*
> +	 */
> +	uint8_t nb_actual_matches;
> +	/**< The total number of actual matches detected by the Regex
> device.*/
> +	uint8_t nb_matches;
> +	/**< The total number of matches returned by the RegEx device for
> this
> +	 * scan. The size of *rte_regex_ops::matches* zero length array will
> be
> +	 * this value.
> +	 *
> +	 * @see struct rte_regex_ops::matches, struct rte_regex_match
> +	 */
> +
> +	/* W1 */
> +	RTE_STD_C11
> +	union {
> +		uint64_t u64;
> +		/**<  Allow 8-byte reserved on 32-bit system */
> +		void *buf_addr;
> +		/**< Virtual address of the pattern to be matched. */
> +	};
> +
> +	/* W2 */
> +	rte_iova_t buf_iova;
> +	/**< IOVA address of the pattern to be matched. */
> +
> +	/* W3 */
> +	uint16_t group_id0;
> +	/**< First group_id to match the rule against. Minimum one group id
> +	 * must be provided by application.
> +	 * When RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F set then
> group_id1
> +	 * is valid, respectively similar flags for group_id2 and group_id3.
> +	 * Upon the match, struct rte_regex_match::group_id shall be
> updated
> +	 * with matching group ID by the device. Group ID scheme provides
> +	 * rule isolation and effective pattern matching.
> +	 */
> +	uint16_t group_id1;
> +	/**< Second group_id to match the rule against.
> +	 *
> +	 * @see RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F
> +	 */
> +	uint16_t group_id2;
> +	/**< Third group_id to match the rule against.
> +	 *
> +	 * @see RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F
> +	 */
> +	uint16_t group_id3;
> +	/**< Forth group_id to match the rule against.
> +	 *
> +	 * @see RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F
> +	 */
> +
> +	/* W4 */
> +	RTE_STD_C11
> +	union {
> +		uint64_t user_id;
> +		/**< Application specific opaque value. An application may
> use
> +		 * this field to hold application specific value to share
> +		 * between dequeue and enqueue operation.
> +		 * Implementation should not modify this field.
> +		 */
> +		void *user_ptr;
> +		/**< Pointer representation of *user_id* */
> +	};
> +
> +	/* W5 */
> +	struct rte_regex_match matches[];
> +	/**< Zero length array to hold the match tuples.
> +	 * The struct rte_regex_ops::nb_matches value holds the number of
> +	 * elements in this array.
> +	 *
> +	 * @see struct rte_regex_ops::nb_matches
> +	 */
> +};
> +
> +/**
> + * Enqueue a burst of scan request on a RegEx device.
> + *
> + * The rte_regex_enqueue_burst() function is invoked to place
> + * regex operations on the queue *qp_id* of the device designated by
> + * its *dev_id*.
> + *
> + * The *nb_ops* parameter is the number of operations to process which
> are
> + * supplied in the *ops* array of *rte_regex_op* structures.
> + *
> + * The rte_regex_enqueue_burst() function returns the number of
> + * operations it actually enqueued for processing. A return value equal to
> + * *nb_ops* means that all packets have been enqueued.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param qp_id
> + *   The index of the queue pair which packets are to be enqueued for
> + *   processing. The value must be in the range [0, nb_queue_pairs - 1]
> + *   previously supplied to rte_regex_dev_configure().
> + * @param ops
> + *   The address of an array of *nb_ops* pointers to *rte_regex_op*
> structures
> + *   which contain the regex operations to be processed.
> + * @param nb_ops
> + *   The number of operations to process.
> + *
> + * @return
> + *   The number of operations actually enqueued on the regex device. The
> return
> + *   value can be less than the value of the *nb_ops* parameter when the
> + *   regex devices queue is full or if invalid parameters are specified in
> + *   a *rte_regex_op*. If the return value is less than *nb_ops*, the
> remaining
> + *   ops at the end of *ops* are not consumed and the caller has to take
> care
> + *   of them.
> + */
> +uint16_t
> +rte_regex_enqueue_burst(uint8_t dev_id, uint16_t qp_id,
> +			struct rte_regex_ops **ops, uint16_t nb_ops);
> +
> +/**
> + *
> + * Dequeue a burst of scan response from a queue on the RegEx device.
> + * The dequeued operation are stored in *rte_regex_op* structures
> + * whose pointers are supplied in the *ops* array.
> + *
> + * The rte_regex_dequeue_burst() function returns the number of ops
> + * actually dequeued, which is the number of *rte_regex_op* data
> structures
> + * effectively supplied into the *ops* array.
> + *
> + * A return value equal to *nb_ops* indicates that the queue contained
> + * at least *nb_ops* operations, and this is likely to signify that other
> + * processed operations remain in the devices output queue. Applications
> + * implementing a "retrieve as many processed operations as possible"
> policy
> + * can check this specific case and keep invoking the
> + * rte_regex_dequeue_burst() function until a value less than
> + * *nb_ops* is returned.
> + *
> + * The rte_regex_dequeue_burst() function does not provide any error
> + * notification to avoid the corresponding overhead.
> + *
> + * @param dev_id
> + *   The RegEx device identifier
> + * @param qp_id
> + *   The index of the queue pair from which to retrieve processed packets.
> + *   The value must be in the range [0, nb_queue_pairs - 1] previously
> + *   supplied to rte_regex_dev_configure().
> + * @param ops
> + *   The address of an array of pointers to *rte_regex_op* structures that
> must
> + *   be large enough to store *nb_ops* pointers in it.
> + * @param nb_ops
> + *   The maximum number of operations to dequeue.
> + *
> + * @return
> + *   The number of operations actually dequeued, which is the number
> + *   of pointers to *rte_regex_op* structures effectively supplied to the
> + *   *ops* array. If the return value is less than *nb_ops*, the remaining
> + *   ops at the end of *ops* are not consumed and the caller has to take
> care
> + *   of them.
> + */
> +uint16_t
> +rte_regex_dequeue_burst(uint8_t dev_id, uint16_t qp_id,
> +			struct rte_regex_ops **ops, uint16_t nb_ops);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_REGEXDEV_H_ */
> --
> 2.21.0
Thomas Monjalon Aug. 15, 2019, 9:35 a.m. UTC | #2
+Cc other interested vendors
+Cc contributors to µDPI project in fd.io

27/06/2019 17:50, jerinj@marvell.com:
> From: Jerin Jacob <jerinj@marvell.com>
> 
> Even though there are some vendors which offer Regex HW offload, due to
> lack of standard API, It is diffcult for DPDK consumer to use them
> in a portable way.
> 
> This _RFC_ attempts to standardize the RegEx/DPI offload APIs for DPDK.
> 
> The Doxygen generated RFC API documentation available here:
> https://dreamy-noether-22777e.netlify.com/rte__regexdev_8h.html
> 
> This RFC crafted based on SW Regex API frameworks such as libpcre and
> hyperscan and a few of the RegEx HW IPs which I am aware of.
> 
> RegEx pattern matching applications:
> • Next Generation Firewalls (NGFW)
> • Deep Packet and Flow Inspection (DPI)
> • Intrusion Prevention Systems (IPS)
> • DDoS Mitigation
> • Network Monitoring
> • Data Loss Prevention (DLP)
> • Smart NICs
> • Grammar based content processing
> • URL, spam and adware filtering
> • Advanced auditing and policing of user/application security policies
> • Financial data mining - parsing of streamed financial feeds 
> 
> Request to review from HW and SW RegEx vendors and RegEx application users
> to have portable DPDK API for RegEx.
> 
> The API schematics are based cryptodev, eventdev and ethdev existing device API.
> 
> Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> ---
> 
> RTE RegEx Device API
> --------------------
> 
> Defines RTE RegEx Device APIs for RegEx operations and its provisioning.
> 
> The RegEx Device API is composed of two parts:
> 
> - The application-oriented RegEx API that includes functions to setup
>   a RegEx device (configure it, setup its queue pairs and start it),
>   update the rule database and so on.
> 
> - The driver-oriented RegEx API that exports a function allowing
>   a RegEx poll Mode Driver (PMD) to simultaneously register itself as
>   a RegEx device driver.
> 
> RegEx device components and definitions:
> 
>     +-----------------+
>     |                 |
>     |                 o---------+    rte_regex_[en|de]queue_burst()
>     |   PCRE based    o------+  |               |
>     |  RegEx pattern  |      |  |  +--------+   |
>     | matching engine o------+--+--o        |   |    +------+
>     |                 |      |  |  | queue  |<==o===>|Core 0|
>     |                 o----+ |  |  | pair 0 |        |      |
>     |                 |    | |  |  +--------+        +------+
>     +-----------------+    | |  |
>            ^               | |  |  +--------+
>            |               | |  |  |        |        +------+
>            |               | +--+--o queue  |<======>|Core 1|
>        Rule|Database       |    |  | pair 1 |        |      |
>     +------+----------+    |    |  +--------+        +------+
>     |     Group 0     |    |    |
>     | +-------------+ |    |    |  +--------+        +------+
>     | | Rules 0..n  | |    |    |  |        |        |Core 2|
>     | +-------------+ |    |    +--o queue  |<======>|      |
>     |     Group 1     |    |       | pair 2 |        +------+
>     | +-------------+ |    |       +--------+
>     | | Rules 0..n  | |    |
>     | +-------------+ |    |       +--------+
>     |     Group 2     |    |       |        |        +------+
>     | +-------------+ |    |       | queue  |<======>|Core n|
>     | | Rules 0..n  | |    +-------o pair n |        |      |
>     | +-------------+ |            +--------+        +------+
>     |     Group n     |
>     | +-------------+ |<-------rte_regex_rule_db_update()
>     | | Rules 0..n  | |<-------rte_regex_rule_db_import()
>     | +-------------+ |------->rte_regex_rule_db_export()
>     +-----------------+
> 
> RegEx: A regular expression is a concise and flexible means for matching
> strings of text, such as particular characters, words, or patterns of
> characters. A common abbreviation for this is “RegEx”.
> 
> RegEx device: A hardware or software-based implementation of RegEx
> device API for PCRE based pattern matching syntax and semantics.
> 
> PCRE RegEx syntax and semantics specification:
> http://regexkit.sourceforge.net/Documentation/pcre/pcrepattern.html
> 
> RegEx queue pair: Each RegEx device should have one or more queue pair to
> transmit a burst of pattern matching request and receive a burst of
> receive the pattern matching response. The pattern matching request/response
> embedded in *rte_regex_ops* structure.
> 
> Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
> Match ID and Group ID to identify the rule upon the match.
> 
> Rule database: The RegEx device accepts regular expressions and converts them
> into a compiled rule database that can then be used to scan data.
> Compilation allows the device to analyze the given pattern(s) and
> pre-determine how to scan for these patterns in an optimized fashion that
> would be far too expensive to compute at run-time. A rule database contains
> a set of rules that compiled in device specific binary form.
> 
> Match ID or Rule ID: A unique identifier provided at the time of rule
> creation for the application to identify the rule upon match.
> 
> Group ID: Group of rules can be grouped under one group ID to enable
> rule isolation and effective pattern matching. A unique group identifier
> provided at the time of rule creation for the application to identify the
> rule upon match.
> 
> Scan: A pattern matching request through *enqueue* API.
> 
> It may possible that a given RegEx device may not support all the features
> of PCRE. The application may probe unsupported features through
> struct rte_regex_dev_info::pcre_unsup_flags
> 
> By default, all the functions of the RegEx Device API exported by a PMD
> are lock-free functions which assume to not be invoked in parallel on
> different logical cores to work on the same target object. For instance,
> the dequeue function of a PMD cannot be invoked in parallel on two logical
> cores to operates on same RegEx queue pair. Of course, this function
> can be invoked in parallel by different logical core on different queue pair.
> It is the responsibility of the upper level application to enforce this rule.
> 
> In all functions of the RegEx API, the RegEx device is
> designated by an integer >= 0 named the device identifier *dev_id*
> 
> At the RegEx driver level, RegEx devices are represented by a generic
> data structure of type *rte_regex_dev*.
> 
> RegEx devices are dynamically registered during the PCI/SoC device probing
> phase performed at EAL initialization time.
> When a RegEx device is being probed, a *rte_regex_dev* structure and
> a new device identifier are allocated for that device. Then, the
> regex_dev_init() function supplied by the RegEx driver matching the probed
> device is invoked to properly initialize the device.
> 
> The role of the device init function consists of resetting the hardware or
> software RegEx driver implementations.
> 
> If the device init operation is successful, the correspondence between
> the device identifier assigned to the new device and its associated
> *rte_regex_dev* structure is effectively registered.
> Otherwise, both the *rte_regex_dev* structure and the device identifier are
> freed.
> 
> The functions exported by the application RegEx API to setup a device
> designated by its device identifier must be invoked in the following order:
>     - rte_regex_dev_configure()
>     - rte_regex_queue_pair_setup()
>     - rte_regex_dev_start()
> 
> Then, the application can invoke, in any order, the functions
> exported by the RegEx API to enqueue pattern matching job, dequeue pattern
> matching response, get the stats, update the rule database,
> get/set device attributes and so on
> 
> If the application wants to change the configuration (i.e. call
> rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must call
> rte_regex_dev_stop() first to stop the device and then do the reconfiguration
> before calling rte_regex_dev_start() again. The enqueue and dequeue
> functions should not be invoked when the device is stopped.
> 
> Finally, an application can close a RegEx device by invoking the
> rte_regex_dev_close() function.
> 
> Each function of the application RegEx API invokes a specific function
> of the PMD that controls the target device designated by its device
> identifier.
> 
> For this purpose, all device-specific functions of a RegEx driver are
> supplied through a set of pointers contained in a generic structure of type
> *regex_dev_ops*.
> The address of the *regex_dev_ops* structure is stored in the *rte_regex_dev*
> structure by the device init function of the RegEx driver, which is
> invoked during the PCI/SoC device probing phase, as explained earlier.
> 
> In other words, each function of the RegEx API simply retrieves the
> *rte_regex_dev* structure associated with the device identifier and
> performs an indirect invocation of the corresponding driver function
> supplied in the *regex_dev_ops* structure of the *rte_regex_dev* structure.
> 
> For performance reasons, the address of the fast-path functions of the
> RegEx driver is not contained in the *regex_dev_ops* structure.
> Instead, they are directly stored at the beginning of the *rte_regex_dev*
> structure to avoid an extra indirect memory access during their invocation.
> 
> RTE RegEx device drivers do not use interrupts for enqueue or dequeue
> operation. Instead, RegEx drivers export Poll-Mode enqueue and dequeue
> functions to applications.
> 
> The *enqueue* operation submits a burst of RegEx pattern matching request
> to the RegEx device and the *dequeue* operation gets a burst of pattern
> matching response for the ones submitted through *enqueue* operation.
> 
> Typical application utilisation of the RegEx device API will follow the
> following programming flow.
> 
> - rte_regex_dev_configure()
> - rte_regex_queue_pair_setup()
> - rte_regex_rule_db_update() Needs to invoke if precompiled rule database not
>   provided in rte_regex_dev_config::rule_db for rte_regex_dev_configure()
>   and/or application needs to update rule database.
> - Create or reuse exiting mempool for *rte_regex_ops* objects.
> - rte_regex_dev_start()
> - rte_regex_enqueue_burst()
> - rte_regex_dequeue_burst()
> 
> ---
> 
>  config/common_base                 |    5 +
>  doc/api/doxy-api-index.md          |    1 +
>  doc/api/doxy-api.conf.in           |    1 +
>  lib/Makefile                       |    2 +
>  lib/librte_regexdev/Makefile       |   23 +
>  lib/librte_regexdev/rte_regexdev.c |    5 +
>  lib/librte_regexdev/rte_regexdev.h | 1247 ++++++++++++++++++++++++++++
>  7 files changed, 1284 insertions(+)
>  create mode 100644 lib/librte_regexdev/Makefile
>  create mode 100644 lib/librte_regexdev/rte_regexdev.c
>  create mode 100644 lib/librte_regexdev/rte_regexdev.h
> 
> diff --git a/config/common_base b/config/common_base
> index e406e7836..986093d6e 100644
> --- a/config/common_base
> +++ b/config/common_base
> @@ -746,6 +746,11 @@ CONFIG_RTE_LIBRTE_PMD_DPAA2_QDMA_RAWDEV=n
>  #
>  CONFIG_RTE_LIBRTE_PMD_IFPGA_RAWDEV=y
>  
> +#
> +# Compile regex device support
> +#
> +CONFIG_RTE_LIBRTE_REGEXDEV=y
> +
>  #
>  # Compile librte_ring
>  #
> diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
> index 715248dd1..a0bc27ae4 100644
> --- a/doc/api/doxy-api-index.md
> +++ b/doc/api/doxy-api-index.md
> @@ -26,6 +26,7 @@ The public API headers are grouped by topics:
>    [event_timer_adapter]    (@ref rte_event_timer_adapter.h),
>    [event_crypto_adapter]   (@ref rte_event_crypto_adapter.h),
>    [rawdev]             (@ref rte_rawdev.h),
> +  [regexdev]           (@ref rte_regexdev.h),
>    [metrics]            (@ref rte_metrics.h),
>    [bitrate]            (@ref rte_bitrate.h),
>    [latency]            (@ref rte_latencystats.h),
> diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in
> index b9896cb63..7adb821bb 100644
> --- a/doc/api/doxy-api.conf.in
> +++ b/doc/api/doxy-api.conf.in
> @@ -53,6 +53,7 @@ INPUT                   = @TOPDIR@/doc/api/doxy-api-index.md \
>                            @TOPDIR@/lib/librte_rawdev \
>                            @TOPDIR@/lib/librte_rcu \
>                            @TOPDIR@/lib/librte_reorder \
> +                          @TOPDIR@/lib/librte_regexdev \
>                            @TOPDIR@/lib/librte_ring \
>                            @TOPDIR@/lib/librte_sched \
>                            @TOPDIR@/lib/librte_security \
> diff --git a/lib/Makefile b/lib/Makefile
> index 791e0d991..57de9691a 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -44,6 +44,8 @@ DEPDIRS-librte_eventdev := librte_eal librte_ring librte_ethdev librte_hash \
>                             librte_mempool librte_timer librte_cryptodev
>  DIRS-$(CONFIG_RTE_LIBRTE_RAWDEV) += librte_rawdev
>  DEPDIRS-librte_rawdev := librte_eal librte_ethdev
> +DIRS-$(CONFIG_RTE_LIBRTE_REGEXDEV) += librte_regexdev
> +DEPDIRS-librte_regexdev := librte_eal
>  DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
>  DEPDIRS-librte_vhost := librte_eal librte_mempool librte_mbuf librte_ethdev \
>  			librte_net
> diff --git a/lib/librte_regexdev/Makefile b/lib/librte_regexdev/Makefile
> new file mode 100644
> index 000000000..723b4b28c
> --- /dev/null
> +++ b/lib/librte_regexdev/Makefile
> @@ -0,0 +1,23 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(C) 2019 Marvell International Ltd.
> +#
> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +# library name
> +LIB = librte_regexdev.a
> +
> +# library version
> +LIBABIVER := 1
> +
> +# build flags
> +CFLAGS += -O3
> +CFLAGS += $(WERROR_FLAGS)
> +
> +# library source files
> +SRCS-y += rte_regexdev.c
> +
> +# export include files
> +SYMLINK-y-include += rte_regexdev.h
> +
> +include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/lib/librte_regexdev/rte_regexdev.c b/lib/librte_regexdev/rte_regexdev.c
> new file mode 100644
> index 000000000..e5be0f29c
> --- /dev/null
> +++ b/lib/librte_regexdev/rte_regexdev.c
> @@ -0,0 +1,5 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(C) 2019 Marvell International Ltd.
> + */
> +
> +#include <rte_regexdev.h>
> diff --git a/lib/librte_regexdev/rte_regexdev.h b/lib/librte_regexdev/rte_regexdev.h
> new file mode 100644
> index 000000000..765da4aaa
> --- /dev/null
> +++ b/lib/librte_regexdev/rte_regexdev.h
> @@ -0,0 +1,1247 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(C) 2019 Marvell International Ltd.
> + */
> +
> +#ifndef _RTE_REGEXDEV_H_
> +#define _RTE_REGEXDEV_H_
> +
> +/**
> + * @file
> + *
> + * RTE RegEx Device API
> + *
> + * Defines RTE RegEx Device APIs for RegEx operations and its provisioning.
> + *
> + * The RegEx Device API is composed of two parts:
> + *
> + * - The application-oriented RegEx API that includes functions to setup
> + *   a RegEx device (configure it, setup its queue pairs and start it),
> + *   update the rule database and so on.
> + *
> + * - The driver-oriented RegEx API that exports a function allowing
> + *   a RegEx poll Mode Driver (PMD) to simultaneously register itself as
> + *   a RegEx device driver.
> + *
> + * RegEx device components and definitions:
> + *
> + *     +-----------------+
> + *     |                 |
> + *     |                 o---------+    rte_regex_[en|de]queue_burst()
> + *     |   PCRE based    o------+  |               |
> + *     |  RegEx pattern  |      |  |  +--------+   |
> + *     | matching engine o------+--+--o        |   |    +------+
> + *     |                 |      |  |  | queue  |<==o===>|Core 0|
> + *     |                 o----+ |  |  | pair 0 |        |      |
> + *     |                 |    | |  |  +--------+        +------+
> + *     +-----------------+    | |  |
> + *            ^               | |  |  +--------+
> + *            |               | |  |  |        |        +------+
> + *            |               | +--+--o queue  |<======>|Core 1|
> + *        Rule|Database       |    |  | pair 1 |        |      |
> + *     +------+----------+    |    |  +--------+        +------+
> + *     |     Group 0     |    |    |
> + *     | +-------------+ |    |    |  +--------+        +------+
> + *     | | Rules 0..n  | |    |    |  |        |        |Core 2|
> + *     | +-------------+ |    |    +--o queue  |<======>|      |
> + *     |     Group 1     |    |       | pair 2 |        +------+
> + *     | +-------------+ |    |       +--------+
> + *     | | Rules 0..n  | |    |
> + *     | +-------------+ |    |       +--------+
> + *     |     Group 2     |    |       |        |        +------+
> + *     | +-------------+ |    |       | queue  |<======>|Core n|
> + *     | | Rules 0..n  | |    +-------o pair n |        |      |
> + *     | +-------------+ |            +--------+        +------+
> + *     |     Group n     |
> + *     | +-------------+ |<-------rte_regex_rule_db_update()
> + *     | | Rules 0..n  | |<-------rte_regex_rule_db_import()
> + *     | +-------------+ |------->rte_regex_rule_db_export()
> + *     +-----------------+
> + *
> + * RegEx: A regular expression is a concise and flexible means for matching
> + * strings of text, such as particular characters, words, or patterns of
> + * characters. A common abbreviation for this is “RegEx”.
> + *
> + * RegEx device: A hardware or software-based implementation of RegEx
> + * device API for PCRE based pattern matching syntax and semantics.
> + *
> + * PCRE RegEx syntax and semantics specification:
> + * http://regexkit.sourceforge.net/Documentation/pcre/pcrepattern.html
> + *
> + * RegEx queue pair: Each RegEx device should have one or more queue pair to
> + * transmit a burst of pattern matching request and receive a burst of
> + * receive the pattern matching response. The pattern matching request/response
> + * embedded in *rte_regex_ops* structure.
> + *
> + * Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
> + * Match ID and Group ID to identify the rule upon the match.
> + *
> + * Rule database: The RegEx device accepts regular expressions and converts them
> + * into a compiled rule database that can then be used to scan data.
> + * Compilation allows the device to analyze the given pattern(s) and
> + * pre-determine how to scan for these patterns in an optimized fashion that
> + * would be far too expensive to compute at run-time. A rule database contains
> + * a set of rules that compiled in device specific binary form.
> + *
> + * Match ID or Rule ID: A unique identifier provided at the time of rule
> + * creation for the application to identify the rule upon match.
> + *
> + * Group ID: Group of rules can be grouped under one group ID to enable
> + * rule isolation and effective pattern matching. A unique group identifier
> + * provided at the time of rule creation for the application to identify the
> + * rule upon match.
> + *
> + * Scan: A pattern matching request through *enqueue* API.
> + *
> + * It may possible that a given RegEx device may not support all the features
> + * of PCRE. The application may probe unsupported features through
> + * struct rte_regex_dev_info::pcre_unsup_flags
> + *
> + * By default, all the functions of the RegEx Device API exported by a PMD
> + * are lock-free functions which assume to not be invoked in parallel on
> + * different logical cores to work on the same target object. For instance,
> + * the dequeue function of a PMD cannot be invoked in parallel on two logical
> + * cores to operates on same RegEx queue pair. Of course, this function
> + * can be invoked in parallel by different logical core on different queue pair.
> + * It is the responsibility of the upper level application to enforce this rule.
> + *
> + * In all functions of the RegEx API, the RegEx device is
> + * designated by an integer >= 0 named the device identifier *dev_id*
> + *
> + * At the RegEx driver level, RegEx devices are represented by a generic
> + * data structure of type *rte_regex_dev*.
> + *
> + * RegEx devices are dynamically registered during the PCI/SoC device probing
> + * phase performed at EAL initialization time.
> + * When a RegEx device is being probed, a *rte_regex_dev* structure and
> + * a new device identifier are allocated for that device. Then, the
> + * regex_dev_init() function supplied by the RegEx driver matching the probed
> + * device is invoked to properly initialize the device.
> + *
> + * The role of the device init function consists of resetting the hardware or
> + * software RegEx driver implementations.
> + *
> + * If the device init operation is successful, the correspondence between
> + * the device identifier assigned to the new device and its associated
> + * *rte_regex_dev* structure is effectively registered.
> + * Otherwise, both the *rte_regex_dev* structure and the device identifier are
> + * freed.
> + *
> + * The functions exported by the application RegEx API to setup a device
> + * designated by its device identifier must be invoked in the following order:
> + *     - rte_regex_dev_configure()
> + *     - rte_regex_queue_pair_setup()
> + *     - rte_regex_dev_start()
> + *
> + * Then, the application can invoke, in any order, the functions
> + * exported by the RegEx API to enqueue pattern matching job, dequeue pattern
> + * matching response, get the stats, update the rule database,
> + * get/set device attributes and so on
> + *
> + * If the application wants to change the configuration (i.e. call
> + * rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must call
> + * rte_regex_dev_stop() first to stop the device and then do the reconfiguration
> + * before calling rte_regex_dev_start() again. The enqueue and dequeue
> + * functions should not be invoked when the device is stopped.
> + *
> + * Finally, an application can close a RegEx device by invoking the
> + * rte_regex_dev_close() function.
> + *
> + * Each function of the application RegEx API invokes a specific function
> + * of the PMD that controls the target device designated by its device
> + * identifier.
> + *
> + * For this purpose, all device-specific functions of a RegEx driver are
> + * supplied through a set of pointers contained in a generic structure of type
> + * *regex_dev_ops*.
> + * The address of the *regex_dev_ops* structure is stored in the *rte_regex_dev*
> + * structure by the device init function of the RegEx driver, which is
> + * invoked during the PCI/SoC device probing phase, as explained earlier.
> + *
> + * In other words, each function of the RegEx API simply retrieves the
> + * *rte_regex_dev* structure associated with the device identifier and
> + * performs an indirect invocation of the corresponding driver function
> + * supplied in the *regex_dev_ops* structure of the *rte_regex_dev* structure.
> + *
> + * For performance reasons, the address of the fast-path functions of the
> + * RegEx driver is not contained in the *regex_dev_ops* structure.
> + * Instead, they are directly stored at the beginning of the *rte_regex_dev*
> + * structure to avoid an extra indirect memory access during their invocation.
> + *
> + * RTE RegEx device drivers do not use interrupts for enqueue or dequeue
> + * operation. Instead, RegEx drivers export Poll-Mode enqueue and dequeue
> + * functions to applications.
> + *
> + * The *enqueue* operation submits a burst of RegEx pattern matching request
> + * to the RegEx device and the *dequeue* operation gets a burst of pattern
> + * matching response for the ones submitted through *enqueue* operation.
> + *
> + * Typical application utilisation of the RegEx device API will follow the
> + * following programming flow.
> + *
> + * - rte_regex_dev_configure()
> + * - rte_regex_queue_pair_setup()
> + * - rte_regex_rule_db_update() Needs to invoke if precompiled rule database not
> + *   provided in rte_regex_dev_config::rule_db for rte_regex_dev_configure()
> + *   and/or application needs to update rule database.
> + * - Create or reuse exiting mempool for *rte_regex_ops* objects.
> + * - rte_regex_dev_start()
> + * - rte_regex_enqueue_burst()
> + * - rte_regex_dequeue_burst()
> + *
> + */
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include <rte_common.h>
> +#include <rte_config.h>
> +#include <rte_dev.h>
> +#include <rte_errno.h>
> +#include <rte_memory.h>
> +
> +/**
> + * Get the total number of RegEx devices that have been successfully
> + * initialised.
> + *
> + * @return
> + *   The total number of usable RegEx devices.
> + */
> +uint8_t
> +rte_regex_dev_count(void);
> +
> +/**
> + * Get the device identifier for the named RegEx device.
> + *
> + * @param name
> + *   RegEx device name to select the RegEx device identifier.
> + *
> + * @return
> + *   Returns RegEx device identifier on success.
> + *   - <0: Failure to find named RegEx device.
> + */
> +int
> +rte_regex_dev_get_dev_id(const char *name);
> +
> +/* Enumerates RegEx device capabilities */
> +#define RTE_REGEX_DEV_CAPA_RUNTIME_COMPILATION_F (1ULL << 0)
> +/**< RegEx device does support compiling the rules at runtime unlike
> + * loading only the pre-built rule database using
> + * struct rte_regex_dev_config::rule_db in rte_regex_dev_configure()
> + * @see struct rte_regex_dev_config::rule_db, rte_regex_dev_configure()
> + * @see struct rte_regex_dev_info::regex_dev_capa
> + */
> +
> +
> +/* Enumerates unsupported PCRE features for the RegEx device */
> +#define RTE_REGEX_DEV_PCRE_UNSUP_START_ANCHOR_F (1ULL << 0)
> +/**< RegEx device doesn't support PCRE Anchor to start of match flag.
> + * Example RegEx is '/\Gfoo\d/'. Here '\G' asserts position at the end of the
> + * previous match or the start of the string for the first match.
> + * This position will change each time the RegEx is applied to the subject
> + * string. If the RegEx is applied to 'foo1foo2Zfoo3' the first two matches will
> + * be successful for 'foo1foo2' and fail for 'Zfoo3'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_ATOMIC_GROUPING_F (1ULL << 1)
> +/**< RegEx device doesn't support PCRE Atomic grouping.
> + * Atomic groups are represented by '(?>)'. An atomic group is a group that,
> + * when the RegEx engine exits from it, automatically throws away all
> + * backtracking positions remembered by any tokens inside the group.
> + * Example RegEx is 'a(?>bc|b)c' if the given patterns are 'abc' and 'abcc' then
> + * 'a(bc|b)c' matches both where as 'a(?>bc|b)c' matches only abcc because
> + * atomic groups don't allow backtracing back to 'b'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_BACKTRACKING_CTRL_F (1ULL << 2)
> +/**< RegEx device doesn't support PCRE backtracking control verbs.
> + * Some examples of backtracing verbs are (*COMMIT), (*ACCEPT), (*FAIL),
> + * (*SKIP), (*PRUNE).
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_CALLOUTS_F (1ULL << 3)
> +/**< RegEx device doesn't support PCRE callouts.
> + * PCRE supports calling external function in between matches by using '(?C)'.
> + * Example RegEx 'ABC(?C)D' if a given patter is 'ABCD' then the RegEx engine
> + * will parse ABC perform a userdefined callout and return a successful match at
> + * D.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_BACKREFERENCE_F (1ULL << 4)
> +/**< RegEx device doesn't support PCRE backreference.
> + * Example RegEx is '(\2ABC|(GHI))+' \2 matches the same text as most recently
> + * matched by the 2nd capturing group i.e. 'GHI'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_GREEDY_F (1ULL << 5)
> +/**< RegEx device doesn't support PCRE Greedy mode.
> + * For example if the RegEx is 'AB\d*?' then '*?' represents zero or unlimited
> + * matches. In greedy mode the pattern 'AB12345' will be matched completely
> + * where as the ungreedy mode 'AB' will be returned as the match.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_LOOKAROUND_ASRT_F (1ULL << 6)
> +/**< RegEx device doesn't support PCRE Lookaround assertions
> + * (Zero-width assertions). Example RegEx is '[a-z]+\d+(?=!{3,})' if
> + * the given pattern is 'dwad1234!' the RegEx engine doesn't report any matches
> + * because the assert '(?=!{3,})' fails. The pattern 'dwad123!!!' would return a
> + * successful match.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_MATCH_POINT_RST_F (1ULL << 7)
> +/**< RegEx device doesn't support PCRE match point reset directive.
> + * Example RegEx is '[a-z]+\K\d+' if the pattern is 'dwad123'
> + * then even though the entire pattern matches only '123'
> + * is reported as a match.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_NEWLINE_CONVENTIONS_F (1ULL << 8)
> +/**< RegEx device doesn't support PCRE newline convention.
> + * Newline conventions are represented as follows:
> + * (*CR)        carriage return
> + * (*LF)        linefeed
> + * (*CRLF)      carriage return, followed by linefeed
> + * (*ANYCRLF)   any of the three above
> + * (*ANY)       all Unicode newline sequences
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_NEWLINE_SEQ_F (1ULL << 9)
> +/**< RegEx device doesn't support PCRE newline sequence.
> + * The escape sequence '\R' will match any newline sequence.
> + * It is equivalent to: '(?>\r\n|\n|\x0b|\f|\r|\x85)'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_POSSESSIVE_QUALIFIERS_F (1ULL << 10)
> +/**< RegEx device doesn't support PCRE possessive qualifiers.
> + * Example RegEx possessive qualifiers '*+', '++', '?+', '{m,n}+'.
> + * Possessive quantifier repeats the token as many times as possible and it does
> + * not give up matches as the engine backtracks. With a possessive quantifier,
> + * the deal is all or nothing.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_SUBROUTINE_REFERENCES_F (1ULL << 11)
> +/**< RegEx device doesn't support PCRE Subroutine references.
> + * PCRE Subroutine references allow for sub patterns to be assessed
> + * as part of the RegEx. Example RegEx is '(foo|fuzz)\g<1>+bar' matches the
> + * pattern 'foofoofuzzfoofuzzbar'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_8_F (1ULL << 12)
> +/**< RegEx device doesn't support UTF-8 character encoding.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_16_F (1ULL << 13)
> +/**< RegEx device doesn't support UTF-16 character encoding.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_32_F (1ULL << 14)
> +/**< RegEx device doesn't support UTF-32 character encoding.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_WORD_BOUNDARY_F (1ULL << 15)
> +/**< RegEx device doesn't support word boundaries.
> + * The meta character '\b' represents word boundary anchor.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_FORWARD_REFERENCES_F (1ULL << 16)
> +/**< RegEx device doesn't support Forward references.
> + * Forward references allow you to use a back reference to a group that appears
> + * later in the RegEx. Example RegEx is '(\3ABC|(DEF|(GHI)))+' matches the
> + * following string 'GHIGHIABCDEF'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +/* Enumerates PCRE rule flags */
> +#define RTE_REGEX_PCRE_RULE_ALLOW_EMPTY_F (1ULL << 0)
> +/**< When this flag is set, the pattern that can match against an empty string,
> + * such as '.*' are allowed.
> + * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_ANCHORED_F (1ULL << 1)
> +/**< When this flag is set, the pattern is forced to be "anchored", that is, it
> + * is constrained to match only at the first matching point in the string that
> + * is being searched. Similar to '^' and represented by \A.
> + * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_CASELESS_F (1ULL << 2)
> +/**< When this flag is set, letters in the pattern match both upper and lower
> + * case letters in the subject.
> + * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_DOTALL_F (1ULL << 3)
> +/**< When this flag is set, a dot metacharacter in the pattern matches any
> + * character, including one that indicates a newline.
> + * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_DUPNAMES_F (1ULL << 4)
> +/**< When this flag is set, names used to identify capture groups need not be
> + * unique.
> + * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_EXTENDED_F (1ULL << 5)
> +/**< When this flag is set, most white space characters in the pattern are
> + * totally ignored except when escaped or inside a character class.
> + * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_MATCH_UNSET_BACKREF_F (1ULL << 6)
> +/**< When this flag is set, a backreference to an unset capture group matches an
> + * empty string.
> + * @see RTE_REGEX_DEV_PCRE_UNSUP_FORWARD_REFERENCES_F
> + * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_MULTILINE_F (1ULL << 7)
> +/**< When this flag  is set, the '^' and '$' constructs match immediately
> + * following or immediately before internal newlines in the subject string,
> + * respectively, as well as at the very start and end.
> + * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_NO_AUTO_CAPTURE_F (1ULL << 8)
> +/**< When this Flag is set, it disables the use of numbered capturing
> + * parentheses in the pattern. References to capture groups (backreferences or
> + * recursion/subroutine calls) may only refer to named groups, though the
> + * reference can be by name or by number.
> + * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_UCP_F (1ULL << 9)
> +/**< By default, only ASCII characters are recognized, When this flag is set,
> + * Unicode properties are used instead to classify characters.
> + * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_UNGREEDY_F (1ULL << 10)
> +/**< When this flag is set, the "greediness" of the quantifiers is inverted
> + * so that they are not greedy by default, but become greedy if followed by
> + * '?'.
> + * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_UTF_F (1ULL << 11)
> +/**< When this flag is set, RegEx engine has to regard both the pattern and the
> + * subject strings that are subsequently processed as strings of UTF characters
> + * instead of single-code-unit strings.
> + * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_NEVER_BACKSLASH_C_F (1ULL << 12)
> +/**< This Flag locks out the use of '\C' in the pattern that is being compiled.
> + * This escape matches one data unit, even in UTF mode which can cause
> + * unpredictable behavior in UTF-8 or UTF-16 modes, because it may leave the
> + * current matching point in the middle of a multi-code-unit character.
> + * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
> + */
> +
> +
> +/**
> + * RegEx device information
> + */
> +struct rte_regex_dev_info {
> +	const char *driver_name; /**< RegEx driver name */
> +	struct rte_device *dev;	/**< Device information */
> +	uint8_t max_matches;
> +	/**< Maximum matches per scan supported by this device */
> +	uint16_t max_queue_pairs;
> +	/**< Maximum queue pairs supported by this device */
> +	uint16_t max_payload_size;
> +	/**< Maximum payload size for a pattern match request or scan.
> +	 * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> +	 */
> +	uint16_t max_rules_per_group;
> +	/**< Maximum rules supported per group by this device */
> +	uint16_t max_groups;
> +	/**< Maximum group supported by this device */
> +	uint32_t regex_dev_capa;
> +	/**< RegEx device capabilities. @see RTE_REGEX_DEV_CAPA_* */
> +	uint64_t rule_flags;
> +	/**< Supported compiler rule flags.
> +	 * @see RTE_REGEX_PCRE_RULE_*, struct rte_regex_rule::rule_flags
> +	 */
> +	uint64_t pcre_unsup_flags;
> +	/**< Unsupported PCRE features for this RegEx device.
> +	 * @see RTE_REGEX_DEV_PCRE_UNSUP_*
> +	 */
> +};
> +
> +/**
> + * Retrieve the contextual information of a RegEx device.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + *
> + * @param[out] dev_info
> + *   A pointer to a structure of type *rte_regex_dev_info* to be filled with the
> + *   contextual information of the device.
> + *
> + * @return
> + *   - 0: Success, driver updates the contextual information of the RegEx device
> + *   - <0: Error code returned by the driver info get function.
> + *
> + */
> +int
> +rte_regex_dev_info_get(uint8_t dev_id, struct rte_regex_dev_info *dev_info);
> +
> +/* Enumerates RegEx device configuration flags */
> +#define RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F (1ULL << 0)
> +/**< Cross buffer scan refers to the ability to be able to detect
> + * matches that occur across buffer boundaries, where the buffers are related
> + * to each other in some way. Enable this flag when to scan payload size
> + * greater struct struct rte_regex_dev_info::max_payload_size and/or
> + * matches can present across scan buffer boundaries.
> + *
> + * @see struct rte_regex_dev_info::max_payload_size
> + * @see struct rte_regex_dev_config::dev_cfg_flags, rte_regex_dev_configure()
> + * @see RTE_REGEX_OPS_RSP_PMI_SOJ_F
> + * @see RTE_REGEX_OPS_RSP_PMI_EOJ_F
> + */
> +
> +/** RegEx device configuration structure */
> +struct rte_regex_dev_config {
> +	uint8_t nb_max_matches;
> +	/**< Maximum matches per scan configured on this device.
> +	 * This value cannot exceed the *max_matches*
> +	 * which previously provided in rte_regex_dev_info_get().
> +	 * The value 0 is allowed, in which case, value 1 used.
> +	 * @see struct rte_regex_dev_info::max_matches
> +	 */
> +	uint16_t nb_queue_pairs;
> +	/**< Number of RegEx queue pairs to configure on this device.
> +	 * This value cannot exceed the *max_queue_pairs* which previously
> +	 * provided in rte_regex_dev_info_get().
> +	 * @see struct rte_regex_dev_info::max_queue_pairs
> +	 */
> +	uint16_t nb_rules_per_group;
> +	/**< Number of rules per group to configure on this device.
> +	 * This value cannot exceed the *max_rules_per_group*
> +	 * which previously provided in rte_regex_dev_info_get().
> +	 * The value 0 is allowed, in which case,
> +	 * struct rte_regex_dev_info::max_rules_per_group used.
> +	 * @see struct rte_regex_dev_info::max_rules_per_group
> +	 */
> +	uint16_t nb_groups;
> +	/**< Number of groups to configure on this device.
> +	 * This value cannot exceed the *max_groups*
> +	 * which previously provided in rte_regex_dev_info_get().
> +	 * @see struct rte_regex_dev_info::max_groups
> +	 */
> +	const char *rule_db;
> +	/**< Import initial set of prebuilt rule database on this device.
> +	 * The value NULL is allowed, in which case, the device will not
> +	 * be configured prebuilt rule database. Application may use
> +	 * rte_regex_rule_db_update() or rte_regex_rule_db_import() API
> +	 * to update or import rule database after the
> +	 * rte_regex_dev_configure().
> +	 * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
> +	 */
> +	uint32_t rule_db_len;
> +	/**< Length of *rule_db* buffer. */
> +	uint32_t dev_cfg_flags;
> +	/**< RegEx device configuration flags, See RTE_REGEX_DEV_CFG_*  */
> +};
> +
> +/**
> + * Configure a RegEx device.
> + *
> + * This function must be invoked first before any other function in the
> + * API. This function can also be re-invoked when a device is in the
> + * stopped state.
> + *
> + * The caller may use rte_regex_dev_info_get() to get the capability of each
> + * resources available for this regex device.
> + *
> + * @param dev_id
> + *   The identifier of the device to configure.
> + * @param cfg
> + *   The RegEx device configuration structure.
> + *
> + * @return
> + *   - 0: Success, device configured.
> + *   - <0: Error code returned by the driver configuration function.
> + */
> +int
> +rte_regex_dev_configure(uint8_t dev_id, const struct rte_regex_dev_config *cfg);
> +
> +/* Enumerates RegEx queue pair configuration flags */
> +#define RTE_REGEX_QUEUE_PAIR_CFG_OOS_F (1ULL << 0)
> +/**< Out of order scan, If not set, a scan must retire after previously issued
> + * in-order scans to this queue pair. If set, this scan can be retired as soon
> + * as device returns completion. Application should not set out of order scan
> + * flag if it needs to maintain the ingress order of scan request.
> + *
> + * @see struct rte_regex_qp_conf::qp_conf_flags, rte_regex_queue_pair_setup()
> + */
> +
> +struct rte_regex_ops;
> +typedef void (*regexdev_stop_flush_t)(uint8_t dev_id, uint16_t qp_id,
> +				      struct rte_regex_ops *op);
> +/**< Callback function called during rte_regex_dev_stop(), invoked once per
> + * flushed RegEx op.
> + */
> +
> +/** RegEx queue pair configuration structure */
> +struct rte_regex_qp_conf {
> +	uint32_t qp_conf_flags;
> +	/**< Queue pair config flags, See RTE_REGEX_QUEUE_PAIR_CFG_* */
> +	uint16_t nb_desc;
> +	/**< The number of descriptors to allocate for this queue pair. */
> +	regexdev_stop_flush_t cb;
> +	/**< Callback function called during rte_regex_dev_stop(), invoked
> +	 * once per flushed regex op. Value NULL is allowed, in which case
> +	 * callback will not be invoked. This function can be used to properly
> +	 * dispose of outstanding regex ops from response queue,
> +	 * for example ops containing memory pointers.
> +	 * @see rte_regex_dev_stop()
> +	 */
> +};
> +
> +/**
> + * Allocate and set up a RegEx queue pair for a RegEx device.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param queue_pair_id
> + *   The index of the RegEx queue pair to setup. The value must be in the range
> + *   [0, nb_queue_pairs - 1] previously supplied to rte_regex_dev_configure().
> + * @param qp_conf
> + *   The pointer to the configuration data to be used for the RegEx queue pair.
> + *   NULL value is allowed, in which case default configuration	used.
> + *
> + * @return
> + *   - 0: Success, RegEx queue pair correctly set up.
> + *   - <0: RegEx queue configuration failed
> + */
> +int
> +rte_regex_queue_pair_setup(uint8_t dev_id, uint8_t queue_pair_id,
> +			   const struct rte_regex_qp_conf *qp_conf);
> +
> +/**
> + * Start a RegEx device.
> + *
> + * The device start step is the last one and consists of setting the RegEx
> + * queues to start accepting the pattern matching scan requests.
> + *
> + * On success, all basic functions exported by the API (RegEx enqueue,
> + * RegEx dequeue and so on) can be invoked.
> + *
> + * @param dev_id
> + *   RegEx device identifier
> + * @return
> + *   - 0: Success, device started.
> + *   - <0: Device start failed.
> + */
> +int
> +rte_regex_dev_start(uint8_t dev_id);
> +
> +/**
> + * Stop a RegEx device.
> + *
> + * Stop a RegEx device. The device can be restarted with a call to
> + * rte_regex_dev_start().
> + *
> + * This function causes all queued response regex ops to be drained in the
> + * response queue. While draining ops out of the device,
> + * struct rte_regex_qp_conf::cb will be invoked for each ops.
> + *
> + * @param dev_id
> + *   RegEx device identifier.
> + *
> + * @see struct rte_regex_qp_conf::cb, rte_regex_queue_pair_setup()
> + */
> +void
> +rte_regex_dev_stop(uint8_t dev_id);
> +
> +/**
> + * Close a RegEx device. The device cannot be restarted!
> + *
> + * @param dev_id
> + *   RegEx device identifier
> + *
> + * @return
> + *  - 0 on successfully closed the device.
> + *  - <0 on failure to close the device.
> + */
> +int
> +rte_regex_dev_close(uint8_t dev_id);
> +
> +/* Device get/set attributes */
> +
> +/** Enumerates RegEx device attribute identifier */
> +enum rte_regex_dev_attr_id {
> +	RTE_REGEX_DEV_ATTR_SOCKET_ID,
> +	/**< The NUMA socket id to which the device is connected or
> +	 * a default of zero if the socket could not be determined.
> +	 * datatype: *int*
> +	 * operation: *get*
> +	 */
> +	RTE_REGEX_DEV_ATTR_MAX_MATCHES,
> +	/**< Maximum number of matches per scan.
> +	 * datatype: *uint8_t*
> +	 * operation: *get* and *set*
> +	 *
> +	 * @see RTE_REGEX_OPS_RSP_MAX_MATCH_F
> +	 */
> +	RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT,
> +	/**< Upper bound scan time in ns.
> +	 * datatype: *uint16_t*
> +	 * operation: *get* and *set*
> +	 *
> +	 * @see RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F
> +	 */
> +	RTE_REGEX_DEV_ATTR_MAX_PREFIX,
> +	/**< Maximum number of prefix detected per scan.
> +	 * This would be useful for denial of service detection.
> +	 * datatype: *uint16_t*
> +	 * operation: *get* and *set*
> +	 *
> +	 * @see RTE_REGEX_OPS_RSP_MAX_PREFIX_F
> +	 */
> +};
> +
> +/**
> + * Get an attribute from a RegEx device.
> + *
> + * @param dev_id RegEx device identifier
> + * @param attr_id The attribute ID to retrieve
> + * @param[out] attr_value A pointer that will be filled in with the attribute
> + *             value if successful.
> + *
> + * @return
> + *   - 0: Successfully retrieved attribute value.
> + *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
> + *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
> + */
> +int
> +rte_regex_dev_attr_get(uint8_t dev_id, enum rte_regex_dev_attr_id attr_id,
> +		       void *attr_value);
> +
> +/**
> + * Set an attribute to a RegEx device.
> + *
> + * @param dev_id RegEx device identifier
> + * @param attr_id The attribute ID to retrieve
> + * @param attr_value A pointer that will be filled in with the attribute value
> + *                   by the application
> + *
> + * @return
> + *   - 0: Successfully applied the attribute value.
> + *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
> + *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
> + */
> +int
> +rte_regex_dev_attr_set(uint8_t dev_id, enum rte_regex_dev_attr_id attr_id,
> +		       const void *attr_value);
> +
> +/* Rule related APIs */
> +/** Enumerates RegEx rule operation */
> +enum rte_regex_rule_op {
> +	RTE_REGEX_RULE_OP_ADD,
> +	/**< Add RegEx rule to rule database */
> +	RTE_REGEX_RULE_OP_REMOVE
> +	/**< Remove RegEx rule from rule database */
> +};
> +
> +/** Structure to hold a RegEx rule attributes */
> +struct rte_regex_rule {
> +	enum rte_regex_rule_op op;
> +	/**< OP type of the rule either a OP_ADD or OP_DELETE */
> +	uint16_t group_id;
> +	/**< Group identifier to which the rule belongs to. */
> +	uint32_t rule_id;
> +	/**< Rule identifier which is returned on successful match. */
> +	const char *pcre_rule;
> +	/**< Buffer to hold the PCRE rule. */
> +	uint16_t pcre_rule_len;
> +	/**< Length of the PCRE rule*/
> +	uint64_t rule_flags;
> +	/* PCRE rule flags. Supported device specific PCRE rules enumerated
> +	 * in struct rte_regex_dev_info::rule_flags. For successful rule
> +	 * database update, application needs to provide only supported
> +	 * rule flags.
> +	 * @See RTE_REGEX_PCRE_RULE_*, struct rte_regex_dev_info::rule_flags
> +	 */
> +};
> +
> +/**
> + * Update the rule database of a RegEx device.
> + *
> + * @param dev_id RegEx device identifier
> + * @param rules
> + *   Points to an array of *nb_rules* objects of type *rte_regex_rule* structure
> + *   which contain the regex rules attributes to be updated in rule database.
> + * @param nb_rules
> + *   The number of PCRE rules to update the rule database.
> + *
> + * @return
> + *   The number of regex rules actually updated on the regex device's rule
> + *   database. The return value can be less than the value of the *nb_rules*
> + *   parameter when the regex devices fails to update the rule database or
> + *   if invalid parameters are specified in a *rte_regex_rule*.
> + *   If the return value is less than *nb_rules*, the remaining PCRE rules
> + *   at the end of *rules* are not consumed and the caller has to take
> + *   care of them and rte_errno is set accordingly.
> + *   Possible errno values include:
> + *   - -EINVAL:  Invalid device ID or rules is NULL
> + *   - -ENOTSUP: The last processed rule is not supported on this device.
> + *   - -ENOSPC: No space available in rule database.
> + *
> + * @see rte_regex_rule_db_import(), rte_regex_rule_db_export()
> + */
> +uint16_t
> +rte_regex_rule_db_update(uint8_t dev_id, const struct rte_regex_rule *rules,
> +			 uint16_t nb_rules);
> +
> +/**
> + * Import a prebuilt rule database from a buffer to a RegEx device.
> + *
> + * @param dev_id RegEx device identifier
> + * @param rule_db
> + *   Points to prebuilt rule database.
> + * @param rule_db_len
> + *   Length of the rule database.
> + *
> + * @return
> + *   - 0: Successfully updated the prebuilt rule database.
> + *   - -EINVAL:  Invalid device ID or rule_db is NULL
> + *   - -ENOTSUP: Rule database import is not supported on this device.
> + *   - -ENOSPC: No space available in rule database.
> + *
> + * @see rte_regex_rule_db_update(), rte_regex_rule_db_export()
> + */
> +int
> +rte_regex_rule_db_import(uint8_t dev_id, const char *rule_db,
> +			 uint32_t rule_db_len);
> +
> +/**
> + * Export the prebuilt rule database from a RegEx device to the buffer.
> + *
> + * @param dev_id RegEx device identifier
> + * @param[out] rule_db
> + *   Block of memory to insert the rule database. Must be at least size in
> + *   capacity. If set to NULL, function returns required capacity.
> + *
> + * @return
> + *   - 0: Successfully exported the prebuilt rule database.
> + *   - size: If rule_db set to NULL then required capacity for *rule_db*
> + *   - -EINVAL:  Invalid device ID
> + *   - -ENOTSUP: Rule database export is not supported on this device.
> + *
> + * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
> + */
> +int
> +rte_regex_rule_db_export(uint8_t dev_id, char *rule_db);
> +
> +/* Extended statistics */
> +/** Maximum name length for extended statistics counters */
> +#define RTE_REGEX_DEV_XSTATS_NAME_SIZE 64
> +
> +/**
> + * A name-key lookup element for extended statistics.
> + *
> + * This structure is used to map between names and ID numbers
> + * for extended RegEx device statistics.
> + */
> +struct rte_regex_dev_xstats_map {
> +	uint16_t id;
> +	/**< xstat identifier */
> +	char name[RTE_REGEX_DEV_XSTATS_NAME_SIZE];
> +	/**< xstat name */
> +};
> +
> +/**
> + * Retrieve names of extended statistics of a regex device.
> + *
> + * @param dev_id
> + *   The identifier of the regex device.
> + * @param[out] xstats_map
> + *   Block of memory to insert id and names into. Must be at least size in
> + *   capacity. If set to NULL, function returns required capacity.
> + * @return
> + *   - positive value on success:
> + *        -The return value is the number of entries filled in the stats map.
> + *        -If xstats_map set to NULL then required capacity for xstats_map.
> + *   - negative value on error:
> + *      -ENODEV for invalid *dev_id*
> + *      -ENOTSUP if the device doesn't support this function.
> + */
> +int
> +rte_regex_dev_xstats_names_get(uint8_t dev_id,
> +			       struct rte_regex_dev_xstats_map *xstats_map);
> +
> +/**
> + * Retrieve extended statistics of an regex device.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param ids
> + *   The id numbers of the stats to get. The ids can be got from the stat
> + *   position in the stat list from rte_regex_dev_xstats_names_get(), or
> + *   by using rte_regex_dev_xstats_by_name_get().
> + * @param[out] values
> + *   The values for each stats request by ID.
> + * @param n
> + *   The number of stats requested
> + * @return
> + *   - positive value: number of stat entries filled into the values array
> + *   - negative value on error:
> + *      -ENODEV for invalid *dev_id*
> + *      -ENOTSUP if the device doesn't support this function.
> + */
> +int
> +rte_regex_dev_xstats_get(uint8_t dev_id, const uint16_t ids[],
> +			 uint64_t values[], uint16_t n);
> +
> +/**
> + * Retrieve the value of a single stat by requesting it by name.
> + *
> + * @param dev_id
> + *   The identifier of the device
> + * @param name
> + *   The stat name to retrieve
> + * @param[out] id
> + *   If non-NULL, the numerical id of the stat will be returned, so that further
> + *   requests for the stat can be got using rte_regex_dev_xstats_get, which will
> + *   be faster as it doesn't need to scan a list of names for the stat.
> + * @param[out] value
> + *   Must be non-NULL, retrieved xstat value will be stored in this address.
> + *
> + * @return
> + *   - 0: Successfully retrieved xstat value.
> + *   - -EINVAL: invalid parameters
> + *   - -ENOTSUP: if not supported.
> + */
> +int
> +rte_regex_dev_xstats_by_name_get(uint8_t dev_id, const char *name,
> +				 uint16_t *id, uint64_t *value);
> +
> +/**
> + * Reset the values of the xstats of the selected component in the device.
> + *
> + * @param dev_id
> + *   The identifier of the device
> + * @param ids
> + *   Selects specific statistics to be reset. When NULL, all statistics will be
> + *   reset. If non-NULL, must point to array of at least *nb_ids* size.
> + * @param nb_ids
> + *   The number of ids available from the *ids* array. Ignored when ids is NULL.
> + * @return
> + *   - 0: Successfully reset the statistics to zero.
> + *   - -EINVAL: invalid parameters
> + *   - -ENOTSUP: if not supported.
> + */
> +int
> +rte_regex_dev_xstats_reset(uint8_t dev_id, const uint16_t ids[],
> +			   uint16_t nb_ids);
> +
> +/**
> + * Trigger the RegEx device self test.
> + *
> + * @param dev_id
> + *   The identifier of the device
> + * @return
> + *   - 0: Selftest successful
> + *   - -ENOTSUP if the device doesn't support selftest
> + *   - other values < 0 on failure.
> + */
> +int rte_regex_dev_selftest(uint8_t dev_id);
> +
> +/**
> + * Dump internal information about *dev_id* to the FILE* provided in *f*.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + *
> + * @param f
> + *   A pointer to a file for output
> + *
> + * @return
> + *   - 0: on success
> + *   - <0: on failure.
> + */
> +int
> +rte_regex_dev_dump(uint8_t dev_id, FILE *f);
> +
> +/* Fast path APIs */
> +
> +/**
> + * The generic *rte_regex_match* structure to hold the RegEx match attributes.
> + * @see struct rte_regex_ops::matches
> + */
> +struct rte_regex_match {
> +	RTE_STD_C11
> +	union {
> +		uint64_t u64;
> +		struct {
> +			uint32_t rule_id:20;
> +			/**< Rule identifier to which the pattern matched.
> +			 * @see struct rte_regex_rule::rule_id
> +			 */
> +			uint32_t group_id:12;
> +			/**< Group identifier of the rule which the pattern
> +			 * matched. @see struct rte_regex_rule::group_id
> +			 */
> +			uint16_t offset;
> +			/**< Starting Byte Position for matched rule. */
> +			uint16_t len;
> +			/**< Length of match in bytes */
> +		};
> +	};
> +};
> +
> +/* Enumerates RegEx request flags. */
> +#define RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F (1 << 0)
> +/**< Set when struct rte_regex_rule::group_id1 valid */
> +
> +#define RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F (1 << 1)
> +/**< Set when struct rte_regex_rule::group_id2 valid */
> +
> +#define RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F (1 << 2)
> +/**< Set when struct rte_regex_rule::group_id3 valid */
> +
> +#define RTE_REGEX_OPS_REQ_STOP_ON_MATCH_F (1 << 4)
> +/**< The RegEx engine will stop scanning and return the first match. */
> +
> +#define RTE_REGEX_OPS_REQ_MATCH_HIGH_PRIORITY_F (1 << 5)
> +/**< In High Priority mode a maximum of one match will be returned per scan to
> + * reduce the post-processing required by the application. The match with the
> + * lowest Rule id, lowest start pointer and lowest match length will be
> + * returned.
> + *
> + * @see struct rte_regex_ops::nb_actual_matches
> + * @see struct rte_regex_ops::nb_matches
> + */
> +
> +
> +/* Enumerates RegEx response flags. */
> +#define RTE_REGEX_OPS_RSP_PMI_SOJ_F (1 << 0)
> +/**< Indicates that the RegEx device has encountered a partial match at the
> + * start of scan in the given buffer.
> + *
> + * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> + */
> +
> +#define RTE_REGEX_OPS_RSP_PMI_EOJ_F (1 << 1)
> +/**< Indicates that the RegEx device has encountered a partial match at the
> + * end of scan in the given buffer.
> + *
> + * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> + */
> +
> +#define RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F (1 << 2)
> +/**< Indicates that the RegEx device has exceeded the max timeout while
> + * scanning the given buffer.
> + *
> + * @see RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT
> + */
> +
> +#define RTE_REGEX_OPS_RSP_MAX_MATCH_F (1 << 3)
> +/**< Indicates that the RegEx device has exceeded the max matches while
> + * scanning the given buffer.
> + *
> + * @see RTE_REGEX_DEV_ATTR_MAX_MATCHES
> + */
> +
> +#define RTE_REGEX_OPS_RSP_MAX_PREFIX_F (1 << 4)
> +/**< Indicates that the RegEx device has reached the max allowed prefix length
> + * while scanning the given buffer.
> + *
> + * @see RTE_REGEX_DEV_ATTR_MAX_PREFIX
> + */
> +
> +/**
> + * The generic *rte_regex_ops* structure to hold the RegEx attributes
> + * for enqueue and dequeue operation.
> + */
> +struct rte_regex_ops {
> +	/* W0 */
> +	uint16_t req_flags;
> +	/**< Request flags for the RegEx ops.
> +	 * @see RTE_REGEX_OPS_REQ_*
> +	 */
> +	uint16_t scan_size;
> +	/**< Scan size of the buffer to be scanned in bytes. */
> +	uint16_t rsp_flags;
> +	/**< Response flags for the RegEx ops.
> +	 * @see RTE_REGEX_OPS_RSP_*
> +	 */
> +	uint8_t nb_actual_matches;
> +	/**< The total number of actual matches detected by the Regex device.*/
> +	uint8_t nb_matches;
> +	/**< The total number of matches returned by the RegEx device for this
> +	 * scan. The size of *rte_regex_ops::matches* zero length array will be
> +	 * this value.
> +	 *
> +	 * @see struct rte_regex_ops::matches, struct rte_regex_match
> +	 */
> +
> +	/* W1 */
> +	RTE_STD_C11
> +	union {
> +		uint64_t u64;
> +		/**<  Allow 8-byte reserved on 32-bit system */
> +		void *buf_addr;
> +		/**< Virtual address of the pattern to be matched. */
> +	};
> +
> +	/* W2 */
> +	rte_iova_t buf_iova;
> +	/**< IOVA address of the pattern to be matched. */
> +
> +	/* W3 */
> +	uint16_t group_id0;
> +	/**< First group_id to match the rule against. Minimum one group id
> +	 * must be provided by application.
> +	 * When RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F set then group_id1
> +	 * is valid, respectively similar flags for group_id2 and group_id3.
> +	 * Upon the match, struct rte_regex_match::group_id shall be updated
> +	 * with matching group ID by the device. Group ID scheme provides
> +	 * rule isolation and effective pattern matching.
> +	 */
> +	uint16_t group_id1;
> +	/**< Second group_id to match the rule against.
> +	 *
> +	 * @see RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F
> +	 */
> +	uint16_t group_id2;
> +	/**< Third group_id to match the rule against.
> +	 *
> +	 * @see RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F
> +	 */
> +	uint16_t group_id3;
> +	/**< Forth group_id to match the rule against.
> +	 *
> +	 * @see RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F
> +	 */
> +
> +	/* W4 */
> +	RTE_STD_C11
> +	union {
> +		uint64_t user_id;
> +		/**< Application specific opaque value. An application may use
> +		 * this field to hold application specific value to share
> +		 * between dequeue and enqueue operation.
> +		 * Implementation should not modify this field.
> +		 */
> +		void *user_ptr;
> +		/**< Pointer representation of *user_id* */
> +	};
> +
> +	/* W5 */
> +	struct rte_regex_match matches[];
> +	/**< Zero length array to hold the match tuples.
> +	 * The struct rte_regex_ops::nb_matches value holds the number of
> +	 * elements in this array.
> +	 *
> +	 * @see struct rte_regex_ops::nb_matches
> +	 */
> +};
> +
> +/**
> + * Enqueue a burst of scan request on a RegEx device.
> + *
> + * The rte_regex_enqueue_burst() function is invoked to place
> + * regex operations on the queue *qp_id* of the device designated by
> + * its *dev_id*.
> + *
> + * The *nb_ops* parameter is the number of operations to process which are
> + * supplied in the *ops* array of *rte_regex_op* structures.
> + *
> + * The rte_regex_enqueue_burst() function returns the number of
> + * operations it actually enqueued for processing. A return value equal to
> + * *nb_ops* means that all packets have been enqueued.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param qp_id
> + *   The index of the queue pair which packets are to be enqueued for
> + *   processing. The value must be in the range [0, nb_queue_pairs - 1]
> + *   previously supplied to rte_regex_dev_configure().
> + * @param ops
> + *   The address of an array of *nb_ops* pointers to *rte_regex_op* structures
> + *   which contain the regex operations to be processed.
> + * @param nb_ops
> + *   The number of operations to process.
> + *
> + * @return
> + *   The number of operations actually enqueued on the regex device. The return
> + *   value can be less than the value of the *nb_ops* parameter when the
> + *   regex devices queue is full or if invalid parameters are specified in
> + *   a *rte_regex_op*. If the return value is less than *nb_ops*, the remaining
> + *   ops at the end of *ops* are not consumed and the caller has to take care
> + *   of them.
> + */
> +uint16_t
> +rte_regex_enqueue_burst(uint8_t dev_id, uint16_t qp_id,
> +			struct rte_regex_ops **ops, uint16_t nb_ops);
> +
> +/**
> + *
> + * Dequeue a burst of scan response from a queue on the RegEx device.
> + * The dequeued operation are stored in *rte_regex_op* structures
> + * whose pointers are supplied in the *ops* array.
> + *
> + * The rte_regex_dequeue_burst() function returns the number of ops
> + * actually dequeued, which is the number of *rte_regex_op* data structures
> + * effectively supplied into the *ops* array.
> + *
> + * A return value equal to *nb_ops* indicates that the queue contained
> + * at least *nb_ops* operations, and this is likely to signify that other
> + * processed operations remain in the devices output queue. Applications
> + * implementing a "retrieve as many processed operations as possible" policy
> + * can check this specific case and keep invoking the
> + * rte_regex_dequeue_burst() function until a value less than
> + * *nb_ops* is returned.
> + *
> + * The rte_regex_dequeue_burst() function does not provide any error
> + * notification to avoid the corresponding overhead.
> + *
> + * @param dev_id
> + *   The RegEx device identifier
> + * @param qp_id
> + *   The index of the queue pair from which to retrieve processed packets.
> + *   The value must be in the range [0, nb_queue_pairs - 1] previously
> + *   supplied to rte_regex_dev_configure().
> + * @param ops
> + *   The address of an array of pointers to *rte_regex_op* structures that must
> + *   be large enough to store *nb_ops* pointers in it.
> + * @param nb_ops
> + *   The maximum number of operations to dequeue.
> + *
> + * @return
> + *   The number of operations actually dequeued, which is the number
> + *   of pointers to *rte_regex_op* structures effectively supplied to the
> + *   *ops* array. If the return value is less than *nb_ops*, the remaining
> + *   ops at the end of *ops* are not consumed and the caller has to take care
> + *   of them.
> + */
> +uint16_t
> +rte_regex_dequeue_burst(uint8_t dev_id, uint16_t qp_id,
> +			struct rte_regex_ops **ops, uint16_t nb_ops);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_REGEXDEV_H_ */
Thomas Monjalon Aug. 15, 2019, 11:34 a.m. UTC | #3
+Cc more

------------

From: Jerin Jacob <jerinj@marvell.com>
 
Even though there are some vendors which offer Regex HW offload, due to
lack of standard API, It is diffcult for DPDK consumer to use them
in a portable way.
 
This _RFC_ attempts to standardize the RegEx/DPI offload APIs for DPDK.
 
The Doxygen generated RFC API documentation available here:
https://dreamy-noether-22777e.netlify.com/rte__regexdev_8h.html
 
This RFC crafted based on SW Regex API frameworks such as libpcre and
hyperscan and a few of the RegEx HW IPs which I am aware of.
 
RegEx pattern matching applications:
• Next Generation Firewalls (NGFW)
• Deep Packet and Flow Inspection (DPI)
• Intrusion Prevention Systems (IPS)
• DDoS Mitigation
• Network Monitoring
• Data Loss Prevention (DLP)
• Smart NICs
• Grammar based content processing
• URL, spam and adware filtering
• Advanced auditing and policing of user/application security policies
• Financial data mining - parsing of streamed financial feeds 
 
Request to review from HW and SW RegEx vendors and RegEx application users
to have portable DPDK API for RegEx.
 
The API schematics are based cryptodev, eventdev and ethdev existing device API.
 
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
---
 
RTE RegEx Device API
--------------------
 
Defines RTE RegEx Device APIs for RegEx operations and its provisioning.
 
The RegEx Device API is composed of two parts:
 
- The application-oriented RegEx API that includes functions to setup
a RegEx device (configure it, setup its queue pairs and start it),
update the rule database and so on.
 
- The driver-oriented RegEx API that exports a function allowing
a RegEx poll Mode Driver (PMD) to simultaneously register itself as
a RegEx device driver.
 
RegEx device components and definitions:
 
    +-----------------+
    |                 |
    |                 o---------+    rte_regex_[en|de]queue_burst()
    |   PCRE based    o------+  |               |
    |  RegEx pattern  |      |  |  +--------+   |
    | matching engine o------+--+--o        |   |    +------+
    |                 |      |  |  | queue  |<==o===>|Core 0|
    |                 o----+ |  |  | pair 0 |        |      |
    |                 |    | |  |  +--------+        +------+
    +-----------------+    | |  |
           ^               | |  |  +--------+
           |               | |  |  |        |        +------+
           |               | +--+--o queue  |<======>|Core 1|
       Rule|Database       |    |  | pair 1 |        |      |
    +------+----------+    |    |  +--------+        +------+
    |     Group 0     |    |    |
    | +-------------+ |    |    |  +--------+        +------+
    | | Rules 0..n  | |    |    |  |        |        |Core 2|
    | +-------------+ |    |    +--o queue  |<======>|      |
    |     Group 1     |    |       | pair 2 |        +------+
    | +-------------+ |    |       +--------+
    | | Rules 0..n  | |    |
    | +-------------+ |    |       +--------+
    |     Group 2     |    |       |        |        +------+
    | +-------------+ |    |       | queue  |<======>|Core n|
    | | Rules 0..n  | |    +-------o pair n |        |      |
    | +-------------+ |            +--------+        +------+
    |     Group n     |
    | +-------------+ |<-------rte_regex_rule_db_update()
    | | Rules 0..n  | |<-------rte_regex_rule_db_import()
    | +-------------+ |------->rte_regex_rule_db_export()
    +-----------------+
 
RegEx: A regular expression is a concise and flexible means for matching
strings of text, such as particular characters, words, or patterns of
characters. A common abbreviation for this is “RegEx”.
 
RegEx device: A hardware or software-based implementation of RegEx
device API for PCRE based pattern matching syntax and semantics.
 
PCRE RegEx syntax and semantics specification:
http://regexkit.sourceforge.net/Documentation/pcre/pcrepattern.html
 
RegEx queue pair: Each RegEx device should have one or more queue pair to
transmit a burst of pattern matching request and receive a burst of
receive the pattern matching response. The pattern matching request/response
embedded in *rte_regex_ops* structure.
 
Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
Match ID and Group ID to identify the rule upon the match.
 
Rule database: The RegEx device accepts regular expressions and converts them
into a compiled rule database that can then be used to scan data.
Compilation allows the device to analyze the given pattern(s) and
pre-determine how to scan for these patterns in an optimized fashion that
would be far too expensive to compute at run-time. A rule database contains
a set of rules that compiled in device specific binary form.
 
Match ID or Rule ID: A unique identifier provided at the time of rule
creation for the application to identify the rule upon match.
 
Group ID: Group of rules can be grouped under one group ID to enable
rule isolation and effective pattern matching. A unique group identifier
provided at the time of rule creation for the application to identify the
rule upon match.
 
Scan: A pattern matching request through *enqueue* API.
 
It may possible that a given RegEx device may not support all the features
of PCRE. The application may probe unsupported features through
struct rte_regex_dev_info::pcre_unsup_flags
 
By default, all the functions of the RegEx Device API exported by a PMD
are lock-free functions which assume to not be invoked in parallel on
different logical cores to work on the same target object. For instance,
the dequeue function of a PMD cannot be invoked in parallel on two logical
cores to operates on same RegEx queue pair. Of course, this function
can be invoked in parallel by different logical core on different queue pair.
It is the responsibility of the upper level application to enforce this rule.
 
In all functions of the RegEx API, the RegEx device is
designated by an integer >= 0 named the device identifier *dev_id*
 
At the RegEx driver level, RegEx devices are represented by a generic
data structure of type *rte_regex_dev*.
 
RegEx devices are dynamically registered during the PCI/SoC device probing
phase performed at EAL initialization time.
When a RegEx device is being probed, a *rte_regex_dev* structure and
a new device identifier are allocated for that device. Then, the
regex_dev_init() function supplied by the RegEx driver matching the probed
device is invoked to properly initialize the device.
 
The role of the device init function consists of resetting the hardware or
software RegEx driver implementations.
 
If the device init operation is successful, the correspondence between
the device identifier assigned to the new device and its associated
*rte_regex_dev* structure is effectively registered.
Otherwise, both the *rte_regex_dev* structure and the device identifier are
freed.
 
The functions exported by the application RegEx API to setup a device
designated by its device identifier must be invoked in the following order:
- rte_regex_dev_configure()
- rte_regex_queue_pair_setup()
- rte_regex_dev_start()
 
Then, the application can invoke, in any order, the functions
exported by the RegEx API to enqueue pattern matching job, dequeue pattern
matching response, get the stats, update the rule database,
get/set device attributes and so on
 
If the application wants to change the configuration (i.e. call
rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must call
rte_regex_dev_stop() first to stop the device and then do the reconfiguration
before calling rte_regex_dev_start() again. The enqueue and dequeue
functions should not be invoked when the device is stopped.
 
Finally, an application can close a RegEx device by invoking the
rte_regex_dev_close() function.
 
Each function of the application RegEx API invokes a specific function
of the PMD that controls the target device designated by its device
identifier.
 
For this purpose, all device-specific functions of a RegEx driver are
supplied through a set of pointers contained in a generic structure of type
*regex_dev_ops*.
The address of the *regex_dev_ops* structure is stored in the *rte_regex_dev*
structure by the device init function of the RegEx driver, which is
invoked during the PCI/SoC device probing phase, as explained earlier.
 
In other words, each function of the RegEx API simply retrieves the
*rte_regex_dev* structure associated with the device identifier and
performs an indirect invocation of the corresponding driver function
supplied in the *regex_dev_ops* structure of the *rte_regex_dev* structure.
 
For performance reasons, the address of the fast-path functions of the
RegEx driver is not contained in the *regex_dev_ops* structure.
Instead, they are directly stored at the beginning of the *rte_regex_dev*
structure to avoid an extra indirect memory access during their invocation.
 
RTE RegEx device drivers do not use interrupts for enqueue or dequeue
operation. Instead, RegEx drivers export Poll-Mode enqueue and dequeue
functions to applications.
 
The *enqueue* operation submits a burst of RegEx pattern matching request
to the RegEx device and the *dequeue* operation gets a burst of pattern
matching response for the ones submitted through *enqueue* operation.
 
Typical application utilisation of the RegEx device API will follow the
following programming flow.
 
- rte_regex_dev_configure()
- rte_regex_queue_pair_setup()
- rte_regex_rule_db_update() Needs to invoke if precompiled rule database not
provided in rte_regex_dev_config::rule_db for rte_regex_dev_configure()
and/or application needs to update rule database.
- Create or reuse exiting mempool for *rte_regex_ops* objects.
- rte_regex_dev_start()
- rte_regex_enqueue_burst()
- rte_regex_dequeue_burst()
 
---
 
config/common_base                 |    5 +
doc/api/doxy-api-index.md          |    1 +
doc/api/doxy-api.conf.in           |    1 +
lib/Makefile                       |    2 +
lib/librte_regexdev/Makefile       |   23 +
lib/librte_regexdev/rte_regexdev.c |    5 +
lib/librte_regexdev/rte_regexdev.h | 1247 ++++++++++++++++++++++++++++
7 files changed, 1284 insertions(+)
create mode 100644 lib/librte_regexdev/Makefile
create mode 100644 lib/librte_regexdev/rte_regexdev.c
create mode 100644 lib/librte_regexdev/rte_regexdev.h
 
diff --git a/config/common_base b/config/common_base
index e406e7836..986093d6e 100644
--- a/config/common_base
+++ b/config/common_base
@@ -746,6 +746,11 @@ CONFIG_RTE_LIBRTE_PMD_DPAA2_QDMA_RAWDEV=n
#
CONFIG_RTE_LIBRTE_PMD_IFPGA_RAWDEV=y
  
+#
+# Compile regex device support
+#
+CONFIG_RTE_LIBRTE_REGEXDEV=y
+
#
# Compile librte_ring
#
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index 715248dd1..a0bc27ae4 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -26,6 +26,7 @@ The public API headers are grouped by topics:
[event_timer_adapter]    (@ref rte_event_timer_adapter.h),
[event_crypto_adapter]   (@ref rte_event_crypto_adapter.h),
[rawdev]             (@ref rte_rawdev.h),
+  [regexdev]           (@ref rte_regexdev.h),
[metrics]            (@ref rte_metrics.h),
[bitrate]            (@ref rte_bitrate.h),
[latency]            (@ref rte_latencystats.h),
diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in
index b9896cb63..7adb821bb 100644
--- a/doc/api/doxy-api.conf.in
+++ b/doc/api/doxy-api.conf.in
@@ -53,6 +53,7 @@ INPUT                   = @TOPDIR@/doc/api/doxy-api-index.md \
@TOPDIR@/lib/librte_rawdev \
@TOPDIR@/lib/librte_rcu \
@TOPDIR@/lib/librte_reorder \
+                          @TOPDIR@/lib/librte_regexdev \
@TOPDIR@/lib/librte_ring \
@TOPDIR@/lib/librte_sched \
@TOPDIR@/lib/librte_security \
diff --git a/lib/Makefile b/lib/Makefile
index 791e0d991..57de9691a 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -44,6 +44,8 @@ DEPDIRS-librte_eventdev := librte_eal librte_ring librte_ethdev librte_hash \
librte_mempool librte_timer librte_cryptodev
DIRS-$(CONFIG_RTE_LIBRTE_RAWDEV) += librte_rawdev
DEPDIRS-librte_rawdev := librte_eal librte_ethdev
+DIRS-$(CONFIG_RTE_LIBRTE_REGEXDEV) += librte_regexdev
+DEPDIRS-librte_regexdev := librte_eal
DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
DEPDIRS-librte_vhost := librte_eal librte_mempool librte_mbuf librte_ethdev \
			librte_net
diff --git a/lib/librte_regexdev/Makefile b/lib/librte_regexdev/Makefile
new file mode 100644
index 000000000..723b4b28c
--- /dev/null
+++ b/lib/librte_regexdev/Makefile
@@ -0,0 +1,23 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(C) 2019 Marvell International Ltd.
+#
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_regexdev.a
+
+# library version
+LIBABIVER := 1
+
+# build flags
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+# library source files
+SRCS-y += rte_regexdev.c
+
+# export include files
+SYMLINK-y-include += rte_regexdev.h
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_regexdev/rte_regexdev.c b/lib/librte_regexdev/rte_regexdev.c
new file mode 100644
index 000000000..e5be0f29c
--- /dev/null
+++ b/lib/librte_regexdev/rte_regexdev.c
@@ -0,0 +1,5 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2019 Marvell International Ltd.
+ */
+
+#include <rte_regexdev.h>
diff --git a/lib/librte_regexdev/rte_regexdev.h b/lib/librte_regexdev/rte_regexdev.h
new file mode 100644
index 000000000..765da4aaa
--- /dev/null
+++ b/lib/librte_regexdev/rte_regexdev.h
@@ -0,0 +1,1247 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2019 Marvell International Ltd.
+ */
+
+#ifndef _RTE_REGEXDEV_H_
+#define _RTE_REGEXDEV_H_
+
+/**
+ * @file
+ *
+ * RTE RegEx Device API
+ *
+ * Defines RTE RegEx Device APIs for RegEx operations and its provisioning.
+ *
+ * The RegEx Device API is composed of two parts:
+ *
+ * - The application-oriented RegEx API that includes functions to setup
+ *   a RegEx device (configure it, setup its queue pairs and start it),
+ *   update the rule database and so on.
+ *
+ * - The driver-oriented RegEx API that exports a function allowing
+ *   a RegEx poll Mode Driver (PMD) to simultaneously register itself as
+ *   a RegEx device driver.
+ *
+ * RegEx device components and definitions:
+ *
+ *     +-----------------+
+ *     |                 |
+ *     |                 o---------+    rte_regex_[en|de]queue_burst()
+ *     |   PCRE based    o------+  |               |
+ *     |  RegEx pattern  |      |  |  +--------+   |
+ *     | matching engine o------+--+--o        |   |    +------+
+ *     |                 |      |  |  | queue  |<==o===>|Core 0|
+ *     |                 o----+ |  |  | pair 0 |        |      |
+ *     |                 |    | |  |  +--------+        +------+
+ *     +-----------------+    | |  |
+ *            ^               | |  |  +--------+
+ *            |               | |  |  |        |        +------+
+ *            |               | +--+--o queue  |<======>|Core 1|
+ *        Rule|Database       |    |  | pair 1 |        |      |
+ *     +------+----------+    |    |  +--------+        +------+
+ *     |     Group 0     |    |    |
+ *     | +-------------+ |    |    |  +--------+        +------+
+ *     | | Rules 0..n  | |    |    |  |        |        |Core 2|
+ *     | +-------------+ |    |    +--o queue  |<======>|      |
+ *     |     Group 1     |    |       | pair 2 |        +------+
+ *     | +-------------+ |    |       +--------+
+ *     | | Rules 0..n  | |    |
+ *     | +-------------+ |    |       +--------+
+ *     |     Group 2     |    |       |        |        +------+
+ *     | +-------------+ |    |       | queue  |<======>|Core n|
+ *     | | Rules 0..n  | |    +-------o pair n |        |      |
+ *     | +-------------+ |            +--------+        +------+
+ *     |     Group n     |
+ *     | +-------------+ |<-------rte_regex_rule_db_update()
+ *     | | Rules 0..n  | |<-------rte_regex_rule_db_import()
+ *     | +-------------+ |------->rte_regex_rule_db_export()
+ *     +-----------------+
+ *
+ * RegEx: A regular expression is a concise and flexible means for matching
+ * strings of text, such as particular characters, words, or patterns of
+ * characters. A common abbreviation for this is “RegEx”.
+ *
+ * RegEx device: A hardware or software-based implementation of RegEx
+ * device API for PCRE based pattern matching syntax and semantics.
+ *
+ * PCRE RegEx syntax and semantics specification:
+ * http://regexkit.sourceforge.net/Documentation/pcre/pcrepattern.html
+ *
+ * RegEx queue pair: Each RegEx device should have one or more queue pair to
+ * transmit a burst of pattern matching request and receive a burst of
+ * receive the pattern matching response. The pattern matching request/response
+ * embedded in *rte_regex_ops* structure.
+ *
+ * Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
+ * Match ID and Group ID to identify the rule upon the match.
+ *
+ * Rule database: The RegEx device accepts regular expressions and converts them
+ * into a compiled rule database that can then be used to scan data.
+ * Compilation allows the device to analyze the given pattern(s) and
+ * pre-determine how to scan for these patterns in an optimized fashion that
+ * would be far too expensive to compute at run-time. A rule database contains
+ * a set of rules that compiled in device specific binary form.
+ *
+ * Match ID or Rule ID: A unique identifier provided at the time of rule
+ * creation for the application to identify the rule upon match.
+ *
+ * Group ID: Group of rules can be grouped under one group ID to enable
+ * rule isolation and effective pattern matching. A unique group identifier
+ * provided at the time of rule creation for the application to identify the
+ * rule upon match.
+ *
+ * Scan: A pattern matching request through *enqueue* API.
+ *
+ * It may possible that a given RegEx device may not support all the features
+ * of PCRE. The application may probe unsupported features through
+ * struct rte_regex_dev_info::pcre_unsup_flags
+ *
+ * By default, all the functions of the RegEx Device API exported by a PMD
+ * are lock-free functions which assume to not be invoked in parallel on
+ * different logical cores to work on the same target object. For instance,
+ * the dequeue function of a PMD cannot be invoked in parallel on two logical
+ * cores to operates on same RegEx queue pair. Of course, this function
+ * can be invoked in parallel by different logical core on different queue pair.
+ * It is the responsibility of the upper level application to enforce this rule.
+ *
+ * In all functions of the RegEx API, the RegEx device is
+ * designated by an integer >= 0 named the device identifier *dev_id*
+ *
+ * At the RegEx driver level, RegEx devices are represented by a generic
+ * data structure of type *rte_regex_dev*.
+ *
+ * RegEx devices are dynamically registered during the PCI/SoC device probing
+ * phase performed at EAL initialization time.
+ * When a RegEx device is being probed, a *rte_regex_dev* structure and
+ * a new device identifier are allocated for that device. Then, the
+ * regex_dev_init() function supplied by the RegEx driver matching the probed
+ * device is invoked to properly initialize the device.
+ *
+ * The role of the device init function consists of resetting the hardware or
+ * software RegEx driver implementations.
+ *
+ * If the device init operation is successful, the correspondence between
+ * the device identifier assigned to the new device and its associated
+ * *rte_regex_dev* structure is effectively registered.
+ * Otherwise, both the *rte_regex_dev* structure and the device identifier are
+ * freed.
+ *
+ * The functions exported by the application RegEx API to setup a device
+ * designated by its device identifier must be invoked in the following order:
+ *     - rte_regex_dev_configure()
+ *     - rte_regex_queue_pair_setup()
+ *     - rte_regex_dev_start()
+ *
+ * Then, the application can invoke, in any order, the functions
+ * exported by the RegEx API to enqueue pattern matching job, dequeue pattern
+ * matching response, get the stats, update the rule database,
+ * get/set device attributes and so on
+ *
+ * If the application wants to change the configuration (i.e. call
+ * rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must call
+ * rte_regex_dev_stop() first to stop the device and then do the reconfiguration
+ * before calling rte_regex_dev_start() again. The enqueue and dequeue
+ * functions should not be invoked when the device is stopped.
+ *
+ * Finally, an application can close a RegEx device by invoking the
+ * rte_regex_dev_close() function.
+ *
+ * Each function of the application RegEx API invokes a specific function
+ * of the PMD that controls the target device designated by its device
+ * identifier.
+ *
+ * For this purpose, all device-specific functions of a RegEx driver are
+ * supplied through a set of pointers contained in a generic structure of type
+ * *regex_dev_ops*.
+ * The address of the *regex_dev_ops* structure is stored in the *rte_regex_dev*
+ * structure by the device init function of the RegEx driver, which is
+ * invoked during the PCI/SoC device probing phase, as explained earlier.
+ *
+ * In other words, each function of the RegEx API simply retrieves the
+ * *rte_regex_dev* structure associated with the device identifier and
+ * performs an indirect invocation of the corresponding driver function
+ * supplied in the *regex_dev_ops* structure of the *rte_regex_dev* structure.
+ *
+ * For performance reasons, the address of the fast-path functions of the
+ * RegEx driver is not contained in the *regex_dev_ops* structure.
+ * Instead, they are directly stored at the beginning of the *rte_regex_dev*
+ * structure to avoid an extra indirect memory access during their invocation.
+ *
+ * RTE RegEx device drivers do not use interrupts for enqueue or dequeue
+ * operation. Instead, RegEx drivers export Poll-Mode enqueue and dequeue
+ * functions to applications.
+ *
+ * The *enqueue* operation submits a burst of RegEx pattern matching request
+ * to the RegEx device and the *dequeue* operation gets a burst of pattern
+ * matching response for the ones submitted through *enqueue* operation.
+ *
+ * Typical application utilisation of the RegEx device API will follow the
+ * following programming flow.
+ *
+ * - rte_regex_dev_configure()
+ * - rte_regex_queue_pair_setup()
+ * - rte_regex_rule_db_update() Needs to invoke if precompiled rule database not
+ *   provided in rte_regex_dev_config::rule_db for rte_regex_dev_configure()
+ *   and/or application needs to update rule database.
+ * - Create or reuse exiting mempool for *rte_regex_ops* objects.
+ * - rte_regex_dev_start()
+ * - rte_regex_enqueue_burst()
+ * - rte_regex_dequeue_burst()
+ *
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_common.h>
+#include <rte_config.h>
+#include <rte_dev.h>
+#include <rte_errno.h>
+#include <rte_memory.h>
+
+/**
+ * Get the total number of RegEx devices that have been successfully
+ * initialised.
+ *
+ * @return
+ *   The total number of usable RegEx devices.
+ */
+uint8_t
+rte_regex_dev_count(void);
+
+/**
+ * Get the device identifier for the named RegEx device.
+ *
+ * @param name
+ *   RegEx device name to select the RegEx device identifier.
+ *
+ * @return
+ *   Returns RegEx device identifier on success.
+ *   - <0: Failure to find named RegEx device.
+ */
+int
+rte_regex_dev_get_dev_id(const char *name);
+
+/* Enumerates RegEx device capabilities */
+#define RTE_REGEX_DEV_CAPA_RUNTIME_COMPILATION_F (1ULL << 0)
+/**< RegEx device does support compiling the rules at runtime unlike
+ * loading only the pre-built rule database using
+ * struct rte_regex_dev_config::rule_db in rte_regex_dev_configure()
+ * @see struct rte_regex_dev_config::rule_db, rte_regex_dev_configure()
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+
+/* Enumerates unsupported PCRE features for the RegEx device */
+#define RTE_REGEX_DEV_PCRE_UNSUP_START_ANCHOR_F (1ULL << 0)
+/**< RegEx device doesn't support PCRE Anchor to start of match flag.
+ * Example RegEx is '/\Gfoo\d/'. Here '\G' asserts position at the end of the
+ * previous match or the start of the string for the first match.
+ * This position will change each time the RegEx is applied to the subject
+ * string. If the RegEx is applied to 'foo1foo2Zfoo3' the first two matches will
+ * be successful for 'foo1foo2' and fail for 'Zfoo3'.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_ATOMIC_GROUPING_F (1ULL << 1)
+/**< RegEx device doesn't support PCRE Atomic grouping.
+ * Atomic groups are represented by '(?>)'. An atomic group is a group that,
+ * when the RegEx engine exits from it, automatically throws away all
+ * backtracking positions remembered by any tokens inside the group.
+ * Example RegEx is 'a(?>bc|b)c' if the given patterns are 'abc' and 'abcc' then
+ * 'a(bc|b)c' matches both where as 'a(?>bc|b)c' matches only abcc because
+ * atomic groups don't allow backtracing back to 'b'.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_BACKTRACKING_CTRL_F (1ULL << 2)
+/**< RegEx device doesn't support PCRE backtracking control verbs.
+ * Some examples of backtracing verbs are (*COMMIT), (*ACCEPT), (*FAIL),
+ * (*SKIP), (*PRUNE).
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_CALLOUTS_F (1ULL << 3)
+/**< RegEx device doesn't support PCRE callouts.
+ * PCRE supports calling external function in between matches by using '(?C)'.
+ * Example RegEx 'ABC(?C)D' if a given patter is 'ABCD' then the RegEx engine
+ * will parse ABC perform a userdefined callout and return a successful match at
+ * D.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_BACKREFERENCE_F (1ULL << 4)
+/**< RegEx device doesn't support PCRE backreference.
+ * Example RegEx is '(\2ABC|(GHI))+' \2 matches the same text as most recently
+ * matched by the 2nd capturing group i.e. 'GHI'.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_GREEDY_F (1ULL << 5)
+/**< RegEx device doesn't support PCRE Greedy mode.
+ * For example if the RegEx is 'AB\d*?' then '*?' represents zero or unlimited
+ * matches. In greedy mode the pattern 'AB12345' will be matched completely
+ * where as the ungreedy mode 'AB' will be returned as the match.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_LOOKAROUND_ASRT_F (1ULL << 6)
+/**< RegEx device doesn't support PCRE Lookaround assertions
+ * (Zero-width assertions). Example RegEx is '[a-z]+\d+(?=!{3,})' if
+ * the given pattern is 'dwad1234!' the RegEx engine doesn't report any matches
+ * because the assert '(?=!{3,})' fails. The pattern 'dwad123!!!' would return a
+ * successful match.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_MATCH_POINT_RST_F (1ULL << 7)
+/**< RegEx device doesn't support PCRE match point reset directive.
+ * Example RegEx is '[a-z]+\K\d+' if the pattern is 'dwad123'
+ * then even though the entire pattern matches only '123'
+ * is reported as a match.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_NEWLINE_CONVENTIONS_F (1ULL << 8)
+/**< RegEx device doesn't support PCRE newline convention.
+ * Newline conventions are represented as follows:
+ * (*CR)        carriage return
+ * (*LF)        linefeed
+ * (*CRLF)      carriage return, followed by linefeed
+ * (*ANYCRLF)   any of the three above
+ * (*ANY)       all Unicode newline sequences
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_NEWLINE_SEQ_F (1ULL << 9)
+/**< RegEx device doesn't support PCRE newline sequence.
+ * The escape sequence '\R' will match any newline sequence.
+ * It is equivalent to: '(?>\r\n|\n|\x0b|\f|\r|\x85)'.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_POSSESSIVE_QUALIFIERS_F (1ULL << 10)
+/**< RegEx device doesn't support PCRE possessive qualifiers.
+ * Example RegEx possessive qualifiers '*+', '++', '?+', '{m,n}+'.
+ * Possessive quantifier repeats the token as many times as possible and it does
+ * not give up matches as the engine backtracks. With a possessive quantifier,
+ * the deal is all or nothing.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_SUBROUTINE_REFERENCES_F (1ULL << 11)
+/**< RegEx device doesn't support PCRE Subroutine references.
+ * PCRE Subroutine references allow for sub patterns to be assessed
+ * as part of the RegEx. Example RegEx is '(foo|fuzz)\g<1>+bar' matches the
+ * pattern 'foofoofuzzfoofuzzbar'.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_8_F (1ULL << 12)
+/**< RegEx device doesn't support UTF-8 character encoding.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_16_F (1ULL << 13)
+/**< RegEx device doesn't support UTF-16 character encoding.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_32_F (1ULL << 14)
+/**< RegEx device doesn't support UTF-32 character encoding.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_WORD_BOUNDARY_F (1ULL << 15)
+/**< RegEx device doesn't support word boundaries.
+ * The meta character '\b' represents word boundary anchor.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_FORWARD_REFERENCES_F (1ULL << 16)
+/**< RegEx device doesn't support Forward references.
+ * Forward references allow you to use a back reference to a group that appears
+ * later in the RegEx. Example RegEx is '(\3ABC|(DEF|(GHI)))+' matches the
+ * following string 'GHIGHIABCDEF'.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+/* Enumerates PCRE rule flags */
+#define RTE_REGEX_PCRE_RULE_ALLOW_EMPTY_F (1ULL << 0)
+/**< When this flag is set, the pattern that can match against an empty string,
+ * such as '.*' are allowed.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_ANCHORED_F (1ULL << 1)
+/**< When this flag is set, the pattern is forced to be "anchored", that is, it
+ * is constrained to match only at the first matching point in the string that
+ * is being searched. Similar to '^' and represented by \A.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_CASELESS_F (1ULL << 2)
+/**< When this flag is set, letters in the pattern match both upper and lower
+ * case letters in the subject.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_DOTALL_F (1ULL << 3)
+/**< When this flag is set, a dot metacharacter in the pattern matches any
+ * character, including one that indicates a newline.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_DUPNAMES_F (1ULL << 4)
+/**< When this flag is set, names used to identify capture groups need not be
+ * unique.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_EXTENDED_F (1ULL << 5)
+/**< When this flag is set, most white space characters in the pattern are
+ * totally ignored except when escaped or inside a character class.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_MATCH_UNSET_BACKREF_F (1ULL << 6)
+/**< When this flag is set, a backreference to an unset capture group matches an
+ * empty string.
+ * @see RTE_REGEX_DEV_PCRE_UNSUP_FORWARD_REFERENCES_F
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_MULTILINE_F (1ULL << 7)
+/**< When this flag  is set, the '^' and '$' constructs match immediately
+ * following or immediately before internal newlines in the subject string,
+ * respectively, as well as at the very start and end.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_NO_AUTO_CAPTURE_F (1ULL << 8)
+/**< When this Flag is set, it disables the use of numbered capturing
+ * parentheses in the pattern. References to capture groups (backreferences or
+ * recursion/subroutine calls) may only refer to named groups, though the
+ * reference can be by name or by number.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_UCP_F (1ULL << 9)
+/**< By default, only ASCII characters are recognized, When this flag is set,
+ * Unicode properties are used instead to classify characters.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_UNGREEDY_F (1ULL << 10)
+/**< When this flag is set, the "greediness" of the quantifiers is inverted
+ * so that they are not greedy by default, but become greedy if followed by
+ * '?'.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_UTF_F (1ULL << 11)
+/**< When this flag is set, RegEx engine has to regard both the pattern and the
+ * subject strings that are subsequently processed as strings of UTF characters
+ * instead of single-code-unit strings.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_NEVER_BACKSLASH_C_F (1ULL << 12)
+/**< This Flag locks out the use of '\C' in the pattern that is being compiled.
+ * This escape matches one data unit, even in UTF mode which can cause
+ * unpredictable behavior in UTF-8 or UTF-16 modes, because it may leave the
+ * current matching point in the middle of a multi-code-unit character.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+
+/**
+ * RegEx device information
+ */
+struct rte_regex_dev_info {
+	const char *driver_name; /**< RegEx driver name */
+	struct rte_device *dev;	/**< Device information */
+	uint8_t max_matches;
+	/**< Maximum matches per scan supported by this device */
+	uint16_t max_queue_pairs;
+	/**< Maximum queue pairs supported by this device */
+	uint16_t max_payload_size;
+	/**< Maximum payload size for a pattern match request or scan.
+	 * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
+	 */
+	uint16_t max_rules_per_group;
+	/**< Maximum rules supported per group by this device */
+	uint16_t max_groups;
+	/**< Maximum group supported by this device */
+	uint32_t regex_dev_capa;
+	/**< RegEx device capabilities. @see RTE_REGEX_DEV_CAPA_* */
+	uint64_t rule_flags;
+	/**< Supported compiler rule flags.
+	 * @see RTE_REGEX_PCRE_RULE_*, struct rte_regex_rule::rule_flags
+	 */
+	uint64_t pcre_unsup_flags;
+	/**< Unsupported PCRE features for this RegEx device.
+	 * @see RTE_REGEX_DEV_PCRE_UNSUP_*
+	 */
+};
+
+/**
+ * Retrieve the contextual information of a RegEx device.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ *
+ * @param[out] dev_info
+ *   A pointer to a structure of type *rte_regex_dev_info* to be filled with the
+ *   contextual information of the device.
+ *
+ * @return
+ *   - 0: Success, driver updates the contextual information of the RegEx device
+ *   - <0: Error code returned by the driver info get function.
+ *
+ */
+int
+rte_regex_dev_info_get(uint8_t dev_id, struct rte_regex_dev_info *dev_info);
+
+/* Enumerates RegEx device configuration flags */
+#define RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F (1ULL << 0)
+/**< Cross buffer scan refers to the ability to be able to detect
+ * matches that occur across buffer boundaries, where the buffers are related
+ * to each other in some way. Enable this flag when to scan payload size
+ * greater struct struct rte_regex_dev_info::max_payload_size and/or
+ * matches can present across scan buffer boundaries.
+ *
+ * @see struct rte_regex_dev_info::max_payload_size
+ * @see struct rte_regex_dev_config::dev_cfg_flags, rte_regex_dev_configure()
+ * @see RTE_REGEX_OPS_RSP_PMI_SOJ_F
+ * @see RTE_REGEX_OPS_RSP_PMI_EOJ_F
+ */
+
+/** RegEx device configuration structure */
+struct rte_regex_dev_config {
+	uint8_t nb_max_matches;
+	/**< Maximum matches per scan configured on this device.
+	 * This value cannot exceed the *max_matches*
+	 * which previously provided in rte_regex_dev_info_get().
+	 * The value 0 is allowed, in which case, value 1 used.
+	 * @see struct rte_regex_dev_info::max_matches
+	 */
+	uint16_t nb_queue_pairs;
+	/**< Number of RegEx queue pairs to configure on this device.
+	 * This value cannot exceed the *max_queue_pairs* which previously
+	 * provided in rte_regex_dev_info_get().
+	 * @see struct rte_regex_dev_info::max_queue_pairs
+	 */
+	uint16_t nb_rules_per_group;
+	/**< Number of rules per group to configure on this device.
+	 * This value cannot exceed the *max_rules_per_group*
+	 * which previously provided in rte_regex_dev_info_get().
+	 * The value 0 is allowed, in which case,
+	 * struct rte_regex_dev_info::max_rules_per_group used.
+	 * @see struct rte_regex_dev_info::max_rules_per_group
+	 */
+	uint16_t nb_groups;
+	/**< Number of groups to configure on this device.
+	 * This value cannot exceed the *max_groups*
+	 * which previously provided in rte_regex_dev_info_get().
+	 * @see struct rte_regex_dev_info::max_groups
+	 */
+	const char *rule_db;
+	/**< Import initial set of prebuilt rule database on this device.
+	 * The value NULL is allowed, in which case, the device will not
+	 * be configured prebuilt rule database. Application may use
+	 * rte_regex_rule_db_update() or rte_regex_rule_db_import() API
+	 * to update or import rule database after the
+	 * rte_regex_dev_configure().
+	 * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
+	 */
+	uint32_t rule_db_len;
+	/**< Length of *rule_db* buffer. */
+	uint32_t dev_cfg_flags;
+	/**< RegEx device configuration flags, See RTE_REGEX_DEV_CFG_*  */
+};
+
+/**
+ * Configure a RegEx device.
+ *
+ * This function must be invoked first before any other function in the
+ * API. This function can also be re-invoked when a device is in the
+ * stopped state.
+ *
+ * The caller may use rte_regex_dev_info_get() to get the capability of each
+ * resources available for this regex device.
+ *
+ * @param dev_id
+ *   The identifier of the device to configure.
+ * @param cfg
+ *   The RegEx device configuration structure.
+ *
+ * @return
+ *   - 0: Success, device configured.
+ *   - <0: Error code returned by the driver configuration function.
+ */
+int
+rte_regex_dev_configure(uint8_t dev_id, const struct rte_regex_dev_config *cfg);
+
+/* Enumerates RegEx queue pair configuration flags */
+#define RTE_REGEX_QUEUE_PAIR_CFG_OOS_F (1ULL << 0)
+/**< Out of order scan, If not set, a scan must retire after previously issued
+ * in-order scans to this queue pair. If set, this scan can be retired as soon
+ * as device returns completion. Application should not set out of order scan
+ * flag if it needs to maintain the ingress order of scan request.
+ *
+ * @see struct rte_regex_qp_conf::qp_conf_flags, rte_regex_queue_pair_setup()
+ */
+
+struct rte_regex_ops;
+typedef void (*regexdev_stop_flush_t)(uint8_t dev_id, uint16_t qp_id,
+				      struct rte_regex_ops *op);
+/**< Callback function called during rte_regex_dev_stop(), invoked once per
+ * flushed RegEx op.
+ */
+
+/** RegEx queue pair configuration structure */
+struct rte_regex_qp_conf {
+	uint32_t qp_conf_flags;
+	/**< Queue pair config flags, See RTE_REGEX_QUEUE_PAIR_CFG_* */
+	uint16_t nb_desc;
+	/**< The number of descriptors to allocate for this queue pair. */
+	regexdev_stop_flush_t cb;
+	/**< Callback function called during rte_regex_dev_stop(), invoked
+	 * once per flushed regex op. Value NULL is allowed, in which case
+	 * callback will not be invoked. This function can be used to properly
+	 * dispose of outstanding regex ops from response queue,
+	 * for example ops containing memory pointers.
+	 * @see rte_regex_dev_stop()
+	 */
+};
+
+/**
+ * Allocate and set up a RegEx queue pair for a RegEx device.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param queue_pair_id
+ *   The index of the RegEx queue pair to setup. The value must be in the range
+ *   [0, nb_queue_pairs - 1] previously supplied to rte_regex_dev_configure().
+ * @param qp_conf
+ *   The pointer to the configuration data to be used for the RegEx queue pair.
+ *   NULL value is allowed, in which case default configuration	used.
+ *
+ * @return
+ *   - 0: Success, RegEx queue pair correctly set up.
+ *   - <0: RegEx queue configuration failed
+ */
+int
+rte_regex_queue_pair_setup(uint8_t dev_id, uint8_t queue_pair_id,
+			   const struct rte_regex_qp_conf *qp_conf);
+
+/**
+ * Start a RegEx device.
+ *
+ * The device start step is the last one and consists of setting the RegEx
+ * queues to start accepting the pattern matching scan requests.
+ *
+ * On success, all basic functions exported by the API (RegEx enqueue,
+ * RegEx dequeue and so on) can be invoked.
+ *
+ * @param dev_id
+ *   RegEx device identifier
+ * @return
+ *   - 0: Success, device started.
+ *   - <0: Device start failed.
+ */
+int
+rte_regex_dev_start(uint8_t dev_id);
+
+/**
+ * Stop a RegEx device.
+ *
+ * Stop a RegEx device. The device can be restarted with a call to
+ * rte_regex_dev_start().
+ *
+ * This function causes all queued response regex ops to be drained in the
+ * response queue. While draining ops out of the device,
+ * struct rte_regex_qp_conf::cb will be invoked for each ops.
+ *
+ * @param dev_id
+ *   RegEx device identifier.
+ *
+ * @see struct rte_regex_qp_conf::cb, rte_regex_queue_pair_setup()
+ */
+void
+rte_regex_dev_stop(uint8_t dev_id);
+
+/**
+ * Close a RegEx device. The device cannot be restarted!
+ *
+ * @param dev_id
+ *   RegEx device identifier
+ *
+ * @return
+ *  - 0 on successfully closed the device.
+ *  - <0 on failure to close the device.
+ */
+int
+rte_regex_dev_close(uint8_t dev_id);
+
+/* Device get/set attributes */
+
+/** Enumerates RegEx device attribute identifier */
+enum rte_regex_dev_attr_id {
+	RTE_REGEX_DEV_ATTR_SOCKET_ID,
+	/**< The NUMA socket id to which the device is connected or
+	 * a default of zero if the socket could not be determined.
+	 * datatype: *int*
+	 * operation: *get*
+	 */
+	RTE_REGEX_DEV_ATTR_MAX_MATCHES,
+	/**< Maximum number of matches per scan.
+	 * datatype: *uint8_t*
+	 * operation: *get* and *set*
+	 *
+	 * @see RTE_REGEX_OPS_RSP_MAX_MATCH_F
+	 */
+	RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT,
+	/**< Upper bound scan time in ns.
+	 * datatype: *uint16_t*
+	 * operation: *get* and *set*
+	 *
+	 * @see RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F
+	 */
+	RTE_REGEX_DEV_ATTR_MAX_PREFIX,
+	/**< Maximum number of prefix detected per scan.
+	 * This would be useful for denial of service detection.
+	 * datatype: *uint16_t*
+	 * operation: *get* and *set*
+	 *
+	 * @see RTE_REGEX_OPS_RSP_MAX_PREFIX_F
+	 */
+};
+
+/**
+ * Get an attribute from a RegEx device.
+ *
+ * @param dev_id RegEx device identifier
+ * @param attr_id The attribute ID to retrieve
+ * @param[out] attr_value A pointer that will be filled in with the attribute
+ *             value if successful.
+ *
+ * @return
+ *   - 0: Successfully retrieved attribute value.
+ *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
+ *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
+ */
+int
+rte_regex_dev_attr_get(uint8_t dev_id, enum rte_regex_dev_attr_id attr_id,
+		       void *attr_value);
+
+/**
+ * Set an attribute to a RegEx device.
+ *
+ * @param dev_id RegEx device identifier
+ * @param attr_id The attribute ID to retrieve
+ * @param attr_value A pointer that will be filled in with the attribute value
+ *                   by the application
+ *
+ * @return
+ *   - 0: Successfully applied the attribute value.
+ *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
+ *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
+ */
+int
+rte_regex_dev_attr_set(uint8_t dev_id, enum rte_regex_dev_attr_id attr_id,
+		       const void *attr_value);
+
+/* Rule related APIs */
+/** Enumerates RegEx rule operation */
+enum rte_regex_rule_op {
+	RTE_REGEX_RULE_OP_ADD,
+	/**< Add RegEx rule to rule database */
+	RTE_REGEX_RULE_OP_REMOVE
+	/**< Remove RegEx rule from rule database */
+};
+
+/** Structure to hold a RegEx rule attributes */
+struct rte_regex_rule {
+	enum rte_regex_rule_op op;
+	/**< OP type of the rule either a OP_ADD or OP_DELETE */
+	uint16_t group_id;
+	/**< Group identifier to which the rule belongs to. */
+	uint32_t rule_id;
+	/**< Rule identifier which is returned on successful match. */
+	const char *pcre_rule;
+	/**< Buffer to hold the PCRE rule. */
+	uint16_t pcre_rule_len;
+	/**< Length of the PCRE rule*/
+	uint64_t rule_flags;
+	/* PCRE rule flags. Supported device specific PCRE rules enumerated
+	 * in struct rte_regex_dev_info::rule_flags. For successful rule
+	 * database update, application needs to provide only supported
+	 * rule flags.
+	 * @See RTE_REGEX_PCRE_RULE_*, struct rte_regex_dev_info::rule_flags
+	 */
+};
+
+/**
+ * Update the rule database of a RegEx device.
+ *
+ * @param dev_id RegEx device identifier
+ * @param rules
+ *   Points to an array of *nb_rules* objects of type *rte_regex_rule* structure
+ *   which contain the regex rules attributes to be updated in rule database.
+ * @param nb_rules
+ *   The number of PCRE rules to update the rule database.
+ *
+ * @return
+ *   The number of regex rules actually updated on the regex device's rule
+ *   database. The return value can be less than the value of the *nb_rules*
+ *   parameter when the regex devices fails to update the rule database or
+ *   if invalid parameters are specified in a *rte_regex_rule*.
+ *   If the return value is less than *nb_rules*, the remaining PCRE rules
+ *   at the end of *rules* are not consumed and the caller has to take
+ *   care of them and rte_errno is set accordingly.
+ *   Possible errno values include:
+ *   - -EINVAL:  Invalid device ID or rules is NULL
+ *   - -ENOTSUP: The last processed rule is not supported on this device.
+ *   - -ENOSPC: No space available in rule database.
+ *
+ * @see rte_regex_rule_db_import(), rte_regex_rule_db_export()
+ */
+uint16_t
+rte_regex_rule_db_update(uint8_t dev_id, const struct rte_regex_rule *rules,
+			 uint16_t nb_rules);
+
+/**
+ * Import a prebuilt rule database from a buffer to a RegEx device.
+ *
+ * @param dev_id RegEx device identifier
+ * @param rule_db
+ *   Points to prebuilt rule database.
+ * @param rule_db_len
+ *   Length of the rule database.
+ *
+ * @return
+ *   - 0: Successfully updated the prebuilt rule database.
+ *   - -EINVAL:  Invalid device ID or rule_db is NULL
+ *   - -ENOTSUP: Rule database import is not supported on this device.
+ *   - -ENOSPC: No space available in rule database.
+ *
+ * @see rte_regex_rule_db_update(), rte_regex_rule_db_export()
+ */
+int
+rte_regex_rule_db_import(uint8_t dev_id, const char *rule_db,
+			 uint32_t rule_db_len);
+
+/**
+ * Export the prebuilt rule database from a RegEx device to the buffer.
+ *
+ * @param dev_id RegEx device identifier
+ * @param[out] rule_db
+ *   Block of memory to insert the rule database. Must be at least size in
+ *   capacity. If set to NULL, function returns required capacity.
+ *
+ * @return
+ *   - 0: Successfully exported the prebuilt rule database.
+ *   - size: If rule_db set to NULL then required capacity for *rule_db*
+ *   - -EINVAL:  Invalid device ID
+ *   - -ENOTSUP: Rule database export is not supported on this device.
+ *
+ * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
+ */
+int
+rte_regex_rule_db_export(uint8_t dev_id, char *rule_db);
+
+/* Extended statistics */
+/** Maximum name length for extended statistics counters */
+#define RTE_REGEX_DEV_XSTATS_NAME_SIZE 64
+
+/**
+ * A name-key lookup element for extended statistics.
+ *
+ * This structure is used to map between names and ID numbers
+ * for extended RegEx device statistics.
+ */
+struct rte_regex_dev_xstats_map {
+	uint16_t id;
+	/**< xstat identifier */
+	char name[RTE_REGEX_DEV_XSTATS_NAME_SIZE];
+	/**< xstat name */
+};
+
+/**
+ * Retrieve names of extended statistics of a regex device.
+ *
+ * @param dev_id
+ *   The identifier of the regex device.
+ * @param[out] xstats_map
+ *   Block of memory to insert id and names into. Must be at least size in
+ *   capacity. If set to NULL, function returns required capacity.
+ * @return
+ *   - positive value on success:
+ *        -The return value is the number of entries filled in the stats map.
+ *        -If xstats_map set to NULL then required capacity for xstats_map.
+ *   - negative value on error:
+ *      -ENODEV for invalid *dev_id*
+ *      -ENOTSUP if the device doesn't support this function.
+ */
+int
+rte_regex_dev_xstats_names_get(uint8_t dev_id,
+			       struct rte_regex_dev_xstats_map *xstats_map);
+
+/**
+ * Retrieve extended statistics of an regex device.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param ids
+ *   The id numbers of the stats to get. The ids can be got from the stat
+ *   position in the stat list from rte_regex_dev_xstats_names_get(), or
+ *   by using rte_regex_dev_xstats_by_name_get().
+ * @param[out] values
+ *   The values for each stats request by ID.
+ * @param n
+ *   The number of stats requested
+ * @return
+ *   - positive value: number of stat entries filled into the values array
+ *   - negative value on error:
+ *      -ENODEV for invalid *dev_id*
+ *      -ENOTSUP if the device doesn't support this function.
+ */
+int
+rte_regex_dev_xstats_get(uint8_t dev_id, const uint16_t ids[],
+			 uint64_t values[], uint16_t n);
+
+/**
+ * Retrieve the value of a single stat by requesting it by name.
+ *
+ * @param dev_id
+ *   The identifier of the device
+ * @param name
+ *   The stat name to retrieve
+ * @param[out] id
+ *   If non-NULL, the numerical id of the stat will be returned, so that further
+ *   requests for the stat can be got using rte_regex_dev_xstats_get, which will
+ *   be faster as it doesn't need to scan a list of names for the stat.
+ * @param[out] value
+ *   Must be non-NULL, retrieved xstat value will be stored in this address.
+ *
+ * @return
+ *   - 0: Successfully retrieved xstat value.
+ *   - -EINVAL: invalid parameters
+ *   - -ENOTSUP: if not supported.
+ */
+int
+rte_regex_dev_xstats_by_name_get(uint8_t dev_id, const char *name,
+				 uint16_t *id, uint64_t *value);
+
+/**
+ * Reset the values of the xstats of the selected component in the device.
+ *
+ * @param dev_id
+ *   The identifier of the device
+ * @param ids
+ *   Selects specific statistics to be reset. When NULL, all statistics will be
+ *   reset. If non-NULL, must point to array of at least *nb_ids* size.
+ * @param nb_ids
+ *   The number of ids available from the *ids* array. Ignored when ids is NULL.
+ * @return
+ *   - 0: Successfully reset the statistics to zero.
+ *   - -EINVAL: invalid parameters
+ *   - -ENOTSUP: if not supported.
+ */
+int
+rte_regex_dev_xstats_reset(uint8_t dev_id, const uint16_t ids[],
+			   uint16_t nb_ids);
+
+/**
+ * Trigger the RegEx device self test.
+ *
+ * @param dev_id
+ *   The identifier of the device
+ * @return
+ *   - 0: Selftest successful
+ *   - -ENOTSUP if the device doesn't support selftest
+ *   - other values < 0 on failure.
+ */
+int rte_regex_dev_selftest(uint8_t dev_id);
+
+/**
+ * Dump internal information about *dev_id* to the FILE* provided in *f*.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ *
+ * @param f
+ *   A pointer to a file for output
+ *
+ * @return
+ *   - 0: on success
+ *   - <0: on failure.
+ */
+int
+rte_regex_dev_dump(uint8_t dev_id, FILE *f);
+
+/* Fast path APIs */
+
+/**
+ * The generic *rte_regex_match* structure to hold the RegEx match attributes.
+ * @see struct rte_regex_ops::matches
+ */
+struct rte_regex_match {
+	RTE_STD_C11
+	union {
+		uint64_t u64;
+		struct {
+			uint32_t rule_id:20;
+			/**< Rule identifier to which the pattern matched.
+			 * @see struct rte_regex_rule::rule_id
+			 */
+			uint32_t group_id:12;
+			/**< Group identifier of the rule which the pattern
+			 * matched. @see struct rte_regex_rule::group_id
+			 */
+			uint16_t offset;
+			/**< Starting Byte Position for matched rule. */
+			uint16_t len;
+			/**< Length of match in bytes */
+		};
+	};
+};
+
+/* Enumerates RegEx request flags. */
+#define RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F (1 << 0)
+/**< Set when struct rte_regex_rule::group_id1 valid */
+
+#define RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F (1 << 1)
+/**< Set when struct rte_regex_rule::group_id2 valid */
+
+#define RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F (1 << 2)
+/**< Set when struct rte_regex_rule::group_id3 valid */
+
+#define RTE_REGEX_OPS_REQ_STOP_ON_MATCH_F (1 << 4)
+/**< The RegEx engine will stop scanning and return the first match. */
+
+#define RTE_REGEX_OPS_REQ_MATCH_HIGH_PRIORITY_F (1 << 5)
+/**< In High Priority mode a maximum of one match will be returned per scan to
+ * reduce the post-processing required by the application. The match with the
+ * lowest Rule id, lowest start pointer and lowest match length will be
+ * returned.
+ *
+ * @see struct rte_regex_ops::nb_actual_matches
+ * @see struct rte_regex_ops::nb_matches
+ */
+
+
+/* Enumerates RegEx response flags. */
+#define RTE_REGEX_OPS_RSP_PMI_SOJ_F (1 << 0)
+/**< Indicates that the RegEx device has encountered a partial match at the
+ * start of scan in the given buffer.
+ *
+ * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
+ */
+
+#define RTE_REGEX_OPS_RSP_PMI_EOJ_F (1 << 1)
+/**< Indicates that the RegEx device has encountered a partial match at the
+ * end of scan in the given buffer.
+ *
+ * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
+ */
+
+#define RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F (1 << 2)
+/**< Indicates that the RegEx device has exceeded the max timeout while
+ * scanning the given buffer.
+ *
+ * @see RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT
+ */
+
+#define RTE_REGEX_OPS_RSP_MAX_MATCH_F (1 << 3)
+/**< Indicates that the RegEx device has exceeded the max matches while
+ * scanning the given buffer.
+ *
+ * @see RTE_REGEX_DEV_ATTR_MAX_MATCHES
+ */
+
+#define RTE_REGEX_OPS_RSP_MAX_PREFIX_F (1 << 4)
+/**< Indicates that the RegEx device has reached the max allowed prefix length
+ * while scanning the given buffer.
+ *
+ * @see RTE_REGEX_DEV_ATTR_MAX_PREFIX
+ */
+
+/**
+ * The generic *rte_regex_ops* structure to hold the RegEx attributes
+ * for enqueue and dequeue operation.
+ */
+struct rte_regex_ops {
+	/* W0 */
+	uint16_t req_flags;
+	/**< Request flags for the RegEx ops.
+	 * @see RTE_REGEX_OPS_REQ_*
+	 */
+	uint16_t scan_size;
+	/**< Scan size of the buffer to be scanned in bytes. */
+	uint16_t rsp_flags;
+	/**< Response flags for the RegEx ops.
+	 * @see RTE_REGEX_OPS_RSP_*
+	 */
+	uint8_t nb_actual_matches;
+	/**< The total number of actual matches detected by the Regex device.*/
+	uint8_t nb_matches;
+	/**< The total number of matches returned by the RegEx device for this
+	 * scan. The size of *rte_regex_ops::matches* zero length array will be
+	 * this value.
+	 *
+	 * @see struct rte_regex_ops::matches, struct rte_regex_match
+	 */
+
+	/* W1 */
+	RTE_STD_C11
+	union {
+		uint64_t u64;
+		/**<  Allow 8-byte reserved on 32-bit system */
+		void *buf_addr;
+		/**< Virtual address of the pattern to be matched. */
+	};
+
+	/* W2 */
+	rte_iova_t buf_iova;
+	/**< IOVA address of the pattern to be matched. */
+
+	/* W3 */
+	uint16_t group_id0;
+	/**< First group_id to match the rule against. Minimum one group id
+	 * must be provided by application.
+	 * When RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F set then group_id1
+	 * is valid, respectively similar flags for group_id2 and group_id3.
+	 * Upon the match, struct rte_regex_match::group_id shall be updated
+	 * with matching group ID by the device. Group ID scheme provides
+	 * rule isolation and effective pattern matching.
+	 */
+	uint16_t group_id1;
+	/**< Second group_id to match the rule against.
+	 *
+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F
+	 */
+	uint16_t group_id2;
+	/**< Third group_id to match the rule against.
+	 *
+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F
+	 */
+	uint16_t group_id3;
+	/**< Forth group_id to match the rule against.
+	 *
+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F
+	 */
+
+	/* W4 */
+	RTE_STD_C11
+	union {
+		uint64_t user_id;
+		/**< Application specific opaque value. An application may use
+		 * this field to hold application specific value to share
+		 * between dequeue and enqueue operation.
+		 * Implementation should not modify this field.
+		 */
+		void *user_ptr;
+		/**< Pointer representation of *user_id* */
+	};
+
+	/* W5 */
+	struct rte_regex_match matches[];
+	/**< Zero length array to hold the match tuples.
+	 * The struct rte_regex_ops::nb_matches value holds the number of
+	 * elements in this array.
+	 *
+	 * @see struct rte_regex_ops::nb_matches
+	 */
+};
+
+/**
+ * Enqueue a burst of scan request on a RegEx device.
+ *
+ * The rte_regex_enqueue_burst() function is invoked to place
+ * regex operations on the queue *qp_id* of the device designated by
+ * its *dev_id*.
+ *
+ * The *nb_ops* parameter is the number of operations to process which are
+ * supplied in the *ops* array of *rte_regex_op* structures.
+ *
+ * The rte_regex_enqueue_burst() function returns the number of
+ * operations it actually enqueued for processing. A return value equal to
+ * *nb_ops* means that all packets have been enqueued.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param qp_id
+ *   The index of the queue pair which packets are to be enqueued for
+ *   processing. The value must be in the range [0, nb_queue_pairs - 1]
+ *   previously supplied to rte_regex_dev_configure().
+ * @param ops
+ *   The address of an array of *nb_ops* pointers to *rte_regex_op* structures
+ *   which contain the regex operations to be processed.
+ * @param nb_ops
+ *   The number of operations to process.
+ *
+ * @return
+ *   The number of operations actually enqueued on the regex device. The return
+ *   value can be less than the value of the *nb_ops* parameter when the
+ *   regex devices queue is full or if invalid parameters are specified in
+ *   a *rte_regex_op*. If the return value is less than *nb_ops*, the remaining
+ *   ops at the end of *ops* are not consumed and the caller has to take care
+ *   of them.
+ */
+uint16_t
+rte_regex_enqueue_burst(uint8_t dev_id, uint16_t qp_id,
+			struct rte_regex_ops **ops, uint16_t nb_ops);
+
+/**
+ *
+ * Dequeue a burst of scan response from a queue on the RegEx device.
+ * The dequeued operation are stored in *rte_regex_op* structures
+ * whose pointers are supplied in the *ops* array.
+ *
+ * The rte_regex_dequeue_burst() function returns the number of ops
+ * actually dequeued, which is the number of *rte_regex_op* data structures
+ * effectively supplied into the *ops* array.
+ *
+ * A return value equal to *nb_ops* indicates that the queue contained
+ * at least *nb_ops* operations, and this is likely to signify that other
+ * processed operations remain in the devices output queue. Applications
+ * implementing a "retrieve as many processed operations as possible" policy
+ * can check this specific case and keep invoking the
+ * rte_regex_dequeue_burst() function until a value less than
+ * *nb_ops* is returned.
+ *
+ * The rte_regex_dequeue_burst() function does not provide any error
+ * notification to avoid the corresponding overhead.
+ *
+ * @param dev_id
+ *   The RegEx device identifier
+ * @param qp_id
+ *   The index of the queue pair from which to retrieve processed packets.
+ *   The value must be in the range [0, nb_queue_pairs - 1] previously
+ *   supplied to rte_regex_dev_configure().
+ * @param ops
+ *   The address of an array of pointers to *rte_regex_op* structures that must
+ *   be large enough to store *nb_ops* pointers in it.
+ * @param nb_ops
+ *   The maximum number of operations to dequeue.
+ *
+ * @return
+ *   The number of operations actually dequeued, which is the number
+ *   of pointers to *rte_regex_op* structures effectively supplied to the
+ *   *ops* array. If the return value is less than *nb_ops*, the remaining
+ *   ops at the end of *ops* are not consumed and the caller has to take care
+ *   of them.
+ */
+uint16_t
+rte_regex_dequeue_burst(uint8_t dev_id, uint16_t qp_id,
+			struct rte_regex_ops **ops, uint16_t nb_ops);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_REGEXDEV_H_ */
Jerin Jacob Kollanukkaran Aug. 19, 2019, 3:09 a.m. UTC | #4
Reply to Xiang's queries in main thread:

Hi all,

Some questions regarding APIs. Could you please give more insights?

1) rte_regex_ops
      a) rsp_flags
      These two flags RTE_REGEX_OPS_RSP_PMI_SOJ_F and RTE_REGEX_OPS_RSP_PMI_EOJ_F are used for cross buffer scan.
      RTE_REGEX_OPS_RSP_PMI_EOJ_F tells whether we have a partial match at the end of current buffer after scan.
      What's the purpose of having RTE_REGEX_OPS_RSP_PMI_SOJ_F?

[Jerin] Since we need three states to represent partial match buffer, RTE_REGEX_OPS_RSP_PMI_SOJ_F to
represent start of the buffer, intermediate buffers with no flag, and end of the buffer with RTE_REGEX_OPS_RSP_PMI_EOJ

      RTE_REGEX_OPS_RSP_MAX_PREFIX_F: This looks like a definition for a specific hardware implementation. I am wondering what this PREFIX refers to:)?

[Jerin] Yes. Looks like it is for hardware specific implementation. Introduced rte_regex_dev_attr_set/get functions to make it portable and
To add new implementation specific fields.
For example, if a rule is
/ABCDEF.*XYZ/, ABCD is considered the prefix, and EF.*XYZ is considered the factor. The prefix is a literal
string, while the factor can contain complex regular expression constructs. As a result, rule matching occurs in
two stages: prefix matching and factor matching.
 
      b)  user_id or user_ptr
      Under what kind of circumstances should an application pass value into these variables for enqueue and dequeuer operations?

[Jerin] Just like rte_crypto_ops, struct rte_regex_ops also allocated using mempool normally, on enqueue, user can specify user_id
If needed to in order identify the op on dequeue if required. The use case could be to store the sequence number from application
POV or storing the mbuf ptr in which pattern is requested etc.
 

 2) rte_regex_match
      a) offset; /**< Starting Byte Position for matched rule. */ and  uint16_t len; /**< Length of match in bytes */
      Looks like the matching offset is defined as *starting matching offset* instead of *end matching offset*, e.g. report the offset of "a" instead of "c" for pattern "abc". 
      If so, this makes it hard to integrate software regex libraries such as Hyperscan and RE2 as they only report *end matching offset* without length of match. 
      Although Hyperscan has API for *starting matching offset*, it only delivers partial syntax support. So I think we have to define *end of matching offset* for software solutions.

[Jerin] I understand the hyperscan's HS_FLAG_SOM_LEFTMOST tradeoffs. I thought application would need always the length of the match.
Probably we will see how other HW implementation (from Mellanox) etc. We will try to abstract it, probably we can make it as function of "user requested".

3)  rte_regex_rule_db_update()
    Does this mean we can dynamically add or delete rules for an already generated database without recompile from scratch for hardware Regex implementation? 
    If so, this isn't possible for software solutions as they don't support dynamic database update and require recompile. 

[Jerin] rte_regex_rule_db_update() internally it would call recompile function for both HW and SW.
See rte_regex_dev_config::rule_db in rte_regex_dev_configure() for precompiled rule database case.

4) rte_regex_rule_db_import() and rte_regex_rule_db_export()
     What's the expected behavior for import and export operations? Will we create another copy of database when calling them? 

[Jerin] Does it require copy or not it is Implementation defined. Marvell's HW implementation has centralized rule database
per device.

Thanks,
Xiang

> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Thursday, August 15, 2019 5:04 PM
> To: dev@dpdk.org
> Cc: Jerin Jacob Kollanukkaran <jerinj@marvell.com>; Pavan Nikhilesh
> Bhagavatula <pbhagavatula@marvell.com>; Shahaf Shuler
> <shahafs@mellanox.com>; Hemant Agrawal <hemant.agrawal@nxp.com>;
> Opher Reviv <opher@mellanox.com>; Alex Rosenbaum
> <alexr@mellanox.com>; Dovrat Zifroni <dovrat@marvell.com>; Prasun
> Kapoor <pkapoor@marvell.com>; Nipun Gupta <nipun.gupta@nxp.com>;
> Wang, Xiang W <xiang.w.wang@intel.com>; Richardson, Bruce
> <bruce.richardson@intel.com>; yang.a.hong@intel.com;
> harry.chang@intel.com; gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com;
> wushuai@inspur.com; yuyingxia@yxlink.com;
> fanchenggang@sunyainfo.com; davidfgao@tencent.com;
> liuzhong1@chinaunicom.cn; zhaoyong11@huawei.com; oc@yunify.com;
> jim@netgate.com; hongjun.ni@intel.com; j.bromhead@titan-ic.com;
> deri@ntop.org; fc@napatech.com; arthur.su@lionic.com
> Subject: Re: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev
> subsystem
> 
> +Cc more
> 
> ------------
> 
> From: Jerin Jacob <jerinj@marvell.com>
> 
> Even though there are some vendors which offer Regex HW offload, due to
> lack of standard API, It is diffcult for DPDK consumer to use them
> in a portable way.
> 
> This _RFC_ attempts to standardize the RegEx/DPI offload APIs for DPDK.
> 
> The Doxygen generated RFC API documentation available here:
> https://dreamy-noether-22777e.netlify.com/rte__regexdev_8h.html
> 
> This RFC crafted based on SW Regex API frameworks such as libpcre and
> hyperscan and a few of the RegEx HW IPs which I am aware of.
> 
> RegEx pattern matching applications:
> • Next Generation Firewalls (NGFW)
> • Deep Packet and Flow Inspection (DPI)
> • Intrusion Prevention Systems (IPS)
> • DDoS Mitigation
> • Network Monitoring
> • Data Loss Prevention (DLP)
> • Smart NICs
> • Grammar based content processing
> • URL, spam and adware filtering
> • Advanced auditing and policing of user/application security policies
> • Financial data mining - parsing of streamed financial feeds
> 
> Request to review from HW and SW RegEx vendors and RegEx application
> users
> to have portable DPDK API for RegEx.
> 
> The API schematics are based cryptodev, eventdev and ethdev existing
> device API.
> 
> Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> ---
> 
> RTE RegEx Device API
> --------------------
> 
> Defines RTE RegEx Device APIs for RegEx operations and its provisioning.
> 
> The RegEx Device API is composed of two parts:
> 
> - The application-oriented RegEx API that includes functions to setup
> a RegEx device (configure it, setup its queue pairs and start it),
> update the rule database and so on.
> 
> - The driver-oriented RegEx API that exports a function allowing
> a RegEx poll Mode Driver (PMD) to simultaneously register itself as
> a RegEx device driver.
> 
> RegEx device components and definitions:
> 
>     +-----------------+
>     |                 |
>     |                 o---------+    rte_regex_[en|de]queue_burst()
>     |   PCRE based    o------+  |               |
>     |  RegEx pattern  |      |  |  +--------+   |
>     | matching engine o------+--+--o        |   |    +------+
>     |                 |      |  |  | queue  |<==o===>|Core 0|
>     |                 o----+ |  |  | pair 0 |        |      |
>     |                 |    | |  |  +--------+        +------+
>     +-----------------+    | |  |
>            ^               | |  |  +--------+
>            |               | |  |  |        |        +------+
>            |               | +--+--o queue  |<======>|Core 1|
>        Rule|Database       |    |  | pair 1 |        |      |
>     +------+----------+    |    |  +--------+        +------+
>     |     Group 0     |    |    |
>     | +-------------+ |    |    |  +--------+        +------+
>     | | Rules 0..n  | |    |    |  |        |        |Core 2|
>     | +-------------+ |    |    +--o queue  |<======>|      |
>     |     Group 1     |    |       | pair 2 |        +------+
>     | +-------------+ |    |       +--------+
>     | | Rules 0..n  | |    |
>     | +-------------+ |    |       +--------+
>     |     Group 2     |    |       |        |        +------+
>     | +-------------+ |    |       | queue  |<======>|Core n|
>     | | Rules 0..n  | |    +-------o pair n |        |      |
>     | +-------------+ |            +--------+        +------+
>     |     Group n     |
>     | +-------------+ |<-------rte_regex_rule_db_update()
>     | | Rules 0..n  | |<-------rte_regex_rule_db_import()
>     | +-------------+ |------->rte_regex_rule_db_export()
>     +-----------------+
> 
> RegEx: A regular expression is a concise and flexible means for matching
> strings of text, such as particular characters, words, or patterns of
> characters. A common abbreviation for this is “RegEx”.
> 
> RegEx device: A hardware or software-based implementation of RegEx
> device API for PCRE based pattern matching syntax and semantics.
> 
> PCRE RegEx syntax and semantics specification:
> http://regexkit.sourceforge.net/Documentation/pcre/pcrepattern.html
> 
> RegEx queue pair: Each RegEx device should have one or more queue pair to
> transmit a burst of pattern matching request and receive a burst of
> receive the pattern matching response. The pattern matching
> request/response
> embedded in *rte_regex_ops* structure.
> 
> Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
> Match ID and Group ID to identify the rule upon the match.
> 
> Rule database: The RegEx device accepts regular expressions and converts
> them
> into a compiled rule database that can then be used to scan data.
> Compilation allows the device to analyze the given pattern(s) and
> pre-determine how to scan for these patterns in an optimized fashion that
> would be far too expensive to compute at run-time. A rule database contains
> a set of rules that compiled in device specific binary form.
> 
> Match ID or Rule ID: A unique identifier provided at the time of rule
> creation for the application to identify the rule upon match.
> 
> Group ID: Group of rules can be grouped under one group ID to enable
> rule isolation and effective pattern matching. A unique group identifier
> provided at the time of rule creation for the application to identify the
> rule upon match.
> 
> Scan: A pattern matching request through *enqueue* API.
> 
> It may possible that a given RegEx device may not support all the features
> of PCRE. The application may probe unsupported features through
> struct rte_regex_dev_info::pcre_unsup_flags
> 
> By default, all the functions of the RegEx Device API exported by a PMD
> are lock-free functions which assume to not be invoked in parallel on
> different logical cores to work on the same target object. For instance,
> the dequeue function of a PMD cannot be invoked in parallel on two logical
> cores to operates on same RegEx queue pair. Of course, this function
> can be invoked in parallel by different logical core on different queue pair.
> It is the responsibility of the upper level application to enforce this rule.
> 
> In all functions of the RegEx API, the RegEx device is
> designated by an integer >= 0 named the device identifier *dev_id*
> 
> At the RegEx driver level, RegEx devices are represented by a generic
> data structure of type *rte_regex_dev*.
> 
> RegEx devices are dynamically registered during the PCI/SoC device probing
> phase performed at EAL initialization time.
> When a RegEx device is being probed, a *rte_regex_dev* structure and
> a new device identifier are allocated for that device. Then, the
> regex_dev_init() function supplied by the RegEx driver matching the probed
> device is invoked to properly initialize the device.
> 
> The role of the device init function consists of resetting the hardware or
> software RegEx driver implementations.
> 
> If the device init operation is successful, the correspondence between
> the device identifier assigned to the new device and its associated
> *rte_regex_dev* structure is effectively registered.
> Otherwise, both the *rte_regex_dev* structure and the device identifier are
> freed.
> 
> The functions exported by the application RegEx API to setup a device
> designated by its device identifier must be invoked in the following order:
> - rte_regex_dev_configure()
> - rte_regex_queue_pair_setup()
> - rte_regex_dev_start()
> 
> Then, the application can invoke, in any order, the functions
> exported by the RegEx API to enqueue pattern matching job, dequeue
> pattern
> matching response, get the stats, update the rule database,
> get/set device attributes and so on
> 
> If the application wants to change the configuration (i.e. call
> rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must call
> rte_regex_dev_stop() first to stop the device and then do the
> reconfiguration
> before calling rte_regex_dev_start() again. The enqueue and dequeue
> functions should not be invoked when the device is stopped.
> 
> Finally, an application can close a RegEx device by invoking the
> rte_regex_dev_close() function.
> 
> Each function of the application RegEx API invokes a specific function
> of the PMD that controls the target device designated by its device
> identifier.
> 
> For this purpose, all device-specific functions of a RegEx driver are
> supplied through a set of pointers contained in a generic structure of type
> *regex_dev_ops*.
> The address of the *regex_dev_ops* structure is stored in the
> *rte_regex_dev*
> structure by the device init function of the RegEx driver, which is
> invoked during the PCI/SoC device probing phase, as explained earlier.
> 
> In other words, each function of the RegEx API simply retrieves the
> *rte_regex_dev* structure associated with the device identifier and
> performs an indirect invocation of the corresponding driver function
> supplied in the *regex_dev_ops* structure of the *rte_regex_dev*
> structure.
> 
> For performance reasons, the address of the fast-path functions of the
> RegEx driver is not contained in the *regex_dev_ops* structure.
> Instead, they are directly stored at the beginning of the *rte_regex_dev*
> structure to avoid an extra indirect memory access during their invocation.
> 
> RTE RegEx device drivers do not use interrupts for enqueue or dequeue
> operation. Instead, RegEx drivers export Poll-Mode enqueue and dequeue
> functions to applications.
> 
> The *enqueue* operation submits a burst of RegEx pattern matching
> request
> to the RegEx device and the *dequeue* operation gets a burst of pattern
> matching response for the ones submitted through *enqueue* operation.
> 
> Typical application utilisation of the RegEx device API will follow the
> following programming flow.
> 
> - rte_regex_dev_configure()
> - rte_regex_queue_pair_setup()
> - rte_regex_rule_db_update() Needs to invoke if precompiled rule database
> not
> provided in rte_regex_dev_config::rule_db for rte_regex_dev_configure()
> and/or application needs to update rule database.
> - Create or reuse exiting mempool for *rte_regex_ops* objects.
> - rte_regex_dev_start()
> - rte_regex_enqueue_burst()
> - rte_regex_dequeue_burst()
> 
> ---
> 
> config/common_base                 |    5 +
> doc/api/doxy-api-index.md          |    1 +
> doc/api/doxy-api.conf.in           |    1 +
> lib/Makefile                       |    2 +
> lib/librte_regexdev/Makefile       |   23 +
> lib/librte_regexdev/rte_regexdev.c |    5 +
> lib/librte_regexdev/rte_regexdev.h | 1247
> ++++++++++++++++++++++++++++
> 7 files changed, 1284 insertions(+)
> create mode 100644 lib/librte_regexdev/Makefile
> create mode 100644 lib/librte_regexdev/rte_regexdev.c
> create mode 100644 lib/librte_regexdev/rte_regexdev.h
> 
> diff --git a/config/common_base b/config/common_base
> index e406e7836..986093d6e 100644
> --- a/config/common_base
> +++ b/config/common_base
> @@ -746,6 +746,11 @@
> CONFIG_RTE_LIBRTE_PMD_DPAA2_QDMA_RAWDEV=n
> #
> CONFIG_RTE_LIBRTE_PMD_IFPGA_RAWDEV=y
> 
> +#
> +# Compile regex device support
> +#
> +CONFIG_RTE_LIBRTE_REGEXDEV=y
> +
> #
> # Compile librte_ring
> #
> diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
> index 715248dd1..a0bc27ae4 100644
> --- a/doc/api/doxy-api-index.md
> +++ b/doc/api/doxy-api-index.md
> @@ -26,6 +26,7 @@ The public API headers are grouped by topics:
> [event_timer_adapter]    (@ref rte_event_timer_adapter.h),
> [event_crypto_adapter]   (@ref rte_event_crypto_adapter.h),
> [rawdev]             (@ref rte_rawdev.h),
> +  [regexdev]           (@ref rte_regexdev.h),
> [metrics]            (@ref rte_metrics.h),
> [bitrate]            (@ref rte_bitrate.h),
> [latency]            (@ref rte_latencystats.h),
> diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in
> index b9896cb63..7adb821bb 100644
> --- a/doc/api/doxy-api.conf.in
> +++ b/doc/api/doxy-api.conf.in
> @@ -53,6 +53,7 @@ INPUT                   = @TOPDIR@/doc/api/doxy-api-
> index.md \
> @TOPDIR@/lib/librte_rawdev \
> @TOPDIR@/lib/librte_rcu \
> @TOPDIR@/lib/librte_reorder \
> +                          @TOPDIR@/lib/librte_regexdev \
> @TOPDIR@/lib/librte_ring \
> @TOPDIR@/lib/librte_sched \
> @TOPDIR@/lib/librte_security \
> diff --git a/lib/Makefile b/lib/Makefile
> index 791e0d991..57de9691a 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -44,6 +44,8 @@ DEPDIRS-librte_eventdev := librte_eal librte_ring
> librte_ethdev librte_hash \
> librte_mempool librte_timer librte_cryptodev
> DIRS-$(CONFIG_RTE_LIBRTE_RAWDEV) += librte_rawdev
> DEPDIRS-librte_rawdev := librte_eal librte_ethdev
> +DIRS-$(CONFIG_RTE_LIBRTE_REGEXDEV) += librte_regexdev
> +DEPDIRS-librte_regexdev := librte_eal
> DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
> DEPDIRS-librte_vhost := librte_eal librte_mempool librte_mbuf
> librte_ethdev \
> 			librte_net
> diff --git a/lib/librte_regexdev/Makefile b/lib/librte_regexdev/Makefile
> new file mode 100644
> index 000000000..723b4b28c
> --- /dev/null
> +++ b/lib/librte_regexdev/Makefile
> @@ -0,0 +1,23 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(C) 2019 Marvell International Ltd.
> +#
> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +# library name
> +LIB = librte_regexdev.a
> +
> +# library version
> +LIBABIVER := 1
> +
> +# build flags
> +CFLAGS += -O3
> +CFLAGS += $(WERROR_FLAGS)
> +
> +# library source files
> +SRCS-y += rte_regexdev.c
> +
> +# export include files
> +SYMLINK-y-include += rte_regexdev.h
> +
> +include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/lib/librte_regexdev/rte_regexdev.c
> b/lib/librte_regexdev/rte_regexdev.c
> new file mode 100644
> index 000000000..e5be0f29c
> --- /dev/null
> +++ b/lib/librte_regexdev/rte_regexdev.c
> @@ -0,0 +1,5 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(C) 2019 Marvell International Ltd.
> + */
> +
> +#include <rte_regexdev.h>
> diff --git a/lib/librte_regexdev/rte_regexdev.h
> b/lib/librte_regexdev/rte_regexdev.h
> new file mode 100644
> index 000000000..765da4aaa
> --- /dev/null
> +++ b/lib/librte_regexdev/rte_regexdev.h
> @@ -0,0 +1,1247 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(C) 2019 Marvell International Ltd.
> + */
> +
> +#ifndef _RTE_REGEXDEV_H_
> +#define _RTE_REGEXDEV_H_
> +
> +/**
> + * @file
> + *
> + * RTE RegEx Device API
> + *
> + * Defines RTE RegEx Device APIs for RegEx operations and its provisioning.
> + *
> + * The RegEx Device API is composed of two parts:
> + *
> + * - The application-oriented RegEx API that includes functions to setup
> + *   a RegEx device (configure it, setup its queue pairs and start it),
> + *   update the rule database and so on.
> + *
> + * - The driver-oriented RegEx API that exports a function allowing
> + *   a RegEx poll Mode Driver (PMD) to simultaneously register itself as
> + *   a RegEx device driver.
> + *
> + * RegEx device components and definitions:
> + *
> + *     +-----------------+
> + *     |                 |
> + *     |                 o---------+    rte_regex_[en|de]queue_burst()
> + *     |   PCRE based    o------+  |               |
> + *     |  RegEx pattern  |      |  |  +--------+   |
> + *     | matching engine o------+--+--o        |   |    +------+
> + *     |                 |      |  |  | queue  |<==o===>|Core 0|
> + *     |                 o----+ |  |  | pair 0 |        |      |
> + *     |                 |    | |  |  +--------+        +------+
> + *     +-----------------+    | |  |
> + *            ^               | |  |  +--------+
> + *            |               | |  |  |        |        +------+
> + *            |               | +--+--o queue  |<======>|Core 1|
> + *        Rule|Database       |    |  | pair 1 |        |      |
> + *     +------+----------+    |    |  +--------+        +------+
> + *     |     Group 0     |    |    |
> + *     | +-------------+ |    |    |  +--------+        +------+
> + *     | | Rules 0..n  | |    |    |  |        |        |Core 2|
> + *     | +-------------+ |    |    +--o queue  |<======>|      |
> + *     |     Group 1     |    |       | pair 2 |        +------+
> + *     | +-------------+ |    |       +--------+
> + *     | | Rules 0..n  | |    |
> + *     | +-------------+ |    |       +--------+
> + *     |     Group 2     |    |       |        |        +------+
> + *     | +-------------+ |    |       | queue  |<======>|Core n|
> + *     | | Rules 0..n  | |    +-------o pair n |        |      |
> + *     | +-------------+ |            +--------+        +------+
> + *     |     Group n     |
> + *     | +-------------+ |<-------rte_regex_rule_db_update()
> + *     | | Rules 0..n  | |<-------rte_regex_rule_db_import()
> + *     | +-------------+ |------->rte_regex_rule_db_export()
> + *     +-----------------+
> + *
> + * RegEx: A regular expression is a concise and flexible means for matching
> + * strings of text, such as particular characters, words, or patterns of
> + * characters. A common abbreviation for this is “RegEx”.
> + *
> + * RegEx device: A hardware or software-based implementation of RegEx
> + * device API for PCRE based pattern matching syntax and semantics.
> + *
> + * PCRE RegEx syntax and semantics specification:
> + * http://regexkit.sourceforge.net/Documentation/pcre/pcrepattern.html
> + *
> + * RegEx queue pair: Each RegEx device should have one or more queue
> pair to
> + * transmit a burst of pattern matching request and receive a burst of
> + * receive the pattern matching response. The pattern matching
> request/response
> + * embedded in *rte_regex_ops* structure.
> + *
> + * Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
> + * Match ID and Group ID to identify the rule upon the match.
> + *
> + * Rule database: The RegEx device accepts regular expressions and
> converts them
> + * into a compiled rule database that can then be used to scan data.
> + * Compilation allows the device to analyze the given pattern(s) and
> + * pre-determine how to scan for these patterns in an optimized fashion
> that
> + * would be far too expensive to compute at run-time. A rule database
> contains
> + * a set of rules that compiled in device specific binary form.
> + *
> + * Match ID or Rule ID: A unique identifier provided at the time of rule
> + * creation for the application to identify the rule upon match.
> + *
> + * Group ID: Group of rules can be grouped under one group ID to enable
> + * rule isolation and effective pattern matching. A unique group identifier
> + * provided at the time of rule creation for the application to identify the
> + * rule upon match.
> + *
> + * Scan: A pattern matching request through *enqueue* API.
> + *
> + * It may possible that a given RegEx device may not support all the features
> + * of PCRE. The application may probe unsupported features through
> + * struct rte_regex_dev_info::pcre_unsup_flags
> + *
> + * By default, all the functions of the RegEx Device API exported by a PMD
> + * are lock-free functions which assume to not be invoked in parallel on
> + * different logical cores to work on the same target object. For instance,
> + * the dequeue function of a PMD cannot be invoked in parallel on two
> logical
> + * cores to operates on same RegEx queue pair. Of course, this function
> + * can be invoked in parallel by different logical core on different queue
> pair.
> + * It is the responsibility of the upper level application to enforce this rule.
> + *
> + * In all functions of the RegEx API, the RegEx device is
> + * designated by an integer >= 0 named the device identifier *dev_id*
> + *
> + * At the RegEx driver level, RegEx devices are represented by a generic
> + * data structure of type *rte_regex_dev*.
> + *
> + * RegEx devices are dynamically registered during the PCI/SoC device
> probing
> + * phase performed at EAL initialization time.
> + * When a RegEx device is being probed, a *rte_regex_dev* structure and
> + * a new device identifier are allocated for that device. Then, the
> + * regex_dev_init() function supplied by the RegEx driver matching the
> probed
> + * device is invoked to properly initialize the device.
> + *
> + * The role of the device init function consists of resetting the hardware or
> + * software RegEx driver implementations.
> + *
> + * If the device init operation is successful, the correspondence between
> + * the device identifier assigned to the new device and its associated
> + * *rte_regex_dev* structure is effectively registered.
> + * Otherwise, both the *rte_regex_dev* structure and the device identifier
> are
> + * freed.
> + *
> + * The functions exported by the application RegEx API to setup a device
> + * designated by its device identifier must be invoked in the following order:
> + *     - rte_regex_dev_configure()
> + *     - rte_regex_queue_pair_setup()
> + *     - rte_regex_dev_start()
> + *
> + * Then, the application can invoke, in any order, the functions
> + * exported by the RegEx API to enqueue pattern matching job, dequeue
> pattern
> + * matching response, get the stats, update the rule database,
> + * get/set device attributes and so on
> + *
> + * If the application wants to change the configuration (i.e. call
> + * rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must
> call
> + * rte_regex_dev_stop() first to stop the device and then do the
> reconfiguration
> + * before calling rte_regex_dev_start() again. The enqueue and dequeue
> + * functions should not be invoked when the device is stopped.
> + *
> + * Finally, an application can close a RegEx device by invoking the
> + * rte_regex_dev_close() function.
> + *
> + * Each function of the application RegEx API invokes a specific function
> + * of the PMD that controls the target device designated by its device
> + * identifier.
> + *
> + * For this purpose, all device-specific functions of a RegEx driver are
> + * supplied through a set of pointers contained in a generic structure of type
> + * *regex_dev_ops*.
> + * The address of the *regex_dev_ops* structure is stored in the
> *rte_regex_dev*
> + * structure by the device init function of the RegEx driver, which is
> + * invoked during the PCI/SoC device probing phase, as explained earlier.
> + *
> + * In other words, each function of the RegEx API simply retrieves the
> + * *rte_regex_dev* structure associated with the device identifier and
> + * performs an indirect invocation of the corresponding driver function
> + * supplied in the *regex_dev_ops* structure of the *rte_regex_dev*
> structure.
> + *
> + * For performance reasons, the address of the fast-path functions of the
> + * RegEx driver is not contained in the *regex_dev_ops* structure.
> + * Instead, they are directly stored at the beginning of the *rte_regex_dev*
> + * structure to avoid an extra indirect memory access during their
> invocation.
> + *
> + * RTE RegEx device drivers do not use interrupts for enqueue or dequeue
> + * operation. Instead, RegEx drivers export Poll-Mode enqueue and
> dequeue
> + * functions to applications.
> + *
> + * The *enqueue* operation submits a burst of RegEx pattern matching
> request
> + * to the RegEx device and the *dequeue* operation gets a burst of pattern
> + * matching response for the ones submitted through *enqueue*
> operation.
> + *
> + * Typical application utilisation of the RegEx device API will follow the
> + * following programming flow.
> + *
> + * - rte_regex_dev_configure()
> + * - rte_regex_queue_pair_setup()
> + * - rte_regex_rule_db_update() Needs to invoke if precompiled rule
> database not
> + *   provided in rte_regex_dev_config::rule_db for
> rte_regex_dev_configure()
> + *   and/or application needs to update rule database.
> + * - Create or reuse exiting mempool for *rte_regex_ops* objects.
> + * - rte_regex_dev_start()
> + * - rte_regex_enqueue_burst()
> + * - rte_regex_dequeue_burst()
> + *
> + */
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include <rte_common.h>
> +#include <rte_config.h>
> +#include <rte_dev.h>
> +#include <rte_errno.h>
> +#include <rte_memory.h>
> +
> +/**
> + * Get the total number of RegEx devices that have been successfully
> + * initialised.
> + *
> + * @return
> + *   The total number of usable RegEx devices.
> + */
> +uint8_t
> +rte_regex_dev_count(void);
> +
> +/**
> + * Get the device identifier for the named RegEx device.
> + *
> + * @param name
> + *   RegEx device name to select the RegEx device identifier.
> + *
> + * @return
> + *   Returns RegEx device identifier on success.
> + *   - <0: Failure to find named RegEx device.
> + */
> +int
> +rte_regex_dev_get_dev_id(const char *name);
> +
> +/* Enumerates RegEx device capabilities */
> +#define RTE_REGEX_DEV_CAPA_RUNTIME_COMPILATION_F (1ULL << 0)
> +/**< RegEx device does support compiling the rules at runtime unlike
> + * loading only the pre-built rule database using
> + * struct rte_regex_dev_config::rule_db in rte_regex_dev_configure()
> + * @see struct rte_regex_dev_config::rule_db, rte_regex_dev_configure()
> + * @see struct rte_regex_dev_info::regex_dev_capa
> + */
> +
> +
> +/* Enumerates unsupported PCRE features for the RegEx device */
> +#define RTE_REGEX_DEV_PCRE_UNSUP_START_ANCHOR_F (1ULL << 0)
> +/**< RegEx device doesn't support PCRE Anchor to start of match flag.
> + * Example RegEx is '/\Gfoo\d/'. Here '\G' asserts position at the end of the
> + * previous match or the start of the string for the first match.
> + * This position will change each time the RegEx is applied to the subject
> + * string. If the RegEx is applied to 'foo1foo2Zfoo3' the first two matches will
> + * be successful for 'foo1foo2' and fail for 'Zfoo3'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_ATOMIC_GROUPING_F (1ULL <<
> 1)
> +/**< RegEx device doesn't support PCRE Atomic grouping.
> + * Atomic groups are represented by '(?>)'. An atomic group is a group that,
> + * when the RegEx engine exits from it, automatically throws away all
> + * backtracking positions remembered by any tokens inside the group.
> + * Example RegEx is 'a(?>bc|b)c' if the given patterns are 'abc' and 'abcc'
> then
> + * 'a(bc|b)c' matches both where as 'a(?>bc|b)c' matches only abcc because
> + * atomic groups don't allow backtracing back to 'b'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_BACKTRACKING_CTRL_F (1ULL <<
> 2)
> +/**< RegEx device doesn't support PCRE backtracking control verbs.
> + * Some examples of backtracing verbs are (*COMMIT), (*ACCEPT), (*FAIL),
> + * (*SKIP), (*PRUNE).
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_CALLOUTS_F (1ULL << 3)
> +/**< RegEx device doesn't support PCRE callouts.
> + * PCRE supports calling external function in between matches by using
> '(?C)'.
> + * Example RegEx 'ABC(?C)D' if a given patter is 'ABCD' then the RegEx
> engine
> + * will parse ABC perform a userdefined callout and return a successful
> match at
> + * D.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_BACKREFERENCE_F (1ULL << 4)
> +/**< RegEx device doesn't support PCRE backreference.
> + * Example RegEx is '(\2ABC|(GHI))+' \2 matches the same text as most
> recently
> + * matched by the 2nd capturing group i.e. 'GHI'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_GREEDY_F (1ULL << 5)
> +/**< RegEx device doesn't support PCRE Greedy mode.
> + * For example if the RegEx is 'AB\d*?' then '*?' represents zero or
> unlimited
> + * matches. In greedy mode the pattern 'AB12345' will be matched
> completely
> + * where as the ungreedy mode 'AB' will be returned as the match.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_LOOKAROUND_ASRT_F (1ULL <<
> 6)
> +/**< RegEx device doesn't support PCRE Lookaround assertions
> + * (Zero-width assertions). Example RegEx is '[a-z]+\d+(?=!{3,})' if
> + * the given pattern is 'dwad1234!' the RegEx engine doesn't report any
> matches
> + * because the assert '(?=!{3,})' fails. The pattern 'dwad123!!!' would return
> a
> + * successful match.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_MATCH_POINT_RST_F (1ULL <<
> 7)
> +/**< RegEx device doesn't support PCRE match point reset directive.
> + * Example RegEx is '[a-z]+\K\d+' if the pattern is 'dwad123'
> + * then even though the entire pattern matches only '123'
> + * is reported as a match.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_NEWLINE_CONVENTIONS_F
> (1ULL << 8)
> +/**< RegEx device doesn't support PCRE newline convention.
> + * Newline conventions are represented as follows:
> + * (*CR)        carriage return
> + * (*LF)        linefeed
> + * (*CRLF)      carriage return, followed by linefeed
> + * (*ANYCRLF)   any of the three above
> + * (*ANY)       all Unicode newline sequences
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_NEWLINE_SEQ_F (1ULL << 9)
> +/**< RegEx device doesn't support PCRE newline sequence.
> + * The escape sequence '\R' will match any newline sequence.
> + * It is equivalent to: '(?>\r\n|\n|\x0b|\f|\r|\x85)'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_POSSESSIVE_QUALIFIERS_F (1ULL
> << 10)
> +/**< RegEx device doesn't support PCRE possessive qualifiers.
> + * Example RegEx possessive qualifiers '*+', '++', '?+', '{m,n}+'.
> + * Possessive quantifier repeats the token as many times as possible and it
> does
> + * not give up matches as the engine backtracks. With a possessive
> quantifier,
> + * the deal is all or nothing.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_SUBROUTINE_REFERENCES_F
> (1ULL << 11)
> +/**< RegEx device doesn't support PCRE Subroutine references.
> + * PCRE Subroutine references allow for sub patterns to be assessed
> + * as part of the RegEx. Example RegEx is '(foo|fuzz)\g<1>+bar' matches the
> + * pattern 'foofoofuzzfoofuzzbar'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_8_F (1ULL << 12)
> +/**< RegEx device doesn't support UTF-8 character encoding.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_16_F (1ULL << 13)
> +/**< RegEx device doesn't support UTF-16 character encoding.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_32_F (1ULL << 14)
> +/**< RegEx device doesn't support UTF-32 character encoding.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_WORD_BOUNDARY_F (1ULL <<
> 15)
> +/**< RegEx device doesn't support word boundaries.
> + * The meta character '\b' represents word boundary anchor.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_FORWARD_REFERENCES_F (1ULL
> << 16)
> +/**< RegEx device doesn't support Forward references.
> + * Forward references allow you to use a back reference to a group that
> appears
> + * later in the RegEx. Example RegEx is '(\3ABC|(DEF|(GHI)))+' matches the
> + * following string 'GHIGHIABCDEF'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +/* Enumerates PCRE rule flags */
> +#define RTE_REGEX_PCRE_RULE_ALLOW_EMPTY_F (1ULL << 0)
> +/**< When this flag is set, the pattern that can match against an empty
> string,
> + * such as '.*' are allowed.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_ANCHORED_F (1ULL << 1)
> +/**< When this flag is set, the pattern is forced to be "anchored", that is, it
> + * is constrained to match only at the first matching point in the string that
> + * is being searched. Similar to '^' and represented by \A.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_CASELESS_F (1ULL << 2)
> +/**< When this flag is set, letters in the pattern match both upper and
> lower
> + * case letters in the subject.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_DOTALL_F (1ULL << 3)
> +/**< When this flag is set, a dot metacharacter in the pattern matches any
> + * character, including one that indicates a newline.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_DUPNAMES_F (1ULL << 4)
> +/**< When this flag is set, names used to identify capture groups need not
> be
> + * unique.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_EXTENDED_F (1ULL << 5)
> +/**< When this flag is set, most white space characters in the pattern are
> + * totally ignored except when escaped or inside a character class.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_MATCH_UNSET_BACKREF_F (1ULL << 6)
> +/**< When this flag is set, a backreference to an unset capture group
> matches an
> + * empty string.
> + * @see RTE_REGEX_DEV_PCRE_UNSUP_FORWARD_REFERENCES_F
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_MULTILINE_F (1ULL << 7)
> +/**< When this flag  is set, the '^' and '$' constructs match immediately
> + * following or immediately before internal newlines in the subject string,
> + * respectively, as well as at the very start and end.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_NO_AUTO_CAPTURE_F (1ULL << 8)
> +/**< When this Flag is set, it disables the use of numbered capturing
> + * parentheses in the pattern. References to capture groups
> (backreferences or
> + * recursion/subroutine calls) may only refer to named groups, though the
> + * reference can be by name or by number.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_UCP_F (1ULL << 9)
> +/**< By default, only ASCII characters are recognized, When this flag is set,
> + * Unicode properties are used instead to classify characters.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_UNGREEDY_F (1ULL << 10)
> +/**< When this flag is set, the "greediness" of the quantifiers is inverted
> + * so that they are not greedy by default, but become greedy if followed by
> + * '?'.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_UTF_F (1ULL << 11)
> +/**< When this flag is set, RegEx engine has to regard both the pattern and
> the
> + * subject strings that are subsequently processed as strings of UTF
> characters
> + * instead of single-code-unit strings.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_NEVER_BACKSLASH_C_F (1ULL << 12)
> +/**< This Flag locks out the use of '\C' in the pattern that is being compiled.
> + * This escape matches one data unit, even in UTF mode which can cause
> + * unpredictable behavior in UTF-8 or UTF-16 modes, because it may leave
> the
> + * current matching point in the middle of a multi-code-unit character.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +
> +/**
> + * RegEx device information
> + */
> +struct rte_regex_dev_info {
> +	const char *driver_name; /**< RegEx driver name */
> +	struct rte_device *dev;	/**< Device information */
> +	uint8_t max_matches;
> +	/**< Maximum matches per scan supported by this device */
> +	uint16_t max_queue_pairs;
> +	/**< Maximum queue pairs supported by this device */
> +	uint16_t max_payload_size;
> +	/**< Maximum payload size for a pattern match request or scan.
> +	 * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> +	 */
> +	uint16_t max_rules_per_group;
> +	/**< Maximum rules supported per group by this device */
> +	uint16_t max_groups;
> +	/**< Maximum group supported by this device */
> +	uint32_t regex_dev_capa;
> +	/**< RegEx device capabilities. @see RTE_REGEX_DEV_CAPA_* */
> +	uint64_t rule_flags;
> +	/**< Supported compiler rule flags.
> +	 * @see RTE_REGEX_PCRE_RULE_*, struct rte_regex_rule::rule_flags
> +	 */
> +	uint64_t pcre_unsup_flags;
> +	/**< Unsupported PCRE features for this RegEx device.
> +	 * @see RTE_REGEX_DEV_PCRE_UNSUP_*
> +	 */
> +};
> +
> +/**
> + * Retrieve the contextual information of a RegEx device.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + *
> + * @param[out] dev_info
> + *   A pointer to a structure of type *rte_regex_dev_info* to be filled with
> the
> + *   contextual information of the device.
> + *
> + * @return
> + *   - 0: Success, driver updates the contextual information of the RegEx
> device
> + *   - <0: Error code returned by the driver info get function.
> + *
> + */
> +int
> +rte_regex_dev_info_get(uint8_t dev_id, struct rte_regex_dev_info
> *dev_info);
> +
> +/* Enumerates RegEx device configuration flags */
> +#define RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F (1ULL << 0)
> +/**< Cross buffer scan refers to the ability to be able to detect
> + * matches that occur across buffer boundaries, where the buffers are
> related
> + * to each other in some way. Enable this flag when to scan payload size
> + * greater struct struct rte_regex_dev_info::max_payload_size and/or
> + * matches can present across scan buffer boundaries.
> + *
> + * @see struct rte_regex_dev_info::max_payload_size
> + * @see struct rte_regex_dev_config::dev_cfg_flags,
> rte_regex_dev_configure()
> + * @see RTE_REGEX_OPS_RSP_PMI_SOJ_F
> + * @see RTE_REGEX_OPS_RSP_PMI_EOJ_F
> + */
> +
> +/** RegEx device configuration structure */
> +struct rte_regex_dev_config {
> +	uint8_t nb_max_matches;
> +	/**< Maximum matches per scan configured on this device.
> +	 * This value cannot exceed the *max_matches*
> +	 * which previously provided in rte_regex_dev_info_get().
> +	 * The value 0 is allowed, in which case, value 1 used.
> +	 * @see struct rte_regex_dev_info::max_matches
> +	 */
> +	uint16_t nb_queue_pairs;
> +	/**< Number of RegEx queue pairs to configure on this device.
> +	 * This value cannot exceed the *max_queue_pairs* which
> previously
> +	 * provided in rte_regex_dev_info_get().
> +	 * @see struct rte_regex_dev_info::max_queue_pairs
> +	 */
> +	uint16_t nb_rules_per_group;
> +	/**< Number of rules per group to configure on this device.
> +	 * This value cannot exceed the *max_rules_per_group*
> +	 * which previously provided in rte_regex_dev_info_get().
> +	 * The value 0 is allowed, in which case,
> +	 * struct rte_regex_dev_info::max_rules_per_group used.
> +	 * @see struct rte_regex_dev_info::max_rules_per_group
> +	 */
> +	uint16_t nb_groups;
> +	/**< Number of groups to configure on this device.
> +	 * This value cannot exceed the *max_groups*
> +	 * which previously provided in rte_regex_dev_info_get().
> +	 * @see struct rte_regex_dev_info::max_groups
> +	 */
> +	const char *rule_db;
> +	/**< Import initial set of prebuilt rule database on this device.
> +	 * The value NULL is allowed, in which case, the device will not
> +	 * be configured prebuilt rule database. Application may use
> +	 * rte_regex_rule_db_update() or rte_regex_rule_db_import() API
> +	 * to update or import rule database after the
> +	 * rte_regex_dev_configure().
> +	 * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
> +	 */
> +	uint32_t rule_db_len;
> +	/**< Length of *rule_db* buffer. */
> +	uint32_t dev_cfg_flags;
> +	/**< RegEx device configuration flags, See RTE_REGEX_DEV_CFG_*
> */
> +};
> +
> +/**
> + * Configure a RegEx device.
> + *
> + * This function must be invoked first before any other function in the
> + * API. This function can also be re-invoked when a device is in the
> + * stopped state.
> + *
> + * The caller may use rte_regex_dev_info_get() to get the capability of each
> + * resources available for this regex device.
> + *
> + * @param dev_id
> + *   The identifier of the device to configure.
> + * @param cfg
> + *   The RegEx device configuration structure.
> + *
> + * @return
> + *   - 0: Success, device configured.
> + *   - <0: Error code returned by the driver configuration function.
> + */
> +int
> +rte_regex_dev_configure(uint8_t dev_id, const struct
> rte_regex_dev_config *cfg);
> +
> +/* Enumerates RegEx queue pair configuration flags */
> +#define RTE_REGEX_QUEUE_PAIR_CFG_OOS_F (1ULL << 0)
> +/**< Out of order scan, If not set, a scan must retire after previously issued
> + * in-order scans to this queue pair. If set, this scan can be retired as soon
> + * as device returns completion. Application should not set out of order scan
> + * flag if it needs to maintain the ingress order of scan request.
> + *
> + * @see struct rte_regex_qp_conf::qp_conf_flags,
> rte_regex_queue_pair_setup()
> + */
> +
> +struct rte_regex_ops;
> +typedef void (*regexdev_stop_flush_t)(uint8_t dev_id, uint16_t qp_id,
> +				      struct rte_regex_ops *op);
> +/**< Callback function called during rte_regex_dev_stop(), invoked once
> per
> + * flushed RegEx op.
> + */
> +
> +/** RegEx queue pair configuration structure */
> +struct rte_regex_qp_conf {
> +	uint32_t qp_conf_flags;
> +	/**< Queue pair config flags, See RTE_REGEX_QUEUE_PAIR_CFG_*
> */
> +	uint16_t nb_desc;
> +	/**< The number of descriptors to allocate for this queue pair. */
> +	regexdev_stop_flush_t cb;
> +	/**< Callback function called during rte_regex_dev_stop(), invoked
> +	 * once per flushed regex op. Value NULL is allowed, in which case
> +	 * callback will not be invoked. This function can be used to properly
> +	 * dispose of outstanding regex ops from response queue,
> +	 * for example ops containing memory pointers.
> +	 * @see rte_regex_dev_stop()
> +	 */
> +};
> +
> +/**
> + * Allocate and set up a RegEx queue pair for a RegEx device.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param queue_pair_id
> + *   The index of the RegEx queue pair to setup. The value must be in the
> range
> + *   [0, nb_queue_pairs - 1] previously supplied to
> rte_regex_dev_configure().
> + * @param qp_conf
> + *   The pointer to the configuration data to be used for the RegEx queue
> pair.
> + *   NULL value is allowed, in which case default configuration	used.
> + *
> + * @return
> + *   - 0: Success, RegEx queue pair correctly set up.
> + *   - <0: RegEx queue configuration failed
> + */
> +int
> +rte_regex_queue_pair_setup(uint8_t dev_id, uint8_t queue_pair_id,
> +			   const struct rte_regex_qp_conf *qp_conf);
> +
> +/**
> + * Start a RegEx device.
> + *
> + * The device start step is the last one and consists of setting the RegEx
> + * queues to start accepting the pattern matching scan requests.
> + *
> + * On success, all basic functions exported by the API (RegEx enqueue,
> + * RegEx dequeue and so on) can be invoked.
> + *
> + * @param dev_id
> + *   RegEx device identifier
> + * @return
> + *   - 0: Success, device started.
> + *   - <0: Device start failed.
> + */
> +int
> +rte_regex_dev_start(uint8_t dev_id);
> +
> +/**
> + * Stop a RegEx device.
> + *
> + * Stop a RegEx device. The device can be restarted with a call to
> + * rte_regex_dev_start().
> + *
> + * This function causes all queued response regex ops to be drained in the
> + * response queue. While draining ops out of the device,
> + * struct rte_regex_qp_conf::cb will be invoked for each ops.
> + *
> + * @param dev_id
> + *   RegEx device identifier.
> + *
> + * @see struct rte_regex_qp_conf::cb, rte_regex_queue_pair_setup()
> + */
> +void
> +rte_regex_dev_stop(uint8_t dev_id);
> +
> +/**
> + * Close a RegEx device. The device cannot be restarted!
> + *
> + * @param dev_id
> + *   RegEx device identifier
> + *
> + * @return
> + *  - 0 on successfully closed the device.
> + *  - <0 on failure to close the device.
> + */
> +int
> +rte_regex_dev_close(uint8_t dev_id);
> +
> +/* Device get/set attributes */
> +
> +/** Enumerates RegEx device attribute identifier */
> +enum rte_regex_dev_attr_id {
> +	RTE_REGEX_DEV_ATTR_SOCKET_ID,
> +	/**< The NUMA socket id to which the device is connected or
> +	 * a default of zero if the socket could not be determined.
> +	 * datatype: *int*
> +	 * operation: *get*
> +	 */
> +	RTE_REGEX_DEV_ATTR_MAX_MATCHES,
> +	/**< Maximum number of matches per scan.
> +	 * datatype: *uint8_t*
> +	 * operation: *get* and *set*
> +	 *
> +	 * @see RTE_REGEX_OPS_RSP_MAX_MATCH_F
> +	 */
> +	RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT,
> +	/**< Upper bound scan time in ns.
> +	 * datatype: *uint16_t*
> +	 * operation: *get* and *set*
> +	 *
> +	 * @see RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F
> +	 */
> +	RTE_REGEX_DEV_ATTR_MAX_PREFIX,
> +	/**< Maximum number of prefix detected per scan.
> +	 * This would be useful for denial of service detection.
> +	 * datatype: *uint16_t*
> +	 * operation: *get* and *set*
> +	 *
> +	 * @see RTE_REGEX_OPS_RSP_MAX_PREFIX_F
> +	 */
> +};
> +
> +/**
> + * Get an attribute from a RegEx device.
> + *
> + * @param dev_id RegEx device identifier
> + * @param attr_id The attribute ID to retrieve
> + * @param[out] attr_value A pointer that will be filled in with the attribute
> + *             value if successful.
> + *
> + * @return
> + *   - 0: Successfully retrieved attribute value.
> + *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
> + *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
> + */
> +int
> +rte_regex_dev_attr_get(uint8_t dev_id, enum rte_regex_dev_attr_id
> attr_id,
> +		       void *attr_value);
> +
> +/**
> + * Set an attribute to a RegEx device.
> + *
> + * @param dev_id RegEx device identifier
> + * @param attr_id The attribute ID to retrieve
> + * @param attr_value A pointer that will be filled in with the attribute value
> + *                   by the application
> + *
> + * @return
> + *   - 0: Successfully applied the attribute value.
> + *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
> + *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
> + */
> +int
> +rte_regex_dev_attr_set(uint8_t dev_id, enum rte_regex_dev_attr_id
> attr_id,
> +		       const void *attr_value);
> +
> +/* Rule related APIs */
> +/** Enumerates RegEx rule operation */
> +enum rte_regex_rule_op {
> +	RTE_REGEX_RULE_OP_ADD,
> +	/**< Add RegEx rule to rule database */
> +	RTE_REGEX_RULE_OP_REMOVE
> +	/**< Remove RegEx rule from rule database */
> +};
> +
> +/** Structure to hold a RegEx rule attributes */
> +struct rte_regex_rule {
> +	enum rte_regex_rule_op op;
> +	/**< OP type of the rule either a OP_ADD or OP_DELETE */
> +	uint16_t group_id;
> +	/**< Group identifier to which the rule belongs to. */
> +	uint32_t rule_id;
> +	/**< Rule identifier which is returned on successful match. */
> +	const char *pcre_rule;
> +	/**< Buffer to hold the PCRE rule. */
> +	uint16_t pcre_rule_len;
> +	/**< Length of the PCRE rule*/
> +	uint64_t rule_flags;
> +	/* PCRE rule flags. Supported device specific PCRE rules enumerated
> +	 * in struct rte_regex_dev_info::rule_flags. For successful rule
> +	 * database update, application needs to provide only supported
> +	 * rule flags.
> +	 * @See RTE_REGEX_PCRE_RULE_*, struct
> rte_regex_dev_info::rule_flags
> +	 */
> +};
> +
> +/**
> + * Update the rule database of a RegEx device.
> + *
> + * @param dev_id RegEx device identifier
> + * @param rules
> + *   Points to an array of *nb_rules* objects of type *rte_regex_rule*
> structure
> + *   which contain the regex rules attributes to be updated in rule database.
> + * @param nb_rules
> + *   The number of PCRE rules to update the rule database.
> + *
> + * @return
> + *   The number of regex rules actually updated on the regex device's rule
> + *   database. The return value can be less than the value of the *nb_rules*
> + *   parameter when the regex devices fails to update the rule database or
> + *   if invalid parameters are specified in a *rte_regex_rule*.
> + *   If the return value is less than *nb_rules*, the remaining PCRE rules
> + *   at the end of *rules* are not consumed and the caller has to take
> + *   care of them and rte_errno is set accordingly.
> + *   Possible errno values include:
> + *   - -EINVAL:  Invalid device ID or rules is NULL
> + *   - -ENOTSUP: The last processed rule is not supported on this device.
> + *   - -ENOSPC: No space available in rule database.
> + *
> + * @see rte_regex_rule_db_import(), rte_regex_rule_db_export()
> + */
> +uint16_t
> +rte_regex_rule_db_update(uint8_t dev_id, const struct rte_regex_rule
> *rules,
> +			 uint16_t nb_rules);
> +
> +/**
> + * Import a prebuilt rule database from a buffer to a RegEx device.
> + *
> + * @param dev_id RegEx device identifier
> + * @param rule_db
> + *   Points to prebuilt rule database.
> + * @param rule_db_len
> + *   Length of the rule database.
> + *
> + * @return
> + *   - 0: Successfully updated the prebuilt rule database.
> + *   - -EINVAL:  Invalid device ID or rule_db is NULL
> + *   - -ENOTSUP: Rule database import is not supported on this device.
> + *   - -ENOSPC: No space available in rule database.
> + *
> + * @see rte_regex_rule_db_update(), rte_regex_rule_db_export()
> + */
> +int
> +rte_regex_rule_db_import(uint8_t dev_id, const char *rule_db,
> +			 uint32_t rule_db_len);
> +
> +/**
> + * Export the prebuilt rule database from a RegEx device to the buffer.
> + *
> + * @param dev_id RegEx device identifier
> + * @param[out] rule_db
> + *   Block of memory to insert the rule database. Must be at least size in
> + *   capacity. If set to NULL, function returns required capacity.
> + *
> + * @return
> + *   - 0: Successfully exported the prebuilt rule database.
> + *   - size: If rule_db set to NULL then required capacity for *rule_db*
> + *   - -EINVAL:  Invalid device ID
> + *   - -ENOTSUP: Rule database export is not supported on this device.
> + *
> + * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
> + */
> +int
> +rte_regex_rule_db_export(uint8_t dev_id, char *rule_db);
> +
> +/* Extended statistics */
> +/** Maximum name length for extended statistics counters */
> +#define RTE_REGEX_DEV_XSTATS_NAME_SIZE 64
> +
> +/**
> + * A name-key lookup element for extended statistics.
> + *
> + * This structure is used to map between names and ID numbers
> + * for extended RegEx device statistics.
> + */
> +struct rte_regex_dev_xstats_map {
> +	uint16_t id;
> +	/**< xstat identifier */
> +	char name[RTE_REGEX_DEV_XSTATS_NAME_SIZE];
> +	/**< xstat name */
> +};
> +
> +/**
> + * Retrieve names of extended statistics of a regex device.
> + *
> + * @param dev_id
> + *   The identifier of the regex device.
> + * @param[out] xstats_map
> + *   Block of memory to insert id and names into. Must be at least size in
> + *   capacity. If set to NULL, function returns required capacity.
> + * @return
> + *   - positive value on success:
> + *        -The return value is the number of entries filled in the stats map.
> + *        -If xstats_map set to NULL then required capacity for xstats_map.
> + *   - negative value on error:
> + *      -ENODEV for invalid *dev_id*
> + *      -ENOTSUP if the device doesn't support this function.
> + */
> +int
> +rte_regex_dev_xstats_names_get(uint8_t dev_id,
> +			       struct rte_regex_dev_xstats_map *xstats_map);
> +
> +/**
> + * Retrieve extended statistics of an regex device.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param ids
> + *   The id numbers of the stats to get. The ids can be got from the stat
> + *   position in the stat list from rte_regex_dev_xstats_names_get(), or
> + *   by using rte_regex_dev_xstats_by_name_get().
> + * @param[out] values
> + *   The values for each stats request by ID.
> + * @param n
> + *   The number of stats requested
> + * @return
> + *   - positive value: number of stat entries filled into the values array
> + *   - negative value on error:
> + *      -ENODEV for invalid *dev_id*
> + *      -ENOTSUP if the device doesn't support this function.
> + */
> +int
> +rte_regex_dev_xstats_get(uint8_t dev_id, const uint16_t ids[],
> +			 uint64_t values[], uint16_t n);
> +
> +/**
> + * Retrieve the value of a single stat by requesting it by name.
> + *
> + * @param dev_id
> + *   The identifier of the device
> + * @param name
> + *   The stat name to retrieve
> + * @param[out] id
> + *   If non-NULL, the numerical id of the stat will be returned, so that further
> + *   requests for the stat can be got using rte_regex_dev_xstats_get, which
> will
> + *   be faster as it doesn't need to scan a list of names for the stat.
> + * @param[out] value
> + *   Must be non-NULL, retrieved xstat value will be stored in this address.
> + *
> + * @return
> + *   - 0: Successfully retrieved xstat value.
> + *   - -EINVAL: invalid parameters
> + *   - -ENOTSUP: if not supported.
> + */
> +int
> +rte_regex_dev_xstats_by_name_get(uint8_t dev_id, const char *name,
> +				 uint16_t *id, uint64_t *value);
> +
> +/**
> + * Reset the values of the xstats of the selected component in the device.
> + *
> + * @param dev_id
> + *   The identifier of the device
> + * @param ids
> + *   Selects specific statistics to be reset. When NULL, all statistics will be
> + *   reset. If non-NULL, must point to array of at least *nb_ids* size.
> + * @param nb_ids
> + *   The number of ids available from the *ids* array. Ignored when ids is
> NULL.
> + * @return
> + *   - 0: Successfully reset the statistics to zero.
> + *   - -EINVAL: invalid parameters
> + *   - -ENOTSUP: if not supported.
> + */
> +int
> +rte_regex_dev_xstats_reset(uint8_t dev_id, const uint16_t ids[],
> +			   uint16_t nb_ids);
> +
> +/**
> + * Trigger the RegEx device self test.
> + *
> + * @param dev_id
> + *   The identifier of the device
> + * @return
> + *   - 0: Selftest successful
> + *   - -ENOTSUP if the device doesn't support selftest
> + *   - other values < 0 on failure.
> + */
> +int rte_regex_dev_selftest(uint8_t dev_id);
> +
> +/**
> + * Dump internal information about *dev_id* to the FILE* provided in *f*.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + *
> + * @param f
> + *   A pointer to a file for output
> + *
> + * @return
> + *   - 0: on success
> + *   - <0: on failure.
> + */
> +int
> +rte_regex_dev_dump(uint8_t dev_id, FILE *f);
> +
> +/* Fast path APIs */
> +
> +/**
> + * The generic *rte_regex_match* structure to hold the RegEx match
> attributes.
> + * @see struct rte_regex_ops::matches
> + */
> +struct rte_regex_match {
> +	RTE_STD_C11
> +	union {
> +		uint64_t u64;
> +		struct {
> +			uint32_t rule_id:20;
> +			/**< Rule identifier to which the pattern matched.
> +			 * @see struct rte_regex_rule::rule_id
> +			 */
> +			uint32_t group_id:12;
> +			/**< Group identifier of the rule which the pattern
> +			 * matched. @see struct rte_regex_rule::group_id
> +			 */
> +			uint16_t offset;
> +			/**< Starting Byte Position for matched rule. */
> +			uint16_t len;
> +			/**< Length of match in bytes */
> +		};
> +	};
> +};
> +
> +/* Enumerates RegEx request flags. */
> +#define RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F (1 << 0)
> +/**< Set when struct rte_regex_rule::group_id1 valid */
> +
> +#define RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F (1 << 1)
> +/**< Set when struct rte_regex_rule::group_id2 valid */
> +
> +#define RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F (1 << 2)
> +/**< Set when struct rte_regex_rule::group_id3 valid */
> +
> +#define RTE_REGEX_OPS_REQ_STOP_ON_MATCH_F (1 << 4)
> +/**< The RegEx engine will stop scanning and return the first match. */
> +
> +#define RTE_REGEX_OPS_REQ_MATCH_HIGH_PRIORITY_F (1 << 5)
> +/**< In High Priority mode a maximum of one match will be returned per
> scan to
> + * reduce the post-processing required by the application. The match with
> the
> + * lowest Rule id, lowest start pointer and lowest match length will be
> + * returned.
> + *
> + * @see struct rte_regex_ops::nb_actual_matches
> + * @see struct rte_regex_ops::nb_matches
> + */
> +
> +
> +/* Enumerates RegEx response flags. */
> +#define RTE_REGEX_OPS_RSP_PMI_SOJ_F (1 << 0)
> +/**< Indicates that the RegEx device has encountered a partial match at the
> + * start of scan in the given buffer.
> + *
> + * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> + */
> +
> +#define RTE_REGEX_OPS_RSP_PMI_EOJ_F (1 << 1)
> +/**< Indicates that the RegEx device has encountered a partial match at the
> + * end of scan in the given buffer.
> + *
> + * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> + */
> +
> +#define RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F (1 << 2)
> +/**< Indicates that the RegEx device has exceeded the max timeout while
> + * scanning the given buffer.
> + *
> + * @see RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT
> + */
> +
> +#define RTE_REGEX_OPS_RSP_MAX_MATCH_F (1 << 3)
> +/**< Indicates that the RegEx device has exceeded the max matches while
> + * scanning the given buffer.
> + *
> + * @see RTE_REGEX_DEV_ATTR_MAX_MATCHES
> + */
> +
> +#define RTE_REGEX_OPS_RSP_MAX_PREFIX_F (1 << 4)
> +/**< Indicates that the RegEx device has reached the max allowed prefix
> length
> + * while scanning the given buffer.
> + *
> + * @see RTE_REGEX_DEV_ATTR_MAX_PREFIX
> + */
> +
> +/**
> + * The generic *rte_regex_ops* structure to hold the RegEx attributes
> + * for enqueue and dequeue operation.
> + */
> +struct rte_regex_ops {
> +	/* W0 */
> +	uint16_t req_flags;
> +	/**< Request flags for the RegEx ops.
> +	 * @see RTE_REGEX_OPS_REQ_*
> +	 */
> +	uint16_t scan_size;
> +	/**< Scan size of the buffer to be scanned in bytes. */
> +	uint16_t rsp_flags;
> +	/**< Response flags for the RegEx ops.
> +	 * @see RTE_REGEX_OPS_RSP_*
> +	 */
> +	uint8_t nb_actual_matches;
> +	/**< The total number of actual matches detected by the Regex
> device.*/
> +	uint8_t nb_matches;
> +	/**< The total number of matches returned by the RegEx device for
> this
> +	 * scan. The size of *rte_regex_ops::matches* zero length array will
> be
> +	 * this value.
> +	 *
> +	 * @see struct rte_regex_ops::matches, struct rte_regex_match
> +	 */
> +
> +	/* W1 */
> +	RTE_STD_C11
> +	union {
> +		uint64_t u64;
> +		/**<  Allow 8-byte reserved on 32-bit system */
> +		void *buf_addr;
> +		/**< Virtual address of the pattern to be matched. */
> +	};
> +
> +	/* W2 */
> +	rte_iova_t buf_iova;
> +	/**< IOVA address of the pattern to be matched. */
> +
> +	/* W3 */
> +	uint16_t group_id0;
> +	/**< First group_id to match the rule against. Minimum one group id
> +	 * must be provided by application.
> +	 * When RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F set then
> group_id1
> +	 * is valid, respectively similar flags for group_id2 and group_id3.
> +	 * Upon the match, struct rte_regex_match::group_id shall be
> updated
> +	 * with matching group ID by the device. Group ID scheme provides
> +	 * rule isolation and effective pattern matching.
> +	 */
> +	uint16_t group_id1;
> +	/**< Second group_id to match the rule against.
> +	 *
> +	 * @see RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F
> +	 */
> +	uint16_t group_id2;
> +	/**< Third group_id to match the rule against.
> +	 *
> +	 * @see RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F
> +	 */
> +	uint16_t group_id3;
> +	/**< Forth group_id to match the rule against.
> +	 *
> +	 * @see RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F
> +	 */
> +
> +	/* W4 */
> +	RTE_STD_C11
> +	union {
> +		uint64_t user_id;
> +		/**< Application specific opaque value. An application may
> use
> +		 * this field to hold application specific value to share
> +		 * between dequeue and enqueue operation.
> +		 * Implementation should not modify this field.
> +		 */
> +		void *user_ptr;
> +		/**< Pointer representation of *user_id* */
> +	};
> +
> +	/* W5 */
> +	struct rte_regex_match matches[];
> +	/**< Zero length array to hold the match tuples.
> +	 * The struct rte_regex_ops::nb_matches value holds the number of
> +	 * elements in this array.
> +	 *
> +	 * @see struct rte_regex_ops::nb_matches
> +	 */
> +};
> +
> +/**
> + * Enqueue a burst of scan request on a RegEx device.
> + *
> + * The rte_regex_enqueue_burst() function is invoked to place
> + * regex operations on the queue *qp_id* of the device designated by
> + * its *dev_id*.
> + *
> + * The *nb_ops* parameter is the number of operations to process which
> are
> + * supplied in the *ops* array of *rte_regex_op* structures.
> + *
> + * The rte_regex_enqueue_burst() function returns the number of
> + * operations it actually enqueued for processing. A return value equal to
> + * *nb_ops* means that all packets have been enqueued.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param qp_id
> + *   The index of the queue pair which packets are to be enqueued for
> + *   processing. The value must be in the range [0, nb_queue_pairs - 1]
> + *   previously supplied to rte_regex_dev_configure().
> + * @param ops
> + *   The address of an array of *nb_ops* pointers to *rte_regex_op*
> structures
> + *   which contain the regex operations to be processed.
> + * @param nb_ops
> + *   The number of operations to process.
> + *
> + * @return
> + *   The number of operations actually enqueued on the regex device. The
> return
> + *   value can be less than the value of the *nb_ops* parameter when the
> + *   regex devices queue is full or if invalid parameters are specified in
> + *   a *rte_regex_op*. If the return value is less than *nb_ops*, the
> remaining
> + *   ops at the end of *ops* are not consumed and the caller has to take
> care
> + *   of them.
> + */
> +uint16_t
> +rte_regex_enqueue_burst(uint8_t dev_id, uint16_t qp_id,
> +			struct rte_regex_ops **ops, uint16_t nb_ops);
> +
> +/**
> + *
> + * Dequeue a burst of scan response from a queue on the RegEx device.
> + * The dequeued operation are stored in *rte_regex_op* structures
> + * whose pointers are supplied in the *ops* array.
> + *
> + * The rte_regex_dequeue_burst() function returns the number of ops
> + * actually dequeued, which is the number of *rte_regex_op* data
> structures
> + * effectively supplied into the *ops* array.
> + *
> + * A return value equal to *nb_ops* indicates that the queue contained
> + * at least *nb_ops* operations, and this is likely to signify that other
> + * processed operations remain in the devices output queue. Applications
> + * implementing a "retrieve as many processed operations as possible"
> policy
> + * can check this specific case and keep invoking the
> + * rte_regex_dequeue_burst() function until a value less than
> + * *nb_ops* is returned.
> + *
> + * The rte_regex_dequeue_burst() function does not provide any error
> + * notification to avoid the corresponding overhead.
> + *
> + * @param dev_id
> + *   The RegEx device identifier
> + * @param qp_id
> + *   The index of the queue pair from which to retrieve processed packets.
> + *   The value must be in the range [0, nb_queue_pairs - 1] previously
> + *   supplied to rte_regex_dev_configure().
> + * @param ops
> + *   The address of an array of pointers to *rte_regex_op* structures that
> must
> + *   be large enough to store *nb_ops* pointers in it.
> + * @param nb_ops
> + *   The maximum number of operations to dequeue.
> + *
> + * @return
> + *   The number of operations actually dequeued, which is the number
> + *   of pointers to *rte_regex_op* structures effectively supplied to the
> + *   *ops* array. If the return value is less than *nb_ops*, the remaining
> + *   ops at the end of *ops* are not consumed and the caller has to take
> care
> + *   of them.
> + */
> +uint16_t
> +rte_regex_dequeue_burst(uint8_t dev_id, uint16_t qp_id,
> +			struct rte_regex_ops **ops, uint16_t nb_ops);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_REGEXDEV_H_ */
>
Wang Xiang Aug. 20, 2019, 1:54 a.m. UTC | #5
Thanks Jerin. Comments inline.

-----Original Message-----
From: Jerin Jacob Kollanukkaran [mailto:jerinj@marvell.com] 
Sent: Monday, August 19, 2019 11:09 AM
To: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org
Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; Shahaf Shuler <shahafs@mellanox.com>; Hemant Agrawal <hemant.agrawal@nxp.com>; Opher Reviv <opher@mellanox.com>; Alex Rosenbaum <alexr@mellanox.com>; Dovrat Zifroni <dovrat@marvell.com>; Prasun Kapoor <pkapoor@marvell.com>; Nipun Gupta <nipun.gupta@nxp.com>; Wang, Xiang W <xiang.w.wang@intel.com>; Richardson, Bruce <bruce.richardson@intel.com>; Hong, Yang A <yang.a.hong@intel.com>; Chang, Harry <harry.chang@intel.com>; gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn; zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com; wushuai@inspur.com; yuyingxia@yxlink.com; fanchenggang@sunyainfo.com; davidfgao@tencent.com; liuzhong1@chinaunicom.cn; zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com; Ni, Hongjun <hongjun.ni@intel.com>; j.bromhead@titan-ic.com; deri@ntop.org; fc@napatech.com; arthur.su@lionic.com; Guy Kaneti <guyk@marvell.com>; Smadar Fuks <smadarf@marvell.com>; Liron Himi <lironh@marvell.com>
Subject: RE: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev subsystem

Reply to Xiang's queries in main thread:

Hi all,

Some questions regarding APIs. Could you please give more insights?

1) rte_regex_ops
      a) rsp_flags
      These two flags RTE_REGEX_OPS_RSP_PMI_SOJ_F and RTE_REGEX_OPS_RSP_PMI_EOJ_F are used for cross buffer scan.
      RTE_REGEX_OPS_RSP_PMI_EOJ_F tells whether we have a partial match at the end of current buffer after scan.
      What's the purpose of having RTE_REGEX_OPS_RSP_PMI_SOJ_F?

[Jerin] Since we need three states to represent partial match buffer, RTE_REGEX_OPS_RSP_PMI_SOJ_F to
represent start of the buffer, intermediate buffers with no flag, and end of the buffer with RTE_REGEX_OPS_RSP_PMI_EOJ
[Xiang] How could a user leverage these flags for matching? Suppose a large buffer is divided into multiple chunks. Will RTE_REGEX_OPS_RSP_PMI_SOJ_F cause an early quit once it isn't set after scan the first chunk. Similarly, RTE_REGEX_OPS_RSP_PMI_EOJ tells a user whether to stop matching future buffers after finish the last chunk?  

      RTE_REGEX_OPS_RSP_MAX_PREFIX_F: This looks like a definition for a specific hardware implementation. I am wondering what this PREFIX refers to:)?

[Jerin] Yes. Looks like it is for hardware specific implementation. Introduced rte_regex_dev_attr_set/get functions to make it portable and
To add new implementation specific fields.
For example, if a rule is
/ABCDEF.*XYZ/, ABCD is considered the prefix, and EF.*XYZ is considered the factor. The prefix is a literal
string, while the factor can contain complex regular expression constructs. As a result, rule matching occurs in
two stages: prefix matching and factor matching.
 
      b)  user_id or user_ptr
      Under what kind of circumstances should an application pass value into these variables for enqueue and dequeuer operations?

[Jerin] Just like rte_crypto_ops, struct rte_regex_ops also allocated using mempool normally, on enqueue, user can specify user_id
If needed to in order identify the op on dequeue if required. The use case could be to store the sequence number from application
POV or storing the mbuf ptr in which pattern is requested etc.
 

 2) rte_regex_match
      a) offset; /**< Starting Byte Position for matched rule. */ and  uint16_t len; /**< Length of match in bytes */
      Looks like the matching offset is defined as *starting matching offset* instead of *end matching offset*, e.g. report the offset of "a" instead of "c" for pattern "abc". 
      If so, this makes it hard to integrate software regex libraries such as Hyperscan and RE2 as they only report *end matching offset* without length of match. 
      Although Hyperscan has API for *starting matching offset*, it only delivers partial syntax support. So I think we have to define *end of matching offset* for software solutions.

[Jerin] I understand the hyperscan's HS_FLAG_SOM_LEFTMOST tradeoffs. I thought application would need always the length of the match.
Probably we will see how other HW implementation (from Mellanox) etc. We will try to abstract it, probably we can make it as function of "user requested".
[Xiang] Yes, it will be good to make it per user request. At least from Hyperscan user's point of view, start of match and match length are not mandatory. 

3)  rte_regex_rule_db_update()
    Does this mean we can dynamically add or delete rules for an already generated database without recompile from scratch for hardware Regex implementation? 
    If so, this isn't possible for software solutions as they don't support dynamic database update and require recompile. 

[Jerin] rte_regex_rule_db_update() internally it would call recompile function for both HW and SW.
See rte_regex_dev_config::rule_db in rte_regex_dev_configure() for precompiled rule database case.
[Xiang] OK, sounds like we have to save the original rule-set for the device in order to do recompile. I see both ADD and REMOVE operators from rte_regex_rule.
For rules with REMOVE operator, what's the expected behavior to handle them for the old rule-set? Do we need to go through the old rule-set and remove corresponding rules before doing recompile?  

4) rte_regex_rule_db_import() and rte_regex_rule_db_export()
     What's the expected behavior for import and export operations? Will we create another copy of database when calling them? 

[Jerin] Does it require copy or not it is Implementation defined. Marvell's HW implementation has centralized rule database
per device.

Thanks,
Xiang

> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Thursday, August 15, 2019 5:04 PM
> To: dev@dpdk.org
> Cc: Jerin Jacob Kollanukkaran <jerinj@marvell.com>; Pavan Nikhilesh
> Bhagavatula <pbhagavatula@marvell.com>; Shahaf Shuler
> <shahafs@mellanox.com>; Hemant Agrawal <hemant.agrawal@nxp.com>;
> Opher Reviv <opher@mellanox.com>; Alex Rosenbaum
> <alexr@mellanox.com>; Dovrat Zifroni <dovrat@marvell.com>; Prasun
> Kapoor <pkapoor@marvell.com>; Nipun Gupta <nipun.gupta@nxp.com>;
> Wang, Xiang W <xiang.w.wang@intel.com>; Richardson, Bruce
> <bruce.richardson@intel.com>; yang.a.hong@intel.com;
> harry.chang@intel.com; gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com;
> wushuai@inspur.com; yuyingxia@yxlink.com;
> fanchenggang@sunyainfo.com; davidfgao@tencent.com;
> liuzhong1@chinaunicom.cn; zhaoyong11@huawei.com; oc@yunify.com;
> jim@netgate.com; hongjun.ni@intel.com; j.bromhead@titan-ic.com;
> deri@ntop.org; fc@napatech.com; arthur.su@lionic.com
> Subject: Re: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev
> subsystem
> 
> +Cc more
> 
> ------------
> 
> From: Jerin Jacob <jerinj@marvell.com>
> 
> Even though there are some vendors which offer Regex HW offload, due to
> lack of standard API, It is diffcult for DPDK consumer to use them
> in a portable way.
> 
> This _RFC_ attempts to standardize the RegEx/DPI offload APIs for DPDK.
> 
> The Doxygen generated RFC API documentation available here:
> https://dreamy-noether-22777e.netlify.com/rte__regexdev_8h.html
> 
> This RFC crafted based on SW Regex API frameworks such as libpcre and
> hyperscan and a few of the RegEx HW IPs which I am aware of.
> 
> RegEx pattern matching applications:
> • Next Generation Firewalls (NGFW)
> • Deep Packet and Flow Inspection (DPI)
> • Intrusion Prevention Systems (IPS)
> • DDoS Mitigation
> • Network Monitoring
> • Data Loss Prevention (DLP)
> • Smart NICs
> • Grammar based content processing
> • URL, spam and adware filtering
> • Advanced auditing and policing of user/application security policies
> • Financial data mining - parsing of streamed financial feeds
> 
> Request to review from HW and SW RegEx vendors and RegEx application
> users
> to have portable DPDK API for RegEx.
> 
> The API schematics are based cryptodev, eventdev and ethdev existing
> device API.
> 
> Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> ---
> 
> RTE RegEx Device API
> --------------------
> 
> Defines RTE RegEx Device APIs for RegEx operations and its provisioning.
> 
> The RegEx Device API is composed of two parts:
> 
> - The application-oriented RegEx API that includes functions to setup
> a RegEx device (configure it, setup its queue pairs and start it),
> update the rule database and so on.
> 
> - The driver-oriented RegEx API that exports a function allowing
> a RegEx poll Mode Driver (PMD) to simultaneously register itself as
> a RegEx device driver.
> 
> RegEx device components and definitions:
> 
>     +-----------------+
>     |                 |
>     |                 o---------+    rte_regex_[en|de]queue_burst()
>     |   PCRE based    o------+  |               |
>     |  RegEx pattern  |      |  |  +--------+   |
>     | matching engine o------+--+--o        |   |    +------+
>     |                 |      |  |  | queue  |<==o===>|Core 0|
>     |                 o----+ |  |  | pair 0 |        |      |
>     |                 |    | |  |  +--------+        +------+
>     +-----------------+    | |  |
>            ^               | |  |  +--------+
>            |               | |  |  |        |        +------+
>            |               | +--+--o queue  |<======>|Core 1|
>        Rule|Database       |    |  | pair 1 |        |      |
>     +------+----------+    |    |  +--------+        +------+
>     |     Group 0     |    |    |
>     | +-------------+ |    |    |  +--------+        +------+
>     | | Rules 0..n  | |    |    |  |        |        |Core 2|
>     | +-------------+ |    |    +--o queue  |<======>|      |
>     |     Group 1     |    |       | pair 2 |        +------+
>     | +-------------+ |    |       +--------+
>     | | Rules 0..n  | |    |
>     | +-------------+ |    |       +--------+
>     |     Group 2     |    |       |        |        +------+
>     | +-------------+ |    |       | queue  |<======>|Core n|
>     | | Rules 0..n  | |    +-------o pair n |        |      |
>     | +-------------+ |            +--------+        +------+
>     |     Group n     |
>     | +-------------+ |<-------rte_regex_rule_db_update()
>     | | Rules 0..n  | |<-------rte_regex_rule_db_import()
>     | +-------------+ |------->rte_regex_rule_db_export()
>     +-----------------+
> 
> RegEx: A regular expression is a concise and flexible means for matching
> strings of text, such as particular characters, words, or patterns of
> characters. A common abbreviation for this is “RegEx”.
> 
> RegEx device: A hardware or software-based implementation of RegEx
> device API for PCRE based pattern matching syntax and semantics.
> 
> PCRE RegEx syntax and semantics specification:
> http://regexkit.sourceforge.net/Documentation/pcre/pcrepattern.html
> 
> RegEx queue pair: Each RegEx device should have one or more queue pair to
> transmit a burst of pattern matching request and receive a burst of
> receive the pattern matching response. The pattern matching
> request/response
> embedded in *rte_regex_ops* structure.
> 
> Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
> Match ID and Group ID to identify the rule upon the match.
> 
> Rule database: The RegEx device accepts regular expressions and converts
> them
> into a compiled rule database that can then be used to scan data.
> Compilation allows the device to analyze the given pattern(s) and
> pre-determine how to scan for these patterns in an optimized fashion that
> would be far too expensive to compute at run-time. A rule database contains
> a set of rules that compiled in device specific binary form.
> 
> Match ID or Rule ID: A unique identifier provided at the time of rule
> creation for the application to identify the rule upon match.
> 
> Group ID: Group of rules can be grouped under one group ID to enable
> rule isolation and effective pattern matching. A unique group identifier
> provided at the time of rule creation for the application to identify the
> rule upon match.
> 
> Scan: A pattern matching request through *enqueue* API.
> 
> It may possible that a given RegEx device may not support all the features
> of PCRE. The application may probe unsupported features through
> struct rte_regex_dev_info::pcre_unsup_flags
> 
> By default, all the functions of the RegEx Device API exported by a PMD
> are lock-free functions which assume to not be invoked in parallel on
> different logical cores to work on the same target object. For instance,
> the dequeue function of a PMD cannot be invoked in parallel on two logical
> cores to operates on same RegEx queue pair. Of course, this function
> can be invoked in parallel by different logical core on different queue pair.
> It is the responsibility of the upper level application to enforce this rule.
> 
> In all functions of the RegEx API, the RegEx device is
> designated by an integer >= 0 named the device identifier *dev_id*
> 
> At the RegEx driver level, RegEx devices are represented by a generic
> data structure of type *rte_regex_dev*.
> 
> RegEx devices are dynamically registered during the PCI/SoC device probing
> phase performed at EAL initialization time.
> When a RegEx device is being probed, a *rte_regex_dev* structure and
> a new device identifier are allocated for that device. Then, the
> regex_dev_init() function supplied by the RegEx driver matching the probed
> device is invoked to properly initialize the device.
> 
> The role of the device init function consists of resetting the hardware or
> software RegEx driver implementations.
> 
> If the device init operation is successful, the correspondence between
> the device identifier assigned to the new device and its associated
> *rte_regex_dev* structure is effectively registered.
> Otherwise, both the *rte_regex_dev* structure and the device identifier are
> freed.
> 
> The functions exported by the application RegEx API to setup a device
> designated by its device identifier must be invoked in the following order:
> - rte_regex_dev_configure()
> - rte_regex_queue_pair_setup()
> - rte_regex_dev_start()
> 
> Then, the application can invoke, in any order, the functions
> exported by the RegEx API to enqueue pattern matching job, dequeue
> pattern
> matching response, get the stats, update the rule database,
> get/set device attributes and so on
> 
> If the application wants to change the configuration (i.e. call
> rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must call
> rte_regex_dev_stop() first to stop the device and then do the
> reconfiguration
> before calling rte_regex_dev_start() again. The enqueue and dequeue
> functions should not be invoked when the device is stopped.
> 
> Finally, an application can close a RegEx device by invoking the
> rte_regex_dev_close() function.
> 
> Each function of the application RegEx API invokes a specific function
> of the PMD that controls the target device designated by its device
> identifier.
> 
> For this purpose, all device-specific functions of a RegEx driver are
> supplied through a set of pointers contained in a generic structure of type
> *regex_dev_ops*.
> The address of the *regex_dev_ops* structure is stored in the
> *rte_regex_dev*
> structure by the device init function of the RegEx driver, which is
> invoked during the PCI/SoC device probing phase, as explained earlier.
> 
> In other words, each function of the RegEx API simply retrieves the
> *rte_regex_dev* structure associated with the device identifier and
> performs an indirect invocation of the corresponding driver function
> supplied in the *regex_dev_ops* structure of the *rte_regex_dev*
> structure.
> 
> For performance reasons, the address of the fast-path functions of the
> RegEx driver is not contained in the *regex_dev_ops* structure.
> Instead, they are directly stored at the beginning of the *rte_regex_dev*
> structure to avoid an extra indirect memory access during their invocation.
> 
> RTE RegEx device drivers do not use interrupts for enqueue or dequeue
> operation. Instead, RegEx drivers export Poll-Mode enqueue and dequeue
> functions to applications.
> 
> The *enqueue* operation submits a burst of RegEx pattern matching
> request
> to the RegEx device and the *dequeue* operation gets a burst of pattern
> matching response for the ones submitted through *enqueue* operation.
> 
> Typical application utilisation of the RegEx device API will follow the
> following programming flow.
> 
> - rte_regex_dev_configure()
> - rte_regex_queue_pair_setup()
> - rte_regex_rule_db_update() Needs to invoke if precompiled rule database
> not
> provided in rte_regex_dev_config::rule_db for rte_regex_dev_configure()
> and/or application needs to update rule database.
> - Create or reuse exiting mempool for *rte_regex_ops* objects.
> - rte_regex_dev_start()
> - rte_regex_enqueue_burst()
> - rte_regex_dequeue_burst()
> 
> ---
> 
> config/common_base                 |    5 +
> doc/api/doxy-api-index.md          |    1 +
> doc/api/doxy-api.conf.in           |    1 +
> lib/Makefile                       |    2 +
> lib/librte_regexdev/Makefile       |   23 +
> lib/librte_regexdev/rte_regexdev.c |    5 +
> lib/librte_regexdev/rte_regexdev.h | 1247
> ++++++++++++++++++++++++++++
> 7 files changed, 1284 insertions(+)
> create mode 100644 lib/librte_regexdev/Makefile
> create mode 100644 lib/librte_regexdev/rte_regexdev.c
> create mode 100644 lib/librte_regexdev/rte_regexdev.h
> 
> diff --git a/config/common_base b/config/common_base
> index e406e7836..986093d6e 100644
> --- a/config/common_base
> +++ b/config/common_base
> @@ -746,6 +746,11 @@
> CONFIG_RTE_LIBRTE_PMD_DPAA2_QDMA_RAWDEV=n
> #
> CONFIG_RTE_LIBRTE_PMD_IFPGA_RAWDEV=y
> 
> +#
> +# Compile regex device support
> +#
> +CONFIG_RTE_LIBRTE_REGEXDEV=y
> +
> #
> # Compile librte_ring
> #
> diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
> index 715248dd1..a0bc27ae4 100644
> --- a/doc/api/doxy-api-index.md
> +++ b/doc/api/doxy-api-index.md
> @@ -26,6 +26,7 @@ The public API headers are grouped by topics:
> [event_timer_adapter]    (@ref rte_event_timer_adapter.h),
> [event_crypto_adapter]   (@ref rte_event_crypto_adapter.h),
> [rawdev]             (@ref rte_rawdev.h),
> +  [regexdev]           (@ref rte_regexdev.h),
> [metrics]            (@ref rte_metrics.h),
> [bitrate]            (@ref rte_bitrate.h),
> [latency]            (@ref rte_latencystats.h),
> diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in
> index b9896cb63..7adb821bb 100644
> --- a/doc/api/doxy-api.conf.in
> +++ b/doc/api/doxy-api.conf.in
> @@ -53,6 +53,7 @@ INPUT                   = @TOPDIR@/doc/api/doxy-api-
> index.md \
> @TOPDIR@/lib/librte_rawdev \
> @TOPDIR@/lib/librte_rcu \
> @TOPDIR@/lib/librte_reorder \
> +                          @TOPDIR@/lib/librte_regexdev \
> @TOPDIR@/lib/librte_ring \
> @TOPDIR@/lib/librte_sched \
> @TOPDIR@/lib/librte_security \
> diff --git a/lib/Makefile b/lib/Makefile
> index 791e0d991..57de9691a 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -44,6 +44,8 @@ DEPDIRS-librte_eventdev := librte_eal librte_ring
> librte_ethdev librte_hash \
> librte_mempool librte_timer librte_cryptodev
> DIRS-$(CONFIG_RTE_LIBRTE_RAWDEV) += librte_rawdev
> DEPDIRS-librte_rawdev := librte_eal librte_ethdev
> +DIRS-$(CONFIG_RTE_LIBRTE_REGEXDEV) += librte_regexdev
> +DEPDIRS-librte_regexdev := librte_eal
> DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
> DEPDIRS-librte_vhost := librte_eal librte_mempool librte_mbuf
> librte_ethdev \
> 			librte_net
> diff --git a/lib/librte_regexdev/Makefile b/lib/librte_regexdev/Makefile
> new file mode 100644
> index 000000000..723b4b28c
> --- /dev/null
> +++ b/lib/librte_regexdev/Makefile
> @@ -0,0 +1,23 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(C) 2019 Marvell International Ltd.
> +#
> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +# library name
> +LIB = librte_regexdev.a
> +
> +# library version
> +LIBABIVER := 1
> +
> +# build flags
> +CFLAGS += -O3
> +CFLAGS += $(WERROR_FLAGS)
> +
> +# library source files
> +SRCS-y += rte_regexdev.c
> +
> +# export include files
> +SYMLINK-y-include += rte_regexdev.h
> +
> +include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/lib/librte_regexdev/rte_regexdev.c
> b/lib/librte_regexdev/rte_regexdev.c
> new file mode 100644
> index 000000000..e5be0f29c
> --- /dev/null
> +++ b/lib/librte_regexdev/rte_regexdev.c
> @@ -0,0 +1,5 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(C) 2019 Marvell International Ltd.
> + */
> +
> +#include <rte_regexdev.h>
> diff --git a/lib/librte_regexdev/rte_regexdev.h
> b/lib/librte_regexdev/rte_regexdev.h
> new file mode 100644
> index 000000000..765da4aaa
> --- /dev/null
> +++ b/lib/librte_regexdev/rte_regexdev.h
> @@ -0,0 +1,1247 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(C) 2019 Marvell International Ltd.
> + */
> +
> +#ifndef _RTE_REGEXDEV_H_
> +#define _RTE_REGEXDEV_H_
> +
> +/**
> + * @file
> + *
> + * RTE RegEx Device API
> + *
> + * Defines RTE RegEx Device APIs for RegEx operations and its provisioning.
> + *
> + * The RegEx Device API is composed of two parts:
> + *
> + * - The application-oriented RegEx API that includes functions to setup
> + *   a RegEx device (configure it, setup its queue pairs and start it),
> + *   update the rule database and so on.
> + *
> + * - The driver-oriented RegEx API that exports a function allowing
> + *   a RegEx poll Mode Driver (PMD) to simultaneously register itself as
> + *   a RegEx device driver.
> + *
> + * RegEx device components and definitions:
> + *
> + *     +-----------------+
> + *     |                 |
> + *     |                 o---------+    rte_regex_[en|de]queue_burst()
> + *     |   PCRE based    o------+  |               |
> + *     |  RegEx pattern  |      |  |  +--------+   |
> + *     | matching engine o------+--+--o        |   |    +------+
> + *     |                 |      |  |  | queue  |<==o===>|Core 0|
> + *     |                 o----+ |  |  | pair 0 |        |      |
> + *     |                 |    | |  |  +--------+        +------+
> + *     +-----------------+    | |  |
> + *            ^               | |  |  +--------+
> + *            |               | |  |  |        |        +------+
> + *            |               | +--+--o queue  |<======>|Core 1|
> + *        Rule|Database       |    |  | pair 1 |        |      |
> + *     +------+----------+    |    |  +--------+        +------+
> + *     |     Group 0     |    |    |
> + *     | +-------------+ |    |    |  +--------+        +------+
> + *     | | Rules 0..n  | |    |    |  |        |        |Core 2|
> + *     | +-------------+ |    |    +--o queue  |<======>|      |
> + *     |     Group 1     |    |       | pair 2 |        +------+
> + *     | +-------------+ |    |       +--------+
> + *     | | Rules 0..n  | |    |
> + *     | +-------------+ |    |       +--------+
> + *     |     Group 2     |    |       |        |        +------+
> + *     | +-------------+ |    |       | queue  |<======>|Core n|
> + *     | | Rules 0..n  | |    +-------o pair n |        |      |
> + *     | +-------------+ |            +--------+        +------+
> + *     |     Group n     |
> + *     | +-------------+ |<-------rte_regex_rule_db_update()
> + *     | | Rules 0..n  | |<-------rte_regex_rule_db_import()
> + *     | +-------------+ |------->rte_regex_rule_db_export()
> + *     +-----------------+
> + *
> + * RegEx: A regular expression is a concise and flexible means for matching
> + * strings of text, such as particular characters, words, or patterns of
> + * characters. A common abbreviation for this is “RegEx”.
> + *
> + * RegEx device: A hardware or software-based implementation of RegEx
> + * device API for PCRE based pattern matching syntax and semantics.
> + *
> + * PCRE RegEx syntax and semantics specification:
> + * http://regexkit.sourceforge.net/Documentation/pcre/pcrepattern.html
> + *
> + * RegEx queue pair: Each RegEx device should have one or more queue
> pair to
> + * transmit a burst of pattern matching request and receive a burst of
> + * receive the pattern matching response. The pattern matching
> request/response
> + * embedded in *rte_regex_ops* structure.
> + *
> + * Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
> + * Match ID and Group ID to identify the rule upon the match.
> + *
> + * Rule database: The RegEx device accepts regular expressions and
> converts them
> + * into a compiled rule database that can then be used to scan data.
> + * Compilation allows the device to analyze the given pattern(s) and
> + * pre-determine how to scan for these patterns in an optimized fashion
> that
> + * would be far too expensive to compute at run-time. A rule database
> contains
> + * a set of rules that compiled in device specific binary form.
> + *
> + * Match ID or Rule ID: A unique identifier provided at the time of rule
> + * creation for the application to identify the rule upon match.
> + *
> + * Group ID: Group of rules can be grouped under one group ID to enable
> + * rule isolation and effective pattern matching. A unique group identifier
> + * provided at the time of rule creation for the application to identify the
> + * rule upon match.
> + *
> + * Scan: A pattern matching request through *enqueue* API.
> + *
> + * It may possible that a given RegEx device may not support all the features
> + * of PCRE. The application may probe unsupported features through
> + * struct rte_regex_dev_info::pcre_unsup_flags
> + *
> + * By default, all the functions of the RegEx Device API exported by a PMD
> + * are lock-free functions which assume to not be invoked in parallel on
> + * different logical cores to work on the same target object. For instance,
> + * the dequeue function of a PMD cannot be invoked in parallel on two
> logical
> + * cores to operates on same RegEx queue pair. Of course, this function
> + * can be invoked in parallel by different logical core on different queue
> pair.
> + * It is the responsibility of the upper level application to enforce this rule.
> + *
> + * In all functions of the RegEx API, the RegEx device is
> + * designated by an integer >= 0 named the device identifier *dev_id*
> + *
> + * At the RegEx driver level, RegEx devices are represented by a generic
> + * data structure of type *rte_regex_dev*.
> + *
> + * RegEx devices are dynamically registered during the PCI/SoC device
> probing
> + * phase performed at EAL initialization time.
> + * When a RegEx device is being probed, a *rte_regex_dev* structure and
> + * a new device identifier are allocated for that device. Then, the
> + * regex_dev_init() function supplied by the RegEx driver matching the
> probed
> + * device is invoked to properly initialize the device.
> + *
> + * The role of the device init function consists of resetting the hardware or
> + * software RegEx driver implementations.
> + *
> + * If the device init operation is successful, the correspondence between
> + * the device identifier assigned to the new device and its associated
> + * *rte_regex_dev* structure is effectively registered.
> + * Otherwise, both the *rte_regex_dev* structure and the device identifier
> are
> + * freed.
> + *
> + * The functions exported by the application RegEx API to setup a device
> + * designated by its device identifier must be invoked in the following order:
> + *     - rte_regex_dev_configure()
> + *     - rte_regex_queue_pair_setup()
> + *     - rte_regex_dev_start()
> + *
> + * Then, the application can invoke, in any order, the functions
> + * exported by the RegEx API to enqueue pattern matching job, dequeue
> pattern
> + * matching response, get the stats, update the rule database,
> + * get/set device attributes and so on
> + *
> + * If the application wants to change the configuration (i.e. call
> + * rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must
> call
> + * rte_regex_dev_stop() first to stop the device and then do the
> reconfiguration
> + * before calling rte_regex_dev_start() again. The enqueue and dequeue
> + * functions should not be invoked when the device is stopped.
> + *
> + * Finally, an application can close a RegEx device by invoking the
> + * rte_regex_dev_close() function.
> + *
> + * Each function of the application RegEx API invokes a specific function
> + * of the PMD that controls the target device designated by its device
> + * identifier.
> + *
> + * For this purpose, all device-specific functions of a RegEx driver are
> + * supplied through a set of pointers contained in a generic structure of type
> + * *regex_dev_ops*.
> + * The address of the *regex_dev_ops* structure is stored in the
> *rte_regex_dev*
> + * structure by the device init function of the RegEx driver, which is
> + * invoked during the PCI/SoC device probing phase, as explained earlier.
> + *
> + * In other words, each function of the RegEx API simply retrieves the
> + * *rte_regex_dev* structure associated with the device identifier and
> + * performs an indirect invocation of the corresponding driver function
> + * supplied in the *regex_dev_ops* structure of the *rte_regex_dev*
> structure.
> + *
> + * For performance reasons, the address of the fast-path functions of the
> + * RegEx driver is not contained in the *regex_dev_ops* structure.
> + * Instead, they are directly stored at the beginning of the *rte_regex_dev*
> + * structure to avoid an extra indirect memory access during their
> invocation.
> + *
> + * RTE RegEx device drivers do not use interrupts for enqueue or dequeue
> + * operation. Instead, RegEx drivers export Poll-Mode enqueue and
> dequeue
> + * functions to applications.
> + *
> + * The *enqueue* operation submits a burst of RegEx pattern matching
> request
> + * to the RegEx device and the *dequeue* operation gets a burst of pattern
> + * matching response for the ones submitted through *enqueue*
> operation.
> + *
> + * Typical application utilisation of the RegEx device API will follow the
> + * following programming flow.
> + *
> + * - rte_regex_dev_configure()
> + * - rte_regex_queue_pair_setup()
> + * - rte_regex_rule_db_update() Needs to invoke if precompiled rule
> database not
> + *   provided in rte_regex_dev_config::rule_db for
> rte_regex_dev_configure()
> + *   and/or application needs to update rule database.
> + * - Create or reuse exiting mempool for *rte_regex_ops* objects.
> + * - rte_regex_dev_start()
> + * - rte_regex_enqueue_burst()
> + * - rte_regex_dequeue_burst()
> + *
> + */
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include <rte_common.h>
> +#include <rte_config.h>
> +#include <rte_dev.h>
> +#include <rte_errno.h>
> +#include <rte_memory.h>
> +
> +/**
> + * Get the total number of RegEx devices that have been successfully
> + * initialised.
> + *
> + * @return
> + *   The total number of usable RegEx devices.
> + */
> +uint8_t
> +rte_regex_dev_count(void);
> +
> +/**
> + * Get the device identifier for the named RegEx device.
> + *
> + * @param name
> + *   RegEx device name to select the RegEx device identifier.
> + *
> + * @return
> + *   Returns RegEx device identifier on success.
> + *   - <0: Failure to find named RegEx device.
> + */
> +int
> +rte_regex_dev_get_dev_id(const char *name);
> +
> +/* Enumerates RegEx device capabilities */
> +#define RTE_REGEX_DEV_CAPA_RUNTIME_COMPILATION_F (1ULL << 0)
> +/**< RegEx device does support compiling the rules at runtime unlike
> + * loading only the pre-built rule database using
> + * struct rte_regex_dev_config::rule_db in rte_regex_dev_configure()
> + * @see struct rte_regex_dev_config::rule_db, rte_regex_dev_configure()
> + * @see struct rte_regex_dev_info::regex_dev_capa
> + */
> +
> +
> +/* Enumerates unsupported PCRE features for the RegEx device */
> +#define RTE_REGEX_DEV_PCRE_UNSUP_START_ANCHOR_F (1ULL << 0)
> +/**< RegEx device doesn't support PCRE Anchor to start of match flag.
> + * Example RegEx is '/\Gfoo\d/'. Here '\G' asserts position at the end of the
> + * previous match or the start of the string for the first match.
> + * This position will change each time the RegEx is applied to the subject
> + * string. If the RegEx is applied to 'foo1foo2Zfoo3' the first two matches will
> + * be successful for 'foo1foo2' and fail for 'Zfoo3'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_ATOMIC_GROUPING_F (1ULL <<
> 1)
> +/**< RegEx device doesn't support PCRE Atomic grouping.
> + * Atomic groups are represented by '(?>)'. An atomic group is a group that,
> + * when the RegEx engine exits from it, automatically throws away all
> + * backtracking positions remembered by any tokens inside the group.
> + * Example RegEx is 'a(?>bc|b)c' if the given patterns are 'abc' and 'abcc'
> then
> + * 'a(bc|b)c' matches both where as 'a(?>bc|b)c' matches only abcc because
> + * atomic groups don't allow backtracing back to 'b'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_BACKTRACKING_CTRL_F (1ULL <<
> 2)
> +/**< RegEx device doesn't support PCRE backtracking control verbs.
> + * Some examples of backtracing verbs are (*COMMIT), (*ACCEPT), (*FAIL),
> + * (*SKIP), (*PRUNE).
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_CALLOUTS_F (1ULL << 3)
> +/**< RegEx device doesn't support PCRE callouts.
> + * PCRE supports calling external function in between matches by using
> '(?C)'.
> + * Example RegEx 'ABC(?C)D' if a given patter is 'ABCD' then the RegEx
> engine
> + * will parse ABC perform a userdefined callout and return a successful
> match at
> + * D.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_BACKREFERENCE_F (1ULL << 4)
> +/**< RegEx device doesn't support PCRE backreference.
> + * Example RegEx is '(\2ABC|(GHI))+' \2 matches the same text as most
> recently
> + * matched by the 2nd capturing group i.e. 'GHI'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_GREEDY_F (1ULL << 5)
> +/**< RegEx device doesn't support PCRE Greedy mode.
> + * For example if the RegEx is 'AB\d*?' then '*?' represents zero or
> unlimited
> + * matches. In greedy mode the pattern 'AB12345' will be matched
> completely
> + * where as the ungreedy mode 'AB' will be returned as the match.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_LOOKAROUND_ASRT_F (1ULL <<
> 6)
> +/**< RegEx device doesn't support PCRE Lookaround assertions
> + * (Zero-width assertions). Example RegEx is '[a-z]+\d+(?=!{3,})' if
> + * the given pattern is 'dwad1234!' the RegEx engine doesn't report any
> matches
> + * because the assert '(?=!{3,})' fails. The pattern 'dwad123!!!' would return
> a
> + * successful match.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_MATCH_POINT_RST_F (1ULL <<
> 7)
> +/**< RegEx device doesn't support PCRE match point reset directive.
> + * Example RegEx is '[a-z]+\K\d+' if the pattern is 'dwad123'
> + * then even though the entire pattern matches only '123'
> + * is reported as a match.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_NEWLINE_CONVENTIONS_F
> (1ULL << 8)
> +/**< RegEx device doesn't support PCRE newline convention.
> + * Newline conventions are represented as follows:
> + * (*CR)        carriage return
> + * (*LF)        linefeed
> + * (*CRLF)      carriage return, followed by linefeed
> + * (*ANYCRLF)   any of the three above
> + * (*ANY)       all Unicode newline sequences
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_NEWLINE_SEQ_F (1ULL << 9)
> +/**< RegEx device doesn't support PCRE newline sequence.
> + * The escape sequence '\R' will match any newline sequence.
> + * It is equivalent to: '(?>\r\n|\n|\x0b|\f|\r|\x85)'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_POSSESSIVE_QUALIFIERS_F (1ULL
> << 10)
> +/**< RegEx device doesn't support PCRE possessive qualifiers.
> + * Example RegEx possessive qualifiers '*+', '++', '?+', '{m,n}+'.
> + * Possessive quantifier repeats the token as many times as possible and it
> does
> + * not give up matches as the engine backtracks. With a possessive
> quantifier,
> + * the deal is all or nothing.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_SUBROUTINE_REFERENCES_F
> (1ULL << 11)
> +/**< RegEx device doesn't support PCRE Subroutine references.
> + * PCRE Subroutine references allow for sub patterns to be assessed
> + * as part of the RegEx. Example RegEx is '(foo|fuzz)\g<1>+bar' matches the
> + * pattern 'foofoofuzzfoofuzzbar'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_8_F (1ULL << 12)
> +/**< RegEx device doesn't support UTF-8 character encoding.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_16_F (1ULL << 13)
> +/**< RegEx device doesn't support UTF-16 character encoding.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_32_F (1ULL << 14)
> +/**< RegEx device doesn't support UTF-32 character encoding.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_WORD_BOUNDARY_F (1ULL <<
> 15)
> +/**< RegEx device doesn't support word boundaries.
> + * The meta character '\b' represents word boundary anchor.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_FORWARD_REFERENCES_F (1ULL
> << 16)
> +/**< RegEx device doesn't support Forward references.
> + * Forward references allow you to use a back reference to a group that
> appears
> + * later in the RegEx. Example RegEx is '(\3ABC|(DEF|(GHI)))+' matches the
> + * following string 'GHIGHIABCDEF'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +/* Enumerates PCRE rule flags */
> +#define RTE_REGEX_PCRE_RULE_ALLOW_EMPTY_F (1ULL << 0)
> +/**< When this flag is set, the pattern that can match against an empty
> string,
> + * such as '.*' are allowed.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_ANCHORED_F (1ULL << 1)
> +/**< When this flag is set, the pattern is forced to be "anchored", that is, it
> + * is constrained to match only at the first matching point in the string that
> + * is being searched. Similar to '^' and represented by \A.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_CASELESS_F (1ULL << 2)
> +/**< When this flag is set, letters in the pattern match both upper and
> lower
> + * case letters in the subject.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_DOTALL_F (1ULL << 3)
> +/**< When this flag is set, a dot metacharacter in the pattern matches any
> + * character, including one that indicates a newline.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_DUPNAMES_F (1ULL << 4)
> +/**< When this flag is set, names used to identify capture groups need not
> be
> + * unique.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_EXTENDED_F (1ULL << 5)
> +/**< When this flag is set, most white space characters in the pattern are
> + * totally ignored except when escaped or inside a character class.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_MATCH_UNSET_BACKREF_F (1ULL << 6)
> +/**< When this flag is set, a backreference to an unset capture group
> matches an
> + * empty string.
> + * @see RTE_REGEX_DEV_PCRE_UNSUP_FORWARD_REFERENCES_F
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_MULTILINE_F (1ULL << 7)
> +/**< When this flag  is set, the '^' and '$' constructs match immediately
> + * following or immediately before internal newlines in the subject string,
> + * respectively, as well as at the very start and end.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_NO_AUTO_CAPTURE_F (1ULL << 8)
> +/**< When this Flag is set, it disables the use of numbered capturing
> + * parentheses in the pattern. References to capture groups
> (backreferences or
> + * recursion/subroutine calls) may only refer to named groups, though the
> + * reference can be by name or by number.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_UCP_F (1ULL << 9)
> +/**< By default, only ASCII characters are recognized, When this flag is set,
> + * Unicode properties are used instead to classify characters.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_UNGREEDY_F (1ULL << 10)
> +/**< When this flag is set, the "greediness" of the quantifiers is inverted
> + * so that they are not greedy by default, but become greedy if followed by
> + * '?'.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_UTF_F (1ULL << 11)
> +/**< When this flag is set, RegEx engine has to regard both the pattern and
> the
> + * subject strings that are subsequently processed as strings of UTF
> characters
> + * instead of single-code-unit strings.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_NEVER_BACKSLASH_C_F (1ULL << 12)
> +/**< This Flag locks out the use of '\C' in the pattern that is being compiled.
> + * This escape matches one data unit, even in UTF mode which can cause
> + * unpredictable behavior in UTF-8 or UTF-16 modes, because it may leave
> the
> + * current matching point in the middle of a multi-code-unit character.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +
> +/**
> + * RegEx device information
> + */
> +struct rte_regex_dev_info {
> +	const char *driver_name; /**< RegEx driver name */
> +	struct rte_device *dev;	/**< Device information */
> +	uint8_t max_matches;
> +	/**< Maximum matches per scan supported by this device */
> +	uint16_t max_queue_pairs;
> +	/**< Maximum queue pairs supported by this device */
> +	uint16_t max_payload_size;
> +	/**< Maximum payload size for a pattern match request or scan.
> +	 * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> +	 */
> +	uint16_t max_rules_per_group;
> +	/**< Maximum rules supported per group by this device */
> +	uint16_t max_groups;
> +	/**< Maximum group supported by this device */
> +	uint32_t regex_dev_capa;
> +	/**< RegEx device capabilities. @see RTE_REGEX_DEV_CAPA_* */
> +	uint64_t rule_flags;
> +	/**< Supported compiler rule flags.
> +	 * @see RTE_REGEX_PCRE_RULE_*, struct rte_regex_rule::rule_flags
> +	 */
> +	uint64_t pcre_unsup_flags;
> +	/**< Unsupported PCRE features for this RegEx device.
> +	 * @see RTE_REGEX_DEV_PCRE_UNSUP_*
> +	 */
> +};
> +
> +/**
> + * Retrieve the contextual information of a RegEx device.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + *
> + * @param[out] dev_info
> + *   A pointer to a structure of type *rte_regex_dev_info* to be filled with
> the
> + *   contextual information of the device.
> + *
> + * @return
> + *   - 0: Success, driver updates the contextual information of the RegEx
> device
> + *   - <0: Error code returned by the driver info get function.
> + *
> + */
> +int
> +rte_regex_dev_info_get(uint8_t dev_id, struct rte_regex_dev_info
> *dev_info);
> +
> +/* Enumerates RegEx device configuration flags */
> +#define RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F (1ULL << 0)
> +/**< Cross buffer scan refers to the ability to be able to detect
> + * matches that occur across buffer boundaries, where the buffers are
> related
> + * to each other in some way. Enable this flag when to scan payload size
> + * greater struct struct rte_regex_dev_info::max_payload_size and/or
> + * matches can present across scan buffer boundaries.
> + *
> + * @see struct rte_regex_dev_info::max_payload_size
> + * @see struct rte_regex_dev_config::dev_cfg_flags,
> rte_regex_dev_configure()
> + * @see RTE_REGEX_OPS_RSP_PMI_SOJ_F
> + * @see RTE_REGEX_OPS_RSP_PMI_EOJ_F
> + */
> +
> +/** RegEx device configuration structure */
> +struct rte_regex_dev_config {
> +	uint8_t nb_max_matches;
> +	/**< Maximum matches per scan configured on this device.
> +	 * This value cannot exceed the *max_matches*
> +	 * which previously provided in rte_regex_dev_info_get().
> +	 * The value 0 is allowed, in which case, value 1 used.
> +	 * @see struct rte_regex_dev_info::max_matches
> +	 */
> +	uint16_t nb_queue_pairs;
> +	/**< Number of RegEx queue pairs to configure on this device.
> +	 * This value cannot exceed the *max_queue_pairs* which
> previously
> +	 * provided in rte_regex_dev_info_get().
> +	 * @see struct rte_regex_dev_info::max_queue_pairs
> +	 */
> +	uint16_t nb_rules_per_group;
> +	/**< Number of rules per group to configure on this device.
> +	 * This value cannot exceed the *max_rules_per_group*
> +	 * which previously provided in rte_regex_dev_info_get().
> +	 * The value 0 is allowed, in which case,
> +	 * struct rte_regex_dev_info::max_rules_per_group used.
> +	 * @see struct rte_regex_dev_info::max_rules_per_group
> +	 */
> +	uint16_t nb_groups;
> +	/**< Number of groups to configure on this device.
> +	 * This value cannot exceed the *max_groups*
> +	 * which previously provided in rte_regex_dev_info_get().
> +	 * @see struct rte_regex_dev_info::max_groups
> +	 */
> +	const char *rule_db;
> +	/**< Import initial set of prebuilt rule database on this device.
> +	 * The value NULL is allowed, in which case, the device will not
> +	 * be configured prebuilt rule database. Application may use
> +	 * rte_regex_rule_db_update() or rte_regex_rule_db_import() API
> +	 * to update or import rule database after the
> +	 * rte_regex_dev_configure().
> +	 * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
> +	 */
> +	uint32_t rule_db_len;
> +	/**< Length of *rule_db* buffer. */
> +	uint32_t dev_cfg_flags;
> +	/**< RegEx device configuration flags, See RTE_REGEX_DEV_CFG_*
> */
> +};
> +
> +/**
> + * Configure a RegEx device.
> + *
> + * This function must be invoked first before any other function in the
> + * API. This function can also be re-invoked when a device is in the
> + * stopped state.
> + *
> + * The caller may use rte_regex_dev_info_get() to get the capability of each
> + * resources available for this regex device.
> + *
> + * @param dev_id
> + *   The identifier of the device to configure.
> + * @param cfg
> + *   The RegEx device configuration structure.
> + *
> + * @return
> + *   - 0: Success, device configured.
> + *   - <0: Error code returned by the driver configuration function.
> + */
> +int
> +rte_regex_dev_configure(uint8_t dev_id, const struct
> rte_regex_dev_config *cfg);
> +
> +/* Enumerates RegEx queue pair configuration flags */
> +#define RTE_REGEX_QUEUE_PAIR_CFG_OOS_F (1ULL << 0)
> +/**< Out of order scan, If not set, a scan must retire after previously issued
> + * in-order scans to this queue pair. If set, this scan can be retired as soon
> + * as device returns completion. Application should not set out of order scan
> + * flag if it needs to maintain the ingress order of scan request.
> + *
> + * @see struct rte_regex_qp_conf::qp_conf_flags,
> rte_regex_queue_pair_setup()
> + */
> +
> +struct rte_regex_ops;
> +typedef void (*regexdev_stop_flush_t)(uint8_t dev_id, uint16_t qp_id,
> +				      struct rte_regex_ops *op);
> +/**< Callback function called during rte_regex_dev_stop(), invoked once
> per
> + * flushed RegEx op.
> + */
> +
> +/** RegEx queue pair configuration structure */
> +struct rte_regex_qp_conf {
> +	uint32_t qp_conf_flags;
> +	/**< Queue pair config flags, See RTE_REGEX_QUEUE_PAIR_CFG_*
> */
> +	uint16_t nb_desc;
> +	/**< The number of descriptors to allocate for this queue pair. */
> +	regexdev_stop_flush_t cb;
> +	/**< Callback function called during rte_regex_dev_stop(), invoked
> +	 * once per flushed regex op. Value NULL is allowed, in which case
> +	 * callback will not be invoked. This function can be used to properly
> +	 * dispose of outstanding regex ops from response queue,
> +	 * for example ops containing memory pointers.
> +	 * @see rte_regex_dev_stop()
> +	 */
> +};
> +
> +/**
> + * Allocate and set up a RegEx queue pair for a RegEx device.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param queue_pair_id
> + *   The index of the RegEx queue pair to setup. The value must be in the
> range
> + *   [0, nb_queue_pairs - 1] previously supplied to
> rte_regex_dev_configure().
> + * @param qp_conf
> + *   The pointer to the configuration data to be used for the RegEx queue
> pair.
> + *   NULL value is allowed, in which case default configuration	used.
> + *
> + * @return
> + *   - 0: Success, RegEx queue pair correctly set up.
> + *   - <0: RegEx queue configuration failed
> + */
> +int
> +rte_regex_queue_pair_setup(uint8_t dev_id, uint8_t queue_pair_id,
> +			   const struct rte_regex_qp_conf *qp_conf);
> +
> +/**
> + * Start a RegEx device.
> + *
> + * The device start step is the last one and consists of setting the RegEx
> + * queues to start accepting the pattern matching scan requests.
> + *
> + * On success, all basic functions exported by the API (RegEx enqueue,
> + * RegEx dequeue and so on) can be invoked.
> + *
> + * @param dev_id
> + *   RegEx device identifier
> + * @return
> + *   - 0: Success, device started.
> + *   - <0: Device start failed.
> + */
> +int
> +rte_regex_dev_start(uint8_t dev_id);
> +
> +/**
> + * Stop a RegEx device.
> + *
> + * Stop a RegEx device. The device can be restarted with a call to
> + * rte_regex_dev_start().
> + *
> + * This function causes all queued response regex ops to be drained in the
> + * response queue. While draining ops out of the device,
> + * struct rte_regex_qp_conf::cb will be invoked for each ops.
> + *
> + * @param dev_id
> + *   RegEx device identifier.
> + *
> + * @see struct rte_regex_qp_conf::cb, rte_regex_queue_pair_setup()
> + */
> +void
> +rte_regex_dev_stop(uint8_t dev_id);
> +
> +/**
> + * Close a RegEx device. The device cannot be restarted!
> + *
> + * @param dev_id
> + *   RegEx device identifier
> + *
> + * @return
> + *  - 0 on successfully closed the device.
> + *  - <0 on failure to close the device.
> + */
> +int
> +rte_regex_dev_close(uint8_t dev_id);
> +
> +/* Device get/set attributes */
> +
> +/** Enumerates RegEx device attribute identifier */
> +enum rte_regex_dev_attr_id {
> +	RTE_REGEX_DEV_ATTR_SOCKET_ID,
> +	/**< The NUMA socket id to which the device is connected or
> +	 * a default of zero if the socket could not be determined.
> +	 * datatype: *int*
> +	 * operation: *get*
> +	 */
> +	RTE_REGEX_DEV_ATTR_MAX_MATCHES,
> +	/**< Maximum number of matches per scan.
> +	 * datatype: *uint8_t*
> +	 * operation: *get* and *set*
> +	 *
> +	 * @see RTE_REGEX_OPS_RSP_MAX_MATCH_F
> +	 */
> +	RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT,
> +	/**< Upper bound scan time in ns.
> +	 * datatype: *uint16_t*
> +	 * operation: *get* and *set*
> +	 *
> +	 * @see RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F
> +	 */
> +	RTE_REGEX_DEV_ATTR_MAX_PREFIX,
> +	/**< Maximum number of prefix detected per scan.
> +	 * This would be useful for denial of service detection.
> +	 * datatype: *uint16_t*
> +	 * operation: *get* and *set*
> +	 *
> +	 * @see RTE_REGEX_OPS_RSP_MAX_PREFIX_F
> +	 */
> +};
> +
> +/**
> + * Get an attribute from a RegEx device.
> + *
> + * @param dev_id RegEx device identifier
> + * @param attr_id The attribute ID to retrieve
> + * @param[out] attr_value A pointer that will be filled in with the attribute
> + *             value if successful.
> + *
> + * @return
> + *   - 0: Successfully retrieved attribute value.
> + *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
> + *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
> + */
> +int
> +rte_regex_dev_attr_get(uint8_t dev_id, enum rte_regex_dev_attr_id
> attr_id,
> +		       void *attr_value);
> +
> +/**
> + * Set an attribute to a RegEx device.
> + *
> + * @param dev_id RegEx device identifier
> + * @param attr_id The attribute ID to retrieve
> + * @param attr_value A pointer that will be filled in with the attribute value
> + *                   by the application
> + *
> + * @return
> + *   - 0: Successfully applied the attribute value.
> + *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
> + *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
> + */
> +int
> +rte_regex_dev_attr_set(uint8_t dev_id, enum rte_regex_dev_attr_id
> attr_id,
> +		       const void *attr_value);
> +
> +/* Rule related APIs */
> +/** Enumerates RegEx rule operation */
> +enum rte_regex_rule_op {
> +	RTE_REGEX_RULE_OP_ADD,
> +	/**< Add RegEx rule to rule database */
> +	RTE_REGEX_RULE_OP_REMOVE
> +	/**< Remove RegEx rule from rule database */
> +};
> +
> +/** Structure to hold a RegEx rule attributes */
> +struct rte_regex_rule {
> +	enum rte_regex_rule_op op;
> +	/**< OP type of the rule either a OP_ADD or OP_DELETE */
> +	uint16_t group_id;
> +	/**< Group identifier to which the rule belongs to. */
> +	uint32_t rule_id;
> +	/**< Rule identifier which is returned on successful match. */
> +	const char *pcre_rule;
> +	/**< Buffer to hold the PCRE rule. */
> +	uint16_t pcre_rule_len;
> +	/**< Length of the PCRE rule*/
> +	uint64_t rule_flags;
> +	/* PCRE rule flags. Supported device specific PCRE rules enumerated
> +	 * in struct rte_regex_dev_info::rule_flags. For successful rule
> +	 * database update, application needs to provide only supported
> +	 * rule flags.
> +	 * @See RTE_REGEX_PCRE_RULE_*, struct
> rte_regex_dev_info::rule_flags
> +	 */
> +};
> +
> +/**
> + * Update the rule database of a RegEx device.
> + *
> + * @param dev_id RegEx device identifier
> + * @param rules
> + *   Points to an array of *nb_rules* objects of type *rte_regex_rule*
> structure
> + *   which contain the regex rules attributes to be updated in rule database.
> + * @param nb_rules
> + *   The number of PCRE rules to update the rule database.
> + *
> + * @return
> + *   The number of regex rules actually updated on the regex device's rule
> + *   database. The return value can be less than the value of the *nb_rules*
> + *   parameter when the regex devices fails to update the rule database or
> + *   if invalid parameters are specified in a *rte_regex_rule*.
> + *   If the return value is less than *nb_rules*, the remaining PCRE rules
> + *   at the end of *rules* are not consumed and the caller has to take
> + *   care of them and rte_errno is set accordingly.
> + *   Possible errno values include:
> + *   - -EINVAL:  Invalid device ID or rules is NULL
> + *   - -ENOTSUP: The last processed rule is not supported on this device.
> + *   - -ENOSPC: No space available in rule database.
> + *
> + * @see rte_regex_rule_db_import(), rte_regex_rule_db_export()
> + */
> +uint16_t
> +rte_regex_rule_db_update(uint8_t dev_id, const struct rte_regex_rule
> *rules,
> +			 uint16_t nb_rules);
> +
> +/**
> + * Import a prebuilt rule database from a buffer to a RegEx device.
> + *
> + * @param dev_id RegEx device identifier
> + * @param rule_db
> + *   Points to prebuilt rule database.
> + * @param rule_db_len
> + *   Length of the rule database.
> + *
> + * @return
> + *   - 0: Successfully updated the prebuilt rule database.
> + *   - -EINVAL:  Invalid device ID or rule_db is NULL
> + *   - -ENOTSUP: Rule database import is not supported on this device.
> + *   - -ENOSPC: No space available in rule database.
> + *
> + * @see rte_regex_rule_db_update(), rte_regex_rule_db_export()
> + */
> +int
> +rte_regex_rule_db_import(uint8_t dev_id, const char *rule_db,
> +			 uint32_t rule_db_len);
> +
> +/**
> + * Export the prebuilt rule database from a RegEx device to the buffer.
> + *
> + * @param dev_id RegEx device identifier
> + * @param[out] rule_db
> + *   Block of memory to insert the rule database. Must be at least size in
> + *   capacity. If set to NULL, function returns required capacity.
> + *
> + * @return
> + *   - 0: Successfully exported the prebuilt rule database.
> + *   - size: If rule_db set to NULL then required capacity for *rule_db*
> + *   - -EINVAL:  Invalid device ID
> + *   - -ENOTSUP: Rule database export is not supported on this device.
> + *
> + * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
> + */
> +int
> +rte_regex_rule_db_export(uint8_t dev_id, char *rule_db);
> +
> +/* Extended statistics */
> +/** Maximum name length for extended statistics counters */
> +#define RTE_REGEX_DEV_XSTATS_NAME_SIZE 64
> +
> +/**
> + * A name-key lookup element for extended statistics.
> + *
> + * This structure is used to map between names and ID numbers
> + * for extended RegEx device statistics.
> + */
> +struct rte_regex_dev_xstats_map {
> +	uint16_t id;
> +	/**< xstat identifier */
> +	char name[RTE_REGEX_DEV_XSTATS_NAME_SIZE];
> +	/**< xstat name */
> +};
> +
> +/**
> + * Retrieve names of extended statistics of a regex device.
> + *
> + * @param dev_id
> + *   The identifier of the regex device.
> + * @param[out] xstats_map
> + *   Block of memory to insert id and names into. Must be at least size in
> + *   capacity. If set to NULL, function returns required capacity.
> + * @return
> + *   - positive value on success:
> + *        -The return value is the number of entries filled in the stats map.
> + *        -If xstats_map set to NULL then required capacity for xstats_map.
> + *   - negative value on error:
> + *      -ENODEV for invalid *dev_id*
> + *      -ENOTSUP if the device doesn't support this function.
> + */
> +int
> +rte_regex_dev_xstats_names_get(uint8_t dev_id,
> +			       struct rte_regex_dev_xstats_map *xstats_map);
> +
> +/**
> + * Retrieve extended statistics of an regex device.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param ids
> + *   The id numbers of the stats to get. The ids can be got from the stat
> + *   position in the stat list from rte_regex_dev_xstats_names_get(), or
> + *   by using rte_regex_dev_xstats_by_name_get().
> + * @param[out] values
> + *   The values for each stats request by ID.
> + * @param n
> + *   The number of stats requested
> + * @return
> + *   - positive value: number of stat entries filled into the values array
> + *   - negative value on error:
> + *      -ENODEV for invalid *dev_id*
> + *      -ENOTSUP if the device doesn't support this function.
> + */
> +int
> +rte_regex_dev_xstats_get(uint8_t dev_id, const uint16_t ids[],
> +			 uint64_t values[], uint16_t n);
> +
> +/**
> + * Retrieve the value of a single stat by requesting it by name.
> + *
> + * @param dev_id
> + *   The identifier of the device
> + * @param name
> + *   The stat name to retrieve
> + * @param[out] id
> + *   If non-NULL, the numerical id of the stat will be returned, so that further
> + *   requests for the stat can be got using rte_regex_dev_xstats_get, which
> will
> + *   be faster as it doesn't need to scan a list of names for the stat.
> + * @param[out] value
> + *   Must be non-NULL, retrieved xstat value will be stored in this address.
> + *
> + * @return
> + *   - 0: Successfully retrieved xstat value.
> + *   - -EINVAL: invalid parameters
> + *   - -ENOTSUP: if not supported.
> + */
> +int
> +rte_regex_dev_xstats_by_name_get(uint8_t dev_id, const char *name,
> +				 uint16_t *id, uint64_t *value);
> +
> +/**
> + * Reset the values of the xstats of the selected component in the device.
> + *
> + * @param dev_id
> + *   The identifier of the device
> + * @param ids
> + *   Selects specific statistics to be reset. When NULL, all statistics will be
> + *   reset. If non-NULL, must point to array of at least *nb_ids* size.
> + * @param nb_ids
> + *   The number of ids available from the *ids* array. Ignored when ids is
> NULL.
> + * @return
> + *   - 0: Successfully reset the statistics to zero.
> + *   - -EINVAL: invalid parameters
> + *   - -ENOTSUP: if not supported.
> + */
> +int
> +rte_regex_dev_xstats_reset(uint8_t dev_id, const uint16_t ids[],
> +			   uint16_t nb_ids);
> +
> +/**
> + * Trigger the RegEx device self test.
> + *
> + * @param dev_id
> + *   The identifier of the device
> + * @return
> + *   - 0: Selftest successful
> + *   - -ENOTSUP if the device doesn't support selftest
> + *   - other values < 0 on failure.
> + */
> +int rte_regex_dev_selftest(uint8_t dev_id);
> +
> +/**
> + * Dump internal information about *dev_id* to the FILE* provided in *f*.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + *
> + * @param f
> + *   A pointer to a file for output
> + *
> + * @return
> + *   - 0: on success
> + *   - <0: on failure.
> + */
> +int
> +rte_regex_dev_dump(uint8_t dev_id, FILE *f);
> +
> +/* Fast path APIs */
> +
> +/**
> + * The generic *rte_regex_match* structure to hold the RegEx match
> attributes.
> + * @see struct rte_regex_ops::matches
> + */
> +struct rte_regex_match {
> +	RTE_STD_C11
> +	union {
> +		uint64_t u64;
> +		struct {
> +			uint32_t rule_id:20;
> +			/**< Rule identifier to which the pattern matched.
> +			 * @see struct rte_regex_rule::rule_id
> +			 */
> +			uint32_t group_id:12;
> +			/**< Group identifier of the rule which the pattern
> +			 * matched. @see struct rte_regex_rule::group_id
> +			 */
> +			uint16_t offset;
> +			/**< Starting Byte Position for matched rule. */
> +			uint16_t len;
> +			/**< Length of match in bytes */
> +		};
> +	};
> +};
> +
> +/* Enumerates RegEx request flags. */
> +#define RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F (1 << 0)
> +/**< Set when struct rte_regex_rule::group_id1 valid */
> +
> +#define RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F (1 << 1)
> +/**< Set when struct rte_regex_rule::group_id2 valid */
> +
> +#define RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F (1 << 2)
> +/**< Set when struct rte_regex_rule::group_id3 valid */
> +
> +#define RTE_REGEX_OPS_REQ_STOP_ON_MATCH_F (1 << 4)
> +/**< The RegEx engine will stop scanning and return the first match. */
> +
> +#define RTE_REGEX_OPS_REQ_MATCH_HIGH_PRIORITY_F (1 << 5)
> +/**< In High Priority mode a maximum of one match will be returned per
> scan to
> + * reduce the post-processing required by the application. The match with
> the
> + * lowest Rule id, lowest start pointer and lowest match length will be
> + * returned.
> + *
> + * @see struct rte_regex_ops::nb_actual_matches
> + * @see struct rte_regex_ops::nb_matches
> + */
> +
> +
> +/* Enumerates RegEx response flags. */
> +#define RTE_REGEX_OPS_RSP_PMI_SOJ_F (1 << 0)
> +/**< Indicates that the RegEx device has encountered a partial match at the
> + * start of scan in the given buffer.
> + *
> + * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> + */
> +
> +#define RTE_REGEX_OPS_RSP_PMI_EOJ_F (1 << 1)
> +/**< Indicates that the RegEx device has encountered a partial match at the
> + * end of scan in the given buffer.
> + *
> + * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> + */
> +
> +#define RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F (1 << 2)
> +/**< Indicates that the RegEx device has exceeded the max timeout while
> + * scanning the given buffer.
> + *
> + * @see RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT
> + */
> +
> +#define RTE_REGEX_OPS_RSP_MAX_MATCH_F (1 << 3)
> +/**< Indicates that the RegEx device has exceeded the max matches while
> + * scanning the given buffer.
> + *
> + * @see RTE_REGEX_DEV_ATTR_MAX_MATCHES
> + */
> +
> +#define RTE_REGEX_OPS_RSP_MAX_PREFIX_F (1 << 4)
> +/**< Indicates that the RegEx device has reached the max allowed prefix
> length
> + * while scanning the given buffer.
> + *
> + * @see RTE_REGEX_DEV_ATTR_MAX_PREFIX
> + */
> +
> +/**
> + * The generic *rte_regex_ops* structure to hold the RegEx attributes
> + * for enqueue and dequeue operation.
> + */
> +struct rte_regex_ops {
> +	/* W0 */
> +	uint16_t req_flags;
> +	/**< Request flags for the RegEx ops.
> +	 * @see RTE_REGEX_OPS_REQ_*
> +	 */
> +	uint16_t scan_size;
> +	/**< Scan size of the buffer to be scanned in bytes. */
> +	uint16_t rsp_flags;
> +	/**< Response flags for the RegEx ops.
> +	 * @see RTE_REGEX_OPS_RSP_*
> +	 */
> +	uint8_t nb_actual_matches;
> +	/**< The total number of actual matches detected by the Regex
> device.*/
> +	uint8_t nb_matches;
> +	/**< The total number of matches returned by the RegEx device for
> this
> +	 * scan. The size of *rte_regex_ops::matches* zero length array will
> be
> +	 * this value.
> +	 *
> +	 * @see struct rte_regex_ops::matches, struct rte_regex_match
> +	 */
> +
> +	/* W1 */
> +	RTE_STD_C11
> +	union {
> +		uint64_t u64;
> +		/**<  Allow 8-byte reserved on 32-bit system */
> +		void *buf_addr;
> +		/**< Virtual address of the pattern to be matched. */
> +	};
> +
> +	/* W2 */
> +	rte_iova_t buf_iova;
> +	/**< IOVA address of the pattern to be matched. */
> +
> +	/* W3 */
> +	uint16_t group_id0;
> +	/**< First group_id to match the rule against. Minimum one group id
> +	 * must be provided by application.
> +	 * When RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F set then
> group_id1
> +	 * is valid, respectively similar flags for group_id2 and group_id3.
> +	 * Upon the match, struct rte_regex_match::group_id shall be
> updated
> +	 * with matching group ID by the device. Group ID scheme provides
> +	 * rule isolation and effective pattern matching.
> +	 */
> +	uint16_t group_id1;
> +	/**< Second group_id to match the rule against.
> +	 *
> +	 * @see RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F
> +	 */
> +	uint16_t group_id2;
> +	/**< Third group_id to match the rule against.
> +	 *
> +	 * @see RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F
> +	 */
> +	uint16_t group_id3;
> +	/**< Forth group_id to match the rule against.
> +	 *
> +	 * @see RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F
> +	 */
> +
> +	/* W4 */
> +	RTE_STD_C11
> +	union {
> +		uint64_t user_id;
> +		/**< Application specific opaque value. An application may
> use
> +		 * this field to hold application specific value to share
> +		 * between dequeue and enqueue operation.
> +		 * Implementation should not modify this field.
> +		 */
> +		void *user_ptr;
> +		/**< Pointer representation of *user_id* */
> +	};
> +
> +	/* W5 */
> +	struct rte_regex_match matches[];
> +	/**< Zero length array to hold the match tuples.
> +	 * The struct rte_regex_ops::nb_matches value holds the number of
> +	 * elements in this array.
> +	 *
> +	 * @see struct rte_regex_ops::nb_matches
> +	 */
> +};
> +
> +/**
> + * Enqueue a burst of scan request on a RegEx device.
> + *
> + * The rte_regex_enqueue_burst() function is invoked to place
> + * regex operations on the queue *qp_id* of the device designated by
> + * its *dev_id*.
> + *
> + * The *nb_ops* parameter is the number of operations to process which
> are
> + * supplied in the *ops* array of *rte_regex_op* structures.
> + *
> + * The rte_regex_enqueue_burst() function returns the number of
> + * operations it actually enqueued for processing. A return value equal to
> + * *nb_ops* means that all packets have been enqueued.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param qp_id
> + *   The index of the queue pair which packets are to be enqueued for
> + *   processing. The value must be in the range [0, nb_queue_pairs - 1]
> + *   previously supplied to rte_regex_dev_configure().
> + * @param ops
> + *   The address of an array of *nb_ops* pointers to *rte_regex_op*
> structures
> + *   which contain the regex operations to be processed.
> + * @param nb_ops
> + *   The number of operations to process.
> + *
> + * @return
> + *   The number of operations actually enqueued on the regex device. The
> return
> + *   value can be less than the value of the *nb_ops* parameter when the
> + *   regex devices queue is full or if invalid parameters are specified in
> + *   a *rte_regex_op*. If the return value is less than *nb_ops*, the
> remaining
> + *   ops at the end of *ops* are not consumed and the caller has to take
> care
> + *   of them.
> + */
> +uint16_t
> +rte_regex_enqueue_burst(uint8_t dev_id, uint16_t qp_id,
> +			struct rte_regex_ops **ops, uint16_t nb_ops);
> +
> +/**
> + *
> + * Dequeue a burst of scan response from a queue on the RegEx device.
> + * The dequeued operation are stored in *rte_regex_op* structures
> + * whose pointers are supplied in the *ops* array.
> + *
> + * The rte_regex_dequeue_burst() function returns the number of ops
> + * actually dequeued, which is the number of *rte_regex_op* data
> structures
> + * effectively supplied into the *ops* array.
> + *
> + * A return value equal to *nb_ops* indicates that the queue contained
> + * at least *nb_ops* operations, and this is likely to signify that other
> + * processed operations remain in the devices output queue. Applications
> + * implementing a "retrieve as many processed operations as possible"
> policy
> + * can check this specific case and keep invoking the
> + * rte_regex_dequeue_burst() function until a value less than
> + * *nb_ops* is returned.
> + *
> + * The rte_regex_dequeue_burst() function does not provide any error
> + * notification to avoid the corresponding overhead.
> + *
> + * @param dev_id
> + *   The RegEx device identifier
> + * @param qp_id
> + *   The index of the queue pair from which to retrieve processed packets.
> + *   The value must be in the range [0, nb_queue_pairs - 1] previously
> + *   supplied to rte_regex_dev_configure().
> + * @param ops
> + *   The address of an array of pointers to *rte_regex_op* structures that
> must
> + *   be large enough to store *nb_ops* pointers in it.
> + * @param nb_ops
> + *   The maximum number of operations to dequeue.
> + *
> + * @return
> + *   The number of operations actually dequeued, which is the number
> + *   of pointers to *rte_regex_op* structures effectively supplied to the
> + *   *ops* array. If the return value is less than *nb_ops*, the remaining
> + *   ops at the end of *ops* are not consumed and the caller has to take
> care
> + *   of them.
> + */
> +uint16_t
> +rte_regex_dequeue_burst(uint8_t dev_id, uint16_t qp_id,
> +			struct rte_regex_ops **ops, uint16_t nb_ops);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_REGEXDEV_H_ */
>
Shahaf Shuler Aug. 21, 2019, 5:32 a.m. UTC | #6
Hi Jerin,

Thursday, August 15, 2019 2:34 PM, Thomas Monjalon:
> Subject: Re: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev
> subsystem
> 
> +Cc more
> 
> ------------
> 
> From: Jerin Jacob <jerinj@marvell.com>
> 
> Even though there are some vendors which offer Regex HW offload, due to
> lack of standard API, It is diffcult for DPDK consumer to use them
> in a portable way.
> 
> This _RFC_ attempts to standardize the RegEx/DPI offload APIs for DPDK.
> 
> The Doxygen generated RFC API documentation available here:
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdrea
> my-noether-
> 22777e.netlify.com%2Frte__regexdev_8h.html&amp;data=02%7C01%7Csha
> hafs%40mellanox.com%7Cdf93416cf4e8498a982c08d721748937%7Ca652971c
> 7d2e4d9ba6a4d149256f461b%7C0%7C0%7C637014656739993131&amp;sdata
> =6ZAOrLmj3sf7LrPRlzE7IyqkK8b4cvFIQqK6zSwF4aw%3D&amp;reserved=0
> 
> This RFC crafted based on SW Regex API frameworks such as libpcre and
> hyperscan and a few of the RegEx HW IPs which I am aware of.
> 
> RegEx pattern matching applications:
> • Next Generation Firewalls (NGFW)
> • Deep Packet and Flow Inspection (DPI)
> • Intrusion Prevention Systems (IPS)
> • DDoS Mitigation
> • Network Monitoring
> • Data Loss Prevention (DLP)
> • Smart NICs
> • Grammar based content processing
> • URL, spam and adware filtering
> • Advanced auditing and policing of user/application security policies
> • Financial data mining - parsing of streamed financial feeds

I think two more important use case to add (at least on the doc of this subsystem) are:
* application recognition 
* memory introspection 


> 
> Request to review from HW and SW RegEx vendors and RegEx application
> users
> to have portable DPDK API for RegEx.
> 
> The API schematics are based cryptodev, eventdev and ethdev existing
> device API.
> 
> Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> ---
> 
> RTE RegEx Device API
> --------------------
> 
> Defines RTE RegEx Device APIs for RegEx operations and its provisioning.
> 
> The RegEx Device API is composed of two parts:
> 
> - The application-oriented RegEx API that includes functions to setup
> a RegEx device (configure it, setup its queue pairs and start it),
> update the rule database and so on.
> 
> - The driver-oriented RegEx API that exports a function allowing
> a RegEx poll Mode Driver (PMD) to simultaneously register itself as
> a RegEx device driver.
> 
> RegEx device components and definitions:
> 
>     +-----------------+
>     |                 |
>     |                 o---------+    rte_regex_[en|de]queue_burst()
>     |   PCRE based    o------+  |               |
>     |  RegEx pattern  |      |  |  +--------+   |
>     | matching engine o------+--+--o        |   |    +------+
>     |                 |      |  |  | queue  |<==o===>|Core 0|
>     |                 o----+ |  |  | pair 0 |        |      |
>     |                 |    | |  |  +--------+        +------+
>     +-----------------+    | |  |
>            ^               | |  |  +--------+
>            |               | |  |  |        |        +------+
>            |               | +--+--o queue  |<======>|Core 1|
>        Rule|Database       |    |  | pair 1 |        |      |
>     +------+----------+    |    |  +--------+        +------+
>     |     Group 0     |    |    |
>     | +-------------+ |    |    |  +--------+        +------+
>     | | Rules 0..n  | |    |    |  |        |        |Core 2|
>     | +-------------+ |    |    +--o queue  |<======>|      |
>     |     Group 1     |    |       | pair 2 |        +------+
>     | +-------------+ |    |       +--------+
>     | | Rules 0..n  | |    |
>     | +-------------+ |    |       +--------+
>     |     Group 2     |    |       |        |        +------+
>     | +-------------+ |    |       | queue  |<======>|Core n|
>     | | Rules 0..n  | |    +-------o pair n |        |      |
>     | +-------------+ |            +--------+        +------+
>     |     Group n     |
>     | +-------------+ |<-------rte_regex_rule_db_update()
>     | | Rules 0..n  | |<-------rte_regex_rule_db_import()
>     | +-------------+ |------->rte_regex_rule_db_export()
>     +-----------------+
> 
> RegEx: A regular expression is a concise and flexible means for matching
> strings of text, such as particular characters, words, or patterns of
> characters. A common abbreviation for this is “RegEx”.
> 
> RegEx device: A hardware or software-based implementation of RegEx
> device API for PCRE based pattern matching syntax and semantics.
> 
> PCRE RegEx syntax and semantics specification:
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fregex
> kit.sourceforge.net%2FDocumentation%2Fpcre%2Fpcrepattern.html&amp;d
> ata=02%7C01%7Cshahafs%40mellanox.com%7Cdf93416cf4e8498a982c08d721
> 748937%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C63701465673
> 9993131&amp;sdata=B0LSMubldDy3UlF55Z3whhNiRq6ep1pxB8Rrt5DItfw%3
> D&amp;reserved=0
> 
> RegEx queue pair: Each RegEx device should have one or more queue pair to
> transmit a burst of pattern matching request and receive a burst of
> receive the pattern matching response. The pattern matching
> request/response
> embedded in *rte_regex_ops* structure.
> 
> Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
> Match ID and Group ID to identify the rule upon the match.
> 
> Rule database: The RegEx device accepts regular expressions and converts
> them
> into a compiled rule database that can then be used to scan data.
> Compilation allows the device to analyze the given pattern(s) and
> pre-determine how to scan for these patterns in an optimized fashion that
> would be far too expensive to compute at run-time. A rule database contains
> a set of rules that compiled in device specific binary form.
> 
> Match ID or Rule ID: A unique identifier provided at the time of rule
> creation for the application to identify the rule upon match.
> 
> Group ID: Group of rules can be grouped under one group ID to enable
> rule isolation and effective pattern matching. A unique group identifier
> provided at the time of rule creation for the application to identify the
> rule upon match.
> 
> Scan: A pattern matching request through *enqueue* API.
> 
> It may possible that a given RegEx device may not support all the features
> of PCRE. The application may probe unsupported features through
> struct rte_regex_dev_info::pcre_unsup_flags
> 
> By default, all the functions of the RegEx Device API exported by a PMD
> are lock-free functions which assume to not be invoked in parallel on
> different logical cores to work on the same target object. For instance,
> the dequeue function of a PMD cannot be invoked in parallel on two logical
> cores to operates on same RegEx queue pair. Of course, this function
> can be invoked in parallel by different logical core on different queue pair.
> It is the responsibility of the upper level application to enforce this rule.
> 
> In all functions of the RegEx API, the RegEx device is
> designated by an integer >= 0 named the device identifier *dev_id*
> 
> At the RegEx driver level, RegEx devices are represented by a generic
> data structure of type *rte_regex_dev*.
> 
> RegEx devices are dynamically registered during the PCI/SoC device probing
> phase performed at EAL initialization time.
> When a RegEx device is being probed, a *rte_regex_dev* structure and
> a new device identifier are allocated for that device. Then, the
> regex_dev_init() function supplied by the RegEx driver matching the probed
> device is invoked to properly initialize the device.
> 
> The role of the device init function consists of resetting the hardware or
> software RegEx driver implementations.
> 
> If the device init operation is successful, the correspondence between
> the device identifier assigned to the new device and its associated
> *rte_regex_dev* structure is effectively registered.
> Otherwise, both the *rte_regex_dev* structure and the device identifier are
> freed.
> 
> The functions exported by the application RegEx API to setup a device
> designated by its device identifier must be invoked in the following order:
> - rte_regex_dev_configure()
> - rte_regex_queue_pair_setup()
> - rte_regex_dev_start()
> 
> Then, the application can invoke, in any order, the functions
> exported by the RegEx API to enqueue pattern matching job, dequeue
> pattern
> matching response, get the stats, update the rule database,
> get/set device attributes and so on
> 
> If the application wants to change the configuration (i.e. call
> rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must call
> rte_regex_dev_stop() first to stop the device and then do the
> reconfiguration
> before calling rte_regex_dev_start() again. The enqueue and dequeue
> functions should not be invoked when the device is stopped.
> 
> Finally, an application can close a RegEx device by invoking the
> rte_regex_dev_close() function.
> 
> Each function of the application RegEx API invokes a specific function
> of the PMD that controls the target device designated by its device
> identifier.
> 
> For this purpose, all device-specific functions of a RegEx driver are
> supplied through a set of pointers contained in a generic structure of type
> *regex_dev_ops*.
> The address of the *regex_dev_ops* structure is stored in the
> *rte_regex_dev*
> structure by the device init function of the RegEx driver, which is
> invoked during the PCI/SoC device probing phase, as explained earlier.
> 
> In other words, each function of the RegEx API simply retrieves the
> *rte_regex_dev* structure associated with the device identifier and
> performs an indirect invocation of the corresponding driver function
> supplied in the *regex_dev_ops* structure of the *rte_regex_dev*
> structure.
> 
> For performance reasons, the address of the fast-path functions of the
> RegEx driver is not contained in the *regex_dev_ops* structure.
> Instead, they are directly stored at the beginning of the *rte_regex_dev*
> structure to avoid an extra indirect memory access during their invocation.
> 
> RTE RegEx device drivers do not use interrupts for enqueue or dequeue
> operation. Instead, RegEx drivers export Poll-Mode enqueue and dequeue
> functions to applications.
> 
> The *enqueue* operation submits a burst of RegEx pattern matching
> request
> to the RegEx device and the *dequeue* operation gets a burst of pattern
> matching response for the ones submitted through *enqueue* operation.
> 
> Typical application utilisation of the RegEx device API will follow the
> following programming flow.
> 
> - rte_regex_dev_configure()
> - rte_regex_queue_pair_setup()
> - rte_regex_rule_db_update() Needs to invoke if precompiled rule database
> not
> provided in rte_regex_dev_config::rule_db for rte_regex_dev_configure()
> and/or application needs to update rule database.
> - Create or reuse exiting mempool for *rte_regex_ops* objects.
> - rte_regex_dev_start()
> - rte_regex_enqueue_burst()
> - rte_regex_dequeue_burst()
> 
> ---
> 
> config/common_base                 |    5 +
> doc/api/doxy-api-index.md          |    1 +
> doc/api/doxy-api.conf.in           |    1 +
> lib/Makefile                       |    2 +
> lib/librte_regexdev/Makefile       |   23 +
> lib/librte_regexdev/rte_regexdev.c |    5 +
> lib/librte_regexdev/rte_regexdev.h | 1247
> ++++++++++++++++++++++++++++
> 7 files changed, 1284 insertions(+)
> create mode 100644 lib/librte_regexdev/Makefile
> create mode 100644 lib/librte_regexdev/rte_regexdev.c
> create mode 100644 lib/librte_regexdev/rte_regexdev.h
> 
> diff --git a/config/common_base b/config/common_base
> index e406e7836..986093d6e 100644
> --- a/config/common_base
> +++ b/config/common_base
> @@ -746,6 +746,11 @@
> CONFIG_RTE_LIBRTE_PMD_DPAA2_QDMA_RAWDEV=n
> #
> CONFIG_RTE_LIBRTE_PMD_IFPGA_RAWDEV=y
> 
> +#
> +# Compile regex device support
> +#
> +CONFIG_RTE_LIBRTE_REGEXDEV=y
> +
> #
> # Compile librte_ring
> #
> diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
> index 715248dd1..a0bc27ae4 100644
> --- a/doc/api/doxy-api-index.md
> +++ b/doc/api/doxy-api-index.md
> @@ -26,6 +26,7 @@ The public API headers are grouped by topics:
> [event_timer_adapter]    (@ref rte_event_timer_adapter.h),
> [event_crypto_adapter]   (@ref rte_event_crypto_adapter.h),
> [rawdev]             (@ref rte_rawdev.h),
> +  [regexdev]           (@ref rte_regexdev.h),
> [metrics]            (@ref rte_metrics.h),
> [bitrate]            (@ref rte_bitrate.h),
> [latency]            (@ref rte_latencystats.h),
> diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in
> index b9896cb63..7adb821bb 100644
> --- a/doc/api/doxy-api.conf.in
> +++ b/doc/api/doxy-api.conf.in
> @@ -53,6 +53,7 @@ INPUT                   = @TOPDIR@/doc/api/doxy-api-
> index.md \
> @TOPDIR@/lib/librte_rawdev \
> @TOPDIR@/lib/librte_rcu \
> @TOPDIR@/lib/librte_reorder \
> +                          @TOPDIR@/lib/librte_regexdev \
> @TOPDIR@/lib/librte_ring \
> @TOPDIR@/lib/librte_sched \
> @TOPDIR@/lib/librte_security \
> diff --git a/lib/Makefile b/lib/Makefile
> index 791e0d991..57de9691a 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -44,6 +44,8 @@ DEPDIRS-librte_eventdev := librte_eal librte_ring
> librte_ethdev librte_hash \
> librte_mempool librte_timer librte_cryptodev
> DIRS-$(CONFIG_RTE_LIBRTE_RAWDEV) += librte_rawdev
> DEPDIRS-librte_rawdev := librte_eal librte_ethdev
> +DIRS-$(CONFIG_RTE_LIBRTE_REGEXDEV) += librte_regexdev
> +DEPDIRS-librte_regexdev := librte_eal
> DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
> DEPDIRS-librte_vhost := librte_eal librte_mempool librte_mbuf
> librte_ethdev \
> 			librte_net
> diff --git a/lib/librte_regexdev/Makefile b/lib/librte_regexdev/Makefile
> new file mode 100644
> index 000000000..723b4b28c
> --- /dev/null
> +++ b/lib/librte_regexdev/Makefile
> @@ -0,0 +1,23 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(C) 2019 Marvell International Ltd.
> +#
> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +# library name
> +LIB = librte_regexdev.a
> +
> +# library version
> +LIBABIVER := 1
> +
> +# build flags
> +CFLAGS += -O3
> +CFLAGS += $(WERROR_FLAGS)
> +
> +# library source files
> +SRCS-y += rte_regexdev.c
> +
> +# export include files
> +SYMLINK-y-include += rte_regexdev.h
> +
> +include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/lib/librte_regexdev/rte_regexdev.c
> b/lib/librte_regexdev/rte_regexdev.c
> new file mode 100644
> index 000000000..e5be0f29c
> --- /dev/null
> +++ b/lib/librte_regexdev/rte_regexdev.c
> @@ -0,0 +1,5 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(C) 2019 Marvell International Ltd.
> + */
> +
> +#include <rte_regexdev.h>
> diff --git a/lib/librte_regexdev/rte_regexdev.h
> b/lib/librte_regexdev/rte_regexdev.h
> new file mode 100644
> index 000000000..765da4aaa
> --- /dev/null
> +++ b/lib/librte_regexdev/rte_regexdev.h
> @@ -0,0 +1,1247 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(C) 2019 Marvell International Ltd.
> + */
> +
> +#ifndef _RTE_REGEXDEV_H_
> +#define _RTE_REGEXDEV_H_
> +
> +/**
> + * @file
> + *
> + * RTE RegEx Device API
> + *
> + * Defines RTE RegEx Device APIs for RegEx operations and its provisioning.
> + *
> + * The RegEx Device API is composed of two parts:
> + *
> + * - The application-oriented RegEx API that includes functions to setup
> + *   a RegEx device (configure it, setup its queue pairs and start it),
> + *   update the rule database and so on.
> + *
> + * - The driver-oriented RegEx API that exports a function allowing
> + *   a RegEx poll Mode Driver (PMD) to simultaneously register itself as
> + *   a RegEx device driver.
> + *
> + * RegEx device components and definitions:
> + *
> + *     +-----------------+
> + *     |                 |
> + *     |                 o---------+    rte_regex_[en|de]queue_burst()
> + *     |   PCRE based    o------+  |               |
> + *     |  RegEx pattern  |      |  |  +--------+   |
> + *     | matching engine o------+--+--o        |   |    +------+
> + *     |                 |      |  |  | queue  |<==o===>|Core 0|
> + *     |                 o----+ |  |  | pair 0 |        |      |
> + *     |                 |    | |  |  +--------+        +------+
> + *     +-----------------+    | |  |
> + *            ^               | |  |  +--------+
> + *            |               | |  |  |        |        +------+
> + *            |               | +--+--o queue  |<======>|Core 1|
> + *        Rule|Database       |    |  | pair 1 |        |      |
> + *     +------+----------+    |    |  +--------+        +------+
> + *     |     Group 0     |    |    |
> + *     | +-------------+ |    |    |  +--------+        +------+
> + *     | | Rules 0..n  | |    |    |  |        |        |Core 2|
> + *     | +-------------+ |    |    +--o queue  |<======>|      |
> + *     |     Group 1     |    |       | pair 2 |        +------+
> + *     | +-------------+ |    |       +--------+
> + *     | | Rules 0..n  | |    |
> + *     | +-------------+ |    |       +--------+
> + *     |     Group 2     |    |       |        |        +------+
> + *     | +-------------+ |    |       | queue  |<======>|Core n|
> + *     | | Rules 0..n  | |    +-------o pair n |        |      |
> + *     | +-------------+ |            +--------+        +------+
> + *     |     Group n     |
> + *     | +-------------+ |<-------rte_regex_rule_db_update()
> + *     | | Rules 0..n  | |<-------rte_regex_rule_db_import()
> + *     | +-------------+ |------->rte_regex_rule_db_export()
> + *     +-----------------+
> + *
> + * RegEx: A regular expression is a concise and flexible means for matching
> + * strings of text, such as particular characters, words, or patterns of
> + * characters. A common abbreviation for this is “RegEx”.
> + *
> + * RegEx device: A hardware or software-based implementation of RegEx
> + * device API for PCRE based pattern matching syntax and semantics.
> + *
> + * PCRE RegEx syntax and semantics specification:
> + *
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fregex
> kit.sourceforge.net%2FDocumentation%2Fpcre%2Fpcrepattern.html&amp;d
> ata=02%7C01%7Cshahafs%40mellanox.com%7Cdf93416cf4e8498a982c08d721
> 748937%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C63701465673
> 9993131&amp;sdata=B0LSMubldDy3UlF55Z3whhNiRq6ep1pxB8Rrt5DItfw%3
> D&amp;reserved=0
> + *
> + * RegEx queue pair: Each RegEx device should have one or more queue
> pair to
> + * transmit a burst of pattern matching request and receive a burst of
> + * receive the pattern matching response. The pattern matching
> request/response
> + * embedded in *rte_regex_ops* structure.
> + *
> + * Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
> + * Match ID and Group ID to identify the rule upon the match.
> + *
> + * Rule database: The RegEx device accepts regular expressions and
> converts them
> + * into a compiled rule database that can then be used to scan data.
> + * Compilation allows the device to analyze the given pattern(s) and
> + * pre-determine how to scan for these patterns in an optimized fashion
> that
> + * would be far too expensive to compute at run-time. A rule database
> contains
> + * a set of rules that compiled in device specific binary form.
> + *
> + * Match ID or Rule ID: A unique identifier provided at the time of rule
> + * creation for the application to identify the rule upon match.
> + *
> + * Group ID: Group of rules can be grouped under one group ID to enable
> + * rule isolation and effective pattern matching. A unique group identifier
> + * provided at the time of rule creation for the application to identify the
> + * rule upon match.
> + *
> + * Scan: A pattern matching request through *enqueue* API.
> + *
> + * It may possible that a given RegEx device may not support all the features
> + * of PCRE. The application may probe unsupported features through
> + * struct rte_regex_dev_info::pcre_unsup_flags
> + *
> + * By default, all the functions of the RegEx Device API exported by a PMD
> + * are lock-free functions which assume to not be invoked in parallel on
> + * different logical cores to work on the same target object. For instance,
> + * the dequeue function of a PMD cannot be invoked in parallel on two
> logical
> + * cores to operates on same RegEx queue pair. Of course, this function
> + * can be invoked in parallel by different logical core on different queue
> pair.
> + * It is the responsibility of the upper level application to enforce this rule.
> + *
> + * In all functions of the RegEx API, the RegEx device is
> + * designated by an integer >= 0 named the device identifier *dev_id*
> + *
> + * At the RegEx driver level, RegEx devices are represented by a generic
> + * data structure of type *rte_regex_dev*.
> + *
> + * RegEx devices are dynamically registered during the PCI/SoC device
> probing
> + * phase performed at EAL initialization time.
> + * When a RegEx device is being probed, a *rte_regex_dev* structure and
> + * a new device identifier are allocated for that device. Then, the
> + * regex_dev_init() function supplied by the RegEx driver matching the
> probed
> + * device is invoked to properly initialize the device.
> + *
> + * The role of the device init function consists of resetting the hardware or
> + * software RegEx driver implementations.
> + *
> + * If the device init operation is successful, the correspondence between
> + * the device identifier assigned to the new device and its associated
> + * *rte_regex_dev* structure is effectively registered.
> + * Otherwise, both the *rte_regex_dev* structure and the device identifier
> are
> + * freed.
> + *
> + * The functions exported by the application RegEx API to setup a device
> + * designated by its device identifier must be invoked in the following order:
> + *     - rte_regex_dev_configure()
> + *     - rte_regex_queue_pair_setup()
> + *     - rte_regex_dev_start()
> + *
> + * Then, the application can invoke, in any order, the functions
> + * exported by the RegEx API to enqueue pattern matching job, dequeue
> pattern
> + * matching response, get the stats, update the rule database,
> + * get/set device attributes and so on
> + *
> + * If the application wants to change the configuration (i.e. call
> + * rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must
> call
> + * rte_regex_dev_stop() first to stop the device and then do the
> reconfiguration
> + * before calling rte_regex_dev_start() again. The enqueue and dequeue
> + * functions should not be invoked when the device is stopped.
> + *
> + * Finally, an application can close a RegEx device by invoking the
> + * rte_regex_dev_close() function.
> + *
> + * Each function of the application RegEx API invokes a specific function
> + * of the PMD that controls the target device designated by its device
> + * identifier.
> + *
> + * For this purpose, all device-specific functions of a RegEx driver are
> + * supplied through a set of pointers contained in a generic structure of type
> + * *regex_dev_ops*.
> + * The address of the *regex_dev_ops* structure is stored in the
> *rte_regex_dev*
> + * structure by the device init function of the RegEx driver, which is
> + * invoked during the PCI/SoC device probing phase, as explained earlier.
> + *
> + * In other words, each function of the RegEx API simply retrieves the
> + * *rte_regex_dev* structure associated with the device identifier and
> + * performs an indirect invocation of the corresponding driver function
> + * supplied in the *regex_dev_ops* structure of the *rte_regex_dev*
> structure.
> + *
> + * For performance reasons, the address of the fast-path functions of the
> + * RegEx driver is not contained in the *regex_dev_ops* structure.
> + * Instead, they are directly stored at the beginning of the *rte_regex_dev*
> + * structure to avoid an extra indirect memory access during their
> invocation.
> + *
> + * RTE RegEx device drivers do not use interrupts for enqueue or dequeue
> + * operation. Instead, RegEx drivers export Poll-Mode enqueue and
> dequeue
> + * functions to applications.
> + *
> + * The *enqueue* operation submits a burst of RegEx pattern matching
> request
> + * to the RegEx device and the *dequeue* operation gets a burst of pattern
> + * matching response for the ones submitted through *enqueue*
> operation.
> + *
> + * Typical application utilisation of the RegEx device API will follow the
> + * following programming flow.
> + *
> + * - rte_regex_dev_configure()
> + * - rte_regex_queue_pair_setup()
> + * - rte_regex_rule_db_update() Needs to invoke if precompiled rule
> database not
> + *   provided in rte_regex_dev_config::rule_db for
> rte_regex_dev_configure()
> + *   and/or application needs to update rule database.
> + * - Create or reuse exiting mempool for *rte_regex_ops* objects.
> + * - rte_regex_dev_start()
> + * - rte_regex_enqueue_burst()
> + * - rte_regex_dequeue_burst()
> + *
> + */
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include <rte_common.h>
> +#include <rte_config.h>
> +#include <rte_dev.h>
> +#include <rte_errno.h>
> +#include <rte_memory.h>
> +
> +/**
> + * Get the total number of RegEx devices that have been successfully
> + * initialised.
> + *
> + * @return
> + *   The total number of usable RegEx devices.
> + */
> +uint8_t
> +rte_regex_dev_count(void);
> +
> +/**
> + * Get the device identifier for the named RegEx device.
> + *
> + * @param name
> + *   RegEx device name to select the RegEx device identifier.
> + *
> + * @return
> + *   Returns RegEx device identifier on success.
> + *   - <0: Failure to find named RegEx device.
> + */
> +int
> +rte_regex_dev_get_dev_id(const char *name);
> +
> +/* Enumerates RegEx device capabilities */
> +#define RTE_REGEX_DEV_CAPA_RUNTIME_COMPILATION_F (1ULL << 0)
> +/**< RegEx device does support compiling the rules at runtime unlike
> + * loading only the pre-built rule database using
> + * struct rte_regex_dev_config::rule_db in rte_regex_dev_configure()
> + * @see struct rte_regex_dev_config::rule_db, rte_regex_dev_configure()
> + * @see struct rte_regex_dev_info::regex_dev_capa
> + */
> +
> +
> +/* Enumerates unsupported PCRE features for the RegEx device */
> +#define RTE_REGEX_DEV_PCRE_UNSUP_START_ANCHOR_F (1ULL << 0)
> +/**< RegEx device doesn't support PCRE Anchor to start of match flag.
> + * Example RegEx is '/\Gfoo\d/'. Here '\G' asserts position at the end of the
> + * previous match or the start of the string for the first match.
> + * This position will change each time the RegEx is applied to the subject
> + * string. If the RegEx is applied to 'foo1foo2Zfoo3' the first two matches will
> + * be successful for 'foo1foo2' and fail for 'Zfoo3'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_ATOMIC_GROUPING_F (1ULL <<
> 1)
> +/**< RegEx device doesn't support PCRE Atomic grouping.
> + * Atomic groups are represented by '(?>)'. An atomic group is a group that,
> + * when the RegEx engine exits from it, automatically throws away all
> + * backtracking positions remembered by any tokens inside the group.
> + * Example RegEx is 'a(?>bc|b)c' if the given patterns are 'abc' and 'abcc'
> then
> + * 'a(bc|b)c' matches both where as 'a(?>bc|b)c' matches only abcc because
> + * atomic groups don't allow backtracing back to 'b'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_BACKTRACKING_CTRL_F (1ULL <<
> 2)
> +/**< RegEx device doesn't support PCRE backtracking control verbs.
> + * Some examples of backtracing verbs are (*COMMIT), (*ACCEPT), (*FAIL),
> + * (*SKIP), (*PRUNE).
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_CALLOUTS_F (1ULL << 3)
> +/**< RegEx device doesn't support PCRE callouts.
> + * PCRE supports calling external function in between matches by using
> '(?C)'.
> + * Example RegEx 'ABC(?C)D' if a given patter is 'ABCD' then the RegEx
> engine
> + * will parse ABC perform a userdefined callout and return a successful
> match at
> + * D.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_BACKREFERENCE_F (1ULL << 4)
> +/**< RegEx device doesn't support PCRE backreference.
> + * Example RegEx is '(\2ABC|(GHI))+' \2 matches the same text as most
> recently
> + * matched by the 2nd capturing group i.e. 'GHI'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_GREEDY_F (1ULL << 5)
> +/**< RegEx device doesn't support PCRE Greedy mode.
> + * For example if the RegEx is 'AB\d*?' then '*?' represents zero or
> unlimited
> + * matches. In greedy mode the pattern 'AB12345' will be matched
> completely
> + * where as the ungreedy mode 'AB' will be returned as the match.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_LOOKAROUND_ASRT_F (1ULL <<
> 6)
> +/**< RegEx device doesn't support PCRE Lookaround assertions
> + * (Zero-width assertions). Example RegEx is '[a-z]+\d+(?=!{3,})' if
> + * the given pattern is 'dwad1234!' the RegEx engine doesn't report any
> matches
> + * because the assert '(?=!{3,})' fails. The pattern 'dwad123!!!' would return
> a
> + * successful match.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_MATCH_POINT_RST_F (1ULL <<
> 7)
> +/**< RegEx device doesn't support PCRE match point reset directive.
> + * Example RegEx is '[a-z]+\K\d+' if the pattern is 'dwad123'
> + * then even though the entire pattern matches only '123'
> + * is reported as a match.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_NEWLINE_CONVENTIONS_F
> (1ULL << 8)
> +/**< RegEx device doesn't support PCRE newline convention.
> + * Newline conventions are represented as follows:
> + * (*CR)        carriage return
> + * (*LF)        linefeed
> + * (*CRLF)      carriage return, followed by linefeed
> + * (*ANYCRLF)   any of the three above
> + * (*ANY)       all Unicode newline sequences
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_NEWLINE_SEQ_F (1ULL << 9)
> +/**< RegEx device doesn't support PCRE newline sequence.
> + * The escape sequence '\R' will match any newline sequence.
> + * It is equivalent to: '(?>\r\n|\n|\x0b|\f|\r|\x85)'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_POSSESSIVE_QUALIFIERS_F (1ULL
> << 10)
> +/**< RegEx device doesn't support PCRE possessive qualifiers.
> + * Example RegEx possessive qualifiers '*+', '++', '?+', '{m,n}+'.
> + * Possessive quantifier repeats the token as many times as possible and it
> does
> + * not give up matches as the engine backtracks. With a possessive
> quantifier,
> + * the deal is all or nothing.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_SUBROUTINE_REFERENCES_F
> (1ULL << 11)
> +/**< RegEx device doesn't support PCRE Subroutine references.
> + * PCRE Subroutine references allow for sub patterns to be assessed
> + * as part of the RegEx. Example RegEx is '(foo|fuzz)\g<1>+bar' matches the
> + * pattern 'foofoofuzzfoofuzzbar'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_8_F (1ULL << 12)
> +/**< RegEx device doesn't support UTF-8 character encoding.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_16_F (1ULL << 13)
> +/**< RegEx device doesn't support UTF-16 character encoding.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_32_F (1ULL << 14)
> +/**< RegEx device doesn't support UTF-32 character encoding.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_WORD_BOUNDARY_F (1ULL <<
> 15)
> +/**< RegEx device doesn't support word boundaries.
> + * The meta character '\b' represents word boundary anchor.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_FORWARD_REFERENCES_F (1ULL
> << 16)
> +/**< RegEx device doesn't support Forward references.
> + * Forward references allow you to use a back reference to a group that
> appears
> + * later in the RegEx. Example RegEx is '(\3ABC|(DEF|(GHI)))+' matches the
> + * following string 'GHIGHIABCDEF'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +/* Enumerates PCRE rule flags */
> +#define RTE_REGEX_PCRE_RULE_ALLOW_EMPTY_F (1ULL << 0)
> +/**< When this flag is set, the pattern that can match against an empty
> string,
> + * such as '.*' are allowed.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_ANCHORED_F (1ULL << 1)
> +/**< When this flag is set, the pattern is forced to be "anchored", that is, it
> + * is constrained to match only at the first matching point in the string that
> + * is being searched. Similar to '^' and represented by \A.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_CASELESS_F (1ULL << 2)
> +/**< When this flag is set, letters in the pattern match both upper and
> lower
> + * case letters in the subject.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_DOTALL_F (1ULL << 3)
> +/**< When this flag is set, a dot metacharacter in the pattern matches any
> + * character, including one that indicates a newline.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_DUPNAMES_F (1ULL << 4)
> +/**< When this flag is set, names used to identify capture groups need not
> be
> + * unique.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_EXTENDED_F (1ULL << 5)
> +/**< When this flag is set, most white space characters in the pattern are
> + * totally ignored except when escaped or inside a character class.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_MATCH_UNSET_BACKREF_F (1ULL << 6)
> +/**< When this flag is set, a backreference to an unset capture group
> matches an
> + * empty string.
> + * @see RTE_REGEX_DEV_PCRE_UNSUP_FORWARD_REFERENCES_F
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_MULTILINE_F (1ULL << 7)
> +/**< When this flag  is set, the '^' and '$' constructs match immediately
> + * following or immediately before internal newlines in the subject string,
> + * respectively, as well as at the very start and end.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_NO_AUTO_CAPTURE_F (1ULL << 8)
> +/**< When this Flag is set, it disables the use of numbered capturing
> + * parentheses in the pattern. References to capture groups
> (backreferences or
> + * recursion/subroutine calls) may only refer to named groups, though the
> + * reference can be by name or by number.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_UCP_F (1ULL << 9)
> +/**< By default, only ASCII characters are recognized, When this flag is set,
> + * Unicode properties are used instead to classify characters.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_UNGREEDY_F (1ULL << 10)
> +/**< When this flag is set, the "greediness" of the quantifiers is inverted
> + * so that they are not greedy by default, but become greedy if followed by
> + * '?'.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_UTF_F (1ULL << 11)
> +/**< When this flag is set, RegEx engine has to regard both the pattern and
> the
> + * subject strings that are subsequently processed as strings of UTF
> characters
> + * instead of single-code-unit strings.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_NEVER_BACKSLASH_C_F (1ULL << 12)
> +/**< This Flag locks out the use of '\C' in the pattern that is being compiled.
> + * This escape matches one data unit, even in UTF mode which can cause
> + * unpredictable behavior in UTF-8 or UTF-16 modes, because it may leave
> the
> + * current matching point in the middle of a multi-code-unit character.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +
> +/**
> + * RegEx device information
> + */
> +struct rte_regex_dev_info {
> +	const char *driver_name; /**< RegEx driver name */
> +	struct rte_device *dev;	/**< Device information */
> +	uint8_t max_matches;
> +	/**< Maximum matches per scan supported by this device */
> +	uint16_t max_queue_pairs;
> +	/**< Maximum queue pairs supported by this device */
> +	uint16_t max_payload_size;
> +	/**< Maximum payload size for a pattern match request or scan.
> +	 * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> +	 */
> +	uint16_t max_rules_per_group;
> +	/**< Maximum rules supported per group by this device */
> +	uint16_t max_groups;
> +	/**< Maximum group supported by this device */
> +	uint32_t regex_dev_capa;
> +	/**< RegEx device capabilities. @see RTE_REGEX_DEV_CAPA_* */
> +	uint64_t rule_flags;
> +	/**< Supported compiler rule flags.
> +	 * @see RTE_REGEX_PCRE_RULE_*, struct rte_regex_rule::rule_flags
> +	 */
> +	uint64_t pcre_unsup_flags;
> +	/**< Unsupported PCRE features for this RegEx device.
> +	 * @see RTE_REGEX_DEV_PCRE_UNSUP_*
> +	 */
> +};
> +
> +/**
> + * Retrieve the contextual information of a RegEx device.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + *
> + * @param[out] dev_info
> + *   A pointer to a structure of type *rte_regex_dev_info* to be filled with
> the
> + *   contextual information of the device.
> + *
> + * @return
> + *   - 0: Success, driver updates the contextual information of the RegEx
> device
> + *   - <0: Error code returned by the driver info get function.
> + *
> + */
> +int
> +rte_regex_dev_info_get(uint8_t dev_id, struct rte_regex_dev_info
> *dev_info);
> +
> +/* Enumerates RegEx device configuration flags */
> +#define RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F (1ULL << 0)
> +/**< Cross buffer scan refers to the ability to be able to detect
> + * matches that occur across buffer boundaries, where the buffers are
> related
> + * to each other in some way. Enable this flag when to scan payload size
> + * greater struct struct rte_regex_dev_info::max_payload_size and/or
> + * matches can present across scan buffer boundaries.
> + *
> + * @see struct rte_regex_dev_info::max_payload_size
> + * @see struct rte_regex_dev_config::dev_cfg_flags,
> rte_regex_dev_configure()
> + * @see RTE_REGEX_OPS_RSP_PMI_SOJ_F
> + * @see RTE_REGEX_OPS_RSP_PMI_EOJ_F
> + */
> +
> +/** RegEx device configuration structure */
> +struct rte_regex_dev_config {
> +	uint8_t nb_max_matches;
> +	/**< Maximum matches per scan configured on this device.
> +	 * This value cannot exceed the *max_matches*
> +	 * which previously provided in rte_regex_dev_info_get().
> +	 * The value 0 is allowed, in which case, value 1 used.
> +	 * @see struct rte_regex_dev_info::max_matches
> +	 */
> +	uint16_t nb_queue_pairs;
> +	/**< Number of RegEx queue pairs to configure on this device.
> +	 * This value cannot exceed the *max_queue_pairs* which
> previously
> +	 * provided in rte_regex_dev_info_get().
> +	 * @see struct rte_regex_dev_info::max_queue_pairs
> +	 */
> +	uint16_t nb_rules_per_group;
> +	/**< Number of rules per group to configure on this device.
> +	 * This value cannot exceed the *max_rules_per_group*
> +	 * which previously provided in rte_regex_dev_info_get().
> +	 * The value 0 is allowed, in which case,
> +	 * struct rte_regex_dev_info::max_rules_per_group used.
> +	 * @see struct rte_regex_dev_info::max_rules_per_group
> +	 */
> +	uint16_t nb_groups;
> +	/**< Number of groups to configure on this device.
> +	 * This value cannot exceed the *max_groups*
> +	 * which previously provided in rte_regex_dev_info_get().
> +	 * @see struct rte_regex_dev_info::max_groups
> +	 */
> +	const char *rule_db;
> +	/**< Import initial set of prebuilt rule database on this device.
> +	 * The value NULL is allowed, in which case, the device will not
> +	 * be configured prebuilt rule database. Application may use
> +	 * rte_regex_rule_db_update() or rte_regex_rule_db_import() API
> +	 * to update or import rule database after the
> +	 * rte_regex_dev_configure().
> +	 * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
> +	 */
> +	uint32_t rule_db_len;
> +	/**< Length of *rule_db* buffer. */
> +	uint32_t dev_cfg_flags;
> +	/**< RegEx device configuration flags, See RTE_REGEX_DEV_CFG_*
> */
> +};
> +
> +/**
> + * Configure a RegEx device.
> + *
> + * This function must be invoked first before any other function in the
> + * API. This function can also be re-invoked when a device is in the
> + * stopped state.
> + *
> + * The caller may use rte_regex_dev_info_get() to get the capability of each
> + * resources available for this regex device.
> + *
> + * @param dev_id
> + *   The identifier of the device to configure.
> + * @param cfg
> + *   The RegEx device configuration structure.
> + *
> + * @return
> + *   - 0: Success, device configured.
> + *   - <0: Error code returned by the driver configuration function.
> + */
> +int
> +rte_regex_dev_configure(uint8_t dev_id, const struct
> rte_regex_dev_config *cfg);
> +
> +/* Enumerates RegEx queue pair configuration flags */
> +#define RTE_REGEX_QUEUE_PAIR_CFG_OOS_F (1ULL << 0)
> +/**< Out of order scan, If not set, a scan must retire after previously issued
> + * in-order scans to this queue pair. If set, this scan can be retired as soon
> + * as device returns completion. Application should not set out of order scan
> + * flag if it needs to maintain the ingress order of scan request.
> + *
> + * @see struct rte_regex_qp_conf::qp_conf_flags,
> rte_regex_queue_pair_setup()
> + */
> +
> +struct rte_regex_ops;
> +typedef void (*regexdev_stop_flush_t)(uint8_t dev_id, uint16_t qp_id,
> +				      struct rte_regex_ops *op);
> +/**< Callback function called during rte_regex_dev_stop(), invoked once
> per
> + * flushed RegEx op.
> + */
> +
> +/** RegEx queue pair configuration structure */
> +struct rte_regex_qp_conf {
> +	uint32_t qp_conf_flags;
> +	/**< Queue pair config flags, See RTE_REGEX_QUEUE_PAIR_CFG_*
> */
> +	uint16_t nb_desc;
> +	/**< The number of descriptors to allocate for this queue pair. */
> +	regexdev_stop_flush_t cb;
> +	/**< Callback function called during rte_regex_dev_stop(), invoked
> +	 * once per flushed regex op. Value NULL is allowed, in which case
> +	 * callback will not be invoked. This function can be used to properly
> +	 * dispose of outstanding regex ops from response queue,
> +	 * for example ops containing memory pointers.
> +	 * @see rte_regex_dev_stop()
> +	 */
> +};
> +
> +/**
> + * Allocate and set up a RegEx queue pair for a RegEx device.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param queue_pair_id
> + *   The index of the RegEx queue pair to setup. The value must be in the
> range
> + *   [0, nb_queue_pairs - 1] previously supplied to
> rte_regex_dev_configure().
> + * @param qp_conf
> + *   The pointer to the configuration data to be used for the RegEx queue
> pair.
> + *   NULL value is allowed, in which case default configuration	used.
> + *
> + * @return
> + *   - 0: Success, RegEx queue pair correctly set up.
> + *   - <0: RegEx queue configuration failed
> + */
> +int
> +rte_regex_queue_pair_setup(uint8_t dev_id, uint8_t queue_pair_id,
> +			   const struct rte_regex_qp_conf *qp_conf);
> +
> +/**
> + * Start a RegEx device.
> + *
> + * The device start step is the last one and consists of setting the RegEx
> + * queues to start accepting the pattern matching scan requests.
> + *
> + * On success, all basic functions exported by the API (RegEx enqueue,
> + * RegEx dequeue and so on) can be invoked.
> + *
> + * @param dev_id
> + *   RegEx device identifier
> + * @return
> + *   - 0: Success, device started.
> + *   - <0: Device start failed.
> + */
> +int
> +rte_regex_dev_start(uint8_t dev_id);
> +
> +/**
> + * Stop a RegEx device.
> + *
> + * Stop a RegEx device. The device can be restarted with a call to
> + * rte_regex_dev_start().
> + *
> + * This function causes all queued response regex ops to be drained in the
> + * response queue. While draining ops out of the device,
> + * struct rte_regex_qp_conf::cb will be invoked for each ops.
> + *
> + * @param dev_id
> + *   RegEx device identifier.
> + *
> + * @see struct rte_regex_qp_conf::cb, rte_regex_queue_pair_setup()
> + */
> +void
> +rte_regex_dev_stop(uint8_t dev_id);
> +
> +/**
> + * Close a RegEx device. The device cannot be restarted!
> + *
> + * @param dev_id
> + *   RegEx device identifier
> + *
> + * @return
> + *  - 0 on successfully closed the device.
> + *  - <0 on failure to close the device.
> + */
> +int
> +rte_regex_dev_close(uint8_t dev_id);
> +
> +/* Device get/set attributes */
> +
> +/** Enumerates RegEx device attribute identifier */
> +enum rte_regex_dev_attr_id {
> +	RTE_REGEX_DEV_ATTR_SOCKET_ID,
> +	/**< The NUMA socket id to which the device is connected or
> +	 * a default of zero if the socket could not be determined.
> +	 * datatype: *int*
> +	 * operation: *get*
> +	 */
> +	RTE_REGEX_DEV_ATTR_MAX_MATCHES,
> +	/**< Maximum number of matches per scan.
> +	 * datatype: *uint8_t*
> +	 * operation: *get* and *set*
> +	 *
> +	 * @see RTE_REGEX_OPS_RSP_MAX_MATCH_F
> +	 */
> +	RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT,
> +	/**< Upper bound scan time in ns.
> +	 * datatype: *uint16_t*
> +	 * operation: *get* and *set*
> +	 *
> +	 * @see RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F
> +	 */
> +	RTE_REGEX_DEV_ATTR_MAX_PREFIX,
> +	/**< Maximum number of prefix detected per scan.
> +	 * This would be useful for denial of service detection.
> +	 * datatype: *uint16_t*
> +	 * operation: *get* and *set*
> +	 *
> +	 * @see RTE_REGEX_OPS_RSP_MAX_PREFIX_F
> +	 */
> +};
> +
> +/**
> + * Get an attribute from a RegEx device.
> + *
> + * @param dev_id RegEx device identifier
> + * @param attr_id The attribute ID to retrieve
> + * @param[out] attr_value A pointer that will be filled in with the attribute
> + *             value if successful.
> + *
> + * @return
> + *   - 0: Successfully retrieved attribute value.
> + *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
> + *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
> + */
> +int
> +rte_regex_dev_attr_get(uint8_t dev_id, enum rte_regex_dev_attr_id
> attr_id,
> +		       void *attr_value);
> +
> +/**
> + * Set an attribute to a RegEx device.
> + *
> + * @param dev_id RegEx device identifier
> + * @param attr_id The attribute ID to retrieve
> + * @param attr_value A pointer that will be filled in with the attribute value
> + *                   by the application
> + *
> + * @return
> + *   - 0: Successfully applied the attribute value.
> + *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
> + *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
> + */
> +int
> +rte_regex_dev_attr_set(uint8_t dev_id, enum rte_regex_dev_attr_id
> attr_id,
> +		       const void *attr_value);
> +
> +/* Rule related APIs */
> +/** Enumerates RegEx rule operation */
> +enum rte_regex_rule_op {
> +	RTE_REGEX_RULE_OP_ADD,
> +	/**< Add RegEx rule to rule database */
> +	RTE_REGEX_RULE_OP_REMOVE
> +	/**< Remove RegEx rule from rule database */
> +};
> +
> +/** Structure to hold a RegEx rule attributes */
> +struct rte_regex_rule {
> +	enum rte_regex_rule_op op;
> +	/**< OP type of the rule either a OP_ADD or OP_DELETE */
> +	uint16_t group_id;
> +	/**< Group identifier to which the rule belongs to. */
> +	uint32_t rule_id;
> +	/**< Rule identifier which is returned on successful match. */
> +	const char *pcre_rule;
> +	/**< Buffer to hold the PCRE rule. */
> +	uint16_t pcre_rule_len;
> +	/**< Length of the PCRE rule*/
> +	uint64_t rule_flags;
> +	/* PCRE rule flags. Supported device specific PCRE rules enumerated
> +	 * in struct rte_regex_dev_info::rule_flags. For successful rule
> +	 * database update, application needs to provide only supported
> +	 * rule flags.
> +	 * @See RTE_REGEX_PCRE_RULE_*, struct
> rte_regex_dev_info::rule_flags
> +	 */
> +};
> +
> +/**
> + * Update the rule database of a RegEx device.
> + *
> + * @param dev_id RegEx device identifier
> + * @param rules
> + *   Points to an array of *nb_rules* objects of type *rte_regex_rule*
> structure
> + *   which contain the regex rules attributes to be updated in rule database.
> + * @param nb_rules
> + *   The number of PCRE rules to update the rule database.
> + *
> + * @return
> + *   The number of regex rules actually updated on the regex device's rule
> + *   database. The return value can be less than the value of the *nb_rules*
> + *   parameter when the regex devices fails to update the rule database or
> + *   if invalid parameters are specified in a *rte_regex_rule*.
> + *   If the return value is less than *nb_rules*, the remaining PCRE rules
> + *   at the end of *rules* are not consumed and the caller has to take
> + *   care of them and rte_errno is set accordingly.
> + *   Possible errno values include:
> + *   - -EINVAL:  Invalid device ID or rules is NULL
> + *   - -ENOTSUP: The last processed rule is not supported on this device.
> + *   - -ENOSPC: No space available in rule database.
> + *
> + * @see rte_regex_rule_db_import(), rte_regex_rule_db_export()
> + */
> +uint16_t
> +rte_regex_rule_db_update(uint8_t dev_id, const struct rte_regex_rule
> *rules,
> +			 uint16_t nb_rules);

I think the function name is not too informative. If this function meant to compile the rule then it should be explicit on the function name. 

> +
> +/**
> + * Import a prebuilt rule database from a buffer to a RegEx device.
> + *
> + * @param dev_id RegEx device identifier
> + * @param rule_db
> + *   Points to prebuilt rule database.
> + * @param rule_db_len
> + *   Length of the rule database.
> + *
> + * @return
> + *   - 0: Successfully updated the prebuilt rule database.
> + *   - -EINVAL:  Invalid device ID or rule_db is NULL
> + *   - -ENOTSUP: Rule database import is not supported on this device.
> + *   - -ENOSPC: No space available in rule database.
> + *
> + * @see rte_regex_rule_db_update(), rte_regex_rule_db_export()
> + */
> +int
> +rte_regex_rule_db_import(uint8_t dev_id, const char *rule_db,
> +			 uint32_t rule_db_len);
> +
> +/**
> + * Export the prebuilt rule database from a RegEx device to the buffer.
> + *
> + * @param dev_id RegEx device identifier
> + * @param[out] rule_db
> + *   Block of memory to insert the rule database. Must be at least size in
> + *   capacity. If set to NULL, function returns required capacity.
> + *
> + * @return
> + *   - 0: Successfully exported the prebuilt rule database.
> + *   - size: If rule_db set to NULL then required capacity for *rule_db*
> + *   - -EINVAL:  Invalid device ID
> + *   - -ENOTSUP: Rule database export is not supported on this device.
> + *
> + * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
> + */
> +int
> +rte_regex_rule_db_export(uint8_t dev_id, char *rule_db);
> +
> +/* Extended statistics */
> +/** Maximum name length for extended statistics counters */
> +#define RTE_REGEX_DEV_XSTATS_NAME_SIZE 64
> +
> +/**
> + * A name-key lookup element for extended statistics.
> + *
> + * This structure is used to map between names and ID numbers
> + * for extended RegEx device statistics.
> + */
> +struct rte_regex_dev_xstats_map {
> +	uint16_t id;
> +	/**< xstat identifier */
> +	char name[RTE_REGEX_DEV_XSTATS_NAME_SIZE];
> +	/**< xstat name */
> +};
> +
> +/**
> + * Retrieve names of extended statistics of a regex device.
> + *
> + * @param dev_id
> + *   The identifier of the regex device.
> + * @param[out] xstats_map
> + *   Block of memory to insert id and names into. Must be at least size in
> + *   capacity. If set to NULL, function returns required capacity.
> + * @return
> + *   - positive value on success:
> + *        -The return value is the number of entries filled in the stats map.
> + *        -If xstats_map set to NULL then required capacity for xstats_map.
> + *   - negative value on error:
> + *      -ENODEV for invalid *dev_id*
> + *      -ENOTSUP if the device doesn't support this function.
> + */
> +int
> +rte_regex_dev_xstats_names_get(uint8_t dev_id,
> +			       struct rte_regex_dev_xstats_map *xstats_map);
> +
> +/**
> + * Retrieve extended statistics of an regex device.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param ids
> + *   The id numbers of the stats to get. The ids can be got from the stat
> + *   position in the stat list from rte_regex_dev_xstats_names_get(), or
> + *   by using rte_regex_dev_xstats_by_name_get().
> + * @param[out] values
> + *   The values for each stats request by ID.
> + * @param n
> + *   The number of stats requested
> + * @return
> + *   - positive value: number of stat entries filled into the values array
> + *   - negative value on error:
> + *      -ENODEV for invalid *dev_id*
> + *      -ENOTSUP if the device doesn't support this function.
> + */
> +int
> +rte_regex_dev_xstats_get(uint8_t dev_id, const uint16_t ids[],
> +			 uint64_t values[], uint16_t n);
> +
> +/**
> + * Retrieve the value of a single stat by requesting it by name.
> + *
> + * @param dev_id
> + *   The identifier of the device
> + * @param name
> + *   The stat name to retrieve
> + * @param[out] id
> + *   If non-NULL, the numerical id of the stat will be returned, so that further
> + *   requests for the stat can be got using rte_regex_dev_xstats_get, which
> will
> + *   be faster as it doesn't need to scan a list of names for the stat.
> + * @param[out] value
> + *   Must be non-NULL, retrieved xstat value will be stored in this address.
> + *
> + * @return
> + *   - 0: Successfully retrieved xstat value.
> + *   - -EINVAL: invalid parameters
> + *   - -ENOTSUP: if not supported.
> + */
> +int
> +rte_regex_dev_xstats_by_name_get(uint8_t dev_id, const char *name,
> +				 uint16_t *id, uint64_t *value);
> +
> +/**
> + * Reset the values of the xstats of the selected component in the device.
> + *
> + * @param dev_id
> + *   The identifier of the device
> + * @param ids
> + *   Selects specific statistics to be reset. When NULL, all statistics will be
> + *   reset. If non-NULL, must point to array of at least *nb_ids* size.
> + * @param nb_ids
> + *   The number of ids available from the *ids* array. Ignored when ids is
> NULL.
> + * @return
> + *   - 0: Successfully reset the statistics to zero.
> + *   - -EINVAL: invalid parameters
> + *   - -ENOTSUP: if not supported.
> + */
> +int
> +rte_regex_dev_xstats_reset(uint8_t dev_id, const uint16_t ids[],
> +			   uint16_t nb_ids);
> +
> +/**
> + * Trigger the RegEx device self test.
> + *
> + * @param dev_id
> + *   The identifier of the device
> + * @return
> + *   - 0: Selftest successful
> + *   - -ENOTSUP if the device doesn't support selftest
> + *   - other values < 0 on failure.
> + */
> +int rte_regex_dev_selftest(uint8_t dev_id);
> +
> +/**
> + * Dump internal information about *dev_id* to the FILE* provided in *f*.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + *
> + * @param f
> + *   A pointer to a file for output
> + *
> + * @return
> + *   - 0: on success
> + *   - <0: on failure.
> + */
> +int
> +rte_regex_dev_dump(uint8_t dev_id, FILE *f);
> +
> +/* Fast path APIs */
> +
> +/**
> + * The generic *rte_regex_match* structure to hold the RegEx match
> attributes.
> + * @see struct rte_regex_ops::matches
> + */
> +struct rte_regex_match {
> +	RTE_STD_C11
> +	union {
> +		uint64_t u64;
> +		struct {
> +			uint32_t rule_id:20;
> +			/**< Rule identifier to which the pattern matched.
> +			 * @see struct rte_regex_rule::rule_id
> +			 */
> +			uint32_t group_id:12;
> +			/**< Group identifier of the rule which the pattern
> +			 * matched. @see struct rte_regex_rule::group_id
> +			 */
> +			uint16_t offset;
> +			/**< Starting Byte Position for matched rule. */
> +			uint16_t len;
> +			/**< Length of match in bytes */
> +		};
> +	};
> +};
> +
> +/* Enumerates RegEx request flags. */
> +#define RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F (1 << 0)
> +/**< Set when struct rte_regex_rule::group_id1 valid */
> +
> +#define RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F (1 << 1)
> +/**< Set when struct rte_regex_rule::group_id2 valid */
> +
> +#define RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F (1 << 2)
> +/**< Set when struct rte_regex_rule::group_id3 valid */
> +
> +#define RTE_REGEX_OPS_REQ_STOP_ON_MATCH_F (1 << 4)
> +/**< The RegEx engine will stop scanning and return the first match. */
> +
> +#define RTE_REGEX_OPS_REQ_MATCH_HIGH_PRIORITY_F (1 << 5)
> +/**< In High Priority mode a maximum of one match will be returned per
> scan to
> + * reduce the post-processing required by the application. The match with
> the
> + * lowest Rule id, lowest start pointer and lowest match length will be
> + * returned.
> + *
> + * @see struct rte_regex_ops::nb_actual_matches
> + * @see struct rte_regex_ops::nb_matches
> + */
> +
> +
> +/* Enumerates RegEx response flags. */
> +#define RTE_REGEX_OPS_RSP_PMI_SOJ_F (1 << 0)
> +/**< Indicates that the RegEx device has encountered a partial match at the
> + * start of scan in the given buffer.
> + *
> + * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> + */
> +
> +#define RTE_REGEX_OPS_RSP_PMI_EOJ_F (1 << 1)
> +/**< Indicates that the RegEx device has encountered a partial match at the
> + * end of scan in the given buffer.
> + *
> + * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> + */
> +
> +#define RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F (1 << 2)
> +/**< Indicates that the RegEx device has exceeded the max timeout while
> + * scanning the given buffer.
> + *
> + * @see RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT
> + */
> +
> +#define RTE_REGEX_OPS_RSP_MAX_MATCH_F (1 << 3)
> +/**< Indicates that the RegEx device has exceeded the max matches while
> + * scanning the given buffer.
> + *
> + * @see RTE_REGEX_DEV_ATTR_MAX_MATCHES
> + */
> +
> +#define RTE_REGEX_OPS_RSP_MAX_PREFIX_F (1 << 4)
> +/**< Indicates that the RegEx device has reached the max allowed prefix
> length
> + * while scanning the given buffer.
> + *
> + * @see RTE_REGEX_DEV_ATTR_MAX_PREFIX
> + */
> +
> +/**
> + * The generic *rte_regex_ops* structure to hold the RegEx attributes
> + * for enqueue and dequeue operation.
> + */
> +struct rte_regex_ops {
> +	/* W0 */
> +	uint16_t req_flags;
> +	/**< Request flags for the RegEx ops.
> +	 * @see RTE_REGEX_OPS_REQ_*
> +	 */
> +	uint16_t scan_size;
> +	/**< Scan size of the buffer to be scanned in bytes. */
> +	uint16_t rsp_flags;
> +	/**< Response flags for the RegEx ops.
> +	 * @see RTE_REGEX_OPS_RSP_*
> +	 */
> +	uint8_t nb_actual_matches;
> +	/**< The total number of actual matches detected by the Regex
> device.*/
> +	uint8_t nb_matches;
> +	/**< The total number of matches returned by the RegEx device for
> this
> +	 * scan. The size of *rte_regex_ops::matches* zero length array will
> be
> +	 * this value.
> +	 *
> +	 * @see struct rte_regex_ops::matches, struct rte_regex_match
> +	 */
> +
> +	/* W1 */
> +	RTE_STD_C11
> +	union {
> +		uint64_t u64;
> +		/**<  Allow 8-byte reserved on 32-bit system */
> +		void *buf_addr;
> +		/**< Virtual address of the pattern to be matched. */
> +	};
> +
> +	/* W2 */
> +	rte_iova_t buf_iova;
> +	/**< IOVA address of the pattern to be matched. */
> +
> +	/* W3 */
> +	uint16_t group_id0;
> +	/**< First group_id to match the rule against. Minimum one group id
> +	 * must be provided by application.
> +	 * When RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F set then
> group_id1
> +	 * is valid, respectively similar flags for group_id2 and group_id3.
> +	 * Upon the match, struct rte_regex_match::group_id shall be
> updated
> +	 * with matching group ID by the device. Group ID scheme provides
> +	 * rule isolation and effective pattern matching.
> +	 */
> +	uint16_t group_id1;
> +	/**< Second group_id to match the rule against.
> +	 *
> +	 * @see RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F
> +	 */
> +	uint16_t group_id2;
> +	/**< Third group_id to match the rule against.
> +	 *
> +	 * @see RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F
> +	 */
> +	uint16_t group_id3;
> +	/**< Forth group_id to match the rule against.
> +	 *
> +	 * @see RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F
> +	 */
> +
> +	/* W4 */
> +	RTE_STD_C11
> +	union {
> +		uint64_t user_id;
> +		/**< Application specific opaque value. An application may
> use
> +		 * this field to hold application specific value to share
> +		 * between dequeue and enqueue operation.
> +		 * Implementation should not modify this field.
> +		 */
> +		void *user_ptr;
> +		/**< Pointer representation of *user_id* */
> +	};

Since we target the regex subsystem for both regex and DPI I think it will be good to add another uint64_t field called connection_id. 
Device that support DPI can refer to it as another match able field when looking up for matches on the given buffer. 

This field is different from the user_id, as it is not opaque for the device. 

> +
> +	/* W5 */
> +	struct rte_regex_match matches[];
> +	/**< Zero length array to hold the match tuples.
> +	 * The struct rte_regex_ops::nb_matches value holds the number of
> +	 * elements in this array.
> +	 *
> +	 * @see struct rte_regex_ops::nb_matches
> +	 */
> +};
> +
> +/**
> + * Enqueue a burst of scan request on a RegEx device.
> + *
> + * The rte_regex_enqueue_burst() function is invoked to place
> + * regex operations on the queue *qp_id* of the device designated by
> + * its *dev_id*.
> + *
> + * The *nb_ops* parameter is the number of operations to process which
> are
> + * supplied in the *ops* array of *rte_regex_op* structures.
> + *
> + * The rte_regex_enqueue_burst() function returns the number of
> + * operations it actually enqueued for processing. A return value equal to
> + * *nb_ops* means that all packets have been enqueued.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param qp_id
> + *   The index of the queue pair which packets are to be enqueued for
> + *   processing. The value must be in the range [0, nb_queue_pairs - 1]
> + *   previously supplied to rte_regex_dev_configure().
> + * @param ops
> + *   The address of an array of *nb_ops* pointers to *rte_regex_op*
> structures
> + *   which contain the regex operations to be processed.
> + * @param nb_ops
> + *   The number of operations to process.
> + *
> + * @return
> + *   The number of operations actually enqueued on the regex device. The
> return
> + *   value can be less than the value of the *nb_ops* parameter when the
> + *   regex devices queue is full or if invalid parameters are specified in
> + *   a *rte_regex_op*. If the return value is less than *nb_ops*, the
> remaining
> + *   ops at the end of *ops* are not consumed and the caller has to take
> care
> + *   of them.
> + */
> +uint16_t
> +rte_regex_enqueue_burst(uint8_t dev_id, uint16_t qp_id,
> +			struct rte_regex_ops **ops, uint16_t nb_ops);
> +
> +/**
> + *
> + * Dequeue a burst of scan response from a queue on the RegEx device.
> + * The dequeued operation are stored in *rte_regex_op* structures
> + * whose pointers are supplied in the *ops* array.
> + *
> + * The rte_regex_dequeue_burst() function returns the number of ops
> + * actually dequeued, which is the number of *rte_regex_op* data
> structures
> + * effectively supplied into the *ops* array.
> + *
> + * A return value equal to *nb_ops* indicates that the queue contained
> + * at least *nb_ops* operations, and this is likely to signify that other
> + * processed operations remain in the devices output queue. Applications
> + * implementing a "retrieve as many processed operations as possible"
> policy
> + * can check this specific case and keep invoking the
> + * rte_regex_dequeue_burst() function until a value less than
> + * *nb_ops* is returned.
> + *
> + * The rte_regex_dequeue_burst() function does not provide any error
> + * notification to avoid the corresponding overhead.
> + *
> + * @param dev_id
> + *   The RegEx device identifier
> + * @param qp_id
> + *   The index of the queue pair from which to retrieve processed packets.
> + *   The value must be in the range [0, nb_queue_pairs - 1] previously
> + *   supplied to rte_regex_dev_configure().
> + * @param ops
> + *   The address of an array of pointers to *rte_regex_op* structures that
> must
> + *   be large enough to store *nb_ops* pointers in it.
> + * @param nb_ops
> + *   The maximum number of operations to dequeue.
> + *
> + * @return
> + *   The number of operations actually dequeued, which is the number
> + *   of pointers to *rte_regex_op* structures effectively supplied to the
> + *   *ops* array. If the return value is less than *nb_ops*, the remaining
> + *   ops at the end of *ops* are not consumed and the caller has to take
> care
> + *   of them.
> + */
> +uint16_t
> +rte_regex_dequeue_burst(uint8_t dev_id, uint16_t qp_id,
> +			struct rte_regex_ops **ops, uint16_t nb_ops);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_REGEXDEV_H_ */
>
John Bromhead Aug. 21, 2019, 3:12 p.m. UTC | #7
Their are probably quite a few other use cases, but suggest you also add

Natural Language Processing (NLP)
Sentiment Analysis
Big Data database acceleration (Spark, Hadoop etc.)
Computational Storage

Regards JohnB

John Bromhead
VP of Business Development
Titan IC
San Diego, CA 92130, USA

j.bromhead@titan-ic.com<mailto:j.bromhead@titan-ic.com>
Cell: +1-858-642-2501
Web: www.titan-ic.com<http://www.titan-ic.com/>
Personal email: john@bromhead.com<mailto:john@bromhead.com>
LinkedIn: https://www.linkedin.com/in/jbromhead<https://www.linkedin.com/in/jbromhead/>
To book a meeting: https://calendly.com/johnbromhead/titanic


On Aug 20, 2019, at 10:32 PM, Shahaf Shuler <shahafs@mellanox.com<mailto:shahafs@mellanox.com>> wrote:

Hi Jerin,

Thursday, August 15, 2019 2:34 PM, Thomas Monjalon:
Subject: Re: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev
subsystem

+Cc more

------------

From: Jerin Jacob <jerinj@marvell.com<mailto:jerinj@marvell.com>>

Even though there are some vendors which offer Regex HW offload, due to
lack of standard API, It is diffcult for DPDK consumer to use them
in a portable way.

This _RFC_ attempts to standardize the RegEx/DPI offload APIs for DPDK.

The Doxygen generated RFC API documentation available here:
https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdrea
my-noether-
22777e.netlify.com<http://22777e.netlify.com>%2Frte__regexdev_8h.html&amp;data=02%7C01%7Csha
hafs%40mellanox.com<http://40mellanox.com>%7Cdf93416cf4e8498a982c08d721748937%7Ca652971c
7d2e4d9ba6a4d149256f461b%7C0%7C0%7C637014656739993131&amp;sdata
=6ZAOrLmj3sf7LrPRlzE7IyqkK8b4cvFIQqK6zSwF4aw%3D&amp;reserved=0

This RFC crafted based on SW Regex API frameworks such as libpcre and
hyperscan and a few of the RegEx HW IPs which I am aware of.

RegEx pattern matching applications:
• Next Generation Firewalls (NGFW)
• Deep Packet and Flow Inspection (DPI)
• Intrusion Prevention Systems (IPS)
• DDoS Mitigation
• Network Monitoring
• Data Loss Prevention (DLP)
• Smart NICs
• Grammar based content processing
• URL, spam and adware filtering
• Advanced auditing and policing of user/application security policies
• Financial data mining - parsing of streamed financial feeds

I think two more important use case to add (at least on the doc of this subsystem) are:
* application recognition
* memory introspection



Request to review from HW and SW RegEx vendors and RegEx application
users
to have portable DPDK API for RegEx.

The API schematics are based cryptodev, eventdev and ethdev existing
device API.

Signed-off-by: Jerin Jacob <jerinj@marvell.com<mailto:jerinj@marvell.com>>
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com<mailto:pbhagavatula@marvell.com>>
---

RTE RegEx Device API
--------------------

Defines RTE RegEx Device APIs for RegEx operations and its provisioning.

The RegEx Device API is composed of two parts:

- The application-oriented RegEx API that includes functions to setup
a RegEx device (configure it, setup its queue pairs and start it),
update the rule database and so on.

- The driver-oriented RegEx API that exports a function allowing
a RegEx poll Mode Driver (PMD) to simultaneously register itself as
a RegEx device driver.

RegEx device components and definitions:

   +-----------------+
   |                 |
   |                 o---------+    rte_regex_[en|de]queue_burst()
   |   PCRE based    o------+  |               |
   |  RegEx pattern  |      |  |  +--------+   |
   | matching engine o------+--+--o        |   |    +------+
   |                 |      |  |  | queue  |<==o===>|Core 0|
   |                 o----+ |  |  | pair 0 |        |      |
   |                 |    | |  |  +--------+        +------+
   +-----------------+    | |  |
          ^               | |  |  +--------+
          |               | |  |  |        |        +------+
          |               | +--+--o queue  |<======>|Core 1|
      Rule|Database       |    |  | pair 1 |        |      |
   +------+----------+    |    |  +--------+        +------+
   |     Group 0     |    |    |
   | +-------------+ |    |    |  +--------+        +------+
   | | Rules 0..n  | |    |    |  |        |        |Core 2|
   | +-------------+ |    |    +--o queue  |<======>|      |
   |     Group 1     |    |       | pair 2 |        +------+
   | +-------------+ |    |       +--------+
   | | Rules 0..n  | |    |
   | +-------------+ |    |       +--------+
   |     Group 2     |    |       |        |        +------+
   | +-------------+ |    |       | queue  |<======>|Core n|
   | | Rules 0..n  | |    +-------o pair n |        |      |
   | +-------------+ |            +--------+        +------+
   |     Group n     |
   | +-------------+ |<-------rte_regex_rule_db_update()
   | | Rules 0..n  | |<-------rte_regex_rule_db_import()
   | +-------------+ |------->rte_regex_rule_db_export()
   +-----------------+

RegEx: A regular expression is a concise and flexible means for matching
strings of text, such as particular characters, words, or patterns of
characters. A common abbreviation for this is “RegEx”.

RegEx device: A hardware or software-based implementation of RegEx
device API for PCRE based pattern matching syntax and semantics.

PCRE RegEx syntax and semantics specification:
https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fregex
kit.sourceforge.net<http://kit.sourceforge.net>%2FDocumentation%2Fpcre%2Fpcrepattern.html&amp;d
ata=02%7C01%7Cshahafs%40mellanox.com<http://40mellanox.com>%7Cdf93416cf4e8498a982c08d721
748937%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C63701465673
9993131&amp;sdata=B0LSMubldDy3UlF55Z3whhNiRq6ep1pxB8Rrt5DItfw%3
D&amp;reserved=0

RegEx queue pair: Each RegEx device should have one or more queue pair to
transmit a burst of pattern matching request and receive a burst of
receive the pattern matching response. The pattern matching
request/response
embedded in *rte_regex_ops* structure.

Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
Match ID and Group ID to identify the rule upon the match.

Rule database: The RegEx device accepts regular expressions and converts
them
into a compiled rule database that can then be used to scan data.
Compilation allows the device to analyze the given pattern(s) and
pre-determine how to scan for these patterns in an optimized fashion that
would be far too expensive to compute at run-time. A rule database contains
a set of rules that compiled in device specific binary form.

Match ID or Rule ID: A unique identifier provided at the time of rule
creation for the application to identify the rule upon match.

Group ID: Group of rules can be grouped under one group ID to enable
rule isolation and effective pattern matching. A unique group identifier
provided at the time of rule creation for the application to identify the
rule upon match.

Scan: A pattern matching request through *enqueue* API.

It may possible that a given RegEx device may not support all the features
of PCRE. The application may probe unsupported features through
struct rte_regex_dev_info::pcre_unsup_flags

By default, all the functions of the RegEx Device API exported by a PMD
are lock-free functions which assume to not be invoked in parallel on
different logical cores to work on the same target object. For instance,
the dequeue function of a PMD cannot be invoked in parallel on two logical
cores to operates on same RegEx queue pair. Of course, this function
can be invoked in parallel by different logical core on different queue pair.
It is the responsibility of the upper level application to enforce this rule.

In all functions of the RegEx API, the RegEx device is
designated by an integer >= 0 named the device identifier *dev_id*

At the RegEx driver level, RegEx devices are represented by a generic
data structure of type *rte_regex_dev*.

RegEx devices are dynamically registered during the PCI/SoC device probing
phase performed at EAL initialization time.
When a RegEx device is being probed, a *rte_regex_dev* structure and
a new device identifier are allocated for that device. Then, the
regex_dev_init() function supplied by the RegEx driver matching the probed
device is invoked to properly initialize the device.

The role of the device init function consists of resetting the hardware or
software RegEx driver implementations.

If the device init operation is successful, the correspondence between
the device identifier assigned to the new device and its associated
*rte_regex_dev* structure is effectively registered.
Otherwise, both the *rte_regex_dev* structure and the device identifier are
freed.

The functions exported by the application RegEx API to setup a device
designated by its device identifier must be invoked in the following order:
- rte_regex_dev_configure()
- rte_regex_queue_pair_setup()
- rte_regex_dev_start()

Then, the application can invoke, in any order, the functions
exported by the RegEx API to enqueue pattern matching job, dequeue
pattern
matching response, get the stats, update the rule database,
get/set device attributes and so on

If the application wants to change the configuration (i.e. call
rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must call
rte_regex_dev_stop() first to stop the device and then do the
reconfiguration
before calling rte_regex_dev_start() again. The enqueue and dequeue
functions should not be invoked when the device is stopped.

Finally, an application can close a RegEx device by invoking the
rte_regex_dev_close() function.

Each function of the application RegEx API invokes a specific function
of the PMD that controls the target device designated by its device
identifier.

For this purpose, all device-specific functions of a RegEx driver are
supplied through a set of pointers contained in a generic structure of type
*regex_dev_ops*.
The address of the *regex_dev_ops* structure is stored in the
*rte_regex_dev*
structure by the device init function of the RegEx driver, which is
invoked during the PCI/SoC device probing phase, as explained earlier.

In other words, each function of the RegEx API simply retrieves the
*rte_regex_dev* structure associated with the device identifier and
performs an indirect invocation of the corresponding driver function
supplied in the *regex_dev_ops* structure of the *rte_regex_dev*
structure.

For performance reasons, the address of the fast-path functions of the
RegEx driver is not contained in the *regex_dev_ops* structure.
Instead, they are directly stored at the beginning of the *rte_regex_dev*
structure to avoid an extra indirect memory access during their invocation.

RTE RegEx device drivers do not use interrupts for enqueue or dequeue
operation. Instead, RegEx drivers export Poll-Mode enqueue and dequeue
functions to applications.

The *enqueue* operation submits a burst of RegEx pattern matching
request
to the RegEx device and the *dequeue* operation gets a burst of pattern
matching response for the ones submitted through *enqueue* operation.

Typical application utilisation of the RegEx device API will follow the
following programming flow.

- rte_regex_dev_configure()
- rte_regex_queue_pair_setup()
- rte_regex_rule_db_update() Needs to invoke if precompiled rule database
not
provided in rte_regex_dev_config::rule_db for rte_regex_dev_configure()
and/or application needs to update rule database.
- Create or reuse exiting mempool for *rte_regex_ops* objects.
- rte_regex_dev_start()
- rte_regex_enqueue_burst()
- rte_regex_dequeue_burst()

---

config/common_base                 |    5 +
doc/api/doxy-api-index.md          |    1 +
doc/api/doxy-api.conf.in           |    1 +
lib/Makefile                       |    2 +
lib/librte_regexdev/Makefile       |   23 +
lib/librte_regexdev/rte_regexdev.c |    5 +
lib/librte_regexdev/rte_regexdev.h | 1247
++++++++++++++++++++++++++++
7 files changed, 1284 insertions(+)
create mode 100644 lib/librte_regexdev/Makefile
create mode 100644 lib/librte_regexdev/rte_regexdev.c
create mode 100644 lib/librte_regexdev/rte_regexdev.h

diff --git a/config/common_base b/config/common_base
index e406e7836..986093d6e 100644
--- a/config/common_base
+++ b/config/common_base
@@ -746,6 +746,11 @@
CONFIG_RTE_LIBRTE_PMD_DPAA2_QDMA_RAWDEV=n
#
CONFIG_RTE_LIBRTE_PMD_IFPGA_RAWDEV=y

+#
+# Compile regex device support
+#
+CONFIG_RTE_LIBRTE_REGEXDEV=y
+
#
# Compile librte_ring
#
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index 715248dd1..a0bc27ae4 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -26,6 +26,7 @@ The public API headers are grouped by topics:
[event_timer_adapter]    (@ref rte_event_timer_adapter.h),
[event_crypto_adapter]   (@ref rte_event_crypto_adapter.h),
[rawdev]             (@ref rte_rawdev.h),
+  [regexdev]           (@ref rte_regexdev.h),
[metrics]            (@ref rte_metrics.h),
[bitrate]            (@ref rte_bitrate.h),
[latency]            (@ref rte_latencystats.h),
diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in
index b9896cb63..7adb821bb 100644
--- a/doc/api/doxy-api.conf.in
+++ b/doc/api/doxy-api.conf.in
@@ -53,6 +53,7 @@ INPUT                   = @TOPDIR@/doc/api/doxy-api-
index.md \
@TOPDIR@/lib/librte_rawdev \
@TOPDIR@/lib/librte_rcu \
@TOPDIR@/lib/librte_reorder \
+                          @TOPDIR@/lib/librte_regexdev \
@TOPDIR@/lib/librte_ring \
@TOPDIR@/lib/librte_sched \
@TOPDIR@/lib/librte_security \
diff --git a/lib/Makefile b/lib/Makefile
index 791e0d991..57de9691a 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -44,6 +44,8 @@ DEPDIRS-librte_eventdev := librte_eal librte_ring
librte_ethdev librte_hash \
librte_mempool librte_timer librte_cryptodev
DIRS-$(CONFIG_RTE_LIBRTE_RAWDEV) += librte_rawdev
DEPDIRS-librte_rawdev := librte_eal librte_ethdev
+DIRS-$(CONFIG_RTE_LIBRTE_REGEXDEV) += librte_regexdev
+DEPDIRS-librte_regexdev := librte_eal
DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
DEPDIRS-librte_vhost := librte_eal librte_mempool librte_mbuf
librte_ethdev \
           librte_net
diff --git a/lib/librte_regexdev/Makefile b/lib/librte_regexdev/Makefile
new file mode 100644
index 000000000..723b4b28c
--- /dev/null
+++ b/lib/librte_regexdev/Makefile
@@ -0,0 +1,23 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(C) 2019 Marvell International Ltd.
+#
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_regexdev.a
+
+# library version
+LIBABIVER := 1
+
+# build flags
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+# library source files
+SRCS-y += rte_regexdev.c
+
+# export include files
+SYMLINK-y-include += rte_regexdev.h
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_regexdev/rte_regexdev.c
b/lib/librte_regexdev/rte_regexdev.c
new file mode 100644
index 000000000..e5be0f29c
--- /dev/null
+++ b/lib/librte_regexdev/rte_regexdev.c
@@ -0,0 +1,5 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2019 Marvell International Ltd.
+ */
+
+#include <rte_regexdev.h>
diff --git a/lib/librte_regexdev/rte_regexdev.h
b/lib/librte_regexdev/rte_regexdev.h
new file mode 100644
index 000000000..765da4aaa
--- /dev/null
+++ b/lib/librte_regexdev/rte_regexdev.h
@@ -0,0 +1,1247 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2019 Marvell International Ltd.
+ */
+
+#ifndef _RTE_REGEXDEV_H_
+#define _RTE_REGEXDEV_H_
+
+/**
+ * @file
+ *
+ * RTE RegEx Device API
+ *
+ * Defines RTE RegEx Device APIs for RegEx operations and its provisioning.
+ *
+ * The RegEx Device API is composed of two parts:
+ *
+ * - The application-oriented RegEx API that includes functions to setup
+ *   a RegEx device (configure it, setup its queue pairs and start it),
+ *   update the rule database and so on.
+ *
+ * - The driver-oriented RegEx API that exports a function allowing
+ *   a RegEx poll Mode Driver (PMD) to simultaneously register itself as
+ *   a RegEx device driver.
+ *
+ * RegEx device components and definitions:
+ *
+ *     +-----------------+
+ *     |                 |
+ *     |                 o---------+    rte_regex_[en|de]queue_burst()
+ *     |   PCRE based    o------+  |               |
+ *     |  RegEx pattern  |      |  |  +--------+   |
+ *     | matching engine o------+--+--o        |   |    +------+
+ *     |                 |      |  |  | queue  |<==o===>|Core 0|
+ *     |                 o----+ |  |  | pair 0 |        |      |
+ *     |                 |    | |  |  +--------+        +------+
+ *     +-----------------+    | |  |
+ *            ^               | |  |  +--------+
+ *            |               | |  |  |        |        +------+
+ *            |               | +--+--o queue  |<======>|Core 1|
+ *        Rule|Database       |    |  | pair 1 |        |      |
+ *     +------+----------+    |    |  +--------+        +------+
+ *     |     Group 0     |    |    |
+ *     | +-------------+ |    |    |  +--------+        +------+
+ *     | | Rules 0..n  | |    |    |  |        |        |Core 2|
+ *     | +-------------+ |    |    +--o queue  |<======>|      |
+ *     |     Group 1     |    |       | pair 2 |        +------+
+ *     | +-------------+ |    |       +--------+
+ *     | | Rules 0..n  | |    |
+ *     | +-------------+ |    |       +--------+
+ *     |     Group 2     |    |       |        |        +------+
+ *     | +-------------+ |    |       | queue  |<======>|Core n|
+ *     | | Rules 0..n  | |    +-------o pair n |        |      |
+ *     | +-------------+ |            +--------+        +------+
+ *     |     Group n     |
+ *     | +-------------+ |<-------rte_regex_rule_db_update()
+ *     | | Rules 0..n  | |<-------rte_regex_rule_db_import()
+ *     | +-------------+ |------->rte_regex_rule_db_export()
+ *     +-----------------+
+ *
+ * RegEx: A regular expression is a concise and flexible means for matching
+ * strings of text, such as particular characters, words, or patterns of
+ * characters. A common abbreviation for this is “RegEx”.
+ *
+ * RegEx device: A hardware or software-based implementation of RegEx
+ * device API for PCRE based pattern matching syntax and semantics.
+ *
+ * PCRE RegEx syntax and semantics specification:
+ *
https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fregex
kit.sourceforge.net<http://kit.sourceforge.net>%2FDocumentation%2Fpcre%2Fpcrepattern.html&amp;d
ata=02%7C01%7Cshahafs%40mellanox.com<http://40mellanox.com>%7Cdf93416cf4e8498a982c08d721
748937%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C63701465673
9993131&amp;sdata=B0LSMubldDy3UlF55Z3whhNiRq6ep1pxB8Rrt5DItfw%3
D&amp;reserved=0
+ *
+ * RegEx queue pair: Each RegEx device should have one or more queue
pair to
+ * transmit a burst of pattern matching request and receive a burst of
+ * receive the pattern matching response. The pattern matching
request/response
+ * embedded in *rte_regex_ops* structure.
+ *
+ * Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
+ * Match ID and Group ID to identify the rule upon the match.
+ *
+ * Rule database: The RegEx device accepts regular expressions and
converts them
+ * into a compiled rule database that can then be used to scan data.
+ * Compilation allows the device to analyze the given pattern(s) and
+ * pre-determine how to scan for these patterns in an optimized fashion
that
+ * would be far too expensive to compute at run-time. A rule database
contains
+ * a set of rules that compiled in device specific binary form.
+ *
+ * Match ID or Rule ID: A unique identifier provided at the time of rule
+ * creation for the application to identify the rule upon match.
+ *
+ * Group ID: Group of rules can be grouped under one group ID to enable
+ * rule isolation and effective pattern matching. A unique group identifier
+ * provided at the time of rule creation for the application to identify the
+ * rule upon match.
+ *
+ * Scan: A pattern matching request through *enqueue* API.
+ *
+ * It may possible that a given RegEx device may not support all the features
+ * of PCRE. The application may probe unsupported features through
+ * struct rte_regex_dev_info::pcre_unsup_flags
+ *
+ * By default, all the functions of the RegEx Device API exported by a PMD
+ * are lock-free functions which assume to not be invoked in parallel on
+ * different logical cores to work on the same target object. For instance,
+ * the dequeue function of a PMD cannot be invoked in parallel on two
logical
+ * cores to operates on same RegEx queue pair. Of course, this function
+ * can be invoked in parallel by different logical core on different queue
pair.
+ * It is the responsibility of the upper level application to enforce this rule.
+ *
+ * In all functions of the RegEx API, the RegEx device is
+ * designated by an integer >= 0 named the device identifier *dev_id*
+ *
+ * At the RegEx driver level, RegEx devices are represented by a generic
+ * data structure of type *rte_regex_dev*.
+ *
+ * RegEx devices are dynamically registered during the PCI/SoC device
probing
+ * phase performed at EAL initialization time.
+ * When a RegEx device is being probed, a *rte_regex_dev* structure and
+ * a new device identifier are allocated for that device. Then, the
+ * regex_dev_init() function supplied by the RegEx driver matching the
probed
+ * device is invoked to properly initialize the device.
+ *
+ * The role of the device init function consists of resetting the hardware or
+ * software RegEx driver implementations.
+ *
+ * If the device init operation is successful, the correspondence between
+ * the device identifier assigned to the new device and its associated
+ * *rte_regex_dev* structure is effectively registered.
+ * Otherwise, both the *rte_regex_dev* structure and the device identifier
are
+ * freed.
+ *
+ * The functions exported by the application RegEx API to setup a device
+ * designated by its device identifier must be invoked in the following order:
+ *     - rte_regex_dev_configure()
+ *     - rte_regex_queue_pair_setup()
+ *     - rte_regex_dev_start()
+ *
+ * Then, the application can invoke, in any order, the functions
+ * exported by the RegEx API to enqueue pattern matching job, dequeue
pattern
+ * matching response, get the stats, update the rule database,
+ * get/set device attributes and so on
+ *
+ * If the application wants to change the configuration (i.e. call
+ * rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must
call
+ * rte_regex_dev_stop() first to stop the device and then do the
reconfiguration
+ * before calling rte_regex_dev_start() again. The enqueue and dequeue
+ * functions should not be invoked when the device is stopped.
+ *
+ * Finally, an application can close a RegEx device by invoking the
+ * rte_regex_dev_close() function.
+ *
+ * Each function of the application RegEx API invokes a specific function
+ * of the PMD that controls the target device designated by its device
+ * identifier.
+ *
+ * For this purpose, all device-specific functions of a RegEx driver are
+ * supplied through a set of pointers contained in a generic structure of type
+ * *regex_dev_ops*.
+ * The address of the *regex_dev_ops* structure is stored in the
*rte_regex_dev*
+ * structure by the device init function of the RegEx driver, which is
+ * invoked during the PCI/SoC device probing phase, as explained earlier.
+ *
+ * In other words, each function of the RegEx API simply retrieves the
+ * *rte_regex_dev* structure associated with the device identifier and
+ * performs an indirect invocation of the corresponding driver function
+ * supplied in the *regex_dev_ops* structure of the *rte_regex_dev*
structure.
+ *
+ * For performance reasons, the address of the fast-path functions of the
+ * RegEx driver is not contained in the *regex_dev_ops* structure.
+ * Instead, they are directly stored at the beginning of the *rte_regex_dev*
+ * structure to avoid an extra indirect memory access during their
invocation.
+ *
+ * RTE RegEx device drivers do not use interrupts for enqueue or dequeue
+ * operation. Instead, RegEx drivers export Poll-Mode enqueue and
dequeue
+ * functions to applications.
+ *
+ * The *enqueue* operation submits a burst of RegEx pattern matching
request
+ * to the RegEx device and the *dequeue* operation gets a burst of pattern
+ * matching response for the ones submitted through *enqueue*
operation.
+ *
+ * Typical application utilisation of the RegEx device API will follow the
+ * following programming flow.
+ *
+ * - rte_regex_dev_configure()
+ * - rte_regex_queue_pair_setup()
+ * - rte_regex_rule_db_update() Needs to invoke if precompiled rule
database not
+ *   provided in rte_regex_dev_config::rule_db for
rte_regex_dev_configure()
+ *   and/or application needs to update rule database.
+ * - Create or reuse exiting mempool for *rte_regex_ops* objects.
+ * - rte_regex_dev_start()
+ * - rte_regex_enqueue_burst()
+ * - rte_regex_dequeue_burst()
+ *
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_common.h>
+#include <rte_config.h>
+#include <rte_dev.h>
+#include <rte_errno.h>
+#include <rte_memory.h>
+
+/**
+ * Get the total number of RegEx devices that have been successfully
+ * initialised.
+ *
+ * @return
+ *   The total number of usable RegEx devices.
+ */
+uint8_t
+rte_regex_dev_count(void);
+
+/**
+ * Get the device identifier for the named RegEx device.
+ *
+ * @param name
+ *   RegEx device name to select the RegEx device identifier.
+ *
+ * @return
+ *   Returns RegEx device identifier on success.
+ *   - <0: Failure to find named RegEx device.
+ */
+int
+rte_regex_dev_get_dev_id(const char *name);
+
+/* Enumerates RegEx device capabilities */
+#define RTE_REGEX_DEV_CAPA_RUNTIME_COMPILATION_F (1ULL << 0)
+/**< RegEx device does support compiling the rules at runtime unlike
+ * loading only the pre-built rule database using
+ * struct rte_regex_dev_config::rule_db in rte_regex_dev_configure()
+ * @see struct rte_regex_dev_config::rule_db, rte_regex_dev_configure()
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+
+/* Enumerates unsupported PCRE features for the RegEx device */
+#define RTE_REGEX_DEV_PCRE_UNSUP_START_ANCHOR_F (1ULL << 0)
+/**< RegEx device doesn't support PCRE Anchor to start of match flag.
+ * Example RegEx is '/\Gfoo\d/'. Here '\G' asserts position at the end of the
+ * previous match or the start of the string for the first match.
+ * This position will change each time the RegEx is applied to the subject
+ * string. If the RegEx is applied to 'foo1foo2Zfoo3' the first two matches will
+ * be successful for 'foo1foo2' and fail for 'Zfoo3'.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_ATOMIC_GROUPING_F (1ULL <<
1)
+/**< RegEx device doesn't support PCRE Atomic grouping.
+ * Atomic groups are represented by '(?>)'. An atomic group is a group that,
+ * when the RegEx engine exits from it, automatically throws away all
+ * backtracking positions remembered by any tokens inside the group.
+ * Example RegEx is 'a(?>bc|b)c' if the given patterns are 'abc' and 'abcc'
then
+ * 'a(bc|b)c' matches both where as 'a(?>bc|b)c' matches only abcc because
+ * atomic groups don't allow backtracing back to 'b'.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_BACKTRACKING_CTRL_F (1ULL <<
2)
+/**< RegEx device doesn't support PCRE backtracking control verbs.
+ * Some examples of backtracing verbs are (*COMMIT), (*ACCEPT), (*FAIL),
+ * (*SKIP), (*PRUNE).
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_CALLOUTS_F (1ULL << 3)
+/**< RegEx device doesn't support PCRE callouts.
+ * PCRE supports calling external function in between matches by using
'(?C)'.
+ * Example RegEx 'ABC(?C)D' if a given patter is 'ABCD' then the RegEx
engine
+ * will parse ABC perform a userdefined callout and return a successful
match at
+ * D.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_BACKREFERENCE_F (1ULL << 4)
+/**< RegEx device doesn't support PCRE backreference.
+ * Example RegEx is '(\2ABC|(GHI))+' \2 matches the same text as most
recently
+ * matched by the 2nd capturing group i.e. 'GHI'.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_GREEDY_F (1ULL << 5)
+/**< RegEx device doesn't support PCRE Greedy mode.
+ * For example if the RegEx is 'AB\d*?' then '*?' represents zero or
unlimited
+ * matches. In greedy mode the pattern 'AB12345' will be matched
completely
+ * where as the ungreedy mode 'AB' will be returned as the match.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_LOOKAROUND_ASRT_F (1ULL <<
6)
+/**< RegEx device doesn't support PCRE Lookaround assertions
+ * (Zero-width assertions). Example RegEx is '[a-z]+\d+(?=!{3,})' if
+ * the given pattern is 'dwad1234!' the RegEx engine doesn't report any
matches
+ * because the assert '(?=!{3,})' fails. The pattern 'dwad123!!!' would return
a
+ * successful match.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_MATCH_POINT_RST_F (1ULL <<
7)
+/**< RegEx device doesn't support PCRE match point reset directive.
+ * Example RegEx is '[a-z]+\K\d+' if the pattern is 'dwad123'
+ * then even though the entire pattern matches only '123'
+ * is reported as a match.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_NEWLINE_CONVENTIONS_F
(1ULL << 8)
+/**< RegEx device doesn't support PCRE newline convention.
+ * Newline conventions are represented as follows:
+ * (*CR)        carriage return
+ * (*LF)        linefeed
+ * (*CRLF)      carriage return, followed by linefeed
+ * (*ANYCRLF)   any of the three above
+ * (*ANY)       all Unicode newline sequences
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_NEWLINE_SEQ_F (1ULL << 9)
+/**< RegEx device doesn't support PCRE newline sequence.
+ * The escape sequence '\R' will match any newline sequence.
+ * It is equivalent to: '(?>\r\n|\n|\x0b|\f|\r|\x85)'.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_POSSESSIVE_QUALIFIERS_F (1ULL
<< 10)
+/**< RegEx device doesn't support PCRE possessive qualifiers.
+ * Example RegEx possessive qualifiers '*+', '++', '?+', '{m,n}+'.
+ * Possessive quantifier repeats the token as many times as possible and it
does
+ * not give up matches as the engine backtracks. With a possessive
quantifier,
+ * the deal is all or nothing.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_SUBROUTINE_REFERENCES_F
(1ULL << 11)
+/**< RegEx device doesn't support PCRE Subroutine references.
+ * PCRE Subroutine references allow for sub patterns to be assessed
+ * as part of the RegEx. Example RegEx is '(foo|fuzz)\g<1>+bar' matches the
+ * pattern 'foofoofuzzfoofuzzbar'.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_8_F (1ULL << 12)
+/**< RegEx device doesn't support UTF-8 character encoding.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_16_F (1ULL << 13)
+/**< RegEx device doesn't support UTF-16 character encoding.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_32_F (1ULL << 14)
+/**< RegEx device doesn't support UTF-32 character encoding.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_WORD_BOUNDARY_F (1ULL <<
15)
+/**< RegEx device doesn't support word boundaries.
+ * The meta character '\b' represents word boundary anchor.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_FORWARD_REFERENCES_F (1ULL
<< 16)
+/**< RegEx device doesn't support Forward references.
+ * Forward references allow you to use a back reference to a group that
appears
+ * later in the RegEx. Example RegEx is '(\3ABC|(DEF|(GHI)))+' matches the
+ * following string 'GHIGHIABCDEF'.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+/* Enumerates PCRE rule flags */
+#define RTE_REGEX_PCRE_RULE_ALLOW_EMPTY_F (1ULL << 0)
+/**< When this flag is set, the pattern that can match against an empty
string,
+ * such as '.*' are allowed.
+ * @see struct rte_regex_dev_info::rule_flags, struct
rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_ANCHORED_F (1ULL << 1)
+/**< When this flag is set, the pattern is forced to be "anchored", that is, it
+ * is constrained to match only at the first matching point in the string that
+ * is being searched. Similar to '^' and represented by \A.
+ * @see struct rte_regex_dev_info::rule_flags, struct
rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_CASELESS_F (1ULL << 2)
+/**< When this flag is set, letters in the pattern match both upper and
lower
+ * case letters in the subject.
+ * @see struct rte_regex_dev_info::rule_flags, struct
rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_DOTALL_F (1ULL << 3)
+/**< When this flag is set, a dot metacharacter in the pattern matches any
+ * character, including one that indicates a newline.
+ * @see struct rte_regex_dev_info::rule_flags, struct
rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_DUPNAMES_F (1ULL << 4)
+/**< When this flag is set, names used to identify capture groups need not
be
+ * unique.
+ * @see struct rte_regex_dev_info::rule_flags, struct
rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_EXTENDED_F (1ULL << 5)
+/**< When this flag is set, most white space characters in the pattern are
+ * totally ignored except when escaped or inside a character class.
+ * @see struct rte_regex_dev_info::rule_flags, struct
rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_MATCH_UNSET_BACKREF_F (1ULL << 6)
+/**< When this flag is set, a backreference to an unset capture group
matches an
+ * empty string.
+ * @see RTE_REGEX_DEV_PCRE_UNSUP_FORWARD_REFERENCES_F
+ * @see struct rte_regex_dev_info::rule_flags, struct
rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_MULTILINE_F (1ULL << 7)
+/**< When this flag  is set, the '^' and '$' constructs match immediately
+ * following or immediately before internal newlines in the subject string,
+ * respectively, as well as at the very start and end.
+ * @see struct rte_regex_dev_info::rule_flags, struct
rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_NO_AUTO_CAPTURE_F (1ULL << 8)
+/**< When this Flag is set, it disables the use of numbered capturing
+ * parentheses in the pattern. References to capture groups
(backreferences or
+ * recursion/subroutine calls) may only refer to named groups, though the
+ * reference can be by name or by number.
+ * @see struct rte_regex_dev_info::rule_flags, struct
rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_UCP_F (1ULL << 9)
+/**< By default, only ASCII characters are recognized, When this flag is set,
+ * Unicode properties are used instead to classify characters.
+ * @see struct rte_regex_dev_info::rule_flags, struct
rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_UNGREEDY_F (1ULL << 10)
+/**< When this flag is set, the "greediness" of the quantifiers is inverted
+ * so that they are not greedy by default, but become greedy if followed by
+ * '?'.
+ * @see struct rte_regex_dev_info::rule_flags, struct
rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_UTF_F (1ULL << 11)
+/**< When this flag is set, RegEx engine has to regard both the pattern and
the
+ * subject strings that are subsequently processed as strings of UTF
characters
+ * instead of single-code-unit strings.
+ * @see struct rte_regex_dev_info::rule_flags, struct
rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_NEVER_BACKSLASH_C_F (1ULL << 12)
+/**< This Flag locks out the use of '\C' in the pattern that is being compiled.
+ * This escape matches one data unit, even in UTF mode which can cause
+ * unpredictable behavior in UTF-8 or UTF-16 modes, because it may leave
the
+ * current matching point in the middle of a multi-code-unit character.
+ * @see struct rte_regex_dev_info::rule_flags, struct
rte_regex_rule::rule_flags
+ */
+
+
+/**
+ * RegEx device information
+ */
+struct rte_regex_dev_info {
+    const char *driver_name; /**< RegEx driver name */
+    struct rte_device *dev;    /**< Device information */
+    uint8_t max_matches;
+    /**< Maximum matches per scan supported by this device */
+    uint16_t max_queue_pairs;
+    /**< Maximum queue pairs supported by this device */
+    uint16_t max_payload_size;
+    /**< Maximum payload size for a pattern match request or scan.
+     * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
+     */
+    uint16_t max_rules_per_group;
+    /**< Maximum rules supported per group by this device */
+    uint16_t max_groups;
+    /**< Maximum group supported by this device */
+    uint32_t regex_dev_capa;
+    /**< RegEx device capabilities. @see RTE_REGEX_DEV_CAPA_* */
+    uint64_t rule_flags;
+    /**< Supported compiler rule flags.
+     * @see RTE_REGEX_PCRE_RULE_*, struct rte_regex_rule::rule_flags
+     */
+    uint64_t pcre_unsup_flags;
+    /**< Unsupported PCRE features for this RegEx device.
+     * @see RTE_REGEX_DEV_PCRE_UNSUP_*
+     */
+};
+
+/**
+ * Retrieve the contextual information of a RegEx device.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ *
+ * @param[out] dev_info
+ *   A pointer to a structure of type *rte_regex_dev_info* to be filled with
the
+ *   contextual information of the device.
+ *
+ * @return
+ *   - 0: Success, driver updates the contextual information of the RegEx
device
+ *   - <0: Error code returned by the driver info get function.
+ *
+ */
+int
+rte_regex_dev_info_get(uint8_t dev_id, struct rte_regex_dev_info
*dev_info);
+
+/* Enumerates RegEx device configuration flags */
+#define RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F (1ULL << 0)
+/**< Cross buffer scan refers to the ability to be able to detect
+ * matches that occur across buffer boundaries, where the buffers are
related
+ * to each other in some way. Enable this flag when to scan payload size
+ * greater struct struct rte_regex_dev_info::max_payload_size and/or
+ * matches can present across scan buffer boundaries.
+ *
+ * @see struct rte_regex_dev_info::max_payload_size
+ * @see struct rte_regex_dev_config::dev_cfg_flags,
rte_regex_dev_configure()
+ * @see RTE_REGEX_OPS_RSP_PMI_SOJ_F
+ * @see RTE_REGEX_OPS_RSP_PMI_EOJ_F
+ */
+
+/** RegEx device configuration structure */
+struct rte_regex_dev_config {
+    uint8_t nb_max_matches;
+    /**< Maximum matches per scan configured on this device.
+     * This value cannot exceed the *max_matches*
+     * which previously provided in rte_regex_dev_info_get().
+     * The value 0 is allowed, in which case, value 1 used.
+     * @see struct rte_regex_dev_info::max_matches
+     */
+    uint16_t nb_queue_pairs;
+    /**< Number of RegEx queue pairs to configure on this device.
+     * This value cannot exceed the *max_queue_pairs* which
previously
+     * provided in rte_regex_dev_info_get().
+     * @see struct rte_regex_dev_info::max_queue_pairs
+     */
+    uint16_t nb_rules_per_group;
+    /**< Number of rules per group to configure on this device.
+     * This value cannot exceed the *max_rules_per_group*
+     * which previously provided in rte_regex_dev_info_get().
+     * The value 0 is allowed, in which case,
+     * struct rte_regex_dev_info::max_rules_per_group used.
+     * @see struct rte_regex_dev_info::max_rules_per_group
+     */
+    uint16_t nb_groups;
+    /**< Number of groups to configure on this device.
+     * This value cannot exceed the *max_groups*
+     * which previously provided in rte_regex_dev_info_get().
+     * @see struct rte_regex_dev_info::max_groups
+     */
+    const char *rule_db;
+    /**< Import initial set of prebuilt rule database on this device.
+     * The value NULL is allowed, in which case, the device will not
+     * be configured prebuilt rule database. Application may use
+     * rte_regex_rule_db_update() or rte_regex_rule_db_import() API
+     * to update or import rule database after the
+     * rte_regex_dev_configure().
+     * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
+     */
+    uint32_t rule_db_len;
+    /**< Length of *rule_db* buffer. */
+    uint32_t dev_cfg_flags;
+    /**< RegEx device configuration flags, See RTE_REGEX_DEV_CFG_*
*/
+};
+
+/**
+ * Configure a RegEx device.
+ *
+ * This function must be invoked first before any other function in the
+ * API. This function can also be re-invoked when a device is in the
+ * stopped state.
+ *
+ * The caller may use rte_regex_dev_info_get() to get the capability of each
+ * resources available for this regex device.
+ *
+ * @param dev_id
+ *   The identifier of the device to configure.
+ * @param cfg
+ *   The RegEx device configuration structure.
+ *
+ * @return
+ *   - 0: Success, device configured.
+ *   - <0: Error code returned by the driver configuration function.
+ */
+int
+rte_regex_dev_configure(uint8_t dev_id, const struct
rte_regex_dev_config *cfg);
+
+/* Enumerates RegEx queue pair configuration flags */
+#define RTE_REGEX_QUEUE_PAIR_CFG_OOS_F (1ULL << 0)
+/**< Out of order scan, If not set, a scan must retire after previously issued
+ * in-order scans to this queue pair. If set, this scan can be retired as soon
+ * as device returns completion. Application should not set out of order scan
+ * flag if it needs to maintain the ingress order of scan request.
+ *
+ * @see struct rte_regex_qp_conf::qp_conf_flags,
rte_regex_queue_pair_setup()
+ */
+
+struct rte_regex_ops;
+typedef void (*regexdev_stop_flush_t)(uint8_t dev_id, uint16_t qp_id,
+                      struct rte_regex_ops *op);
+/**< Callback function called during rte_regex_dev_stop(), invoked once
per
+ * flushed RegEx op.
+ */
+
+/** RegEx queue pair configuration structure */
+struct rte_regex_qp_conf {
+    uint32_t qp_conf_flags;
+    /**< Queue pair config flags, See RTE_REGEX_QUEUE_PAIR_CFG_*
*/
+    uint16_t nb_desc;
+    /**< The number of descriptors to allocate for this queue pair. */
+    regexdev_stop_flush_t cb;
+    /**< Callback function called during rte_regex_dev_stop(), invoked
+     * once per flushed regex op. Value NULL is allowed, in which case
+     * callback will not be invoked. This function can be used to properly
+     * dispose of outstanding regex ops from response queue,
+     * for example ops containing memory pointers.
+     * @see rte_regex_dev_stop()
+     */
+};
+
+/**
+ * Allocate and set up a RegEx queue pair for a RegEx device.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param queue_pair_id
+ *   The index of the RegEx queue pair to setup. The value must be in the
range
+ *   [0, nb_queue_pairs - 1] previously supplied to
rte_regex_dev_configure().
+ * @param qp_conf
+ *   The pointer to the configuration data to be used for the RegEx queue
pair.
+ *   NULL value is allowed, in which case default configuration    used.
+ *
+ * @return
+ *   - 0: Success, RegEx queue pair correctly set up.
+ *   - <0: RegEx queue configuration failed
+ */
+int
+rte_regex_queue_pair_setup(uint8_t dev_id, uint8_t queue_pair_id,
+               const struct rte_regex_qp_conf *qp_conf);
+
+/**
+ * Start a RegEx device.
+ *
+ * The device start step is the last one and consists of setting the RegEx
+ * queues to start accepting the pattern matching scan requests.
+ *
+ * On success, all basic functions exported by the API (RegEx enqueue,
+ * RegEx dequeue and so on) can be invoked.
+ *
+ * @param dev_id
+ *   RegEx device identifier
+ * @return
+ *   - 0: Success, device started.
+ *   - <0: Device start failed.
+ */
+int
+rte_regex_dev_start(uint8_t dev_id);
+
+/**
+ * Stop a RegEx device.
+ *
+ * Stop a RegEx device. The device can be restarted with a call to
+ * rte_regex_dev_start().
+ *
+ * This function causes all queued response regex ops to be drained in the
+ * response queue. While draining ops out of the device,
+ * struct rte_regex_qp_conf::cb will be invoked for each ops.
+ *
+ * @param dev_id
+ *   RegEx device identifier.
+ *
+ * @see struct rte_regex_qp_conf::cb, rte_regex_queue_pair_setup()
+ */
+void
+rte_regex_dev_stop(uint8_t dev_id);
+
+/**
+ * Close a RegEx device. The device cannot be restarted!
+ *
+ * @param dev_id
+ *   RegEx device identifier
+ *
+ * @return
+ *  - 0 on successfully closed the device.
+ *  - <0 on failure to close the device.
+ */
+int
+rte_regex_dev_close(uint8_t dev_id);
+
+/* Device get/set attributes */
+
+/** Enumerates RegEx device attribute identifier */
+enum rte_regex_dev_attr_id {
+    RTE_REGEX_DEV_ATTR_SOCKET_ID,
+    /**< The NUMA socket id to which the device is connected or
+     * a default of zero if the socket could not be determined.
+     * datatype: *int*
+     * operation: *get*
+     */
+    RTE_REGEX_DEV_ATTR_MAX_MATCHES,
+    /**< Maximum number of matches per scan.
+     * datatype: *uint8_t*
+     * operation: *get* and *set*
+     *
+     * @see RTE_REGEX_OPS_RSP_MAX_MATCH_F
+     */
+    RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT,
+    /**< Upper bound scan time in ns.
+     * datatype: *uint16_t*
+     * operation: *get* and *set*
+     *
+     * @see RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F
+     */
+    RTE_REGEX_DEV_ATTR_MAX_PREFIX,
+    /**< Maximum number of prefix detected per scan.
+     * This would be useful for denial of service detection.
+     * datatype: *uint16_t*
+     * operation: *get* and *set*
+     *
+     * @see RTE_REGEX_OPS_RSP_MAX_PREFIX_F
+     */
+};
+
+/**
+ * Get an attribute from a RegEx device.
+ *
+ * @param dev_id RegEx device identifier
+ * @param attr_id The attribute ID to retrieve
+ * @param[out] attr_value A pointer that will be filled in with the attribute
+ *             value if successful.
+ *
+ * @return
+ *   - 0: Successfully retrieved attribute value.
+ *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
+ *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
+ */
+int
+rte_regex_dev_attr_get(uint8_t dev_id, enum rte_regex_dev_attr_id
attr_id,
+               void *attr_value);
+
+/**
+ * Set an attribute to a RegEx device.
+ *
+ * @param dev_id RegEx device identifier
+ * @param attr_id The attribute ID to retrieve
+ * @param attr_value A pointer that will be filled in with the attribute value
+ *                   by the application
+ *
+ * @return
+ *   - 0: Successfully applied the attribute value.
+ *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
+ *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
+ */
+int
+rte_regex_dev_attr_set(uint8_t dev_id, enum rte_regex_dev_attr_id
attr_id,
+               const void *attr_value);
+
+/* Rule related APIs */
+/** Enumerates RegEx rule operation */
+enum rte_regex_rule_op {
+    RTE_REGEX_RULE_OP_ADD,
+    /**< Add RegEx rule to rule database */
+    RTE_REGEX_RULE_OP_REMOVE
+    /**< Remove RegEx rule from rule database */
+};
+
+/** Structure to hold a RegEx rule attributes */
+struct rte_regex_rule {
+    enum rte_regex_rule_op op;
+    /**< OP type of the rule either a OP_ADD or OP_DELETE */
+    uint16_t group_id;
+    /**< Group identifier to which the rule belongs to. */
+    uint32_t rule_id;
+    /**< Rule identifier which is returned on successful match. */
+    const char *pcre_rule;
+    /**< Buffer to hold the PCRE rule. */
+    uint16_t pcre_rule_len;
+    /**< Length of the PCRE rule*/
+    uint64_t rule_flags;
+    /* PCRE rule flags. Supported device specific PCRE rules enumerated
+     * in struct rte_regex_dev_info::rule_flags. For successful rule
+     * database update, application needs to provide only supported
+     * rule flags.
+     * @See RTE_REGEX_PCRE_RULE_*, struct
rte_regex_dev_info::rule_flags
+     */
+};
+
+/**
+ * Update the rule database of a RegEx device.
+ *
+ * @param dev_id RegEx device identifier
+ * @param rules
+ *   Points to an array of *nb_rules* objects of type *rte_regex_rule*
structure
+ *   which contain the regex rules attributes to be updated in rule database.
+ * @param nb_rules
+ *   The number of PCRE rules to update the rule database.
+ *
+ * @return
+ *   The number of regex rules actually updated on the regex device's rule
+ *   database. The return value can be less than the value of the *nb_rules*
+ *   parameter when the regex devices fails to update the rule database or
+ *   if invalid parameters are specified in a *rte_regex_rule*.
+ *   If the return value is less than *nb_rules*, the remaining PCRE rules
+ *   at the end of *rules* are not consumed and the caller has to take
+ *   care of them and rte_errno is set accordingly.
+ *   Possible errno values include:
+ *   - -EINVAL:  Invalid device ID or rules is NULL
+ *   - -ENOTSUP: The last processed rule is not supported on this device.
+ *   - -ENOSPC: No space available in rule database.
+ *
+ * @see rte_regex_rule_db_import(), rte_regex_rule_db_export()
+ */
+uint16_t
+rte_regex_rule_db_update(uint8_t dev_id, const struct rte_regex_rule
*rules,
+             uint16_t nb_rules);

I think the function name is not too informative. If this function meant to compile the rule then it should be explicit on the function name.

+
+/**
+ * Import a prebuilt rule database from a buffer to a RegEx device.
+ *
+ * @param dev_id RegEx device identifier
+ * @param rule_db
+ *   Points to prebuilt rule database.
+ * @param rule_db_len
+ *   Length of the rule database.
+ *
+ * @return
+ *   - 0: Successfully updated the prebuilt rule database.
+ *   - -EINVAL:  Invalid device ID or rule_db is NULL
+ *   - -ENOTSUP: Rule database import is not supported on this device.
+ *   - -ENOSPC: No space available in rule database.
+ *
+ * @see rte_regex_rule_db_update(), rte_regex_rule_db_export()
+ */
+int
+rte_regex_rule_db_import(uint8_t dev_id, const char *rule_db,
+             uint32_t rule_db_len);
+
+/**
+ * Export the prebuilt rule database from a RegEx device to the buffer.
+ *
+ * @param dev_id RegEx device identifier
+ * @param[out] rule_db
+ *   Block of memory to insert the rule database. Must be at least size in
+ *   capacity. If set to NULL, function returns required capacity.
+ *
+ * @return
+ *   - 0: Successfully exported the prebuilt rule database.
+ *   - size: If rule_db set to NULL then required capacity for *rule_db*
+ *   - -EINVAL:  Invalid device ID
+ *   - -ENOTSUP: Rule database export is not supported on this device.
+ *
+ * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
+ */
+int
+rte_regex_rule_db_export(uint8_t dev_id, char *rule_db);
+
+/* Extended statistics */
+/** Maximum name length for extended statistics counters */
+#define RTE_REGEX_DEV_XSTATS_NAME_SIZE 64
+
+/**
+ * A name-key lookup element for extended statistics.
+ *
+ * This structure is used to map between names and ID numbers
+ * for extended RegEx device statistics.
+ */
+struct rte_regex_dev_xstats_map {
+    uint16_t id;
+    /**< xstat identifier */
+    char name[RTE_REGEX_DEV_XSTATS_NAME_SIZE];
+    /**< xstat name */
+};
+
+/**
+ * Retrieve names of extended statistics of a regex device.
+ *
+ * @param dev_id
+ *   The identifier of the regex device.
+ * @param[out] xstats_map
+ *   Block of memory to insert id and names into. Must be at least size in
+ *   capacity. If set to NULL, function returns required capacity.
+ * @return
+ *   - positive value on success:
+ *        -The return value is the number of entries filled in the stats map.
+ *        -If xstats_map set to NULL then required capacity for xstats_map.
+ *   - negative value on error:
+ *      -ENODEV for invalid *dev_id*
+ *      -ENOTSUP if the device doesn't support this function.
+ */
+int
+rte_regex_dev_xstats_names_get(uint8_t dev_id,
+                   struct rte_regex_dev_xstats_map *xstats_map);
+
+/**
+ * Retrieve extended statistics of an regex device.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param ids
+ *   The id numbers of the stats to get. The ids can be got from the stat
+ *   position in the stat list from rte_regex_dev_xstats_names_get(), or
+ *   by using rte_regex_dev_xstats_by_name_get().
+ * @param[out] values
+ *   The values for each stats request by ID.
+ * @param n
+ *   The number of stats requested
+ * @return
+ *   - positive value: number of stat entries filled into the values array
+ *   - negative value on error:
+ *      -ENODEV for invalid *dev_id*
+ *      -ENOTSUP if the device doesn't support this function.
+ */
+int
+rte_regex_dev_xstats_get(uint8_t dev_id, const uint16_t ids[],
+             uint64_t values[], uint16_t n);
+
+/**
+ * Retrieve the value of a single stat by requesting it by name.
+ *
+ * @param dev_id
+ *   The identifier of the device
+ * @param name
+ *   The stat name to retrieve
+ * @param[out] id
+ *   If non-NULL, the numerical id of the stat will be returned, so that further
+ *   requests for the stat can be got using rte_regex_dev_xstats_get, which
will
+ *   be faster as it doesn't need to scan a list of names for the stat.
+ * @param[out] value
+ *   Must be non-NULL, retrieved xstat value will be stored in this address.
+ *
+ * @return
+ *   - 0: Successfully retrieved xstat value.
+ *   - -EINVAL: invalid parameters
+ *   - -ENOTSUP: if not supported.
+ */
+int
+rte_regex_dev_xstats_by_name_get(uint8_t dev_id, const char *name,
+                 uint16_t *id, uint64_t *value);
+
+/**
+ * Reset the values of the xstats of the selected component in the device.
+ *
+ * @param dev_id
+ *   The identifier of the device
+ * @param ids
+ *   Selects specific statistics to be reset. When NULL, all statistics will be
+ *   reset. If non-NULL, must point to array of at least *nb_ids* size.
+ * @param nb_ids
+ *   The number of ids available from the *ids* array. Ignored when ids is
NULL.
+ * @return
+ *   - 0: Successfully reset the statistics to zero.
+ *   - -EINVAL: invalid parameters
+ *   - -ENOTSUP: if not supported.
+ */
+int
+rte_regex_dev_xstats_reset(uint8_t dev_id, const uint16_t ids[],
+               uint16_t nb_ids);
+
+/**
+ * Trigger the RegEx device self test.
+ *
+ * @param dev_id
+ *   The identifier of the device
+ * @return
+ *   - 0: Selftest successful
+ *   - -ENOTSUP if the device doesn't support selftest
+ *   - other values < 0 on failure.
+ */
+int rte_regex_dev_selftest(uint8_t dev_id);
+
+/**
+ * Dump internal information about *dev_id* to the FILE* provided in *f*.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ *
+ * @param f
+ *   A pointer to a file for output
+ *
+ * @return
+ *   - 0: on success
+ *   - <0: on failure.
+ */
+int
+rte_regex_dev_dump(uint8_t dev_id, FILE *f);
+
+/* Fast path APIs */
+
+/**
+ * The generic *rte_regex_match* structure to hold the RegEx match
attributes.
+ * @see struct rte_regex_ops::matches
+ */
+struct rte_regex_match {
+    RTE_STD_C11
+    union {
+        uint64_t u64;
+        struct {
+            uint32_t rule_id:20;
+            /**< Rule identifier to which the pattern matched.
+             * @see struct rte_regex_rule::rule_id
+             */
+            uint32_t group_id:12;
+            /**< Group identifier of the rule which the pattern
+             * matched. @see struct rte_regex_rule::group_id
+             */
+            uint16_t offset;
+            /**< Starting Byte Position for matched rule. */
+            uint16_t len;
+            /**< Length of match in bytes */
+        };
+    };
+};
+
+/* Enumerates RegEx request flags. */
+#define RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F (1 << 0)
+/**< Set when struct rte_regex_rule::group_id1 valid */
+
+#define RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F (1 << 1)
+/**< Set when struct rte_regex_rule::group_id2 valid */
+
+#define RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F (1 << 2)
+/**< Set when struct rte_regex_rule::group_id3 valid */
+
+#define RTE_REGEX_OPS_REQ_STOP_ON_MATCH_F (1 << 4)
+/**< The RegEx engine will stop scanning and return the first match. */
+
+#define RTE_REGEX_OPS_REQ_MATCH_HIGH_PRIORITY_F (1 << 5)
+/**< In High Priority mode a maximum of one match will be returned per
scan to
+ * reduce the post-processing required by the application. The match with
the
+ * lowest Rule id, lowest start pointer and lowest match length will be
+ * returned.
+ *
+ * @see struct rte_regex_ops::nb_actual_matches
+ * @see struct rte_regex_ops::nb_matches
+ */
+
+
+/* Enumerates RegEx response flags. */
+#define RTE_REGEX_OPS_RSP_PMI_SOJ_F (1 << 0)
+/**< Indicates that the RegEx device has encountered a partial match at the
+ * start of scan in the given buffer.
+ *
+ * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
+ */
+
+#define RTE_REGEX_OPS_RSP_PMI_EOJ_F (1 << 1)
+/**< Indicates that the RegEx device has encountered a partial match at the
+ * end of scan in the given buffer.
+ *
+ * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
+ */
+
+#define RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F (1 << 2)
+/**< Indicates that the RegEx device has exceeded the max timeout while
+ * scanning the given buffer.
+ *
+ * @see RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT
+ */
+
+#define RTE_REGEX_OPS_RSP_MAX_MATCH_F (1 << 3)
+/**< Indicates that the RegEx device has exceeded the max matches while
+ * scanning the given buffer.
+ *
+ * @see RTE_REGEX_DEV_ATTR_MAX_MATCHES
+ */
+
+#define RTE_REGEX_OPS_RSP_MAX_PREFIX_F (1 << 4)
+/**< Indicates that the RegEx device has reached the max allowed prefix
length
+ * while scanning the given buffer.
+ *
+ * @see RTE_REGEX_DEV_ATTR_MAX_PREFIX
+ */
+
+/**
+ * The generic *rte_regex_ops* structure to hold the RegEx attributes
+ * for enqueue and dequeue operation.
+ */
+struct rte_regex_ops {
+    /* W0 */
+    uint16_t req_flags;
+    /**< Request flags for the RegEx ops.
+     * @see RTE_REGEX_OPS_REQ_*
+     */
+    uint16_t scan_size;
+    /**< Scan size of the buffer to be scanned in bytes. */
+    uint16_t rsp_flags;
+    /**< Response flags for the RegEx ops.
+     * @see RTE_REGEX_OPS_RSP_*
+     */
+    uint8_t nb_actual_matches;
+    /**< The total number of actual matches detected by the Regex
device.*/
+    uint8_t nb_matches;
+    /**< The total number of matches returned by the RegEx device for
this
+     * scan. The size of *rte_regex_ops::matches* zero length array will
be
+     * this value.
+     *
+     * @see struct rte_regex_ops::matches, struct rte_regex_match
+     */
+
+    /* W1 */
+    RTE_STD_C11
+    union {
+        uint64_t u64;
+        /**<  Allow 8-byte reserved on 32-bit system */
+        void *buf_addr;
+        /**< Virtual address of the pattern to be matched. */
+    };
+
+    /* W2 */
+    rte_iova_t buf_iova;
+    /**< IOVA address of the pattern to be matched. */
+
+    /* W3 */
+    uint16_t group_id0;
+    /**< First group_id to match the rule against. Minimum one group id
+     * must be provided by application.
+     * When RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F set then
group_id1
+     * is valid, respectively similar flags for group_id2 and group_id3.
+     * Upon the match, struct rte_regex_match::group_id shall be
updated
+     * with matching group ID by the device. Group ID scheme provides
+     * rule isolation and effective pattern matching.
+     */
+    uint16_t group_id1;
+    /**< Second group_id to match the rule against.
+     *
+     * @see RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F
+     */
+    uint16_t group_id2;
+    /**< Third group_id to match the rule against.
+     *
+     * @see RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F
+     */
+    uint16_t group_id3;
+    /**< Forth group_id to match the rule against.
+     *
+     * @see RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F
+     */
+
+    /* W4 */
+    RTE_STD_C11
+    union {
+        uint64_t user_id;
+        /**< Application specific opaque value. An application may
use
+         * this field to hold application specific value to share
+         * between dequeue and enqueue operation.
+         * Implementation should not modify this field.
+         */
+        void *user_ptr;
+        /**< Pointer representation of *user_id* */
+    };

Since we target the regex subsystem for both regex and DPI I think it will be good to add another uint64_t field called connection_id.
Device that support DPI can refer to it as another match able field when looking up for matches on the given buffer.

This field is different from the user_id, as it is not opaque for the device.

+
+    /* W5 */
+    struct rte_regex_match matches[];
+    /**< Zero length array to hold the match tuples.
+     * The struct rte_regex_ops::nb_matches value holds the number of
+     * elements in this array.
+     *
+     * @see struct rte_regex_ops::nb_matches
+     */
+};
+
+/**
+ * Enqueue a burst of scan request on a RegEx device.
+ *
+ * The rte_regex_enqueue_burst() function is invoked to place
+ * regex operations on the queue *qp_id* of the device designated by
+ * its *dev_id*.
+ *
+ * The *nb_ops* parameter is the number of operations to process which
are
+ * supplied in the *ops* array of *rte_regex_op* structures.
+ *
+ * The rte_regex_enqueue_burst() function returns the number of
+ * operations it actually enqueued for processing. A return value equal to
+ * *nb_ops* means that all packets have been enqueued.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param qp_id
+ *   The index of the queue pair which packets are to be enqueued for
+ *   processing. The value must be in the range [0, nb_queue_pairs - 1]
+ *   previously supplied to rte_regex_dev_configure().
+ * @param ops
+ *   The address of an array of *nb_ops* pointers to *rte_regex_op*
structures
+ *   which contain the regex operations to be processed.
+ * @param nb_ops
+ *   The number of operations to process.
+ *
+ * @return
+ *   The number of operations actually enqueued on the regex device. The
return
+ *   value can be less than the value of the *nb_ops* parameter when the
+ *   regex devices queue is full or if invalid parameters are specified in
+ *   a *rte_regex_op*. If the return value is less than *nb_ops*, the
remaining
+ *   ops at the end of *ops* are not consumed and the caller has to take
care
+ *   of them.
+ */
+uint16_t
+rte_regex_enqueue_burst(uint8_t dev_id, uint16_t qp_id,
+            struct rte_regex_ops **ops, uint16_t nb_ops);
+
+/**
+ *
+ * Dequeue a burst of scan response from a queue on the RegEx device.
+ * The dequeued operation are stored in *rte_regex_op* structures
+ * whose pointers are supplied in the *ops* array.
+ *
+ * The rte_regex_dequeue_burst() function returns the number of ops
+ * actually dequeued, which is the number of *rte_regex_op* data
structures
+ * effectively supplied into the *ops* array.
+ *
+ * A return value equal to *nb_ops* indicates that the queue contained
+ * at least *nb_ops* operations, and this is likely to signify that other
+ * processed operations remain in the devices output queue. Applications
+ * implementing a "retrieve as many processed operations as possible"
policy
+ * can check this specific case and keep invoking the
+ * rte_regex_dequeue_burst() function until a value less than
+ * *nb_ops* is returned.
+ *
+ * The rte_regex_dequeue_burst() function does not provide any error
+ * notification to avoid the corresponding overhead.
+ *
+ * @param dev_id
+ *   The RegEx device identifier
+ * @param qp_id
+ *   The index of the queue pair from which to retrieve processed packets.
+ *   The value must be in the range [0, nb_queue_pairs - 1] previously
+ *   supplied to rte_regex_dev_configure().
+ * @param ops
+ *   The address of an array of pointers to *rte_regex_op* structures that
must
+ *   be large enough to store *nb_ops* pointers in it.
+ * @param nb_ops
+ *   The maximum number of operations to dequeue.
+ *
+ * @return
+ *   The number of operations actually dequeued, which is the number
+ *   of pointers to *rte_regex_op* structures effectively supplied to the
+ *   *ops* array. If the return value is less than *nb_ops*, the remaining
+ *   ops at the end of *ops* are not consumed and the caller has to take
care
+ *   of them.
+ */
+uint16_t
+rte_regex_dequeue_burst(uint8_t dev_id, uint16_t qp_id,
+            struct rte_regex_ops **ops, uint16_t nb_ops);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_REGEXDEV_H_ */
Jerin Jacob Kollanukkaran Sept. 10, 2019, 8:05 a.m. UTC | #8
Hi Xiang,

Sorry for delay in response(Was busy with 19.11 proposal deadline). Please see inline.
 
> 
> Reply to Xiang's queries in main thread:
> 
> Hi all,
> 
> Some questions regarding APIs. Could you please give more insights?
> 
> 1) rte_regex_ops
>       a) rsp_flags
>       These two flags RTE_REGEX_OPS_RSP_PMI_SOJ_F and
> RTE_REGEX_OPS_RSP_PMI_EOJ_F are used for cross buffer scan.
>       RTE_REGEX_OPS_RSP_PMI_EOJ_F tells whether we have a partial match
> at the end of current buffer after scan.
>       What's the purpose of having RTE_REGEX_OPS_RSP_PMI_SOJ_F?
> 
> [Jerin] Since we need three states to represent partial match buffer,
> RTE_REGEX_OPS_RSP_PMI_SOJ_F to
> represent start of the buffer, intermediate buffers with no flag, and end of
> the buffer with RTE_REGEX_OPS_RSP_PMI_EOJ

> [Xiang] How could a user leverage these flags for matching? Suppose a large
> buffer is divided into multiple chunks. Will RTE_REGEX_OPS_RSP_PMI_SOJ_F
> cause an early quit once it isn't set after scan the first chunk. Similarly,
> RTE_REGEX_OPS_RSP_PMI_EOJ tells a user whether to stop matching future
> buffers after finish the last chunk?

Let me describe with an example,

Assume,
1) struct rte_regex_dev_info:: max_payload_size set to 1024
2) rte_regex_dev_config:: dev_cfg_flags configured with RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
3) Device programmed with matching "hello\s+world" pattern
4) user enqueue struct rte_regex_ops:: buf_addr point following "data" and struct rte_regex_op:: scan_size = 1024

data[0..1021] = data don’t have hello world pattern
data[1022] = 'h'
data[1023] = 'e'

5) user enqueue struct rte_regex_ops:: buf_addr point following "data" and struct rte_regex_op:: scan_size = 9

data[0] = 'l'
data[1] = 'l'
data[2] = 'o'
data[3] = ' '
data[4] = 'w'
data[5] = 'o'
data[6] = 'r'
data[7] = 'l'
data[8] = 'd'

If so,

Response to 4) will be RTE_REGEX_OPS_RSP_PMI_SOJ_F in rte_regex_ops:: rsp_flags on dequeue
Where rte_regex_match:: offset is 1022 and len 2

Response to 5) will be RTE_REGEX_OPS_RSP_PMI_EOJ_F in rte_regex_ops:: rsp_flags on dequeue
Where rte_regex_match:: offset is 0 and len 9


> 
>       RTE_REGEX_OPS_RSP_MAX_PREFIX_F: This looks like a definition for a
> specific hardware implementation. I am wondering what this PREFIX refers
> to:)?
> 
> [Jerin] Yes. Looks like it is for hardware specific implementation. Introduced
> rte_regex_dev_attr_set/get functions to make it portable and
> To add new implementation specific fields.
> For example, if a rule is
> /ABCDEF.*XYZ/, ABCD is considered the prefix, and EF.*XYZ is considered the
> factor. The prefix is a literal
> string, while the factor can contain complex regular expression constructs. As
> a result, rule matching occurs in
> two stages: prefix matching and factor matching.
> 
>       b)  user_id or user_ptr
>       Under what kind of circumstances should an application pass value into
> these variables for enqueue and dequeuer operations?
> 
> [Jerin] Just like rte_crypto_ops, struct rte_regex_ops also allocated using
> mempool normally, on enqueue, user can specify user_id
> If needed to in order identify the op on dequeue if required. The use case
> could be to store the sequence number from application
> POV or storing the mbuf ptr in which pattern is requested etc.
> 
> 
>  2) rte_regex_match
>       a) offset; /**< Starting Byte Position for matched rule. */ and  uint16_t
> len; /**< Length of match in bytes */
>       Looks like the matching offset is defined as *starting matching offset*
> instead of *end matching offset*, e.g. report the offset of "a" instead of "c"
> for pattern "abc".
>       If so, this makes it hard to integrate software regex libraries such as
> Hyperscan and RE2 as they only report *end matching offset* without length
> of match.
>       Although Hyperscan has API for *starting matching offset*, it only delivers
> partial syntax support. So I think we have to define *end of matching offset*
> for software solutions.
> 
> [Jerin] I understand the hyperscan's HS_FLAG_SOM_LEFTMOST tradeoffs. I
> thought application would need always the length of the match.
> Probably we will see how other HW implementation (from Mellanox) etc. We
> will try to abstract it, probably we can make it as function of "user
> requested".
> [Xiang] Yes, it will be good to make it per user request. At least from
> Hyperscan user's point of view, start of match and match length are not
> mandatory.

OK. I think, we can introduce RTE_REGEX_DEV_CFG_MATCH_AS_START
In device configure.

Since offset+len == end, we can introduce following generic inline function.

static inline 
rte_regex_match_end(truct rte_regex_match *match)
{
	match->offset + match->len;
}

Example:  pattern to match is  "hello\s+world"  and data is following
data[4] = 'h'
data[5] = 'e'
data[6] = 'l'
data[7] = 'l'
data[8] = 'o'
data[9] = ' '
data[10] = 'w'
data[11] = 'o'
data[12] = 'r'
data[13] = 'l'
data[14] = 'd'

if device is configured with RTE_REGEX_DEV_CFG_MATCH_AS_START
match->offset returns 4
match->len returns 11

if device is NOT configured with RTE_REGEX_DEV_CFG_MATCH_AS_START
driver MAY return the following(in hyperscan case)
match->offset returns 0
match->len returns 11 + 4

In both case(irrespective of flags, to make application life easy) rte_regex_match_end() would return 15.
If application demands for MATCH_AS_START then driver can return match->offset returns 4 and match->len returns 11
Aka set HS_FLAG_SOM_LEFTMOST in hyperscan driver, But application should use rte_regex_match_end()
for finding the end of the match. To make, work in all cases.

Is it OK? 

> 
> 3)  rte_regex_rule_db_update()
>     Does this mean we can dynamically add or delete rules for an already
> generated database without recompile from scratch for hardware Regex
> implementation?
>     If so, this isn't possible for software solutions as they don't support
> dynamic database update and require recompile.
> 
> [Jerin] rte_regex_rule_db_update() internally it would call recompile
> function for both HW and SW.
> See rte_regex_dev_config::rule_db in rte_regex_dev_configure() for
> precompiled rule database case.
> [Xiang] OK, sounds like we have to save the original rule-set for the device in
> order to do recompile. I see both ADD and REMOVE operators from
> rte_regex_rule.
> For rules with REMOVE operator, what's the expected behavior to handle
> them for the old rule-set? Do we need to go through the old rule-set and
> remove corresponding rules before doing recompile?

Yes.
Jerin Jacob Kollanukkaran Sept. 10, 2019, 10:31 a.m. UTC | #9
> Hi Jerin,










> 
> Thursday, August 15, 2019 2:34 PM, Thomas Monjalon:
> > Subject: Re: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev
> > subsystem
> >
> > +Cc more
> >
> > ------------
> >
> > From: Jerin Jacob <jerinj@marvell.com>
> >
> > Even though there are some vendors which offer Regex HW offload, due to
> > lack of standard API, It is diffcult for DPDK consumer to use them
> > in a portable way.
> >
> > This _RFC_ attempts to standardize the RegEx/DPI offload APIs for DPDK.
> >
> > The Doxygen generated RFC API documentation available here:
> > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdrea
> > my-noether-
> > 22777e.netlify.com%2Frte__regexdev_8h.html&amp;data=02%7C01%7Csha
> >
> hafs%40mellanox.com%7Cdf93416cf4e8498a982c08d721748937%7Ca652971c
> >
> 7d2e4d9ba6a4d149256f461b%7C0%7C0%7C637014656739993131&amp;sdata
> > =6ZAOrLmj3sf7LrPRlzE7IyqkK8b4cvFIQqK6zSwF4aw%3D&amp;reserved=0
> >
> > This RFC crafted based on SW Regex API frameworks such as libpcre and
> > hyperscan and a few of the RegEx HW IPs which I am aware of.
> >
> > RegEx pattern matching applications:
> > • Next Generation Firewalls (NGFW)
> > • Deep Packet and Flow Inspection (DPI)
> > • Intrusion Prevention Systems (IPS)
> > • DDoS Mitigation
> > • Network Monitoring
> > • Data Loss Prevention (DLP)
> > • Smart NICs
> > • Grammar based content processing
> > • URL, spam and adware filtering
> > • Advanced auditing and policing of user/application security policies
> > • Financial data mining - parsing of streamed financial feeds
> 
> I think two more important use case to add (at least on the doc of this
> subsystem) are:
> * application recognition
> * memory introspection
> 
> 
> >
> > Request to review from HW and SW RegEx vendors and RegEx application
> > users
> > to have portable DPDK API for RegEx.
> >
> > The API schematics are based cryptodev, eventdev and ethdev existing
> > device API.
> >
> > Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> > Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> > ---
> >
> > RTE RegEx Device API
> > --------------------
> >
> > Defines RTE RegEx Device APIs for RegEx operations and its provisioning.
> >
> > The RegEx Device API is composed of two parts:
> >
> > - The application-oriented RegEx API that includes functions to setup
> > a RegEx device (configure it, setup its queue pairs and start it),
> > update the rule database and so on.
> >
> > - The driver-oriented RegEx API that exports a function allowing
> > a RegEx poll Mode Driver (PMD) to simultaneously register itself as
> > a RegEx device driver.
> >
> > RegEx device components and definitions:
> >
> >     +-----------------+
> >     |                 |
> >     |                 o---------+    rte_regex_[en|de]queue_burst()
> >     |   PCRE based    o------+  |               |
> >     |  RegEx pattern  |      |  |  +--------+   |
> >     | matching engine o------+--+--o        |   |    +------+
> >     |                 |      |  |  | queue  |<==o===>|Core 0|
> >     |                 o----+ |  |  | pair 0 |        |      |
> >     |                 |    | |  |  +--------+        +------+
> >     +-----------------+    | |  |
> >            ^               | |  |  +--------+
> >            |               | |  |  |        |        +------+
> >            |               | +--+--o queue  |<======>|Core 1|
> >        Rule|Database       |    |  | pair 1 |        |      |
> >     +------+----------+    |    |  +--------+        +------+
> >     |     Group 0     |    |    |
> >     | +-------------+ |    |    |  +--------+        +------+
> >     | | Rules 0..n  | |    |    |  |        |        |Core 2|
> >     | +-------------+ |    |    +--o queue  |<======>|      |
> >     |     Group 1     |    |       | pair 2 |        +------+
> >     | +-------------+ |    |       +--------+
> >     | | Rules 0..n  | |    |
> >     | +-------------+ |    |       +--------+
> >     |     Group 2     |    |       |        |        +------+
> >     | +-------------+ |    |       | queue  |<======>|Core n|
> >     | | Rules 0..n  | |    +-------o pair n |        |      |
> >     | +-------------+ |            +--------+        +------+
> >     |     Group n     |
> >     | +-------------+ |<-------rte_regex_rule_db_update()
> >     | | Rules 0..n  | |<-------rte_regex_rule_db_import()
> >     | +-------------+ |------->rte_regex_rule_db_export()
> >     +-----------------+
> >
> > RegEx: A regular expression is a concise and flexible means for matching
> > strings of text, such as particular characters, words, or patterns of
> > characters. A common abbreviation for this is “RegEx”.
> >
> > RegEx device: A hardware or software-based implementation of RegEx
> > device API for PCRE based pattern matching syntax and semantics.
> >
> > PCRE RegEx syntax and semantics specification:
> > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fregex
> > kit.sourceforge.net%2FDocumentation%2Fpcre%2Fpcrepattern.html&amp;d
> >
> ata=02%7C01%7Cshahafs%40mellanox.com%7Cdf93416cf4e8498a982c08d721
> >
> 748937%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C63701465673
> > 9993131&amp;sdata=B0LSMubldDy3UlF55Z3whhNiRq6ep1pxB8Rrt5DItfw%3
> > D&amp;reserved=0
> >
> > RegEx queue pair: Each RegEx device should have one or more queue pair to
> > transmit a burst of pattern matching request and receive a burst of
> > receive the pattern matching response. The pattern matching
> > request/response
> > embedded in *rte_regex_ops* structure.
> >
> > Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
> > Match ID and Group ID to identify the rule upon the match.
> >
> > Rule database: The RegEx device accepts regular expressions and converts
> > them
> > into a compiled rule database that can then be used to scan data.
> > Compilation allows the device to analyze the given pattern(s) and
> > pre-determine how to scan for these patterns in an optimized fashion that
> > would be far too expensive to compute at run-time. A rule database contains
> > a set of rules that compiled in device specific binary form.
> >
> > Match ID or Rule ID: A unique identifier provided at the time of rule
> > creation for the application to identify the rule upon match.
> >
> > Group ID: Group of rules can be grouped under one group ID to enable
> > rule isolation and effective pattern matching. A unique group identifier
> > provided at the time of rule creation for the application to identify the
> > rule upon match.
> >
> > Scan: A pattern matching request through *enqueue* API.
> >
> > It may possible that a given RegEx device may not support all the features
> > of PCRE. The application may probe unsupported features through
> > struct rte_regex_dev_info::pcre_unsup_flags
> >
> > By default, all the functions of the RegEx Device API exported by a PMD
> > are lock-free functions which assume to not be invoked in parallel on
> > different logical cores to work on the same target object. For instance,
> > the dequeue function of a PMD cannot be invoked in parallel on two logical
> > cores to operates on same RegEx queue pair. Of course, this function
> > can be invoked in parallel by different logical core on different queue pair.
> > It is the responsibility of the upper level application to enforce this rule.
> >
> > In all functions of the RegEx API, the RegEx device is
> > designated by an integer >= 0 named the device identifier *dev_id*
> >
> > At the RegEx driver level, RegEx devices are represented by a generic
> > data structure of type *rte_regex_dev*.
> >
> > RegEx devices are dynamically registered during the PCI/SoC device probing
> > phase performed at EAL initialization time.
> > When a RegEx device is being probed, a *rte_regex_dev* structure and
> > a new device identifier are allocated for that device. Then, the
> > regex_dev_init() function supplied by the RegEx driver matching the probed
> > device is invoked to properly initialize the device.
> >
> > The role of the device init function consists of resetting the hardware or
> > software RegEx driver implementations.
> >
> > If the device init operation is successful, the correspondence between
> > the device identifier assigned to the new device and its associated
> > *rte_regex_dev* structure is effectively registered.
> > Otherwise, both the *rte_regex_dev* structure and the device identifier are
> > freed.
> >
> > The functions exported by the application RegEx API to setup a device
> > designated by its device identifier must be invoked in the following order:
> > - rte_regex_dev_configure()
> > - rte_regex_queue_pair_setup()
> > - rte_regex_dev_start()
> >
> > Then, the application can invoke, in any order, the functions
> > exported by the RegEx API to enqueue pattern matching job, dequeue
> > pattern
> > matching response, get the stats, update the rule database,
> > get/set device attributes and so on
> >
> > If the application wants to change the configuration (i.e. call
> > rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must call
> > rte_regex_dev_stop() first to stop the device and then do the
> > reconfiguration
> > before calling rte_regex_dev_start() again. The enqueue and dequeue
> > functions should not be invoked when the device is stopped.
> >
> > Finally, an application can close a RegEx device by invoking the
> > rte_regex_dev_close() function.
> >
> > Each function of the application RegEx API invokes a specific function
> > of the PMD that controls the target device designated by its device
> > identifier.
> >
> > For this purpose, all device-specific functions of a RegEx driver are
> > supplied through a set of pointers contained in a generic structure of type
> > *regex_dev_ops*.
> > The address of the *regex_dev_ops* structure is stored in the
> > *rte_regex_dev*
> > structure by the device init function of the RegEx driver, which is
> > invoked during the PCI/SoC device probing phase, as explained earlier.
> >
> > In other words, each function of the RegEx API simply retrieves the
> > *rte_regex_dev* structure associated with the device identifier and
> > performs an indirect invocation of the corresponding driver function
> > supplied in the *regex_dev_ops* structure of the *rte_regex_dev*
> > structure.
> >
> > For performance reasons, the address of the fast-path functions of the
> > RegEx driver is not contained in the *regex_dev_ops* structure.
> > Instead, they are directly stored at the beginning of the *rte_regex_dev*
> > structure to avoid an extra indirect memory access during their invocation.
> >
> > RTE RegEx device drivers do not use interrupts for enqueue or dequeue
> > operation. Instead, RegEx drivers export Poll-Mode enqueue and dequeue
> > functions to applications.
> >
> > The *enqueue* operation submits a burst of RegEx pattern matching
> > request
> > to the RegEx device and the *dequeue* operation gets a burst of pattern
> > matching response for the ones submitted through *enqueue* operation.
> >
> > Typical application utilisation of the RegEx device API will follow the
> > following programming flow.
> >
> > - rte_regex_dev_configure()
> > - rte_regex_queue_pair_setup()
> > - rte_regex_rule_db_update() Needs to invoke if precompiled rule database
> > not
> > provided in rte_regex_dev_config::rule_db for rte_regex_dev_configure()
> > and/or application needs to update rule database.
> > - Create or reuse exiting mempool for *rte_regex_ops* objects.
> > - rte_regex_dev_start()
> > - rte_regex_enqueue_burst()
> > - rte_regex_dequeue_burst()
> >
> > ---
> >
> > config/common_base                 |    5 +
> > doc/api/doxy-api-index.md          |    1 +
> > doc/api/doxy-api.conf.in           |    1 +
> > lib/Makefile                       |    2 +
> > lib/librte_regexdev/Makefile       |   23 +
> > lib/librte_regexdev/rte_regexdev.c |    5 +
> > lib/librte_regexdev/rte_regexdev.h | 1247
> > ++++++++++++++++++++++++++++
> > 7 files changed, 1284 insertions(+)
> > create mode 100644 lib/librte_regexdev/Makefile
> > create mode 100644 lib/librte_regexdev/rte_regexdev.c
> > create mode 100644 lib/librte_regexdev/rte_regexdev.h
> >
> > diff --git a/config/common_base b/config/common_base
> > index e406e7836..986093d6e 100644
> > --- a/config/common_base
> > +++ b/config/common_base
> > @@ -746,6 +746,11 @@
> > CONFIG_RTE_LIBRTE_PMD_DPAA2_QDMA_RAWDEV=n
> > #
> > CONFIG_RTE_LIBRTE_PMD_IFPGA_RAWDEV=y
> >
> > +#
> > +# Compile regex device support
> > +#
> > +CONFIG_RTE_LIBRTE_REGEXDEV=y
> > +
> > #
> > # Compile librte_ring
> > #
> > diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
> > index 715248dd1..a0bc27ae4 100644
> > --- a/doc/api/doxy-api-index.md
> > +++ b/doc/api/doxy-api-index.md
> > @@ -26,6 +26,7 @@ The public API headers are grouped by topics:
> > [event_timer_adapter]    (@ref rte_event_timer_adapter.h),
> > [event_crypto_adapter]   (@ref rte_event_crypto_adapter.h),
> > [rawdev]             (@ref rte_rawdev.h),
> > +  [regexdev]           (@ref rte_regexdev.h),
> > [metrics]            (@ref rte_metrics.h),
> > [bitrate]            (@ref rte_bitrate.h),
> > [latency]            (@ref rte_latencystats.h),
> > diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in
> > index b9896cb63..7adb821bb 100644
> > --- a/doc/api/doxy-api.conf.in
> > +++ b/doc/api/doxy-api.conf.in
> > @@ -53,6 +53,7 @@ INPUT                   = @TOPDIR@/doc/api/doxy-api-
> > index.md \
> > @TOPDIR@/lib/librte_rawdev \
> > @TOPDIR@/lib/librte_rcu \
> > @TOPDIR@/lib/librte_reorder \
> > +                          @TOPDIR@/lib/librte_regexdev \
> > @TOPDIR@/lib/librte_ring \
> > @TOPDIR@/lib/librte_sched \
> > @TOPDIR@/lib/librte_security \
> > diff --git a/lib/Makefile b/lib/Makefile
> > index 791e0d991..57de9691a 100644
> > --- a/lib/Makefile
> > +++ b/lib/Makefile
> > @@ -44,6 +44,8 @@ DEPDIRS-librte_eventdev := librte_eal librte_ring
> > librte_ethdev librte_hash \
> > librte_mempool librte_timer librte_cryptodev
> > DIRS-$(CONFIG_RTE_LIBRTE_RAWDEV) += librte_rawdev
> > DEPDIRS-librte_rawdev := librte_eal librte_ethdev
> > +DIRS-$(CONFIG_RTE_LIBRTE_REGEXDEV) += librte_regexdev
> > +DEPDIRS-librte_regexdev := librte_eal
> > DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
> > DEPDIRS-librte_vhost := librte_eal librte_mempool librte_mbuf
> > librte_ethdev \
> > 			librte_net
> > diff --git a/lib/librte_regexdev/Makefile b/lib/librte_regexdev/Makefile
> > new file mode 100644
> > index 000000000..723b4b28c
> > --- /dev/null
> > +++ b/lib/librte_regexdev/Makefile
> > @@ -0,0 +1,23 @@
> > +# SPDX-License-Identifier: BSD-3-Clause
> > +# Copyright(C) 2019 Marvell International Ltd.
> > +#
> > +
> > +include $(RTE_SDK)/mk/rte.vars.mk
> > +
> > +# library name
> > +LIB = librte_regexdev.a
> > +
> > +# library version
> > +LIBABIVER := 1
> > +
> > +# build flags
> > +CFLAGS += -O3
> > +CFLAGS += $(WERROR_FLAGS)
> > +
> > +# library source files
> > +SRCS-y += rte_regexdev.c
> > +
> > +# export include files
> > +SYMLINK-y-include += rte_regexdev.h
> > +
> > +include $(RTE_SDK)/mk/rte.lib.mk
> > diff --git a/lib/librte_regexdev/rte_regexdev.c
> > b/lib/librte_regexdev/rte_regexdev.c
> > new file mode 100644
> > index 000000000..e5be0f29c
> > --- /dev/null
> > +++ b/lib/librte_regexdev/rte_regexdev.c
> > @@ -0,0 +1,5 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(C) 2019 Marvell International Ltd.
> > + */
> > +
> > +#include <rte_regexdev.h>
> > diff --git a/lib/librte_regexdev/rte_regexdev.h
> > b/lib/librte_regexdev/rte_regexdev.h
> > new file mode 100644
> > index 000000000..765da4aaa
> > --- /dev/null
> > +++ b/lib/librte_regexdev/rte_regexdev.h
> > @@ -0,0 +1,1247 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(C) 2019 Marvell International Ltd.
> > + */
> > +
> > +#ifndef _RTE_REGEXDEV_H_
> > +#define _RTE_REGEXDEV_H_
> > +
> > +/**
> > + * @file
> > + *
> > + * RTE RegEx Device API
> > + *
> > + * Defines RTE RegEx Device APIs for RegEx operations and its provisioning.
> > + *
> > + * The RegEx Device API is composed of two parts:
> > + *
> > + * - The application-oriented RegEx API that includes functions to setup
> > + *   a RegEx device (configure it, setup its queue pairs and start it),
> > + *   update the rule database and so on.
> > + *
> > + * - The driver-oriented RegEx API that exports a function allowing
> > + *   a RegEx poll Mode Driver (PMD) to simultaneously register itself as
> > + *   a RegEx device driver.
> > + *
> > + * RegEx device components and definitions:
> > + *
> > + *     +-----------------+
> > + *     |                 |
> > + *     |                 o---------+    rte_regex_[en|de]queue_burst()
> > + *     |   PCRE based    o------+  |               |
> > + *     |  RegEx pattern  |      |  |  +--------+   |
> > + *     | matching engine o------+--+--o        |   |    +------+
> > + *     |                 |      |  |  | queue  |<==o===>|Core 0|
> > + *     |                 o----+ |  |  | pair 0 |        |      |
> > + *     |                 |    | |  |  +--------+        +------+
> > + *     +-----------------+    | |  |
> > + *            ^               | |  |  +--------+
> > + *            |               | |  |  |        |        +------+
> > + *            |               | +--+--o queue  |<======>|Core 1|
> > + *        Rule|Database       |    |  | pair 1 |        |      |
> > + *     +------+----------+    |    |  +--------+        +------+
> > + *     |     Group 0     |    |    |
> > + *     | +-------------+ |    |    |  +--------+        +------+
> > + *     | | Rules 0..n  | |    |    |  |        |        |Core 2|
> > + *     | +-------------+ |    |    +--o queue  |<======>|      |
> > + *     |     Group 1     |    |       | pair 2 |        +------+
> > + *     | +-------------+ |    |       +--------+
> > + *     | | Rules 0..n  | |    |
> > + *     | +-------------+ |    |       +--------+
> > + *     |     Group 2     |    |       |        |        +------+
> > + *     | +-------------+ |    |       | queue  |<======>|Core n|
> > + *     | | Rules 0..n  | |    +-------o pair n |        |      |
> > + *     | +-------------+ |            +--------+        +------+
> > + *     |     Group n     |
> > + *     | +-------------+ |<-------rte_regex_rule_db_update()
> > + *     | | Rules 0..n  | |<-------rte_regex_rule_db_import()
> > + *     | +-------------+ |------->rte_regex_rule_db_export()
> > + *     +-----------------+
> > + *
> > + * RegEx: A regular expression is a concise and flexible means for matching
> > + * strings of text, such as particular characters, words, or patterns of
> > + * characters. A common abbreviation for this is “RegEx”.
> > + *
> > + * RegEx device: A hardware or software-based implementation of RegEx
> > + * device API for PCRE based pattern matching syntax and semantics.
> > + *
> > + * PCRE RegEx syntax and semantics specification:
> > + *
> > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fregex
> > kit.sourceforge.net%2FDocumentation%2Fpcre%2Fpcrepattern.html&amp;d
> >
> ata=02%7C01%7Cshahafs%40mellanox.com%7Cdf93416cf4e8498a982c08d721
> >
> 748937%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C63701465673
> > 9993131&amp;sdata=B0LSMubldDy3UlF55Z3whhNiRq6ep1pxB8Rrt5DItfw%3
> > D&amp;reserved=0
> > + *
> > + * RegEx queue pair: Each RegEx device should have one or more queue
> > pair to
> > + * transmit a burst of pattern matching request and receive a burst of
> > + * receive the pattern matching response. The pattern matching
> > request/response
> > + * embedded in *rte_regex_ops* structure.
> > + *
> > + * Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
> > + * Match ID and Group ID to identify the rule upon the match.
> > + *
> > + * Rule database: The RegEx device accepts regular expressions and
> > converts them
> > + * into a compiled rule database that can then be used to scan data.
> > + * Compilation allows the device to analyze the given pattern(s) and
> > + * pre-determine how to scan for these patterns in an optimized fashion
> > that
> > + * would be far too expensive to compute at run-time. A rule database
> > contains
> > + * a set of rules that compiled in device specific binary form.
> > + *
> > + * Match ID or Rule ID: A unique identifier provided at the time of rule
> > + * creation for the application to identify the rule upon match.
> > + *
> > + * Group ID: Group of rules can be grouped under one group ID to enable
> > + * rule isolation and effective pattern matching. A unique group identifier
> > + * provided at the time of rule creation for the application to identify the
> > + * rule upon match.
> > + *
> > + * Scan: A pattern matching request through *enqueue* API.
> > + *
> > + * It may possible that a given RegEx device may not support all the features
> > + * of PCRE. The application may probe unsupported features through
> > + * struct rte_regex_dev_info::pcre_unsup_flags
> > + *
> > + * By default, all the functions of the RegEx Device API exported by a PMD
> > + * are lock-free functions which assume to not be invoked in parallel on
> > + * different logical cores to work on the same target object. For instance,
> > + * the dequeue function of a PMD cannot be invoked in parallel on two
> > logical
> > + * cores to operates on same RegEx queue pair. Of course, this function
> > + * can be invoked in parallel by different logical core on different queue
> > pair.
> > + * It is the responsibility of the upper level application to enforce this rule.
> > + *
> > + * In all functions of the RegEx API, the RegEx device is
> > + * designated by an integer >= 0 named the device identifier *dev_id*
> > + *
> > + * At the RegEx driver level, RegEx devices are represented by a generic
> > + * data structure of type *rte_regex_dev*.
> > + *
> > + * RegEx devices are dynamically registered during the PCI/SoC device
> > probing
> > + * phase performed at EAL initialization time.
> > + * When a RegEx device is being probed, a *rte_regex_dev* structure and
> > + * a new device identifier are allocated for that device. Then, the
> > + * regex_dev_init() function supplied by the RegEx driver matching the
> > probed
> > + * device is invoked to properly initialize the device.
> > + *
> > + * The role of the device init function consists of resetting the hardware or
> > + * software RegEx driver implementations.
> > + *
> > + * If the device init operation is successful, the correspondence between
> > + * the device identifier assigned to the new device and its associated
> > + * *rte_regex_dev* structure is effectively registered.
> > + * Otherwise, both the *rte_regex_dev* structure and the device identifier
> > are
> > + * freed.
> > + *
> > + * The functions exported by the application RegEx API to setup a device
> > + * designated by its device identifier must be invoked in the following order:
> > + *     - rte_regex_dev_configure()
> > + *     - rte_regex_queue_pair_setup()
> > + *     - rte_regex_dev_start()
> > + *
> > + * Then, the application can invoke, in any order, the functions
> > + * exported by the RegEx API to enqueue pattern matching job, dequeue
> > pattern
> > + * matching response, get the stats, update the rule database,
> > + * get/set device attributes and so on
> > + *
> > + * If the application wants to change the configuration (i.e. call
> > + * rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must
> > call
> > + * rte_regex_dev_stop() first to stop the device and then do the
> > reconfiguration
> > + * before calling rte_regex_dev_start() again. The enqueue and dequeue
> > + * functions should not be invoked when the device is stopped.
> > + *
> > + * Finally, an application can close a RegEx device by invoking the
> > + * rte_regex_dev_close() function.
> > + *
> > + * Each function of the application RegEx API invokes a specific function
> > + * of the PMD that controls the target device designated by its device
> > + * identifier.
> > + *
> > + * For this purpose, all device-specific functions of a RegEx driver are
> > + * supplied through a set of pointers contained in a generic structure of type
> > + * *regex_dev_ops*.
> > + * The address of the *regex_dev_ops* structure is stored in the
> > *rte_regex_dev*
> > + * structure by the device init function of the RegEx driver, which is
> > + * invoked during the PCI/SoC device probing phase, as explained earlier.
> > + *
> > + * In other words, each function of the RegEx API simply retrieves the
> > + * *rte_regex_dev* structure associated with the device identifier and
> > + * performs an indirect invocation of the corresponding driver function
> > + * supplied in the *regex_dev_ops* structure of the *rte_regex_dev*
> > structure.
> > + *
> > + * For performance reasons, the address of the fast-path functions of the
> > + * RegEx driver is not contained in the *regex_dev_ops* structure.
> > + * Instead, they are directly stored at the beginning of the *rte_regex_dev*
> > + * structure to avoid an extra indirect memory access during their
> > invocation.
> > + *
> > + * RTE RegEx device drivers do not use interrupts for enqueue or dequeue
> > + * operation. Instead, RegEx drivers export Poll-Mode enqueue and
> > dequeue
> > + * functions to applications.
> > + *
> > + * The *enqueue* operation submits a burst of RegEx pattern matching
> > request
> > + * to the RegEx device and the *dequeue* operation gets a burst of pattern
> > + * matching response for the ones submitted through *enqueue*
> > operation.
> > + *
> > + * Typical application utilisation of the RegEx device API will follow the
> > + * following programming flow.
> > + *
> > + * - rte_regex_dev_configure()
> > + * - rte_regex_queue_pair_setup()
> > + * - rte_regex_rule_db_update() Needs to invoke if precompiled rule
> > database not
> > + *   provided in rte_regex_dev_config::rule_db for
> > rte_regex_dev_configure()
> > + *   and/or application needs to update rule database.
> > + * - Create or reuse exiting mempool for *rte_regex_ops* objects.
> > + * - rte_regex_dev_start()
> > + * - rte_regex_enqueue_burst()
> > + * - rte_regex_dequeue_burst()
> > + *
> > + */
> > +
> > +#ifdef __cplusplus
> > +extern "C" {
> > +#endif
> > +
> > +#include <rte_common.h>
> > +#include <rte_config.h>
> > +#include <rte_dev.h>
> > +#include <rte_errno.h>
> > +#include <rte_memory.h>
> > +
> > +/**
> > + * Get the total number of RegEx devices that have been successfully
> > + * initialised.
> > + *
> > + * @return
> > + *   The total number of usable RegEx devices.
> > + */
> > +uint8_t
> > +rte_regex_dev_count(void);
> > +
> > +/**
> > + * Get the device identifier for the named RegEx device.
> > + *
> > + * @param name
> > + *   RegEx device name to select the RegEx device identifier.
> > + *
> > + * @return
> > + *   Returns RegEx device identifier on success.
> > + *   - <0: Failure to find named RegEx device.
> > + */
> > +int
> > +rte_regex_dev_get_dev_id(const char *name);
> > +
> > +/* Enumerates RegEx device capabilities */
> > +#define RTE_REGEX_DEV_CAPA_RUNTIME_COMPILATION_F (1ULL << 0)
> > +/**< RegEx device does support compiling the rules at runtime unlike
> > + * loading only the pre-built rule database using
> > + * struct rte_regex_dev_config::rule_db in rte_regex_dev_configure()
> > + * @see struct rte_regex_dev_config::rule_db, rte_regex_dev_configure()
> > + * @see struct rte_regex_dev_info::regex_dev_capa
> > + */
> > +
> > +
> > +/* Enumerates unsupported PCRE features for the RegEx device */
> > +#define RTE_REGEX_DEV_PCRE_UNSUP_START_ANCHOR_F (1ULL << 0)
> > +/**< RegEx device doesn't support PCRE Anchor to start of match flag.
> > + * Example RegEx is '/\Gfoo\d/'. Here '\G' asserts position at the end of the
> > + * previous match or the start of the string for the first match.
> > + * This position will change each time the RegEx is applied to the subject
> > + * string. If the RegEx is applied to 'foo1foo2Zfoo3' the first two matches will
> > + * be successful for 'foo1foo2' and fail for 'Zfoo3'.
> > + * @see struct rte_regex_dev_info::pcre_unsup_flags
> > + */
> > +
> > +#define RTE_REGEX_DEV_PCRE_UNSUP_ATOMIC_GROUPING_F (1ULL <<
> > 1)
> > +/**< RegEx device doesn't support PCRE Atomic grouping.
> > + * Atomic groups are represented by '(?>)'. An atomic group is a group that,
> > + * when the RegEx engine exits from it, automatically throws away all
> > + * backtracking positions remembered by any tokens inside the group.
> > + * Example RegEx is 'a(?>bc|b)c' if the given patterns are 'abc' and 'abcc'
> > then
> > + * 'a(bc|b)c' matches both where as 'a(?>bc|b)c' matches only abcc because
> > + * atomic groups don't allow backtracing back to 'b'.
> > + * @see struct rte_regex_dev_info::pcre_unsup_flags
> > + */
> > +
> > +#define RTE_REGEX_DEV_PCRE_UNSUP_BACKTRACKING_CTRL_F (1ULL <<
> > 2)
> > +/**< RegEx device doesn't support PCRE backtracking control verbs.
> > + * Some examples of backtracing verbs are (*COMMIT), (*ACCEPT), (*FAIL),
> > + * (*SKIP), (*PRUNE).
> > + * @see struct rte_regex_dev_info::pcre_unsup_flags
> > + */
> > +
> > +#define RTE_REGEX_DEV_PCRE_UNSUP_CALLOUTS_F (1ULL << 3)
> > +/**< RegEx device doesn't support PCRE callouts.
> > + * PCRE supports calling external function in between matches by using
> > '(?C)'.
> > + * Example RegEx 'ABC(?C)D' if a given patter is 'ABCD' then the RegEx
> > engine
> > + * will parse ABC perform a userdefined callout and return a successful
> > match at
> > + * D.
> > + * @see struct rte_regex_dev_info::pcre_unsup_flags
> > + */
> > +
> > +#define RTE_REGEX_DEV_PCRE_UNSUP_BACKREFERENCE_F (1ULL << 4)
> > +/**< RegEx device doesn't support PCRE backreference.
> > + * Example RegEx is '(\2ABC|(GHI))+' \2 matches the same text as most
> > recently
> > + * matched by the 2nd capturing group i.e. 'GHI'.
> > + * @see struct rte_regex_dev_info::pcre_unsup_flags
> > + */
> > +
> > +#define RTE_REGEX_DEV_PCRE_UNSUP_GREEDY_F (1ULL << 5)
> > +/**< RegEx device doesn't support PCRE Greedy mode.
> > + * For example if the RegEx is 'AB\d*?' then '*?' represents zero or
> > unlimited
> > + * matches. In greedy mode the pattern 'AB12345' will be matched
> > completely
> > + * where as the ungreedy mode 'AB' will be returned as the match.
> > + * @see struct rte_regex_dev_info::pcre_unsup_flags
> > + */
> > +
> > +#define RTE_REGEX_DEV_PCRE_UNSUP_LOOKAROUND_ASRT_F (1ULL <<
> > 6)
> > +/**< RegEx device doesn't support PCRE Lookaround assertions
> > + * (Zero-width assertions). Example RegEx is '[a-z]+\d+(?=!{3,})' if
> > + * the given pattern is 'dwad1234!' the RegEx engine doesn't report any
> > matches
> > + * because the assert '(?=!{3,})' fails. The pattern 'dwad123!!!' would return
> > a
> > + * successful match.
> > + * @see struct rte_regex_dev_info::pcre_unsup_flags
> > + */
> > +
> > +#define RTE_REGEX_DEV_PCRE_UNSUP_MATCH_POINT_RST_F (1ULL <<
> > 7)
> > +/**< RegEx device doesn't support PCRE match point reset directive.
> > + * Example RegEx is '[a-z]+\K\d+' if the pattern is 'dwad123'
> > + * then even though the entire pattern matches only '123'
> > + * is reported as a match.
> > + * @see struct rte_regex_dev_info::pcre_unsup_flags
> > + */
> > +
> > +#define RTE_REGEX_DEV_PCRE_UNSUP_NEWLINE_CONVENTIONS_F
> > (1ULL << 8)
> > +/**< RegEx device doesn't support PCRE newline convention.
> > + * Newline conventions are represented as follows:
> > + * (*CR)        carriage return
> > + * (*LF)        linefeed
> > + * (*CRLF)      carriage return, followed by linefeed
> > + * (*ANYCRLF)   any of the three above
> > + * (*ANY)       all Unicode newline sequences
> > + * @see struct rte_regex_dev_info::pcre_unsup_flags
> > + */
> > +
> > +#define RTE_REGEX_DEV_PCRE_UNSUP_NEWLINE_SEQ_F (1ULL << 9)
> > +/**< RegEx device doesn't support PCRE newline sequence.
> > + * The escape sequence '\R' will match any newline sequence.
> > + * It is equivalent to: '(?>\r\n|\n|\x0b|\f|\r|\x85)'.
> > + * @see struct rte_regex_dev_info::pcre_unsup_flags
> > + */
> > +
> > +#define RTE_REGEX_DEV_PCRE_UNSUP_POSSESSIVE_QUALIFIERS_F (1ULL
> > << 10)
> > +/**< RegEx device doesn't support PCRE possessive qualifiers.
> > + * Example RegEx possessive qualifiers '*+', '++', '?+', '{m,n}+'.
> > + * Possessive quantifier repeats the token as many times as possible and it
> > does
> > + * not give up matches as the engine backtracks. With a possessive
> > quantifier,
> > + * the deal is all or nothing.
> > + * @see struct rte_regex_dev_info::pcre_unsup_flags
> > + */
> > +
> > +#define RTE_REGEX_DEV_PCRE_UNSUP_SUBROUTINE_REFERENCES_F
> > (1ULL << 11)
> > +/**< RegEx device doesn't support PCRE Subroutine references.
> > + * PCRE Subroutine references allow for sub patterns to be assessed
> > + * as part of the RegEx. Example RegEx is '(foo|fuzz)\g<1>+bar' matches the
> > + * pattern 'foofoofuzzfoofuzzbar'.
> > + * @see struct rte_regex_dev_info::pcre_unsup_flags
> > + */
> > +
> > +#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_8_F (1ULL << 12)
> > +/**< RegEx device doesn't support UTF-8 character encoding.
> > + * @see struct rte_regex_dev_info::pcre_unsup_flags
> > + */
> > +
> > +#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_16_F (1ULL << 13)
> > +/**< RegEx device doesn't support UTF-16 character encoding.
> > + * @see struct rte_regex_dev_info::pcre_unsup_flags
> > + */
> > +
> > +#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_32_F (1ULL << 14)
> > +/**< RegEx device doesn't support UTF-32 character encoding.
> > + * @see struct rte_regex_dev_info::pcre_unsup_flags
> > + */
> > +
> > +#define RTE_REGEX_DEV_PCRE_UNSUP_WORD_BOUNDARY_F (1ULL <<
> > 15)
> > +/**< RegEx device doesn't support word boundaries.
> > + * The meta character '\b' represents word boundary anchor.
> > + * @see struct rte_regex_dev_info::pcre_unsup_flags
> > + */
> > +
> > +#define RTE_REGEX_DEV_PCRE_UNSUP_FORWARD_REFERENCES_F (1ULL
> > << 16)
> > +/**< RegEx device doesn't support Forward references.
> > + * Forward references allow you to use a back reference to a group that
> > appears
> > + * later in the RegEx. Example RegEx is '(\3ABC|(DEF|(GHI)))+' matches the
> > + * following string 'GHIGHIABCDEF'.
> > + * @see struct rte_regex_dev_info::pcre_unsup_flags
> > + */
> > +
> > +/* Enumerates PCRE rule flags */
> > +#define RTE_REGEX_PCRE_RULE_ALLOW_EMPTY_F (1ULL << 0)
> > +/**< When this flag is set, the pattern that can match against an empty
> > string,
> > + * such as '.*' are allowed.
> > + * @see struct rte_regex_dev_info::rule_flags, struct
> > rte_regex_rule::rule_flags
> > + */
> > +
> > +#define RTE_REGEX_PCRE_RULE_ANCHORED_F (1ULL << 1)
> > +/**< When this flag is set, the pattern is forced to be "anchored", that is, it
> > + * is constrained to match only at the first matching point in the string that
> > + * is being searched. Similar to '^' and represented by \A.
> > + * @see struct rte_regex_dev_info::rule_flags, struct
> > rte_regex_rule::rule_flags
> > + */
> > +
> > +#define RTE_REGEX_PCRE_RULE_CASELESS_F (1ULL << 2)
> > +/**< When this flag is set, letters in the pattern match both upper and
> > lower
> > + * case letters in the subject.
> > + * @see struct rte_regex_dev_info::rule_flags, struct
> > rte_regex_rule::rule_flags
> > + */
> > +
> > +#define RTE_REGEX_PCRE_RULE_DOTALL_F (1ULL << 3)
> > +/**< When this flag is set, a dot metacharacter in the pattern matches any
> > + * character, including one that indicates a newline.
> > + * @see struct rte_regex_dev_info::rule_flags, struct
> > rte_regex_rule::rule_flags
> > + */
> > +
> > +#define RTE_REGEX_PCRE_RULE_DUPNAMES_F (1ULL << 4)
> > +/**< When this flag is set, names used to identify capture groups need not
> > be
> > + * unique.
> > + * @see struct rte_regex_dev_info::rule_flags, struct
> > rte_regex_rule::rule_flags
> > + */
> > +
> > +#define RTE_REGEX_PCRE_RULE_EXTENDED_F (1ULL << 5)
> > +/**< When this flag is set, most white space characters in the pattern are
> > + * totally ignored except when escaped or inside a character class.
> > + * @see struct rte_regex_dev_info::rule_flags, struct
> > rte_regex_rule::rule_flags
> > + */
> > +
> > +#define RTE_REGEX_PCRE_RULE_MATCH_UNSET_BACKREF_F (1ULL << 6)
> > +/**< When this flag is set, a backreference to an unset capture group
> > matches an
> > + * empty string.
> > + * @see RTE_REGEX_DEV_PCRE_UNSUP_FORWARD_REFERENCES_F
> > + * @see struct rte_regex_dev_info::rule_flags, struct
> > rte_regex_rule::rule_flags
> > + */
> > +
> > +#define RTE_REGEX_PCRE_RULE_MULTILINE_F (1ULL << 7)
> > +/**< When this flag  is set, the '^' and '$' constructs match immediately
> > + * following or immediately before internal newlines in the subject string,
> > + * respectively, as well as at the very start and end.
> > + * @see struct rte_regex_dev_info::rule_flags, struct
> > rte_regex_rule::rule_flags
> > + */
> > +
> > +#define RTE_REGEX_PCRE_RULE_NO_AUTO_CAPTURE_F (1ULL << 8)
> > +/**< When this Flag is set, it disables the use of numbered capturing
> > + * parentheses in the pattern. References to capture groups
> > (backreferences or
> > + * recursion/subroutine calls) may only refer to named groups, though the
> > + * reference can be by name or by number.
> > + * @see struct rte_regex_dev_info::rule_flags, struct
> > rte_regex_rule::rule_flags
> > + */
> > +
> > +#define RTE_REGEX_PCRE_RULE_UCP_F (1ULL << 9)
> > +/**< By default, only ASCII characters are recognized, When this flag is set,
> > + * Unicode properties are used instead to classify characters.
> > + * @see struct rte_regex_dev_info::rule_flags, struct
> > rte_regex_rule::rule_flags
> > + */
> > +
> > +#define RTE_REGEX_PCRE_RULE_UNGREEDY_F (1ULL << 10)
> > +/**< When this flag is set, the "greediness" of the quantifiers is inverted
> > + * so that they are not greedy by default, but become greedy if followed by
> > + * '?'.
> > + * @see struct rte_regex_dev_info::rule_flags, struct
> > rte_regex_rule::rule_flags
> > + */
> > +
> > +#define RTE_REGEX_PCRE_RULE_UTF_F (1ULL << 11)
> > +/**< When this flag is set, RegEx engine has to regard both the pattern and
> > the
> > + * subject strings that are subsequently processed as strings of UTF
> > characters
> > + * instead of single-code-unit strings.
> > + * @see struct rte_regex_dev_info::rule_flags, struct
> > rte_regex_rule::rule_flags
> > + */
> > +
> > +#define RTE_REGEX_PCRE_RULE_NEVER_BACKSLASH_C_F (1ULL << 12)
> > +/**< This Flag locks out the use of '\C' in the pattern that is being compiled.
> > + * This escape matches one data unit, even in UTF mode which can cause
> > + * unpredictable behavior in UTF-8 or UTF-16 modes, because it may leave
> > the
> > + * current matching point in the middle of a multi-code-unit character.
> > + * @see struct rte_regex_dev_info::rule_flags, struct
> > rte_regex_rule::rule_flags
> > + */
> > +
> > +
> > +/**
> > + * RegEx device information
> > + */
> > +struct rte_regex_dev_info {
> > +	const char *driver_name; /**< RegEx driver name */
> > +	struct rte_device *dev;	/**< Device information */
> > +	uint8_t max_matches;
> > +	/**< Maximum matches per scan supported by this device */
> > +	uint16_t max_queue_pairs;
> > +	/**< Maximum queue pairs supported by this device */
> > +	uint16_t max_payload_size;
> > +	/**< Maximum payload size for a pattern match request or scan.
> > +	 * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> > +	 */
> > +	uint16_t max_rules_per_group;
> > +	/**< Maximum rules supported per group by this device */
> > +	uint16_t max_groups;
> > +	/**< Maximum group supported by this device */
> > +	uint32_t regex_dev_capa;
> > +	/**< RegEx device capabilities. @see RTE_REGEX_DEV_CAPA_* */
> > +	uint64_t rule_flags;
> > +	/**< Supported compiler rule flags.
> > +	 * @see RTE_REGEX_PCRE_RULE_*, struct rte_regex_rule::rule_flags
> > +	 */
> > +	uint64_t pcre_unsup_flags;
> > +	/**< Unsupported PCRE features for this RegEx device.
> > +	 * @see RTE_REGEX_DEV_PCRE_UNSUP_*
> > +	 */
> > +};
> > +
> > +/**
> > + * Retrieve the contextual information of a RegEx device.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + *
> > + * @param[out] dev_info
> > + *   A pointer to a structure of type *rte_regex_dev_info* to be filled with
> > the
> > + *   contextual information of the device.
> > + *
> > + * @return
> > + *   - 0: Success, driver updates the contextual information of the RegEx
> > device
> > + *   - <0: Error code returned by the driver info get function.
> > + *
> > + */
> > +int
> > +rte_regex_dev_info_get(uint8_t dev_id, struct rte_regex_dev_info
> > *dev_info);
> > +
> > +/* Enumerates RegEx device configuration flags */
> > +#define RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F (1ULL << 0)
> > +/**< Cross buffer scan refers to the ability to be able to detect
> > + * matches that occur across buffer boundaries, where the buffers are
> > related
> > + * to each other in some way. Enable this flag when to scan payload size
> > + * greater struct struct rte_regex_dev_info::max_payload_size and/or
> > + * matches can present across scan buffer boundaries.
> > + *
> > + * @see struct rte_regex_dev_info::max_payload_size
> > + * @see struct rte_regex_dev_config::dev_cfg_flags,
> > rte_regex_dev_configure()
> > + * @see RTE_REGEX_OPS_RSP_PMI_SOJ_F
> > + * @see RTE_REGEX_OPS_RSP_PMI_EOJ_F
> > + */
> > +
> > +/** RegEx device configuration structure */
> > +struct rte_regex_dev_config {
> > +	uint8_t nb_max_matches;
> > +	/**< Maximum matches per scan configured on this device.
> > +	 * This value cannot exceed the *max_matches*
> > +	 * which previously provided in rte_regex_dev_info_get().
> > +	 * The value 0 is allowed, in which case, value 1 used.
> > +	 * @see struct rte_regex_dev_info::max_matches
> > +	 */
> > +	uint16_t nb_queue_pairs;
> > +	/**< Number of RegEx queue pairs to configure on this device.
> > +	 * This value cannot exceed the *max_queue_pairs* which
> > previously
> > +	 * provided in rte_regex_dev_info_get().
> > +	 * @see struct rte_regex_dev_info::max_queue_pairs
> > +	 */
> > +	uint16_t nb_rules_per_group;
> > +	/**< Number of rules per group to configure on this device.
> > +	 * This value cannot exceed the *max_rules_per_group*
> > +	 * which previously provided in rte_regex_dev_info_get().
> > +	 * The value 0 is allowed, in which case,
> > +	 * struct rte_regex_dev_info::max_rules_per_group used.
> > +	 * @see struct rte_regex_dev_info::max_rules_per_group
> > +	 */
> > +	uint16_t nb_groups;
> > +	/**< Number of groups to configure on this device.
> > +	 * This value cannot exceed the *max_groups*
> > +	 * which previously provided in rte_regex_dev_info_get().
> > +	 * @see struct rte_regex_dev_info::max_groups
> > +	 */
> > +	const char *rule_db;
> > +	/**< Import initial set of prebuilt rule database on this device.
> > +	 * The value NULL is allowed, in which case, the device will not
> > +	 * be configured prebuilt rule database. Application may use
> > +	 * rte_regex_rule_db_update() or rte_regex_rule_db_import() API
> > +	 * to update or import rule database after the
> > +	 * rte_regex_dev_configure().
> > +	 * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
> > +	 */
> > +	uint32_t rule_db_len;
> > +	/**< Length of *rule_db* buffer. */
> > +	uint32_t dev_cfg_flags;
> > +	/**< RegEx device configuration flags, See RTE_REGEX_DEV_CFG_*
> > */
> > +};
> > +
> > +/**
> > + * Configure a RegEx device.
> > + *
> > + * This function must be invoked first before any other function in the
> > + * API. This function can also be re-invoked when a device is in the
> > + * stopped state.
> > + *
> > + * The caller may use rte_regex_dev_info_get() to get the capability of each
> > + * resources available for this regex device.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device to configure.
> > + * @param cfg
> > + *   The RegEx device configuration structure.
> > + *
> > + * @return
> > + *   - 0: Success, device configured.
> > + *   - <0: Error code returned by the driver configuration function.
> > + */
> > +int
> > +rte_regex_dev_configure(uint8_t dev_id, const struct
> > rte_regex_dev_config *cfg);
> > +
> > +/* Enumerates RegEx queue pair configuration flags */
> > +#define RTE_REGEX_QUEUE_PAIR_CFG_OOS_F (1ULL << 0)
> > +/**< Out of order scan, If not set, a scan must retire after previously issued
> > + * in-order scans to this queue pair. If set, this scan can be retired as soon
> > + * as device returns completion. Application should not set out of order scan
> > + * flag if it needs to maintain the ingress order of scan request.
> > + *
> > + * @see struct rte_regex_qp_conf::qp_conf_flags,
> > rte_regex_queue_pair_setup()
> > + */
> > +
> > +struct rte_regex_ops;
> > +typedef void (*regexdev_stop_flush_t)(uint8_t dev_id, uint16_t qp_id,
> > +				      struct rte_regex_ops *op);
> > +/**< Callback function called during rte_regex_dev_stop(), invoked once
> > per
> > + * flushed RegEx op.
> > + */
> > +
> > +/** RegEx queue pair configuration structure */
> > +struct rte_regex_qp_conf {
> > +	uint32_t qp_conf_flags;
> > +	/**< Queue pair config flags, See RTE_REGEX_QUEUE_PAIR_CFG_*
> > */
> > +	uint16_t nb_desc;
> > +	/**< The number of descriptors to allocate for this queue pair. */
> > +	regexdev_stop_flush_t cb;
> > +	/**< Callback function called during rte_regex_dev_stop(), invoked
> > +	 * once per flushed regex op. Value NULL is allowed, in which case
> > +	 * callback will not be invoked. This function can be used to properly
> > +	 * dispose of outstanding regex ops from response queue,
> > +	 * for example ops containing memory pointers.
> > +	 * @see rte_regex_dev_stop()
> > +	 */
> > +};
> > +
> > +/**
> > + * Allocate and set up a RegEx queue pair for a RegEx device.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + * @param queue_pair_id
> > + *   The index of the RegEx queue pair to setup. The value must be in the
> > range
> > + *   [0, nb_queue_pairs - 1] previously supplied to
> > rte_regex_dev_configure().
> > + * @param qp_conf
> > + *   The pointer to the configuration data to be used for the RegEx queue
> > pair.
> > + *   NULL value is allowed, in which case default configuration	used.
> > + *
> > + * @return
> > + *   - 0: Success, RegEx queue pair correctly set up.
> > + *   - <0: RegEx queue configuration failed
> > + */
> > +int
> > +rte_regex_queue_pair_setup(uint8_t dev_id, uint8_t queue_pair_id,
> > +			   const struct rte_regex_qp_conf *qp_conf);
> > +
> > +/**
> > + * Start a RegEx device.
> > + *
> > + * The device start step is the last one and consists of setting the RegEx
> > + * queues to start accepting the pattern matching scan requests.
> > + *
> > + * On success, all basic functions exported by the API (RegEx enqueue,
> > + * RegEx dequeue and so on) can be invoked.
> > + *
> > + * @param dev_id
> > + *   RegEx device identifier
> > + * @return
> > + *   - 0: Success, device started.
> > + *   - <0: Device start failed.
> > + */
> > +int
> > +rte_regex_dev_start(uint8_t dev_id);
> > +
> > +/**
> > + * Stop a RegEx device.
> > + *
> > + * Stop a RegEx device. The device can be restarted with a call to
> > + * rte_regex_dev_start().
> > + *
> > + * This function causes all queued response regex ops to be drained in the
> > + * response queue. While draining ops out of the device,
> > + * struct rte_regex_qp_conf::cb will be invoked for each ops.
> > + *
> > + * @param dev_id
> > + *   RegEx device identifier.
> > + *
> > + * @see struct rte_regex_qp_conf::cb, rte_regex_queue_pair_setup()
> > + */
> > +void
> > +rte_regex_dev_stop(uint8_t dev_id);
> > +
> > +/**
> > + * Close a RegEx device. The device cannot be restarted!
> > + *
> > + * @param dev_id
> > + *   RegEx device identifier
> > + *
> > + * @return
> > + *  - 0 on successfully closed the device.
> > + *  - <0 on failure to close the device.
> > + */
> > +int
> > +rte_regex_dev_close(uint8_t dev_id);
> > +
> > +/* Device get/set attributes */
> > +
> > +/** Enumerates RegEx device attribute identifier */
> > +enum rte_regex_dev_attr_id {
> > +	RTE_REGEX_DEV_ATTR_SOCKET_ID,
> > +	/**< The NUMA socket id to which the device is connected or
> > +	 * a default of zero if the socket could not be determined.
> > +	 * datatype: *int*
> > +	 * operation: *get*
> > +	 */
> > +	RTE_REGEX_DEV_ATTR_MAX_MATCHES,
> > +	/**< Maximum number of matches per scan.
> > +	 * datatype: *uint8_t*
> > +	 * operation: *get* and *set*
> > +	 *
> > +	 * @see RTE_REGEX_OPS_RSP_MAX_MATCH_F
> > +	 */
> > +	RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT,
> > +	/**< Upper bound scan time in ns.
> > +	 * datatype: *uint16_t*
> > +	 * operation: *get* and *set*
> > +	 *
> > +	 * @see RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F
> > +	 */
> > +	RTE_REGEX_DEV_ATTR_MAX_PREFIX,
> > +	/**< Maximum number of prefix detected per scan.
> > +	 * This would be useful for denial of service detection.
> > +	 * datatype: *uint16_t*
> > +	 * operation: *get* and *set*
> > +	 *
> > +	 * @see RTE_REGEX_OPS_RSP_MAX_PREFIX_F
> > +	 */
> > +};
> > +
> > +/**
> > + * Get an attribute from a RegEx device.
> > + *
> > + * @param dev_id RegEx device identifier
> > + * @param attr_id The attribute ID to retrieve
> > + * @param[out] attr_value A pointer that will be filled in with the attribute
> > + *             value if successful.
> > + *
> > + * @return
> > + *   - 0: Successfully retrieved attribute value.
> > + *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
> > + *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
> > + */
> > +int
> > +rte_regex_dev_attr_get(uint8_t dev_id, enum rte_regex_dev_attr_id
> > attr_id,
> > +		       void *attr_value);
> > +
> > +/**
> > + * Set an attribute to a RegEx device.
> > + *
> > + * @param dev_id RegEx device identifier
> > + * @param attr_id The attribute ID to retrieve
> > + * @param attr_value A pointer that will be filled in with the attribute value
> > + *                   by the application
> > + *
> > + * @return
> > + *   - 0: Successfully applied the attribute value.
> > + *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
> > + *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
> > + */
> > +int
> > +rte_regex_dev_attr_set(uint8_t dev_id, enum rte_regex_dev_attr_id
> > attr_id,
> > +		       const void *attr_value);
> > +
> > +/* Rule related APIs */
> > +/** Enumerates RegEx rule operation */
> > +enum rte_regex_rule_op {
> > +	RTE_REGEX_RULE_OP_ADD,
> > +	/**< Add RegEx rule to rule database */
> > +	RTE_REGEX_RULE_OP_REMOVE
> > +	/**< Remove RegEx rule from rule database */
> > +};
> > +
> > +/** Structure to hold a RegEx rule attributes */
> > +struct rte_regex_rule {
> > +	enum rte_regex_rule_op op;
> > +	/**< OP type of the rule either a OP_ADD or OP_DELETE */
> > +	uint16_t group_id;
> > +	/**< Group identifier to which the rule belongs to. */
> > +	uint32_t rule_id;
> > +	/**< Rule identifier which is returned on successful match. */
> > +	const char *pcre_rule;
> > +	/**< Buffer to hold the PCRE rule. */
> > +	uint16_t pcre_rule_len;
> > +	/**< Length of the PCRE rule*/
> > +	uint64_t rule_flags;
> > +	/* PCRE rule flags. Supported device specific PCRE rules enumerated
> > +	 * in struct rte_regex_dev_info::rule_flags. For successful rule
> > +	 * database update, application needs to provide only supported
> > +	 * rule flags.
> > +	 * @See RTE_REGEX_PCRE_RULE_*, struct
> > rte_regex_dev_info::rule_flags
> > +	 */
> > +};
> > +
> > +/**
> > + * Update the rule database of a RegEx device.
> > + *
> > + * @param dev_id RegEx device identifier
> > + * @param rules
> > + *   Points to an array of *nb_rules* objects of type *rte_regex_rule*
> > structure
> > + *   which contain the regex rules attributes to be updated in rule database.
> > + * @param nb_rules
> > + *   The number of PCRE rules to update the rule database.
> > + *
> > + * @return
> > + *   The number of regex rules actually updated on the regex device's rule
> > + *   database. The return value can be less than the value of the *nb_rules*
> > + *   parameter when the regex devices fails to update the rule database or
> > + *   if invalid parameters are specified in a *rte_regex_rule*.
> > + *   If the return value is less than *nb_rules*, the remaining PCRE rules
> > + *   at the end of *rules* are not consumed and the caller has to take
> > + *   care of them and rte_errno is set accordingly.
> > + *   Possible errno values include:
> > + *   - -EINVAL:  Invalid device ID or rules is NULL
> > + *   - -ENOTSUP: The last processed rule is not supported on this device.
> > + *   - -ENOSPC: No space available in rule database.
> > + *
> > + * @see rte_regex_rule_db_import(), rte_regex_rule_db_export()
> > + */
> > +uint16_t
> > +rte_regex_rule_db_update(uint8_t dev_id, const struct rte_regex_rule
> > *rules,
> > +			 uint16_t nb_rules);
> 
> I think the function name is not too informative. If this function meant to
> compile the rule then it should be explicit on the function name.
> 
> > +
> > +/**
> > + * Import a prebuilt rule database from a buffer to a RegEx device.
> > + *
> > + * @param dev_id RegEx device identifier
> > + * @param rule_db
> > + *   Points to prebuilt rule database.
> > + * @param rule_db_len
> > + *   Length of the rule database.
> > + *
> > + * @return
> > + *   - 0: Successfully updated the prebuilt rule database.
> > + *   - -EINVAL:  Invalid device ID or rule_db is NULL
> > + *   - -ENOTSUP: Rule database import is not supported on this device.
> > + *   - -ENOSPC: No space available in rule database.
> > + *
> > + * @see rte_regex_rule_db_update(), rte_regex_rule_db_export()
> > + */
> > +int
> > +rte_regex_rule_db_import(uint8_t dev_id, const char *rule_db,
> > +			 uint32_t rule_db_len);
> > +
> > +/**
> > + * Export the prebuilt rule database from a RegEx device to the buffer.
> > + *
> > + * @param dev_id RegEx device identifier
> > + * @param[out] rule_db
> > + *   Block of memory to insert the rule database. Must be at least size in
> > + *   capacity. If set to NULL, function returns required capacity.
> > + *
> > + * @return
> > + *   - 0: Successfully exported the prebuilt rule database.
> > + *   - size: If rule_db set to NULL then required capacity for *rule_db*
> > + *   - -EINVAL:  Invalid device ID
> > + *   - -ENOTSUP: Rule database export is not supported on this device.
> > + *
> > + * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
> > + */
> > +int
> > +rte_regex_rule_db_export(uint8_t dev_id, char *rule_db);
> > +
> > +/* Extended statistics */
> > +/** Maximum name length for extended statistics counters */
> > +#define RTE_REGEX_DEV_XSTATS_NAME_SIZE 64
> > +
> > +/**
> > + * A name-key lookup element for extended statistics.
> > + *
> > + * This structure is used to map between names and ID numbers
> > + * for extended RegEx device statistics.
> > + */
> > +struct rte_regex_dev_xstats_map {
> > +	uint16_t id;
> > +	/**< xstat identifier */
> > +	char name[RTE_REGEX_DEV_XSTATS_NAME_SIZE];
> > +	/**< xstat name */
> > +};
> > +
> > +/**
> > + * Retrieve names of extended statistics of a regex device.
> > + *
> > + * @param dev_id
> > + *   The identifier of the regex device.
> > + * @param[out] xstats_map
> > + *   Block of memory to insert id and names into. Must be at least size in
> > + *   capacity. If set to NULL, function returns required capacity.
> > + * @return
> > + *   - positive value on success:
> > + *        -The return value is the number of entries filled in the stats map.
> > + *        -If xstats_map set to NULL then required capacity for xstats_map.
> > + *   - negative value on error:
> > + *      -ENODEV for invalid *dev_id*
> > + *      -ENOTSUP if the device doesn't support this function.
> > + */
> > +int
> > +rte_regex_dev_xstats_names_get(uint8_t dev_id,
> > +			       struct rte_regex_dev_xstats_map *xstats_map);
> > +
> > +/**
> > + * Retrieve extended statistics of an regex device.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + * @param ids
> > + *   The id numbers of the stats to get. The ids can be got from the stat
> > + *   position in the stat list from rte_regex_dev_xstats_names_get(), or
> > + *   by using rte_regex_dev_xstats_by_name_get().
> > + * @param[out] values
> > + *   The values for each stats request by ID.
> > + * @param n
> > + *   The number of stats requested
> > + * @return
> > + *   - positive value: number of stat entries filled into the values array
> > + *   - negative value on error:
> > + *      -ENODEV for invalid *dev_id*
> > + *      -ENOTSUP if the device doesn't support this function.
> > + */
> > +int
> > +rte_regex_dev_xstats_get(uint8_t dev_id, const uint16_t ids[],
> > +			 uint64_t values[], uint16_t n);
> > +
> > +/**
> > + * Retrieve the value of a single stat by requesting it by name.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device
> > + * @param name
> > + *   The stat name to retrieve
> > + * @param[out] id
> > + *   If non-NULL, the numerical id of the stat will be returned, so that further
> > + *   requests for the stat can be got using rte_regex_dev_xstats_get, which
> > will
> > + *   be faster as it doesn't need to scan a list of names for the stat.
> > + * @param[out] value
> > + *   Must be non-NULL, retrieved xstat value will be stored in this address.
> > + *
> > + * @return
> > + *   - 0: Successfully retrieved xstat value.
> > + *   - -EINVAL: invalid parameters
> > + *   - -ENOTSUP: if not supported.
> > + */
> > +int
> > +rte_regex_dev_xstats_by_name_get(uint8_t dev_id, const char *name,
> > +				 uint16_t *id, uint64_t *value);
> > +
> > +/**
> > + * Reset the values of the xstats of the selected component in the device.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device
> > + * @param ids
> > + *   Selects specific statistics to be reset. When NULL, all statistics will be
> > + *   reset. If non-NULL, must point to array of at least *nb_ids* size.
> > + * @param nb_ids
> > + *   The number of ids available from the *ids* array. Ignored when ids is
> > NULL.
> > + * @return
> > + *   - 0: Successfully reset the statistics to zero.
> > + *   - -EINVAL: invalid parameters
> > + *   - -ENOTSUP: if not supported.
> > + */
> > +int
> > +rte_regex_dev_xstats_reset(uint8_t dev_id, const uint16_t ids[],
> > +			   uint16_t nb_ids);
> > +
> > +/**
> > + * Trigger the RegEx device self test.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device
> > + * @return
> > + *   - 0: Selftest successful
> > + *   - -ENOTSUP if the device doesn't support selftest
> > + *   - other values < 0 on failure.
> > + */
> > +int rte_regex_dev_selftest(uint8_t dev_id);
> > +
> > +/**
> > + * Dump internal information about *dev_id* to the FILE* provided in *f*.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + *
> > + * @param f
> > + *   A pointer to a file for output
> > + *
> > + * @return
> > + *   - 0: on success
> > + *   - <0: on failure.
> > + */
> > +int
> > +rte_regex_dev_dump(uint8_t dev_id, FILE *f);
> > +
> > +/* Fast path APIs */
> > +
> > +/**
> > + * The generic *rte_regex_match* structure to hold the RegEx match
> > attributes.
> > + * @see struct rte_regex_ops::matches
> > + */
> > +struct rte_regex_match {
> > +	RTE_STD_C11
> > +	union {
> > +		uint64_t u64;
> > +		struct {
> > +			uint32_t rule_id:20;
> > +			/**< Rule identifier to which the pattern matched.
> > +			 * @see struct rte_regex_rule::rule_id
> > +			 */
> > +			uint32_t group_id:12;
> > +			/**< Group identifier of the rule which the pattern
> > +			 * matched. @see struct rte_regex_rule::group_id
> > +			 */
> > +			uint16_t offset;
> > +			/**< Starting Byte Position for matched rule. */
> > +			uint16_t len;
> > +			/**< Length of match in bytes */
> > +		};
> > +	};
> > +};
> > +
> > +/* Enumerates RegEx request flags. */
> > +#define RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F (1 << 0)
> > +/**< Set when struct rte_regex_rule::group_id1 valid */
> > +
> > +#define RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F (1 << 1)
> > +/**< Set when struct rte_regex_rule::group_id2 valid */
> > +
> > +#define RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F (1 << 2)
> > +/**< Set when struct rte_regex_rule::group_id3 valid */
> > +
> > +#define RTE_REGEX_OPS_REQ_STOP_ON_MATCH_F (1 << 4)
> > +/**< The RegEx engine will stop scanning and return the first match. */
> > +
> > +#define RTE_REGEX_OPS_REQ_MATCH_HIGH_PRIORITY_F (1 << 5)
> > +/**< In High Priority mode a maximum of one match will be returned per
> > scan to
> > + * reduce the post-processing required by the application. The match with
> > the
> > + * lowest Rule id, lowest start pointer and lowest match length will be
> > + * returned.
> > + *
> > + * @see struct rte_regex_ops::nb_actual_matches
> > + * @see struct rte_regex_ops::nb_matches
> > + */
> > +
> > +
> > +/* Enumerates RegEx response flags. */
> > +#define RTE_REGEX_OPS_RSP_PMI_SOJ_F (1 << 0)
> > +/**< Indicates that the RegEx device has encountered a partial match at the
> > + * start of scan in the given buffer.
> > + *
> > + * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> > + */
> > +
> > +#define RTE_REGEX_OPS_RSP_PMI_EOJ_F (1 << 1)
> > +/**< Indicates that the RegEx device has encountered a partial match at the
> > + * end of scan in the given buffer.
> > + *
> > + * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> > + */
> > +
> > +#define RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F (1 << 2)
> > +/**< Indicates that the RegEx device has exceeded the max timeout while
> > + * scanning the given buffer.
> > + *
> > + * @see RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT
> > + */
> > +
> > +#define RTE_REGEX_OPS_RSP_MAX_MATCH_F (1 << 3)
> > +/**< Indicates that the RegEx device has exceeded the max matches while
> > + * scanning the given buffer.
> > + *
> > + * @see RTE_REGEX_DEV_ATTR_MAX_MATCHES
> > + */
> > +
> > +#define RTE_REGEX_OPS_RSP_MAX_PREFIX_F (1 << 4)
> > +/**< Indicates that the RegEx device has reached the max allowed prefix
> > length
> > + * while scanning the given buffer.
> > + *
> > + * @see RTE_REGEX_DEV_ATTR_MAX_PREFIX
> > + */
> > +
> > +/**
> > + * The generic *rte_regex_ops* structure to hold the RegEx attributes
> > + * for enqueue and dequeue operation.
> > + */
> > +struct rte_regex_ops {
> > +	/* W0 */
> > +	uint16_t req_flags;
> > +	/**< Request flags for the RegEx ops.
> > +	 * @see RTE_REGEX_OPS_REQ_*
> > +	 */
> > +	uint16_t scan_size;
> > +	/**< Scan size of the buffer to be scanned in bytes. */
> > +	uint16_t rsp_flags;
> > +	/**< Response flags for the RegEx ops.
> > +	 * @see RTE_REGEX_OPS_RSP_*
> > +	 */
> > +	uint8_t nb_actual_matches;
> > +	/**< The total number of actual matches detected by the Regex
> > device.*/
> > +	uint8_t nb_matches;
> > +	/**< The total number of matches returned by the RegEx device for
> > this
> > +	 * scan. The size of *rte_regex_ops::matches* zero length array will
> > be
> > +	 * this value.
> > +	 *
> > +	 * @see struct rte_regex_ops::matches, struct rte_regex_match
> > +	 */
> > +
> > +	/* W1 */
> > +	RTE_STD_C11
> > +	union {
> > +		uint64_t u64;
> > +		/**<  Allow 8-byte reserved on 32-bit system */
> > +		void *buf_addr;
> > +		/**< Virtual address of the pattern to be matched. */
> > +	};
> > +
> > +	/* W2 */
> > +	rte_iova_t buf_iova;
> > +	/**< IOVA address of the pattern to be matched. */
> > +
> > +	/* W3 */
> > +	uint16_t group_id0;
> > +	/**< First group_id to match the rule against. Minimum one group id
> > +	 * must be provided by application.
> > +	 * When RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F set then
> > group_id1
> > +	 * is valid, respectively similar flags for group_id2 and group_id3.
> > +	 * Upon the match, struct rte_regex_match::group_id shall be
> > updated
> > +	 * with matching group ID by the device. Group ID scheme provides
> > +	 * rule isolation and effective pattern matching.
> > +	 */
> > +	uint16_t group_id1;
> > +	/**< Second group_id to match the rule against.
> > +	 *
> > +	 * @see RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F
> > +	 */
> > +	uint16_t group_id2;
> > +	/**< Third group_id to match the rule against.
> > +	 *
> > +	 * @see RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F
> > +	 */
> > +	uint16_t group_id3;
> > +	/**< Forth group_id to match the rule against.
> > +	 *
> > +	 * @see RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F
> > +	 */
> > +
> > +	/* W4 */
> > +	RTE_STD_C11
> > +	union {
> > +		uint64_t user_id;
> > +		/**< Application specific opaque value. An application may
> > use
> > +		 * this field to hold application specific value to share
> > +		 * between dequeue and enqueue operation.
> > +		 * Implementation should not modify this field.
> > +		 */
> > +		void *user_ptr;
> > +		/**< Pointer representation of *user_id* */
> > +	};
> 
> Since we target the regex subsystem for both regex and DPI I think it will be
> good to add another uint64_t field called connection_id.
> Device that support DPI can refer to it as another match able field when looking
> up for matches on the given buffer.
> 
> This field is different from the user_id, as it is not opaque for the device.
> 
> > +
> > +	/* W5 */
> > +	struct rte_regex_match matches[];
> > +	/**< Zero length array to hold the match tuples.
> > +	 * The struct rte_regex_ops::nb_matches value holds the number of
> > +	 * elements in this array.
> > +	 *
> > +	 * @see struct rte_regex_ops::nb_matches
> > +	 */
> > +};
> > +
> > +/**
> > + * Enqueue a burst of scan request on a RegEx device.
> > + *
> > + * The rte_regex_enqueue_burst() function is invoked to place
> > + * regex operations on the queue *qp_id* of the device designated by
> > + * its *dev_id*.
> > + *
> > + * The *nb_ops* parameter is the number of operations to process which
> > are
> > + * supplied in the *ops* array of *rte_regex_op* structures.
> > + *
> > + * The rte_regex_enqueue_burst() function returns the number of
> > + * operations it actually enqueued for processing. A return value equal to
> > + * *nb_ops* means that all packets have been enqueued.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + * @param qp_id
> > + *   The index of the queue pair which packets are to be enqueued for
> > + *   processing. The value must be in the range [0, nb_queue_pairs - 1]
> > + *   previously supplied to rte_regex_dev_configure().
> > + * @param ops
> > + *   The address of an array of *nb_ops* pointers to *rte_regex_op*
> > structures
> > + *   which contain the regex operations to be processed.
> > + * @param nb_ops
> > + *   The number of operations to process.
> > + *
> > + * @return
> > + *   The number of operations actually enqueued on the regex device. The
> > return
> > + *   value can be less than the value of the *nb_ops* parameter when the
> > + *   regex devices queue is full or if invalid parameters are specified in
> > + *   a *rte_regex_op*. If the return value is less than *nb_ops*, the
> > remaining
> > + *   ops at the end of *ops* are not consumed and the caller has to take
> > care
> > + *   of them.
> > + */
> > +uint16_t
> > +rte_regex_enqueue_burst(uint8_t dev_id, uint16_t qp_id,
> > +			struct rte_regex_ops **ops, uint16_t nb_ops);
> > +
> > +/**
> > + *
> > + * Dequeue a burst of scan response from a queue on the RegEx device.
> > + * The dequeued operation are stored in *rte_regex_op* structures
> > + * whose pointers are supplied in the *ops* array.
> > + *
> > + * The rte_regex_dequeue_burst() function returns the number of ops
> > + * actually dequeued, which is the number of *rte_regex_op* data
> > structures
> > + * effectively supplied into the *ops* array.
> > + *
> > + * A return value equal to *nb_ops* indicates that the queue contained
> > + * at least *nb_ops* operations, and this is likely to signify that other
> > + * processed operations remain in the devices output queue. Applications
> > + * implementing a "retrieve as many processed operations as possible"
> > policy
> > + * can check this specific case and keep invoking the
> > + * rte_regex_dequeue_burst() function until a value less than
> > + * *nb_ops* is returned.
> > + *
> > + * The rte_regex_dequeue_burst() function does not provide any error
> > + * notification to avoid the corresponding overhead.
> > + *
> > + * @param dev_id
> > + *   The RegEx device identifier
> > + * @param qp_id
> > + *   The index of the queue pair from which to retrieve processed packets.
> > + *   The value must be in the range [0, nb_queue_pairs - 1] previously
> > + *   supplied to rte_regex_dev_configure().
> > + * @param ops
> > + *   The address of an array of pointers to *rte_regex_op* structures that
> > must
> > + *   be large enough to store *nb_ops* pointers in it.
> > + * @param nb_ops
> > + *   The maximum number of operations to dequeue.
> > + *
> > + * @return
> > + *   The number of operations actually dequeued, which is the number
> > + *   of pointers to *rte_regex_op* structures effectively supplied to the
> > + *   *ops* array. If the return value is less than *nb_ops*, the remaining
> > + *   ops at the end of *ops* are not consumed and the caller has to take
> > care
> > + *   of them.
> > + */
> > +uint16_t
> > +rte_regex_dequeue_burst(uint8_t dev_id, uint16_t qp_id,
> > +			struct rte_regex_ops **ops, uint16_t nb_ops);
> > +
> > +#ifdef __cplusplus
> > +}
> > +#endif
> > +
> > +#endif /* _RTE_REGEXDEV_H_ */
> >
Jerin Jacob Kollanukkaran Sept. 10, 2019, 11:02 a.m. UTC | #10
> Hi Jerin,

Hi Shahaf,

Sorry for delay in response(Was busy with 19.11 proposal deadline). Please see inline.

> >
> > RegEx pattern matching applications:
> > • Next Generation Firewalls (NGFW)
> > • Deep Packet and Flow Inspection (DPI)
> > • Intrusion Prevention Systems (IPS)
> > • DDoS Mitigation
> > • Network Monitoring
> > • Data Loss Prevention (DLP)
> > • Smart NICs
> > • Grammar based content processing
> > • URL, spam and adware filtering
> > • Advanced auditing and policing of user/application security policies
> > • Financial data mining - parsing of streamed financial feeds
> 
> I think two more important use case to add (at least on the doc of this
> subsystem) are:
> * application recognition
> * memory introspection

Sure. Will add the following from John as well.

# Natural Language Processing (NLP)
# Sentiment Analysis
# Big Data database acceleration (Spark, Hadoop etc.)
# Computational Storage

> 
> 
> > +/**
> > + * Update the rule database of a RegEx device.
> > + *
> > + * @param dev_id RegEx device identifier
> > + * @param rules
> > + *   Points to an array of *nb_rules* objects of type *rte_regex_rule*
> > structure
> > + *   which contain the regex rules attributes to be updated in rule database.
> > + * @param nb_rules
> > + *   The number of PCRE rules to update the rule database.
> > + *
> > + * @return
> > + *   The number of regex rules actually updated on the regex device's rule
> > + *   database. The return value can be less than the value of the *nb_rules*
> > + *   parameter when the regex devices fails to update the rule database or
> > + *   if invalid parameters are specified in a *rte_regex_rule*.
> > + *   If the return value is less than *nb_rules*, the remaining PCRE rules
> > + *   at the end of *rules* are not consumed and the caller has to take
> > + *   care of them and rte_errno is set accordingly.
> > + *   Possible errno values include:
> > + *   - -EINVAL:  Invalid device ID or rules is NULL
> > + *   - -ENOTSUP: The last processed rule is not supported on this device.
> > + *   - -ENOSPC: No space available in rule database.
> > + *
> > + * @see rte_regex_rule_db_import(), rte_regex_rule_db_export()
> > + */
> > +uint16_t
> > +rte_regex_rule_db_update(uint8_t dev_id, const struct rte_regex_rule
> > *rules,
> > +			 uint16_t nb_rules);
> 
> I think the function name is not too informative. If this function meant to
> compile the rule then it should be explicit on the function name.
 
It is meant to be compile the rules and then  update the rule database.

I think, we can have either 1 or 2. Let me know your preference or
If you have any name suggestion. I will change it accordingly.

1. rte_regex_rule_db_compile()
2. rte_regex_rule_db_compile_update()


> > +
> > + */
> > +struct rte_regex_ops {
> > +
> > +	/* W4 */
> > +	RTE_STD_C11
> > +	union {
> > +		uint64_t user_id;
> > +		/**< Application specific opaque value. An application may
> > use
> > +		 * this field to hold application specific value to share
> > +		 * between dequeue and enqueue operation.
> > +		 * Implementation should not modify this field.
> > +		 */
> > +		void *user_ptr;
> > +		/**< Pointer representation of *user_id* */
> > +	};
> 
> Since we target the regex subsystem for both regex and DPI I think it will be
> good to add another uint64_t field called connection_id.
> Device that support DPI can refer to it as another match able field when looking
> up for matches on the given buffer.
> 
> This field is different from the user_id, as it is not opaque for the device.

Is this driver specific storage place where application should not touch it?

If not, Could you share the data flow of this field? Ie. Who "write" this
Field and who "read" this field.

This is just for documentation, In any event we can add new fields.

If it is only for driver usage then I think, some driver may need more 8B
Storage. In that case I think, each driver can add its on field
After W4(i.e existing user_id) and introduce new field called
match_offset in struct rte_regex_ops

ie. struct rte_regex_match *matches == ops + ops-> match_offset;
so that, Each driver can add enough driver specific metadata.
Wang Xiang Sept. 19, 2019, 1:58 p.m. UTC | #11
Hi Jerin,

Thanks for your response. More comments below and inline.

1) I think the size of some varaibles (e.g. nb_matches, scan_size,
matching offset, etc) should be increased based on what Hyperscan supports.

    a) struct rte_regex_ops:

        uint16_t scan_size => uint32_t scan_size
        uint8_t nb_actual_matches => uint64 nb_actual_matches
        uint8_t nb_matches => uint64 nb__matches

    b) struct rte_regex_match:
        uint16_t offset => uint32_t offset
        uint16_t len => uint32_t len

    c) uint16_t
        rte_regex_rule_db_update(uint8_t dev_id, const struct rte_regex_rule *rules,
                                 uint16_t nb_rules);
    =>
       uint32_t
        rte_regex_rule_db_update(uint8_t dev_id, const struct rte_regex_rule *rules,
                                 uint32_t nb_rules);

    d) int
    rte_regex_queue_pair_setup(uint8_t dev_id, uint8_t queue_pair_id,
                    const struct rte_regex_qp_conf *qp_conf);
    =>
       int
    rte_regex_queue_pair_setup(uint8_t dev_id, uint16_t queue_pair_id,
                    const struct rte_regex_qp_conf *qp_conf);

    e) struct rte_regex_dev_config:
        uint8_t nb_max_matches => uint64_t nb_max_matches

    f) struct rte_regex_dev_info:
        uint8_t max_matches => uint64_t max_matches

2) There are rte_regex_dev_attr_get() and rte_regex_dev_attr_set() defined.
Are all the attributes below could be set by users? Is any of them read-only?

/** Enumerates RegEx device attribute identifier */
enum rte_regex_dev_attr_id {
    RTE_REGEX_DEV_ATTR_SOCKET_ID,
    /**< The NUMA socket id to which the device is connected or
     * a default of zero if the socket could not be determined.
     * datatype: *int*
     * operation: *get*
     */
    RTE_REGEX_DEV_ATTR_MAX_MATCHES,
    /**< Maximum number of matches per scan.
     * datatype: *uint8_t*
     * operation: *get* and *set*
     *
     * @see RTE_REGEX_OPS_RSP_MAX_MATCH_F
     */
    RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT,
    /**< Upper bound scan time in ns.
     * datatype: *uint16_t*
     * operation: *get* and *set*
     *
     * @see RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F
     */
    RTE_REGEX_DEV_ATTR_MAX_PREFIX,
    /**< Maximum number of prefix detected per scan.
     * This would be useful for denial of service detection.
     * datatype: *uint16_t*
     * operation: *get* and *set*
     *
     * @see RTE_REGEX_OPS_RSP_MAX_PREFIX_F
     */
};

3) Both RTE_REGEX_PCRE_RULE_* and
RTE_REGEX_DEV_PCRE_UNSUP_* can be viewed as device capabilities. Can we
merge them with RTE_REGEX_DEV_CAPA_RUNTIME_COMPILATION_F and have a
unified regex_dev_capa in struct rte_regex_dev_info.


4) It'll be good if we can also define synchronous matching API for users who
want to have a one-off scan and wait for the results.

On Tue, Sep 10, 2019 at 08:05:39AM +0000, Jerin Jacob Kollanukkaran wrote:
> Hi Xiang,
> 
> Sorry for delay in response(Was busy with 19.11 proposal deadline). Please see inline.
>  
> > 
> > Reply to Xiang's queries in main thread:
> > 
> > Hi all,
> > 
> > Some questions regarding APIs. Could you please give more insights?
> > 
> > 1) rte_regex_ops
> >       a) rsp_flags
> >       These two flags RTE_REGEX_OPS_RSP_PMI_SOJ_F and
> > RTE_REGEX_OPS_RSP_PMI_EOJ_F are used for cross buffer scan.
> >       RTE_REGEX_OPS_RSP_PMI_EOJ_F tells whether we have a partial match
> > at the end of current buffer after scan.
> >       What's the purpose of having RTE_REGEX_OPS_RSP_PMI_SOJ_F?
> > 
> > [Jerin] Since we need three states to represent partial match buffer,
> > RTE_REGEX_OPS_RSP_PMI_SOJ_F to
> > represent start of the buffer, intermediate buffers with no flag, and end of
> > the buffer with RTE_REGEX_OPS_RSP_PMI_EOJ
> 
> > [Xiang] How could a user leverage these flags for matching? Suppose a large
> > buffer is divided into multiple chunks. Will RTE_REGEX_OPS_RSP_PMI_SOJ_F
> > cause an early quit once it isn't set after scan the first chunk. Similarly,
> > RTE_REGEX_OPS_RSP_PMI_EOJ tells a user whether to stop matching future
> > buffers after finish the last chunk?
> 
> Let me describe with an example,
> 
> Assume,
> 1) struct rte_regex_dev_info:: max_payload_size set to 1024
> 2) rte_regex_dev_config:: dev_cfg_flags configured with RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> 3) Device programmed with matching "hello\s+world" pattern
> 4) user enqueue struct rte_regex_ops:: buf_addr point following "data" and struct rte_regex_op:: scan_size = 1024
> 
> data[0..1021] = data don???t have hello world pattern
> data[1022] = 'h'
> data[1023] = 'e'
> 
> 5) user enqueue struct rte_regex_ops:: buf_addr point following "data" and struct rte_regex_op:: scan_size = 9
> 
> data[0] = 'l'
> data[1] = 'l'
> data[2] = 'o'
> data[3] = ' '
> data[4] = 'w'
> data[5] = 'o'
> data[6] = 'r'
> data[7] = 'l'
> data[8] = 'd'
> 
> If so,
> 
> Response to 4) will be RTE_REGEX_OPS_RSP_PMI_SOJ_F in rte_regex_ops:: rsp_flags on dequeue
> Where rte_regex_match:: offset is 1022 and len 2
> 
> Response to 5) will be RTE_REGEX_OPS_RSP_PMI_EOJ_F in rte_regex_ops:: rsp_flags on dequeue
> Where rte_regex_match:: offset is 0 and len 9
>
If the defined pattern is "hello.*world" instead of "hello\s+world", and
we enqueue following struct rte_regex_ops:

1) rte_regex_op:: scan_size = 1024

   data[0..1021] = data don???t have hello world pattern
   data[1022] = 'h'
   data[1023] = 'e'

2) rte_regex_op:: scan_size = 9
   data[0] = 'l'
   data[1] = 'l'
   data[2] = 'o'
   data[3] = ' '
   data[4] = 'w'
   data[5] = 'o'
   data[6] = 'r'
   data[7] = 'l'
   data[8] = 'd'

3) rte_regex_op:: scan_size = 5
   data[0] = 'w'
   data[1] = 'o'
   data[2] = 'r'
   data[3] = 'l'
   data[4] = 'd'

Will response to 3) have RTE_REGEX_OPS_RSP_PMI_EOJ_F in rte_regex_ops::
rsp_flags on dequeue
Where rte_regex_match:: offset is 0 and len 4?

I am wondering what's your expected behavior for .* or similar syntax and if
there are syntax compatability issues. We report all matches in Hyperscan,
e.g. report end match offsets 11 and 16 for pattern "hello.*world" and
corpus "hello worldworld".

BTW, not sure how other hardware devices handle cross buffer scan. Hyperscan
doesn't reports matches for start and intermediate buffers but only reports
end offset if a full match is found.

> 
> > 
> >       RTE_REGEX_OPS_RSP_MAX_PREFIX_F: This looks like a definition for a
> > specific hardware implementation. I am wondering what this PREFIX refers
> > to:)?
> > 
> > [Jerin] Yes. Looks like it is for hardware specific implementation. Introduced
> > rte_regex_dev_attr_set/get functions to make it portable and
> > To add new implementation specific fields.
> > For example, if a rule is
> > /ABCDEF.*XYZ/, ABCD is considered the prefix, and EF.*XYZ is considered the
> > factor. The prefix is a literal
> > string, while the factor can contain complex regular expression constructs. As
> > a result, rule matching occurs in
> > two stages: prefix matching and factor matching.
> > 
> >       b)  user_id or user_ptr
> >       Under what kind of circumstances should an application pass value into
> > these variables for enqueue and dequeuer operations?
> > 
> > [Jerin] Just like rte_crypto_ops, struct rte_regex_ops also allocated using
> > mempool normally, on enqueue, user can specify user_id
> > If needed to in order identify the op on dequeue if required. The use case
> > could be to store the sequence number from application
> > POV or storing the mbuf ptr in which pattern is requested etc.
> > 
> > 
> >  2) rte_regex_match
> >       a) offset; /**< Starting Byte Position for matched rule. */ and  uint16_t
> > len; /**< Length of match in bytes */
> >       Looks like the matching offset is defined as *starting matching offset*
> > instead of *end matching offset*, e.g. report the offset of "a" instead of "c"
> > for pattern "abc".
> >       If so, this makes it hard to integrate software regex libraries such as
> > Hyperscan and RE2 as they only report *end matching offset* without length
> > of match.
> >       Although Hyperscan has API for *starting matching offset*, it only delivers
> > partial syntax support. So I think we have to define *end of matching offset*
> > for software solutions.
> > 
> > [Jerin] I understand the hyperscan's HS_FLAG_SOM_LEFTMOST tradeoffs. I
> > thought application would need always the length of the match.
> > Probably we will see how other HW implementation (from Mellanox) etc. We
> > will try to abstract it, probably we can make it as function of "user
> > requested".
> > [Xiang] Yes, it will be good to make it per user request. At least from
> > Hyperscan user's point of view, start of match and match length are not
> > mandatory.
> 
> OK. I think, we can introduce RTE_REGEX_DEV_CFG_MATCH_AS_START
> In device configure.
> 
> Since offset+len == end, we can introduce following generic inline function.
> 
> static inline 
> rte_regex_match_end(truct rte_regex_match *match)
> {
> 	match->offset + match->len;
> }
> 
> Example:  pattern to match is  "hello\s+world"  and data is following
> data[4] = 'h'
> data[5] = 'e'
> data[6] = 'l'
> data[7] = 'l'
> data[8] = 'o'
> data[9] = ' '
> data[10] = 'w'
> data[11] = 'o'
> data[12] = 'r'
> data[13] = 'l'
> data[14] = 'd'
> 
> if device is configured with RTE_REGEX_DEV_CFG_MATCH_AS_START
> match->offset returns 4
> match->len returns 11
> 
> if device is NOT configured with RTE_REGEX_DEV_CFG_MATCH_AS_START
> driver MAY return the following(in hyperscan case)
> match->offset returns 0
> match->len returns 11 + 4
> 
> In both case(irrespective of flags, to make application life easy) rte_regex_match_end() would return 15.
> If application demands for MATCH_AS_START then driver can return match->offset returns 4 and match->len returns 11
> Aka set HS_FLAG_SOM_LEFTMOST in hyperscan driver, But application should use rte_regex_match_end()
> for finding the end of the match. To make, work in all cases.
> 
> Is it OK? 
> 
Can we replace len with end offset? So we can change "offset" to
"start_offset" and len to "end_ offset" in struct rte_regex_match. Users
interested in len could take "end_offset - start_offset".
We may also change RTE_REGEX_DEV_CFG_MATCH_AS_START to RTE_REGEX_DEV_CFG_MATCH_START

In your example,
if device is configured with RTE_REGEX_DEV_CFG_MATCH_START
match->start_offset returns 4
match->end_offset returns 15

if device is NOT configured with RTE_REGEX_DEV_CFG_MATCH_START
match->start_offset returns 0
match->end_offset returns 15

> > 
> > 3)  rte_regex_rule_db_update()
> >     Does this mean we can dynamically add or delete rules for an already
> > generated database without recompile from scratch for hardware Regex
> > implementation?
> >     If so, this isn't possible for software solutions as they don't support
> > dynamic database update and require recompile.
> > 
> > [Jerin] rte_regex_rule_db_update() internally it would call recompile
> > function for both HW and SW.
> > See rte_regex_dev_config::rule_db in rte_regex_dev_configure() for
> > precompiled rule database case.
> > [Xiang] OK, sounds like we have to save the original rule-set for the device in
> > order to do recompile. I see both ADD and REMOVE operators from
> > rte_regex_rule.
> > For rules with REMOVE operator, what's the expected behavior to handle
> > them for the old rule-set? Do we need to go through the old rule-set and
> > remove corresponding rules before doing recompile?
> 
> Yes.
>
I think it'll be better to change rte_regex_rule_db_update() to
rte_regex_rule_compile() and have users to provide a full rule-set.
So we don't have to maintain old rule-set and decide which one to keep
and remove. We can simply recompile new rule-set and get rid of
rte_regex_rule_op in this case.
Jerin Jacob Kollanukkaran Sept. 27, 2019, 2:35 p.m. UTC | #12
> -----Original Message-----
> From: Wang Xiang <xiang.w.wang@intel.com>
> 
> Hi Jerin,
> 
> Thanks for your response. More comments below and inline.
> 
> 1) I think the size of some varaibles (e.g. nb_matches, scan_size, matching
> offset, etc) should be increased based on what Hyperscan supports.
> 
>     a) struct rte_regex_ops:
> 
>         uint16_t scan_size => uint32_t scan_size

I think, packet buffers will not be > 64K and getting more than contiguous
64K DMAable memory will be difficult in DPDK.
Other than that, rte_regex_match is 64bit now, increasing width of
Len could increase the size of  "rte_regex_match". i.e Need more
Bandwidth for response. 
Could other HW implementations share the views on max length
is supported on their implementation? Based on that we can decide.

>         uint8_t nb_actual_matches => uint64 nb_actual_matches
>         uint8_t nb_matches => uint64 nb__matches

2^64 matches will be never possible in practical system. How about 2^16.

> 
>     b) struct rte_regex_match:
>         uint16_t offset => uint32_t offset
>         uint16_t len => uint32_t len

See above.

> 
>     c) uint16_t
>         rte_regex_rule_db_update(uint8_t dev_id, const struct rte_regex_rule
> *rules,
>                                  uint16_t nb_rules);
>     =>
>        uint32_t
>         rte_regex_rule_db_update(uint8_t dev_id, const struct rte_regex_rule
> *rules,
>                                  uint32_t nb_rules);

OK. I will change it next version.

> 
>     d) int
>     rte_regex_queue_pair_setup(uint8_t dev_id, uint8_t queue_pair_id,
>                     const struct rte_regex_qp_conf *qp_conf);
>     =>
>        int
>     rte_regex_queue_pair_setup(uint8_t dev_id, uint16_t queue_pair_id,
>                     const struct rte_regex_qp_conf *qp_conf);

OK. I will change it next version.

> 
>     e) struct rte_regex_dev_config:
>         uint8_t nb_max_matches => uint64_t nb_max_matches

2^64 matches will be never possible in practical system. How about 2^16.

> 
>     f) struct rte_regex_dev_info:
>         uint8_t max_matches => uint64_t max_matches

2^64 matches will be never possible in practical system. How about 2^16.

> 
> 2) There are rte_regex_dev_attr_get() and rte_regex_dev_attr_set() defined.
> Are all the attributes below could be set by users? Is any of them read-only?

See below,

> /** Enumerates RegEx device attribute identifier */ enum
> rte_regex_dev_attr_id {
>     RTE_REGEX_DEV_ATTR_SOCKET_ID,
>     /**< The NUMA socket id to which the device is connected or
>      * a default of zero if the socket could not be determined.
>      * datatype: *int*
>      * operation: *get*

*get*  means read only. *get* and *set* means it support both operation

>      */
>     RTE_REGEX_DEV_ATTR_MAX_MATCHES,
>     /**< Maximum number of matches per scan.
>      * datatype: *uint8_t*
>      * operation: *get* and *set*
>      *
>      * @see RTE_REGEX_OPS_RSP_MAX_MATCH_F
>      */
>     RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT,
>     /**< Upper bound scan time in ns.
>      * datatype: *uint16_t*
>      * operation: *get* and *set*
>      *
>      * @see RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F
>      */
>     RTE_REGEX_DEV_ATTR_MAX_PREFIX,
>     /**< Maximum number of prefix detected per scan.
>      * This would be useful for denial of service detection.
>      * datatype: *uint16_t*
>      * operation: *get* and *set*
>      *
>      * @see RTE_REGEX_OPS_RSP_MAX_PREFIX_F
>      */
> };
> 
> 3) Both RTE_REGEX_PCRE_RULE_* and
> RTE_REGEX_DEV_PCRE_UNSUP_* can be viewed as device capabilities. Can we
> merge them with RTE_REGEX_DEV_CAPA_RUNTIME_COMPILATION_F and have
> a unified regex_dev_capa in struct rte_regex_dev_info.

Sure. I will fix it next version.

> 
> 
> 4) It'll be good if we can also define synchronous matching API for users who
> want to have a one-off scan and wait for the results.

Makes sense. I will add synchronous matching API in next version(I understand, it will be useful for SW
Implementations). Probably expose as INFO flag to expose the it as preference.

> 
> On Tue, Sep 10, 2019 at 08:05:39AM +0000, Jerin Jacob Kollanukkaran wrote:
> > Hi Xiang,
> >
> > Sorry for delay in response(Was busy with 19.11 proposal deadline). Please
> see inline.
> >
> > >
> > > Reply to Xiang's queries in main thread:
> > >
> > > Hi all,
> > >
> > > Some questions regarding APIs. Could you please give more insights?
> > >
> > > 1) rte_regex_ops
> > >       a) rsp_flags
> > >       These two flags RTE_REGEX_OPS_RSP_PMI_SOJ_F and
> > > RTE_REGEX_OPS_RSP_PMI_EOJ_F are used for cross buffer scan.
> > >       RTE_REGEX_OPS_RSP_PMI_EOJ_F tells whether we have a partial
> > > match at the end of current buffer after scan.
> > >       What's the purpose of having RTE_REGEX_OPS_RSP_PMI_SOJ_F?
> > >
> > > [Jerin] Since we need three states to represent partial match
> > > buffer, RTE_REGEX_OPS_RSP_PMI_SOJ_F to represent start of the
> > > buffer, intermediate buffers with no flag, and end of the buffer
> > > with RTE_REGEX_OPS_RSP_PMI_EOJ
> >
> > > [Xiang] How could a user leverage these flags for matching? Suppose
> > > a large buffer is divided into multiple chunks. Will
> > > RTE_REGEX_OPS_RSP_PMI_SOJ_F cause an early quit once it isn't set
> > > after scan the first chunk. Similarly, RTE_REGEX_OPS_RSP_PMI_EOJ
> > > tells a user whether to stop matching future buffers after finish the last
> chunk?
> >
> > Let me describe with an example,
> >
> > Assume,
> > 1) struct rte_regex_dev_info:: max_payload_size set to 1024
> > 2) rte_regex_dev_config:: dev_cfg_flags configured with
> > RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> > 3) Device programmed with matching "hello\s+world" pattern
> > 4) user enqueue struct rte_regex_ops:: buf_addr point following "data"
> > and struct rte_regex_op:: scan_size = 1024
> >
> > data[0..1021] = data don???t have hello world pattern data[1022] = 'h'
> > data[1023] = 'e'
> >
> > 5) user enqueue struct rte_regex_ops:: buf_addr point following "data"
> > and struct rte_regex_op:: scan_size = 9
> >
> > data[0] = 'l'
> > data[1] = 'l'
> > data[2] = 'o'
> > data[3] = ' '
> > data[4] = 'w'
> > data[5] = 'o'
> > data[6] = 'r'
> > data[7] = 'l'
> > data[8] = 'd'
> >
> > If so,
> >
> > Response to 4) will be RTE_REGEX_OPS_RSP_PMI_SOJ_F in rte_regex_ops::
> > rsp_flags on dequeue Where rte_regex_match:: offset is 1022 and len 2
> >
> > Response to 5) will be RTE_REGEX_OPS_RSP_PMI_EOJ_F in rte_regex_ops::
> > rsp_flags on dequeue Where rte_regex_match:: offset is 0 and len 9
> >
> If the defined pattern is "hello.*world" instead of "hello\s+world", and we
> enqueue following struct rte_regex_ops:
> 
> 1) rte_regex_op:: scan_size = 1024
> 
>    data[0..1021] = data don???t have hello world pattern
>    data[1022] = 'h'
>    data[1023] = 'e'
> 
> 2) rte_regex_op:: scan_size = 9
>    data[0] = 'l'
>    data[1] = 'l'
>    data[2] = 'o'
>    data[3] = ' '
>    data[4] = 'w'
>    data[5] = 'o'
>    data[6] = 'r'
>    data[7] = 'l'
>    data[8] = 'd'
> 
> 3) rte_regex_op:: scan_size = 5
>    data[0] = 'w'
>    data[1] = 'o'
>    data[2] = 'r'
>    data[3] = 'l'
>    data[4] = 'd'
> 
> Will response to 3) have RTE_REGEX_OPS_RSP_PMI_EOJ_F in rte_regex_ops::
> rsp_flags on dequeue
> Where rte_regex_match:: offset is 0 and len 4?

Yes.

> 
> I am wondering what's your expected behavior for .* or similar syntax and if
> there are syntax compatability issues. We report all matches in Hyperscan, e.g.
> report end match offsets 11 and 16 for pattern "hello.*world" and corpus
> "hello worldworld".
> 
> BTW, not sure how other hardware devices handle cross buffer scan. Hyperscan
> doesn't reports matches for start and intermediate buffers but only reports end
> offset if a full match is found.
> 
> >
> > >
> > >       RTE_REGEX_OPS_RSP_MAX_PREFIX_F: This looks like a definition
> > > for a specific hardware implementation. I am wondering what this
> > > PREFIX refers to:)?
> > >
> > > [Jerin] Yes. Looks like it is for hardware specific implementation.
> > > Introduced rte_regex_dev_attr_set/get functions to make it portable
> > > and To add new implementation specific fields.
> > > For example, if a rule is
> > > /ABCDEF.*XYZ/, ABCD is considered the prefix, and EF.*XYZ is
> > > considered the factor. The prefix is a literal string, while the
> > > factor can contain complex regular expression constructs. As a
> > > result, rule matching occurs in two stages: prefix matching and
> > > factor matching.
> > >
> > >       b)  user_id or user_ptr
> > >       Under what kind of circumstances should an application pass
> > > value into these variables for enqueue and dequeuer operations?
> > >
> > > [Jerin] Just like rte_crypto_ops, struct rte_regex_ops also
> > > allocated using mempool normally, on enqueue, user can specify
> > > user_id If needed to in order identify the op on dequeue if
> > > required. The use case could be to store the sequence number from
> > > application POV or storing the mbuf ptr in which pattern is requested etc.
> > >
> > >
> > >  2) rte_regex_match
> > >       a) offset; /**< Starting Byte Position for matched rule. */
> > > and  uint16_t len; /**< Length of match in bytes */
> > >       Looks like the matching offset is defined as *starting
> > > matching offset* instead of *end matching offset*, e.g. report the offset of
> "a" instead of "c"
> > > for pattern "abc".
> > >       If so, this makes it hard to integrate software regex
> > > libraries such as Hyperscan and RE2 as they only report *end
> > > matching offset* without length of match.
> > >       Although Hyperscan has API for *starting matching offset*, it
> > > only delivers partial syntax support. So I think we have to define
> > > *end of matching offset* for software solutions.
> > >
> > > [Jerin] I understand the hyperscan's HS_FLAG_SOM_LEFTMOST tradeoffs.
> > > I thought application would need always the length of the match.
> > > Probably we will see how other HW implementation (from Mellanox)
> > > etc. We will try to abstract it, probably we can make it as function
> > > of "user requested".
> > > [Xiang] Yes, it will be good to make it per user request. At least
> > > from Hyperscan user's point of view, start of match and match length
> > > are not mandatory.
> >
> > OK. I think, we can introduce RTE_REGEX_DEV_CFG_MATCH_AS_START In
> > device configure.
> >
> > Since offset+len == end, we can introduce following generic inline function.
> >
> > static inline
> > rte_regex_match_end(truct rte_regex_match *match) {
> > 	match->offset + match->len;
> > }
> >
> > Example:  pattern to match is  "hello\s+world"  and data is following
> > data[4] = 'h'
> > data[5] = 'e'
> > data[6] = 'l'
> > data[7] = 'l'
> > data[8] = 'o'
> > data[9] = ' '
> > data[10] = 'w'
> > data[11] = 'o'
> > data[12] = 'r'
> > data[13] = 'l'
> > data[14] = 'd'
> >
> > if device is configured with RTE_REGEX_DEV_CFG_MATCH_AS_START
> > match->offset returns 4
> > match->len returns 11
> >
> > if device is NOT configured with RTE_REGEX_DEV_CFG_MATCH_AS_START
> > driver MAY return the following(in hyperscan case)
> > match->offset returns 0
> > match->len returns 11 + 4
> >
> > In both case(irrespective of flags, to make application life easy)
> rte_regex_match_end() would return 15.
> > If application demands for MATCH_AS_START then driver can return
> > match->offset returns 4 and match->len returns 11 Aka set
> > HS_FLAG_SOM_LEFTMOST in hyperscan driver, But application should use
> rte_regex_match_end() for finding the end of the match. To make, work in all
> cases.
> >
> > Is it OK?
> >
> Can we replace len with end offset? So we can change "offset" to "start_offset"
> and len to "end_ offset" in struct rte_regex_match. Users interested in len
> could take "end_offset - start_offset".
> We may also change RTE_REGEX_DEV_CFG_MATCH_AS_START to
> RTE_REGEX_DEV_CFG_MATCH_START
> 
> In your example,
> if device is configured with RTE_REGEX_DEV_CFG_MATCH_START
> match->start_offset returns 4
> match->end_offset returns 15
> 
> if device is NOT configured with RTE_REGEX_DEV_CFG_MATCH_START
> match->start_offset returns 0
> match->end_offset returns 15


This part is little tricky as HW descriptions need to be rewritten on response.
This is a one issue, I foresee earlier, to come up with rte_regex_match
That's works for all implementation  without performance issue.

We have two HW implementations, both returns start_off and len.
Lets get input from other HW implementation on the semantics of
rte_regex_match. Based on that, we can decide how to go about it?
Thoughts from Mellanox or other vendors?



> 
> > >
> > > 3)  rte_regex_rule_db_update()
> > >     Does this mean we can dynamically add or delete rules for an
> > > already generated database without recompile from scratch for
> > > hardware Regex implementation?
> > >     If so, this isn't possible for software solutions as they don't
> > > support dynamic database update and require recompile.
> > >
> > > [Jerin] rte_regex_rule_db_update() internally it would call
> > > recompile function for both HW and SW.
> > > See rte_regex_dev_config::rule_db in rte_regex_dev_configure() for
> > > precompiled rule database case.
> > > [Xiang] OK, sounds like we have to save the original rule-set for
> > > the device in order to do recompile. I see both ADD and REMOVE
> > > operators from rte_regex_rule.
> > > For rules with REMOVE operator, what's the expected behavior to
> > > handle them for the old rule-set? Do we need to go through the old
> > > rule-set and remove corresponding rules before doing recompile?
> >
> > Yes.
> >
> I think it'll be better to change rte_regex_rule_db_update() to
> rte_regex_rule_compile() and have users to provide a full rule-set.
> So we don't have to maintain old rule-set and decide which one to keep and
> remove. We can simply recompile new rule-set and get rid of
> rte_regex_rule_op in this case.


On virtualized, HW implementations, The RULE database is maintained by single
body. So the above scheme, works with SW and HW implementations.
And It make user life easy as they don't need to maintain the rules.

I don't have preference on the rte_regex_rule_db_update() name, I can change to
rte_regex_rule_compile() if required keeping above functionality. Let me know.
Jerin Jacob Kollanukkaran Sept. 27, 2019, 2:45 p.m. UTC | #13
> -----Original Message-----
> From: Jerin Jacob Kollanukkaran
> Sent: Tuesday, September 10, 2019 4:33 PM
> To: Shahaf Shuler <shahafs@mellanox.com>; Thomas Monjalon
> <thomas@monjalon.net>; dev@dpdk.org
> Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; Hemant
> Agrawal <hemant.agrawal@nxp.com>; Opher Reviv <opher@mellanox.com>;
> Alex Rosenbaum <alexr@mellanox.com>; Dovrat Zifroni
> <dovrat@marvell.com>; Prasun Kapoor <pkapoor@marvell.com>; Nipun Gupta
> <nipun.gupta@nxp.com>; Wang, Xiang W <xiang.w.wang@intel.com>;
> Richardson, Bruce <bruce.richardson@intel.com>; yang.a.hong@intel.com;
> harry.chang@intel.com; gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com;
> wushuai@inspur.com; yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> fc@napatech.com; arthur.su@lionic.com
> Subject: RE: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev
> subsystem
> 
> > Hi Jerin,
> 
> Hi Shahaf,
> 
> Sorry for delay in response(Was busy with 19.11 proposal deadline). Please see
> inline.
> 
> > >
> > > RegEx pattern matching applications:
> > > • Next Generation Firewalls (NGFW)
> > > • Deep Packet and Flow Inspection (DPI) • Intrusion Prevention
> > > Systems (IPS) • DDoS Mitigation • Network Monitoring • Data Loss
> > > Prevention (DLP) • Smart NICs • Grammar based content processing •
> > > URL, spam and adware filtering • Advanced auditing and policing of
> > > user/application security policies • Financial data mining - parsing
> > > of streamed financial feeds
> >
> > I think two more important use case to add (at least on the doc of
> > this
> > subsystem) are:
> > * application recognition
> > * memory introspection
> 
> Sure. Will add the following from John as well.
> 
> # Natural Language Processing (NLP)
> # Sentiment Analysis
> # Big Data database acceleration (Spark, Hadoop etc.) # Computational Storage
> 
> >
> >
> > > +/**
> > > + * Update the rule database of a RegEx device.
> > > + *
> > > + * @param dev_id RegEx device identifier
> > > + * @param rules
> > > + *   Points to an array of *nb_rules* objects of type *rte_regex_rule*
> > > structure
> > > + *   which contain the regex rules attributes to be updated in rule
> database.
> > > + * @param nb_rules
> > > + *   The number of PCRE rules to update the rule database.
> > > + *
> > > + * @return
> > > + *   The number of regex rules actually updated on the regex device's rule
> > > + *   database. The return value can be less than the value of the *nb_rules*
> > > + *   parameter when the regex devices fails to update the rule database or
> > > + *   if invalid parameters are specified in a *rte_regex_rule*.
> > > + *   If the return value is less than *nb_rules*, the remaining PCRE rules
> > > + *   at the end of *rules* are not consumed and the caller has to take
> > > + *   care of them and rte_errno is set accordingly.
> > > + *   Possible errno values include:
> > > + *   - -EINVAL:  Invalid device ID or rules is NULL
> > > + *   - -ENOTSUP: The last processed rule is not supported on this device.
> > > + *   - -ENOSPC: No space available in rule database.
> > > + *
> > > + * @see rte_regex_rule_db_import(), rte_regex_rule_db_export()  */
> > > +uint16_t rte_regex_rule_db_update(uint8_t dev_id, const struct
> > > +rte_regex_rule
> > > *rules,
> > > +			 uint16_t nb_rules);
> >
> > I think the function name is not too informative. If this function
> > meant to compile the rule then it should be explicit on the function name.
> 
> It is meant to be compile the rules and then  update the rule database.
> 
> I think, we can have either 1 or 2. Let me know your preference or If you have
> any name suggestion. I will change it accordingly.
> 
> 1. rte_regex_rule_db_compile()
> 2. rte_regex_rule_db_compile_update()


@Shahaf Shuler, Thoughts?


> 
> 
> > > +
> > > + */
> > > +struct rte_regex_ops {
> > > +
> > > +	/* W4 */
> > > +	RTE_STD_C11
> > > +	union {
> > > +		uint64_t user_id;
> > > +		/**< Application specific opaque value. An application may
> > > use
> > > +		 * this field to hold application specific value to share
> > > +		 * between dequeue and enqueue operation.
> > > +		 * Implementation should not modify this field.
> > > +		 */
> > > +		void *user_ptr;
> > > +		/**< Pointer representation of *user_id* */
> > > +	};
> >
> > Since we target the regex subsystem for both regex and DPI I think it
> > will be good to add another uint64_t field called connection_id.
> > Device that support DPI can refer to it as another match able field
> > when looking up for matches on the given buffer.
> >
> > This field is different from the user_id, as it is not opaque for the device.
> 
> Is this driver specific storage place where application should not touch it?
> 
> If not, Could you share the data flow of this field? Ie. Who "write" this Field and
> who "read" this field.

@Shahaf Shuler Thoughts?

Based on your input, I will update the next version.

> 
> This is just for documentation, In any event we can add new fields.
> 
> If it is only for driver usage then I think, some driver may need more 8B
> Storage. In that case I think, each driver can add its on field After W4(i.e
> existing user_id) and introduce new field called match_offset in struct
> rte_regex_ops
> 
> ie. struct rte_regex_match *matches == ops + ops-> match_offset; so that, Each
> driver can add enough driver specific metadata.
> 
> 
>
Shahaf Shuler Oct. 2, 2019, 5:53 a.m. UTC | #14
Friday, September 27, 2019 5:46 PM, Jerin Jacob Kollanukkaran:
> subsystem
> 
> > -----Original Message-----
> > From: Jerin Jacob Kollanukkaran
> > Sent: Tuesday, September 10, 2019 4:33 PM
> > To: Shahaf Shuler <shahafs@mellanox.com>; Thomas Monjalon
> > <thomas@monjalon.net>; dev@dpdk.org
> > Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; Hemant
> > Agrawal <hemant.agrawal@nxp.com>; Opher Reviv
> <opher@mellanox.com>;
> > Alex Rosenbaum <alexr@mellanox.com>; Dovrat Zifroni
> > <dovrat@marvell.com>; Prasun Kapoor <pkapoor@marvell.com>; Nipun
> Gupta
> > <nipun.gupta@nxp.com>; Wang, Xiang W <xiang.w.wang@intel.com>;
> > Richardson, Bruce <bruce.richardson@intel.com>; yang.a.hong@intel.com;
> > harry.chang@intel.com; gu.jian1@zte.com.cn;
> > shanjiangh@chinatelecom.cn; zhangy.yun@chinatelecom.cn;
> > lixingfu@huachentel.com; wushuai@inspur.com; yuyingxia@yxlink.com;
> > fanchenggang@sunyainfo.com; davidfgao@tencent.com;
> > liuzhong1@chinaunicom.cn; zhaoyong11@huawei.com; oc@yunify.com;
> > jim@netgate.com; hongjun.ni@intel.com; j.bromhead@titan-ic.com;
> > deri@ntop.org; fc@napatech.com; arthur.su@lionic.com
> > Subject: RE: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev
> > subsystem
> >
> > > Hi Jerin,
> >
> > Hi Shahaf,
> >
> > > > + *
> > > > + * @see rte_regex_rule_db_import(), rte_regex_rule_db_export()
> > > > +*/ uint16_t rte_regex_rule_db_update(uint8_t dev_id, const struct
> > > > +rte_regex_rule
> > > > *rules,
> > > > +			 uint16_t nb_rules);
> > >
> > > I think the function name is not too informative. If this function
> > > meant to compile the rule then it should be explicit on the function
> name.
> >
> > It is meant to be compile the rules and then  update the rule database.
> >
> > I think, we can have either 1 or 2. Let me know your preference or If
> > you have any name suggestion. I will change it accordingly.
> >
> > 1. rte_regex_rule_db_compile()
> > 2. rte_regex_rule_db_compile_update()
> 
> 
> @Shahaf Shuler, Thoughts?

IMO we should have two separate functions - one to only compile. One to only update. 

So I would prefer #1, with addition (if not already present) of API to update rules. 

> 
> 
> >
> >
> > > > +
> > > > + */
> > > > +struct rte_regex_ops {
> > > > +
> > > > +	/* W4 */
> > > > +	RTE_STD_C11
> > > > +	union {
> > > > +		uint64_t user_id;
> > > > +		/**< Application specific opaque value. An application may
> > > > use
> > > > +		 * this field to hold application specific value to share
> > > > +		 * between dequeue and enqueue operation.
> > > > +		 * Implementation should not modify this field.
> > > > +		 */
> > > > +		void *user_ptr;
> > > > +		/**< Pointer representation of *user_id* */
> > > > +	};
> > >
> > > Since we target the regex subsystem for both regex and DPI I think
> > > it will be good to add another uint64_t field called connection_id.
> > > Device that support DPI can refer to it as another match able field
> > > when looking up for matches on the given buffer.
> > >
> > > This field is different from the user_id, as it is not opaque for the device.
> >
> > Is this driver specific storage place where application should not touch it?
> >
> > If not, Could you share the data flow of this field? Ie. Who "write"
> > this Field and who "read" this field.

Application writes to the field. Device reads from this fields. 
Unlike the user_ptr which is complete opaque to the device, connection_id field will have some meaning (e.g. DPI rules can apply on it). 

> 
> @Shahaf Shuler Thoughts?
> 
> Based on your input, I will update the next version.
> 
> >
> > This is just for documentation, In any event we can add new fields.
> >
> > If it is only for driver usage then I think, some driver may need more
> > 8B Storage. In that case I think, each driver can add its on field
> > After W4(i.e existing user_id) and introduce new field called
> > match_offset in struct rte_regex_ops
> >
> > ie. struct rte_regex_match *matches == ops + ops-> match_offset; so
> > that, Each driver can add enough driver specific metadata.
> >
> >
> >
Jerin Jacob Kollanukkaran Oct. 2, 2019, 8:31 a.m. UTC | #15
> -----Original Message-----
> From: Shahaf Shuler <shahafs@mellanox.com>
> Sent: Wednesday, October 2, 2019 11:23 AM
> To: Jerin Jacob Kollanukkaran <jerinj@marvell.com>; Thomas Monjalon
> <thomas@monjalon.net>; 'dev@dpdk.org' <dev@dpdk.org>
> Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; 'Hemant
> Agrawal' <hemant.agrawal@nxp.com>; Opher Reviv <opher@mellanox.com>;
> Alex Rosenbaum <alexr@mellanox.com>; Dovrat Zifroni
> <dovrat@marvell.com>; Prasun Kapoor <pkapoor@marvell.com>; 'Nipun
> Gupta' <nipun.gupta@nxp.com>; 'Wang, Xiang W' <xiang.w.wang@intel.com>;
> 'Richardson, Bruce' <bruce.richardson@intel.com>; 'yang.a.hong@intel.com'
> <yang.a.hong@intel.com>; 'harry.chang@intel.com' <harry.chang@intel.com>;
> 'gu.jian1@zte.com.cn' <gu.jian1@zte.com.cn>; 'shanjiangh@chinatelecom.cn'
> <shanjiangh@chinatelecom.cn>; 'zhangy.yun@chinatelecom.cn'
> <zhangy.yun@chinatelecom.cn>; 'lixingfu@huachentel.com'
> <lixingfu@huachentel.com>; 'wushuai@inspur.com' <wushuai@inspur.com>;
> 'yuyingxia@yxlink.com' <yuyingxia@yxlink.com>;
> 'fanchenggang@sunyainfo.com' <fanchenggang@sunyainfo.com>;
> 'davidfgao@tencent.com' <davidfgao@tencent.com>;
> 'liuzhong1@chinaunicom.cn' <liuzhong1@chinaunicom.cn>;
> 'zhaoyong11@huawei.com' <zhaoyong11@huawei.com>; 'oc@yunify.com'
> <oc@yunify.com>; 'jim@netgate.com' <jim@netgate.com>;
> 'hongjun.ni@intel.com' <hongjun.ni@intel.com>; 'j.bromhead@titan-ic.com'
> <j.bromhead@titan-ic.com>; 'deri@ntop.org' <deri@ntop.org>;
> 'fc@napatech.com' <fc@napatech.com>; 'arthur.su@lionic.com'
> <arthur.su@lionic.com>
> Subject: RE: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev
> subsystem
> 
> > > > I think the function name is not too informative. If this function
> > > > meant to compile the rule then it should be explicit on the
> > > > function
> > name.
> > >
> > > It is meant to be compile the rules and then  update the rule database.
> > >
> > > I think, we can have either 1 or 2. Let me know your preference or
> > > If you have any name suggestion. I will change it accordingly.
> > >
> > > 1. rte_regex_rule_db_compile()
> > > 2. rte_regex_rule_db_compile_update()
> >
> >
> > @Shahaf Shuler, Thoughts?
> 
> IMO we should have two separate functions - one to only compile. One to only
> update.
> 
> So I would prefer #1, with addition (if not already present) of API to update
> rules.


OK. Will change it in next version.


> 
> >
> >
> > >
> > >
> > > > > +
> > > > > + */
> > > > > +struct rte_regex_ops {
> > > > > +
> > > > > +	/* W4 */
> > > > > +	RTE_STD_C11
> > > > > +	union {
> > > > > +		uint64_t user_id;
> > > > > +		/**< Application specific opaque value. An application
> may
> > > > > use
> > > > > +		 * this field to hold application specific value to share
> > > > > +		 * between dequeue and enqueue operation.
> > > > > +		 * Implementation should not modify this field.
> > > > > +		 */
> > > > > +		void *user_ptr;
> > > > > +		/**< Pointer representation of *user_id* */
> > > > > +	};
> > > >
> > > > Since we target the regex subsystem for both regex and DPI I think
> > > > it will be good to add another uint64_t field called connection_id.
> > > > Device that support DPI can refer to it as another match able
> > > > field when looking up for matches on the given buffer.
> > > >
> > > > This field is different from the user_id, as it is not opaque for the device.
> > >
> > > Is this driver specific storage place where application should not touch it?
> > >
> > > If not, Could you share the data flow of this field? Ie. Who "write"
> > > this Field and who "read" this field.
> 
> Application writes to the field. Device reads from this fields.
> Unlike the user_ptr which is complete opaque to the device, connection_id field
> will have some meaning (e.g. DPI rules can apply on it).

Will you be connecting the value to rte_flow etc to get the complete data flow.
I understand applications writes to this field, But I am not sure what values 
Needs to be written and how it will be connected in overall scheme of things.
I am not sure even what to write doxgygen comment for this field.

Can we add this field once we have the complete data flow?. Since it is
Experimental we can always add new field.
Shahaf Shuler Oct. 2, 2019, 8:52 a.m. UTC | #16
Wednesday, October 2, 2019 11:32 AM, Jerin Jacob Kollanukkaran:
> Subject: Re: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev
> subsystem
> 
> > -----Original Message-----
> > From: Shahaf Shuler <shahafs@mellanox.com>
> > Sent: Wednesday, October 2, 2019 11:23 AM
> > To: Jerin Jacob Kollanukkaran <jerinj@marvell.com>; Thomas Monjalon
> > <thomas@monjalon.net>; 'dev@dpdk.org' <dev@dpdk.org>
> > Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; 'Hemant
> > Agrawal' <hemant.agrawal@nxp.com>; Opher Reviv
> <opher@mellanox.com>;
> > Alex Rosenbaum <alexr@mellanox.com>; Dovrat Zifroni
> > <dovrat@marvell.com>; Prasun Kapoor <pkapoor@marvell.com>; 'Nipun
> > Gupta' <nipun.gupta@nxp.com>; 'Wang, Xiang W'
> > <xiang.w.wang@intel.com>; 'Richardson, Bruce'
> <bruce.richardson@intel.com>; 'yang.a.hong@intel.com'
> > <yang.a.hong@intel.com>; 'harry.chang@intel.com'
> > <harry.chang@intel.com>; 'gu.jian1@zte.com.cn' <gu.jian1@zte.com.cn>;
> 'shanjiangh@chinatelecom.cn'
> > <shanjiangh@chinatelecom.cn>; 'zhangy.yun@chinatelecom.cn'
> > <zhangy.yun@chinatelecom.cn>; 'lixingfu@huachentel.com'
> > <lixingfu@huachentel.com>; 'wushuai@inspur.com'
> <wushuai@inspur.com>;
> > 'yuyingxia@yxlink.com' <yuyingxia@yxlink.com>;
> > 'fanchenggang@sunyainfo.com' <fanchenggang@sunyainfo.com>;
> > 'davidfgao@tencent.com' <davidfgao@tencent.com>;
> > 'liuzhong1@chinaunicom.cn' <liuzhong1@chinaunicom.cn>;
> > 'zhaoyong11@huawei.com' <zhaoyong11@huawei.com>; 'oc@yunify.com'
> > <oc@yunify.com>; 'jim@netgate.com' <jim@netgate.com>;
> > 'hongjun.ni@intel.com' <hongjun.ni@intel.com>; 'j.bromhead@titan-
> ic.com'
> > <j.bromhead@titan-ic.com>; 'deri@ntop.org' <deri@ntop.org>;
> > 'fc@napatech.com' <fc@napatech.com>; 'arthur.su@lionic.com'
> > <arthur.su@lionic.com>
> > Subject: RE: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev
> > subsystem
> >
> > > > > I think the function name is not too informative. If this
> > > > > function meant to compile the rule then it should be explicit on
> > > > > the function
> > > name.
> > > >
> > > > It is meant to be compile the rules and then  update the rule database.
> > > >
> > > > I think, we can have either 1 or 2. Let me know your preference or
> > > > If you have any name suggestion. I will change it accordingly.
> > > >
> > > > 1. rte_regex_rule_db_compile()
> > > > 2. rte_regex_rule_db_compile_update()
> > >
> > >
> > > @Shahaf Shuler, Thoughts?
> >
> > IMO we should have two separate functions - one to only compile. One
> > to only update.
> >
> > So I would prefer #1, with addition (if not already present) of API to
> > update rules.
> 
> 
> OK. Will change it in next version.
> 
> 
> >
> > >
> > >
> > > >
> > > >
> > > > > > +
> > > > > > + */
> > > > > > +struct rte_regex_ops {
> > > > > > +
> > > > > > +	/* W4 */
> > > > > > +	RTE_STD_C11
> > > > > > +	union {
> > > > > > +		uint64_t user_id;
> > > > > > +		/**< Application specific opaque value. An
> application
> > may
> > > > > > use
> > > > > > +		 * this field to hold application specific value to share
> > > > > > +		 * between dequeue and enqueue operation.
> > > > > > +		 * Implementation should not modify this field.
> > > > > > +		 */
> > > > > > +		void *user_ptr;
> > > > > > +		/**< Pointer representation of *user_id* */
> > > > > > +	};
> > > > >
> > > > > Since we target the regex subsystem for both regex and DPI I
> > > > > think it will be good to add another uint64_t field called
> connection_id.
> > > > > Device that support DPI can refer to it as another match able
> > > > > field when looking up for matches on the given buffer.
> > > > >
> > > > > This field is different from the user_id, as it is not opaque for the
> device.
> > > >
> > > > Is this driver specific storage place where application should not touch
> it?
> > > >
> > > > If not, Could you share the data flow of this field? Ie. Who "write"
> > > > this Field and who "read" this field.
> >
> > Application writes to the field. Device reads from this fields.
> > Unlike the user_ptr which is complete opaque to the device,
> > connection_id field will have some meaning (e.g. DPI rules can apply on it).
> 
> Will you be connecting the value to rte_flow etc to get the complete data
> flow.
> I understand applications writes to this field, But I am not sure what values
> Needs to be written and how it will be connected in overall scheme of things.
> I am not sure even what to write doxgygen comment for this field.
> 
> Can we add this field once we have the complete data flow?. Since it is
> Experimental we can always add new field.

Yes. We can revisit it later, so long we agree that such field can be added. 

>
Jerin Jacob Kollanukkaran Oct. 2, 2019, 9:34 a.m. UTC | #17
> -----Original Message-----
> From: Shahaf Shuler <shahafs@mellanox.com>
> Sent: Wednesday, October 2, 2019 2:23 PM
> To: Jerin Jacob Kollanukkaran <jerinj@marvell.com>; Thomas Monjalon
> <thomas@monjalon.net>; 'dev@dpdk.org' <dev@dpdk.org>
> Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; 'Hemant
> Agrawal' <hemant.agrawal@nxp.com>; Opher Reviv <opher@mellanox.com>;
> Alex Rosenbaum <alexr@mellanox.com>; Dovrat Zifroni
> <dovrat@marvell.com>; Prasun Kapoor <pkapoor@marvell.com>; 'Nipun
> Gupta' <nipun.gupta@nxp.com>; 'Wang, Xiang W' <xiang.w.wang@intel.com>;
> 'Richardson, Bruce' <bruce.richardson@intel.com>; 'yang.a.hong@intel.com'
> <yang.a.hong@intel.com>; 'harry.chang@intel.com' <harry.chang@intel.com>;
> 'gu.jian1@zte.com.cn' <gu.jian1@zte.com.cn>; 'shanjiangh@chinatelecom.cn'
> <shanjiangh@chinatelecom.cn>; 'zhangy.yun@chinatelecom.cn'
> <zhangy.yun@chinatelecom.cn>; 'lixingfu@huachentel.com'
> <lixingfu@huachentel.com>; 'wushuai@inspur.com' <wushuai@inspur.com>;
> 'yuyingxia@yxlink.com' <yuyingxia@yxlink.com>;
> 'fanchenggang@sunyainfo.com' <fanchenggang@sunyainfo.com>;
> 'davidfgao@tencent.com' <davidfgao@tencent.com>;
> 'liuzhong1@chinaunicom.cn' <liuzhong1@chinaunicom.cn>;
> 'zhaoyong11@huawei.com' <zhaoyong11@huawei.com>; 'oc@yunify.com'
> <oc@yunify.com>; 'jim@netgate.com' <jim@netgate.com>;
> 'hongjun.ni@intel.com' <hongjun.ni@intel.com>; 'j.bromhead@titan-ic.com'
> <j.bromhead@titan-ic.com>; 'deri@ntop.org' <deri@ntop.org>;
> 'fc@napatech.com' <fc@napatech.com>; 'arthur.su@lionic.com'
> <arthur.su@lionic.com>
> Subject: RE: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev
> subsystem
> 
> > > > > >
> > > > > > Since we target the regex subsystem for both regex and DPI I
> > > > > > think it will be good to add another uint64_t field called
> > connection_id.
> > > > > > Device that support DPI can refer to it as another match able
> > > > > > field when looking up for matches on the given buffer.
> > > > > >
> > > > > > This field is different from the user_id, as it is not opaque
> > > > > > for the
> > device.
> > > > >
> > > > > Is this driver specific storage place where application should
> > > > > not touch
> > it?
> > > > >
> > > > > If not, Could you share the data flow of this field? Ie. Who "write"
> > > > > this Field and who "read" this field.
> > >
> > > Application writes to the field. Device reads from this fields.
> > > Unlike the user_ptr which is complete opaque to the device,
> > > connection_id field will have some meaning (e.g. DPI rules can apply on it).
> >
> > Will you be connecting the value to rte_flow etc to get the complete
> > data flow.
> > I understand applications writes to this field, But I am not sure what
> > values Needs to be written and how it will be connected in overall scheme of
> things.
> > I am not sure even what to write doxgygen comment for this field.
> >
> > Can we add this field once we have the complete data flow?. Since it
> > is Experimental we can always add new field.
> 
> Yes. We can revisit it later, so long we agree that such field can be added.

Yes. DPI inline support is a valid use case. We can add that support
when data flow is clear and HW support is available.




> 
> >
Wang Xiang Oct. 14, 2019, 1:59 p.m. UTC | #18
On Fri, Sep 27, 2019 at 02:35:00PM +0000, Jerin Jacob Kollanukkaran wrote:
> > -----Original Message-----
> > From: Wang Xiang <xiang.w.wang@intel.com>
> > 
> > Hi Jerin,
> > 
> > Thanks for your response. More comments below and inline.
> > 
> > 1) I think the size of some varaibles (e.g. nb_matches, scan_size, matching
> > offset, etc) should be increased based on what Hyperscan supports.
> > 
> >     a) struct rte_regex_ops:
> > 
> >         uint16_t scan_size => uint32_t scan_size
> 
> I think, packet buffers will not be > 64K and getting more than contiguous
> 64K DMAable memory will be difficult in DPDK.
> Other than that, rte_regex_match is 64bit now, increasing width of
> Len could increase the size of  "rte_regex_match". i.e Need more
> Bandwidth for response. 
> Could other HW implementations share the views on max length
> is supported on their implementation? Based on that we can decide.
>
OK, let's gather ideas from HW implementation.
> 
> >         uint8_t nb_actual_matches => uint64 nb_actual_matches
> >         uint8_t nb_matches => uint64 nb__matches
> 
> 2^64 matches will be never possible in practical system. How about 2^16.
>
I think the number of matches depends on the number of total rules and
scan size. Based on the definitions (16-bit nb_rules_per_group,
16-bit nb_groups and 16-bit scan size), the maximum possible matches
could exceed 2^16. Users may get partial matches in this case while
Hyperscan doesn't make compromises. It'll also be good to check other HW
implementation.
>
> > 
> >     b) struct rte_regex_match:
> >         uint16_t offset => uint32_t offset
> >         uint16_t len => uint32_t len
> 
> See above.
> 
> > 
> >     c) uint16_t
> >         rte_regex_rule_db_update(uint8_t dev_id, const struct rte_regex_rule
> > *rules,
> >                                  uint16_t nb_rules);
> >     =>
> >        uint32_t
> >         rte_regex_rule_db_update(uint8_t dev_id, const struct rte_regex_rule
> > *rules,
> >                                  uint32_t nb_rules);
> 
> OK. I will change it next version.
> 
> > 
> >     d) int
> >     rte_regex_queue_pair_setup(uint8_t dev_id, uint8_t queue_pair_id,
> >                     const struct rte_regex_qp_conf *qp_conf);
> >     =>
> >        int
> >     rte_regex_queue_pair_setup(uint8_t dev_id, uint16_t queue_pair_id,
> >                     const struct rte_regex_qp_conf *qp_conf);
> 
> OK. I will change it next version.
> 
> > 
> >     e) struct rte_regex_dev_config:
> >         uint8_t nb_max_matches => uint64_t nb_max_matches
> 
> 2^64 matches will be never possible in practical system. How about 2^16.
>
See above.
>
> > 
> >     f) struct rte_regex_dev_info:
> >         uint8_t max_matches => uint64_t max_matches
> 
> 2^64 matches will be never possible in practical system. How about 2^16.
>
See above.
>
> > 
> > 2) There are rte_regex_dev_attr_get() and rte_regex_dev_attr_set() defined.
> > Are all the attributes below could be set by users? Is any of them read-only?
> 
> See below,
> 
> > /** Enumerates RegEx device attribute identifier */ enum
> > rte_regex_dev_attr_id {
> >     RTE_REGEX_DEV_ATTR_SOCKET_ID,
> >     /**< The NUMA socket id to which the device is connected or
> >      * a default of zero if the socket could not be determined.
> >      * datatype: *int*
> >      * operation: *get*
> 
> *get*  means read only. *get* and *set* means it support both operation
> 
> >      */
> >     RTE_REGEX_DEV_ATTR_MAX_MATCHES,
> >     /**< Maximum number of matches per scan.
> >      * datatype: *uint8_t*
> >      * operation: *get* and *set*
> >      *
> >      * @see RTE_REGEX_OPS_RSP_MAX_MATCH_F
> >      */
> >     RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT,
> >     /**< Upper bound scan time in ns.
> >      * datatype: *uint16_t*
> >      * operation: *get* and *set*
> >      *
> >      * @see RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F
> >      */
> >     RTE_REGEX_DEV_ATTR_MAX_PREFIX,
> >     /**< Maximum number of prefix detected per scan.
> >      * This would be useful for denial of service detection.
> >      * datatype: *uint16_t*
> >      * operation: *get* and *set*
> >      *
> >      * @see RTE_REGEX_OPS_RSP_MAX_PREFIX_F
> >      */
> > };
> > 
> > 3) Both RTE_REGEX_PCRE_RULE_* and
> > RTE_REGEX_DEV_PCRE_UNSUP_* can be viewed as device capabilities. Can we
> > merge them with RTE_REGEX_DEV_CAPA_RUNTIME_COMPILATION_F and have
> > a unified regex_dev_capa in struct rte_regex_dev_info.
> 
> Sure. I will fix it next version.
> 
> > 
> > 
> > 4) It'll be good if we can also define synchronous matching API for users who
> > want to have a one-off scan and wait for the results.
> 
> Makes sense. I will add synchronous matching API in next version(I understand, it will be useful for SW
> Implementations). Probably expose as INFO flag to expose the it as preference.
> 
> > 
> > On Tue, Sep 10, 2019 at 08:05:39AM +0000, Jerin Jacob Kollanukkaran wrote:
> > > Hi Xiang,
> > >
> > > Sorry for delay in response(Was busy with 19.11 proposal deadline). Please
> > see inline.
> > >
> > > >
> > > > Reply to Xiang's queries in main thread:
> > > >
> > > > Hi all,
> > > >
> > > > Some questions regarding APIs. Could you please give more insights?
> > > >
> > > > 1) rte_regex_ops
> > > >       a) rsp_flags
> > > >       These two flags RTE_REGEX_OPS_RSP_PMI_SOJ_F and
> > > > RTE_REGEX_OPS_RSP_PMI_EOJ_F are used for cross buffer scan.
> > > >       RTE_REGEX_OPS_RSP_PMI_EOJ_F tells whether we have a partial
> > > > match at the end of current buffer after scan.
> > > >       What's the purpose of having RTE_REGEX_OPS_RSP_PMI_SOJ_F?
> > > >
> > > > [Jerin] Since we need three states to represent partial match
> > > > buffer, RTE_REGEX_OPS_RSP_PMI_SOJ_F to represent start of the
> > > > buffer, intermediate buffers with no flag, and end of the buffer
> > > > with RTE_REGEX_OPS_RSP_PMI_EOJ
> > >
> > > > [Xiang] How could a user leverage these flags for matching? Suppose
> > > > a large buffer is divided into multiple chunks. Will
> > > > RTE_REGEX_OPS_RSP_PMI_SOJ_F cause an early quit once it isn't set
> > > > after scan the first chunk. Similarly, RTE_REGEX_OPS_RSP_PMI_EOJ
> > > > tells a user whether to stop matching future buffers after finish the last
> > chunk?
> > >
> > > Let me describe with an example,
> > >
> > > Assume,
> > > 1) struct rte_regex_dev_info:: max_payload_size set to 1024
> > > 2) rte_regex_dev_config:: dev_cfg_flags configured with
> > > RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> > > 3) Device programmed with matching "hello\s+world" pattern
> > > 4) user enqueue struct rte_regex_ops:: buf_addr point following "data"
> > > and struct rte_regex_op:: scan_size = 1024
> > >
> > > data[0..1021] = data don???t have hello world pattern data[1022] = 'h'
> > > data[1023] = 'e'
> > >
> > > 5) user enqueue struct rte_regex_ops:: buf_addr point following "data"
> > > and struct rte_regex_op:: scan_size = 9
> > >
> > > data[0] = 'l'
> > > data[1] = 'l'
> > > data[2] = 'o'
> > > data[3] = ' '
> > > data[4] = 'w'
> > > data[5] = 'o'
> > > data[6] = 'r'
> > > data[7] = 'l'
> > > data[8] = 'd'
> > >
> > > If so,
> > >
> > > Response to 4) will be RTE_REGEX_OPS_RSP_PMI_SOJ_F in rte_regex_ops::
> > > rsp_flags on dequeue Where rte_regex_match:: offset is 1022 and len 2
> > >
> > > Response to 5) will be RTE_REGEX_OPS_RSP_PMI_EOJ_F in rte_regex_ops::
> > > rsp_flags on dequeue Where rte_regex_match:: offset is 0 and len 9
> > >
> > If the defined pattern is "hello.*world" instead of "hello\s+world", and we
> > enqueue following struct rte_regex_ops:
> > 
> > 1) rte_regex_op:: scan_size = 1024
> > 
> >    data[0..1021] = data don???t have hello world pattern
> >    data[1022] = 'h'
> >    data[1023] = 'e'
> > 
> > 2) rte_regex_op:: scan_size = 9
> >    data[0] = 'l'
> >    data[1] = 'l'
> >    data[2] = 'o'
> >    data[3] = ' '
> >    data[4] = 'w'
> >    data[5] = 'o'
> >    data[6] = 'r'
> >    data[7] = 'l'
> >    data[8] = 'd'
> > 
> > 3) rte_regex_op:: scan_size = 5
> >    data[0] = 'w'
> >    data[1] = 'o'
> >    data[2] = 'r'
> >    data[3] = 'l'
> >    data[4] = 'd'
> > 
> > Will response to 3) have RTE_REGEX_OPS_RSP_PMI_EOJ_F in rte_regex_ops::
> > rsp_flags on dequeue
> > Where rte_regex_match:: offset is 0 and len 4?
> 
> Yes.
> 
> > 
> > I am wondering what's your expected behavior for .* or similar syntax and if
> > there are syntax compatability issues. We report all matches in Hyperscan, e.g.
> > report end match offsets 11 and 16 for pattern "hello.*world" and corpus
> > "hello worldworld".
> > 
> > BTW, not sure how other hardware devices handle cross buffer scan. Hyperscan
> > doesn't reports matches for start and intermediate buffers but only reports end
> > offset if a full match is found.
> > 
> > >
> > > >
> > > >       RTE_REGEX_OPS_RSP_MAX_PREFIX_F: This looks like a definition
> > > > for a specific hardware implementation. I am wondering what this
> > > > PREFIX refers to:)?
> > > >
> > > > [Jerin] Yes. Looks like it is for hardware specific implementation.
> > > > Introduced rte_regex_dev_attr_set/get functions to make it portable
> > > > and To add new implementation specific fields.
> > > > For example, if a rule is
> > > > /ABCDEF.*XYZ/, ABCD is considered the prefix, and EF.*XYZ is
> > > > considered the factor. The prefix is a literal string, while the
> > > > factor can contain complex regular expression constructs. As a
> > > > result, rule matching occurs in two stages: prefix matching and
> > > > factor matching.
> > > >
> > > >       b)  user_id or user_ptr
> > > >       Under what kind of circumstances should an application pass
> > > > value into these variables for enqueue and dequeuer operations?
> > > >
> > > > [Jerin] Just like rte_crypto_ops, struct rte_regex_ops also
> > > > allocated using mempool normally, on enqueue, user can specify
> > > > user_id If needed to in order identify the op on dequeue if
> > > > required. The use case could be to store the sequence number from
> > > > application POV or storing the mbuf ptr in which pattern is requested etc.
> > > >
> > > >
> > > >  2) rte_regex_match
> > > >       a) offset; /**< Starting Byte Position for matched rule. */
> > > > and  uint16_t len; /**< Length of match in bytes */
> > > >       Looks like the matching offset is defined as *starting
> > > > matching offset* instead of *end matching offset*, e.g. report the offset of
> > "a" instead of "c"
> > > > for pattern "abc".
> > > >       If so, this makes it hard to integrate software regex
> > > > libraries such as Hyperscan and RE2 as they only report *end
> > > > matching offset* without length of match.
> > > >       Although Hyperscan has API for *starting matching offset*, it
> > > > only delivers partial syntax support. So I think we have to define
> > > > *end of matching offset* for software solutions.
> > > >
> > > > [Jerin] I understand the hyperscan's HS_FLAG_SOM_LEFTMOST tradeoffs.
> > > > I thought application would need always the length of the match.
> > > > Probably we will see how other HW implementation (from Mellanox)
> > > > etc. We will try to abstract it, probably we can make it as function
> > > > of "user requested".
> > > > [Xiang] Yes, it will be good to make it per user request. At least
> > > > from Hyperscan user's point of view, start of match and match length
> > > > are not mandatory.
> > >
> > > OK. I think, we can introduce RTE_REGEX_DEV_CFG_MATCH_AS_START In
> > > device configure.
> > >
> > > Since offset+len == end, we can introduce following generic inline function.
> > >
> > > static inline
> > > rte_regex_match_end(truct rte_regex_match *match) {
> > > 	match->offset + match->len;
> > > }
> > >
> > > Example:  pattern to match is  "hello\s+world"  and data is following
> > > data[4] = 'h'
> > > data[5] = 'e'
> > > data[6] = 'l'
> > > data[7] = 'l'
> > > data[8] = 'o'
> > > data[9] = ' '
> > > data[10] = 'w'
> > > data[11] = 'o'
> > > data[12] = 'r'
> > > data[13] = 'l'
> > > data[14] = 'd'
> > >
> > > if device is configured with RTE_REGEX_DEV_CFG_MATCH_AS_START
> > > match->offset returns 4
> > > match->len returns 11
> > >
> > > if device is NOT configured with RTE_REGEX_DEV_CFG_MATCH_AS_START
> > > driver MAY return the following(in hyperscan case)
> > > match->offset returns 0
> > > match->len returns 11 + 4
> > >
> > > In both case(irrespective of flags, to make application life easy)
> > rte_regex_match_end() would return 15.
> > > If application demands for MATCH_AS_START then driver can return
> > > match->offset returns 4 and match->len returns 11 Aka set
> > > HS_FLAG_SOM_LEFTMOST in hyperscan driver, But application should use
> > rte_regex_match_end() for finding the end of the match. To make, work in all
> > cases.
> > >
> > > Is it OK?
> > >
> > Can we replace len with end offset? So we can change "offset" to "start_offset"
> > and len to "end_ offset" in struct rte_regex_match. Users interested in len
> > could take "end_offset - start_offset".
> > We may also change RTE_REGEX_DEV_CFG_MATCH_AS_START to
> > RTE_REGEX_DEV_CFG_MATCH_START
> > 
> > In your example,
> > if device is configured with RTE_REGEX_DEV_CFG_MATCH_START
> > match->start_offset returns 4
> > match->end_offset returns 15
> > 
> > if device is NOT configured with RTE_REGEX_DEV_CFG_MATCH_START
> > match->start_offset returns 0
> > match->end_offset returns 15
> 
> 
> This part is little tricky as HW descriptions need to be rewritten on response.
> This is a one issue, I foresee earlier, to come up with rte_regex_match
> That's works for all implementation  without performance issue.
> 
> We have two HW implementations, both returns start_off and len.
> Lets get input from other HW implementation on the semantics of
> rte_regex_match. Based on that, we can decide how to go about it?
> Thoughts from Mellanox or other vendors?
>
Sure. Let's get more inputs on this.
> 
> 
> > 
> > > >
> > > > 3)  rte_regex_rule_db_update()
> > > >     Does this mean we can dynamically add or delete rules for an
> > > > already generated database without recompile from scratch for
> > > > hardware Regex implementation?
> > > >     If so, this isn't possible for software solutions as they don't
> > > > support dynamic database update and require recompile.
> > > >
> > > > [Jerin] rte_regex_rule_db_update() internally it would call
> > > > recompile function for both HW and SW.
> > > > See rte_regex_dev_config::rule_db in rte_regex_dev_configure() for
> > > > precompiled rule database case.
> > > > [Xiang] OK, sounds like we have to save the original rule-set for
> > > > the device in order to do recompile. I see both ADD and REMOVE
> > > > operators from rte_regex_rule.
> > > > For rules with REMOVE operator, what's the expected behavior to
> > > > handle them for the old rule-set? Do we need to go through the old
> > > > rule-set and remove corresponding rules before doing recompile?
> > >
> > > Yes.
> > >
> > I think it'll be better to change rte_regex_rule_db_update() to
> > rte_regex_rule_compile() and have users to provide a full rule-set.
> > So we don't have to maintain old rule-set and decide which one to keep and
> > remove. We can simply recompile new rule-set and get rid of
> > rte_regex_rule_op in this case.
> 
> 
> On virtualized, HW implementations, The RULE database is maintained by single
> body. So the above scheme, works with SW and HW implementations.
> And It make user life easy as they don't need to maintain the rules.
> 
> I don't have preference on the rte_regex_rule_db_update() name, I can change to
> rte_regex_rule_compile() if required keeping above functionality. Let me know.
> 
>
OK, I'm good if your are willing to maintain it for users. Then both
rte_regex_rule_db_update() and rte_regex_rule_compile() work for me.
> 
> 
> 
> 
> 
>

Patch
diff mbox series

diff --git a/config/common_base b/config/common_base
index e406e7836..986093d6e 100644
--- a/config/common_base
+++ b/config/common_base
@@ -746,6 +746,11 @@  CONFIG_RTE_LIBRTE_PMD_DPAA2_QDMA_RAWDEV=n
 #
 CONFIG_RTE_LIBRTE_PMD_IFPGA_RAWDEV=y
 
+#
+# Compile regex device support
+#
+CONFIG_RTE_LIBRTE_REGEXDEV=y
+
 #
 # Compile librte_ring
 #
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index 715248dd1..a0bc27ae4 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -26,6 +26,7 @@  The public API headers are grouped by topics:
   [event_timer_adapter]    (@ref rte_event_timer_adapter.h),
   [event_crypto_adapter]   (@ref rte_event_crypto_adapter.h),
   [rawdev]             (@ref rte_rawdev.h),
+  [regexdev]           (@ref rte_regexdev.h),
   [metrics]            (@ref rte_metrics.h),
   [bitrate]            (@ref rte_bitrate.h),
   [latency]            (@ref rte_latencystats.h),
diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in
index b9896cb63..7adb821bb 100644
--- a/doc/api/doxy-api.conf.in
+++ b/doc/api/doxy-api.conf.in
@@ -53,6 +53,7 @@  INPUT                   = @TOPDIR@/doc/api/doxy-api-index.md \
                           @TOPDIR@/lib/librte_rawdev \
                           @TOPDIR@/lib/librte_rcu \
                           @TOPDIR@/lib/librte_reorder \
+                          @TOPDIR@/lib/librte_regexdev \
                           @TOPDIR@/lib/librte_ring \
                           @TOPDIR@/lib/librte_sched \
                           @TOPDIR@/lib/librte_security \
diff --git a/lib/Makefile b/lib/Makefile
index 791e0d991..57de9691a 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -44,6 +44,8 @@  DEPDIRS-librte_eventdev := librte_eal librte_ring librte_ethdev librte_hash \
                            librte_mempool librte_timer librte_cryptodev
 DIRS-$(CONFIG_RTE_LIBRTE_RAWDEV) += librte_rawdev
 DEPDIRS-librte_rawdev := librte_eal librte_ethdev
+DIRS-$(CONFIG_RTE_LIBRTE_REGEXDEV) += librte_regexdev
+DEPDIRS-librte_regexdev := librte_eal
 DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
 DEPDIRS-librte_vhost := librte_eal librte_mempool librte_mbuf librte_ethdev \
 			librte_net
diff --git a/lib/librte_regexdev/Makefile b/lib/librte_regexdev/Makefile
new file mode 100644
index 000000000..723b4b28c
--- /dev/null
+++ b/lib/librte_regexdev/Makefile
@@ -0,0 +1,23 @@ 
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(C) 2019 Marvell International Ltd.
+#
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_regexdev.a
+
+# library version
+LIBABIVER := 1
+
+# build flags
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+# library source files
+SRCS-y += rte_regexdev.c
+
+# export include files
+SYMLINK-y-include += rte_regexdev.h
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_regexdev/rte_regexdev.c b/lib/librte_regexdev/rte_regexdev.c
new file mode 100644
index 000000000..e5be0f29c
--- /dev/null
+++ b/lib/librte_regexdev/rte_regexdev.c
@@ -0,0 +1,5 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2019 Marvell International Ltd.
+ */
+
+#include <rte_regexdev.h>
diff --git a/lib/librte_regexdev/rte_regexdev.h b/lib/librte_regexdev/rte_regexdev.h
new file mode 100644
index 000000000..765da4aaa
--- /dev/null
+++ b/lib/librte_regexdev/rte_regexdev.h
@@ -0,0 +1,1247 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2019 Marvell International Ltd.
+ */
+
+#ifndef _RTE_REGEXDEV_H_
+#define _RTE_REGEXDEV_H_
+
+/**
+ * @file
+ *
+ * RTE RegEx Device API
+ *
+ * Defines RTE RegEx Device APIs for RegEx operations and its provisioning.
+ *
+ * The RegEx Device API is composed of two parts:
+ *
+ * - The application-oriented RegEx API that includes functions to setup
+ *   a RegEx device (configure it, setup its queue pairs and start it),
+ *   update the rule database and so on.
+ *
+ * - The driver-oriented RegEx API that exports a function allowing
+ *   a RegEx poll Mode Driver (PMD) to simultaneously register itself as
+ *   a RegEx device driver.
+ *
+ * RegEx device components and definitions:
+ *
+ *     +-----------------+
+ *     |                 |
+ *     |                 o---------+    rte_regex_[en|de]queue_burst()
+ *     |   PCRE based    o------+  |               |
+ *     |  RegEx pattern  |      |  |  +--------+   |
+ *     | matching engine o------+--+--o        |   |    +------+
+ *     |                 |      |  |  | queue  |<==o===>|Core 0|
+ *     |                 o----+ |  |  | pair 0 |        |      |
+ *     |                 |    | |  |  +--------+        +------+
+ *     +-----------------+    | |  |
+ *            ^               | |  |  +--------+
+ *            |               | |  |  |        |        +------+
+ *            |               | +--+--o queue  |<======>|Core 1|
+ *        Rule|Database       |    |  | pair 1 |        |      |
+ *     +------+----------+    |    |  +--------+        +------+
+ *     |     Group 0     |    |    |
+ *     | +-------------+ |    |    |  +--------+        +------+
+ *     | | Rules 0..n  | |    |    |  |        |        |Core 2|
+ *     | +-------------+ |    |    +--o queue  |<======>|      |
+ *     |     Group 1     |    |       | pair 2 |        +------+
+ *     | +-------------+ |    |       +--------+
+ *     | | Rules 0..n  | |    |
+ *     | +-------------+ |    |       +--------+
+ *     |     Group 2     |    |       |        |        +------+
+ *     | +-------------+ |    |       | queue  |<======>|Core n|
+ *     | | Rules 0..n  | |    +-------o pair n |        |      |
+ *     | +-------------+ |            +--------+        +------+
+ *     |     Group n     |
+ *     | +-------------+ |<-------rte_regex_rule_db_update()
+ *     | | Rules 0..n  | |<-------rte_regex_rule_db_import()
+ *     | +-------------+ |------->rte_regex_rule_db_export()
+ *     +-----------------+
+ *
+ * RegEx: A regular expression is a concise and flexible means for matching
+ * strings of text, such as particular characters, words, or patterns of
+ * characters. A common abbreviation for this is “RegEx”.
+ *
+ * RegEx device: A hardware or software-based implementation of RegEx
+ * device API for PCRE based pattern matching syntax and semantics.
+ *
+ * PCRE RegEx syntax and semantics specification:
+ * http://regexkit.sourceforge.net/Documentation/pcre/pcrepattern.html
+ *
+ * RegEx queue pair: Each RegEx device should have one or more queue pair to
+ * transmit a burst of pattern matching request and receive a burst of
+ * receive the pattern matching response. The pattern matching request/response
+ * embedded in *rte_regex_ops* structure.
+ *
+ * Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
+ * Match ID and Group ID to identify the rule upon the match.
+ *
+ * Rule database: The RegEx device accepts regular expressions and converts them
+ * into a compiled rule database that can then be used to scan data.
+ * Compilation allows the device to analyze the given pattern(s) and
+ * pre-determine how to scan for these patterns in an optimized fashion that
+ * would be far too expensive to compute at run-time. A rule database contains
+ * a set of rules that compiled in device specific binary form.
+ *
+ * Match ID or Rule ID: A unique identifier provided at the time of rule
+ * creation for the application to identify the rule upon match.
+ *
+ * Group ID: Group of rules can be grouped under one group ID to enable
+ * rule isolation and effective pattern matching. A unique group identifier
+ * provided at the time of rule creation for the application to identify the
+ * rule upon match.
+ *
+ * Scan: A pattern matching request through *enqueue* API.
+ *
+ * It may possible that a given RegEx device may not support all the features
+ * of PCRE. The application may probe unsupported features through
+ * struct rte_regex_dev_info::pcre_unsup_flags
+ *
+ * By default, all the functions of the RegEx Device API exported by a PMD
+ * are lock-free functions which assume to not be invoked in parallel on
+ * different logical cores to work on the same target object. For instance,
+ * the dequeue function of a PMD cannot be invoked in parallel on two logical
+ * cores to operates on same RegEx queue pair. Of course, this function
+ * can be invoked in parallel by different logical core on different queue pair.
+ * It is the responsibility of the upper level application to enforce this rule.
+ *
+ * In all functions of the RegEx API, the RegEx device is
+ * designated by an integer >= 0 named the device identifier *dev_id*
+ *
+ * At the RegEx driver level, RegEx devices are represented by a generic
+ * data structure of type *rte_regex_dev*.
+ *
+ * RegEx devices are dynamically registered during the PCI/SoC device probing
+ * phase performed at EAL initialization time.
+ * When a RegEx device is being probed, a *rte_regex_dev* structure and
+ * a new device identifier are allocated for that device. Then, the
+ * regex_dev_init() function supplied by the RegEx driver matching the probed
+ * device is invoked to properly initialize the device.
+ *
+ * The role of the device init function consists of resetting the hardware or
+ * software RegEx driver implementations.
+ *
+ * If the device init operation is successful, the correspondence between
+ * the device identifier assigned to the new device and its associated
+ * *rte_regex_dev* structure is effectively registered.
+ * Otherwise, both the *rte_regex_dev* structure and the device identifier are
+ * freed.
+ *
+ * The functions exported by the application RegEx API to setup a device
+ * designated by its device identifier must be invoked in the following order:
+ *     - rte_regex_dev_configure()
+ *     - rte_regex_queue_pair_setup()
+ *     - rte_regex_dev_start()
+ *
+ * Then, the application can invoke, in any order, the functions
+ * exported by the RegEx API to enqueue pattern matching job, dequeue pattern
+ * matching response, get the stats, update the rule database,
+ * get/set device attributes and so on
+ *
+ * If the application wants to change the configuration (i.e. call
+ * rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must call
+ * rte_regex_dev_stop() first to stop the device and then do the reconfiguration
+ * before calling rte_regex_dev_start() again. The enqueue and dequeue
+ * functions should not be invoked when the device is stopped.
+ *
+ * Finally, an application can close a RegEx device by invoking the
+ * rte_regex_dev_close() function.
+ *
+ * Each function of the application RegEx API invokes a specific function
+ * of the PMD that controls the target device designated by its device
+ * identifier.
+ *
+ * For this purpose, all device-specific functions of a RegEx driver are
+ * supplied through a set of pointers contained in a generic structure of type
+ * *regex_dev_ops*.
+ * The address of the *regex_dev_ops* structure is stored in the *rte_regex_dev*
+ * structure by the device init function of the RegEx driver, which is
+ * invoked during the PCI/SoC device probing phase, as explained earlier.
+ *
+ * In other words, each function of the RegEx API simply retrieves the
+ * *rte_regex_dev* structure associated with the device identifier and
+ * performs an indirect invocation of the corresponding driver function
+ * supplied in the *regex_dev_ops* structure of the *rte_regex_dev* structure.
+ *
+ * For performance reasons, the address of the fast-path functions of the
+ * RegEx driver is not contained in the *regex_dev_ops* structure.
+ * Instead, they are directly stored at the beginning of the *rte_regex_dev*
+ * structure to avoid an extra indirect memory access during their invocation.
+ *
+ * RTE RegEx device drivers do not use interrupts for enqueue or dequeue
+ * operation. Instead, RegEx drivers export Poll-Mode enqueue and dequeue
+ * functions to applications.
+ *
+ * The *enqueue* operation submits a burst of RegEx pattern matching request
+ * to the RegEx device and the *dequeue* operation gets a burst of pattern
+ * matching response for the ones submitted through *enqueue* operation.
+ *
+ * Typical application utilisation of the RegEx device API will follow the
+ * following programming flow.
+ *
+ * - rte_regex_dev_configure()
+ * - rte_regex_queue_pair_setup()
+ * - rte_regex_rule_db_update() Needs to invoke if precompiled rule database not
+ *   provided in rte_regex_dev_config::rule_db for rte_regex_dev_configure()
+ *   and/or application needs to update rule database.
+ * - Create or reuse exiting mempool for *rte_regex_ops* objects.
+ * - rte_regex_dev_start()
+ * - rte_regex_enqueue_burst()
+ * - rte_regex_dequeue_burst()
+ *
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_common.h>
+#include <rte_config.h>
+#include <rte_dev.h>
+#include <rte_errno.h>
+#include <rte_memory.h>
+
+/**
+ * Get the total number of RegEx devices that have been successfully
+ * initialised.
+ *
+ * @return
+ *   The total number of usable RegEx devices.
+ */
+uint8_t
+rte_regex_dev_count(void);
+
+/**
+ * Get the device identifier for the named RegEx device.
+ *
+ * @param name
+ *   RegEx device name to select the RegEx device identifier.
+ *
+ * @return
+ *   Returns RegEx device identifier on success.
+ *   - <0: Failure to find named RegEx device.
+ */
+int
+rte_regex_dev_get_dev_id(const char *name);
+
+/* Enumerates RegEx device capabilities */
+#define RTE_REGEX_DEV_CAPA_RUNTIME_COMPILATION_F (1ULL << 0)
+/**< RegEx device does support compiling the rules at runtime unlike
+ * loading only the pre-built rule database using
+ * struct rte_regex_dev_config::rule_db in rte_regex_dev_configure()
+ * @see struct rte_regex_dev_config::rule_db, rte_regex_dev_configure()
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+
+/* Enumerates unsupported PCRE features for the RegEx device */
+#define RTE_REGEX_DEV_PCRE_UNSUP_START_ANCHOR_F (1ULL << 0)
+/**< RegEx device doesn't support PCRE Anchor to start of match flag.
+ * Example RegEx is '/\Gfoo\d/'. Here '\G' asserts position at the end of the
+ * previous match or the start of the string for the first match.
+ * This position will change each time the RegEx is applied to the subject
+ * string. If the RegEx is applied to 'foo1foo2Zfoo3' the first two matches will
+ * be successful for 'foo1foo2' and fail for 'Zfoo3'.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_ATOMIC_GROUPING_F (1ULL << 1)
+/**< RegEx device doesn't support PCRE Atomic grouping.
+ * Atomic groups are represented by '(?>)'. An atomic group is a group that,
+ * when the RegEx engine exits from it, automatically throws away all
+ * backtracking positions remembered by any tokens inside the group.
+ * Example RegEx is 'a(?>bc|b)c' if the given patterns are 'abc' and 'abcc' then
+ * 'a(bc|b)c' matches both where as 'a(?>bc|b)c' matches only abcc because
+ * atomic groups don't allow backtracing back to 'b'.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_BACKTRACKING_CTRL_F (1ULL << 2)
+/**< RegEx device doesn't support PCRE backtracking control verbs.
+ * Some examples of backtracing verbs are (*COMMIT), (*ACCEPT), (*FAIL),
+ * (*SKIP), (*PRUNE).
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_CALLOUTS_F (1ULL << 3)
+/**< RegEx device doesn't support PCRE callouts.
+ * PCRE supports calling external function in between matches by using '(?C)'.
+ * Example RegEx 'ABC(?C)D' if a given patter is 'ABCD' then the RegEx engine
+ * will parse ABC perform a userdefined callout and return a successful match at
+ * D.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_BACKREFERENCE_F (1ULL << 4)
+/**< RegEx device doesn't support PCRE backreference.
+ * Example RegEx is '(\2ABC|(GHI))+' \2 matches the same text as most recently
+ * matched by the 2nd capturing group i.e. 'GHI'.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_GREEDY_F (1ULL << 5)
+/**< RegEx device doesn't support PCRE Greedy mode.
+ * For example if the RegEx is 'AB\d*?' then '*?' represents zero or unlimited
+ * matches. In greedy mode the pattern 'AB12345' will be matched completely
+ * where as the ungreedy mode 'AB' will be returned as the match.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_LOOKAROUND_ASRT_F (1ULL << 6)
+/**< RegEx device doesn't support PCRE Lookaround assertions
+ * (Zero-width assertions). Example RegEx is '[a-z]+\d+(?=!{3,})' if
+ * the given pattern is 'dwad1234!' the RegEx engine doesn't report any matches
+ * because the assert '(?=!{3,})' fails. The pattern 'dwad123!!!' would return a
+ * successful match.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_MATCH_POINT_RST_F (1ULL << 7)
+/**< RegEx device doesn't support PCRE match point reset directive.
+ * Example RegEx is '[a-z]+\K\d+' if the pattern is 'dwad123'
+ * then even though the entire pattern matches only '123'
+ * is reported as a match.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_NEWLINE_CONVENTIONS_F (1ULL << 8)
+/**< RegEx device doesn't support PCRE newline convention.
+ * Newline conventions are represented as follows:
+ * (*CR)        carriage return
+ * (*LF)        linefeed
+ * (*CRLF)      carriage return, followed by linefeed
+ * (*ANYCRLF)   any of the three above
+ * (*ANY)       all Unicode newline sequences
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_NEWLINE_SEQ_F (1ULL << 9)
+/**< RegEx device doesn't support PCRE newline sequence.
+ * The escape sequence '\R' will match any newline sequence.
+ * It is equivalent to: '(?>\r\n|\n|\x0b|\f|\r|\x85)'.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_POSSESSIVE_QUALIFIERS_F (1ULL << 10)
+/**< RegEx device doesn't support PCRE possessive qualifiers.
+ * Example RegEx possessive qualifiers '*+', '++', '?+', '{m,n}+'.
+ * Possessive quantifier repeats the token as many times as possible and it does
+ * not give up matches as the engine backtracks. With a possessive quantifier,
+ * the deal is all or nothing.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_SUBROUTINE_REFERENCES_F (1ULL << 11)
+/**< RegEx device doesn't support PCRE Subroutine references.
+ * PCRE Subroutine references allow for sub patterns to be assessed
+ * as part of the RegEx. Example RegEx is '(foo|fuzz)\g<1>+bar' matches the
+ * pattern 'foofoofuzzfoofuzzbar'.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_8_F (1ULL << 12)
+/**< RegEx device doesn't support UTF-8 character encoding.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_16_F (1ULL << 13)
+/**< RegEx device doesn't support UTF-16 character encoding.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_32_F (1ULL << 14)
+/**< RegEx device doesn't support UTF-32 character encoding.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_WORD_BOUNDARY_F (1ULL << 15)
+/**< RegEx device doesn't support word boundaries.
+ * The meta character '\b' represents word boundary anchor.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_FORWARD_REFERENCES_F (1ULL << 16)
+/**< RegEx device doesn't support Forward references.
+ * Forward references allow you to use a back reference to a group that appears
+ * later in the RegEx. Example RegEx is '(\3ABC|(DEF|(GHI)))+' matches the
+ * following string 'GHIGHIABCDEF'.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+/* Enumerates PCRE rule flags */
+#define RTE_REGEX_PCRE_RULE_ALLOW_EMPTY_F (1ULL << 0)
+/**< When this flag is set, the pattern that can match against an empty string,
+ * such as '.*' are allowed.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_ANCHORED_F (1ULL << 1)
+/**< When this flag is set, the pattern is forced to be "anchored", that is, it
+ * is constrained to match only at the first matching point in the string that
+ * is being searched. Similar to '^' and represented by \A.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_CASELESS_F (1ULL << 2)
+/**< When this flag is set, letters in the pattern match both upper and lower
+ * case letters in the subject.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_DOTALL_F (1ULL << 3)
+/**< When this flag is set, a dot metacharacter in the pattern matches any
+ * character, including one that indicates a newline.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_DUPNAMES_F (1ULL << 4)
+/**< When this flag is set, names used to identify capture groups need not be
+ * unique.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_EXTENDED_F (1ULL << 5)
+/**< When this flag is set, most white space characters in the pattern are
+ * totally ignored except when escaped or inside a character class.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_MATCH_UNSET_BACKREF_F (1ULL << 6)
+/**< When this flag is set, a backreference to an unset capture group matches an
+ * empty string.
+ * @see RTE_REGEX_DEV_PCRE_UNSUP_FORWARD_REFERENCES_F
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_MULTILINE_F (1ULL << 7)
+/**< When this flag  is set, the '^' and '$' constructs match immediately
+ * following or immediately before internal newlines in the subject string,
+ * respectively, as well as at the very start and end.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_NO_AUTO_CAPTURE_F (1ULL << 8)
+/**< When this Flag is set, it disables the use of numbered capturing
+ * parentheses in the pattern. References to capture groups (backreferences or
+ * recursion/subroutine calls) may only refer to named groups, though the
+ * reference can be by name or by number.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_UCP_F (1ULL << 9)
+/**< By default, only ASCII characters are recognized, When this flag is set,
+ * Unicode properties are used instead to classify characters.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_UNGREEDY_F (1ULL << 10)
+/**< When this flag is set, the "greediness" of the quantifiers is inverted
+ * so that they are not greedy by default, but become greedy if followed by
+ * '?'.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_UTF_F (1ULL << 11)
+/**< When this flag is set, RegEx engine has to regard both the pattern and the
+ * subject strings that are subsequently processed as strings of UTF characters
+ * instead of single-code-unit strings.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_NEVER_BACKSLASH_C_F (1ULL << 12)
+/**< This Flag locks out the use of '\C' in