From patchwork Mon Nov 8 17:37:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vladimir Medvedkin X-Patchwork-Id: 104017 X-Patchwork-Delegate: david.marchand@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 92D70A0C4E; Mon, 8 Nov 2021 18:50:51 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id EE14241121; Mon, 8 Nov 2021 18:50:44 +0100 (CET) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by mails.dpdk.org (Postfix) with ESMTP id 41426410E4 for ; Mon, 8 Nov 2021 18:50:42 +0100 (CET) X-IronPort-AV: E=McAfee;i="6200,9189,10162"; a="295726467" X-IronPort-AV: E=Sophos;i="5.87,218,1631602800"; d="scan'208";a="295726467" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Nov 2021 09:37:57 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.87,218,1631602800"; d="scan'208";a="491306984" Received: from silpixa00400072.ir.intel.com (HELO silpixa00400072.ger.corp.intel.com) ([10.237.222.91]) by orsmga007.jf.intel.com with ESMTP; 08 Nov 2021 09:37:56 -0800 From: Vladimir Medvedkin To: dev@dpdk.org Date: Mon, 8 Nov 2021 17:37:25 +0000 Message-Id: <20211108173727.133124-1-vladimir.medvedkin@intel.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 Subject: [dpdk-dev] [PATCH 1/2] doc: add programmer's guide for the RIB library X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Currently, programmer's guide for the RIB library is missing. This commit adds it. Signed-off-by: Vladimir Medvedkin Reviewed-by: Conor Walsh --- doc/guides/prog_guide/img/rib_internals.svg | 148 +++++++++++++++++ doc/guides/prog_guide/img/rib_pic.svg | 152 +++++++++++++++++ doc/guides/prog_guide/index.rst | 1 + doc/guides/prog_guide/rib_lib.rst | 172 ++++++++++++++++++++ 4 files changed, 473 insertions(+) create mode 100644 doc/guides/prog_guide/img/rib_internals.svg create mode 100644 doc/guides/prog_guide/img/rib_pic.svg create mode 100644 doc/guides/prog_guide/rib_lib.rst diff --git a/doc/guides/prog_guide/img/rib_internals.svg b/doc/guides/prog_guide/img/rib_internals.svg new file mode 100644 index 0000000000..7e40046cba --- /dev/null +++ b/doc/guides/prog_guide/img/rib_internals.svg @@ -0,0 +1,148 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Intermediate node + + + + Node containing a route + + + + 10.15.0.0/23 + + + + 10.0.0.0/29 + + + + 10.0.0.128/25 + + + + 10.0.0.160/27 + + + + + + + Routes falling under common 10.0.0.0/24 prefix + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/doc/guides/prog_guide/img/rib_pic.svg b/doc/guides/prog_guide/img/rib_pic.svg new file mode 100644 index 0000000000..8a8b176502 --- /dev/null +++ b/doc/guides/prog_guide/img/rib_pic.svg @@ -0,0 +1,152 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Next hop A + + + + Next hop B + + + diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst index 89af28dacb..0a53ec71c8 100644 --- a/doc/guides/prog_guide/index.rst +++ b/doc/guides/prog_guide/index.rst @@ -38,6 +38,7 @@ Programmer's Guide member_lib lpm_lib lpm6_lib + rib_lib flow_classify_lib packet_distrib_lib reorder_lib diff --git a/doc/guides/prog_guide/rib_lib.rst b/doc/guides/prog_guide/rib_lib.rst new file mode 100644 index 0000000000..e8bb35a2b6 --- /dev/null +++ b/doc/guides/prog_guide/rib_lib.rst @@ -0,0 +1,172 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright(c) 2021 Intel Corporation. + +RIB Library +=========== + +The Routing Information Base (RIB) library provides a data store for routing information. +This library is intended for use in control or management plane applications. +There are more suitable libraries for use in data plane applications such as +:ref:`LPM ` or FIB. + + +Implementation details +---------------------- + +RIB implements a key-value store for routing information. + +Routing information is represented by a prefix (key) and a next hop ID (value). +The prefix type depends on the address family. IPv4 addresses are represented by +``uint32_t`` values. IPv6 addresses are represented as ``uint8_t[16]`` values. +Next hop IDs are represented by ``uint64_t`` values. + +.. note:: + + The API and implementation are very similar for IPv4's ``rib`` library and IPv6's ``rib6`` + library, therefore only the ``rib`` library will be discussed here. + Everything within this document except for the size of the prefixes is applicable to the + ``rib6`` library. + +Internally RIB is represented as a binary tree as shown in :numref:`figure_rib_internals`: + +.. _figure_rib_internals: + +.. figure:: img/rib_internals.* + + RIB internals overview + +The binary tree consists of two types of nodes: + +* Actual Routes. + +* Intermediate Nodes which are used internally to preserve the binary tree structure. + + +RIB API Overview +---------------- + +RIB has two configuration parameters: + +* The maximum number of nodes. + +* The size of the extension block within each node. This space is used to store + additional user defined data. + +The main methods within the ``rib`` library are: + +* ``rte_rib_insert()``: Add new routes. + +* ``rte_rib_remove()``: Delete an existing route. + +* ``rte_rib_lookup()``: Lookup an IP in the structure using longest match. + +* ``rte_rib_lookup_exact()``: Lookup an IP in the structure using exact match. + +* ``rte_rib_lookup_parent()``: Find a parent prefix within the structure. + +* ``rte_rib_get_nxt()``: Traverse a subtree within the structure. + +Given a RIB structure with the routes depicted in :numref:`figure_rib_internals`, +here are several usage examples: + +* The best route for ``10.0.0.1`` can be found by calling: + +.. code-block:: c + + struct rte_rib_node *route = rte_rib_lookup(rib, RTE_IPV4(10,0,0,1)); + +This returns an ``rte_rib_node`` pointing to the ``10.0.0.0/29`` prefix. + +* To find an exact match route: + +.. code-block:: c + + struct rte_rib_node *route = rte_rib_lookup_exact(rib, RTE_IPV4(10,0,0,128), 25); + +This returns an ``rte_rib_node`` pointing to the ``10.0.0.128/25`` prefix. + +and + +.. code-block:: c + + struct rte_rib_node *route = rte_rib_lookup_exact(rib, RTE_IPV4(10,0,0,0), 24); + +This returns ``NULL`` as no exact match can be found. + +* To retrieve a group of routes under the common prefix ``10.0.0.0/24`` + (yellow triangle in :numref:`figure_rib_internals`): + +.. code-block:: c + + struct rte_rib_node *route = NULL; + do { + route = rte_rib_get_nxt(rib, RTE_IPV4(10,0,0,0), 24, route, RTE_RIB_GET_NXT_ALL); + } while (route != NULL) + +This returns 3 ``rte_rib_node`` nodes pointing to ``10.0.0.0/29``, ``10.0.0.160/27`` +and ``10.0.0.128/25``. + +Extensions usage example +------------------------ + +Extensions can be used for a wide range of tasks. +By default, an ``rte_rib_node`` node contains only crucial information such as the prefix and +next hop ID, but it doesn't contain protocol specific information such as +metrics, administrative distance and other routing protocol information. +These examples are application specific data and the user can decide what to keep +and how it is stored within the extension memory region in each ``rte_rib_node``. + +It is possible to implement a prefix independent convergence using the RIB extension feature. +If the routing daemon can provide a feasible next hop ID along with a best (active) next hop ID, +it is possible to react to a neighbour failing relatively fast. +Consider a RIB with a number of routes with different next hops (A and B) as +shown in :numref:`figure_rib_pic`. Every route can have a feasible next hop +provided by the routing daemon. + +.. _figure_rib_pic: + +.. figure:: img/rib_pic.* + + RIB prefix independent convergence + +In case of a next hop failure, we need to replace an active failed next hop with a +feasible next hop for every corresponding route without waiting for the routing daemon +recalculation process to complete. +To achieve this we can link all existing routes with the same active next hop in a linked list, +saving the feasible next hop ID and a pointer inside the extension space of each ``rte_rib_node``. + +.. code-block:: c + + struct my_route_ext { + struct rte_rib_node *next; + uint64_t feasible_nh; + }; + + struct rte_rib_conf conf; + conf.ext_sz = sizeof(struct my_route_ext); + rib = rte_rib_create("test", 0, &conf); + ... + /* routing daemon task */ + struct rte_rib_node *route = rte_rib_insert(rib, RTE_IPV4(192,0,2,0), 24); + rte_rib_set_nh(route, active_nh_from_rd); + struct my_route_ext *ext = rte_rib_get_ext(route); + ext->feasible_nh = feasible_nh_from_rd; + list_insert(nh_table[active_nh_from_rd].list_head, route); + ... + /* dataplane monitoring thread */ + /* nexthop id fail_nh fails */ + route = NULL; + do { + route = get_next(nh_table[fail_nh].list_head, route); + uint32_t ip; + uint8_t depth; + rte_rib_get_ip(route, &ip); + rte_rib_get_depth(route, &depth); + ext = rte_rib_get_ext(route); + uint64_t new_nh = ext->feasible_nh; + /* do update to the dataplane, for example to the fib */ + rte_fib_add(fib, ip, depth, new_nh); + /* update nexthop if necessary */ + rte_rib_set_nh(route, new_nh); + } while (route != NULL); + From patchwork Mon Nov 8 17:37:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vladimir Medvedkin X-Patchwork-Id: 104016 X-Patchwork-Delegate: david.marchand@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 862C2A0C4E; Mon, 8 Nov 2021 18:50:43 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 6CE66410FB; Mon, 8 Nov 2021 18:50:43 +0100 (CET) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by mails.dpdk.org (Postfix) with ESMTP id E5213410FB for ; Mon, 8 Nov 2021 18:50:40 +0100 (CET) X-IronPort-AV: E=McAfee;i="6200,9189,10162"; a="295726472" X-IronPort-AV: E=Sophos;i="5.87,218,1631602800"; d="scan'208";a="295726472" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Nov 2021 09:37:58 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.87,218,1631602800"; d="scan'208";a="491306995" Received: from silpixa00400072.ir.intel.com (HELO silpixa00400072.ger.corp.intel.com) ([10.237.222.91]) by orsmga007.jf.intel.com with ESMTP; 08 Nov 2021 09:37:57 -0800 From: Vladimir Medvedkin To: dev@dpdk.org Date: Mon, 8 Nov 2021 17:37:26 +0000 Message-Id: <20211108173727.133124-2-vladimir.medvedkin@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20211108173727.133124-1-vladimir.medvedkin@intel.com> References: <20211108173727.133124-1-vladimir.medvedkin@intel.com> MIME-Version: 1.0 Subject: [dpdk-dev] [PATCH 2/2] doc: add programmer's guide for the FIB library X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Currently, programmer's guide for the FIB library is missing. This commit adds it. Signed-off-by: Vladimir Medvedkin Reviewed-by: Conor Walsh --- doc/guides/prog_guide/fib_lib.rst | 139 +++++++++++++++++++++ doc/guides/prog_guide/img/dir_24_8_alg.svg | 136 ++++++++++++++++++++ doc/guides/prog_guide/index.rst | 1 + doc/guides/prog_guide/rib_lib.rst | 2 +- 4 files changed, 277 insertions(+), 1 deletion(-) create mode 100644 doc/guides/prog_guide/fib_lib.rst create mode 100644 doc/guides/prog_guide/img/dir_24_8_alg.svg diff --git a/doc/guides/prog_guide/fib_lib.rst b/doc/guides/prog_guide/fib_lib.rst new file mode 100644 index 0000000000..4aa02e1ed9 --- /dev/null +++ b/doc/guides/prog_guide/fib_lib.rst @@ -0,0 +1,139 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright(c) 2021 Intel Corporation. + +.. _FIB_Library: + +FIB Library +=========== + +The FIB library provides a fast Longest Prefix Match (LPM) search for 32-bit +keys or 128-bit for IPv6. It can be used in a variety of applications, +the most typical of which is IPv4/IPv6 forwarding. + +.. note:: + + The API and implementation are very similar for IPv4's ``fib`` library and + IPv6's ``fib6`` library, therefore only the ``fib`` library will be discussed here. + Everything within this document except for the size of the prefixes applies + to the ``rib6`` library. + +FIB API Overview +---------------- + +The main configuration struct contains: + +* Type of :ref:`dataplane algorithm `. + +* Default next hop ID. + +* The maximum number of routes. + +* Settings for the data algorithm (:ref:`will be discussed later `). + +A FIB rule consists of a prefix and an associated next hop ID. The prefix consists +of an IPv4 network address (``uint32_t``) and the corresponding prefix length. +The prefix serves as the key and the next hop ID as the value while doing an LPM +search within FIB. The size of the next hop ID is variable and must be configured +during initialization. + +The main methods within the ``fib`` library are: + +* ``rte_fib_add()``: Add a new route with a corresponding next hop ID to the + table or update the next hop ID if the prefix already exists in a table. + +* ``rte_fib_delete()``: Delete an existing route from the table. + +* ``rte_fib_lookup_bulk()``: Provides a bulk Longest Prefix Match (LPM) lookup function + for a set of IP addresses, it will return a set of corresponding next hop IDs. + + +Implementation details +---------------------- + +Internally FIB contains the ``rte_rib`` data struct to help maintain the dataplane struct. +The dataplane struct is opaque, so that users can add their own algorithm implementations. + + +.. _fib_dataplane_algorithms: + +Dataplane Algorithms +-------------------- + +Dummy +~~~~~ + +This algorithm uses ``rte_rib`` as a dataplane struct. Lookups are relatively slow, +but extra memory isn't used for the dataplane struct. This algorithm should only +be used for testing and debugging purposes. + +This algorithm will be used if the ``RTE_FIB_DUMMY`` type is configured as the +dataplane algorithm on FIB creation. + + +DIR-24-8 +~~~~~~~~ + +The current implementation of this algorithm uses a variation of the DIR-24-8 +algorithm that trades memory usage for improved LPM lookup speed. +This algorithm allows the lookup operation to be performed using only a single +memory read access in most cases. In the statistically rare case where the best +match rule has a depth larger than 24, the lookup operation will require two +memory read accesses. + +This algorithm will be used if the ``RTE_FIB_DIR24_8`` type is configured as the +dataplane algorithm on FIB creation. + +The main FIB configuration struct stores the dataplane parameters inside ``dir24_8`` +within the ``rte_fib_conf`` and it consists of: + +* ``nh_sz``: The size of the entry containing the next hop ID. + This could be 1, 2, 4 or 8 bytes long. + Note that all available bits except one are used to store the actual next hop ID. + +* ``num_tbl8``: The number of tbl8 groups, each group consists of 256 entries + corresponding to the ``nh_sz`` size. + +The main elements of the dataplane struct for the DIR-24-8 algorithm are: + +* TBL24: An array with 2\ :sup:`24` entries, corresponding to the ``nh_sz`` size. + +* TBL8: An array of ``num_tbl8`` tbl8 groups. + +The lookup algorithms logic can be seen in :numref:`figure_dir_24_8_alg`: + +.. _figure_dir_24_8_alg: + +.. figure:: img/dir_24_8_alg.* + + DIR-24-8 Lookup algorithm + +The first table ``tbl24``, is indexed using the first 24 bits of the IP address to be looked up, +while the second table(s) ``tbl8``, is indexed using the last 8 bits of the IP address. +This means that depending on the outcome of trying to match the IP address of an incoming packet +to a rule stored in the tbl24 we might need to continue the lookup process in the second level. + +Since every entry of the tbl24 can potentially point to a tbl8, +ideally we would have 2\ :sup:`24` tbl8s, which would be the same as having a +single table with 2\ :sup:`32` entries. This is not feasible due to resource restrictions. +Instead, this approach takes advantage of the fact that rules longer than 24 bits are very rare. +By splitting the process into two different tables/levels and limiting the number of tbl8s, +we can greatly reduce memory consumption while maintaining a very good lookup speed. +This method generally results in one memory access per lookup. + +An entry in a tbl8 contains the following fields: + +* The next hop ID. + +* 1 bit indicating if the lookup should proceed inside the tbl8. + +Use cases +--------- + +The FIB library is useful for any use cases that rely on the Longest Prefix Match (LPM) +algorithm such as IP forwarding or packet classification. + +More complex use cases are also possible, as it is possible to have next hop IDs +which are 63 bits long (using ``RTE_FIB_DIR24_8_8B`` as a next hop size). +These use cases could include storing two next hop IDs inside the 63 bits of the next hop. +This may be useful to provide a fallback next hop ID, ASN or forwarding class +corresponding to a given prefix without having to perform an additional lookup. diff --git a/doc/guides/prog_guide/img/dir_24_8_alg.svg b/doc/guides/prog_guide/img/dir_24_8_alg.svg new file mode 100644 index 0000000000..17725b7b97 --- /dev/null +++ b/doc/guides/prog_guide/img/dir_24_8_alg.svg @@ -0,0 +1,136 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + uint[8,16,32,64]_t + + + + + + + nh_id + + + + + + + + + + + + + + + + + + + uint[8,16,32,64]_t + + + + nh_id + + + + + + + 24 + + + + 8 + + + + + + + Extended entry? + + + + Return nh_id + + + + + no + + + + + + + nh_id * 256 + ip & 0xff + + + + + + + + + + + yes + + + + + + + IPv4 Address + + + + tbl24 + + + + tbl8 + + + diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst index 0a53ec71c8..42c4a690c6 100644 --- a/doc/guides/prog_guide/index.rst +++ b/doc/guides/prog_guide/index.rst @@ -38,6 +38,7 @@ Programmer's Guide member_lib lpm_lib lpm6_lib + fib_lib rib_lib flow_classify_lib packet_distrib_lib diff --git a/doc/guides/prog_guide/rib_lib.rst b/doc/guides/prog_guide/rib_lib.rst index e8bb35a2b6..0a6c3d7063 100644 --- a/doc/guides/prog_guide/rib_lib.rst +++ b/doc/guides/prog_guide/rib_lib.rst @@ -7,7 +7,7 @@ RIB Library The Routing Information Base (RIB) library provides a data store for routing information. This library is intended for use in control or management plane applications. There are more suitable libraries for use in data plane applications such as -:ref:`LPM ` or FIB. +:ref:`LPM ` or :ref:`FIB `. Implementation details