From patchwork Tue Jul 11 19:33:05 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vladimir Medvedkin X-Patchwork-Id: 26815 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [IPv6:::1]) by dpdk.org (Postfix) with ESMTP id 4C50C5598; Tue, 11 Jul 2017 21:09:13 +0200 (CEST) Received: from mailrelay1.rambler.ru (mailrelay1.rambler.ru [81.19.66.239]) by dpdk.org (Postfix) with ESMTP id CFD8E374C for ; Tue, 11 Jul 2017 21:09:09 +0200 (CEST) Received: from test02.park.rambler.ru (dpdk01.infra.rambler.ru [10.16.253.100]) by mailrelay1.rambler.ru (Postfix) with ESMTP id 3x6WpY1hKdzLlh; Tue, 11 Jul 2017 22:09:09 +0300 (MSK) From: Medvedkin Vladimir To: dev@dpdk.org Cc: Medvedkin Vladimir Date: Tue, 11 Jul 2017 19:33:05 +0000 Message-Id: <1499801585-10031-2-git-send-email-medvedkinv@gmail.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1499801585-10031-1-git-send-email-medvedkinv@gmail.com> References: <1499801585-10031-1-git-send-email-medvedkinv@gmail.com> X-Rcpt-To: , Subject: [dpdk-dev] [PATCH] lib/rib: Add Routing Information Base library X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" RIB is an alternative to current LPM library. It solves the following problems - Increases the speed of control plane operations against lpm such as adding/deleting routes - Adds abstraction from dataplane algorythms, so it is possible to add different ip route lookup algorythms such as DXR/poptrie/lpc-trie/etc in addition to current dir24_8 - It is possible to keep user defined application specific additional information in struct rte_rib_v4_node which represents route entry. It can be next hop/set of next hops (i.e. active and feasible), pointers to link rte_rib_v4_node based on some criteria (i.e. next_hop), plenty of additional control plane information. - For dir24_8 implementation it is possible to remove rte_lpm_tbl_entry.depth field that helps to save 6 bits. - Also new dir24_8 implementation supports different next_hop sizes (1/2/4/8 bytes per next hop) Signed-off-by: Medvedkin Vladimir --- config/common_base | 6 + doc/api/doxy-api.conf | 1 + lib/Makefile | 2 + lib/librte_rib/Makefile | 43 ++++ lib/librte_rib/rte_dir24_8.c | 411 +++++++++++++++++++++++++++++++++++++++ lib/librte_rib/rte_dir24_8.h | 144 ++++++++++++++ lib/librte_rib/rte_rib.c | 454 +++++++++++++++++++++++++++++++++++++++++++ lib/librte_rib/rte_rib.h | 260 +++++++++++++++++++++++++ 8 files changed, 1321 insertions(+) create mode 100644 lib/librte_rib/Makefile create mode 100644 lib/librte_rib/rte_dir24_8.c create mode 100644 lib/librte_rib/rte_dir24_8.h create mode 100644 lib/librte_rib/rte_rib.c create mode 100644 lib/librte_rib/rte_rib.h diff --git a/config/common_base b/config/common_base index 8ae6e92..1b0921f 100644 --- a/config/common_base +++ b/config/common_base @@ -615,6 +615,12 @@ CONFIG_RTE_LIBRTE_LPM=y CONFIG_RTE_LIBRTE_LPM_DEBUG=n # +# Compile librte_rib +# +CONFIG_RTE_LIBRTE_RIB=y +CONFIG_RTE_LIBRTE_RIB_DEBUG=n + +# # Compile librte_acl # CONFIG_RTE_LIBRTE_ACL=y diff --git a/doc/api/doxy-api.conf b/doc/api/doxy-api.conf index 823554f..f80c8d5 100644 --- a/doc/api/doxy-api.conf +++ b/doc/api/doxy-api.conf @@ -54,6 +54,7 @@ INPUT = doc/api/doxy-api-index.md \ lib/librte_kvargs \ lib/librte_latencystats \ lib/librte_lpm \ + lib/librte_rib \ lib/librte_mbuf \ lib/librte_mempool \ lib/librte_meter \ diff --git a/lib/Makefile b/lib/Makefile index 86caba1..943c383 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -61,6 +61,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_EFD) += librte_efd DEPDIRS-librte_efd := librte_eal librte_ring librte_hash DIRS-$(CONFIG_RTE_LIBRTE_LPM) += librte_lpm DEPDIRS-librte_lpm := librte_eal +DIRS-$(CONFIG_RTE_LIBRTE_RIB) += librte_rib +DEPDIRS-librte_rib := librte_eal DIRS-$(CONFIG_RTE_LIBRTE_ACL) += librte_acl DEPDIRS-librte_acl := librte_eal DIRS-$(CONFIG_RTE_LIBRTE_NET) += librte_net diff --git a/lib/librte_rib/Makefile b/lib/librte_rib/Makefile new file mode 100644 index 0000000..8d434a3 --- /dev/null +++ b/lib/librte_rib/Makefile @@ -0,0 +1,43 @@ +# BSD LICENSE +# +# Copyright(c) 2017 Vladimir Medvedkin +# All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# +# * Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# * Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in +# the documentation and/or other materials provided with the +# distribution. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +include $(RTE_SDK)/mk/rte.vars.mk + +# library name +LIB = librte_rib.a + +CFLAGS += -O3 +CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) + +# all source are stored in SRCS-y +SRCS-$(CONFIG_RTE_LIBRTE_LPM) := rte_rib.c rte_dir24_8.c + +# install this header file +SYMLINK-$(CONFIG_RTE_LIBRTE_LPM)-include := rte_rib.h rte_dir24_8.h + +include $(RTE_SDK)/mk/rte.lib.mk diff --git a/lib/librte_rib/rte_dir24_8.c b/lib/librte_rib/rte_dir24_8.c new file mode 100644 index 0000000..915df3d --- /dev/null +++ b/lib/librte_rib/rte_dir24_8.c @@ -0,0 +1,411 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include +#include +#include +#include + +#include +#include + +#include +#include + +#define ROUNDUP(x, y) ((((x - 1) >> (32 - y)) + 1) << (32 - y)) +#define RTE_DIR24_8_GET_TBL24_P(fib, ip) \ + ((void *)&((uint8_t *)fib->tbl24)[(ip & \ + RTE_DIR24_8_TBL24_MASK) >> (8 - fib->nh_sz)]) \ + +int +rte_dir24_8_lookup_1b(void *fib_p, uint32_t ip, uint64_t *next_hop) +{ + struct rte_dir24_8_tbl *fib = (struct rte_dir24_8_tbl *)fib_p; + uint32_t tbl24_index = (ip >> 8); + + RTE_RIB_RETURN_IF_TRUE(((fib_p == NULL) || + (next_hop == NULL)), -EINVAL); + + RTE_DIR24_8_LOOKUP(uint8_t); + + return (tbl_entry & RTE_DIR24_8_LOOKUP_SUCCESS) ? 0 : -ENOENT; +} + +int +rte_dir24_8_lookup_2b(void *fib_p, uint32_t ip, uint64_t *next_hop) +{ + struct rte_dir24_8_tbl *fib = (struct rte_dir24_8_tbl *)fib_p; + uint32_t tbl24_index = (ip >> 8); + + RTE_RIB_RETURN_IF_TRUE(((fib_p == NULL) || + (next_hop == NULL)), -EINVAL); + + RTE_DIR24_8_LOOKUP(uint16_t); + + return (tbl_entry & RTE_DIR24_8_LOOKUP_SUCCESS) ? 0 : -ENOENT; +} + +int +rte_dir24_8_lookup_4b(void *fib_p, uint32_t ip, uint64_t *next_hop) +{ + struct rte_dir24_8_tbl *fib = (struct rte_dir24_8_tbl *)fib_p; + uint32_t tbl24_index = (ip >> 8); + + RTE_RIB_RETURN_IF_TRUE(((fib_p == NULL) || + (next_hop == NULL)), -EINVAL); + + RTE_DIR24_8_LOOKUP(uint32_t); + + return (tbl_entry & RTE_DIR24_8_LOOKUP_SUCCESS) ? 0 : -ENOENT; +} + +int +rte_dir24_8_lookup_8b(void *fib_p, uint32_t ip, uint64_t *next_hop) +{ + struct rte_dir24_8_tbl *fib = (struct rte_dir24_8_tbl *)fib_p; + uint32_t tbl24_index = (ip >> 8); + + RTE_RIB_RETURN_IF_TRUE(((fib_p == NULL) || + (next_hop == NULL)), -EINVAL); + + RTE_DIR24_8_LOOKUP(uint64_t); + + return (tbl_entry & RTE_DIR24_8_LOOKUP_SUCCESS) ? 0 : -ENOENT; +} + +static int +tbl8_alloc(struct rte_dir24_8_tbl *fib) +{ + uint32_t i; + uint8_t *ptr8 = (uint8_t *)fib->tbl8; + uint16_t *ptr16 = (uint16_t *)fib->tbl8; + uint32_t *ptr32 = (uint32_t *)fib->tbl8; + uint64_t *ptr64 = (uint64_t *)fib->tbl8; + + if (fib->cur_tbl8s >= fib->number_tbl8s) + return -ENOSPC; + + switch (fib->nh_sz) { + case RTE_DIR24_8_1B: + for (i = 0; i < fib->number_tbl8s; i++) { + if ((ptr8[i * RTE_DIR24_8_TBL8_GRP_NUM_ENT] & + RTE_DIR24_8_VALID_EXT_ENT) != + RTE_DIR24_8_VALID_EXT_ENT) { + fib->cur_tbl8s++; + return i; + } + } + break; + case RTE_DIR24_8_2B: + for (i = 0; i < fib->number_tbl8s; i++) { + if ((ptr16[i * RTE_DIR24_8_TBL8_GRP_NUM_ENT] & + RTE_DIR24_8_VALID_EXT_ENT) != + RTE_DIR24_8_VALID_EXT_ENT) { + fib->cur_tbl8s++; + return i; + } + } + case RTE_DIR24_8_4B: + for (i = 0; i < fib->number_tbl8s; i++) { + if ((ptr32[i * RTE_DIR24_8_TBL8_GRP_NUM_ENT] & + RTE_DIR24_8_VALID_EXT_ENT) != + RTE_DIR24_8_VALID_EXT_ENT) { + fib->cur_tbl8s++; + return i; + } + } + case RTE_DIR24_8_8B: + for (i = 0; i < fib->number_tbl8s; i++) { + if ((ptr64[i * RTE_DIR24_8_TBL8_GRP_NUM_ENT] & + RTE_DIR24_8_VALID_EXT_ENT) != + RTE_DIR24_8_VALID_EXT_ENT) { + fib->cur_tbl8s++; + return i; + } + } + } + return -ENOSPC; +} + +static void +write_to_fib(void *ptr, uint64_t val, enum rte_dir24_8_nh_sz size, int n) +{ + int i; + uint8_t *ptr8 = (uint8_t *)ptr; + uint16_t *ptr16 = (uint16_t *)ptr; + uint32_t *ptr32 = (uint32_t *)ptr; + uint64_t *ptr64 = (uint64_t *)ptr; + + switch (size) { + case RTE_DIR24_8_1B: + for (i = 0; i < n; i++) + ptr8[i] = (uint8_t)val; + break; + case RTE_DIR24_8_2B: + for (i = 0; i < n; i++) + ptr16[i] = (uint16_t)val; + break; + case RTE_DIR24_8_4B: + for (i = 0; i < n; i++) + ptr32[i] = (uint32_t)val; + break; + case RTE_DIR24_8_8B: + for (i = 0; i < n; i++) + ptr64[i] = (uint64_t)val; + break; + } +} + +static int +install_to_fib(struct rte_dir24_8_tbl *fib, uint32_t ledge, uint32_t redge, + uint64_t next_hop, int valid) +{ + int tbl8_idx; + uint64_t tbl24_tmp; + uint8_t *tbl8_ptr; + + if (ROUNDUP(ledge, 24) <= redge) { + if (ledge < ROUNDUP(ledge, 24)) { + tbl24_tmp = RTE_DIR24_8_GET_TBL24(fib, ledge); + if ((tbl24_tmp & RTE_DIR24_8_VALID_EXT_ENT) != + RTE_DIR24_8_VALID_EXT_ENT) { + tbl8_idx = tbl8_alloc(fib); + if (tbl8_idx < 0) + return tbl8_idx; + tbl8_ptr = (uint8_t *)fib->tbl8 + + ((tbl8_idx * RTE_DIR24_8_TBL8_GRP_NUM_ENT) << + fib->nh_sz); + /*Init tbl8 entries with nexthop from tbl24*/ + write_to_fib((void *)tbl8_ptr, tbl24_tmp| + RTE_DIR24_8_VALID_EXT_ENT, fib->nh_sz, + RTE_DIR24_8_TBL8_GRP_NUM_ENT); + /*update dir24 entry with tbl8 index*/ + write_to_fib(RTE_DIR24_8_GET_TBL24_P(fib, ledge), + (tbl8_idx << 2)| + RTE_DIR24_8_VALID_EXT_ENT| + valid, fib->nh_sz, 1); + } else + tbl8_idx = tbl24_tmp >> 2; + tbl8_ptr = (uint8_t *)fib->tbl8 + + (((tbl8_idx * RTE_DIR24_8_TBL8_GRP_NUM_ENT) + + (ledge & ~RTE_DIR24_8_TBL24_MASK)) << + fib->nh_sz); + /*update tbl8 with new next hop*/ + write_to_fib((void *)tbl8_ptr, (next_hop << 2)| + RTE_DIR24_8_VALID_EXT_ENT|valid, + fib->nh_sz, ROUNDUP(ledge, 24) - ledge); + } + if (ROUNDUP(ledge, 24) < (redge & RTE_DIR24_8_TBL24_MASK)) { + write_to_fib(RTE_DIR24_8_GET_TBL24_P(fib, + ROUNDUP(ledge, 24)), (next_hop << 2)|valid, + fib->nh_sz, ((redge & RTE_DIR24_8_TBL24_MASK) - + ROUNDUP(ledge, 24)) >> 8); + } + if (redge & ~RTE_DIR24_8_TBL24_MASK) { + tbl24_tmp = RTE_DIR24_8_GET_TBL24(fib, redge); + if ((tbl24_tmp & RTE_DIR24_8_VALID_EXT_ENT) != + RTE_DIR24_8_VALID_EXT_ENT) { + tbl8_idx = tbl8_alloc(fib); + if (tbl8_idx < 0) + return tbl8_idx; + tbl8_ptr = (uint8_t *)fib->tbl8 + + ((tbl8_idx * RTE_DIR24_8_TBL8_GRP_NUM_ENT) << + fib->nh_sz); + /*Init tbl8 entries with nexthop from tbl24*/ + write_to_fib((void *)tbl8_ptr, tbl24_tmp| + RTE_DIR24_8_VALID_EXT_ENT, fib->nh_sz, + RTE_DIR24_8_TBL8_GRP_NUM_ENT); + /*update dir24 entry with tbl8 index*/ + write_to_fib(RTE_DIR24_8_GET_TBL24_P(fib, redge), + (tbl8_idx << 2)| + RTE_DIR24_8_VALID_EXT_ENT| + valid, fib->nh_sz, 1); + } else + tbl8_idx = tbl24_tmp >> 2; + tbl8_ptr = (uint8_t *)fib->tbl8 + + ((tbl8_idx * RTE_DIR24_8_TBL8_GRP_NUM_ENT) << + fib->nh_sz); + /*update tbl8 with new next hop*/ + write_to_fib((void *)tbl8_ptr, (next_hop << 2)| + RTE_DIR24_8_VALID_EXT_ENT|valid, + fib->nh_sz, redge & ~RTE_DIR24_8_TBL24_MASK); + } + } else { + tbl24_tmp = RTE_DIR24_8_GET_TBL24(fib, ledge); + if ((tbl24_tmp & RTE_DIR24_8_VALID_EXT_ENT) != + RTE_DIR24_8_VALID_EXT_ENT) { + tbl8_idx = tbl8_alloc(fib); + if (tbl8_idx < 0) + return tbl8_idx; + tbl8_ptr = (uint8_t *)fib->tbl8 + + ((tbl8_idx * RTE_DIR24_8_TBL8_GRP_NUM_ENT) << + fib->nh_sz); + + /*Init tbl8 entries with nexthop from tbl24*/ + write_to_fib((void *)tbl8_ptr, tbl24_tmp| + RTE_DIR24_8_VALID_EXT_ENT, fib->nh_sz, + RTE_DIR24_8_TBL8_GRP_NUM_ENT); + /*update dir24 entry with tbl8 index*/ + write_to_fib(RTE_DIR24_8_GET_TBL24_P(fib, ledge), + (tbl8_idx << 2)| + RTE_DIR24_8_VALID_EXT_ENT| + valid, fib->nh_sz, 1); + } else + tbl8_idx = tbl24_tmp >> 2; + tbl8_ptr = (uint8_t *)fib->tbl8 + + (((tbl8_idx * RTE_DIR24_8_TBL8_GRP_NUM_ENT) + + (ledge & ~RTE_DIR24_8_TBL24_MASK)) << + fib->nh_sz); + /*update tbl8 with new next hop*/ + write_to_fib((void *)tbl8_ptr, (next_hop << 2)| + RTE_DIR24_8_VALID_EXT_ENT|valid, + fib->nh_sz, redge - ledge); + } + return 0; +} + +int +rte_dir24_8_modify(struct rte_rib *rib, uint32_t ip, uint8_t mask_len, + uint64_t next_hop, enum rte_rib_op rib_op) +{ + int ret, valid = RTE_DIR24_8_LOOKUP_SUCCESS; + struct rte_rib_v4_node *ent, *tmp; + struct rte_rib_simple_ext *ext, *tmp_ext; + struct rte_dir24_8_tbl *fib; + uint32_t ledge, redge; + int tbl8_idx; + uint8_t *tbl8_ptr; + + if ((rib == NULL) || (mask_len > RTE_RIB_V4_MAXDEPTH)) + return -EINVAL; + + fib = rib->fib; + ip &= (uint32_t)(UINT64_MAX << (32 - mask_len)); + + ent = rte_rib_v4_lookup_exact(rib, ip, mask_len); + if (rib_op == RTE_RIB_OP_ADD) { + if (ent) + return -EEXIST; + ent = rte_rib_v4_insert(rib, ip, mask_len); + } + if (ent == NULL) + return -ENOENT; + ext = (struct rte_rib_simple_ext *)ent->ext; + + switch (rib_op) { + case RTE_RIB_OP_DEL: + tmp = rte_rib_v4_lookup_parent(ent); + if (tmp == NULL) { + valid = 0; + break; + } + tmp_ext = (struct rte_rib_simple_ext *)tmp->ext; + if (ext->next_hop == tmp_ext->next_hop) + goto delete_finish; + next_hop = tmp_ext->next_hop; + break; + case RTE_RIB_OP_MODIFY: + if (ext->next_hop == next_hop) + return 0; + break; + case RTE_RIB_OP_ADD: + ext->next_hop = next_hop; + break; + default: + return -EINVAL; + } + + tmp = NULL; + ledge = ip; + do { + tmp = rte_rib_v4_get_next_child(rib, ip, mask_len, tmp, + GET_NXT_COVER); + if (tmp != NULL) { + if (tmp->mask_len == mask_len) + continue; + redge = tmp->key; + if (ledge == redge) { + ledge = redge + + (uint32_t)(1ULL << (32 - tmp->mask_len)); + continue; + } + ret = install_to_fib(rib->fib, ledge, redge, + next_hop, valid); + if (ret != 0) { + if ((ret == -ENOSPC) && (mask_len > 24)) + rte_rib_v4_remove(rib, ip, mask_len); + return ret; + } + ledge = redge + + (uint32_t)(1ULL << (32 - tmp->mask_len)); + } else { + redge = ip + (uint32_t)(1ULL << (32 - mask_len)); + ret = install_to_fib(rib->fib, ledge, redge, + next_hop, valid); + if (ret != 0) { + if ((ret == -ENOSPC) && (mask_len > 24)) + rte_rib_v4_remove(rib, ip, mask_len); + return ret; + } + } + } while (tmp); + + if (rib_op == RTE_RIB_OP_DEL) { + /*tbl8 recycle check*/ +delete_finish: + rte_rib_v4_remove(rib, ip, mask_len); + if (mask_len > 24) { + tmp = rte_rib_v4_get_next_child(rib, + (ip & RTE_DIR24_8_TBL24_MASK), 24, NULL, + GET_NXT_ALL); + if (tmp == NULL) { + ent = rte_rib_v4_lookup(rib, ip); + if (ent) { + ext = (struct rte_rib_simple_ext *)ent->ext; + next_hop = ext->next_hop << 2| + RTE_DIR24_8_LOOKUP_SUCCESS; + } else { + next_hop = 0; + } + tbl8_idx = RTE_DIR24_8_GET_TBL24(fib, ip) >> 2; + tbl8_ptr = (uint8_t *)fib->tbl8 + + ((tbl8_idx * RTE_DIR24_8_TBL8_GRP_NUM_ENT) << + fib->nh_sz); + /*update tbl24*/ + write_to_fib(RTE_DIR24_8_GET_TBL24_P(fib, ip), + next_hop, fib->nh_sz, 1); + /*Cleanup tbl8*/ + write_to_fib((void *)tbl8_ptr, 0, fib->nh_sz, + RTE_DIR24_8_TBL8_GRP_NUM_ENT); + } + } + } + return 0; +} diff --git a/lib/librte_rib/rte_dir24_8.h b/lib/librte_rib/rte_dir24_8.h new file mode 100644 index 0000000..080cd6e --- /dev/null +++ b/lib/librte_rib/rte_dir24_8.h @@ -0,0 +1,144 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _RTE_DIR24_8_H_ +#define _RTE_DIR24_8_H_ + +/** + * @file + * RTE Longest Prefix Match (LPM) + */ + +#ifdef __cplusplus +extern "C" { +#endif + +/** @internal Total number of tbl24 entries. */ +#define RTE_DIR24_8_TBL24_NUM_ENT (1 << 24) + +/** Maximum depth value possible for IPv4 LPM. */ +#define RTE_DIR24_8_MAX_DEPTH 32 + +/** @internal Number of entries in a tbl8 group. */ +#define RTE_DIR24_8_TBL8_GRP_NUM_ENT 256 + +/** @internal bitmask with valid and valid_group fields set */ +#define RTE_DIR24_8_VALID_EXT_ENT 0x03 + +/** Bitmask used to indicate successful lookup */ +#define RTE_DIR24_8_LOOKUP_SUCCESS 0x01 + +#define RTE_DIR24_8_TBL24_MASK 0xffffff00 + +/** Size of nexthop (1 << nh_sz) bits */ +enum rte_dir24_8_nh_sz { + RTE_DIR24_8_1B, + RTE_DIR24_8_2B, + RTE_DIR24_8_4B, + RTE_DIR24_8_8B +}; + +#define DIR24_8_TBL_IDX(a) ((a) >> (3 - fib->nh_sz)) +#define DIR24_8_PSD_IDX(a) ((a) & ((1 << (3 - fib->nh_sz)) - 1)) +#define DIR24_8_BITS_IN_NH (8 * (1 << fib->nh_sz)) + +#define DIR24_8_TBL24_VAL(ip) (ip >> 8) +#define DIR24_8_TBL8_VAL(res, ip) \ + ((res >> 2) * RTE_DIR24_8_TBL8_GRP_NUM_ENT + (uint8_t)ip) \ + +#define DIR24_8_LOOKUP_MSK ((((uint64_t)1 << ((1 << (fib->nh_sz + 3)) - 1)) << 1) - 1) +#define RTE_DIR24_8_GET_TBL24(fib, ip) \ + ((fib->tbl24[DIR24_8_TBL_IDX(DIR24_8_TBL24_VAL(ip))] >> \ + (DIR24_8_PSD_IDX(DIR24_8_TBL24_VAL(ip)) * \ + DIR24_8_BITS_IN_NH)) & DIR24_8_LOOKUP_MSK) \ + +#define RTE_DIR24_8_GET_TBL8(fib, res, ip) \ + ((fib->tbl8[DIR24_8_TBL_IDX(DIR24_8_TBL8_VAL(res, ip))] >> \ + (DIR24_8_PSD_IDX(DIR24_8_TBL8_VAL(res, ip)) * \ + DIR24_8_BITS_IN_NH)) & DIR24_8_LOOKUP_MSK) \ + +#define RTE_DIR24_8_LOOKUP(type) \ + const type *ptbl = (const type *) \ + &((type *)fib->tbl24)[tbl24_index]; \ + type tbl_entry = *ptbl; \ + if (unlikely((tbl_entry & RTE_DIR24_8_VALID_EXT_ENT) == \ + RTE_DIR24_8_VALID_EXT_ENT)) { \ + uint32_t tbl8_index = (uint8_t)ip + ((tbl_entry >> 2) * \ + RTE_DIR24_8_TBL8_GRP_NUM_ENT); \ + \ + ptbl = (const type *)&((type *)fib->tbl8)[tbl8_index]; \ + tbl_entry = *ptbl; \ + } \ + \ + *next_hop = (uint64_t)tbl_entry >> 2 \ + \ + +struct rte_dir24_8_tbl { + uint32_t number_tbl8s; /**< Total number of tbl8s. */ + uint32_t cur_tbl8s; /**< Current cumber of tbl8s. */ + enum rte_dir24_8_nh_sz nh_sz; /**< Size of nexthop entry */ + uint64_t *tbl8; /**< LPM tbl8 table. */ + uint64_t tbl24[0] __rte_cache_aligned; /**< LPM tbl24 table. */ +}; + +int rte_dir24_8_modify(struct rte_rib *rib, uint32_t key, uint8_t mask_len, + uint64_t next_hop, enum rte_rib_op rib_op); +int rte_dir24_8_lookup_1b(void *fib_p, uint32_t ip, uint64_t *next_hop); +int rte_dir24_8_lookup_2b(void *fib_p, uint32_t ip, uint64_t *next_hop); +int rte_dir24_8_lookup_4b(void *fib_p, uint32_t ip, uint64_t *next_hop); +int rte_dir24_8_lookup_8b(void *fib_p, uint32_t ip, uint64_t *next_hop); +int rte_dir24_8_modify(struct rte_rib *rib, uint32_t key, uint8_t mask_len, + uint64_t next_hop, enum rte_rib_op rib_op); + +static inline int +rte_dir24_8_lookup(void *fib_p, uint32_t ip, uint64_t *next_hop) +{ + uint64_t res; + struct rte_dir24_8_tbl *fib = (struct rte_dir24_8_tbl *)fib_p; + + /* DEBUG: Check user input arguments. */ + RTE_RIB_RETURN_IF_TRUE(((fib == NULL) || (ip == NULL) || + (next_hop == NULL)), -EINVAL); + + res = RTE_DIR24_8_GET_TBL24(fib, ip); + if (unlikely((res & RTE_DIR24_8_VALID_EXT_ENT) == + RTE_DIR24_8_VALID_EXT_ENT)) { + res = RTE_DIR24_8_GET_TBL8(fib, res, ip); + } + *next_hop = res >> 2; + return (res & RTE_DIR24_8_LOOKUP_SUCCESS) ? 0 : -ENOENT; +} + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_DIR24_8_H_ */ + diff --git a/lib/librte_rib/rte_rib.c b/lib/librte_rib/rte_rib.c new file mode 100644 index 0000000..0fbe9a5 --- /dev/null +++ b/lib/librte_rib/rte_rib.c @@ -0,0 +1,454 @@ +/* + * BSD LICENSE + * + * Copyright(c) 2017 Vladimir Medvedkin + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include + +TAILQ_HEAD(rte_rib_list, rte_tailq_entry); +static struct rte_tailq_elem rte_rib_tailq = { + .name = "RTE_RIB", +}; +EAL_REGISTER_TAILQ(rte_rib_tailq) + +#define MIN(X, Y) (((X) < (Y)) ? (X) : (Y)) + +static struct rte_rib_v4_node * +new_node_malloc(struct rte_rib *rib) +{ + return malloc(rib->node_sz); +} + +static void +free_node_malloc(__rte_unused struct rte_rib *rib, struct rte_rib_v4_node *ent) +{ + free(ent); +} + +static struct rte_rib_v4_node * +new_node_mempool(struct rte_rib *rib) +{ + struct rte_rib_v4_node *ent; + int ret; + + ret = rte_mempool_get(rib->node_pool, (void *)&ent); + if (ret != 0) + return NULL; + return ent; +} + +static void +free_node_mempool(struct rte_rib *rib, struct rte_rib_v4_node *ent) +{ + rte_mempool_put(rib->node_pool, ent); +} + +struct rte_rib_v4_node * +rte_rib_v4_lookup(struct rte_rib *rib, uint32_t key) +{ + struct rte_rib_v4_node *cur = rib->trie; + struct rte_rib_v4_node *prev = NULL; + + while ((cur != NULL) && (((cur->key ^ key) & + (uint32_t)(UINT64_MAX << (32 - cur->mask_len))) != 0)) { + if (cur->flag & VALID_NODE) + prev = cur; + cur = GET_NXT_NODE(cur, key); + } + return prev; +} + +struct rte_rib_v4_node * +rte_rib_v4_lookup_parent(struct rte_rib_v4_node *ent) +{ + struct rte_rib_v4_node *tmp; + + if (ent == NULL) + return NULL; + tmp = ent->parent; + while ((tmp != NULL) && (tmp->flag & VALID_NODE) != VALID_NODE) + tmp = tmp->parent; + return tmp; +} + +struct rte_rib_v4_node * +rte_rib_v4_lookup_exact(struct rte_rib *rib, uint32_t key, uint8_t mask_len) +{ + struct rte_rib_v4_node *cur = rib->trie; + + key &= (uint32_t)(UINT64_MAX << (32 - mask_len)); + while (cur != NULL) { + if ((cur->key == key) && (cur->mask_len == mask_len) && + (cur->flag & VALID_NODE)) + return cur; + if ((cur->mask_len > mask_len) || + (((uint64_t)key >> (32 - cur->mask_len)) != + ((uint64_t)cur->key >> (32 - cur->mask_len)))) + break; + cur = GET_NXT_NODE(cur, key); + } + return NULL; +} + +struct rte_rib_v4_node * +rte_rib_v4_get_next_child(struct rte_rib *rib, uint32_t key, + uint8_t mask_len, struct rte_rib_v4_node *cur, int flag) +{ + struct rte_rib_v4_node *tmp, *prev = NULL; + + if (cur == NULL) { + tmp = rib->trie; + while ((tmp) && (tmp->mask_len < mask_len)) + tmp = GET_NXT_NODE(tmp, key); + } else { + tmp = cur; + while ((tmp->parent != NULL) && (IS_RIGHT_NODE(tmp) || + (tmp->parent->right == NULL))) { + tmp = tmp->parent; + if ((tmp->flag & VALID_NODE) && + (IS_COVERED(tmp->key, tmp->mask_len, key, mask_len))) + return tmp; + } + tmp = (tmp->parent) ? tmp->parent->right : NULL; + } + while (tmp) { + if ((tmp->flag & VALID_NODE) && + (IS_COVERED(tmp->key, tmp->mask_len, key, mask_len))) { + prev = tmp; + if (flag == GET_NXT_COVER) + return prev; + } + tmp = (tmp->left) ? tmp->left : tmp->right; + } + return prev; +} + +void +rte_rib_v4_remove(struct rte_rib *rib, uint32_t key, uint8_t mask_len) +{ + struct rte_rib_v4_node *cur, *prev, *child; + + cur = rte_rib_v4_lookup_exact(rib, key, mask_len); + if (cur == NULL) + return; + + cur->flag &= ~VALID_NODE; + while ((cur->flag & VALID_NODE) != VALID_NODE) { + if ((cur->left != NULL) && (cur->right != NULL)) { + rib->free_node(rib, cur); + return; + } + child = (cur->left == NULL) ? cur->right : cur->left; + if (child != NULL) + child->parent = cur->parent; + if (cur->parent == NULL) { + rib->trie = child; + rib->free_node(rib, cur); + return; + } + if (cur->parent->left == cur) + cur->parent->left = child; + else + cur->parent->right = child; + prev = cur; + cur = cur->parent; + rib->free_node(rib, prev); + } +} + +struct rte_rib_v4_node * +rte_rib_v4_insert(struct rte_rib *rib, uint32_t key, uint8_t mask_len) +{ + struct rte_rib_v4_node **tmp = &rib->trie; + struct rte_rib_v4_node *prev = NULL; + struct rte_rib_v4_node *new_node = NULL; + struct rte_rib_v4_node *common_node = NULL; + int i = 0; + uint32_t common_prefix; + uint8_t common_mask_len; + + if (mask_len > 32) + return NULL; + + key &= (uint32_t)(UINT64_MAX << (32 - mask_len)); + new_node = rte_rib_v4_lookup_exact(rib, key, mask_len); + if (new_node) + return NULL; + + new_node = rib->alloc_node(rib); + if (new_node == NULL) { + RTE_LOG(ERR, LPM, "RIB node allocation failed\n"); + return NULL; + } + new_node->left = NULL; + new_node->right = NULL; + new_node->parent = NULL; + new_node->key = key; + new_node->mask_len = mask_len; + new_node->flag = VALID_NODE; + + while (1) { + if (*tmp == NULL) { + *tmp = new_node; + new_node->parent = prev; + } + if ((key == (*tmp)->key) && (mask_len == (*tmp)->mask_len)) { + if (new_node != *tmp) { + rib->free_node(rib, new_node); + (*tmp)->flag |= VALID_NODE; + } + return *tmp; + } + i = (*tmp)->mask_len; + if ((i >= mask_len) || (((uint64_t)key >> (32 - i)) != + ((uint64_t)(*tmp)->key >> (32 - i)))) + break; + prev = *tmp; + tmp = (key & (1 << (31 - i))) ? &(*tmp)->right : &(*tmp)->left; + } + common_mask_len = MIN(mask_len, (*tmp)->mask_len); + common_prefix = key ^ (*tmp)->key; +#ifdef __LZCNT__ + i = _lzcnt_u32(common_prefix); +#else + for (i = 0; i <= common_mask_len; i++) { + if ((common_prefix & (1 << (31 - i))) != 0) + break; + } +#endif + common_mask_len = MIN(i, common_mask_len); + common_prefix = key & (uint32_t)(UINT64_MAX << (32 - common_mask_len)); + if ((common_prefix == key) && (common_mask_len == mask_len)) { + if ((*tmp)->key & (1 << (31 - mask_len))) + new_node->right = *tmp; + else + new_node->left = *tmp; + new_node->parent = (*tmp)->parent; + (*tmp)->parent = new_node; + *tmp = new_node; + } else { + common_node = rib->alloc_node(rib); + if (common_node == NULL) { + RTE_LOG(ERR, LPM, "RIB node allocation failed\n"); + rib->free_node(rib, new_node); + return NULL; + } + common_node->key = common_prefix; + common_node->mask_len = common_mask_len; + common_node->flag = 0; + common_node->parent = (*tmp)->parent; + new_node->parent = common_node; + (*tmp)->parent = common_node; + if ((new_node->key & (1 << (31 - common_mask_len))) == 0) { + common_node->left = new_node; + common_node->right = *tmp; + } else { + common_node->left = *tmp; + common_node->right = new_node; + } + *tmp = common_node; + } + return new_node; +} + +struct rte_rib * +rte_rib_v4_create(const char *name, int socket_id, struct rte_rib_conf *conf) +{ + char mem_name[RTE_RIB_NAMESIZE]; + struct rte_rib *rib = NULL; + struct rte_tailq_entry *te; + struct rte_rib_list *rib_list; + int dir24_8_nh_size; + rte_rib_lookup_fn_t lookup_fn; + + /* Check user arguments. */ + if ((name == NULL) || (conf == NULL) || (socket_id < -1) || + (conf->type >= RTE_RIB_TYPE_MAX) || + (conf->alloc_type >= RTE_RIB_ALLOC_MAX) || + (conf->max_nodes == 0) || + (conf->node_sz < sizeof(struct rte_rib_v4_node))) { + rte_errno = EINVAL; + return NULL; + } + + snprintf(mem_name, sizeof(mem_name), "RIB_%s", name); + + rib_list = RTE_TAILQ_CAST(rte_rib_tailq.head, rte_rib_list); + + rte_rwlock_write_lock(RTE_EAL_TAILQ_RWLOCK); + /* guarantee there's no existing */ + TAILQ_FOREACH(te, rib_list, next) { + rib = (struct rte_rib *)te->data; + if (strncmp(name, rib->name, RTE_RIB_NAMESIZE) == 0) + break; + } + rib = NULL; + if (te != NULL) { + rte_errno = EEXIST; + goto exit; + } + + /* allocate tailq entry */ + te = rte_zmalloc("RIB_TAILQ_ENTRY", sizeof(*te), 0); + if (te == NULL) { + RTE_LOG(ERR, LPM, "Failed to allocate tailq entry\n"); + goto exit; + } + + /* Allocate memory to store the LPM data structures. */ + rib = (struct rte_rib *)rte_zmalloc_socket(mem_name, + sizeof(struct rte_rib), RTE_CACHE_LINE_SIZE, socket_id); + if (rib == NULL) { + RTE_LOG(ERR, LPM, "RIB memory allocation failed\n"); + goto free_te; + } + snprintf(rib->name, sizeof(rib->name), "%s", name); + rib->trie = NULL; + rib->node_sz = conf->node_sz; + + snprintf(mem_name, sizeof(mem_name), "FIB_%s", name); + + if (conf->type <= RTE_RIB_DIR24_8_8B) { + switch (conf->type) { + case RTE_RIB_DIR24_8_1B: + dir24_8_nh_size = RTE_DIR24_8_1B; + lookup_fn = rte_dir24_8_lookup_1b; + break; + case RTE_RIB_DIR24_8_2B: + dir24_8_nh_size = RTE_DIR24_8_2B; + lookup_fn = rte_dir24_8_lookup_2b; + break; + case RTE_RIB_DIR24_8_4B: + dir24_8_nh_size = RTE_DIR24_8_4B; + lookup_fn = rte_dir24_8_lookup_4b; + break; + case RTE_RIB_DIR24_8_8B: + dir24_8_nh_size = RTE_DIR24_8_8B; + lookup_fn = rte_dir24_8_lookup_8b; + break; + case RTE_RIB_TYPE_MAX: + default: + RTE_LOG(ERR, LPM, "Bad RIB type\n"); + goto free_rib; + } + rib->fib = (void *)rte_zmalloc_socket(mem_name, + sizeof(struct rte_dir24_8_tbl) + + RTE_DIR24_8_TBL24_NUM_ENT * (1 << dir24_8_nh_size), + RTE_CACHE_LINE_SIZE, socket_id); + if (rib->fib == NULL) { + RTE_LOG(ERR, LPM, "Failed to allocate FIB\n"); + goto free_rib; + } + snprintf(mem_name, sizeof(mem_name), "TBL8_%s", name); + ((struct rte_dir24_8_tbl *)rib->fib)->tbl8 = + (void *)rte_zmalloc_socket(mem_name, + RTE_DIR24_8_TBL8_GRP_NUM_ENT * (1 << dir24_8_nh_size) * + conf->number_tbl8s, RTE_CACHE_LINE_SIZE, socket_id); + if (((struct rte_dir24_8_tbl *)rib->fib)->tbl8 == NULL) { + RTE_LOG(ERR, LPM, "Failed to allocate TBL8\n"); + rte_free(rib->fib); + goto free_rib; + } + rib->modify_cb = rte_dir24_8_modify; + rib->lookup_cb = lookup_fn; + ((struct rte_dir24_8_tbl *)rib->fib)->nh_sz = + (enum rte_dir24_8_nh_sz)dir24_8_nh_size; + ((struct rte_dir24_8_tbl *)rib->fib)->number_tbl8s = + conf->number_tbl8s; + } + + rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK); + + switch (conf->alloc_type) { + case RTE_RIB_MALLOC: + rib->alloc_node = new_node_malloc; + rib->free_node = free_node_malloc; + break; + case RTE_RIB_MEMPOOL: + snprintf(mem_name, sizeof(mem_name), "MP_%s", name); + rib->node_pool = rte_mempool_create(mem_name, conf->max_nodes, + conf->node_sz, 0, 0, NULL, NULL, NULL, NULL, + socket_id, 0); + + if (rib->node_pool == NULL) { + RTE_LOG(ERR, LPM, "Failed to allocate mempool\n"); + rte_rwlock_write_lock(RTE_EAL_TAILQ_RWLOCK); + goto free_fib; + } + rib->alloc_node = new_node_mempool; + rib->free_node = free_node_mempool; + break; + case RTE_RIB_ALLOC_MAX: + default: + RTE_LOG(ERR, LPM, "Bad RIB alloc type\n"); + rte_rwlock_write_lock(RTE_EAL_TAILQ_RWLOCK); + goto free_fib; + } + + rte_rwlock_write_lock(RTE_EAL_TAILQ_RWLOCK); + + te->data = (void *)rib; + TAILQ_INSERT_TAIL(rib_list, te, next); + + rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK); + + return rib; + +free_fib: + rte_free(((struct rte_dir24_8_tbl *)rib->fib)->tbl8); + rte_free(rib->fib); +free_rib: + rte_free(rib); + rib = NULL; +free_te: + rte_free(te); +exit: + rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK); + + return NULL; +} diff --git a/lib/librte_rib/rte_rib.h b/lib/librte_rib/rte_rib.h new file mode 100644 index 0000000..279c85c --- /dev/null +++ b/lib/librte_rib/rte_rib.h @@ -0,0 +1,260 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2017 Vladimir Medvedkin + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _RTE_RIB_H_ +#define _RTE_RIB_H_ + +/** + * @file + * Compressed trie implementation for Longest Prefix Match + */ + +/** @internal Macro to enable/disable run-time checks. */ +#if defined(RTE_LIBRTE_DIR24_8_DEBUG) +#define RTE_RIB_RETURN_IF_TRUE(cond, retval) do { \ + if (cond) \ + return retval; \ +} while (0) +#else +#define RTE_RIB_RETURN_IF_TRUE(cond, retval) +#endif + +#define VALID_NODE 1 +#define GET_NXT_ALL 0 +#define GET_NXT_COVER 1 + +/** Max number of characters in RIB name. */ +#define RTE_RIB_NAMESIZE 64 + +/** Maximum depth value possible for IPv4 RIB. */ +#define RTE_RIB_V4_MAXDEPTH 32 + +/** + * Macro to check if prefix1 {key1/mask_len1} + * is covered by prefix2 {key2/mask_len2} + */ +#define IS_COVERED(key1, mask_len1, key2, mask_len2) \ + ((((key1 ^ key2) & (uint32_t)(UINT64_MAX << (32 - mask_len2))) == 0) \ + && (mask_len1 > mask_len2)) + +/** @internal Macro to get next node in tree*/ +#define GET_NXT_NODE(node, key) \ + ((key & (1 << (31 - node->mask_len))) ? node->right : node->left) +/** @internal Macro to check if node is right child*/ +#define IS_RIGHT_NODE(node) (node->parent->right == node) + + +struct rte_rib_v4_node { + struct rte_rib_v4_node *left; + struct rte_rib_v4_node *right; + struct rte_rib_v4_node *parent; + uint32_t key; + uint8_t mask_len; + uint8_t flag; + uint64_t ext[0]; +}; + +/** simple extension to keep only single next_hop */ +struct rte_rib_simple_ext { + uint64_t next_hop; +}; + +struct rte_rib; + +enum rte_rib_op { + RTE_RIB_OP_ADD = 0, + RTE_RIB_OP_DEL, + RTE_RIB_OP_MODIFY +}; + +/** Type of FIB struct*/ +enum rte_rib_type { + RTE_RIB_DIR24_8_1B, + RTE_RIB_DIR24_8_2B, + RTE_RIB_DIR24_8_4B, + RTE_RIB_DIR24_8_8B, + RTE_RIB_TYPE_MAX +}; + +/** RIB nodes allocation type */ +enum rte_rib_alloc_type { + RTE_RIB_MALLOC, + RTE_RIB_MEMPOOL, + RTE_RIB_ALLOC_MAX +}; + +typedef int (*rte_rib_modify_fn_t)(struct rte_rib *rib, uint32_t key, + uint8_t mask_len, uint64_t next_hop, enum rte_rib_op rib_op); +typedef int (*rte_rib_lookup_fn_t)(void *fib, uint32_t key, uint64_t *next_hop); +typedef struct rte_rib_v4_node *(*rte_rib_alloc_node_fn_t)(struct rte_rib *rib); +typedef void (*rte_rib_free_node_fn_t)(struct rte_rib *rib, + struct rte_rib_v4_node *node); + +struct rte_rib { + char name[RTE_RIB_NAMESIZE]; + /*pointer to rib trie*/ + struct rte_rib_v4_node *trie; + /*pointer to dataplane struct*/ + void *fib; + /*prefix modification callback*/ + rte_rib_modify_fn_t modify_cb; + /* lookup fn callback*/ + rte_rib_lookup_fn_t lookup_cb; + /*alloc trie element*/ + rte_rib_alloc_node_fn_t alloc_node; + /*free trie element*/ + rte_rib_free_node_fn_t free_node; + void *node_pool; + int node_sz; +}; + +/** RIB configuration structure */ +struct rte_rib_conf { + enum rte_rib_type type; + int max_nodes; + size_t node_sz; + int number_tbl8s; + enum rte_rib_alloc_type alloc_type; +}; + +/** + * Lookup an IP into the RIB structure + * + * @param rib + * RIB object handle + * @param key + * IP to be looked up in the RIB + * @return + * pointer to struct rte_rib_v4_node on success, + * NULL otherwise + */ +struct rte_rib_v4_node *rte_rib_v4_lookup(struct rte_rib *rib, uint32_t key); + +/** + * Lookup less specific route into the RIB structure + * + * @param ent + * Pointer to struct rte_rib_v4_node that represents target route + * @return + * pointer to struct rte_rib_v4_node that represents + * less specific route on success, + * NULL otherwise + */ +struct rte_rib_v4_node *rte_rib_v4_lookup_parent(struct rte_rib_v4_node *ent); + +/** + * Lookup prefix into the RIB structure + * + * @param rib + * RIB object handle + * @param key + * net to be looked up in the RIB + * @param mask_len + * prefix length + * @return + * pointer to struct rte_rib_v4_node on success, + * NULL otherwise + */ +struct rte_rib_v4_node *rte_rib_v4_lookup_exact(struct rte_rib *rib, + uint32_t key, uint8_t mask_len); + +/** + * Retrieve next more specific prefix from the RIB + * that is covered by key/mask_len supernet + * + * @param rib + * RIB object handle + * @param key + * net address of supernet prefix that covers returned more specific prefixes + * @param mask_len + * supernet prefix length + * @param cur + * pointer to the last returned prefix to get next prefix + * or + * NULL to get first more specific prefix + * @param flag + * -GET_NXT_ALL + * get all prefixes from subtrie + * -GET_NXT_COVER + * get only first more specific prefix even if it have more specifics + * @return + * pointer to the next more specific prefix + * or + * NULL if there is no prefixes left + */ +struct rte_rib_v4_node *rte_rib_v4_get_next_child(struct rte_rib *rib, + uint32_t key, uint8_t mask_len, struct rte_rib_v4_node *cur, int flag); + +/** + * Remove prefix from the RIB + * + * @param rib + * RIB object handle + * @param key + * net to be removed from the RIB + * @param mask_len + * prefix length + */ +void rte_rib_v4_remove(struct rte_rib *rib, uint32_t key, uint8_t mask_len); + +/** + * Insert prefix into the RIB + * + * @param rib + * RIB object handle + * @param key + * net to be inserted to the RIB + * @param mask_len + * prefix length + * @return + * pointer to new rte_rib_v4_node on success + * NULL otherwise + */ +struct rte_rib_v4_node *rte_rib_v4_insert(struct rte_rib *rib, uint32_t key, + uint8_t mask_len); + +/** + * Create RIB + * + * @param name + * RIB name + * @param socket_id + * NUMA socket ID for RIB table memory allocation + * @param conf + * Structure containing the configuration + * @return + * Handle to LPM object on success + * NULL otherwise with rte_errno set to an appropriate values. + */ +struct rte_rib *rte_rib_v4_create(const char *name, int socket_id, + struct rte_rib_conf *conf); + +#endif /* _RTE_RIB_H_ */ +