From patchwork Thu Aug 22 06:34:55 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ruifeng Wang X-Patchwork-Id: 57814 Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 8976C1BF59; Thu, 22 Aug 2019 08:35:42 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by dpdk.org (Postfix) with ESMTP id 5F5BE1BF59 for ; Thu, 22 Aug 2019 08:35:41 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id DDF1E344; Wed, 21 Aug 2019 23:35:40 -0700 (PDT) Received: from net-arm-c2400-02.shanghai.arm.com (net-arm-c2400-02.shanghai.arm.com [10.169.40.42]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id E95BA3F706; Wed, 21 Aug 2019 23:37:56 -0700 (PDT) From: Ruifeng Wang To: bruce.richardson@intel.com, vladimir.medvedkin@intel.com, olivier.matz@6wind.com Cc: dev@dpdk.org, honnappa.nagarahalli@arm.com, dharmik.thakkar@arm.com, nd@arm.com Date: Thu, 22 Aug 2019 14:34:55 +0800 Message-Id: <20190822063457.41596-2-ruifeng.wang@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190822063457.41596-1-ruifeng.wang@arm.com> References: <20190822063457.41596-1-ruifeng.wang@arm.com> Subject: [dpdk-dev] [RFC PATCH 1/3] doc/rcu: add RCU integration design details X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Honnappa Nagarahalli Add a section to describe a design to integrate QSBR RCU library with other libraries in DPDK. Signed-off-by: Honnappa Nagarahalli Reviewed-by: Gavin Hu Reviewed-by: Ruifeng Wang --- doc/guides/prog_guide/rcu_lib.rst | 51 +++++++++++++++++++++++++++++++ 1 file changed, 51 insertions(+) diff --git a/doc/guides/prog_guide/rcu_lib.rst b/doc/guides/prog_guide/rcu_lib.rst index 8fe5b1f73..2869441ca 100644 --- a/doc/guides/prog_guide/rcu_lib.rst +++ b/doc/guides/prog_guide/rcu_lib.rst @@ -186,3 +186,54 @@ However, when ``CONFIG_RTE_LIBRTE_RCU_DEBUG`` is enabled, these APIs aid in debugging issues. One can mark the access to shared data structures on the reader side using these APIs. The ``rte_rcu_qsbr_quiescent()`` will check if all the locks are unlocked. + +Integrating QSBR RCU with other libraries +----------------------------------------- + +Lock-free algorithms place additional burden on the application to reclaim +memory. Integrating memory reclaiming mechanisms in the libraries help +remove some of the burden. Though QSBR method presents flexibility to +achieve performance, it presents challenges while integrating with libraries. + +The memory reclaiming process using QSBR can be split into 4 parts: + +#. Initialization +#. Quiescent State Reporting +#. Reclaiming Resources +#. Shutdown + +The design proposed here requires the application to handle 'Initialization' +and 'Quiescent State Reporting'. So, + +* the application has to create the RCU variable and register the reader threads to report their quiescent state. +* the application has to register the same RCU variable with the library. +* reader threads in the application have to report the quiescent state. This allows for the application to control the length of the critical section/how frequently the application wants to report the quiescent state. + +The library will handle 'Reclaiming Resources' part of the process. The +libraries will make use of the writer thread context to execute the memory +reclaiming algorithm. So, + +* library should provide an API to register a RCU variable that it will use. +* library should trigger the readers to report quiescent state status upon deleting the resources by calling ``rte_rcu_qsbr_start``. + +* library should store the token and deleted resources for later use to free them after the readers have reported their quiescent state. Since the readers will report the quiescent state status in the order of deletion, the library must store the tokens/resources in the order in which the resources were deleted. A FIFO data structure would achieve the desired results. The length of the FIFO would depend on the rate of deletion and the rate at which the readers report their quiescent state. In the worst case the length of FIFO would be equal to the maximum number of resources the data structure supports. However, in most cases, the length will be much smaller. But, the library should not take the length of FIFO as an input from the application. Instead, it should implement a data structure which should be able to grow/shrink dynamically. Overhead introduced by such a data structure on delete operations should be considered as well. + +* library should query the quiescent state and free the resources. It should make use of non-blocking ``rte_rcu_qsbr_check`` API to query the quiescent state. This allows the application to do useful work while the readers report their quiescent state. If there are tokens/resources present in the FIFO already, the delete API should peek the head of the FIFO and check the quiescent state status. If the status is success, the token/resource should be dequeued and the resource should be freed. This process can be repeated till the quiescent state status for a token returns failure indicating that subsequent tokens will also fail quiescent state status query. The same process can be incorporated while adding new entries in the data structure if the library runs out of resources. + +The 'Shutdown' process needs to be shared between the application and the +library. + +* library should check the quiescent state status of all the tokens that may be present in the FIFO and free the resources. It should make use of non-blocking ``rte_rcu_qsbr_check`` API to query the quiescent state. If any of the tokens do not pass the quiescent state check, the library should print an error and stop the memory reclaimation process. + +* the application should make sure that the reader threads are not using the shared data structure, unregister the reader threads from the QSBR variable before calling the library's shutdown function. + +Integrating the resource reclaimation with libraries removes the burden from +the application and makes it easy to use lock-free algorithms. + +This design has several advantages over currently known methods. + +#. Application does not need a dedicated thread to reclaim resources. Memory + reclaimation happens as part of the writer thread without sacrificing + a lot of performance. +#. The library has better control over the resources. For ex: the library can + attempt to reclaim when it has run out of resources. From patchwork Thu Aug 22 06:34:56 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ruifeng Wang X-Patchwork-Id: 57815 Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 05B301BF5D; Thu, 22 Aug 2019 08:35:57 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by dpdk.org (Postfix) with ESMTP id 87BC41BF5A for ; Thu, 22 Aug 2019 08:35:55 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1A2E8344; Wed, 21 Aug 2019 23:35:55 -0700 (PDT) Received: from net-arm-c2400-02.shanghai.arm.com (net-arm-c2400-02.shanghai.arm.com [10.169.40.42]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id EFC6E3F706; Wed, 21 Aug 2019 23:38:10 -0700 (PDT) From: Ruifeng Wang To: bruce.richardson@intel.com, vladimir.medvedkin@intel.com, olivier.matz@6wind.com Cc: dev@dpdk.org, honnappa.nagarahalli@arm.com, dharmik.thakkar@arm.com, nd@arm.com, Ruifeng Wang Date: Thu, 22 Aug 2019 14:34:56 +0800 Message-Id: <20190822063457.41596-3-ruifeng.wang@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190822063457.41596-1-ruifeng.wang@arm.com> References: <20190822063457.41596-1-ruifeng.wang@arm.com> Subject: [dpdk-dev] [RFC PATCH 2/3] lib/ring: add peek API X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" The peek API allows fetching the next available object in the ring without dequeuing it. This helps in scenarios where dequeuing of objects depend on their value. Signed-off-by: Dharmik Thakkar Signed-off-by: Ruifeng Wang Reviewed-by: Honnappa Nagarahalli Reviewed-by: Gavin Hu --- lib/librte_ring/rte_ring.h | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+) diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h index 2a9f768a1..d3d0d5e18 100644 --- a/lib/librte_ring/rte_ring.h +++ b/lib/librte_ring/rte_ring.h @@ -953,6 +953,36 @@ rte_ring_dequeue_burst(struct rte_ring *r, void **obj_table, r->cons.single, available); } +/** + * Peek one object from a ring. + * + * The peek API allows fetching the next available object in the ring + * without dequeuing it. This API is not multi-thread safe with respect + * to other consumer threads. + * + * @param r + * A pointer to the ring structure. + * @param obj_p + * A pointer to a void * pointer (object) that will be filled. + * @return + * - 0: Success, object available + * - -ENOENT: Not enough entries in the ring. + */ +__rte_experimental +static __rte_always_inline int +rte_ring_peek(struct rte_ring *r, void **obj_p) +{ + uint32_t prod_tail = r->prod.tail; + uint32_t cons_head = r->cons.head; + uint32_t count = (prod_tail - cons_head) & r->mask; + unsigned int n = 1; + if (count) { + DEQUEUE_PTRS(r, &r[1], cons_head, obj_p, n, void *); + return 0; + } + return -ENOENT; +} + #ifdef __cplusplus } #endif From patchwork Thu Aug 22 06:34:57 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ruifeng Wang X-Patchwork-Id: 57816 Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id DC3731BF85; Thu, 22 Aug 2019 08:36:04 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by dpdk.org (Postfix) with ESMTP id C7A781BF6A for ; Thu, 22 Aug 2019 08:36:03 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 62A37344; Wed, 21 Aug 2019 23:36:03 -0700 (PDT) Received: from net-arm-c2400-02.shanghai.arm.com (net-arm-c2400-02.shanghai.arm.com [10.169.40.42]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 321A13F706; Wed, 21 Aug 2019 23:38:19 -0700 (PDT) From: Ruifeng Wang To: bruce.richardson@intel.com, vladimir.medvedkin@intel.com, olivier.matz@6wind.com Cc: dev@dpdk.org, honnappa.nagarahalli@arm.com, dharmik.thakkar@arm.com, nd@arm.com, Ruifeng Wang Date: Thu, 22 Aug 2019 14:34:57 +0800 Message-Id: <20190822063457.41596-4-ruifeng.wang@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190822063457.41596-1-ruifeng.wang@arm.com> References: <20190822063457.41596-1-ruifeng.wang@arm.com> Subject: [dpdk-dev] [RFC PATCH 3/3] lib/lpm: integrate RCU QSBR X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Currently, the tbl8 group is freed even though the readers might be using the tbl8 group entries. The freed tbl8 group can be reallocated quickly. This results in incorrect lookup results. RCU QSBR process is integrated for safe tbl8 group reclaim. Refer to RCU documentation to understand various aspects of integrating RCU library into other libraries. Signed-off-by: Ruifeng Wang Reviewed-by: Honnappa Nagarahalli Reviewed-by: Gavin Hu --- lib/librte_lpm/Makefile | 3 +- lib/librte_lpm/meson.build | 2 + lib/librte_lpm/rte_lpm.c | 218 +++++++++++++++++++++++++++-- lib/librte_lpm/rte_lpm.h | 22 +++ lib/librte_lpm/rte_lpm_version.map | 6 + lib/meson.build | 3 +- 6 files changed, 239 insertions(+), 15 deletions(-) diff --git a/lib/librte_lpm/Makefile b/lib/librte_lpm/Makefile index a7946a1c5..ca9e16312 100644 --- a/lib/librte_lpm/Makefile +++ b/lib/librte_lpm/Makefile @@ -6,9 +6,10 @@ include $(RTE_SDK)/mk/rte.vars.mk # library name LIB = librte_lpm.a +CFLAGS += -DALLOW_EXPERIMENTAL_API CFLAGS += -O3 CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -LDLIBS += -lrte_eal -lrte_hash +LDLIBS += -lrte_eal -lrte_hash -lrte_rcu EXPORT_MAP := rte_lpm_version.map diff --git a/lib/librte_lpm/meson.build b/lib/librte_lpm/meson.build index a5176d8ae..19a35107f 100644 --- a/lib/librte_lpm/meson.build +++ b/lib/librte_lpm/meson.build @@ -2,9 +2,11 @@ # Copyright(c) 2017 Intel Corporation version = 2 +allow_experimental_apis = true sources = files('rte_lpm.c', 'rte_lpm6.c') headers = files('rte_lpm.h', 'rte_lpm6.h') # since header files have different names, we can install all vector headers # without worrying about which architecture we actually need headers += files('rte_lpm_altivec.h', 'rte_lpm_neon.h', 'rte_lpm_sse.h') deps += ['hash'] +deps += ['rcu'] diff --git a/lib/librte_lpm/rte_lpm.c b/lib/librte_lpm/rte_lpm.c index 3a929a1b1..1efdef22d 100644 --- a/lib/librte_lpm/rte_lpm.c +++ b/lib/librte_lpm/rte_lpm.c @@ -1,5 +1,6 @@ /* SPDX-License-Identifier: BSD-3-Clause * Copyright(c) 2010-2014 Intel Corporation + * Copyright(c) 2019 Arm Limited */ #include @@ -22,6 +23,7 @@ #include #include #include +#include #include "rte_lpm.h" @@ -39,6 +41,11 @@ enum valid_flag { VALID }; +struct __rte_lpm_qs_item { + uint64_t token; /**< QSBR token.*/ + uint32_t index; /**< tbl8 group index.*/ +}; + /* Macro to enable/disable run-time checks. */ #if defined(RTE_LIBRTE_LPM_DEBUG) #include @@ -381,6 +388,8 @@ rte_lpm_free_v1604(struct rte_lpm *lpm) rte_mcfg_tailq_write_unlock(); + if (lpm->qsv) + rte_ring_free(lpm->qs_fifo); rte_free(lpm->tbl8); rte_free(lpm->rules_tbl); rte_free(lpm); @@ -390,6 +399,145 @@ BIND_DEFAULT_SYMBOL(rte_lpm_free, _v1604, 16.04); MAP_STATIC_SYMBOL(void rte_lpm_free(struct rte_lpm *lpm), rte_lpm_free_v1604); +/* Add an item into FIFO. + * return: 0 - success + */ +static int +__rte_lpm_rcu_qsbr_fifo_push(struct rte_ring *fifo, + struct __rte_lpm_qs_item *item) +{ + if (rte_ring_sp_enqueue(fifo, (void *)(uintptr_t)item->token) != 0) { + rte_errno = ENOSPC; + return 1; + } + if (rte_ring_sp_enqueue(fifo, (void *)(uintptr_t)item->index) != 0) { + void *obj; + /* token needs to be dequeued when index enqueue fails */ + rte_ring_sc_dequeue(fifo, &obj); + rte_errno = ENOSPC; + return 1; + } + + return 0; +} + +/* Remove item from FIFO. + * Used when data observed by rte_ring_peek. + */ +static void +__rte_lpm_rcu_qsbr_fifo_pop(struct rte_ring *fifo, + struct __rte_lpm_qs_item *item) +{ + void *obj_token = NULL; + void *obj_index = NULL; + + (void)rte_ring_sc_dequeue(fifo, &obj_token); + (void)rte_ring_sc_dequeue(fifo, &obj_index); + + if (item) { + item->token = (uint64_t)((uintptr_t)obj_token); + item->index = (uint32_t)((uintptr_t)obj_index); + } +} + +/* Max number of tbl8 groups to reclaim at one time. */ +#define RCU_QSBR_RECLAIM_SIZE 8 + +/* When RCU QSBR FIFO usage is above 1/(2^RCU_QSBR_RECLAIM_LEVEL), + * reclaim will be triggered by tbl8_free. + */ +#define RCU_QSBR_RECLAIM_LEVEL 3 + +/* Reclaim some tbl8 groups based on quiescent state check. + * RCU_QSBR_RECLAIM_SIZE groups will be reclaimed at max. + * return: 0 - success, 1 - no group reclaimed. + */ +static uint32_t +__rte_lpm_rcu_qsbr_reclaim_chunk(struct rte_lpm *lpm, uint32_t *index) +{ + struct __rte_lpm_qs_item qs_item; + struct rte_lpm_tbl_entry *tbl8_entry = NULL; + void *obj_token; + uint32_t cnt = 0; + + /* Check reader threads quiescent state and + * reclaim as much tbl8 groups as possible. + */ + while ((cnt < RCU_QSBR_RECLAIM_SIZE) && + (rte_ring_peek(lpm->qs_fifo, &obj_token) == 0) && + (rte_rcu_qsbr_check(lpm->qsv, (uint64_t)((uintptr_t)obj_token), + false) == 1)) { + __rte_lpm_rcu_qsbr_fifo_pop(lpm->qs_fifo, &qs_item); + + tbl8_entry = &lpm->tbl8[qs_item.index * + RTE_LPM_TBL8_GROUP_NUM_ENTRIES]; + memset(&tbl8_entry[0], 0, + RTE_LPM_TBL8_GROUP_NUM_ENTRIES * + sizeof(tbl8_entry[0])); + cnt++; + } + + if (cnt) { + if (index) + *index = qs_item.index; + return 0; + } + return 1; +} + +/* Trigger tbl8 group reclaim when necessary. + * Reclaim happens when RCU QSBR queue usage is over 12.5%. + */ +static void +__rte_lpm_rcu_qsbr_try_reclaim(struct rte_lpm *lpm) +{ + if (lpm->qsv == NULL) + return; + + if (rte_ring_count(lpm->qs_fifo) < + (rte_ring_get_capacity(lpm->qs_fifo) >> RCU_QSBR_RECLAIM_LEVEL)) + return; + + (void)__rte_lpm_rcu_qsbr_reclaim_chunk(lpm, NULL); +} + +/* Associate QSBR variable with an LPM object. + */ +int +rte_lpm_rcu_qsbr_add(struct rte_lpm *lpm, struct rte_rcu_qsbr *v) +{ + uint32_t qs_fifo_size; + char rcu_ring_name[RTE_RING_NAMESIZE]; + + if ((lpm == NULL) || (v == NULL)) { + rte_errno = EINVAL; + return 1; + } + + if (lpm->qsv) { + rte_errno = EEXIST; + return 1; + } + + /* round up qs_fifo_size to next power of two that is not less than + * number_tbl8s. Will store 'token' and 'index'. + */ + qs_fifo_size = 2 * rte_align32pow2(lpm->number_tbl8s); + + /* Init QSBR reclaiming FIFO. */ + snprintf(rcu_ring_name, sizeof(rcu_ring_name), "LPM_RCU_%s", lpm->name); + lpm->qs_fifo = rte_ring_create(rcu_ring_name, qs_fifo_size, + SOCKET_ID_ANY, 0); + if (lpm->qs_fifo == NULL) { + RTE_LOG(ERR, LPM, "LPM QS FIFO memory allocation failed\n"); + rte_errno = ENOMEM; + return 1; + } + lpm->qsv = v; + + return 0; +} + /* * Adds a rule to the rule table. * @@ -640,6 +788,35 @@ rule_find_v1604(struct rte_lpm *lpm, uint32_t ip_masked, uint8_t depth) return -EINVAL; } +static int32_t +tbl8_alloc_reclaimed(struct rte_lpm *lpm) +{ + struct rte_lpm_tbl_entry *tbl8_entry = NULL; + uint32_t index; + + if (lpm->qsv != NULL) { + if (__rte_lpm_rcu_qsbr_reclaim_chunk(lpm, &index) == 0) { + /* Set the last reclaimed tbl8 group as VALID. */ + struct rte_lpm_tbl_entry new_tbl8_entry = { + .next_hop = 0, + .valid = INVALID, + .depth = 0, + .valid_group = VALID, + }; + + tbl8_entry = &lpm->tbl8[index * + RTE_LPM_TBL8_GROUP_NUM_ENTRIES]; + __atomic_store(tbl8_entry, &new_tbl8_entry, + __ATOMIC_RELAXED); + + /* Return group index for reclaimed tbl8 group. */ + return index; + } + } + + return -ENOSPC; +} + /* * Find, clean and allocate a tbl8. */ @@ -679,14 +856,15 @@ tbl8_alloc_v20(struct rte_lpm_tbl_entry_v20 *tbl8) } static int32_t -tbl8_alloc_v1604(struct rte_lpm_tbl_entry *tbl8, uint32_t number_tbl8s) +tbl8_alloc_v1604(struct rte_lpm *lpm) { uint32_t group_idx; /* tbl8 group index. */ struct rte_lpm_tbl_entry *tbl8_entry; /* Scan through tbl8 to find a free (i.e. INVALID) tbl8 group. */ - for (group_idx = 0; group_idx < number_tbl8s; group_idx++) { - tbl8_entry = &tbl8[group_idx * RTE_LPM_TBL8_GROUP_NUM_ENTRIES]; + for (group_idx = 0; group_idx < lpm->number_tbl8s; group_idx++) { + tbl8_entry = &lpm->tbl8[group_idx * + RTE_LPM_TBL8_GROUP_NUM_ENTRIES]; /* If a free tbl8 group is found clean it and set as VALID. */ if (!tbl8_entry->valid_group) { struct rte_lpm_tbl_entry new_tbl8_entry = { @@ -708,8 +886,8 @@ tbl8_alloc_v1604(struct rte_lpm_tbl_entry *tbl8, uint32_t number_tbl8s) } } - /* If there are no tbl8 groups free then return error. */ - return -ENOSPC; + /* If there are no tbl8 groups free then check reclaim queue. */ + return tbl8_alloc_reclaimed(lpm); } static void @@ -728,13 +906,27 @@ tbl8_free_v20(struct rte_lpm_tbl_entry_v20 *tbl8, uint32_t tbl8_group_start) } static void -tbl8_free_v1604(struct rte_lpm_tbl_entry *tbl8, uint32_t tbl8_group_start) +tbl8_free_v1604(struct rte_lpm *lpm, uint32_t tbl8_group_start) { - /* Set tbl8 group invalid*/ + struct __rte_lpm_qs_item qs_item; struct rte_lpm_tbl_entry zero_tbl8_entry = {0}; - __atomic_store(&tbl8[tbl8_group_start], &zero_tbl8_entry, - __ATOMIC_RELAXED); + if (lpm->qsv != NULL) { + /* Push into QSBR FIFO. */ + qs_item.token = rte_rcu_qsbr_start(lpm->qsv); + qs_item.index = tbl8_group_start; + if (__rte_lpm_rcu_qsbr_fifo_push(lpm->qs_fifo, &qs_item) != 0) + RTE_LOG(ERR, LPM, "Failed to push QSBR FIFO\n"); + + /* Speculatively reclaim tbl8 groups. + * Help spread the reclaim work load across multiple calls. + */ + __rte_lpm_rcu_qsbr_try_reclaim(lpm); + } else { + /* Set tbl8 group invalid*/ + __atomic_store(&lpm->tbl8[tbl8_group_start], &zero_tbl8_entry, + __ATOMIC_RELAXED); + } } static __rte_noinline int32_t @@ -1037,7 +1229,7 @@ add_depth_big_v1604(struct rte_lpm *lpm, uint32_t ip_masked, uint8_t depth, if (!lpm->tbl24[tbl24_index].valid) { /* Search for a free tbl8 group. */ - tbl8_group_index = tbl8_alloc_v1604(lpm->tbl8, lpm->number_tbl8s); + tbl8_group_index = tbl8_alloc_v1604(lpm); /* Check tbl8 allocation was successful. */ if (tbl8_group_index < 0) { @@ -1083,7 +1275,7 @@ add_depth_big_v1604(struct rte_lpm *lpm, uint32_t ip_masked, uint8_t depth, } /* If valid entry but not extended calculate the index into Table8. */ else if (lpm->tbl24[tbl24_index].valid_group == 0) { /* Search for free tbl8 group. */ - tbl8_group_index = tbl8_alloc_v1604(lpm->tbl8, lpm->number_tbl8s); + tbl8_group_index = tbl8_alloc_v1604(lpm); if (tbl8_group_index < 0) { return tbl8_group_index; @@ -1818,7 +2010,7 @@ delete_depth_big_v1604(struct rte_lpm *lpm, uint32_t ip_masked, */ lpm->tbl24[tbl24_index].valid = 0; __atomic_thread_fence(__ATOMIC_RELEASE); - tbl8_free_v1604(lpm->tbl8, tbl8_group_start); + tbl8_free_v1604(lpm, tbl8_group_start); } else if (tbl8_recycle_index > -1) { /* Update tbl24 entry. */ struct rte_lpm_tbl_entry new_tbl24_entry = { @@ -1834,7 +2026,7 @@ delete_depth_big_v1604(struct rte_lpm *lpm, uint32_t ip_masked, __atomic_store(&lpm->tbl24[tbl24_index], &new_tbl24_entry, __ATOMIC_RELAXED); __atomic_thread_fence(__ATOMIC_RELEASE); - tbl8_free_v1604(lpm->tbl8, tbl8_group_start); + tbl8_free_v1604(lpm, tbl8_group_start); } #undef group_idx return 0; diff --git a/lib/librte_lpm/rte_lpm.h b/lib/librte_lpm/rte_lpm.h index 906ec4483..5079fb262 100644 --- a/lib/librte_lpm/rte_lpm.h +++ b/lib/librte_lpm/rte_lpm.h @@ -1,5 +1,6 @@ /* SPDX-License-Identifier: BSD-3-Clause * Copyright(c) 2010-2014 Intel Corporation + * Copyright(c) 2019 Arm Limited */ #ifndef _RTE_LPM_H_ @@ -21,6 +22,7 @@ #include #include #include +#include #ifdef __cplusplus extern "C" { @@ -186,6 +188,8 @@ struct rte_lpm { __rte_cache_aligned; /**< LPM tbl24 table. */ struct rte_lpm_tbl_entry *tbl8; /**< LPM tbl8 table. */ struct rte_lpm_rule *rules_tbl; /**< LPM rules. */ + struct rte_rcu_qsbr *qsv; /**< RCU QSBR variable for tbl8 group.*/ + struct rte_ring *qs_fifo; /**< RCU QSBR reclaiming queue. */ }; /** @@ -248,6 +252,24 @@ rte_lpm_free_v20(struct rte_lpm_v20 *lpm); void rte_lpm_free_v1604(struct rte_lpm *lpm); +/** + * Associate RCU QSBR variable with an LPM object. + * + * @param lpm + * the lpm object to add RCU QSBR + * @param v + * RCU QSBR variable + * @return + * On success - 0 + * On error - 1 with error code set in rte_errno. + * Possible rte_errno codes are: + * - EINVAL - invalid pointer + * - EEXIST - already added QSBR + * - ENOMEM - memory allocation failure + */ +__rte_experimental +int rte_lpm_rcu_qsbr_add(struct rte_lpm *lpm, struct rte_rcu_qsbr *v); + /** * Add a rule to the LPM table. * diff --git a/lib/librte_lpm/rte_lpm_version.map b/lib/librte_lpm/rte_lpm_version.map index 90beac853..b353aabd2 100644 --- a/lib/librte_lpm/rte_lpm_version.map +++ b/lib/librte_lpm/rte_lpm_version.map @@ -44,3 +44,9 @@ DPDK_17.05 { rte_lpm6_lookup_bulk_func; } DPDK_16.04; + +EXPERIMENTAL { + global: + + rte_lpm_rcu_qsbr_add; +}; diff --git a/lib/meson.build b/lib/meson.build index e5ff83893..3a96f005d 100644 --- a/lib/meson.build +++ b/lib/meson.build @@ -11,6 +11,7 @@ libraries = [ 'kvargs', # eal depends on kvargs 'eal', # everything depends on eal + 'rcu', # hash and lpm depends on this 'ring', 'mempool', 'mbuf', 'net', 'meter', 'ethdev', 'pci', # core 'cmdline', 'metrics', # bitrate/latency stats depends on this @@ -22,7 +23,7 @@ libraries = [ 'gro', 'gso', 'ip_frag', 'jobstats', 'kni', 'latencystats', 'lpm', 'member', 'power', 'pdump', 'rawdev', - 'rcu', 'reorder', 'sched', 'security', 'stack', 'vhost', + 'reorder', 'sched', 'security', 'stack', 'vhost', # ipsec lib depends on net, crypto and security 'ipsec', # add pkt framework libs which use other libs from above