From patchwork Thu Nov 17 05:09:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yan, Zhirun" X-Patchwork-Id: 119910 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 62879A00C2; Thu, 17 Nov 2022 06:09:43 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 5365240F18; Thu, 17 Nov 2022 06:09:43 +0100 (CET) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by mails.dpdk.org (Postfix) with ESMTP id BCC2940F16 for ; Thu, 17 Nov 2022 06:09:41 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1668661781; x=1700197781; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=TuEY15MsssEQBvy5yqllszP9MiSODpH2VZSlXSXWP9A=; b=nyXiRxR0hNS2jyOhfU0Cf0PteVSAT76VF8Oasfx4F06xNsmx6k4OSBRX Su/LCcamfGAbZwMnArPCrg8Gg34OWmY9+P/wXrmGl3fVOoEqEWUkCpNk/ L6VVLmQxyEMzVcc98hk+Glq7QDGSf2OEAQ4+83wKG+e88GRlfjA9dbBpI eyRlH0RspbUCzovii+NwN033NfxrwcIbSvpNO1PBCX90oUSctEvmS9npw Kps6W7Fy0lYw2QFY+z9E1Kdo6mAVzyHJULtyhgmBx6NuPHbzAM+p1RpPS eOy/+zW0Wmrj4TqX+5C8mH9UWiXngNKpMgINwlg/t8AJ9Lz9+KjIT7Edj w==; X-IronPort-AV: E=McAfee;i="6500,9779,10533"; a="377026656" X-IronPort-AV: E=Sophos;i="5.96,169,1665471600"; d="scan'208";a="377026656" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Nov 2022 21:09:40 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10533"; a="617466011" X-IronPort-AV: E=Sophos;i="5.96,169,1665471600"; d="scan'208";a="617466011" Received: from dpdk-zhirun-lmm.sh.intel.com ([10.67.118.230]) by orsmga006.jf.intel.com with ESMTP; 16 Nov 2022 21:09:37 -0800 From: Zhirun Yan To: dev@dpdk.org, jerinj@marvell.com, kirankumark@marvell.com, ndabilpuram@marvell.com Cc: cunming.liang@intel.com, haiyue.wang@intel.com, Zhirun Yan Subject: [PATCH v1 01/13] graph: split graph worker into common and default model Date: Thu, 17 Nov 2022 13:09:14 +0800 Message-Id: <20221117050926.136974-2-zhirun.yan@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221117050926.136974-1-zhirun.yan@intel.com> References: <20221117050926.136974-1-zhirun.yan@intel.com> MIME-Version: 1.0 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org To support multiple graph worker model, split graph into common and default. Naming the current walk function as rte_graph_model_rtc cause the default model is RTC(Run-to-completion). Signed-off-by: Haiyue Wang Signed-off-by: Cunming Liang Signed-off-by: Zhirun Yan --- lib/graph/rte_graph_model_rtc.h | 57 ++++ lib/graph/rte_graph_worker.h | 498 +--------------------------- lib/graph/rte_graph_worker_common.h | 456 +++++++++++++++++++++++++ 3 files changed, 515 insertions(+), 496 deletions(-) create mode 100644 lib/graph/rte_graph_model_rtc.h create mode 100644 lib/graph/rte_graph_worker_common.h diff --git a/lib/graph/rte_graph_model_rtc.h b/lib/graph/rte_graph_model_rtc.h new file mode 100644 index 0000000000..fb58730bde --- /dev/null +++ b/lib/graph/rte_graph_model_rtc.h @@ -0,0 +1,57 @@ +#include "rte_graph_worker_common.h" + +/** + * Perform graph walk on the circular buffer and invoke the process function + * of the nodes and collect the stats. + * + * @param graph + * Graph pointer returned from rte_graph_lookup function. + * + * @see rte_graph_lookup() + */ +static inline void +rte_graph_walk_rtc(struct rte_graph *graph) +{ + const rte_graph_off_t *cir_start = graph->cir_start; + const rte_node_t mask = graph->cir_mask; + uint32_t head = graph->head; + struct rte_node *node; + uint64_t start; + uint16_t rc; + void **objs; + + /* + * Walk on the source node(s) ((cir_start - head) -> cir_start) and then + * on the pending streams (cir_start -> (cir_start + mask) -> cir_start) + * in a circular buffer fashion. + * + * +-----+ <= cir_start - head [number of source nodes] + * | | + * | ... | <= source nodes + * | | + * +-----+ <= cir_start [head = 0] [tail = 0] + * | | + * | ... | <= pending streams + * | | + * +-----+ <= cir_start + mask + */ + while (likely(head != graph->tail)) { + node = (struct rte_node *)RTE_PTR_ADD(graph, cir_start[(int32_t)head++]); + RTE_ASSERT(node->fence == RTE_GRAPH_FENCE); + objs = node->objs; + rte_prefetch0(objs); + + if (rte_graph_has_stats_feature()) { + start = rte_rdtsc(); + rc = node->process(graph, node, objs, node->idx); + node->total_cycles += rte_rdtsc() - start; + node->total_calls++; + node->total_objs += rc; + } else { + node->process(graph, node, objs, node->idx); + } + node->idx = 0; + head = likely((int32_t)head > 0) ? head & mask : head; + } + graph->tail = 0; +} diff --git a/lib/graph/rte_graph_worker.h b/lib/graph/rte_graph_worker.h index 6dc7461659..54d1390786 100644 --- a/lib/graph/rte_graph_worker.h +++ b/lib/graph/rte_graph_worker.h @@ -1,122 +1,4 @@ -/* SPDX-License-Identifier: BSD-3-Clause - * Copyright(C) 2020 Marvell International Ltd. - */ - -#ifndef _RTE_GRAPH_WORKER_H_ -#define _RTE_GRAPH_WORKER_H_ - -/** - * @file rte_graph_worker.h - * - * @warning - * @b EXPERIMENTAL: - * All functions in this file may be changed or removed without prior notice. - * - * This API allows a worker thread to walk over a graph and nodes to create, - * process, enqueue and move streams of objects to the next nodes. - */ - -#include -#include -#include -#include -#include - -#include "rte_graph.h" - -#ifdef __cplusplus -extern "C" { -#endif - -/** - * @internal - * - * Data structure to hold graph data. - */ -struct rte_graph { - uint32_t tail; /**< Tail of circular buffer. */ - uint32_t head; /**< Head of circular buffer. */ - uint32_t cir_mask; /**< Circular buffer wrap around mask. */ - rte_node_t nb_nodes; /**< Number of nodes in the graph. */ - rte_graph_off_t *cir_start; /**< Pointer to circular buffer. */ - rte_graph_off_t nodes_start; /**< Offset at which node memory starts. */ - rte_graph_t id; /**< Graph identifier. */ - int socket; /**< Socket ID where memory is allocated. */ - char name[RTE_GRAPH_NAMESIZE]; /**< Name of the graph. */ - uint64_t fence; /**< Fence. */ -} __rte_cache_aligned; - -/** - * @internal - * - * Data structure to hold node data. - */ -struct rte_node { - /* Slow path area */ - uint64_t fence; /**< Fence. */ - rte_graph_off_t next; /**< Index to next node. */ - rte_node_t id; /**< Node identifier. */ - rte_node_t parent_id; /**< Parent Node identifier. */ - rte_edge_t nb_edges; /**< Number of edges from this node. */ - uint32_t realloc_count; /**< Number of times realloced. */ - - char parent[RTE_NODE_NAMESIZE]; /**< Parent node name. */ - char name[RTE_NODE_NAMESIZE]; /**< Name of the node. */ - - /* Fast path area */ -#define RTE_NODE_CTX_SZ 16 - uint8_t ctx[RTE_NODE_CTX_SZ] __rte_cache_aligned; /**< Node Context. */ - uint16_t size; /**< Total number of objects available. */ - uint16_t idx; /**< Number of objects used. */ - rte_graph_off_t off; /**< Offset of node in the graph reel. */ - uint64_t total_cycles; /**< Cycles spent in this node. */ - uint64_t total_calls; /**< Calls done to this node. */ - uint64_t total_objs; /**< Objects processed by this node. */ - RTE_STD_C11 - union { - void **objs; /**< Array of object pointers. */ - uint64_t objs_u64; - }; - RTE_STD_C11 - union { - rte_node_process_t process; /**< Process function. */ - uint64_t process_u64; - }; - struct rte_node *nodes[] __rte_cache_min_aligned; /**< Next nodes. */ -} __rte_cache_aligned; - -/** - * @internal - * - * Allocate a stream of objects. - * - * If stream already exists then re-allocate it to a larger size. - * - * @param graph - * Pointer to the graph object. - * @param node - * Pointer to the node object. - */ -__rte_experimental -void __rte_node_stream_alloc(struct rte_graph *graph, struct rte_node *node); - -/** - * @internal - * - * Allocate a stream with requested number of objects. - * - * If stream already exists then re-allocate it to a larger size. - * - * @param graph - * Pointer to the graph object. - * @param node - * Pointer to the node object. - * @param req_size - * Number of objects to be allocated. - */ -__rte_experimental -void __rte_node_stream_alloc_size(struct rte_graph *graph, - struct rte_node *node, uint16_t req_size); +#include "rte_graph_model_rtc.h" /** * Perform graph walk on the circular buffer and invoke the process function @@ -131,381 +13,5 @@ __rte_experimental static inline void rte_graph_walk(struct rte_graph *graph) { - const rte_graph_off_t *cir_start = graph->cir_start; - const rte_node_t mask = graph->cir_mask; - uint32_t head = graph->head; - struct rte_node *node; - uint64_t start; - uint16_t rc; - void **objs; - - /* - * Walk on the source node(s) ((cir_start - head) -> cir_start) and then - * on the pending streams (cir_start -> (cir_start + mask) -> cir_start) - * in a circular buffer fashion. - * - * +-----+ <= cir_start - head [number of source nodes] - * | | - * | ... | <= source nodes - * | | - * +-----+ <= cir_start [head = 0] [tail = 0] - * | | - * | ... | <= pending streams - * | | - * +-----+ <= cir_start + mask - */ - while (likely(head != graph->tail)) { - node = (struct rte_node *)RTE_PTR_ADD(graph, cir_start[(int32_t)head++]); - RTE_ASSERT(node->fence == RTE_GRAPH_FENCE); - objs = node->objs; - rte_prefetch0(objs); - - if (rte_graph_has_stats_feature()) { - start = rte_rdtsc(); - rc = node->process(graph, node, objs, node->idx); - node->total_cycles += rte_rdtsc() - start; - node->total_calls++; - node->total_objs += rc; - } else { - node->process(graph, node, objs, node->idx); - } - node->idx = 0; - head = likely((int32_t)head > 0) ? head & mask : head; - } - graph->tail = 0; -} - -/* Fast path helper functions */ - -/** - * @internal - * - * Enqueue a given node to the tail of the graph reel. - * - * @param graph - * Pointer Graph object. - * @param node - * Pointer to node object to be enqueued. - */ -static __rte_always_inline void -__rte_node_enqueue_tail_update(struct rte_graph *graph, struct rte_node *node) -{ - uint32_t tail; - - tail = graph->tail; - graph->cir_start[tail++] = node->off; - graph->tail = tail & graph->cir_mask; -} - -/** - * @internal - * - * Enqueue sequence prologue function. - * - * Updates the node to tail of graph reel and resizes the number of objects - * available in the stream as needed. - * - * @param graph - * Pointer to the graph object. - * @param node - * Pointer to the node object. - * @param idx - * Index at which the object enqueue starts from. - * @param space - * Space required for the object enqueue. - */ -static __rte_always_inline void -__rte_node_enqueue_prologue(struct rte_graph *graph, struct rte_node *node, - const uint16_t idx, const uint16_t space) -{ - - /* Add to the pending stream list if the node is new */ - if (idx == 0) - __rte_node_enqueue_tail_update(graph, node); - - if (unlikely(node->size < (idx + space))) - __rte_node_stream_alloc_size(graph, node, node->size + space); -} - -/** - * @internal - * - * Get the node pointer from current node edge id. - * - * @param node - * Current node pointer. - * @param next - * Edge id of the required node. - * - * @return - * Pointer to the node denoted by the edge id. - */ -static __rte_always_inline struct rte_node * -__rte_node_next_node_get(struct rte_node *node, rte_edge_t next) -{ - RTE_ASSERT(next < node->nb_edges); - RTE_ASSERT(node->fence == RTE_GRAPH_FENCE); - node = node->nodes[next]; - RTE_ASSERT(node->fence == RTE_GRAPH_FENCE); - - return node; -} - -/** - * Enqueue the objs to next node for further processing and set - * the next node to pending state in the circular buffer. - * - * @param graph - * Graph pointer returned from rte_graph_lookup(). - * @param node - * Current node pointer. - * @param next - * Relative next node index to enqueue objs. - * @param objs - * Objs to enqueue. - * @param nb_objs - * Number of objs to enqueue. - */ -__rte_experimental -static inline void -rte_node_enqueue(struct rte_graph *graph, struct rte_node *node, - rte_edge_t next, void **objs, uint16_t nb_objs) -{ - node = __rte_node_next_node_get(node, next); - const uint16_t idx = node->idx; - - __rte_node_enqueue_prologue(graph, node, idx, nb_objs); - - rte_memcpy(&node->objs[idx], objs, nb_objs * sizeof(void *)); - node->idx = idx + nb_objs; + rte_graph_walk_rtc(graph); } - -/** - * Enqueue only one obj to next node for further processing and - * set the next node to pending state in the circular buffer. - * - * @param graph - * Graph pointer returned from rte_graph_lookup(). - * @param node - * Current node pointer. - * @param next - * Relative next node index to enqueue objs. - * @param obj - * Obj to enqueue. - */ -__rte_experimental -static inline void -rte_node_enqueue_x1(struct rte_graph *graph, struct rte_node *node, - rte_edge_t next, void *obj) -{ - node = __rte_node_next_node_get(node, next); - uint16_t idx = node->idx; - - __rte_node_enqueue_prologue(graph, node, idx, 1); - - node->objs[idx++] = obj; - node->idx = idx; -} - -/** - * Enqueue only two objs to next node for further processing and - * set the next node to pending state in the circular buffer. - * Same as rte_node_enqueue_x1 but enqueue two objs. - * - * @param graph - * Graph pointer returned from rte_graph_lookup(). - * @param node - * Current node pointer. - * @param next - * Relative next node index to enqueue objs. - * @param obj0 - * Obj to enqueue. - * @param obj1 - * Obj to enqueue. - */ -__rte_experimental -static inline void -rte_node_enqueue_x2(struct rte_graph *graph, struct rte_node *node, - rte_edge_t next, void *obj0, void *obj1) -{ - node = __rte_node_next_node_get(node, next); - uint16_t idx = node->idx; - - __rte_node_enqueue_prologue(graph, node, idx, 2); - - node->objs[idx++] = obj0; - node->objs[idx++] = obj1; - node->idx = idx; -} - -/** - * Enqueue only four objs to next node for further processing and - * set the next node to pending state in the circular buffer. - * Same as rte_node_enqueue_x1 but enqueue four objs. - * - * @param graph - * Graph pointer returned from rte_graph_lookup(). - * @param node - * Current node pointer. - * @param next - * Relative next node index to enqueue objs. - * @param obj0 - * 1st obj to enqueue. - * @param obj1 - * 2nd obj to enqueue. - * @param obj2 - * 3rd obj to enqueue. - * @param obj3 - * 4th obj to enqueue. - */ -__rte_experimental -static inline void -rte_node_enqueue_x4(struct rte_graph *graph, struct rte_node *node, - rte_edge_t next, void *obj0, void *obj1, void *obj2, - void *obj3) -{ - node = __rte_node_next_node_get(node, next); - uint16_t idx = node->idx; - - __rte_node_enqueue_prologue(graph, node, idx, 4); - - node->objs[idx++] = obj0; - node->objs[idx++] = obj1; - node->objs[idx++] = obj2; - node->objs[idx++] = obj3; - node->idx = idx; -} - -/** - * Enqueue objs to multiple next nodes for further processing and - * set the next nodes to pending state in the circular buffer. - * objs[i] will be enqueued to nexts[i]. - * - * @param graph - * Graph pointer returned from rte_graph_lookup(). - * @param node - * Current node pointer. - * @param nexts - * List of relative next node indices to enqueue objs. - * @param objs - * List of objs to enqueue. - * @param nb_objs - * Number of objs to enqueue. - */ -__rte_experimental -static inline void -rte_node_enqueue_next(struct rte_graph *graph, struct rte_node *node, - rte_edge_t *nexts, void **objs, uint16_t nb_objs) -{ - uint16_t i; - - for (i = 0; i < nb_objs; i++) - rte_node_enqueue_x1(graph, node, nexts[i], objs[i]); -} - -/** - * Get the stream of next node to enqueue the objs. - * Once done with the updating the objs, needs to call - * rte_node_next_stream_put to put the next node to pending state. - * - * @param graph - * Graph pointer returned from rte_graph_lookup(). - * @param node - * Current node pointer. - * @param next - * Relative next node index to get stream. - * @param nb_objs - * Requested free size of the next stream. - * - * @return - * Valid next stream on success. - * - * @see rte_node_next_stream_put(). - */ -__rte_experimental -static inline void ** -rte_node_next_stream_get(struct rte_graph *graph, struct rte_node *node, - rte_edge_t next, uint16_t nb_objs) -{ - node = __rte_node_next_node_get(node, next); - const uint16_t idx = node->idx; - uint16_t free_space = node->size - idx; - - if (unlikely(free_space < nb_objs)) - __rte_node_stream_alloc_size(graph, node, node->size + nb_objs); - - return &node->objs[idx]; -} - -/** - * Put the next stream to pending state in the circular buffer - * for further processing. Should be invoked after rte_node_next_stream_get(). - * - * @param graph - * Graph pointer returned from rte_graph_lookup(). - * @param node - * Current node pointer. - * @param next - * Relative next node index.. - * @param idx - * Number of objs updated in the stream after getting the stream using - * rte_node_next_stream_get. - * - * @see rte_node_next_stream_get(). - */ -__rte_experimental -static inline void -rte_node_next_stream_put(struct rte_graph *graph, struct rte_node *node, - rte_edge_t next, uint16_t idx) -{ - if (unlikely(!idx)) - return; - - node = __rte_node_next_node_get(node, next); - if (node->idx == 0) - __rte_node_enqueue_tail_update(graph, node); - - node->idx += idx; -} - -/** - * Home run scenario, Enqueue all the objs of current node to next - * node in optimized way by swapping the streams of both nodes. - * Performs good when next node is already not in pending state. - * If next node is already in pending state then normal enqueue - * will be used. - * - * @param graph - * Graph pointer returned from rte_graph_lookup(). - * @param src - * Current node pointer. - * @param next - * Relative next node index. - */ -__rte_experimental -static inline void -rte_node_next_stream_move(struct rte_graph *graph, struct rte_node *src, - rte_edge_t next) -{ - struct rte_node *dst = __rte_node_next_node_get(src, next); - - /* Let swap the pointers if dst don't have valid objs */ - if (likely(dst->idx == 0)) { - void **dobjs = dst->objs; - uint16_t dsz = dst->size; - dst->objs = src->objs; - dst->size = src->size; - src->objs = dobjs; - src->size = dsz; - dst->idx = src->idx; - __rte_node_enqueue_tail_update(graph, dst); - } else { /* Move the objects from src node to dst node */ - rte_node_enqueue(graph, src, next, src->objs, src->idx); - } -} - -#ifdef __cplusplus -} -#endif - -#endif /* _RTE_GRAPH_WORKER_H_ */ diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h new file mode 100644 index 0000000000..91a5de7fa4 --- /dev/null +++ b/lib/graph/rte_graph_worker_common.h @@ -0,0 +1,456 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(C) 2020 Marvell International Ltd. + */ + +#ifndef _RTE_GRAPH_WORKER_COMMON_H_ +#define _RTE_GRAPH_WORKER_COMMON_H_ + +/** + * @file rte_graph_worker.h + * + * @warning + * @b EXPERIMENTAL: + * All functions in this file may be changed or removed without prior notice. + * + * This API allows a worker thread to walk over a graph and nodes to create, + * process, enqueue and move streams of objects to the next nodes. + */ + +#include +#include +#include +#include +#include + +#include "rte_graph.h" + +#ifdef __cplusplus +extern "C" { +#endif + +/** + * @internal + * + * Data structure to hold graph data. + */ +struct rte_graph { + uint32_t tail; /**< Tail of circular buffer. */ + uint32_t head; /**< Head of circular buffer. */ + uint32_t cir_mask; /**< Circular buffer wrap around mask. */ + rte_node_t nb_nodes; /**< Number of nodes in the graph. */ + rte_graph_off_t *cir_start; /**< Pointer to circular buffer. */ + rte_graph_off_t nodes_start; /**< Offset at which node memory starts. */ + rte_graph_t id; /**< Graph identifier. */ + int socket; /**< Socket ID where memory is allocated. */ + char name[RTE_GRAPH_NAMESIZE]; /**< Name of the graph. */ + uint64_t fence; /**< Fence. */ +} __rte_cache_aligned; + +/** + * @internal + * + * Data structure to hold node data. + */ +struct rte_node { + /* Slow path area */ + uint64_t fence; /**< Fence. */ + rte_graph_off_t next; /**< Index to next node. */ + rte_node_t id; /**< Node identifier. */ + rte_node_t parent_id; /**< Parent Node identifier. */ + rte_edge_t nb_edges; /**< Number of edges from this node. */ + uint32_t realloc_count; /**< Number of times realloced. */ + + char parent[RTE_NODE_NAMESIZE]; /**< Parent node name. */ + char name[RTE_NODE_NAMESIZE]; /**< Name of the node. */ + + /* Fast path area */ +#define RTE_NODE_CTX_SZ 16 + uint8_t ctx[RTE_NODE_CTX_SZ] __rte_cache_aligned; /**< Node Context. */ + uint16_t size; /**< Total number of objects available. */ + uint16_t idx; /**< Number of objects used. */ + rte_graph_off_t off; /**< Offset of node in the graph reel. */ + uint64_t total_cycles; /**< Cycles spent in this node. */ + uint64_t total_calls; /**< Calls done to this node. */ + uint64_t total_objs; /**< Objects processed by this node. */ + + RTE_STD_C11 + union { + void **objs; /**< Array of object pointers. */ + uint64_t objs_u64; + }; + RTE_STD_C11 + union { + rte_node_process_t process; /**< Process function. */ + uint64_t process_u64; + }; + struct rte_node *nodes[] __rte_cache_min_aligned; /**< Next nodes. */ +} __rte_cache_aligned; + +/** + * @internal + * + * Allocate a stream of objects. + * + * If stream already exists then re-allocate it to a larger size. + * + * @param graph + * Pointer to the graph object. + * @param node + * Pointer to the node object. + */ +__rte_experimental +void __rte_node_stream_alloc(struct rte_graph *graph, struct rte_node *node); + +/** + * @internal + * + * Allocate a stream with requested number of objects. + * + * If stream already exists then re-allocate it to a larger size. + * + * @param graph + * Pointer to the graph object. + * @param node + * Pointer to the node object. + * @param req_size + * Number of objects to be allocated. + */ +__rte_experimental +void __rte_node_stream_alloc_size(struct rte_graph *graph, + struct rte_node *node, uint16_t req_size); + +/* Fast path helper functions */ + +/** + * @internal + * + * Enqueue a given node to the tail of the graph reel. + * + * @param graph + * Pointer Graph object. + * @param node + * Pointer to node object to be enqueued. + */ +static __rte_always_inline void +__rte_node_enqueue_tail_update(struct rte_graph *graph, struct rte_node *node) +{ + uint32_t tail; + + tail = graph->tail; + graph->cir_start[tail++] = node->off; + graph->tail = tail & graph->cir_mask; +} + +/** + * @internal + * + * Enqueue sequence prologue function. + * + * Updates the node to tail of graph reel and resizes the number of objects + * available in the stream as needed. + * + * @param graph + * Pointer to the graph object. + * @param node + * Pointer to the node object. + * @param idx + * Index at which the object enqueue starts from. + * @param space + * Space required for the object enqueue. + */ +static __rte_always_inline void +__rte_node_enqueue_prologue(struct rte_graph *graph, struct rte_node *node, + const uint16_t idx, const uint16_t space) +{ + + /* Add to the pending stream list if the node is new */ + if (idx == 0) + __rte_node_enqueue_tail_update(graph, node); + + if (unlikely(node->size < (idx + space))) + __rte_node_stream_alloc_size(graph, node, node->size + space); +} + +/** + * @internal + * + * Get the node pointer from current node edge id. + * + * @param node + * Current node pointer. + * @param next + * Edge id of the required node. + * + * @return + * Pointer to the node denoted by the edge id. + */ +static __rte_always_inline struct rte_node * +__rte_node_next_node_get(struct rte_node *node, rte_edge_t next) +{ + RTE_ASSERT(next < node->nb_edges); + RTE_ASSERT(node->fence == RTE_GRAPH_FENCE); + node = node->nodes[next]; + RTE_ASSERT(node->fence == RTE_GRAPH_FENCE); + + return node; +} + +/** + * Enqueue the objs to next node for further processing and set + * the next node to pending state in the circular buffer. + * + * @param graph + * Graph pointer returned from rte_graph_lookup(). + * @param node + * Current node pointer. + * @param next + * Relative next node index to enqueue objs. + * @param objs + * Objs to enqueue. + * @param nb_objs + * Number of objs to enqueue. + */ +__rte_experimental +static inline void +rte_node_enqueue(struct rte_graph *graph, struct rte_node *node, + rte_edge_t next, void **objs, uint16_t nb_objs) +{ + node = __rte_node_next_node_get(node, next); + const uint16_t idx = node->idx; + + __rte_node_enqueue_prologue(graph, node, idx, nb_objs); + + rte_memcpy(&node->objs[idx], objs, nb_objs * sizeof(void *)); + node->idx = idx + nb_objs; +} + +/** + * Enqueue only one obj to next node for further processing and + * set the next node to pending state in the circular buffer. + * + * @param graph + * Graph pointer returned from rte_graph_lookup(). + * @param node + * Current node pointer. + * @param next + * Relative next node index to enqueue objs. + * @param obj + * Obj to enqueue. + */ +__rte_experimental +static inline void +rte_node_enqueue_x1(struct rte_graph *graph, struct rte_node *node, + rte_edge_t next, void *obj) +{ + node = __rte_node_next_node_get(node, next); + uint16_t idx = node->idx; + + __rte_node_enqueue_prologue(graph, node, idx, 1); + + node->objs[idx++] = obj; + node->idx = idx; +} + +/** + * Enqueue only two objs to next node for further processing and + * set the next node to pending state in the circular buffer. + * Same as rte_node_enqueue_x1 but enqueue two objs. + * + * @param graph + * Graph pointer returned from rte_graph_lookup(). + * @param node + * Current node pointer. + * @param next + * Relative next node index to enqueue objs. + * @param obj0 + * Obj to enqueue. + * @param obj1 + * Obj to enqueue. + */ +__rte_experimental +static inline void +rte_node_enqueue_x2(struct rte_graph *graph, struct rte_node *node, + rte_edge_t next, void *obj0, void *obj1) +{ + node = __rte_node_next_node_get(node, next); + uint16_t idx = node->idx; + + __rte_node_enqueue_prologue(graph, node, idx, 2); + + node->objs[idx++] = obj0; + node->objs[idx++] = obj1; + node->idx = idx; +} + +/** + * Enqueue only four objs to next node for further processing and + * set the next node to pending state in the circular buffer. + * Same as rte_node_enqueue_x1 but enqueue four objs. + * + * @param graph + * Graph pointer returned from rte_graph_lookup(). + * @param node + * Current node pointer. + * @param next + * Relative next node index to enqueue objs. + * @param obj0 + * 1st obj to enqueue. + * @param obj1 + * 2nd obj to enqueue. + * @param obj2 + * 3rd obj to enqueue. + * @param obj3 + * 4th obj to enqueue. + */ +__rte_experimental +static inline void +rte_node_enqueue_x4(struct rte_graph *graph, struct rte_node *node, + rte_edge_t next, void *obj0, void *obj1, void *obj2, + void *obj3) +{ + node = __rte_node_next_node_get(node, next); + uint16_t idx = node->idx; + + __rte_node_enqueue_prologue(graph, node, idx, 4); + + node->objs[idx++] = obj0; + node->objs[idx++] = obj1; + node->objs[idx++] = obj2; + node->objs[idx++] = obj3; + node->idx = idx; +} + +/** + * Enqueue objs to multiple next nodes for further processing and + * set the next nodes to pending state in the circular buffer. + * objs[i] will be enqueued to nexts[i]. + * + * @param graph + * Graph pointer returned from rte_graph_lookup(). + * @param node + * Current node pointer. + * @param nexts + * List of relative next node indices to enqueue objs. + * @param objs + * List of objs to enqueue. + * @param nb_objs + * Number of objs to enqueue. + */ +__rte_experimental +static inline void +rte_node_enqueue_next(struct rte_graph *graph, struct rte_node *node, + rte_edge_t *nexts, void **objs, uint16_t nb_objs) +{ + uint16_t i; + + for (i = 0; i < nb_objs; i++) + rte_node_enqueue_x1(graph, node, nexts[i], objs[i]); +} + +/** + * Get the stream of next node to enqueue the objs. + * Once done with the updating the objs, needs to call + * rte_node_next_stream_put to put the next node to pending state. + * + * @param graph + * Graph pointer returned from rte_graph_lookup(). + * @param node + * Current node pointer. + * @param next + * Relative next node index to get stream. + * @param nb_objs + * Requested free size of the next stream. + * + * @return + * Valid next stream on success. + * + * @see rte_node_next_stream_put(). + */ +__rte_experimental +static inline void ** +rte_node_next_stream_get(struct rte_graph *graph, struct rte_node *node, + rte_edge_t next, uint16_t nb_objs) +{ + node = __rte_node_next_node_get(node, next); + const uint16_t idx = node->idx; + uint16_t free_space = node->size - idx; + + if (unlikely(free_space < nb_objs)) + __rte_node_stream_alloc_size(graph, node, node->size + nb_objs); + + return &node->objs[idx]; +} + +/** + * Put the next stream to pending state in the circular buffer + * for further processing. Should be invoked after rte_node_next_stream_get(). + * + * @param graph + * Graph pointer returned from rte_graph_lookup(). + * @param node + * Current node pointer. + * @param next + * Relative next node index.. + * @param idx + * Number of objs updated in the stream after getting the stream using + * rte_node_next_stream_get. + * + * @see rte_node_next_stream_get(). + */ +__rte_experimental +static inline void +rte_node_next_stream_put(struct rte_graph *graph, struct rte_node *node, + rte_edge_t next, uint16_t idx) +{ + if (unlikely(!idx)) + return; + + node = __rte_node_next_node_get(node, next); + if (node->idx == 0) + __rte_node_enqueue_tail_update(graph, node); + + node->idx += idx; +} + +/** + * Home run scenario, Enqueue all the objs of current node to next + * node in optimized way by swapping the streams of both nodes. + * Performs good when next node is already not in pending state. + * If next node is already in pending state then normal enqueue + * will be used. + * + * @param graph + * Graph pointer returned from rte_graph_lookup(). + * @param src + * Current node pointer. + * @param next + * Relative next node index. + */ +__rte_experimental +static inline void +rte_node_next_stream_move(struct rte_graph *graph, struct rte_node *src, + rte_edge_t next) +{ + struct rte_node *dst = __rte_node_next_node_get(src, next); + + /* Let swap the pointers if dst don't have valid objs */ + if (likely(dst->idx == 0)) { + void **dobjs = dst->objs; + uint16_t dsz = dst->size; + + dst->objs = src->objs; + dst->size = src->size; + src->objs = dobjs; + src->size = dsz; + dst->idx = src->idx; + __rte_node_enqueue_tail_update(graph, dst); + } else { /* Move the objects from src node to dst node */ + rte_node_enqueue(graph, src, next, src->objs, src->idx); + } +} + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_GRAPH_WORKER_COMMON_H_ */ From patchwork Thu Nov 17 05:09:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yan, Zhirun" X-Patchwork-Id: 119911 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 5A192A00C2; Thu, 17 Nov 2022 06:09:51 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 99F3142D29; Thu, 17 Nov 2022 06:09:46 +0100 (CET) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by mails.dpdk.org (Postfix) with ESMTP id 3E5AF42D1D for ; Thu, 17 Nov 2022 06:09:44 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1668661784; x=1700197784; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=E8xK6RjUCHQb/nKBZW8nfYeJ7cVtIPK0mLwEukZPRFs=; b=Vkd1wJY6oKYYbuVJu/7rp+qfbX+n+OU3UdVWLgUCzCjIwqYKEZkSOBQX fG6szaWgPc8x8FqRdiXhmOHYhyhDA8GqGiUSCQ8+RyE7oV6rJnmizTsej blhVkOKYWNUXi5DTeKP17iiVztH0VnMs6vhvd0O9cC6BSlfJlN1+PNKw2 xMqs4Q1hlheur8BFYYOVyTTWBys2BUYWy+G+l6+0/DSEOK5NG8dHDCdxZ jYSg9FUMwC/QJf0iGaGe7V6HLff+54ih4qcWoZnBJi3HFBBpH4151DCzn WjiYPgQRRwq7uDAirmBBj0igi0OMxS5oP/ozoh93v5QXP8CJJ5kWUf8A1 g==; X-IronPort-AV: E=McAfee;i="6500,9779,10533"; a="377026662" X-IronPort-AV: E=Sophos;i="5.96,169,1665471600"; d="scan'208";a="377026662" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Nov 2022 21:09:43 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10533"; a="617466043" X-IronPort-AV: E=Sophos;i="5.96,169,1665471600"; d="scan'208";a="617466043" Received: from dpdk-zhirun-lmm.sh.intel.com ([10.67.118.230]) by orsmga006.jf.intel.com with ESMTP; 16 Nov 2022 21:09:40 -0800 From: Zhirun Yan To: dev@dpdk.org, jerinj@marvell.com, kirankumark@marvell.com, ndabilpuram@marvell.com Cc: cunming.liang@intel.com, haiyue.wang@intel.com, Zhirun Yan Subject: [PATCH v1 02/13] graph: move node process into inline function Date: Thu, 17 Nov 2022 13:09:15 +0800 Message-Id: <20221117050926.136974-3-zhirun.yan@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221117050926.136974-1-zhirun.yan@intel.com> References: <20221117050926.136974-1-zhirun.yan@intel.com> MIME-Version: 1.0 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Node process is a single and reusable block, move the code into an inline function. Signed-off-by: Haiyue Wang Signed-off-by: Cunming Liang Signed-off-by: Zhirun Yan Acked-by: Jerin Jacob --- lib/graph/rte_graph_model_rtc.h | 18 +--------------- lib/graph/rte_graph_worker_common.h | 33 +++++++++++++++++++++++++++++ 2 files changed, 34 insertions(+), 17 deletions(-) diff --git a/lib/graph/rte_graph_model_rtc.h b/lib/graph/rte_graph_model_rtc.h index fb58730bde..c80b0ce962 100644 --- a/lib/graph/rte_graph_model_rtc.h +++ b/lib/graph/rte_graph_model_rtc.h @@ -16,9 +16,6 @@ rte_graph_walk_rtc(struct rte_graph *graph) const rte_node_t mask = graph->cir_mask; uint32_t head = graph->head; struct rte_node *node; - uint64_t start; - uint16_t rc; - void **objs; /* * Walk on the source node(s) ((cir_start - head) -> cir_start) and then @@ -37,20 +34,7 @@ rte_graph_walk_rtc(struct rte_graph *graph) */ while (likely(head != graph->tail)) { node = (struct rte_node *)RTE_PTR_ADD(graph, cir_start[(int32_t)head++]); - RTE_ASSERT(node->fence == RTE_GRAPH_FENCE); - objs = node->objs; - rte_prefetch0(objs); - - if (rte_graph_has_stats_feature()) { - start = rte_rdtsc(); - rc = node->process(graph, node, objs, node->idx); - node->total_cycles += rte_rdtsc() - start; - node->total_calls++; - node->total_objs += rc; - } else { - node->process(graph, node, objs, node->idx); - } - node->idx = 0; + __rte_node_process(graph, node); head = likely((int32_t)head > 0) ? head & mask : head; } graph->tail = 0; diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h index 91a5de7fa4..b7b2bb958c 100644 --- a/lib/graph/rte_graph_worker_common.h +++ b/lib/graph/rte_graph_worker_common.h @@ -121,6 +121,39 @@ void __rte_node_stream_alloc_size(struct rte_graph *graph, /* Fast path helper functions */ +/** + * @internal + * + * Enqueue a given node to the tail of the graph reel. + * + * @param graph + * Pointer Graph object. + * @param node + * Pointer to node object to be enqueued. + */ +static __rte_always_inline void +__rte_node_process(struct rte_graph *graph, struct rte_node *node) +{ + uint64_t start; + uint16_t rc; + void **objs; + + RTE_ASSERT(node->fence == RTE_GRAPH_FENCE); + objs = node->objs; + rte_prefetch0(objs); + + if (rte_graph_has_stats_feature()) { + start = rte_rdtsc(); + rc = node->process(graph, node, objs, node->idx); + node->total_cycles += rte_rdtsc() - start; + node->total_calls++; + node->total_objs += rc; + } else { + node->process(graph, node, objs, node->idx); + } + node->idx = 0; +} + /** * @internal * From patchwork Thu Nov 17 05:09:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yan, Zhirun" X-Patchwork-Id: 119912 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 91240A00C2; Thu, 17 Nov 2022 06:09:57 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 8434742D1D; Thu, 17 Nov 2022 06:09:48 +0100 (CET) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by mails.dpdk.org (Postfix) with ESMTP id 4941F42D2C for ; Thu, 17 Nov 2022 06:09:47 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1668661787; x=1700197787; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=M/ALlbn5KPII3RoYz/BP3J2Efi5RNggj1hxPXaxiR4o=; b=A8lAPA0hzmaaeAf7AcdaNiTq4PMIgxNweBV8poI4Z28dict6qc6taAJy ZLxsR+cBjIyg5NGO1ZOx7oYqlCmGrv72empaFklapZMVsqX7TJ605KUTF FLhG4V1F247cFvET+HdLE68PBDoah/mtOlxsQvEJoNwB275ryIwTZcvUt 6iENoBaNI5kQ9nAuNuoEojiX5BfdG1PappwJvFgS93G/RudbLZ4e4Cckh 2ywI7nPk+8exuz+/9B4M3TdSAWUCfkNbKZjL78u3HAqeorcarofqW3jxx 7o1DI4rbeFZwGKGsSl9k56Y4Owe2C4wWqg86QRErX0Xyrus9Z45F2REH/ g==; X-IronPort-AV: E=McAfee;i="6500,9779,10533"; a="377026665" X-IronPort-AV: E=Sophos;i="5.96,169,1665471600"; d="scan'208";a="377026665" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Nov 2022 21:09:46 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10533"; a="617466051" X-IronPort-AV: E=Sophos;i="5.96,169,1665471600"; d="scan'208";a="617466051" Received: from dpdk-zhirun-lmm.sh.intel.com ([10.67.118.230]) by orsmga006.jf.intel.com with ESMTP; 16 Nov 2022 21:09:44 -0800 From: Zhirun Yan To: dev@dpdk.org, jerinj@marvell.com, kirankumark@marvell.com, ndabilpuram@marvell.com Cc: cunming.liang@intel.com, haiyue.wang@intel.com, Zhirun Yan Subject: [PATCH v1 03/13] graph: add macro to walk on graph circular buffer Date: Thu, 17 Nov 2022 13:09:16 +0800 Message-Id: <20221117050926.136974-4-zhirun.yan@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221117050926.136974-1-zhirun.yan@intel.com> References: <20221117050926.136974-1-zhirun.yan@intel.com> MIME-Version: 1.0 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org It is common to walk on graph circular buffer and use macro to make it reusable for other worker models. Signed-off-by: Haiyue Wang Signed-off-by: Cunming Liang Signed-off-by: Zhirun Yan --- lib/graph/rte_graph_model_rtc.h | 23 ++--------------------- lib/graph/rte_graph_worker_common.h | 23 +++++++++++++++++++++++ 2 files changed, 25 insertions(+), 21 deletions(-) diff --git a/lib/graph/rte_graph_model_rtc.h b/lib/graph/rte_graph_model_rtc.h index c80b0ce962..5474b06063 100644 --- a/lib/graph/rte_graph_model_rtc.h +++ b/lib/graph/rte_graph_model_rtc.h @@ -12,30 +12,11 @@ static inline void rte_graph_walk_rtc(struct rte_graph *graph) { - const rte_graph_off_t *cir_start = graph->cir_start; - const rte_node_t mask = graph->cir_mask; uint32_t head = graph->head; struct rte_node *node; - /* - * Walk on the source node(s) ((cir_start - head) -> cir_start) and then - * on the pending streams (cir_start -> (cir_start + mask) -> cir_start) - * in a circular buffer fashion. - * - * +-----+ <= cir_start - head [number of source nodes] - * | | - * | ... | <= source nodes - * | | - * +-----+ <= cir_start [head = 0] [tail = 0] - * | | - * | ... | <= pending streams - * | | - * +-----+ <= cir_start + mask - */ - while (likely(head != graph->tail)) { - node = (struct rte_node *)RTE_PTR_ADD(graph, cir_start[(int32_t)head++]); + rte_graph_walk_node(graph, head, node) __rte_node_process(graph, node); - head = likely((int32_t)head > 0) ? head & mask : head; - } + graph->tail = 0; } diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h index b7b2bb958c..df33204336 100644 --- a/lib/graph/rte_graph_worker_common.h +++ b/lib/graph/rte_graph_worker_common.h @@ -121,6 +121,29 @@ void __rte_node_stream_alloc_size(struct rte_graph *graph, /* Fast path helper functions */ +/** + * Macro to walk on the source node(s) ((cir_start - head) -> cir_start) + * and then on the pending streams + * (cir_start -> (cir_start + mask) -> cir_start) + * in a circular buffer fashion. + * + * +-----+ <= cir_start - head [number of source nodes] + * | | + * | ... | <= source nodes + * | | + * +-----+ <= cir_start [head = 0] [tail = 0] + * | | + * | ... | <= pending streams + * | | + * +-----+ <= cir_start + mask + */ +#define rte_graph_walk_node(graph, head, node) \ + for ((node) = RTE_PTR_ADD((graph), (graph)->cir_start[(int32_t)(head)]); \ + likely((head) != (graph)->tail); \ + (head)++, \ + (node) = RTE_PTR_ADD((graph), (graph)->cir_start[(int32_t)(head)]), \ + (head) = likely((int32_t)(head) > 0) ? (head) & (graph)->cir_mask : (head)) + /** * @internal * From patchwork Thu Nov 17 05:09:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yan, Zhirun" X-Patchwork-Id: 119913 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 8ADEFA00C2; Thu, 17 Nov 2022 06:10:03 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 8447442D35; Thu, 17 Nov 2022 06:09:51 +0100 (CET) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by mails.dpdk.org (Postfix) with ESMTP id 436E242D23 for ; Thu, 17 Nov 2022 06:09:50 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1668661790; x=1700197790; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=LRl9+ra30MDOROE6qrYIY9jAsfM8i0CuHNXHlpatb0A=; b=CAx9yZawpSsK1dKVWBQE2rIh2Sjb2F3qwSXX3KkBDVlfZ+ckFupaD9gp PvRXENjEl1GL857WDo311xPjY7BzgjiaZrfgb8ck4q5rl3NKmxx6kvQe2 tVIGMtkLLZG2E8j2tKmV98WruJP9HmQEwzZYtYK+2pXIxAES64pO5JsDx CD4ofZrM0wfDrrcqkFy7xT8Oz8vZkswZDQffYweeb8QKYblIPCZALUczp zQsmXPP6b44CGOi072q1m86pM4aUT1ZO+4cm2Jv/PvTHkKJYthKCZwFZS CPaF42wcBFRec43KUObdFjqvXq8i2wJgKDfQQkQDMg0wi8dWmnmdCUw3k Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10533"; a="377026667" X-IronPort-AV: E=Sophos;i="5.96,169,1665471600"; d="scan'208";a="377026667" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Nov 2022 21:09:49 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10533"; a="617466057" X-IronPort-AV: E=Sophos;i="5.96,169,1665471600"; d="scan'208";a="617466057" Received: from dpdk-zhirun-lmm.sh.intel.com ([10.67.118.230]) by orsmga006.jf.intel.com with ESMTP; 16 Nov 2022 21:09:47 -0800 From: Zhirun Yan To: dev@dpdk.org, jerinj@marvell.com, kirankumark@marvell.com, ndabilpuram@marvell.com Cc: cunming.liang@intel.com, haiyue.wang@intel.com, Zhirun Yan Subject: [PATCH v1 04/13] graph: add get/set graph worker model APIs Date: Thu, 17 Nov 2022 13:09:17 +0800 Message-Id: <20221117050926.136974-5-zhirun.yan@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221117050926.136974-1-zhirun.yan@intel.com> References: <20221117050926.136974-1-zhirun.yan@intel.com> MIME-Version: 1.0 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Add new get/set APIs to configure graph worker model which is used to determine which model will be chosen. Signed-off-by: Haiyue Wang Signed-off-by: Cunming Liang Signed-off-by: Zhirun Yan --- lib/graph/rte_graph_worker.h | 51 +++++++++++++++++++++++++++++ lib/graph/rte_graph_worker_common.h | 13 ++++++++ lib/graph/version.map | 3 ++ 3 files changed, 67 insertions(+) diff --git a/lib/graph/rte_graph_worker.h b/lib/graph/rte_graph_worker.h index 54d1390786..a0ea0df153 100644 --- a/lib/graph/rte_graph_worker.h +++ b/lib/graph/rte_graph_worker.h @@ -1,5 +1,56 @@ #include "rte_graph_model_rtc.h" +static enum rte_graph_worker_model worker_model = RTE_GRAPH_MODEL_DEFAULT; + +/** + * @warning + * @b EXPERIMENTAL: this API may change, or be removed, without prior notice + * Set the graph worker model + * + * @note This function does not perform any locking, and is only safe to call + * before graph running. + * + * @param name + * Name of the graph worker model. + * + * @return + * 0 on success, -1 otherwise. + */ +__rte_experimental +static inline int +rte_graph_worker_model_set(enum rte_graph_worker_model model) +{ + if (model >= RTE_GRAPH_MODEL_MAX) + goto fail; + + worker_model = model; + return 0; + +fail: + worker_model = RTE_GRAPH_MODEL_DEFAULT; + return -1; +} + +/** + * @warning + * @b EXPERIMENTAL: this API may change, or be removed, without prior notice + * + * Get the graph worker model + * + * @param name + * Name of the graph worker model. + * + * @return + * Graph worker model on success. + */ +__rte_experimental +static inline +enum rte_graph_worker_model +rte_graph_worker_model_get(void) +{ + return worker_model; +} + /** * Perform graph walk on the circular buffer and invoke the process function * of the nodes and collect the stats. diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h index df33204336..507a344afd 100644 --- a/lib/graph/rte_graph_worker_common.h +++ b/lib/graph/rte_graph_worker_common.h @@ -86,6 +86,19 @@ struct rte_node { struct rte_node *nodes[] __rte_cache_min_aligned; /**< Next nodes. */ } __rte_cache_aligned; + + +/** Graph worker models */ +enum rte_graph_worker_model { +#define WORKER_MODEL_DEFAULT "default" + RTE_GRAPH_MODEL_DEFAULT = 0, +#define WORKER_MODEL_RTC "rtc" + RTE_GRAPH_MODEL_RTC, +#define WORKER_MODEL_GENERIC "generic" + RTE_GRAPH_MODEL_GENERIC, + RTE_GRAPH_MODEL_MAX, +}; + /** * @internal * diff --git a/lib/graph/version.map b/lib/graph/version.map index 13b838752d..eea73ec9ca 100644 --- a/lib/graph/version.map +++ b/lib/graph/version.map @@ -43,5 +43,8 @@ EXPERIMENTAL { rte_node_next_stream_put; rte_node_next_stream_move; + rte_graph_worker_model_set; + rte_graph_worker_model_get; + local: *; }; From patchwork Thu Nov 17 05:09:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yan, Zhirun" X-Patchwork-Id: 119914 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id F0737A00C2; Thu, 17 Nov 2022 06:10:09 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 9AA5042D3A; Thu, 17 Nov 2022 06:09:54 +0100 (CET) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by mails.dpdk.org (Postfix) with ESMTP id 4849242D30 for ; Thu, 17 Nov 2022 06:09:53 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1668661793; x=1700197793; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=sDdf0Cl1viIbZaEbvQSQTjkz0pIhebwYI4cZagY8no4=; b=aQfm2td6gcY2+5/5sdwmUi0rRyUc//BUDbyXgCybY51ulCg7Bi3OJH7n ZYFaewl2bMiPk3xrV5G3eHKGPEi2gfoJHPCSN5PFVQTHDcY4wtz1BagOE 1aJOQGV7g+UzP4PyyAO2I4pP2jXpPG627KMUMa8vCrNIoGskbKpf00Zwe cpiG98t4ixVJYu1AwF4AUbPcUK+s6TOP6Zqw1TT4aAUGFhqgVViGy3vPN 7PeKSReYCyKhrt9UGfllMxzaVB76ymaonQmtRiwl5OMj8/68EEwpxvg4d R6qDX4qEMEj3xrPm5CYxx3I+1chhbeMgGyovo38isciO4WfQgt1q4oyW0 w==; X-IronPort-AV: E=McAfee;i="6500,9779,10533"; a="377026676" X-IronPort-AV: E=Sophos;i="5.96,169,1665471600"; d="scan'208";a="377026676" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Nov 2022 21:09:52 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10533"; a="617466061" X-IronPort-AV: E=Sophos;i="5.96,169,1665471600"; d="scan'208";a="617466061" Received: from dpdk-zhirun-lmm.sh.intel.com ([10.67.118.230]) by orsmga006.jf.intel.com with ESMTP; 16 Nov 2022 21:09:50 -0800 From: Zhirun Yan To: dev@dpdk.org, jerinj@marvell.com, kirankumark@marvell.com, ndabilpuram@marvell.com Cc: cunming.liang@intel.com, haiyue.wang@intel.com, Zhirun Yan Subject: [PATCH v1 05/13] graph: introduce core affinity API Date: Thu, 17 Nov 2022 13:09:18 +0800 Message-Id: <20221117050926.136974-6-zhirun.yan@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221117050926.136974-1-zhirun.yan@intel.com> References: <20221117050926.136974-1-zhirun.yan@intel.com> MIME-Version: 1.0 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org 1. add lcore_id for node to hold affinity core id. 2. impl rte_node_model_generic_set_lcore_affinity to affinity node with one lcore. 3. update version map for graph public API. Signed-off-by: Haiyue Wang Signed-off-by: Cunming Liang Signed-off-by: Zhirun Yan --- lib/graph/graph_private.h | 1 + lib/graph/meson.build | 1 + lib/graph/node.c | 1 + lib/graph/rte_graph_model_generic.c | 31 +++++++++++++++++++++ lib/graph/rte_graph_model_generic.h | 43 +++++++++++++++++++++++++++++ lib/graph/version.map | 2 ++ 6 files changed, 79 insertions(+) create mode 100644 lib/graph/rte_graph_model_generic.c create mode 100644 lib/graph/rte_graph_model_generic.h diff --git a/lib/graph/graph_private.h b/lib/graph/graph_private.h index f9a85c8926..627090f802 100644 --- a/lib/graph/graph_private.h +++ b/lib/graph/graph_private.h @@ -49,6 +49,7 @@ struct node { STAILQ_ENTRY(node) next; /**< Next node in the list. */ char name[RTE_NODE_NAMESIZE]; /**< Name of the node. */ uint64_t flags; /**< Node configuration flag. */ + unsigned int lcore_id; /**< Node runs on the Lcore ID */ rte_node_process_t process; /**< Node process function. */ rte_node_init_t init; /**< Node init function. */ rte_node_fini_t fini; /**< Node fini function. */ diff --git a/lib/graph/meson.build b/lib/graph/meson.build index c7327549e8..8c8b11ed27 100644 --- a/lib/graph/meson.build +++ b/lib/graph/meson.build @@ -14,6 +14,7 @@ sources = files( 'graph_debug.c', 'graph_stats.c', 'graph_populate.c', + 'rte_graph_model_generic.c', ) headers = files('rte_graph.h', 'rte_graph_worker.h') diff --git a/lib/graph/node.c b/lib/graph/node.c index fc6345de07..8ad4b3cbeb 100644 --- a/lib/graph/node.c +++ b/lib/graph/node.c @@ -100,6 +100,7 @@ __rte_node_register(const struct rte_node_register *reg) goto free; } + node->lcore_id = RTE_MAX_LCORE; node->id = node_id++; /* Add the node at tail */ diff --git a/lib/graph/rte_graph_model_generic.c b/lib/graph/rte_graph_model_generic.c new file mode 100644 index 0000000000..54ff659c7b --- /dev/null +++ b/lib/graph/rte_graph_model_generic.c @@ -0,0 +1,31 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(C) 2022 Intel Corporation + */ + +#include "graph_private.h" +#include "rte_graph_model_generic.h" + +int +rte_node_model_generic_set_lcore_affinity(const char *name, unsigned int lcore_id) +{ + struct node *node; + int ret = -EINVAL; + + if (lcore_id >= RTE_MAX_LCORE) + return ret; + + graph_spinlock_lock(); + + STAILQ_FOREACH(node, node_list_head_get(), next) { + if (strncmp(node->name, name, RTE_NODE_NAMESIZE) == 0) { + node->lcore_id = lcore_id; + ret = 0; + break; + } + } + + graph_spinlock_unlock(); + + return ret; +} + diff --git a/lib/graph/rte_graph_model_generic.h b/lib/graph/rte_graph_model_generic.h new file mode 100644 index 0000000000..20ca48a9e3 --- /dev/null +++ b/lib/graph/rte_graph_model_generic.h @@ -0,0 +1,43 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(C) 2022 Intel Corporation + */ + +#ifndef _RTE_GRAPH_MODEL_GENERIC_H_ +#define _RTE_GRAPH_MODEL_GENERIC_H_ + +/** + * @file rte_graph_model_generic.h + * + * @warning + * @b EXPERIMENTAL: + * All functions in this file may be changed or removed without prior notice. + * + * This API allows a worker thread to walk over a graph and nodes to create, + * process, enqueue and move streams of objects to the next nodes. + */ +#include "rte_graph_worker_common.h" + +#ifdef __cplusplus +extern "C" { +#endif + +/** + * Set lcore affinity to the node. + * + * @param name + * Valid node name. In the case of the cloned node, the name will be + * "parent node name" + "-" + name. + * @param lcore_id + * The lcore ID value. + * + * @return + * 0 on success, error otherwise. + */ +__rte_experimental +int rte_node_model_generic_set_lcore_affinity(const char *name, unsigned int lcore_id); + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_GRAPH_MODEL_GENERIC_H_ */ diff --git a/lib/graph/version.map b/lib/graph/version.map index eea73ec9ca..33ff055be6 100644 --- a/lib/graph/version.map +++ b/lib/graph/version.map @@ -46,5 +46,7 @@ EXPERIMENTAL { rte_graph_worker_model_set; rte_graph_worker_model_get; + rte_node_model_generic_set_lcore_affinity; + local: *; }; From patchwork Thu Nov 17 05:09:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yan, Zhirun" X-Patchwork-Id: 119915 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 8EA46A00C2; Thu, 17 Nov 2022 06:10:15 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 89DEA42D1A; Thu, 17 Nov 2022 06:09:58 +0100 (CET) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by mails.dpdk.org (Postfix) with ESMTP id 33A9140E03 for ; Thu, 17 Nov 2022 06:09:56 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1668661796; x=1700197796; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=nRierDJrq3Ii5YJ9ShE1siJOfJid2RSmHDFY9ecAvrE=; b=PpFvkmaMw/1LkE0DLUEybrlT9E8R/YNkzqqhL/w3VLlrIp3FtYw8r6pl eGcJCD9TxAh0KFRnvSd5afyzQZ4oXKA3mZaN3tLl4ZAlLDv3DuNO2BcwU /PzKzJjEScuqEjqJlThCcyEGjanWE+DNRFwtaomvMBw7SfCZMRDeitG6f 7TkofzYz1S8IOANiW0zbhI2X+YekTtZ9rUzdX734GiUYamNLxhiR1xn9m JN25EvksJrADTeAX1Nr1E5GlScE9fSe4mXR6oDbTcDIC19vDNrN+h+BLX c12arDwejLC+R1Fqvr+jm3B1QsxvWhy83qcmLzjTUJDgtUTNLckjEx7QU Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10533"; a="377026685" X-IronPort-AV: E=Sophos;i="5.96,169,1665471600"; d="scan'208";a="377026685" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Nov 2022 21:09:55 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10533"; a="617466069" X-IronPort-AV: E=Sophos;i="5.96,169,1665471600"; d="scan'208";a="617466069" Received: from dpdk-zhirun-lmm.sh.intel.com ([10.67.118.230]) by orsmga006.jf.intel.com with ESMTP; 16 Nov 2022 21:09:53 -0800 From: Zhirun Yan To: dev@dpdk.org, jerinj@marvell.com, kirankumark@marvell.com, ndabilpuram@marvell.com Cc: cunming.liang@intel.com, haiyue.wang@intel.com, Zhirun Yan Subject: [PATCH v1 06/13] graph: introduce graph affinity API Date: Thu, 17 Nov 2022 13:09:19 +0800 Message-Id: <20221117050926.136974-7-zhirun.yan@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221117050926.136974-1-zhirun.yan@intel.com> References: <20221117050926.136974-1-zhirun.yan@intel.com> MIME-Version: 1.0 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Add lcore_id for graph to hold affinity core id where graph would run on. Add bind/unbind API to set/unset graph affinity attribute. lcore_id will be set as MAX by default, it means not enable this attribute. Signed-off-by: Zhirun Yan --- lib/graph/graph.c | 59 +++++++++++++++++++++++++++++++++++++++ lib/graph/graph_private.h | 2 ++ lib/graph/rte_graph.h | 22 +++++++++++++++ lib/graph/version.map | 2 ++ 4 files changed, 85 insertions(+) diff --git a/lib/graph/graph.c b/lib/graph/graph.c index 3a617cc369..a8d8eb633e 100644 --- a/lib/graph/graph.c +++ b/lib/graph/graph.c @@ -245,6 +245,64 @@ graph_mem_fixup_secondary(struct rte_graph *graph) return graph_mem_fixup_node_ctx(graph); } +static __rte_always_inline bool +graph_src_node_avail(struct graph *graph) +{ + struct graph_node *graph_node; + + STAILQ_FOREACH(graph_node, &graph->node_list, next) + if ((graph_node->node->flags & RTE_NODE_SOURCE_F) && + (graph_node->node->lcore_id == RTE_MAX_LCORE || + graph->lcore_id == graph_node->node->lcore_id)) + return true; + + return false; +} + +int +rte_graph_bind_core(rte_graph_t id, int lcore) +{ + struct graph *graph; + + GRAPH_ID_CHECK(id); + if (!rte_lcore_is_enabled(lcore)) + SET_ERR_JMP(ENOLINK, fail, + "lcore %d not enabled\n", + lcore); + + STAILQ_FOREACH(graph, &graph_list, next) + if (graph->id == id) + break; + + graph->lcore_id = lcore; + graph->socket = rte_lcore_to_socket_id(lcore); + + /* check the availability of source node */ + if (!graph_src_node_avail(graph)) + graph->graph->head = 0; + + return 0; + +fail: + return -rte_errno; +} + +void +rte_graph_unbind_core(rte_graph_t id) +{ + struct graph *graph; + + GRAPH_ID_CHECK(id); + STAILQ_FOREACH(graph, &graph_list, next) + if (graph->id == id) + break; + + graph->lcore_id = RTE_MAX_LCORE; + +fail: + return; +} + struct rte_graph * rte_graph_lookup(const char *name) { @@ -328,6 +386,7 @@ rte_graph_create(const char *name, struct rte_graph_param *prm) graph->src_node_count = src_node_count; graph->node_count = graph_nodes_count(graph); graph->id = graph_id; + graph->lcore_id = RTE_MAX_LCORE; /* Allocate the Graph fast path memory and populate the data */ if (graph_fp_mem_create(graph)) diff --git a/lib/graph/graph_private.h b/lib/graph/graph_private.h index 627090f802..7326975a86 100644 --- a/lib/graph/graph_private.h +++ b/lib/graph/graph_private.h @@ -97,6 +97,8 @@ struct graph { /**< Circular buffer mask for wrap around. */ rte_graph_t id; /**< Graph identifier. */ + unsigned int lcore_id; + /**< Lcore identifier where the graph prefer to run on. */ size_t mem_sz; /**< Memory size of the graph. */ int socket; diff --git a/lib/graph/rte_graph.h b/lib/graph/rte_graph.h index b32c4bc217..1d938f6979 100644 --- a/lib/graph/rte_graph.h +++ b/lib/graph/rte_graph.h @@ -280,6 +280,28 @@ char *rte_graph_id_to_name(rte_graph_t id); __rte_experimental int rte_graph_export(const char *name, FILE *f); +/** + * Set graph lcore affinity attribute + * + * @param id + * Graph id to get the pointer of graph object + * @param lcore + * The lcore where the graph will run on + * @return + * 0 on success, error otherwise. + */ +__rte_experimental +int rte_graph_bind_core(rte_graph_t id, int lcore); + +/** + * Unset the graph lcore affinity attribute + * + * @param id + * Graph id to get the pointer of graph object + */ +__rte_experimental +void rte_graph_unbind_core(rte_graph_t id); + /** * Get graph object from its name. * diff --git a/lib/graph/version.map b/lib/graph/version.map index 33ff055be6..1c599b5b47 100644 --- a/lib/graph/version.map +++ b/lib/graph/version.map @@ -18,6 +18,8 @@ EXPERIMENTAL { rte_graph_node_get_by_name; rte_graph_obj_dump; rte_graph_walk; + rte_graph_bind_core; + rte_graph_unbind_core; rte_graph_cluster_stats_create; rte_graph_cluster_stats_destroy; From patchwork Thu Nov 17 05:09:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yan, Zhirun" X-Patchwork-Id: 119916 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 3CA19A00C2; Thu, 17 Nov 2022 06:10:22 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 8AFCD42D23; Thu, 17 Nov 2022 06:10:01 +0100 (CET) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by mails.dpdk.org (Postfix) with ESMTP id 41DCF42D34 for ; Thu, 17 Nov 2022 06:09:59 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1668661799; x=1700197799; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=CrQZwgkV4Jh6Iyelj6Zx0c+nZ/LChPtNIvONM86X3F8=; b=Clf82aZtQevm8uSIdVcSxDbRdKTT6WPCqJ0iiMAvy+UDU1pAvM1s2dLT 0yPYFWyg7dVoYJLPdXNPLFHiMjHdcjhavKP5xT+OFv3h1SFf4uswmJzDU iJt55mVfGdW41U0UAALO4GfrWhH02fDjrvWSUN9aQrrN/4ModQ7luHLBK 37TSzne1VFWQ6yyZ+DHRZ5Rai1eLczKSIoG8aRp2LkH2ZL6TqcmLTOBCr eqzxdyZWSzviXpl+ctTP3pBP7TQm0+FF6J8W1ostuv7+DsxSUUFanUky0 KjvaTkFliD7QZ0LBiiyDnHf7qhwDWokBkvxnrqXU7XgPjDci8f9zx5rm3 Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10533"; a="377026693" X-IronPort-AV: E=Sophos;i="5.96,169,1665471600"; d="scan'208";a="377026693" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Nov 2022 21:09:58 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10533"; a="617466075" X-IronPort-AV: E=Sophos;i="5.96,169,1665471600"; d="scan'208";a="617466075" Received: from dpdk-zhirun-lmm.sh.intel.com ([10.67.118.230]) by orsmga006.jf.intel.com with ESMTP; 16 Nov 2022 21:09:56 -0800 From: Zhirun Yan To: dev@dpdk.org, jerinj@marvell.com, kirankumark@marvell.com, ndabilpuram@marvell.com Cc: cunming.liang@intel.com, haiyue.wang@intel.com, Zhirun Yan Subject: [PATCH v1 07/13] graph: introduce graph clone API for other worker core Date: Thu, 17 Nov 2022 13:09:20 +0800 Message-Id: <20221117050926.136974-8-zhirun.yan@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221117050926.136974-1-zhirun.yan@intel.com> References: <20221117050926.136974-1-zhirun.yan@intel.com> MIME-Version: 1.0 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org This patch adds graph API for supporting to clone the graph object for a specified worker core. The new graph will also clone all nodes. Signed-off-by: Haiyue Wang Signed-off-by: Cunming Liang Signed-off-by: Zhirun Yan --- lib/graph/graph.c | 110 ++++++++++++++++++++++++++++++++++++++ lib/graph/graph_private.h | 2 + lib/graph/rte_graph.h | 20 +++++++ lib/graph/version.map | 1 + 4 files changed, 133 insertions(+) diff --git a/lib/graph/graph.c b/lib/graph/graph.c index a8d8eb633e..17a9c87032 100644 --- a/lib/graph/graph.c +++ b/lib/graph/graph.c @@ -386,6 +386,7 @@ rte_graph_create(const char *name, struct rte_graph_param *prm) graph->src_node_count = src_node_count; graph->node_count = graph_nodes_count(graph); graph->id = graph_id; + graph->parent_id = RTE_GRAPH_ID_INVALID; graph->lcore_id = RTE_MAX_LCORE; /* Allocate the Graph fast path memory and populate the data */ @@ -447,6 +448,115 @@ rte_graph_destroy(rte_graph_t id) return rc; } +static int +clone_name(struct graph *graph, struct graph *parent_graph, const char *name) +{ + ssize_t sz, rc; + +#define SZ RTE_GRAPH_NAMESIZE + rc = rte_strscpy(graph->name, parent_graph->name, SZ); + if (rc < 0) + goto fail; + sz = rc; + rc = rte_strscpy(graph->name + sz, "-", RTE_MAX((int16_t)(SZ - sz), 0)); + if (rc < 0) + goto fail; + sz += rc; + sz = rte_strscpy(graph->name + sz, name, RTE_MAX((int16_t)(SZ - sz), 0)); + if (sz < 0) + goto fail; + + return 0; +fail: + rte_errno = E2BIG; + return -rte_errno; +} + +static rte_graph_t +graph_clone(struct graph *parent_graph, const char *name) +{ + struct graph_node *graph_node; + struct graph *graph; + + graph_spinlock_lock(); + + /* Don't allow to clone a node from a cloned graph */ + if (parent_graph->parent_id != RTE_GRAPH_ID_INVALID) + SET_ERR_JMP(EEXIST, fail, "A cloned graph is not allowed to be cloned"); + + /* Create graph object */ + graph = calloc(1, sizeof(*graph)); + if (graph == NULL) + SET_ERR_JMP(ENOMEM, fail, "Failed to calloc cloned graph object"); + + /* Naming ceremony of the new graph. name is node->name + "-" + name */ + if (clone_name(graph, parent_graph, name)) + goto free; + + /* Check for existence of duplicate graph */ + if (rte_graph_from_name(graph->name) != RTE_GRAPH_ID_INVALID) + SET_ERR_JMP(EEXIST, free, "Found duplicate graph %s", + graph->name); + + /* Clone nodes from parent graph firstly */ + STAILQ_INIT(&graph->node_list); + STAILQ_FOREACH(graph_node, &parent_graph->node_list, next) { + if (graph_node_add(graph, graph_node->node)) + goto graph_cleanup; + } + + /* Just update adjacency list of all nodes in the graph */ + if (graph_adjacency_list_update(graph)) + goto graph_cleanup; + + /* Initialize the graph object */ + graph->src_node_count = parent_graph->src_node_count; + graph->node_count = parent_graph->node_count; + graph->parent_id = parent_graph->id; + graph->lcore_id = parent_graph->lcore_id; + graph->socket = parent_graph->socket; + graph->id = graph_id; + + /* Allocate the Graph fast path memory and populate the data */ + if (graph_fp_mem_create(graph)) + goto graph_cleanup; + + /* Call init() of the all the nodes in the graph */ + if (graph_node_init(graph)) + goto graph_mem_destroy; + + /* All good, Lets add the graph to the list */ + graph_id++; + STAILQ_INSERT_TAIL(&graph_list, graph, next); + + graph_spinlock_unlock(); + return graph->id; + +graph_mem_destroy: + graph_fp_mem_destroy(graph); +graph_cleanup: + graph_cleanup(graph); +free: + free(graph); +fail: + graph_spinlock_unlock(); + return RTE_GRAPH_ID_INVALID; +} + +rte_graph_t +rte_graph_clone(rte_graph_t id, const char *name) +{ + struct graph *graph; + + GRAPH_ID_CHECK(id); + STAILQ_FOREACH(graph, &graph_list, next) + if (graph->id == id) + return graph_clone(graph, name); + +fail: + return RTE_GRAPH_ID_INVALID; +} + rte_graph_t rte_graph_from_name(const char *name) { diff --git a/lib/graph/graph_private.h b/lib/graph/graph_private.h index 7326975a86..c1f2aadd42 100644 --- a/lib/graph/graph_private.h +++ b/lib/graph/graph_private.h @@ -97,6 +97,8 @@ struct graph { /**< Circular buffer mask for wrap around. */ rte_graph_t id; /**< Graph identifier. */ + rte_graph_t parent_id; + /**< Parent graph identifier. */ unsigned int lcore_id; /**< Lcore identifier where the graph prefer to run on. */ size_t mem_sz; diff --git a/lib/graph/rte_graph.h b/lib/graph/rte_graph.h index 1d938f6979..210e125661 100644 --- a/lib/graph/rte_graph.h +++ b/lib/graph/rte_graph.h @@ -242,6 +242,26 @@ rte_graph_t rte_graph_create(const char *name, struct rte_graph_param *prm); __rte_experimental int rte_graph_destroy(rte_graph_t id); +/** + * Clone Graph. + * + * Clone a graph from static graph (graph created from rte_graph_create). And + * all cloned graphs attached to the parent graph MUST be destroyed together + * for fast schedule design limitation (stop ALL graph walk firstly). + * + * @param id + * Static graph id to clone from. + * @param name + * Name of the new graph. The library prepends the parent graph name to the + * user-specified name. The final graph name will be, + * "parent graph name" + "-" + name. + * + * @return + * Valid graph id on success, RTE_GRAPH_ID_INVALID otherwise. + */ +__rte_experimental +rte_graph_t rte_graph_clone(rte_graph_t id, const char *name); + /** * Get graph id from graph name. * diff --git a/lib/graph/version.map b/lib/graph/version.map index 1c599b5b47..c4d8c2c271 100644 --- a/lib/graph/version.map +++ b/lib/graph/version.map @@ -7,6 +7,7 @@ EXPERIMENTAL { rte_graph_create; rte_graph_destroy; + rte_graph_clone; rte_graph_dump; rte_graph_export; rte_graph_from_name; From patchwork Thu Nov 17 05:09:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yan, Zhirun" X-Patchwork-Id: 119917 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 0A856A00C2; Thu, 17 Nov 2022 06:10:30 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id D838242D38; Thu, 17 Nov 2022 06:10:03 +0100 (CET) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by mails.dpdk.org (Postfix) with ESMTP id E060942D42 for ; Thu, 17 Nov 2022 06:10:01 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1668661802; x=1700197802; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ANqAJwTs8yKqu08sEZiG3stKmE3VhaDASGw9XE1MXuI=; b=iFUpygJhRtpLQukxZkwEpMQp761RoAXepuYDPtx1q0sJO3ENn5M8Q4Be NyucgJZgfBoNMfHmTiXGlQBMHPBLsHRyq+zrxfIfikhhPXXtqV1n5CMJU E7t43snCkjNxk5LDgjSm12yjWRTA2KoRxVGrPNKMR1z96p7zcAl5bDZSB FYqvfPZrUmi1qC5XvxHRyZYbBnqlw6puspm9K0+hzwfFODYaXM5UPJMPy +CKgXhZSFIG8bqS7WqZzmAI7LO4YmyRPgCqx05pKxepz35A9kwgY+lt+/ XnikWaqYSJWSAQDX5U/B4g4NtRcw02Zp9J3XX/plh/IMI8GQx+2dMR/v/ Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10533"; a="377026698" X-IronPort-AV: E=Sophos;i="5.96,169,1665471600"; d="scan'208";a="377026698" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Nov 2022 21:10:01 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10533"; a="617466112" X-IronPort-AV: E=Sophos;i="5.96,169,1665471600"; d="scan'208";a="617466112" Received: from dpdk-zhirun-lmm.sh.intel.com ([10.67.118.230]) by orsmga006.jf.intel.com with ESMTP; 16 Nov 2022 21:09:59 -0800 From: Zhirun Yan To: dev@dpdk.org, jerinj@marvell.com, kirankumark@marvell.com, ndabilpuram@marvell.com Cc: cunming.liang@intel.com, haiyue.wang@intel.com, Zhirun Yan Subject: [PATCH v1 08/13] graph: introduce stream moving cross cores Date: Thu, 17 Nov 2022 13:09:21 +0800 Message-Id: <20221117050926.136974-9-zhirun.yan@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221117050926.136974-1-zhirun.yan@intel.com> References: <20221117050926.136974-1-zhirun.yan@intel.com> MIME-Version: 1.0 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org This patch introduces key functions to allow a worker thread to enable enqueue and move streams of objects to the next nodes over different cores. 1. add graph_sched_wq_node to hold graph scheduling workqueue node stream 2. add workqueue help functions to create/destroy/enqueue/dequeue Signed-off-by: Haiyue Wang Signed-off-by: Cunming Liang Signed-off-by: Zhirun Yan --- lib/graph/graph.c | 1 + lib/graph/graph_populate.c | 1 + lib/graph/graph_private.h | 39 ++++++++ lib/graph/meson.build | 2 +- lib/graph/rte_graph_model_generic.c | 145 ++++++++++++++++++++++++++++ lib/graph/rte_graph_model_generic.h | 35 +++++++ lib/graph/rte_graph_worker_common.h | 18 ++++ 7 files changed, 240 insertions(+), 1 deletion(-) diff --git a/lib/graph/graph.c b/lib/graph/graph.c index 17a9c87032..8ea0daaa35 100644 --- a/lib/graph/graph.c +++ b/lib/graph/graph.c @@ -275,6 +275,7 @@ rte_graph_bind_core(rte_graph_t id, int lcore) break; graph->lcore_id = lcore; + graph->graph->lcore_id = graph->lcore_id; graph->socket = rte_lcore_to_socket_id(lcore); /* check the availability of source node */ diff --git a/lib/graph/graph_populate.c b/lib/graph/graph_populate.c index 102fd6c29b..26f9670406 100644 --- a/lib/graph/graph_populate.c +++ b/lib/graph/graph_populate.c @@ -84,6 +84,7 @@ graph_nodes_populate(struct graph *_graph) } node->id = graph_node->node->id; node->parent_id = pid; + node->lcore_id = graph_node->node->lcore_id; nb_edges = graph_node->node->nb_edges; node->nb_edges = nb_edges; off += sizeof(struct rte_node); diff --git a/lib/graph/graph_private.h b/lib/graph/graph_private.h index c1f2aadd42..f58d0d1d63 100644 --- a/lib/graph/graph_private.h +++ b/lib/graph/graph_private.h @@ -59,6 +59,18 @@ struct node { char next_nodes[][RTE_NODE_NAMESIZE]; /**< Names of next nodes. */ }; +/** + * @internal + * + * Structure that holds the graph scheduling workqueue node stream. + * Used for generic worker model. + */ +struct graph_sched_wq_node { + rte_graph_off_t node_off; + uint16_t nb_objs; + void *objs[RTE_GRAPH_BURST_SIZE]; +} __rte_cache_aligned; + /** * @internal * @@ -349,4 +361,31 @@ void graph_dump(FILE *f, struct graph *g); */ void node_dump(FILE *f, struct node *n); +/** + * @internal + * + * Create the graph schedule work queue. And all cloned graphs attached to the + * parent graph MUST be destroyed together for fast schedule design limitation. + * + * @param _graph + * The graph object + * @param _parent_graph + * The parent graph object which holds the run-queue head. + * + * @return + * - 0: Success. + * - <0: Graph schedule work queue related error. + */ +int graph_sched_wq_create(struct graph *_graph, struct graph *_parent_graph); + +/** + * @internal + * + * Destroy the graph schedule work queue. + * + * @param _graph + * The graph object + */ +void graph_sched_wq_destroy(struct graph *_graph); + #endif /* _RTE_GRAPH_PRIVATE_H_ */ diff --git a/lib/graph/meson.build b/lib/graph/meson.build index 8c8b11ed27..f93ab6fdcb 100644 --- a/lib/graph/meson.build +++ b/lib/graph/meson.build @@ -18,4 +18,4 @@ sources = files( ) headers = files('rte_graph.h', 'rte_graph_worker.h') -deps += ['eal'] +deps += ['eal', 'mempool', 'ring'] diff --git a/lib/graph/rte_graph_model_generic.c b/lib/graph/rte_graph_model_generic.c index 54ff659c7b..c862237432 100644 --- a/lib/graph/rte_graph_model_generic.c +++ b/lib/graph/rte_graph_model_generic.c @@ -5,6 +5,151 @@ #include "graph_private.h" #include "rte_graph_model_generic.h" +int +graph_sched_wq_create(struct graph *_graph, struct graph *_parent_graph) +{ + struct rte_graph *parent_graph = _parent_graph->graph; + struct rte_graph *graph = _graph->graph; + unsigned int wq_size; + + wq_size = GRAPH_SCHED_WQ_SIZE(graph->nb_nodes); + wq_size = rte_align32pow2(wq_size + 1); + + graph->wq = rte_ring_create(graph->name, wq_size, graph->socket, + RING_F_SC_DEQ); + if (graph->wq == NULL) + SET_ERR_JMP(EIO, fail, "Failed to allocate graph WQ"); + + graph->mp = rte_mempool_create(graph->name, wq_size, + sizeof(struct graph_sched_wq_node), + 0, 0, NULL, NULL, NULL, NULL, + graph->socket, MEMPOOL_F_SP_PUT); + if (graph->mp == NULL) + SET_ERR_JMP(EIO, fail_mp, + "Failed to allocate graph WQ schedule entry"); + + graph->lcore_id = _graph->lcore_id; + + if (parent_graph->rq == NULL) { + parent_graph->rq = &parent_graph->rq_head; + SLIST_INIT(parent_graph->rq); + } + + graph->rq = parent_graph->rq; + SLIST_INSERT_HEAD(graph->rq, graph, rq_next); + + return 0; + +fail_mp: + rte_ring_free(graph->wq); + graph->wq = NULL; +fail: + return -rte_errno; +} + +void +graph_sched_wq_destroy(struct graph *_graph) +{ + struct rte_graph *graph = _graph->graph; + + if (graph == NULL) + return; + + rte_ring_free(graph->wq); + graph->wq = NULL; + + rte_mempool_free(graph->mp); + graph->mp = NULL; +} + +static __rte_always_inline bool +__graph_sched_node_enqueue(struct rte_node *node, struct rte_graph *graph) +{ + struct graph_sched_wq_node *wq_node; + uint16_t off = 0; + uint16_t size; + +submit_again: + if (rte_mempool_get(graph->mp, (void **)&wq_node) < 0) + goto fallback; + + size = RTE_MIN(node->idx, RTE_DIM(wq_node->objs)); + wq_node->node_off = node->off; + wq_node->nb_objs = size; + rte_memcpy(wq_node->objs, &node->objs[off], size * sizeof(void *)); + + while (rte_ring_mp_enqueue_bulk_elem(graph->wq, (void *)&wq_node, + sizeof(wq_node), 1, NULL) == 0) + rte_pause(); + + off += size; + node->idx -= size; + if (node->idx > 0) + goto submit_again; + + return true; + +fallback: + if (off != 0) + memmove(&node->objs[0], &node->objs[off], + node->idx * sizeof(void *)); + + return false; +} + +bool __rte_noinline +__rte_graph_sched_node_enqueue(struct rte_node *node, + struct rte_graph_rq_head *rq) +{ + const unsigned int lcore_id = node->lcore_id; + struct rte_graph *graph; + + SLIST_FOREACH(graph, rq, rq_next) + if (graph->lcore_id == lcore_id) + break; + + return graph != NULL ? __graph_sched_node_enqueue(node, graph) : false; +} + +void __rte_noinline +__rte_graph_sched_wq_process(struct rte_graph *graph) +{ + struct graph_sched_wq_node *wq_node; + struct rte_mempool *mp = graph->mp; + struct rte_ring *wq = graph->wq; + uint16_t idx, free_space; + struct rte_node *node; + unsigned int i, n; + struct graph_sched_wq_node *wq_nodes[32]; + + n = rte_ring_sc_dequeue_burst_elem(wq, wq_nodes, sizeof(wq_nodes[0]), + RTE_DIM(wq_nodes), NULL); + if (n == 0) + return; + + for (i = 0; i < n; i++) { + wq_node = wq_nodes[i]; + node = RTE_PTR_ADD(graph, wq_node->node_off); + RTE_ASSERT(node->fence == RTE_GRAPH_FENCE); + idx = node->idx; + free_space = node->size - idx; + + if (unlikely(free_space < wq_node->nb_objs)) + __rte_node_stream_alloc_size(graph, node, node->size + wq_node->nb_objs); + + memmove(&node->objs[idx], wq_node->objs, wq_node->nb_objs * sizeof(void *)); + memset(wq_node->objs, 0, wq_node->nb_objs * sizeof(void *)); + node->idx = idx + wq_node->nb_objs; + + __rte_node_process(graph, node); + + wq_node->nb_objs = 0; + node->idx = 0; + } + + rte_mempool_put_bulk(mp, (void **)wq_nodes, n); +} + int rte_node_model_generic_set_lcore_affinity(const char *name, unsigned int lcore_id) { diff --git a/lib/graph/rte_graph_model_generic.h b/lib/graph/rte_graph_model_generic.h index 20ca48a9e3..5715fc8ffb 100644 --- a/lib/graph/rte_graph_model_generic.h +++ b/lib/graph/rte_graph_model_generic.h @@ -15,12 +15,47 @@ * This API allows a worker thread to walk over a graph and nodes to create, * process, enqueue and move streams of objects to the next nodes. */ +#include +#include +#include +#include + #include "rte_graph_worker_common.h" #ifdef __cplusplus extern "C" { #endif +#define GRAPH_SCHED_WQ_SIZE_MULTIPLIER 8 +#define GRAPH_SCHED_WQ_SIZE(nb_nodes) \ + ((typeof(nb_nodes))((nb_nodes) * GRAPH_SCHED_WQ_SIZE_MULTIPLIER)) + +/** + * @internal + * + * Schedule the node to the right graph's work queue. + * + * @param node + * Pointer to the scheduled node object. + * @param rq + * Pointer to the scheduled run-queue for all graphs. + * + * @return + * True on success, false otherwise. + */ +bool __rte_graph_sched_node_enqueue(struct rte_node *node, + struct rte_graph_rq_head *rq); + +/** + * @internal + * + * Process all nodes (streams) in the graph's work queue. + * + * @param graph + * Pointer to the graph object. + */ +void __rte_noinline __rte_graph_sched_wq_process(struct rte_graph *graph); + /** * Set lcore affinity to the node. * diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h index 507a344afd..cf38a03f44 100644 --- a/lib/graph/rte_graph_worker_common.h +++ b/lib/graph/rte_graph_worker_common.h @@ -28,6 +28,13 @@ extern "C" { #endif +/** + * @internal + * + * Singly-linked list head for graph schedule run-queue. + */ +SLIST_HEAD(rte_graph_rq_head, rte_graph); + /** * @internal * @@ -39,6 +46,15 @@ struct rte_graph { uint32_t cir_mask; /**< Circular buffer wrap around mask. */ rte_node_t nb_nodes; /**< Number of nodes in the graph. */ rte_graph_off_t *cir_start; /**< Pointer to circular buffer. */ + /* Graph schedule */ + struct rte_graph_rq_head *rq __rte_cache_aligned; /* The run-queue */ + struct rte_graph_rq_head rq_head; /* The head for run-queue list */ + + SLIST_ENTRY(rte_graph) rq_next; /* The next for run-queue list */ + unsigned int lcore_id; /**< The graph running Lcore. */ + struct rte_ring *wq; /**< The work-queue for pending streams. */ + struct rte_mempool *mp; /**< The mempool for scheduling streams. */ + /* Graph schedule area */ rte_graph_off_t nodes_start; /**< Offset at which node memory starts. */ rte_graph_t id; /**< Graph identifier. */ int socket; /**< Socket ID where memory is allocated. */ @@ -63,6 +79,8 @@ struct rte_node { char parent[RTE_NODE_NAMESIZE]; /**< Parent node name. */ char name[RTE_NODE_NAMESIZE]; /**< Name of the node. */ + /* Fast schedule area */ + unsigned int lcore_id __rte_cache_aligned; /**< Node running Lcore. */ /* Fast path area */ #define RTE_NODE_CTX_SZ 16 uint8_t ctx[RTE_NODE_CTX_SZ] __rte_cache_aligned; /**< Node Context. */ From patchwork Thu Nov 17 05:09:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yan, Zhirun" X-Patchwork-Id: 119918 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 6AE69A00C2; Thu, 17 Nov 2022 06:10:35 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id D647342D50; Thu, 17 Nov 2022 06:10:05 +0100 (CET) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by mails.dpdk.org (Postfix) with ESMTP id 0423C42D48 for ; Thu, 17 Nov 2022 06:10:03 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1668661804; x=1700197804; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=E8fduuVY9G4uCJTlg2rIy8Yn1O0fV1q/34jdaxB/DWo=; b=EvrGCK85DDi1QAeowqipWW+wfZhZE9a4118kRotpiBpSFW7+aX0sxcZf 9LPrX1Injr0q06CmjGg3JQ1BfVwn4QcTuSeziMerw3vRJFghRylBBQxOF tvN6aTg7OgVnZiJx7X4ksCmmxaBlCxxGTtECNwYkZtL+GC/ddMXegmU7L tyJ3ZnI9p2QihbcH4+cg3ng33RnKupxQszKCDa0Qg2J+6lYOAM/fTqWfr blMg4jEIRMOF/BY5XrrFsAmdK1B9qCogouEF/HaeBBViHdgTCQFeBdDjN TPGA1YX496ptKs+4nZ+7m4TFFkB+x29SQ+TvaAhDO6fEGopIw4bGFGfHK A==; X-IronPort-AV: E=McAfee;i="6500,9779,10533"; a="377026701" X-IronPort-AV: E=Sophos;i="5.96,169,1665471600"; d="scan'208";a="377026701" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Nov 2022 21:10:03 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10533"; a="617466156" X-IronPort-AV: E=Sophos;i="5.96,169,1665471600"; d="scan'208";a="617466156" Received: from dpdk-zhirun-lmm.sh.intel.com ([10.67.118.230]) by orsmga006.jf.intel.com with ESMTP; 16 Nov 2022 21:10:01 -0800 From: Zhirun Yan To: dev@dpdk.org, jerinj@marvell.com, kirankumark@marvell.com, ndabilpuram@marvell.com Cc: cunming.liang@intel.com, haiyue.wang@intel.com, Zhirun Yan Subject: [PATCH v1 09/13] graph: enable create and destroy graph scheduling workqueue Date: Thu, 17 Nov 2022 13:09:22 +0800 Message-Id: <20221117050926.136974-10-zhirun.yan@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221117050926.136974-1-zhirun.yan@intel.com> References: <20221117050926.136974-1-zhirun.yan@intel.com> MIME-Version: 1.0 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org This patch enables to create and destroy scheduling workqueue into common graph operations. Signed-off-by: Haiyue Wang Signed-off-by: Cunming Liang Signed-off-by: Zhirun Yan --- lib/graph/graph.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/lib/graph/graph.c b/lib/graph/graph.c index 8ea0daaa35..63d9bcffd2 100644 --- a/lib/graph/graph.c +++ b/lib/graph/graph.c @@ -428,6 +428,10 @@ rte_graph_destroy(rte_graph_t id) while (graph != NULL) { tmp = STAILQ_NEXT(graph, next); if (graph->id == id) { + /* Destroy the schedule work queue if has */ + if (rte_graph_worker_model_get() == RTE_GRAPH_MODEL_GENERIC) + graph_sched_wq_destroy(graph); + /* Call fini() of the all the nodes in the graph */ graph_node_fini(graph); /* Destroy graph fast path memory */ @@ -522,6 +526,11 @@ graph_clone(struct graph *parent_graph, const char *name) if (graph_fp_mem_create(graph)) goto graph_cleanup; + /* Create the graph schedule work queue */ + if (rte_graph_worker_model_get() == RTE_GRAPH_MODEL_GENERIC && + graph_sched_wq_create(graph, parent_graph)) + goto graph_mem_destroy; + /* Call init() of the all the nodes in the graph */ if (graph_node_init(graph)) goto graph_mem_destroy; From patchwork Thu Nov 17 05:09:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yan, Zhirun" X-Patchwork-Id: 119919 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id A2569A00C2; Thu, 17 Nov 2022 06:10:41 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id CC70242D49; Thu, 17 Nov 2022 06:10:08 +0100 (CET) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by mails.dpdk.org (Postfix) with ESMTP id 5A16142D53 for ; Thu, 17 Nov 2022 06:10:06 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1668661806; x=1700197806; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Q4+8K+8H5m+L1/E8+7xrXF0HneATnliJosiODYpnBNo=; b=A0gB50g9pcEqvJh3BaYtV/sFo8wWlnclxdunor24gQAxz/9skTw5GR0S WEfJOkZF+10xNjPI/v+NESrs1Cou4mn+hXPYAer2BEnyCzxmQA3y31e6A 88UWGl4yeBiqiSYys3ttpdOZChPGkpeQ9bZJPoMH6Tsia7nW6Py4w8ckm v8ccvfUIxAWouzuYUfShP4eMqXVoeHH6p2LQoDIIh4rKdNE1AynHex15d BPyoPw1Yd4/GVy7zigpYyrywWB5IheZlWP8dKJ4XXKQUUbThdbj06aRjG jyocdTSMVflGVjMfb7+X0shp2g4Ih8nh8YmDVY6GbOCPEKiOJdi/gEKeK A==; X-IronPort-AV: E=McAfee;i="6500,9779,10533"; a="377026710" X-IronPort-AV: E=Sophos;i="5.96,169,1665471600"; d="scan'208";a="377026710" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Nov 2022 21:10:06 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10533"; a="617466171" X-IronPort-AV: E=Sophos;i="5.96,169,1665471600"; d="scan'208";a="617466171" Received: from dpdk-zhirun-lmm.sh.intel.com ([10.67.118.230]) by orsmga006.jf.intel.com with ESMTP; 16 Nov 2022 21:10:03 -0800 From: Zhirun Yan To: dev@dpdk.org, jerinj@marvell.com, kirankumark@marvell.com, ndabilpuram@marvell.com Cc: cunming.liang@intel.com, haiyue.wang@intel.com, Zhirun Yan Subject: [PATCH v1 10/13] graph: introduce graph walk by cross-core dispatch Date: Thu, 17 Nov 2022 13:09:23 +0800 Message-Id: <20221117050926.136974-11-zhirun.yan@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221117050926.136974-1-zhirun.yan@intel.com> References: <20221117050926.136974-1-zhirun.yan@intel.com> MIME-Version: 1.0 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org This patch introduces the task scheduler mechanism to enable dispatching tasks to another worker cores. Currently, there is only a local work queue for one graph to walk. We introduce a scheduler worker queue in each worker core for dispatching tasks. It will perform the walk on scheduler work queue first, then handle the local work queue. Signed-off-by: Haiyue Wang Signed-off-by: Cunming Liang Signed-off-by: Zhirun Yan --- lib/graph/rte_graph_model_generic.h | 36 +++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) diff --git a/lib/graph/rte_graph_model_generic.h b/lib/graph/rte_graph_model_generic.h index 5715fc8ffb..c29fc31309 100644 --- a/lib/graph/rte_graph_model_generic.h +++ b/lib/graph/rte_graph_model_generic.h @@ -71,6 +71,42 @@ void __rte_noinline __rte_graph_sched_wq_process(struct rte_graph *graph); __rte_experimental int rte_node_model_generic_set_lcore_affinity(const char *name, unsigned int lcore_id); +/** + * Perform graph walk on the circular buffer and invoke the process function + * of the nodes and collect the stats. + * + * @param graph + * Graph pointer returned from rte_graph_lookup function. + * + * @see rte_graph_lookup() + */ +__rte_experimental +static inline void +rte_graph_walk_generic(struct rte_graph *graph) +{ + uint32_t head = graph->head; + struct rte_node *node; + + if (graph->wq != NULL) + __rte_graph_sched_wq_process(graph); + + rte_graph_walk_node(graph, head, node) { + /* skip the src nodes which not bind with current worker */ + if ((int32_t)head < 0 && node->lcore_id != graph->lcore_id) + continue; + + /* Schedule the node until all task/objs are done */ + if (node->lcore_id != RTE_MAX_LCORE && + graph->lcore_id != node->lcore_id && graph->rq != NULL && + __rte_graph_sched_node_enqueue(node, graph->rq)) + continue; + + __rte_node_process(graph, node); + } + + graph->tail = 0; +} + #ifdef __cplusplus } #endif From patchwork Thu Nov 17 05:09:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yan, Zhirun" X-Patchwork-Id: 119920 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 7E729A00C2; Thu, 17 Nov 2022 06:10:47 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id DAE4542D55; Thu, 17 Nov 2022 06:10:09 +0100 (CET) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by mails.dpdk.org (Postfix) with ESMTP id CAD2C42D48 for ; Thu, 17 Nov 2022 06:10:08 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1668661809; x=1700197809; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=6JryLhmtyevmq6Ib5ozYMrtcP3+SeKG/LdzK2cf6ozw=; b=F9rdCOm8wZsA5SNM5NYV8y2THuznMnxe/5cRa6F6c9/cL//nrUlZImv0 clSxIEDXimYyNXHpnuwA8TVZ6VyRxuEN5JcKqC91rj/PJuidAMtUP4Dx0 nHJ7AwV8Eu1X9X+Eg7EMbSrGy4nIBmBtY4+bC/d75SUm4S9/33m4oVTYr 8S4RpTTwFjj6LwOAlMY6jjkpiOAL07VrLbMjAVm2uQjvkNGAK3qSiZdsk bVPvPAru/r9QXXJ/M/rW4+nwVSEpvIL39RplLezfTIbb3ZOhweRCWLkWl UwSUOYlhcoH3caS5tJmmrXRmgcq3TO+MSO57QSNvwBLoatsq6Y+Dmt78+ w==; X-IronPort-AV: E=McAfee;i="6500,9779,10533"; a="377026715" X-IronPort-AV: E=Sophos;i="5.96,169,1665471600"; d="scan'208";a="377026715" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Nov 2022 21:10:08 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10533"; a="617466178" X-IronPort-AV: E=Sophos;i="5.96,169,1665471600"; d="scan'208";a="617466178" Received: from dpdk-zhirun-lmm.sh.intel.com ([10.67.118.230]) by orsmga006.jf.intel.com with ESMTP; 16 Nov 2022 21:10:06 -0800 From: Zhirun Yan To: dev@dpdk.org, jerinj@marvell.com, kirankumark@marvell.com, ndabilpuram@marvell.com Cc: cunming.liang@intel.com, haiyue.wang@intel.com, Zhirun Yan Subject: [PATCH v1 11/13] graph: enable graph generic scheduler model Date: Thu, 17 Nov 2022 13:09:24 +0800 Message-Id: <20221117050926.136974-12-zhirun.yan@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221117050926.136974-1-zhirun.yan@intel.com> References: <20221117050926.136974-1-zhirun.yan@intel.com> MIME-Version: 1.0 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org This patch enables to chose new scheduler model. Signed-off-by: Haiyue Wang Signed-off-by: Cunming Liang Signed-off-by: Zhirun Yan --- lib/graph/rte_graph_worker.h | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/lib/graph/rte_graph_worker.h b/lib/graph/rte_graph_worker.h index a0ea0df153..dea207ca46 100644 --- a/lib/graph/rte_graph_worker.h +++ b/lib/graph/rte_graph_worker.h @@ -1,4 +1,5 @@ #include "rte_graph_model_rtc.h" +#include "rte_graph_model_generic.h" static enum rte_graph_worker_model worker_model = RTE_GRAPH_MODEL_DEFAULT; @@ -64,5 +65,11 @@ __rte_experimental static inline void rte_graph_walk(struct rte_graph *graph) { - rte_graph_walk_rtc(graph); + int model = rte_graph_worker_model_get(); + + if (model == RTE_GRAPH_MODEL_DEFAULT || + model == RTE_GRAPH_MODEL_RTC) + rte_graph_walk_rtc(graph); + else if (model == RTE_GRAPH_MODEL_GENERIC) + rte_graph_walk_generic(graph); } From patchwork Thu Nov 17 05:09:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yan, Zhirun" X-Patchwork-Id: 119921 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 6F248A00C2; Thu, 17 Nov 2022 06:10:53 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id CF44F40E28; Thu, 17 Nov 2022 06:10:13 +0100 (CET) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by mails.dpdk.org (Postfix) with ESMTP id D4A6442D59 for ; Thu, 17 Nov 2022 06:10:11 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1668661812; x=1700197812; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=/TLTYunmEeyErrhA8rG7TlFeMlYLECtmkEl1gCrXiu0=; b=XrpYw/rHpbOIiQkeF+Ih7dRO2kPakxd/7aZ0wvNUtaiKRuubjjFTpqEa d2ANPHdKkTxcWt5pPXpDrKeS8jJbfhgos6g9ApdOeuoGjvf2BIhermkwd jPPYNMDrUwIz7/GKXgrvum4+rD0Mxg4iavIRE6Ol3HLc0Mb8RxTcvMV5b k7urTxCdrfSBrfoSqA3wa/YhJqBXUIDoPc6MD0xW4MQAZm7sp6Hwg6K49 9V759SzPsHUe5wy3+oPiiEUsJxsUCR6e+V8anlaM4r9X+TS9QpBQ2F+W1 5deFhOe02lT8cEKfg6xpoS6ZGyOqJG0JNHGBxpkoDhApBjMThYTJhHS5H A==; X-IronPort-AV: E=McAfee;i="6500,9779,10533"; a="377026718" X-IronPort-AV: E=Sophos;i="5.96,169,1665471600"; d="scan'208";a="377026718" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Nov 2022 21:10:11 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10533"; a="617466196" X-IronPort-AV: E=Sophos;i="5.96,169,1665471600"; d="scan'208";a="617466196" Received: from dpdk-zhirun-lmm.sh.intel.com ([10.67.118.230]) by orsmga006.jf.intel.com with ESMTP; 16 Nov 2022 21:10:08 -0800 From: Zhirun Yan To: dev@dpdk.org, jerinj@marvell.com, kirankumark@marvell.com, ndabilpuram@marvell.com Cc: cunming.liang@intel.com, haiyue.wang@intel.com, Zhirun Yan Subject: [PATCH v1 12/13] graph: add stats for corss-core dispatching Date: Thu, 17 Nov 2022 13:09:25 +0800 Message-Id: <20221117050926.136974-13-zhirun.yan@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221117050926.136974-1-zhirun.yan@intel.com> References: <20221117050926.136974-1-zhirun.yan@intel.com> MIME-Version: 1.0 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Add stats for cross-core dispatching scheduler if stats collection is enabled. Signed-off-by: Haiyue Wang Signed-off-by: Cunming Liang Signed-off-by: Zhirun Yan --- lib/graph/graph_debug.c | 6 +++ lib/graph/graph_stats.c | 74 +++++++++++++++++++++++++---- lib/graph/rte_graph.h | 2 + lib/graph/rte_graph_model_generic.c | 3 ++ lib/graph/rte_graph_worker_common.h | 2 + 5 files changed, 79 insertions(+), 8 deletions(-) diff --git a/lib/graph/graph_debug.c b/lib/graph/graph_debug.c index b84412f5dd..080ba16ad9 100644 --- a/lib/graph/graph_debug.c +++ b/lib/graph/graph_debug.c @@ -74,6 +74,12 @@ rte_graph_obj_dump(FILE *f, struct rte_graph *g, bool all) fprintf(f, " size=%d\n", n->size); fprintf(f, " idx=%d\n", n->idx); fprintf(f, " total_objs=%" PRId64 "\n", n->total_objs); + if (rte_graph_worker_model_get() == RTE_GRAPH_MODEL_GENERIC) { + fprintf(f, " total_sched_objs=%" PRId64 "\n", + n->total_sched_objs); + fprintf(f, " total_sched_fail=%" PRId64 "\n", + n->total_sched_fail); + } fprintf(f, " total_calls=%" PRId64 "\n", n->total_calls); for (i = 0; i < n->nb_edges; i++) fprintf(f, " edge[%d] <%s>\n", i, diff --git a/lib/graph/graph_stats.c b/lib/graph/graph_stats.c index c0140ba922..801fcb832d 100644 --- a/lib/graph/graph_stats.c +++ b/lib/graph/graph_stats.c @@ -40,13 +40,19 @@ struct rte_graph_cluster_stats { struct cluster_node clusters[]; } __rte_cache_aligned; +#define boarder_model_generic() \ + fprintf(f, "+-------------------------------+---------------+--------" \ + "-------+---------------+---------------+---------------+" \ + "---------------+---------------+-" \ + "----------+\n") + #define boarder() \ fprintf(f, "+-------------------------------+---------------+--------" \ "-------+---------------+---------------+---------------+-" \ "----------+\n") static inline void -print_banner(FILE *f) +print_banner_default(FILE *f) { boarder(); fprintf(f, "%-32s%-16s%-16s%-16s%-16s%-16s%-16s\n", "|Node", "|calls", @@ -55,6 +61,27 @@ print_banner(FILE *f) boarder(); } +static inline void +print_banner_generic(FILE *f) +{ + boarder_model_generic(); + fprintf(f, "%-32s%-16s%-16s%-16s%-16s%-16s%-16s%-16s%-16s\n", + "|Node", "|calls", + "|objs", "|sched objs", "|sched fail", + "|realloc_count", "|objs/call", "|objs/sec(10E6)", + "|cycles/call|"); + boarder_model_generic(); +} + +static inline void +print_banner(FILE *f) +{ + if (rte_graph_worker_model_get() == RTE_GRAPH_MODEL_GENERIC) + print_banner_generic(f); + else + print_banner_default(f); +} + static inline void print_node(FILE *f, const struct rte_graph_cluster_node_stats *stat) { @@ -76,11 +103,21 @@ print_node(FILE *f, const struct rte_graph_cluster_node_stats *stat) objs_per_sec = ts_per_hz ? (objs - prev_objs) / ts_per_hz : 0; objs_per_sec /= 1000000; - fprintf(f, - "|%-31s|%-15" PRIu64 "|%-15" PRIu64 "|%-15" PRIu64 - "|%-15.3f|%-15.6f|%-11.4f|\n", - stat->name, calls, objs, stat->realloc_count, objs_per_call, - objs_per_sec, cycles_per_call); + if (rte_graph_worker_model_get() == RTE_GRAPH_MODEL_GENERIC) { + fprintf(f, + "|%-31s|%-15" PRIu64 "|%-15" PRIu64 "|%-15" PRIu64 + "|%-15" PRIu64 "|%-15" PRIu64 + "|%-15.3f|%-15.6f|%-11.4f|\n", + stat->name, calls, objs, stat->sched_objs, + stat->sched_fail, stat->realloc_count, objs_per_call, + objs_per_sec, cycles_per_call); + } else { + fprintf(f, + "|%-31s|%-15" PRIu64 "|%-15" PRIu64 "|%-15" PRIu64 + "|%-15.3f|%-15.6f|%-11.4f|\n", + stat->name, calls, objs, stat->realloc_count, objs_per_call, + objs_per_sec, cycles_per_call); + } } static int @@ -88,13 +125,20 @@ graph_cluster_stats_cb(bool is_first, bool is_last, void *cookie, const struct rte_graph_cluster_node_stats *stat) { FILE *f = cookie; + int model; + + model = rte_graph_worker_model_get(); if (unlikely(is_first)) print_banner(f); if (stat->objs) print_node(f, stat); - if (unlikely(is_last)) - boarder(); + if (unlikely(is_last)) { + if (model == RTE_GRAPH_MODEL_GENERIC) + boarder_model_generic(); + else + boarder(); + } return 0; }; @@ -332,13 +376,21 @@ static inline void cluster_node_arregate_stats(struct cluster_node *cluster) { uint64_t calls = 0, cycles = 0, objs = 0, realloc_count = 0; + uint64_t sched_objs = 0, sched_fail = 0; struct rte_graph_cluster_node_stats *stat = &cluster->stat; struct rte_node *node; rte_node_t count; + int model; + model = rte_graph_worker_model_get(); for (count = 0; count < cluster->nb_nodes; count++) { node = cluster->nodes[count]; + if (model == RTE_GRAPH_MODEL_GENERIC) { + sched_objs += node->total_sched_objs; + sched_fail += node->total_sched_fail; + } + calls += node->total_calls; objs += node->total_objs; cycles += node->total_cycles; @@ -348,6 +400,12 @@ cluster_node_arregate_stats(struct cluster_node *cluster) stat->calls = calls; stat->objs = objs; stat->cycles = cycles; + + if (model == RTE_GRAPH_MODEL_GENERIC) { + stat->sched_objs = sched_objs; + stat->sched_fail = sched_fail; + } + stat->ts = rte_get_timer_cycles(); stat->realloc_count = realloc_count; } diff --git a/lib/graph/rte_graph.h b/lib/graph/rte_graph.h index 210e125661..2d22ee0255 100644 --- a/lib/graph/rte_graph.h +++ b/lib/graph/rte_graph.h @@ -203,6 +203,8 @@ struct rte_graph_cluster_node_stats { uint64_t prev_calls; /**< Previous number of calls. */ uint64_t prev_objs; /**< Previous number of processed objs. */ uint64_t prev_cycles; /**< Previous number of cycles. */ + uint64_t sched_objs; /**< Previous number of scheduled objs. */ + uint64_t sched_fail; /**< Previous number of failed schedule objs. */ uint64_t realloc_count; /**< Realloc count. */ diff --git a/lib/graph/rte_graph_model_generic.c b/lib/graph/rte_graph_model_generic.c index c862237432..5504a65a39 100644 --- a/lib/graph/rte_graph_model_generic.c +++ b/lib/graph/rte_graph_model_generic.c @@ -83,6 +83,7 @@ __graph_sched_node_enqueue(struct rte_node *node, struct rte_graph *graph) rte_pause(); off += size; + node->total_sched_objs += size; node->idx -= size; if (node->idx > 0) goto submit_again; @@ -94,6 +95,8 @@ __graph_sched_node_enqueue(struct rte_node *node, struct rte_graph *graph) memmove(&node->objs[0], &node->objs[off], node->idx * sizeof(void *)); + node->total_sched_fail += node->idx; + return false; } diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h index cf38a03f44..346f8337d4 100644 --- a/lib/graph/rte_graph_worker_common.h +++ b/lib/graph/rte_graph_worker_common.h @@ -81,6 +81,8 @@ struct rte_node { /* Fast schedule area */ unsigned int lcore_id __rte_cache_aligned; /**< Node running Lcore. */ + uint64_t total_sched_objs; /**< Number of objects scheduled. */ + uint64_t total_sched_fail; /**< Number of scheduled failure. */ /* Fast path area */ #define RTE_NODE_CTX_SZ 16 uint8_t ctx[RTE_NODE_CTX_SZ] __rte_cache_aligned; /**< Node Context. */ From patchwork Thu Nov 17 05:09:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yan, Zhirun" X-Patchwork-Id: 119922 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 1346BA00C2; Thu, 17 Nov 2022 06:11:01 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 2255042D61; Thu, 17 Nov 2022 06:10:17 +0100 (CET) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by mails.dpdk.org (Postfix) with ESMTP id 2E71840141 for ; Thu, 17 Nov 2022 06:10:15 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1668661815; x=1700197815; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Qkb7HlVZ1naKviVdq7cAXa9hv+BILazNiQWu/CklEO0=; b=V9DCFVhZfJJrrf4l46sfdsSyFjH0OI2u7ukQpwXjrqpAy16n5rzISDbm lXmIlJMgMqIz5XBBBpqg/c+h35wr4u8s6YxKbdhJBNDqkjEOQvUWwwu/T RZ760zcgzxOkMmN1xNI+VHlJ6LtNJbth2yYxPO3ONWiUjAzcpLEQD+189 Os/9S5AFXZllES5BR1sJQHGPUI2AUpgdcNrOLqTUghMAE9XJHmxfN1D1L CqMlEZdxoVUh0CnBMqfXmmF7BLN+bvnG/ReogF/B21z7sQsMkKWROuLI7 mMWb/B2UtuxTY2AfrjndMi4N/x10xxT2fHI9FfbfL6uLgVIEd0M0oq7N1 w==; X-IronPort-AV: E=McAfee;i="6500,9779,10533"; a="377026724" X-IronPort-AV: E=Sophos;i="5.96,169,1665471600"; d="scan'208";a="377026724" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Nov 2022 21:10:14 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10533"; a="617466253" X-IronPort-AV: E=Sophos;i="5.96,169,1665471600"; d="scan'208";a="617466253" Received: from dpdk-zhirun-lmm.sh.intel.com ([10.67.118.230]) by orsmga006.jf.intel.com with ESMTP; 16 Nov 2022 21:10:11 -0800 From: Zhirun Yan To: dev@dpdk.org, jerinj@marvell.com, kirankumark@marvell.com, ndabilpuram@marvell.com Cc: cunming.liang@intel.com, haiyue.wang@intel.com, Zhirun Yan Subject: [PATCH v1 13/13] examples/l3fwd-graph: introduce generic worker model Date: Thu, 17 Nov 2022 13:09:26 +0800 Message-Id: <20221117050926.136974-14-zhirun.yan@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221117050926.136974-1-zhirun.yan@intel.com> References: <20221117050926.136974-1-zhirun.yan@intel.com> MIME-Version: 1.0 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Add new parameter "model" to choose generic or rtc worker model. And in generic model, the node will affinity to worker core successively. Note: only support one RX node for remote model in current implementation. ./dpdk-l3fwd-graph -l 8,9,10,11 -n 4 -- -p 0x1 --config="(0,0,9)" -P --model="generic" Signed-off-by: Haiyue Wang Signed-off-by: Cunming Liang Signed-off-by: Zhirun Yan --- examples/l3fwd-graph/main.c | 218 +++++++++++++++++++++++++++++------- 1 file changed, 179 insertions(+), 39 deletions(-) diff --git a/examples/l3fwd-graph/main.c b/examples/l3fwd-graph/main.c index 6dcb6ee92b..c145a3e3e8 100644 --- a/examples/l3fwd-graph/main.c +++ b/examples/l3fwd-graph/main.c @@ -147,6 +147,19 @@ static struct ipv4_l3fwd_lpm_route ipv4_l3fwd_lpm_route_array[] = { {RTE_IPV4(198, 18, 6, 0), 24, 6}, {RTE_IPV4(198, 18, 7, 0), 24, 7}, }; +static int +check_worker_model_params(void) +{ + if (rte_graph_worker_model_get() == RTE_GRAPH_MODEL_GENERIC && + nb_lcore_params > 1) { + printf("Exceeded max number of lcore params for remote model: %hu\n", + nb_lcore_params); + return -1; + } + + return 0; +} + static int check_lcore_params(void) { @@ -291,6 +304,20 @@ parse_max_pkt_len(const char *pktlen) return len; } +static int +parse_worker_model(const char *model) +{ + if (strcmp(model, WORKER_MODEL_DEFAULT) == 0) + return RTE_GRAPH_MODEL_DEFAULT; + else if (strcmp(model, WORKER_MODEL_GENERIC) == 0) { + rte_graph_worker_model_set(RTE_GRAPH_MODEL_GENERIC); + return RTE_GRAPH_MODEL_GENERIC; + } + rte_exit(EXIT_FAILURE, "Invalid worker model: %s", model); + + return RTE_GRAPH_MODEL_MAX; +} + static int parse_portmask(const char *portmask) { @@ -404,6 +431,7 @@ static const char short_options[] = "p:" /* portmask */ #define CMD_LINE_OPT_NO_NUMA "no-numa" #define CMD_LINE_OPT_MAX_PKT_LEN "max-pkt-len" #define CMD_LINE_OPT_PER_PORT_POOL "per-port-pool" +#define CMD_LINE_OPT_WORKER_MODEL "model" enum { /* Long options mapped to a short option */ @@ -416,6 +444,7 @@ enum { CMD_LINE_OPT_NO_NUMA_NUM, CMD_LINE_OPT_MAX_PKT_LEN_NUM, CMD_LINE_OPT_PARSE_PER_PORT_POOL, + CMD_LINE_OPT_WORKER_MODEL_TYPE, }; static const struct option lgopts[] = { @@ -424,6 +453,7 @@ static const struct option lgopts[] = { {CMD_LINE_OPT_NO_NUMA, 0, 0, CMD_LINE_OPT_NO_NUMA_NUM}, {CMD_LINE_OPT_MAX_PKT_LEN, 1, 0, CMD_LINE_OPT_MAX_PKT_LEN_NUM}, {CMD_LINE_OPT_PER_PORT_POOL, 0, 0, CMD_LINE_OPT_PARSE_PER_PORT_POOL}, + {CMD_LINE_OPT_WORKER_MODEL, 1, 0, CMD_LINE_OPT_WORKER_MODEL_TYPE}, {NULL, 0, 0, 0}, }; @@ -498,6 +528,11 @@ parse_args(int argc, char **argv) per_port_pool = 1; break; + case CMD_LINE_OPT_WORKER_MODEL_TYPE: + printf("Use new worker model: %s\n", optarg); + parse_worker_model(optarg); + break; + default: print_usage(prgname); return -1; @@ -735,6 +770,140 @@ config_port_max_pkt_len(struct rte_eth_conf *conf, return 0; } +static void +graph_config_generic(struct rte_graph_param graph_conf) +{ + uint16_t nb_patterns = graph_conf.nb_node_patterns; + int worker_count = rte_lcore_count() - 1; + int main_lcore_id = rte_get_main_lcore(); + int worker_lcore = main_lcore_id; + rte_graph_t main_graph_id = 0; + struct rte_node *node_tmp; + struct lcore_conf *qconf; + struct rte_graph *graph; + rte_graph_t graph_id; + rte_graph_off_t off; + int n_rx_node = 0; + rte_node_t count; + rte_edge_t i; + int ret; + + for (int j = 0; j < nb_lcore_params; j++) { + qconf = &lcore_conf[lcore_params[j].lcore_id]; + /* Add rx node patterns of all lcore */ + for (i = 0; i < qconf->n_rx_queue; i++) { + char *node_name = qconf->rx_queue_list[i].node_name; + + graph_conf.node_patterns[nb_patterns + n_rx_node + i] = node_name; + n_rx_node++; + ret = rte_node_model_generic_set_lcore_affinity(node_name, + lcore_params[j].lcore_id); + if (ret == 0) + printf("Set node %s affinity to lcore %u\n", node_name, + lcore_params[j].lcore_id); + } + } + + graph_conf.nb_node_patterns = nb_patterns + n_rx_node; + graph_conf.socket_id = rte_lcore_to_socket_id(main_lcore_id); + + snprintf(qconf->name, sizeof(qconf->name), "worker_%u", + main_lcore_id); + + /* create main graph */ + main_graph_id = rte_graph_create(qconf->name, &graph_conf); + if (main_graph_id == RTE_GRAPH_ID_INVALID) + rte_exit(EXIT_FAILURE, + "rte_graph_create(): main_graph_id invalid for lcore %u\n", + main_lcore_id); + + qconf->graph_id = main_graph_id; + qconf->graph = rte_graph_lookup(qconf->name); + /* >8 End of graph initialization. */ + if (!qconf->graph) + rte_exit(EXIT_FAILURE, + "rte_graph_lookup(): graph %s not found\n", + qconf->name); + + graph = qconf->graph; + rte_graph_foreach_node(count, off, graph, node_tmp) { + worker_lcore = rte_get_next_lcore(worker_lcore, true, 1); + + /* Need to set the node Lcore affinity before clone graph for each lcore */ + if (node_tmp->lcore_id == RTE_MAX_LCORE) { + ret = rte_node_model_generic_set_lcore_affinity(node_tmp->name, + worker_lcore); + if (ret == 0) + printf("Set node %s affinity to lcore %u\n", + node_tmp->name, worker_lcore); + } + } + + worker_lcore = main_lcore_id; + for (int i = 0; i < worker_count; i++) { + worker_lcore = rte_get_next_lcore(worker_lcore, true, 1); + + qconf = &lcore_conf[worker_lcore]; + snprintf(qconf->name, sizeof(qconf->name), "cloned-%u", worker_lcore); + graph_id = rte_graph_clone(main_graph_id, qconf->name); + ret = rte_graph_bind_core(graph_id, worker_lcore); + if (ret == 0) + printf("bind graph %d to lcore %u\n", graph_id, worker_lcore); + + /* full cloned graph name */ + snprintf(qconf->name, sizeof(qconf->name), "%s", + rte_graph_id_to_name(graph_id)); + qconf->graph_id = graph_id; + qconf->graph = rte_graph_lookup(qconf->name); + if (!qconf->graph) + rte_exit(EXIT_FAILURE, + "Failed to lookup graph %s\n", + qconf->name); + continue; + } +} + +static void +graph_config_rtc(struct rte_graph_param graph_conf) +{ + uint16_t nb_patterns = graph_conf.nb_node_patterns; + struct lcore_conf *qconf; + rte_graph_t graph_id; + uint32_t lcore_id; + rte_edge_t i; + + for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) { + if (rte_lcore_is_enabled(lcore_id) == 0) + continue; + + qconf = &lcore_conf[lcore_id]; + /* Skip graph creation if no source exists */ + if (!qconf->n_rx_queue) + continue; + /* Add rx node patterns of this lcore */ + for (i = 0; i < qconf->n_rx_queue; i++) { + graph_conf.node_patterns[nb_patterns + i] = + qconf->rx_queue_list[i].node_name; + } + graph_conf.nb_node_patterns = nb_patterns + i; + graph_conf.socket_id = rte_lcore_to_socket_id(lcore_id); + snprintf(qconf->name, sizeof(qconf->name), "worker_%u", + lcore_id); + graph_id = rte_graph_create(qconf->name, &graph_conf); + if (graph_id == RTE_GRAPH_ID_INVALID) + rte_exit(EXIT_FAILURE, + "rte_graph_create(): graph_id invalid for lcore %u\n", + lcore_id); + qconf->graph_id = graph_id; + qconf->graph = rte_graph_lookup(qconf->name); + /* >8 End of graph initialization. */ + if (!qconf->graph) + rte_exit(EXIT_FAILURE, + "rte_graph_lookup(): graph %s not found\n", + qconf->name); + } +} + int main(int argc, char **argv) { @@ -759,6 +928,7 @@ main(int argc, char **argv) uint16_t nb_patterns; uint8_t rewrite_len; uint32_t lcore_id; + uint16_t model; int ret; /* Init EAL */ @@ -787,6 +957,9 @@ main(int argc, char **argv) if (check_lcore_params() < 0) rte_exit(EXIT_FAILURE, "check_lcore_params() failed\n"); + if (check_worker_model_params() < 0) + rte_exit(EXIT_FAILURE, "check_worker_model_params() failed\n"); + ret = init_lcore_rx_queues(); if (ret < 0) rte_exit(EXIT_FAILURE, "init_lcore_rx_queues() failed\n"); @@ -1026,46 +1199,13 @@ main(int argc, char **argv) memset(&graph_conf, 0, sizeof(graph_conf)); graph_conf.node_patterns = node_patterns; + graph_conf.nb_node_patterns = nb_patterns; - for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) { - rte_graph_t graph_id; - rte_edge_t i; - - if (rte_lcore_is_enabled(lcore_id) == 0) - continue; - - qconf = &lcore_conf[lcore_id]; - - /* Skip graph creation if no source exists */ - if (!qconf->n_rx_queue) - continue; - - /* Add rx node patterns of this lcore */ - for (i = 0; i < qconf->n_rx_queue; i++) { - graph_conf.node_patterns[nb_patterns + i] = - qconf->rx_queue_list[i].node_name; - } - - graph_conf.nb_node_patterns = nb_patterns + i; - graph_conf.socket_id = rte_lcore_to_socket_id(lcore_id); - - snprintf(qconf->name, sizeof(qconf->name), "worker_%u", - lcore_id); - - graph_id = rte_graph_create(qconf->name, &graph_conf); - if (graph_id == RTE_GRAPH_ID_INVALID) - rte_exit(EXIT_FAILURE, - "rte_graph_create(): graph_id invalid" - " for lcore %u\n", lcore_id); - - qconf->graph_id = graph_id; - qconf->graph = rte_graph_lookup(qconf->name); - /* >8 End of graph initialization. */ - if (!qconf->graph) - rte_exit(EXIT_FAILURE, - "rte_graph_lookup(): graph %s not found\n", - qconf->name); - } + model = rte_graph_worker_model_get(); + if (model == RTE_GRAPH_MODEL_DEFAULT) + graph_config_rtc(graph_conf); + else if (model == RTE_GRAPH_MODEL_GENERIC) + graph_config_generic(graph_conf); memset(&rewrite_data, 0, sizeof(rewrite_data)); rewrite_len = sizeof(rewrite_data);