[v12,13/16] graph: enable graph multicore dispatch scheduler model

Message ID 20230609191245.252521-14-zhirun.yan@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers
Series graph enhancement for multi-core dispatch |

Checks

Context Check Description
ci/checkpatch success coding style OK

Commit Message

Yan, Zhirun June 9, 2023, 7:12 p.m. UTC
  This patch enables to chose new scheduler model. Must define
RTE_GRAPH_MODEL_SELECT before including rte_graph_worker.h
to enable specific model choosing.

Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Signed-off-by: Zhirun Yan <zhirun.yan@intel.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
---
 doc/guides/prog_guide/graph_lib.rst | 71 ++++++++++++++++++++++++++---
 lib/graph/rte_graph_worker.h        | 13 ++++++
 2 files changed, 77 insertions(+), 7 deletions(-)
  

Patch

diff --git a/doc/guides/prog_guide/graph_lib.rst b/doc/guides/prog_guide/graph_lib.rst
index 1cfdc86433..017cc25fd3 100644
--- a/doc/guides/prog_guide/graph_lib.rst
+++ b/doc/guides/prog_guide/graph_lib.rst
@@ -189,13 +189,70 @@  In the above example, A graph object will be created with ethdev Rx
 node of port 0 and queue 0, all ipv4* nodes in the system,
 and ethdev tx node of all ports.
 
-Multicore graph processing
-~~~~~~~~~~~~~~~~~~~~~~~~~~
-In the current graph library implementation, specifically,
-``rte_graph_walk()`` and ``rte_node_enqueue*`` fast path API functions
-are designed to work on single-core to have better performance.
-The fast path API works on graph object, So the multi-core graph
-processing strategy would be to create graph object PER WORKER.
+Graph models
+~~~~~~~~~~~~
+There are two different kinds of graph walking models. User can select the model using
+``rte_graph_worker_model_set()`` API. If the application decides to use only one model,
+the fast path check can be avoided by defining the model with RTE_GRAPH_MODEL_SELECT.
+For example:
+
+.. code-block:: console
+
+#define RTE_GRAPH_MODEL_SELECT RTE_GRAPH_MODEL_RTC
+#include "rte_graph_worker.h"
+
+RTC (Run-To-Completion)
+^^^^^^^^^^^^^^^^^^^^^^^
+This is the default graph walking model. Specifically, ``rte_graph_walk_rtc()`` and
+``rte_node_enqueue*`` fast path API functions are designed to work on single-core to
+have better performance. The fast path API works on graph object, So the multi-core
+graph processing strategy would be to create graph object PER WORKER.
+
+Example:
+
+Graph: node-0 -> node-1 -> node-2 @Core0.
+
+.. code-block:: diff
+
+    + - - - - - - - - - - - - - - - - - - - - - +
+    '                  Core #0                  '
+    '                                           '
+    ' +--------+     +---------+     +--------+ '
+    ' | Node-0 | --> | Node-1  | --> | Node-2 | '
+    ' +--------+     +---------+     +--------+ '
+    '                                           '
+    + - - - - - - - - - - - - - - - - - - - - - +
+
+Dispatch model
+^^^^^^^^^^^^^^
+The dispatch model enables a cross-core dispatching mechanism which employs
+a scheduling work-queue to dispatch streams to other worker cores which
+being associated with the destination node.
+
+Use ``rte_graph_model_mcore_dispatch_lcore_affinity_set()`` to set lcore affinity
+with the node.
+Each worker core will have a graph repetition. Use ``rte_graph_clone()`` to clone
+graph for each worker and use``rte_graph_model_mcore_dispatch_core_bind()`` to
+bind graph with the worker core.
+
+Example:
+
+Graph topo: node-0 -> Core1; node-1 -> node-2; node-2 -> node-3.
+Config graph: node-0 @Core0; node-1/3 @Core1; node-2 @Core2.
+
+.. code-block:: diff
+
+    + - - - - - -+     +- - - - - - - - - - - - - +     + - - - - - -+
+    '  Core #0   '     '          Core #1         '     '  Core #2   '
+    '            '     '                          '     '            '
+    ' +--------+ '     ' +--------+    +--------+ '     ' +--------+ '
+    ' | Node-0 | - - - ->| Node-1 |    | Node-3 |<- - - - | Node-2 | '
+    ' +--------+ '     ' +--------+    +--------+ '     ' +--------+ '
+    '            '     '     |                    '     '      ^     '
+    + - - - - - -+     +- - -|- - - - - - - - - - +     + - - -|- - -+
+                             |                                 |
+                             + - - - - - - - - - - - - - - - - +
+
 
 In fast path
 ~~~~~~~~~~~~
diff --git a/lib/graph/rte_graph_worker.h b/lib/graph/rte_graph_worker.h
index 5b58f7bda9..6685600813 100644
--- a/lib/graph/rte_graph_worker.h
+++ b/lib/graph/rte_graph_worker.h
@@ -11,6 +11,7 @@  extern "C" {
 #endif
 
 #include "rte_graph_model_rtc.h"
+#include "rte_graph_model_mcore_dispatch.h"
 
 /**
  * Perform graph walk on the circular buffer and invoke the process function
@@ -25,7 +26,19 @@  __rte_experimental
 static inline void
 rte_graph_walk(struct rte_graph *graph)
 {
+#if defined(RTE_GRAPH_MODEL_SELECT) && (RTE_GRAPH_MODEL_SELECT == RTE_GRAPH_MODEL_RTC)
 	rte_graph_walk_rtc(graph);
+#elif defined(RTE_GRAPH_MODEL_SELECT) && (RTE_GRAPH_MODEL_SELECT == RTE_GRAPH_MODEL_MCORE_DISPATCH)
+	rte_graph_walk_mcore_dispatch(graph);
+#else
+	switch (rte_graph_worker_model_no_check_get(graph)) {
+	case RTE_GRAPH_MODEL_MCORE_DISPATCH:
+		rte_graph_walk_mcore_dispatch(graph);
+		break;
+	default:
+		rte_graph_walk_rtc(graph);
+	}
+#endif
 }
 
 #ifdef __cplusplus