[v10,13/16] graph: enable graph multicore dispatch scheduler model

Message ID 20230608095759.1800617-14-zhirun.yan@intel.com (mailing list archive)
State Changes Requested, archived
Delegated to: Thomas Monjalon
Headers
Series graph enhancement for multi-core dispatch |

Checks

Context Check Description
ci/checkpatch success coding style OK

Commit Message

Yan, Zhirun June 8, 2023, 9:57 a.m. UTC
  This patch enables to chose new scheduler model. Must define
RTE_GRAPH_MODEL_SELECT before including rte_graph_worker.h
to enable specific model choosing.

Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Signed-off-by: Zhirun Yan <zhirun.yan@intel.com>
---
 doc/guides/prog_guide/graph_lib.rst | 71 ++++++++++++++++++++++++++---
 lib/graph/rte_graph_worker.h        | 13 ++++++
 2 files changed, 77 insertions(+), 7 deletions(-)
  

Comments

Jerin Jacob June 8, 2023, 10:42 a.m. UTC | #1
On Thu, Jun 8, 2023 at 3:35 PM Zhirun Yan <zhirun.yan@intel.com> wrote:
>
> This patch enables to chose new scheduler model. Must define
> RTE_GRAPH_MODEL_SELECT before including rte_graph_worker.h
> to enable specific model choosing.
>
> Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
> Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> Signed-off-by: Zhirun Yan <zhirun.yan@intel.com>

Acked-by: Jerin Jacob <jerinj@marvell.com>


> ---
>  doc/guides/prog_guide/graph_lib.rst | 71 ++++++++++++++++++++++++++---
>  lib/graph/rte_graph_worker.h        | 13 ++++++
>  2 files changed, 77 insertions(+), 7 deletions(-)
>
> diff --git a/doc/guides/prog_guide/graph_lib.rst b/doc/guides/prog_guide/graph_lib.rst
> index 1cfdc86433..017cc25fd3 100644
> --- a/doc/guides/prog_guide/graph_lib.rst
> +++ b/doc/guides/prog_guide/graph_lib.rst
> @@ -189,13 +189,70 @@ In the above example, A graph object will be created with ethdev Rx
>  node of port 0 and queue 0, all ipv4* nodes in the system,
>  and ethdev tx node of all ports.
>
> -Multicore graph processing
> -~~~~~~~~~~~~~~~~~~~~~~~~~~
> -In the current graph library implementation, specifically,
> -``rte_graph_walk()`` and ``rte_node_enqueue*`` fast path API functions
> -are designed to work on single-core to have better performance.
> -The fast path API works on graph object, So the multi-core graph
> -processing strategy would be to create graph object PER WORKER.
> +Graph models
> +~~~~~~~~~~~~
> +There are two different kinds of graph walking models. User can select the model using
> +``rte_graph_worker_model_set()`` API. If the application decides to use only one model,
> +the fast path check can be avoided by defining the model with RTE_GRAPH_MODEL_SELECT.
> +For example:
> +
> +.. code-block:: console
> +
> +#define RTE_GRAPH_MODEL_SELECT RTE_GRAPH_MODEL_RTC
> +#include "rte_graph_worker.h"
> +
> +RTC (Run-To-Completion)
> +^^^^^^^^^^^^^^^^^^^^^^^
> +This is the default graph walking model. Specifically, ``rte_graph_walk_rtc()`` and
> +``rte_node_enqueue*`` fast path API functions are designed to work on single-core to
> +have better performance. The fast path API works on graph object, So the multi-core
> +graph processing strategy would be to create graph object PER WORKER.
> +
> +Example:
> +
> +Graph: node-0 -> node-1 -> node-2 @Core0.
> +
> +.. code-block:: diff
> +
> +    + - - - - - - - - - - - - - - - - - - - - - +
> +    '                  Core #0                  '
> +    '                                           '
> +    ' +--------+     +---------+     +--------+ '
> +    ' | Node-0 | --> | Node-1  | --> | Node-2 | '
> +    ' +--------+     +---------+     +--------+ '
> +    '                                           '
> +    + - - - - - - - - - - - - - - - - - - - - - +
> +
> +Dispatch model
> +^^^^^^^^^^^^^^
> +The dispatch model enables a cross-core dispatching mechanism which employs
> +a scheduling work-queue to dispatch streams to other worker cores which
> +being associated with the destination node.
> +
> +Use ``rte_graph_model_mcore_dispatch_lcore_affinity_set()`` to set lcore affinity
> +with the node.
> +Each worker core will have a graph repetition. Use ``rte_graph_clone()`` to clone
> +graph for each worker and use``rte_graph_model_mcore_dispatch_core_bind()`` to
> +bind graph with the worker core.
> +
> +Example:
> +
> +Graph topo: node-0 -> Core1; node-1 -> node-2; node-2 -> node-3.
> +Config graph: node-0 @Core0; node-1/3 @Core1; node-2 @Core2.
> +
> +.. code-block:: diff
> +
> +    + - - - - - -+     +- - - - - - - - - - - - - +     + - - - - - -+
> +    '  Core #0   '     '          Core #1         '     '  Core #2   '
> +    '            '     '                          '     '            '
> +    ' +--------+ '     ' +--------+    +--------+ '     ' +--------+ '
> +    ' | Node-0 | - - - ->| Node-1 |    | Node-3 |<- - - - | Node-2 | '
> +    ' +--------+ '     ' +--------+    +--------+ '     ' +--------+ '
> +    '            '     '     |                    '     '      ^     '
> +    + - - - - - -+     +- - -|- - - - - - - - - - +     + - - -|- - -+
> +                             |                                 |
> +                             + - - - - - - - - - - - - - - - - +
> +
>
>  In fast path
>  ~~~~~~~~~~~~
> diff --git a/lib/graph/rte_graph_worker.h b/lib/graph/rte_graph_worker.h
> index 5b58f7bda9..6685600813 100644
> --- a/lib/graph/rte_graph_worker.h
> +++ b/lib/graph/rte_graph_worker.h
> @@ -11,6 +11,7 @@ extern "C" {
>  #endif
>
>  #include "rte_graph_model_rtc.h"
> +#include "rte_graph_model_mcore_dispatch.h"
>
>  /**
>   * Perform graph walk on the circular buffer and invoke the process function
> @@ -25,7 +26,19 @@ __rte_experimental
>  static inline void
>  rte_graph_walk(struct rte_graph *graph)
>  {
> +#if defined(RTE_GRAPH_MODEL_SELECT) && (RTE_GRAPH_MODEL_SELECT == RTE_GRAPH_MODEL_RTC)
>         rte_graph_walk_rtc(graph);
> +#elif defined(RTE_GRAPH_MODEL_SELECT) && (RTE_GRAPH_MODEL_SELECT == RTE_GRAPH_MODEL_MCORE_DISPATCH)
> +       rte_graph_walk_mcore_dispatch(graph);
> +#else
> +       switch (rte_graph_worker_model_no_check_get(graph)) {
> +       case RTE_GRAPH_MODEL_MCORE_DISPATCH:
> +               rte_graph_walk_mcore_dispatch(graph);
> +               break;
> +       default:
> +               rte_graph_walk_rtc(graph);
> +       }
> +#endif
>  }
>
>  #ifdef __cplusplus
> --
> 2.37.2
>
  
Pavan Nikhilesh Bhagavatula June 8, 2023, 2:29 p.m. UTC | #2
> This patch enables to chose new scheduler model. Must define
> RTE_GRAPH_MODEL_SELECT before including rte_graph_worker.h
> to enable specific model choosing.
> 
> Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
> Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> Signed-off-by: Zhirun Yan <zhirun.yan@intel.com>

Acked-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> ---
>  doc/guides/prog_guide/graph_lib.rst | 71
> ++++++++++++++++++++++++++---
>  lib/graph/rte_graph_worker.h        | 13 ++++++
>  2 files changed, 77 insertions(+), 7 deletions(-)
> 
> diff --git a/doc/guides/prog_guide/graph_lib.rst
> b/doc/guides/prog_guide/graph_lib.rst
> index 1cfdc86433..017cc25fd3 100644
> --- a/doc/guides/prog_guide/graph_lib.rst
> +++ b/doc/guides/prog_guide/graph_lib.rst
> @@ -189,13 +189,70 @@ In the above example, A graph object will be
> created with ethdev Rx
>  node of port 0 and queue 0, all ipv4* nodes in the system,
>  and ethdev tx node of all ports.
> 
> -Multicore graph processing
> -~~~~~~~~~~~~~~~~~~~~~~~~~~
> -In the current graph library implementation, specifically,
> -``rte_graph_walk()`` and ``rte_node_enqueue*`` fast path API functions
> -are designed to work on single-core to have better performance.
> -The fast path API works on graph object, So the multi-core graph
> -processing strategy would be to create graph object PER WORKER.
> +Graph models
> +~~~~~~~~~~~~
> +There are two different kinds of graph walking models. User can select the
> model using
> +``rte_graph_worker_model_set()`` API. If the application decides to use
> only one model,
> +the fast path check can be avoided by defining the model with
> RTE_GRAPH_MODEL_SELECT.
> +For example:
> +
> +.. code-block:: console
> +
> +#define RTE_GRAPH_MODEL_SELECT RTE_GRAPH_MODEL_RTC
> +#include "rte_graph_worker.h"
> +
> +RTC (Run-To-Completion)
> +^^^^^^^^^^^^^^^^^^^^^^^
> +This is the default graph walking model. Specifically,
> ``rte_graph_walk_rtc()`` and
> +``rte_node_enqueue*`` fast path API functions are designed to work on
> single-core to
> +have better performance. The fast path API works on graph object, So the
> multi-core
> +graph processing strategy would be to create graph object PER WORKER.
> +
> +Example:
> +
> +Graph: node-0 -> node-1 -> node-2 @Core0.
> +
> +.. code-block:: diff
> +
> +    + - - - - - - - - - - - - - - - - - - - - - +
> +    '                  Core #0                  '
> +    '                                           '
> +    ' +--------+     +---------+     +--------+ '
> +    ' | Node-0 | --> | Node-1  | --> | Node-2 | '
> +    ' +--------+     +---------+     +--------+ '
> +    '                                           '
> +    + - - - - - - - - - - - - - - - - - - - - - +
> +
> +Dispatch model
> +^^^^^^^^^^^^^^
> +The dispatch model enables a cross-core dispatching mechanism which
> employs
> +a scheduling work-queue to dispatch streams to other worker cores which
> +being associated with the destination node.
> +
> +Use ``rte_graph_model_mcore_dispatch_lcore_affinity_set()`` to set lcore
> affinity
> +with the node.
> +Each worker core will have a graph repetition. Use ``rte_graph_clone()`` to
> clone
> +graph for each worker and
> use``rte_graph_model_mcore_dispatch_core_bind()`` to
> +bind graph with the worker core.
> +
> +Example:
> +
> +Graph topo: node-0 -> Core1; node-1 -> node-2; node-2 -> node-3.
> +Config graph: node-0 @Core0; node-1/3 @Core1; node-2 @Core2.
> +
> +.. code-block:: diff
> +
> +    + - - - - - -+     +- - - - - - - - - - - - - +     + - - - - - -+
> +    '  Core #0   '     '          Core #1         '     '  Core #2   '
> +    '            '     '                          '     '            '
> +    ' +--------+ '     ' +--------+    +--------+ '     ' +--------+ '
> +    ' | Node-0 | - - - ->| Node-1 |    | Node-3 |<- - - - | Node-2 | '
> +    ' +--------+ '     ' +--------+    +--------+ '     ' +--------+ '
> +    '            '     '     |                    '     '      ^     '
> +    + - - - - - -+     +- - -|- - - - - - - - - - +     + - - -|- - -+
> +                             |                                 |
> +                             + - - - - - - - - - - - - - - - - +
> +
> 
>  In fast path
>  ~~~~~~~~~~~~
> diff --git a/lib/graph/rte_graph_worker.h b/lib/graph/rte_graph_worker.h
> index 5b58f7bda9..6685600813 100644
> --- a/lib/graph/rte_graph_worker.h
> +++ b/lib/graph/rte_graph_worker.h
> @@ -11,6 +11,7 @@ extern "C" {
>  #endif
> 
>  #include "rte_graph_model_rtc.h"
> +#include "rte_graph_model_mcore_dispatch.h"
> 
>  /**
>   * Perform graph walk on the circular buffer and invoke the process function
> @@ -25,7 +26,19 @@ __rte_experimental
>  static inline void
>  rte_graph_walk(struct rte_graph *graph)
>  {
> +#if defined(RTE_GRAPH_MODEL_SELECT) && (RTE_GRAPH_MODEL_SELECT
> == RTE_GRAPH_MODEL_RTC)
>  	rte_graph_walk_rtc(graph);
> +#elif defined(RTE_GRAPH_MODEL_SELECT) &&
> (RTE_GRAPH_MODEL_SELECT == RTE_GRAPH_MODEL_MCORE_DISPATCH)
> +	rte_graph_walk_mcore_dispatch(graph);
> +#else
> +	switch (rte_graph_worker_model_no_check_get(graph)) {
> +	case RTE_GRAPH_MODEL_MCORE_DISPATCH:
> +		rte_graph_walk_mcore_dispatch(graph);
> +		break;
> +	default:
> +		rte_graph_walk_rtc(graph);
> +	}
> +#endif
>  }
> 
>  #ifdef __cplusplus
> --
> 2.37.2
  

Patch

diff --git a/doc/guides/prog_guide/graph_lib.rst b/doc/guides/prog_guide/graph_lib.rst
index 1cfdc86433..017cc25fd3 100644
--- a/doc/guides/prog_guide/graph_lib.rst
+++ b/doc/guides/prog_guide/graph_lib.rst
@@ -189,13 +189,70 @@  In the above example, A graph object will be created with ethdev Rx
 node of port 0 and queue 0, all ipv4* nodes in the system,
 and ethdev tx node of all ports.
 
-Multicore graph processing
-~~~~~~~~~~~~~~~~~~~~~~~~~~
-In the current graph library implementation, specifically,
-``rte_graph_walk()`` and ``rte_node_enqueue*`` fast path API functions
-are designed to work on single-core to have better performance.
-The fast path API works on graph object, So the multi-core graph
-processing strategy would be to create graph object PER WORKER.
+Graph models
+~~~~~~~~~~~~
+There are two different kinds of graph walking models. User can select the model using
+``rte_graph_worker_model_set()`` API. If the application decides to use only one model,
+the fast path check can be avoided by defining the model with RTE_GRAPH_MODEL_SELECT.
+For example:
+
+.. code-block:: console
+
+#define RTE_GRAPH_MODEL_SELECT RTE_GRAPH_MODEL_RTC
+#include "rte_graph_worker.h"
+
+RTC (Run-To-Completion)
+^^^^^^^^^^^^^^^^^^^^^^^
+This is the default graph walking model. Specifically, ``rte_graph_walk_rtc()`` and
+``rte_node_enqueue*`` fast path API functions are designed to work on single-core to
+have better performance. The fast path API works on graph object, So the multi-core
+graph processing strategy would be to create graph object PER WORKER.
+
+Example:
+
+Graph: node-0 -> node-1 -> node-2 @Core0.
+
+.. code-block:: diff
+
+    + - - - - - - - - - - - - - - - - - - - - - +
+    '                  Core #0                  '
+    '                                           '
+    ' +--------+     +---------+     +--------+ '
+    ' | Node-0 | --> | Node-1  | --> | Node-2 | '
+    ' +--------+     +---------+     +--------+ '
+    '                                           '
+    + - - - - - - - - - - - - - - - - - - - - - +
+
+Dispatch model
+^^^^^^^^^^^^^^
+The dispatch model enables a cross-core dispatching mechanism which employs
+a scheduling work-queue to dispatch streams to other worker cores which
+being associated with the destination node.
+
+Use ``rte_graph_model_mcore_dispatch_lcore_affinity_set()`` to set lcore affinity
+with the node.
+Each worker core will have a graph repetition. Use ``rte_graph_clone()`` to clone
+graph for each worker and use``rte_graph_model_mcore_dispatch_core_bind()`` to
+bind graph with the worker core.
+
+Example:
+
+Graph topo: node-0 -> Core1; node-1 -> node-2; node-2 -> node-3.
+Config graph: node-0 @Core0; node-1/3 @Core1; node-2 @Core2.
+
+.. code-block:: diff
+
+    + - - - - - -+     +- - - - - - - - - - - - - +     + - - - - - -+
+    '  Core #0   '     '          Core #1         '     '  Core #2   '
+    '            '     '                          '     '            '
+    ' +--------+ '     ' +--------+    +--------+ '     ' +--------+ '
+    ' | Node-0 | - - - ->| Node-1 |    | Node-3 |<- - - - | Node-2 | '
+    ' +--------+ '     ' +--------+    +--------+ '     ' +--------+ '
+    '            '     '     |                    '     '      ^     '
+    + - - - - - -+     +- - -|- - - - - - - - - - +     + - - -|- - -+
+                             |                                 |
+                             + - - - - - - - - - - - - - - - - +
+
 
 In fast path
 ~~~~~~~~~~~~
diff --git a/lib/graph/rte_graph_worker.h b/lib/graph/rte_graph_worker.h
index 5b58f7bda9..6685600813 100644
--- a/lib/graph/rte_graph_worker.h
+++ b/lib/graph/rte_graph_worker.h
@@ -11,6 +11,7 @@  extern "C" {
 #endif
 
 #include "rte_graph_model_rtc.h"
+#include "rte_graph_model_mcore_dispatch.h"
 
 /**
  * Perform graph walk on the circular buffer and invoke the process function
@@ -25,7 +26,19 @@  __rte_experimental
 static inline void
 rte_graph_walk(struct rte_graph *graph)
 {
+#if defined(RTE_GRAPH_MODEL_SELECT) && (RTE_GRAPH_MODEL_SELECT == RTE_GRAPH_MODEL_RTC)
 	rte_graph_walk_rtc(graph);
+#elif defined(RTE_GRAPH_MODEL_SELECT) && (RTE_GRAPH_MODEL_SELECT == RTE_GRAPH_MODEL_MCORE_DISPATCH)
+	rte_graph_walk_mcore_dispatch(graph);
+#else
+	switch (rte_graph_worker_model_no_check_get(graph)) {
+	case RTE_GRAPH_MODEL_MCORE_DISPATCH:
+		rte_graph_walk_mcore_dispatch(graph);
+		break;
+	default:
+		rte_graph_walk_rtc(graph);
+	}
+#endif
 }
 
 #ifdef __cplusplus