[v2,2/2] doc/contributing: guidelines for logging, tracing and telemetry

Message ID 20230620170728.74117-3-bruce.richardson@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers
Series Improve docs on getting info on running process |

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/github-robot: build success github build: passed
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-aarch-unit-testing success Testing PASS
ci/iol-abi-testing success Testing PASS
ci/iol-unit-testing success Testing PASS
ci/iol-x86_64-compile-testing success Testing PASS
ci/intel-Functional success Functional PASS
ci/loongarch-unit-testing success Unit Testing PASS
ci/loongarch-compilation success Compilation OK
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS
ci/iol-testing success Testing PASS
ci/iol-x86_64-unit-testing success Testing PASS

Commit Message

Bruce Richardson June 20, 2023, 5:07 p.m. UTC
  As discussed by DPDK technical board [1], our contributor guide should
include some details as to when to use logging vs tracing vs telemetry
to provide the end user with information about the running process and
the DPDK libraries it uses.

[1] https://mails.dpdk.org/archives/dev/2023-March/265204.html

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>

---
V2:
* Added note about not logging from unused drivers
* Reworked bullets/sub-bullets on tracing vs debug logs for debugging.
  - Consensus in replies was that people liked having debug logs for
    single-use, e.g. init cases.
  - Kept recommendation for tracing for data-path only
* Added short discussion of *_dump() functions at end of section
* Added sentence to indicate that telemetry should be read-only
* Added mention of common trace format and that other tools are
  available for it.
---
 doc/guides/contributing/coding_style.rst |  2 +
 doc/guides/contributing/design.rst       | 47 ++++++++++++++++++++++++
 doc/guides/prog_guide/telemetry_lib.rst  |  2 +
 doc/guides/prog_guide/trace_lib.rst      |  2 +
 4 files changed, 53 insertions(+)
  

Comments

David Marchand July 4, 2023, 7:54 a.m. UTC | #1
On Tue, Jun 20, 2023 at 7:08 PM Bruce Richardson
<bruce.richardson@intel.com> wrote:
>
> As discussed by DPDK technical board [1], our contributor guide should
> include some details as to when to use logging vs tracing vs telemetry
> to provide the end user with information about the running process and
> the DPDK libraries it uses.
>
> [1] https://mails.dpdk.org/archives/dev/2023-March/265204.html
>
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> Acked-by: Morten Brørup <mb@smartsharesystems.com>
>
> ---
> V2:
> * Added note about not logging from unused drivers
> * Reworked bullets/sub-bullets on tracing vs debug logs for debugging.
>   - Consensus in replies was that people liked having debug logs for
>     single-use, e.g. init cases.
>   - Kept recommendation for tracing for data-path only
> * Added short discussion of *_dump() functions at end of section
> * Added sentence to indicate that telemetry should be read-only
> * Added mention of common trace format and that other tools are
>   available for it.
> ---
>  doc/guides/contributing/coding_style.rst |  2 +
>  doc/guides/contributing/design.rst       | 47 ++++++++++++++++++++++++
>  doc/guides/prog_guide/telemetry_lib.rst  |  2 +
>  doc/guides/prog_guide/trace_lib.rst      |  2 +
>  4 files changed, 53 insertions(+)
>
> diff --git a/doc/guides/contributing/coding_style.rst b/doc/guides/contributing/coding_style.rst
> index 307c7deb9a..0861305dc6 100644
> --- a/doc/guides/contributing/coding_style.rst
> +++ b/doc/guides/contributing/coding_style.rst
> @@ -794,6 +794,8 @@ Control Statements
>                   /* NOTREACHED */
>           }
>
> +.. _dynamic_logging:
> +
>  Dynamic Logging
>  ---------------
>
> diff --git a/doc/guides/contributing/design.rst b/doc/guides/contributing/design.rst
> index d24a7ff6a0..30104e0bfb 100644
> --- a/doc/guides/contributing/design.rst
> +++ b/doc/guides/contributing/design.rst
> @@ -4,6 +4,53 @@
>  Design
>  ======
>
> +Runtime Information - Logging, Tracing and Telemetry
> +-------------------------------------------------------

I would put this section right before the statistics section (since
telemetry and stats are somehow related).


> +
> +It is often desirable to provide information to the end-user as to what is happening to the application at runtime.
> +DPDK provides a number of built-in mechanisms to provide this introspection:
> +
> +* :ref:`Logging<dynamic_logging>`
> +* :ref:`Tracing<trace_library>`
> +* :ref:`Telemetry<telemetry_library>`
> +
> +Each of these has it's own strengths and suitabilities for use within DPDK components.

Nit: its* ?


> +
> +Below are some guidelines for when each should be used:
> +
> +* For reporting error conditions, or other abnormal runtime issues, *logging* should be used.
> +  Depending on the severity of the issue, the appropriate log level, for example,
> +  ``ERROR``, ``WARNING`` or ``NOTICE``, should be used.
> +
> +.. note::
> +
> +    Drivers off all classes, including both bus and device drivers,

Nit: of*

> +    should not output any log information if the hardware they support is not present.
> +    This is to avoid any changes in output for existing users when a new driver is added to DPDK.
> +
> +* For component initialization, or other cases where a path through the code is only likely to be taken once,
> +  either *logging* at ``DEBUG`` level or *tracing* may be used, or potentially both.
> +  In the latter case, tracing can provide basic information as to the code path taken,
> +  with debug-level logging providing additional details on internal state,
> +  not possible to emit via tracing.
> +
> +* For a component's data-path, where a path is to be taken multiple times within a short timeframe,
> +  *tracing* should be used.
> +  Since DPDK tracing uses `Common Trace Format <https://diamon.org/ctf/>`_ for its tracing logs,
> +  post-analysis can be done using a range of external tools.
> +
> +* For numerical or statistical data generated by a component, for example, per-packet statistics,
> +  *telemetry* should be used.
> +
> +* For any data where the data may need to be gathered at any point in the execution to help assess the state of the application component,
> +  for example, core configuration, device information, *telemetry* should be used.
> +  Telemetry callbacks should not modify any program state, but be "read-only".
> +
> +Many libraries also include a ``rte_<libname>_dump()`` function as part of their API,
> +writing verbose internal details to a given file-handle.
> +New libraries are encouraged to provide such functions where it makes sense to do so,
> +as they provide an additional application-controlled mechanism to get details of the internals of a DPDK component.
> +
>  Environment or Architecture-specific Sources
>  --------------------------------------------
>
> diff --git a/doc/guides/prog_guide/telemetry_lib.rst b/doc/guides/prog_guide/telemetry_lib.rst
> index 32f525a67f..71f8bd735e 100644
> --- a/doc/guides/prog_guide/telemetry_lib.rst
> +++ b/doc/guides/prog_guide/telemetry_lib.rst
> @@ -1,6 +1,8 @@
>  ..  SPDX-License-Identifier: BSD-3-Clause
>      Copyright(c) 2020 Intel Corporation.
>
> +.. _telemetry_library:
> +
>  Telemetry Library
>  =================
>
> diff --git a/doc/guides/prog_guide/trace_lib.rst b/doc/guides/prog_guide/trace_lib.rst
> index e5718feddc..a3b8a7c2eb 100644
> --- a/doc/guides/prog_guide/trace_lib.rst
> +++ b/doc/guides/prog_guide/trace_lib.rst
> @@ -1,6 +1,8 @@
>  ..  SPDX-License-Identifier: BSD-3-Clause
>      Copyright(C) 2020 Marvell International Ltd.
>
> +.. _trace_library:
> +
>  Trace Library
>  =============
>
> --
> 2.39.2
>

Reviewed-by: David Marchand <david.marchand@redhat.com>
  

Patch

diff --git a/doc/guides/contributing/coding_style.rst b/doc/guides/contributing/coding_style.rst
index 307c7deb9a..0861305dc6 100644
--- a/doc/guides/contributing/coding_style.rst
+++ b/doc/guides/contributing/coding_style.rst
@@ -794,6 +794,8 @@  Control Statements
                  /* NOTREACHED */
          }
 
+.. _dynamic_logging:
+
 Dynamic Logging
 ---------------
 
diff --git a/doc/guides/contributing/design.rst b/doc/guides/contributing/design.rst
index d24a7ff6a0..30104e0bfb 100644
--- a/doc/guides/contributing/design.rst
+++ b/doc/guides/contributing/design.rst
@@ -4,6 +4,53 @@ 
 Design
 ======
 
+Runtime Information - Logging, Tracing and Telemetry
+-------------------------------------------------------
+
+It is often desirable to provide information to the end-user as to what is happening to the application at runtime.
+DPDK provides a number of built-in mechanisms to provide this introspection:
+
+* :ref:`Logging<dynamic_logging>`
+* :ref:`Tracing<trace_library>`
+* :ref:`Telemetry<telemetry_library>`
+
+Each of these has it's own strengths and suitabilities for use within DPDK components.
+
+Below are some guidelines for when each should be used:
+
+* For reporting error conditions, or other abnormal runtime issues, *logging* should be used.
+  Depending on the severity of the issue, the appropriate log level, for example,
+  ``ERROR``, ``WARNING`` or ``NOTICE``, should be used.
+
+.. note::
+
+    Drivers off all classes, including both bus and device drivers,
+    should not output any log information if the hardware they support is not present.
+    This is to avoid any changes in output for existing users when a new driver is added to DPDK. 
+
+* For component initialization, or other cases where a path through the code is only likely to be taken once,
+  either *logging* at ``DEBUG`` level or *tracing* may be used, or potentially both.
+  In the latter case, tracing can provide basic information as to the code path taken,
+  with debug-level logging providing additional details on internal state,
+  not possible to emit via tracing.
+
+* For a component's data-path, where a path is to be taken multiple times within a short timeframe,
+  *tracing* should be used.
+  Since DPDK tracing uses `Common Trace Format <https://diamon.org/ctf/>`_ for its tracing logs,
+  post-analysis can be done using a range of external tools.
+
+* For numerical or statistical data generated by a component, for example, per-packet statistics,
+  *telemetry* should be used.
+
+* For any data where the data may need to be gathered at any point in the execution to help assess the state of the application component,
+  for example, core configuration, device information, *telemetry* should be used.
+  Telemetry callbacks should not modify any program state, but be "read-only".
+
+Many libraries also include a ``rte_<libname>_dump()`` function as part of their API,
+writing verbose internal details to a given file-handle.
+New libraries are encouraged to provide such functions where it makes sense to do so,
+as they provide an additional application-controlled mechanism to get details of the internals of a DPDK component.
+
 Environment or Architecture-specific Sources
 --------------------------------------------
 
diff --git a/doc/guides/prog_guide/telemetry_lib.rst b/doc/guides/prog_guide/telemetry_lib.rst
index 32f525a67f..71f8bd735e 100644
--- a/doc/guides/prog_guide/telemetry_lib.rst
+++ b/doc/guides/prog_guide/telemetry_lib.rst
@@ -1,6 +1,8 @@ 
 ..  SPDX-License-Identifier: BSD-3-Clause
     Copyright(c) 2020 Intel Corporation.
 
+.. _telemetry_library:
+
 Telemetry Library
 =================
 
diff --git a/doc/guides/prog_guide/trace_lib.rst b/doc/guides/prog_guide/trace_lib.rst
index e5718feddc..a3b8a7c2eb 100644
--- a/doc/guides/prog_guide/trace_lib.rst
+++ b/doc/guides/prog_guide/trace_lib.rst
@@ -1,6 +1,8 @@ 
 ..  SPDX-License-Identifier: BSD-3-Clause
     Copyright(C) 2020 Marvell International Ltd.
 
+.. _trace_library:
+
 Trace Library
 =============