[v2] eal: add build-time option to omit trace

Message ID 20240919080658.252532-1-mb@smartsharesystems.com (mailing list archive)
State Superseded
Delegated to: Thomas Monjalon
Headers
Series [v2] eal: add build-time option to omit trace |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/intel-Functional success Functional PASS
ci/github-robot: build success github build: passed
ci/iol-marvell-Functional success Functional Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-sample-apps-testing success Testing PASS
ci/iol-unit-amd64-testing success Testing PASS
ci/iol-compile-amd64-testing warning Testing issues
ci/iol-unit-arm64-testing success Testing PASS
ci/iol-compile-arm64-testing success Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-intel-Functional success Functional Testing PASS

Commit Message

Morten Brørup Sept. 19, 2024, 8:06 a.m. UTC
Some applications want to omit the trace feature.
Either to reduce the memory footprint, to reduce the exposed attack
surface, or for other reasons.

This patch adds an option in rte_config.h to include or omit trace in the
build. Trace is included by default.

Omitting trace works by omitting all trace points.
For API and ABI compatibility, the trace feature itself remains.

Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
---
v2:
* Added/modified macros required for building applications with trace
  omitted.
---
 app/test/test_trace.c                      | 11 ++++++++++-
 config/rte_config.h                        |  1 +
 lib/eal/include/rte_trace_point.h          | 22 +++++++++++++++++++++-
 lib/eal/include/rte_trace_point_register.h | 17 +++++++++++++++++
 4 files changed, 49 insertions(+), 2 deletions(-)
  

Comments

Jerin Jacob Sept. 19, 2024, 8:51 a.m. UTC | #1
On Thu, Sep 19, 2024 at 1:37 PM Morten Brørup <mb@smartsharesystems.com> wrote:
>
> Some applications want to omit the trace feature.
> Either to reduce the memory footprint, to reduce the exposed attack
> surface, or for other reasons.
>
> This patch adds an option in rte_config.h to include or omit trace in the
> build. Trace is included by default.
>
> Omitting trace works by omitting all trace points.
> For API and ABI compatibility, the trace feature itself remains.
>
> Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
> ---
> v2:
> * Added/modified macros required for building applications with trace
>   omitted.
> ---
>  app/test/test_trace.c                      | 11 ++++++++++-
>  config/rte_config.h                        |  1 +
>  lib/eal/include/rte_trace_point.h          | 22 +++++++++++++++++++++-
>  lib/eal/include/rte_trace_point_register.h | 17 +++++++++++++++++
>  4 files changed, 49 insertions(+), 2 deletions(-)
>
> diff --git a/app/test/test_trace.c b/app/test/test_trace.c
> index 00809f433b..7918cc865d 100644
> --- a/app/test/test_trace.c
> +++ b/app/test/test_trace.c
> @@ -12,7 +12,16 @@
>
>  int app_dpdk_test_tp_count;
>
> -#ifdef RTE_EXEC_ENV_WINDOWS
> +#if !defined(RTE_TRACE)
> +
> +static int
> +test_trace(void)
> +{
> +       printf("trace omitted at build-time, skipping test\n");
> +       return TEST_SKIPPED;
> +}
> +
> +#elif defined(RTE_EXEC_ENV_WINDOWS)
>
>  static int
>  test_trace(void)
> diff --git a/config/rte_config.h b/config/rte_config.h
> index dd7bb0d35b..fd6f8a2f1a 100644
> --- a/config/rte_config.h
> +++ b/config/rte_config.h
> @@ -49,6 +49,7 @@
>  #define RTE_MAX_TAILQ 32
>  #define RTE_LOG_DP_LEVEL RTE_LOG_INFO
>  #define RTE_MAX_VFIO_CONTAINERS 64
> +#define RTE_TRACE 1


Too many ifdefs down using this, IMO, This approach will make
difficult to manage the code and make sure
 RTE_TRACE 0 works always.

I understand your concern about to reduce the expose attach surface.
BTW, it will be there with telemetry too.
Disabling full feature  for the code via config/rte_config.h is new to
DPDK. I think, we need wider opinion on this(Adding techboard).
Also, DPDK never concerned with binary size. Other than binary size,
following patch and exposed the config through meson
will fix the other problems. IMO, That's the correct tradeoff.


#define __rte_trace_point_emit_header_generic(t) \
void *mem; \
do { \
  +      if (RTE_TRACE == 0) \
  +             return \
        const uint64_t val = rte_atomic_load_explicit(t,
rte_memory_order_acquire); \
        if (likely(!(val & __RTE_TRACE_FIELD_ENABLE_MASK))) \
                return; \
        mem = __rte_trace_mem_get(val); \
        if (unlikely(mem == NULL)) \
                return; \
        mem = __rte_trace_point_emit_ev_header(mem, val); \
} while (0)

>
>  /* bsd module defines */
>  #define RTE_CONTIGMEM_MAX_NUM_BUFS 64
> diff --git a/lib/eal/include/rte_trace_point.h b/lib/eal/include/rte_trace_point.h
> index 41e2a7f99e..fd864125a7 100644
> --- a/lib/eal/include/rte_trace_point.h
> +++ b/lib/eal/include/rte_trace_point.h
> @@ -42,6 +42,8 @@ typedef RTE_ATOMIC(uint64_t) rte_trace_point_t;
>   */
>  #define RTE_TRACE_POINT_ARGS
>
> +#ifdef RTE_TRACE
> +
>  /** @internal Helper macro to support RTE_TRACE_POINT and RTE_TRACE_POINT_FP */
>  #define __RTE_TRACE_POINT(_mode, _tp, _args, ...) \
>  extern rte_trace_point_t __##_tp; \
> @@ -100,6 +102,20 @@ _tp _args \
>  #define RTE_TRACE_POINT_FP(tp, args, ...) \
>         __RTE_TRACE_POINT(fp, tp, args, __VA_ARGS__)
>
> +#else
> +
> +/** @internal Helper macro to support RTE_TRACE_POINT and RTE_TRACE_POINT_FP */
> +#define __RTE_TRACE_POINT_VOID(_tp, ...) \
> +static __rte_always_inline void \
> +_tp(...) \
> +{ \
> +}
> +
> +#define RTE_TRACE_POINT(tp, args, ...) __RTE_TRACE_POINT_VOID(tp, args)
> +#define RTE_TRACE_POINT_FP(tp, args, ...) __RTE_TRACE_POINT_VOID(tp, args)
> +
> +#endif /* RTE_TRACE */
> +
>  #ifdef __DOXYGEN__
>
>  /**
> @@ -212,6 +228,7 @@ bool rte_trace_point_is_enabled(rte_trace_point_t *tp);
>  __rte_experimental
>  rte_trace_point_t *rte_trace_point_lookup(const char *name);
>
> +#ifdef RTE_TRACE
>  /**
>   * @internal
>   *
> @@ -230,6 +247,7 @@ __rte_trace_point_fp_is_enabled(void)
>         return false;
>  #endif
>  }
> +#endif /* RTE_TRACE */
>
>  /**
>   * @internal
> @@ -356,6 +374,8 @@ __rte_trace_point_emit_ev_header(void *mem, uint64_t in)
>         return RTE_PTR_ADD(mem, __RTE_TRACE_EVENT_HEADER_SZ);
>  }
>
> +#ifdef RTE_TRACE
> +
>  #define __rte_trace_point_emit_header_generic(t) \
>  void *mem; \
>  do { \
> @@ -411,7 +431,7 @@ do { \
>         RTE_SET_USED(len); \
>  } while (0)
>
> -
> +#endif /* RTE_TRACE */
>  #endif /* ALLOW_EXPERIMENTAL_API */
>  #endif /* _RTE_TRACE_POINT_REGISTER_H_ */
>
> diff --git a/lib/eal/include/rte_trace_point_register.h b/lib/eal/include/rte_trace_point_register.h
> index 41260e5964..567fce534e 100644
> --- a/lib/eal/include/rte_trace_point_register.h
> +++ b/lib/eal/include/rte_trace_point_register.h
> @@ -18,6 +18,8 @@ extern "C" {
>
>  RTE_DECLARE_PER_LCORE(volatile int, trace_point_sz);
>
> +#ifdef RTE_TRACE
> +
>  #define RTE_TRACE_POINT_REGISTER(trace, name) \
>  rte_trace_point_t __rte_section("__rte_trace_point") __##trace; \
>  static const char __##trace##_name[] = RTE_STR(name); \
> @@ -56,6 +58,21 @@ do { \
>                 RTE_STR(uint8_t)); \
>  } while (0)
>
> +#else
> +
> +#define RTE_TRACE_POINT_REGISTER(...)
> +#define __rte_trace_point_emit_header_generic(t) RTE_SET_USED(t)
> +#define __rte_trace_point_emit_header_fp(t) RTE_SET_USED(t)
> +#define __rte_trace_point_emit(in, type) RTE_SET_USED(in)
> +#define rte_trace_point_emit_string(in) RTE_SET_USED(in)
> +#define rte_trace_point_emit_blob(in, len) \
> +do { \
> +       RTE_SET_USED(in); \
> +       RTE_SET_USED(len); \
> +} while (0)
> +
> +#endif /* RTE_TRACE */
> +
>  #ifdef __cplusplus
>  }
>  #endif
> --
> 2.43.0
>
  
Morten Brørup Sept. 19, 2024, 1:15 p.m. UTC | #2
> From: Jerin Jacob [mailto:jerinjacobk@gmail.com]
> Sent: Thursday, 19 September 2024 10.52
> 
> On Thu, Sep 19, 2024 at 1:37 PM Morten Brørup <mb@smartsharesystems.com>
> wrote:
> >
> > Some applications want to omit the trace feature.
> > Either to reduce the memory footprint, to reduce the exposed attack
> > surface, or for other reasons.
> >
> > This patch adds an option in rte_config.h to include or omit trace in the
> > build. Trace is included by default.
> >
> > Omitting trace works by omitting all trace points.
> > For API and ABI compatibility, the trace feature itself remains.
> >
> > Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
> > ---
> > v2:
> > * Added/modified macros required for building applications with trace
> >   omitted.
> > ---
> >  app/test/test_trace.c                      | 11 ++++++++++-
> >  config/rte_config.h                        |  1 +
> >  lib/eal/include/rte_trace_point.h          | 22 +++++++++++++++++++++-
> >  lib/eal/include/rte_trace_point_register.h | 17 +++++++++++++++++
> >  4 files changed, 49 insertions(+), 2 deletions(-)
> >
> > diff --git a/app/test/test_trace.c b/app/test/test_trace.c
> > index 00809f433b..7918cc865d 100644
> > --- a/app/test/test_trace.c
> > +++ b/app/test/test_trace.c
> > @@ -12,7 +12,16 @@
> >
> >  int app_dpdk_test_tp_count;
> >
> > -#ifdef RTE_EXEC_ENV_WINDOWS
> > +#if !defined(RTE_TRACE)
> > +
> > +static int
> > +test_trace(void)
> > +{
> > +       printf("trace omitted at build-time, skipping test\n");
> > +       return TEST_SKIPPED;
> > +}
> > +
> > +#elif defined(RTE_EXEC_ENV_WINDOWS)
> >
> >  static int
> >  test_trace(void)
> > diff --git a/config/rte_config.h b/config/rte_config.h
> > index dd7bb0d35b..fd6f8a2f1a 100644
> > --- a/config/rte_config.h
> > +++ b/config/rte_config.h
> > @@ -49,6 +49,7 @@
> >  #define RTE_MAX_TAILQ 32
> >  #define RTE_LOG_DP_LEVEL RTE_LOG_INFO
> >  #define RTE_MAX_VFIO_CONTAINERS 64
> > +#define RTE_TRACE 1
> 
> 
> Too many ifdefs down using this, IMO, This approach will make
> difficult to manage the code and make sure
>  RTE_TRACE 0 works always.

I agree it's not pretty, but it's the best solution I can come up with, to meet the requirements driving this patch: reducing attack surface and memory footprint.

> 
> I understand your concern about to reduce the expose attach surface.
> BTW, it will be there with telemetry too.

Yes, I have telemetry in mind too. :-)
Got to work on them one by one, though.

> Disabling full feature  for the code via config/rte_config.h is new to
> DPDK. I think, we need wider opinion on this(Adding techboard).

Yes, it’s a good topic for discussion in the techboard.

Currently, there are three levels of build-time configuration:
1. Meson, for prominent options.
2. config/rte_config.h for less prominent options, e.g. enabling RTE_ASSERT(), mempool statistics, etc.
3. Header files in various modules/drivers, for tweaking the individual module.

There didn't seem to be any interest in enabling/disabling trace via Meson, so I chose the next level of configuration.

> Also, DPDK never concerned with binary size.

That is a big mistake. DPDK is not only for cloud.
DPDK should also be for CPEs, SOHO firewalls, and similar network appliances. For some of these high-volume network appliances, memory footprint matters.

Also, when running many instances of a service in the cloud, power consumption matters. Bloat increases power consumption.

> Other than binary size,
> following patch and exposed the config through meson
> will fix the other problems. IMO, That's the correct tradeoff.

Yes; the optimal solution would be making EAL less bloated, i.e. purely an abstraction layer to the underlying hardware and O/S.
This would make trace an independent module, which could be enabled/disabled through Meson.

Bruce has contributed a lot in this area, and already made many modules optional.

> 
> 
> #define __rte_trace_point_emit_header_generic(t) \
> void *mem; \
> do { \
>   +      if (RTE_TRACE == 0) \
>   +             return \
>         const uint64_t val = rte_atomic_load_explicit(t,
> rte_memory_order_acquire); \
>         if (likely(!(val & __RTE_TRACE_FIELD_ENABLE_MASK))) \
>                 return; \
>         mem = __rte_trace_mem_get(val); \
>         if (unlikely(mem == NULL)) \
>                 return; \
>         mem = __rte_trace_point_emit_ev_header(mem, val); \
> } while (0)
> 
> >
> >  /* bsd module defines */
> >  #define RTE_CONTIGMEM_MAX_NUM_BUFS 64
> > diff --git a/lib/eal/include/rte_trace_point.h
> b/lib/eal/include/rte_trace_point.h
> > index 41e2a7f99e..fd864125a7 100644
> > --- a/lib/eal/include/rte_trace_point.h
> > +++ b/lib/eal/include/rte_trace_point.h
> > @@ -42,6 +42,8 @@ typedef RTE_ATOMIC(uint64_t) rte_trace_point_t;
> >   */
> >  #define RTE_TRACE_POINT_ARGS
> >
> > +#ifdef RTE_TRACE
> > +
> >  /** @internal Helper macro to support RTE_TRACE_POINT and
> RTE_TRACE_POINT_FP */
> >  #define __RTE_TRACE_POINT(_mode, _tp, _args, ...) \
> >  extern rte_trace_point_t __##_tp; \
> > @@ -100,6 +102,20 @@ _tp _args \
> >  #define RTE_TRACE_POINT_FP(tp, args, ...) \
> >         __RTE_TRACE_POINT(fp, tp, args, __VA_ARGS__)
> >
> > +#else
> > +
> > +/** @internal Helper macro to support RTE_TRACE_POINT and
> RTE_TRACE_POINT_FP */
> > +#define __RTE_TRACE_POINT_VOID(_tp, ...) \
> > +static __rte_always_inline void \
> > +_tp(...) \
> > +{ \
> > +}
> > +
> > +#define RTE_TRACE_POINT(tp, args, ...) __RTE_TRACE_POINT_VOID(tp, args)
> > +#define RTE_TRACE_POINT_FP(tp, args, ...) __RTE_TRACE_POINT_VOID(tp, args)
> > +
> > +#endif /* RTE_TRACE */
> > +
> >  #ifdef __DOXYGEN__
> >
> >  /**
> > @@ -212,6 +228,7 @@ bool rte_trace_point_is_enabled(rte_trace_point_t *tp);
> >  __rte_experimental
> >  rte_trace_point_t *rte_trace_point_lookup(const char *name);
> >
> > +#ifdef RTE_TRACE
> >  /**
> >   * @internal
> >   *
> > @@ -230,6 +247,7 @@ __rte_trace_point_fp_is_enabled(void)
> >         return false;
> >  #endif
> >  }
> > +#endif /* RTE_TRACE */
> >
> >  /**
> >   * @internal
> > @@ -356,6 +374,8 @@ __rte_trace_point_emit_ev_header(void *mem, uint64_t in)
> >         return RTE_PTR_ADD(mem, __RTE_TRACE_EVENT_HEADER_SZ);
> >  }
> >
> > +#ifdef RTE_TRACE
> > +
> >  #define __rte_trace_point_emit_header_generic(t) \
> >  void *mem; \
> >  do { \
> > @@ -411,7 +431,7 @@ do { \
> >         RTE_SET_USED(len); \
> >  } while (0)
> >
> > -
> > +#endif /* RTE_TRACE */
> >  #endif /* ALLOW_EXPERIMENTAL_API */
> >  #endif /* _RTE_TRACE_POINT_REGISTER_H_ */
> >
> > diff --git a/lib/eal/include/rte_trace_point_register.h
> b/lib/eal/include/rte_trace_point_register.h
> > index 41260e5964..567fce534e 100644
> > --- a/lib/eal/include/rte_trace_point_register.h
> > +++ b/lib/eal/include/rte_trace_point_register.h
> > @@ -18,6 +18,8 @@ extern "C" {
> >
> >  RTE_DECLARE_PER_LCORE(volatile int, trace_point_sz);
> >
> > +#ifdef RTE_TRACE
> > +
> >  #define RTE_TRACE_POINT_REGISTER(trace, name) \
> >  rte_trace_point_t __rte_section("__rte_trace_point") __##trace; \
> >  static const char __##trace##_name[] = RTE_STR(name); \
> > @@ -56,6 +58,21 @@ do { \
> >                 RTE_STR(uint8_t)); \
> >  } while (0)
> >
> > +#else
> > +
> > +#define RTE_TRACE_POINT_REGISTER(...)
> > +#define __rte_trace_point_emit_header_generic(t) RTE_SET_USED(t)
> > +#define __rte_trace_point_emit_header_fp(t) RTE_SET_USED(t)
> > +#define __rte_trace_point_emit(in, type) RTE_SET_USED(in)
> > +#define rte_trace_point_emit_string(in) RTE_SET_USED(in)
> > +#define rte_trace_point_emit_blob(in, len) \
> > +do { \
> > +       RTE_SET_USED(in); \
> > +       RTE_SET_USED(len); \
> > +} while (0)
> > +
> > +#endif /* RTE_TRACE */
> > +
> >  #ifdef __cplusplus
> >  }
> >  #endif
> > --
> > 2.43.0
> >
  
Jerin Jacob Sept. 19, 2024, 1:54 p.m. UTC | #3
On Thu, Sep 19, 2024 at 6:45 PM Morten Brørup <mb@smartsharesystems.com> wrote:
>
> > From: Jerin Jacob [mailto:jerinjacobk@gmail.com]
> > Sent: Thursday, 19 September 2024 10.52
> >
> > On Thu, Sep 19, 2024 at 1:37 PM Morten Brørup <mb@smartsharesystems.com>
> > wrote:
> > >
> > > Some applications want to omit the trace feature.
> > > Either to reduce the memory footprint, to reduce the exposed attack
> > > surface, or for other reasons.
> > >
> > > This patch adds an option in rte_config.h to include or omit trace in the
> > > build. Trace is included by default.
> > >
> > > Omitting trace works by omitting all trace points.
> > > For API and ABI compatibility, the trace feature itself remains.
> > >
> > > Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
> > > ---
> > > v2:
> > > * Added/modified macros required for building applications with trace
> > >   omitted.
> > > ---
> > >  app/test/test_trace.c                      | 11 ++++++++++-
> > >  config/rte_config.h                        |  1 +
> > >  lib/eal/include/rte_trace_point.h          | 22 +++++++++++++++++++++-
> > >  lib/eal/include/rte_trace_point_register.h | 17 +++++++++++++++++
> > >  4 files changed, 49 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/app/test/test_trace.c b/app/test/test_trace.c
> > > index 00809f433b..7918cc865d 100644
> > > --- a/app/test/test_trace.c
> > > +++ b/app/test/test_trace.c
> > > @@ -12,7 +12,16 @@
> > >
> > >  int app_dpdk_test_tp_count;
> > >
> > > -#ifdef RTE_EXEC_ENV_WINDOWS
> > > +#if !defined(RTE_TRACE)
> > > +
> > > +static int
> > > +test_trace(void)
> > > +{
> > > +       printf("trace omitted at build-time, skipping test\n");
> > > +       return TEST_SKIPPED;
> > > +}
> > > +
> > > +#elif defined(RTE_EXEC_ENV_WINDOWS)
> > >
> > >  static int
> > >  test_trace(void)
> > > diff --git a/config/rte_config.h b/config/rte_config.h
> > > index dd7bb0d35b..fd6f8a2f1a 100644
> > > --- a/config/rte_config.h
> > > +++ b/config/rte_config.h
> > > @@ -49,6 +49,7 @@
> > >  #define RTE_MAX_TAILQ 32
> > >  #define RTE_LOG_DP_LEVEL RTE_LOG_INFO
> > >  #define RTE_MAX_VFIO_CONTAINERS 64
> > > +#define RTE_TRACE 1
> >
> >
> > Too many ifdefs down using this, IMO, This approach will make
> > difficult to manage the code and make sure
> >  RTE_TRACE 0 works always.
>
> I agree it's not pretty, but it's the best solution I can come up with, to meet the requirements driving this patch: reducing attack surface and memory footprint.
>
> >
> > I understand your concern about to reduce the expose attach surface.
> > BTW, it will be there with telemetry too.
>
> Yes, I have telemetry in mind too. :-)
> Got to work on them one by one, though.
>
> > Disabling full feature  for the code via config/rte_config.h is new to
> > DPDK. I think, we need wider opinion on this(Adding techboard).
>
> Yes, it’s a good topic for discussion in the techboard.
>
> Currently, there are three levels of build-time configuration:
> 1. Meson, for prominent options.
> 2. config/rte_config.h for less prominent options, e.g. enabling RTE_ASSERT(), mempool statistics, etc.
> 3. Header files in various modules/drivers, for tweaking the individual module.
>
> There didn't seem to be any interest in enabling/disabling trace via Meson, so I chose the next level of configuration.
>
> > Also, DPDK never concerned with binary size.
>
> That is a big mistake. DPDK is not only for cloud.
> DPDK should also be for CPEs, SOHO firewalls, and similar network appliances. For some of these high-volume network appliances, memory footprint matters.

Could you share data with a real-world application of the elf size?
1)Without any change
2)Only disabling via __rte_trace_point_emit_header_generic() .. aka below patch.
3)Full disable.

I think, size command will spit out with different section's size.
This data can be used to take decision and to know how much % it
adding up?


>
> Also, when running many instances of a service in the cloud, power consumption matters. Bloat increases power consumption.
>
> > Other than binary size,
> > following patch and exposed the config through meson
> > will fix the other problems. IMO, That's the correct tradeoff.
>
> Yes; the optimal solution would be making EAL less bloated, i.e. purely an abstraction layer to the underlying hardware and O/S.
> This would make trace an independent module, which could be enabled/disabled through Meson.
>
> Bruce has contributed a lot in this area, and already made many modules optional.
>
> >
> >
> > #define __rte_trace_point_emit_header_generic(t) \
> > void *mem; \
> > do { \
> >   +      if (RTE_TRACE == 0) \
> >   +             return \
> >         const uint64_t val = rte_atomic_load_explicit(t,
> > rte_memory_order_acquire); \
> >         if (likely(!(val & __RTE_TRACE_FIELD_ENABLE_MASK))) \
> >                 return; \
> >         mem = __rte_trace_mem_get(val); \
> >         if (unlikely(mem == NULL)) \
> >                 return; \
> >         mem = __rte_trace_point_emit_ev_header(mem, val); \
> > } while (0)
> >
> > >
> > >  /* bsd module defines */
> > >  #define RTE_CONTIGMEM_MAX_NUM_BUFS 64
> > > diff --git a/lib/eal/include/rte_trace_point.h
> > b/lib/eal/include/rte_trace_point.h
> > > index 41e2a7f99e..fd864125a7 100644
> > > --- a/lib/eal/include/rte_trace_point.h
> > > +++ b/lib/eal/include/rte_trace_point.h
> > > @@ -42,6 +42,8 @@ typedef RTE_ATOMIC(uint64_t) rte_trace_point_t;
> > >   */
> > >  #define RTE_TRACE_POINT_ARGS
> > >
> > > +#ifdef RTE_TRACE
> > > +
> > >  /** @internal Helper macro to support RTE_TRACE_POINT and
> > RTE_TRACE_POINT_FP */
> > >  #define __RTE_TRACE_POINT(_mode, _tp, _args, ...) \
> > >  extern rte_trace_point_t __##_tp; \
> > > @@ -100,6 +102,20 @@ _tp _args \
> > >  #define RTE_TRACE_POINT_FP(tp, args, ...) \
> > >         __RTE_TRACE_POINT(fp, tp, args, __VA_ARGS__)
> > >
> > > +#else
> > > +
> > > +/** @internal Helper macro to support RTE_TRACE_POINT and
> > RTE_TRACE_POINT_FP */
> > > +#define __RTE_TRACE_POINT_VOID(_tp, ...) \
> > > +static __rte_always_inline void \
> > > +_tp(...) \
> > > +{ \
> > > +}
> > > +
> > > +#define RTE_TRACE_POINT(tp, args, ...) __RTE_TRACE_POINT_VOID(tp, args)
> > > +#define RTE_TRACE_POINT_FP(tp, args, ...) __RTE_TRACE_POINT_VOID(tp, args)
> > > +
> > > +#endif /* RTE_TRACE */
> > > +
> > >  #ifdef __DOXYGEN__
> > >
> > >  /**
> > > @@ -212,6 +228,7 @@ bool rte_trace_point_is_enabled(rte_trace_point_t *tp);
> > >  __rte_experimental
> > >  rte_trace_point_t *rte_trace_point_lookup(const char *name);
> > >
> > > +#ifdef RTE_TRACE
> > >  /**
> > >   * @internal
> > >   *
> > > @@ -230,6 +247,7 @@ __rte_trace_point_fp_is_enabled(void)
> > >         return false;
> > >  #endif
> > >  }
> > > +#endif /* RTE_TRACE */
> > >
> > >  /**
> > >   * @internal
> > > @@ -356,6 +374,8 @@ __rte_trace_point_emit_ev_header(void *mem, uint64_t in)
> > >         return RTE_PTR_ADD(mem, __RTE_TRACE_EVENT_HEADER_SZ);
> > >  }
> > >
> > > +#ifdef RTE_TRACE
> > > +
> > >  #define __rte_trace_point_emit_header_generic(t) \
> > >  void *mem; \
> > >  do { \
> > > @@ -411,7 +431,7 @@ do { \
> > >         RTE_SET_USED(len); \
> > >  } while (0)
> > >
> > > -
> > > +#endif /* RTE_TRACE */
> > >  #endif /* ALLOW_EXPERIMENTAL_API */
> > >  #endif /* _RTE_TRACE_POINT_REGISTER_H_ */
> > >
> > > diff --git a/lib/eal/include/rte_trace_point_register.h
> > b/lib/eal/include/rte_trace_point_register.h
> > > index 41260e5964..567fce534e 100644
> > > --- a/lib/eal/include/rte_trace_point_register.h
> > > +++ b/lib/eal/include/rte_trace_point_register.h
> > > @@ -18,6 +18,8 @@ extern "C" {
> > >
> > >  RTE_DECLARE_PER_LCORE(volatile int, trace_point_sz);
> > >
> > > +#ifdef RTE_TRACE
> > > +
> > >  #define RTE_TRACE_POINT_REGISTER(trace, name) \
> > >  rte_trace_point_t __rte_section("__rte_trace_point") __##trace; \
> > >  static const char __##trace##_name[] = RTE_STR(name); \
> > > @@ -56,6 +58,21 @@ do { \
> > >                 RTE_STR(uint8_t)); \
> > >  } while (0)
> > >
> > > +#else
> > > +
> > > +#define RTE_TRACE_POINT_REGISTER(...)
> > > +#define __rte_trace_point_emit_header_generic(t) RTE_SET_USED(t)
> > > +#define __rte_trace_point_emit_header_fp(t) RTE_SET_USED(t)
> > > +#define __rte_trace_point_emit(in, type) RTE_SET_USED(in)
> > > +#define rte_trace_point_emit_string(in) RTE_SET_USED(in)
> > > +#define rte_trace_point_emit_blob(in, len) \
> > > +do { \
> > > +       RTE_SET_USED(in); \
> > > +       RTE_SET_USED(len); \
> > > +} while (0)
> > > +
> > > +#endif /* RTE_TRACE */
> > > +
> > >  #ifdef __cplusplus
> > >  }
> > >  #endif
> > > --
> > > 2.43.0
> > >
  
Morten Brørup Sept. 19, 2024, 3:35 p.m. UTC | #4
> Could you share data with a real-world application of the elf size?
> 1)Without any change

Size of the statically linked executable on the target file system:
3,800,528 byte

> 2)Only disabling via __rte_trace_point_emit_header_generic() .. aka below
> patch.

3,572,032 byte

> 3)Full disable.

3,572,032 byte

> 
> I think, size command will spit out with different section's size.
> This data can be used to take decision and to know how much % it
> adding up?

I.e. trace adds 228,496 byte = 6.4 % to the size of this specific executable.

~200 KB might not seem as much, but this is just omitting one module.
If other unused modules also add ~200 KB, it adds up.

And I haven't done any significant additional memory footprint tuning.


> > > #define __rte_trace_point_emit_header_generic(t) \
> > > void *mem; \
> > > do { \
> > >   +      if (RTE_TRACE == 0) \
> > >   +             return \

Tested without the RTE_TRACE==0 check, simply:
  void *mem; \
  do { \
+         return; \
          const uint64_t val = rte_atomic_load_explicit(t, rte_memory_order_acquire); \

> > >         const uint64_t val = rte_atomic_load_explicit(t,
> > > rte_memory_order_acquire); \
> > >         if (likely(!(val & __RTE_TRACE_FIELD_ENABLE_MASK))) \
> > >                 return; \
> > >         mem = __rte_trace_mem_get(val); \
> > >         if (unlikely(mem == NULL)) \
> > >                 return; \
> > >         mem = __rte_trace_point_emit_ev_header(mem, val); \
> > > } while (0)

I don't understand why your method uses as little memory as mine...
My method should also omit the trace points themselves, with their names and numbers, and their initialization.

I haven't looked deeper into it.

If your method of omitting trace is as efficient as all my ifdefs, I also prefer your method.
Simpler is better.
  
Jerin Jacob Sept. 19, 2024, 3:49 p.m. UTC | #5
On Thu, Sep 19, 2024 at 9:05 PM Morten Brørup <mb@smartsharesystems.com> wrote:
>
> > Could you share data with a real-world application of the elf size?
> > 1)Without any change
>
> Size of the statically linked executable on the target file system:
> 3,800,528 byte
>
> > 2)Only disabling via __rte_trace_point_emit_header_generic() .. aka below
> > patch.
>
> 3,572,032 byte
>
> > 3)Full disable.
>
> 3,572,032 byte
>
> >
> > I think, size command will spit out with different section's size.
> > This data can be used to take decision and to know how much % it
> > adding up?
>
> I.e. trace adds 228,496 byte = 6.4 % to the size of this specific executable.
>
> ~200 KB might not seem as much, but this is just omitting one module.
> If other unused modules also add ~200 KB, it adds up.
>
> And I haven't done any significant additional memory footprint tuning.
>
>
> > > > #define __rte_trace_point_emit_header_generic(t) \
> > > > void *mem; \
> > > > do { \
> > > >   +      if (RTE_TRACE == 0) \
> > > >   +             return \
>
> Tested without the RTE_TRACE==0 check, simply:
>   void *mem; \
>   do { \
> +         return; \
>           const uint64_t val = rte_atomic_load_explicit(t, rte_memory_order_acquire); \
>
> > > >         const uint64_t val = rte_atomic_load_explicit(t,
> > > > rte_memory_order_acquire); \
> > > >         if (likely(!(val & __RTE_TRACE_FIELD_ENABLE_MASK))) \
> > > >                 return; \
> > > >         mem = __rte_trace_mem_get(val); \
> > > >         if (unlikely(mem == NULL)) \
> > > >                 return; \
> > > >         mem = __rte_trace_point_emit_ev_header(mem, val); \
> > > > } while (0)
>
> I don't understand why your method uses as little memory as mine...

Compiler is start enough to understand those functions are NOP.


> My method should also omit the trace points themselves, with their names and numbers, and their initialization.
>
> I haven't looked deeper into it.
>
> If your method of omitting trace is as efficient as all my ifdefs, I also prefer your method.
> Simpler is better.
>
  
Morten Brørup Sept. 19, 2024, 4:08 p.m. UTC | #6
> From: Jerin Jacob [mailto:jerinjacobk@gmail.com]
> Sent: Thursday, 19 September 2024 17.49
> 
> On Thu, Sep 19, 2024 at 9:05 PM Morten Brørup <mb@smartsharesystems.com>
> wrote:
> >
> > > Could you share data with a real-world application of the elf size?
> > > 1)Without any change
> >
> > Size of the statically linked executable on the target file system:
> > 3,800,528 byte
> >
> > > 2)Only disabling via __rte_trace_point_emit_header_generic() .. aka below
> > > patch.
> >
> > 3,572,032 byte
> >
> > > 3)Full disable.
> >
> > 3,572,032 byte
> >
> > >
> > > I think, size command will spit out with different section's size.
> > > This data can be used to take decision and to know how much % it
> > > adding up?
> >
> > I.e. trace adds 228,496 byte = 6.4 % to the size of this specific
> executable.
> >
> > ~200 KB might not seem as much, but this is just omitting one module.
> > If other unused modules also add ~200 KB, it adds up.
> >
> > And I haven't done any significant additional memory footprint tuning.
> >
> >
> > > > > #define __rte_trace_point_emit_header_generic(t) \
> > > > > void *mem; \
> > > > > do { \
> > > > >   +      if (RTE_TRACE == 0) \
> > > > >   +             return \
> >
> > Tested without the RTE_TRACE==0 check, simply:
> >   void *mem; \
> >   do { \
> > +         return; \
> >           const uint64_t val = rte_atomic_load_explicit(t,
> rte_memory_order_acquire); \
> >
> > > > >         const uint64_t val = rte_atomic_load_explicit(t,
> > > > > rte_memory_order_acquire); \
> > > > >         if (likely(!(val & __RTE_TRACE_FIELD_ENABLE_MASK))) \
> > > > >                 return; \
> > > > >         mem = __rte_trace_mem_get(val); \
> > > > >         if (unlikely(mem == NULL)) \
> > > > >                 return; \
> > > > >         mem = __rte_trace_point_emit_ev_header(mem, val); \
> > > > > } while (0)
> >
> > I don't understand why your method uses as little memory as mine...
> 
> Compiler is start enough to understand those functions are NOP.

Yes, I understand why the compiler understands that this code is NOP.

But I don't understand why the tracepoints' values and names are not registered by the unmodified RTE_TRACE_POINT_REGISTER().

> 
> 
> > My method should also omit the trace points themselves, with their names and
> numbers, and their initialization.
> >
> > I haven't looked deeper into it.
> >
> > If your method of omitting trace is as efficient as all my ifdefs, I also
> prefer your method.
> > Simpler is better.
> >
  
Jerin Jacob Sept. 19, 2024, 4:34 p.m. UTC | #7
On Thu, Sep 19, 2024 at 9:38 PM Morten Brørup <mb@smartsharesystems.com> wrote:
>
> > From: Jerin Jacob [mailto:jerinjacobk@gmail.com]
> > Sent: Thursday, 19 September 2024 17.49
> >
> > On Thu, Sep 19, 2024 at 9:05 PM Morten Brørup <mb@smartsharesystems.com>
> > wrote:
> > >
> > > > Could you share data with a real-world application of the elf size?
> > > > 1)Without any change
> > >
> > > Size of the statically linked executable on the target file system:
> > > 3,800,528 byte
> > >
> > > > 2)Only disabling via __rte_trace_point_emit_header_generic() .. aka below
> > > > patch.
> > >
> > > 3,572,032 byte
> > >
> > > > 3)Full disable.
> > >
> > > 3,572,032 byte
> > >
> > > >
> > > > I think, size command will spit out with different section's size.
> > > > This data can be used to take decision and to know how much % it
> > > > adding up?
> > >
> > > I.e. trace adds 228,496 byte = 6.4 % to the size of this specific
> > executable.
> > >
> > > ~200 KB might not seem as much, but this is just omitting one module.
> > > If other unused modules also add ~200 KB, it adds up.
> > >
> > > And I haven't done any significant additional memory footprint tuning.
> > >
> > >
> > > > > > #define __rte_trace_point_emit_header_generic(t) \
> > > > > > void *mem; \
> > > > > > do { \
> > > > > >   +      if (RTE_TRACE == 0) \
> > > > > >   +             return \
> > >
> > > Tested without the RTE_TRACE==0 check, simply:
> > >   void *mem; \
> > >   do { \
> > > +         return; \
> > >           const uint64_t val = rte_atomic_load_explicit(t,
> > rte_memory_order_acquire); \
> > >
> > > > > >         const uint64_t val = rte_atomic_load_explicit(t,
> > > > > > rte_memory_order_acquire); \
> > > > > >         if (likely(!(val & __RTE_TRACE_FIELD_ENABLE_MASK))) \
> > > > > >                 return; \
> > > > > >         mem = __rte_trace_mem_get(val); \
> > > > > >         if (unlikely(mem == NULL)) \
> > > > > >                 return; \
> > > > > >         mem = __rte_trace_point_emit_ev_header(mem, val); \
> > > > > > } while (0)
> > >
> > > I don't understand why your method uses as little memory as mine...
> >
> > Compiler is start enough to understand those functions are NOP.
>
> Yes, I understand why the compiler understands that this code is NOP.
>
> But I don't understand why the tracepoints' values and names are not registered by the unmodified RTE_TRACE_POINT_REGISTER().

My best _guess_ is __rte_trace_point_register()'s register_fn()
execution comes as NOP.




>
> >
> >
> > > My method should also omit the trace points themselves, with their names and
> > numbers, and their initialization.
> > >
> > > I haven't looked deeper into it.
> > >
> > > If your method of omitting trace is as efficient as all my ifdefs, I also
> > prefer your method.
> > > Simpler is better.
> > >
  

Patch

diff --git a/app/test/test_trace.c b/app/test/test_trace.c
index 00809f433b..7918cc865d 100644
--- a/app/test/test_trace.c
+++ b/app/test/test_trace.c
@@ -12,7 +12,16 @@ 
 
 int app_dpdk_test_tp_count;
 
-#ifdef RTE_EXEC_ENV_WINDOWS
+#if !defined(RTE_TRACE)
+
+static int
+test_trace(void)
+{
+	printf("trace omitted at build-time, skipping test\n");
+	return TEST_SKIPPED;
+}
+
+#elif defined(RTE_EXEC_ENV_WINDOWS)
 
 static int
 test_trace(void)
diff --git a/config/rte_config.h b/config/rte_config.h
index dd7bb0d35b..fd6f8a2f1a 100644
--- a/config/rte_config.h
+++ b/config/rte_config.h
@@ -49,6 +49,7 @@ 
 #define RTE_MAX_TAILQ 32
 #define RTE_LOG_DP_LEVEL RTE_LOG_INFO
 #define RTE_MAX_VFIO_CONTAINERS 64
+#define RTE_TRACE 1
 
 /* bsd module defines */
 #define RTE_CONTIGMEM_MAX_NUM_BUFS 64
diff --git a/lib/eal/include/rte_trace_point.h b/lib/eal/include/rte_trace_point.h
index 41e2a7f99e..fd864125a7 100644
--- a/lib/eal/include/rte_trace_point.h
+++ b/lib/eal/include/rte_trace_point.h
@@ -42,6 +42,8 @@  typedef RTE_ATOMIC(uint64_t) rte_trace_point_t;
  */
 #define RTE_TRACE_POINT_ARGS
 
+#ifdef RTE_TRACE
+
 /** @internal Helper macro to support RTE_TRACE_POINT and RTE_TRACE_POINT_FP */
 #define __RTE_TRACE_POINT(_mode, _tp, _args, ...) \
 extern rte_trace_point_t __##_tp; \
@@ -100,6 +102,20 @@  _tp _args \
 #define RTE_TRACE_POINT_FP(tp, args, ...) \
 	__RTE_TRACE_POINT(fp, tp, args, __VA_ARGS__)
 
+#else
+
+/** @internal Helper macro to support RTE_TRACE_POINT and RTE_TRACE_POINT_FP */
+#define __RTE_TRACE_POINT_VOID(_tp, ...) \
+static __rte_always_inline void \
+_tp(...) \
+{ \
+}
+
+#define RTE_TRACE_POINT(tp, args, ...) __RTE_TRACE_POINT_VOID(tp, args)
+#define RTE_TRACE_POINT_FP(tp, args, ...) __RTE_TRACE_POINT_VOID(tp, args)
+
+#endif /* RTE_TRACE */
+
 #ifdef __DOXYGEN__
 
 /**
@@ -212,6 +228,7 @@  bool rte_trace_point_is_enabled(rte_trace_point_t *tp);
 __rte_experimental
 rte_trace_point_t *rte_trace_point_lookup(const char *name);
 
+#ifdef RTE_TRACE
 /**
  * @internal
  *
@@ -230,6 +247,7 @@  __rte_trace_point_fp_is_enabled(void)
 	return false;
 #endif
 }
+#endif /* RTE_TRACE */
 
 /**
  * @internal
@@ -356,6 +374,8 @@  __rte_trace_point_emit_ev_header(void *mem, uint64_t in)
 	return RTE_PTR_ADD(mem, __RTE_TRACE_EVENT_HEADER_SZ);
 }
 
+#ifdef RTE_TRACE
+
 #define __rte_trace_point_emit_header_generic(t) \
 void *mem; \
 do { \
@@ -411,7 +431,7 @@  do { \
 	RTE_SET_USED(len); \
 } while (0)
 
-
+#endif /* RTE_TRACE */
 #endif /* ALLOW_EXPERIMENTAL_API */
 #endif /* _RTE_TRACE_POINT_REGISTER_H_ */
 
diff --git a/lib/eal/include/rte_trace_point_register.h b/lib/eal/include/rte_trace_point_register.h
index 41260e5964..567fce534e 100644
--- a/lib/eal/include/rte_trace_point_register.h
+++ b/lib/eal/include/rte_trace_point_register.h
@@ -18,6 +18,8 @@  extern "C" {
 
 RTE_DECLARE_PER_LCORE(volatile int, trace_point_sz);
 
+#ifdef RTE_TRACE
+
 #define RTE_TRACE_POINT_REGISTER(trace, name) \
 rte_trace_point_t __rte_section("__rte_trace_point") __##trace; \
 static const char __##trace##_name[] = RTE_STR(name); \
@@ -56,6 +58,21 @@  do { \
 		RTE_STR(uint8_t)); \
 } while (0)
 
+#else
+
+#define RTE_TRACE_POINT_REGISTER(...)
+#define __rte_trace_point_emit_header_generic(t) RTE_SET_USED(t)
+#define __rte_trace_point_emit_header_fp(t) RTE_SET_USED(t)
+#define __rte_trace_point_emit(in, type) RTE_SET_USED(in)
+#define rte_trace_point_emit_string(in) RTE_SET_USED(in)
+#define rte_trace_point_emit_blob(in, len) \
+do { \
+	RTE_SET_USED(in); \
+	RTE_SET_USED(len); \
+} while (0)
+
+#endif /* RTE_TRACE */
+
 #ifdef __cplusplus
 }
 #endif