[01/10] build: add an option to enable LTO build
Checks
Commit Message
This patch adds an option to enable link time optimization. In addition
to LTO option itself (-flto) fat-lto-objects are being used. This is
because during the build pmdinfogen scans the generated ELF objects to
find this_pmd_name* symbol in symbol table. Without fat-lto-objects gcc
produces ELF only with extra symbols for internal use during linking and
clang does not produce ELF at all (only LLVM IR bitcode).
Signed-off-by: Andrzej Ostruszka <aostruszka@marvell.com>
---
.travis.yml | 7 +++++
config/common_base | 5 +++
config/meson.build | 9 ++++++
doc/guides/prog_guide/lto.rst | 36 ++++++++++++++++++++++
doc/guides/rel_notes/release_19_11.rst | 8 +++++
meson_options.txt | 2 ++
mk/toolchain/clang/rte.toolchain-compat.mk | 4 +++
mk/toolchain/clang/rte.vars.mk | 8 +++++
mk/toolchain/gcc/rte.toolchain-compat.mk | 4 +++
mk/toolchain/gcc/rte.vars.mk | 12 ++++++++
mk/toolchain/icc/rte.vars.mk | 8 +++++
11 files changed, 103 insertions(+)
create mode 100644 doc/guides/prog_guide/lto.rst
Comments
On Thu, Sep 05, 2019 at 11:32:30AM +0200, Andrzej Ostruszka wrote:
> This patch adds an option to enable link time optimization. In addition
> to LTO option itself (-flto) fat-lto-objects are being used. This is
> because during the build pmdinfogen scans the generated ELF objects to
> find this_pmd_name* symbol in symbol table. Without fat-lto-objects gcc
> produces ELF only with extra symbols for internal use during linking and
> clang does not produce ELF at all (only LLVM IR bitcode).
>
> Signed-off-by: Andrzej Ostruszka <aostruszka@marvell.com>
> ---
<snip>
> --- a/meson_options.txt
> +++ b/meson_options.txt
> @@ -6,6 +6,8 @@ option('drivers_install_subdir', type: 'string', value: 'dpdk/pmds-<VERSION>',
> description: 'Subdirectory of libdir where to install PMDs. Defaults to using a versioned subdirectory.')
> option('enable_docs', type: 'boolean', value: false,
> description: 'build documentation')
> +option('enable_lto', type: 'boolean', value: false,
> + description: 'Enable link time optimization')
> option('enable_kmods', type: 'boolean', value: true,
> description: 'build kernel modules')
> option('examples', type: 'string', value: '',
Should not need a new option here. There is already a built-in option
"b_lto" which we can reuse.
/Bruce
On 9/5/19 11:36 AM, Bruce Richardson wrote:
> Should not need a new option here. There is already a built-in option
> "b_lto" which we can reuse.
Thank you Bruce, I don't know much about meson so I missed that.
Will try to figure out how to use that.
Regards
Andrzej
PS. Apologies to all maintainers for possibly getting duplicates - I've
made a mistake in DPDK mailing list e-mail and have resent the patch
series to dev@dpdk.org separately. Please reply to these so that
everything is visible to the community.
On Thu, Sep 05, 2019 at 11:43:30AM +0200, Andrzej Ostruszka wrote:
> On 9/5/19 11:36 AM, Bruce Richardson wrote:
> > Should not need a new option here. There is already a built-in option
> > "b_lto" which we can reuse.
>
> Thank you Bruce, I don't know much about meson so I missed that.
> Will try to figure out how to use that.
>
No problem.
The options are documented here: https://mesonbuild.com/Builtin-options.html
and you can query them just the same as with the options added for the
project. Therefore, the only thing that really should change is that in the
meson.build file you check the built-in option and add the fat-lto-objects
flag.
Incidentally, I think if the fat-lto-objects flag is not supported you may
want to error out if lto is enabled, as meson will add the lto flag itself
if the option is set.
Regards,
/Bruce
@@ -31,6 +31,7 @@ env:
- DEF_LIB="static" OPTS="-Denable_kmods=false"
- DEF_LIB="shared" OPTS="-Denable_kmods=false"
- DEF_LIB="shared" RUN_TESTS=1 BUILD_DOCS=1
+ - DEF_LIB="static" OPTS="-Denable_lto=true"
matrix:
include:
@@ -100,6 +101,12 @@ matrix:
apt:
packages:
- *extra_packages
+ - env: DEF_LIB="static" OPTS="-Denable_lto=true" EXTRA_PACKAGES=1
+ compiler: gcc
+ addons:
+ apt:
+ packages:
+ - *extra_packages
script: ./.ci/${TRAVIS_OS_NAME}-build.sh
@@ -49,6 +49,11 @@ CONFIG_RTE_FORCE_INTRINSICS=n
#
CONFIG_RTE_ARCH_STRICT_ALIGN=n
+#
+# Enable link time optimization
+#
+CONFIG_RTE_ENABLE_LTO=n
+
#
# Compile to share library
#
@@ -196,3 +196,12 @@ add_project_arguments('-D_GNU_SOURCE', language: 'c')
if is_freebsd
add_project_arguments('-D__BSD_VISIBLE', language: 'c')
endif
+
+if get_option('enable_lto')
+ if cc.has_argument('-flto -ffat-lto-objects')
+ add_project_arguments('-flto -ffat-lto-objects', language: 'c')
+ add_project_link_arguments('-flto', language: 'c')
+ else
+ message('compiler does not support LTO')
+ endif
+endif
new file mode 100644
@@ -0,0 +1,36 @@
+Link Time Optimization
+======================
+
+The DPDK framework supports compilation with link time optimization
+turned on. This depends obviously on the capabilities of the compiler
+to do "whole program" optimization at link time and is available only
+for compilers that support that feature (gcc, clang and icc). To be
+more specific compiler have to support creation of ELF objects
+containing both normal code and internal representation
+(fat-lto-objects). This is required since during build some code is
+generated by parsing produced ELF objects (pmdinfogen).
+
+The amount of performance gain that one can get from LTO depends on the
+compiler and the code that is being compiled. However LTO is also
+useful for additional code analysis done by the compiler. In particular
+due to interprocedural analysis compiler can produce additional warnings
+about variables that might be used uninitialized. Some of these
+warnings might be "false positives" though and you might need to
+explicitly initialize variable in order to silence the compiler.
+
+Link time optimization can be enabled for whole DPDK framework by
+setting:
+
+.. code-block:: console
+ CONFIG_ENABLE_LTO=y
+
+in config file for the case of make based build and by:
+
+.. code-block:: console
+ meson build -Denable_lto=true
+ ninja -C build
+
+for the case of meson based build.
+
+Please note that turning LTO on causes considerable extension of
+compilation time.
@@ -56,6 +56,14 @@ New Features
Also, make sure to start the actual text at the margin.
=========================================================
+**Added build support for Link Time Optimization.**
+
+ LTO is an optimization technique used by the compiler to perform whole
+ program analysis and optimization at link time. In order to do that
+ compilers store their internal representation of the source code that
+ the linker uses at the final stage of compilation process.
+
+ See :doc:`../prog_guide/lto` for more information:
Removed Items
-------------
@@ -6,6 +6,8 @@ option('drivers_install_subdir', type: 'string', value: 'dpdk/pmds-<VERSION>',
description: 'Subdirectory of libdir where to install PMDs. Defaults to using a versioned subdirectory.')
option('enable_docs', type: 'boolean', value: false,
description: 'build documentation')
+option('enable_lto', type: 'boolean', value: false,
+ description: 'Enable link time optimization')
option('enable_kmods', type: 'boolean', value: true,
description: 'build kernel modules')
option('examples', type: 'string', value: '',
@@ -20,3 +20,7 @@ CLANG_MINOR_VERSION := $(shell echo $(CLANG_VERSION) | cut -f2 -d.)
ifeq ($(shell test $(CLANG_MAJOR_VERSION)$(CLANG_MINOR_VERSION) -lt 35 && echo 1), 1)
CC_SUPPORTS_Z := false
endif
+
+ifeq ($(shell test $(CLANG_MAJOR_VERSION)$(CLANG_MINOR_VERSION) -lt 60 && echo 1), 1)
+ CONFIG_RTE_ENABLE_LTO=n
+endif
@@ -48,6 +48,14 @@ endif
# process cpu flags
include $(RTE_SDK)/mk/toolchain/$(RTE_TOOLCHAIN)/rte.toolchain-compat.mk
+ifeq ($(CONFIG_RTE_ENABLE_LTO),y)
+# 'fat-lto' is used since pmdinfogen needs to have 'this_pmd_nameX'
+# exported in symbol table and without this option only internal
+# representation is present.
+TOOLCHAIN_CFLAGS += -flto -ffat-lto-objects
+TOOLCHAIN_LDFLAGS += -flto
+endif
+
# workaround clang bug with warning "missing field initializer" for "= {0}"
WERROR_FLAGS += -Wno-missing-field-initializers
@@ -88,6 +88,10 @@ else
MACHINE_CFLAGS := $(filter-out -march% -mtune% -msse%,$(MACHINE_CFLAGS))
endif
+ ifeq ($(shell test $(GCC_VERSION) -lt 45 && echo 1), 1)
+ CONFIG_RTE_ENABLE_LTO=n
+ endif
+
# Disable thunderx PMD for gcc < 4.7
ifeq ($(shell test $(GCC_VERSION) -lt 47 && echo 1), 1)
CONFIG_RTE_LIBRTE_THUNDERX_NICVF_PMD=d
@@ -62,6 +62,18 @@ endif
# process cpu flags
include $(RTE_SDK)/mk/toolchain/$(RTE_TOOLCHAIN)/rte.toolchain-compat.mk
+ifeq ($(CONFIG_RTE_ENABLE_LTO),y)
+# 'fat-lto' is used since pmdinfogen needs to have 'this_pmd_nameX'
+# exported in symbol table and without this option only internal
+# representation is present.
+TOOLCHAIN_CFLAGS += -flto -ffat-lto-objects
+TOOLCHAIN_LDFLAGS += -flto
+# workaround for GCC bug 81440
+ifeq ($(shell test $(GCC_VERSION) -lt 80 && echo 1), 1)
+WERROR_FLAGS += -Wno-lto-type-mismatch
+endif
+endif
+
# workaround GCC bug with warning "missing initializer" for "= {0}"
ifeq ($(shell test $(GCC_VERSION) -lt 47 && echo 1), 1)
WERROR_FLAGS += -Wno-missing-field-initializers
@@ -54,5 +54,13 @@ endif
# process cpu flags
include $(RTE_SDK)/mk/toolchain/$(RTE_TOOLCHAIN)/rte.toolchain-compat.mk
+ifeq ($(CONFIG_RTE_ENABLE_LTO),y)
+# 'fat-lto' is used since pmdinfogen needs to have 'this_pmd_nameX'
+# exported in symbol table and without this option only internal
+# representation is present.
+TOOLCHAIN_CFLAGS += -flto -ffat-lto-objects
+TOOLCHAIN_LDFLAGS += -flto
+endif
+
export CC AS AR LD OBJCOPY OBJDUMP STRIP READELF
export TOOLCHAIN_CFLAGS TOOLCHAIN_LDFLAGS TOOLCHAIN_ASFLAGS