[dpdk-dev] mk: fix the combined library problems by replacing it with a linker script

Message ID 20151130084102.3f386b50@xeon-e3 (mailing list archive)
State Not Applicable, archived
Headers

Commit Message

Stephen Hemminger Nov. 30, 2015, 4:41 p.m. UTC
  On Mon, 30 Nov 2015 10:03:43 -0500
Neil Horman <nhorman@tuxdriver.com> wrote:

> On Wed, Nov 25, 2015 at 08:08:37AM -0800, Stephen Hemminger wrote:
> > On Wed, 25 Nov 2015 10:38:48 +0200
> > Panu Matilainen <pmatilai@redhat.com> wrote:
> > 
> > > On 11/25/2015 12:46 AM, Stephen Hemminger wrote:
> > > > On Tue, 24 Nov 2015 16:31:17 +0200
> > > > Panu Matilainen <pmatilai@redhat.com> wrote:
> > > >
> > > >> The physically linked-together combined library has been an increasing
> > > >> source of problems, as was predicted when library and symbol versioning
> > > >> was introduced. Replace the complex and fragile construction with a
> > > >> simple linker script which achieves the same without all the problems,
> > > >> remove the related kludges from eg mlx drivers.
> > > >>
> > > >> Since creating the linker script is practically zero cost, remove the
> > > >> config option and just create it always.
> > > >>
> > > >> Based on a patch by Sergio Gonzales Monroy, linker script approach
> > > >> initially suggested by Neil Horman.
> > > >>
> > > >> Suggested-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
> > > >> Suggested-by: Neil Horman <nhorman@tuxdriver.com>
> > > >> Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
> > > >
> > > > But it now means distros have to ship 20 libraries which seems like
> > > > a step back.
> > > 
> > > That's how Fedora and RHEL are shipping it already and nobody has so 
> > > much as noticed anything strange, much less complained about it. 20 
> > > libraries is but a drop in the ocean on a average distro. But more to 
> > > the point, distros will prefer 50 working libraries over one that doesn't.
> > > 
> > > The combined library as it is simply is no longer a viable option. 
> > > Besides just being broken (witness the strange hacks people are coming 
> > > up with to work around issues in it) its ugly because it basically gives 
> > > the middle finger to all the effort going into version compatibility, 
> > > and its also big. Few projects will use every library in DPDK, but with 
> > > the combined library they're forced to lug the 800 pound gorilla along 
> > > needlessly.
> > > 
> > > 	- Panu -
> > > 
> > 
> > Fixing the combined library took less than an hour for us.
> How did you fix the versioning issue?
> 
> Neil

This is what I did. 
Also decided to keep shared library version == major DPDK version
to avoid confusion.


mk: fix when building combined shared library

The DPDK mk file does not set shared object name or version
information as required by Debian.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
  

Comments

Panu Matilainen Dec. 1, 2015, 12:21 p.m. UTC | #1
On 11/30/2015 06:41 PM, Stephen Hemminger wrote:
> On Mon, 30 Nov 2015 10:03:43 -0500
> Neil Horman <nhorman@tuxdriver.com> wrote:
>
>> On Wed, Nov 25, 2015 at 08:08:37AM -0800, Stephen Hemminger wrote:
>>> On Wed, 25 Nov 2015 10:38:48 +0200
>>> Panu Matilainen <pmatilai@redhat.com> wrote:
>>>
>>>> On 11/25/2015 12:46 AM, Stephen Hemminger wrote:
>>>>> On Tue, 24 Nov 2015 16:31:17 +0200
>>>>> Panu Matilainen <pmatilai@redhat.com> wrote:
>>>>>
>>>>>> The physically linked-together combined library has been an increasing
>>>>>> source of problems, as was predicted when library and symbol versioning
>>>>>> was introduced. Replace the complex and fragile construction with a
>>>>>> simple linker script which achieves the same without all the problems,
>>>>>> remove the related kludges from eg mlx drivers.
>>>>>>
>>>>>> Since creating the linker script is practically zero cost, remove the
>>>>>> config option and just create it always.
>>>>>>
>>>>>> Based on a patch by Sergio Gonzales Monroy, linker script approach
>>>>>> initially suggested by Neil Horman.
>>>>>>
>>>>>> Suggested-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
>>>>>> Suggested-by: Neil Horman <nhorman@tuxdriver.com>
>>>>>> Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
>>>>>
>>>>> But it now means distros have to ship 20 libraries which seems like
>>>>> a step back.
>>>>
>>>> That's how Fedora and RHEL are shipping it already and nobody has so
>>>> much as noticed anything strange, much less complained about it. 20
>>>> libraries is but a drop in the ocean on a average distro. But more to
>>>> the point, distros will prefer 50 working libraries over one that doesn't.
>>>>
>>>> The combined library as it is simply is no longer a viable option.
>>>> Besides just being broken (witness the strange hacks people are coming
>>>> up with to work around issues in it) its ugly because it basically gives
>>>> the middle finger to all the effort going into version compatibility,
>>>> and its also big. Few projects will use every library in DPDK, but with
>>>> the combined library they're forced to lug the 800 pound gorilla along
>>>> needlessly.
>>>>
>>>> 	- Panu -
>>>>
>>>
>>> Fixing the combined library took less than an hour for us.
>> How did you fix the versioning issue?
>>
>> Neil
>
> This is what I did.
> Also decided to keep shared library version == major DPDK version
> to avoid confusion.
>
>
> mk: fix when building combined shared library
>
> The DPDK mk file does not set shared object name or version
> information as required by Debian.
>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
>
> --- a/mk/rte.sharelib.mk
> +++ b/mk/rte.sharelib.mk
> @@ -51,10 +51,10 @@ ifeq ($(LINK_USING_CC),1)
>   # Override the definition of LD here, since we're linking with CC
>   LD := $(CC) $(CPU_CFLAGS)
>   O_TO_S = $(LD) $(call linkerprefix,$(CPU_LDFLAGS)) \
> -	-shared $(OBJS) -o $(RTE_OUTPUT)/lib/$(LIB_ONE)
> +	-shared $(OBJS) -Wl,-soname,$(LIB_ONE).$(RTE_LIBVERS) -o $(RTE_OUTPUT)/lib/$(LIB_ONE)
>   else
>   O_TO_S = $(LD) $(CPU_LDFLAGS) \
> -	-shared $(OBJS) -o $(RTE_OUTPUT)/lib/$(LIB_ONE)
> +	-shared $(OBJS) -soname $(LIB_ONE).$(RTE_LIBVERS) -o $(RTE_OUTPUT)/lib/$(LIB_ONE)
>   endif
>
>   O_TO_S_STR = $(subst ','\'',$(O_TO_S)) #'# fix syntax highlight
> --- a/mk/rte.vars.mk
> +++ b/mk/rte.vars.mk
> @@ -74,8 +74,10 @@ ifneq ($(BUILDING_RTE_SDK),)
>   endif
>
>   RTE_LIBNAME := $(CONFIG_RTE_LIBNAME:"%"=%)
> +RTE_LIBVERS := $(CONFIG_RTE_LIBVERS:"%"=%)
>   ifeq ($(RTE_LIBNAME),)
>   RTE_LIBNAME := intel_dpdk
> +RTE_LIBVERS := 2
>   endif
>
>   # RTE_TARGET is deducted from config when we are building the SDK.
>

Adding a soname and a semi-arbitrary version does not fix the 
fundamental problems:

Since the library lumps together everything in DPDK, you'd have to bump 
its version whenever any of the individual libraries bumps its version 
to have the version mean anything. DPDK 2.0 and 2.1 are supposedly 
binary compatible but 2.2 certainly is not, and beyond that who knows.

That in turn forces all apps to be rebuild whenever one of the libraries 
changes version, whether those apps use that particular library or not.

The combined library doesn't have symbol versioning, so besides the 
better version compatibility tracking it loses other benefits like 
limited symbol visibility.

Not to mention the extra complexity in makefiles to support it, the 
increasing amount of duct-tape required to hold it together. And still 
eg the MLX pmds declare the configuration not supported at all.

	- Panu -
  
Robie Basak Dec. 1, 2015, 12:36 p.m. UTC | #2
On Tue, Dec 01, 2015 at 02:21:02PM +0200, Panu Matilainen wrote:
> Adding a soname and a semi-arbitrary version does not fix the fundamental
> problems:
> 
> Since the library lumps together everything in DPDK, you'd have to bump its
> version whenever any of the individual libraries bumps its version to have
> the version mean anything. DPDK 2.0 and 2.1 are supposedly binary compatible
> but 2.2 certainly is not, and beyond that who knows.
> 
> That in turn forces all apps to be rebuild whenever one of the libraries
> changes version, whether those apps use that particular library or not.

If we bundle all the libraries together into one package, then in
distributions we have to rebuild anyway when any of the libraries
changes version since a dependent package can't just depend on any later
version, because we don't know in advance what ABI breaks might occur.
It's also trivial to do rebuilds in a distribution. I'd prefer to see
ABI versioning done right to avoid the pain that might occur there.
Rebuilding dependent packages is on the other hand straightforward.

> The combined library doesn't have symbol versioning, so besides the better
> version compatibility tracking it loses other benefits like limited symbol
> visibility.

The combined library *should* have symbol versioning, which I've brought
up before. This isn't a reason to not have a combined library; it is a
reason to fix the combined library.

Why is limited symbol visibility a benefit in this case?

> Not to mention the extra complexity in makefiles to support it, the
> increasing amount of duct-tape required to hold it together. And still eg
> the MLX pmds declare the configuration not supported at all.

I'd argue that this is because the build system is unnecessarily complex
currently. A library consumer should just be able to #include
<dpdk/foo.h> and link with -ldpdk. It should not have a build system or
custom flags imposed on it by one of the libraries it uses.

Robie
  
Neil Horman Dec. 1, 2015, 1:20 p.m. UTC | #3
On Mon, Nov 30, 2015 at 08:41:02AM -0800, Stephen Hemminger wrote:
> On Mon, 30 Nov 2015 10:03:43 -0500
> Neil Horman <nhorman@tuxdriver.com> wrote:
> 
> > On Wed, Nov 25, 2015 at 08:08:37AM -0800, Stephen Hemminger wrote:
> > > On Wed, 25 Nov 2015 10:38:48 +0200
> > > Panu Matilainen <pmatilai@redhat.com> wrote:
> > > 
> > > > On 11/25/2015 12:46 AM, Stephen Hemminger wrote:
> > > > > On Tue, 24 Nov 2015 16:31:17 +0200
> > > > > Panu Matilainen <pmatilai@redhat.com> wrote:
> > > > >
> > > > >> The physically linked-together combined library has been an increasing
> > > > >> source of problems, as was predicted when library and symbol versioning
> > > > >> was introduced. Replace the complex and fragile construction with a
> > > > >> simple linker script which achieves the same without all the problems,
> > > > >> remove the related kludges from eg mlx drivers.
> > > > >>
> > > > >> Since creating the linker script is practically zero cost, remove the
> > > > >> config option and just create it always.
> > > > >>
> > > > >> Based on a patch by Sergio Gonzales Monroy, linker script approach
> > > > >> initially suggested by Neil Horman.
> > > > >>
> > > > >> Suggested-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
> > > > >> Suggested-by: Neil Horman <nhorman@tuxdriver.com>
> > > > >> Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
> > > > >
> > > > > But it now means distros have to ship 20 libraries which seems like
> > > > > a step back.
> > > > 
> > > > That's how Fedora and RHEL are shipping it already and nobody has so 
> > > > much as noticed anything strange, much less complained about it. 20 
> > > > libraries is but a drop in the ocean on a average distro. But more to 
> > > > the point, distros will prefer 50 working libraries over one that doesn't.
> > > > 
> > > > The combined library as it is simply is no longer a viable option. 
> > > > Besides just being broken (witness the strange hacks people are coming 
> > > > up with to work around issues in it) its ugly because it basically gives 
> > > > the middle finger to all the effort going into version compatibility, 
> > > > and its also big. Few projects will use every library in DPDK, but with 
> > > > the combined library they're forced to lug the 800 pound gorilla along 
> > > > needlessly.
> > > > 
> > > > 	- Panu -
> > > > 
> > > 
> > > Fixing the combined library took less than an hour for us.
> > How did you fix the versioning issue?
> > 
> > Neil
> 
> This is what I did. 
> Also decided to keep shared library version == major DPDK version
> to avoid confusion.
> 
Ok, thats the library versioning, which is great, but I was more concerned about
the symbol versioning - i.e. the collection of version linker scripts that get
applied when you build individual libraries, but completely ignored when you
build the combined library

Neil

> 
> mk: fix when building combined shared library
> 
> The DPDK mk file does not set shared object name or version
> information as required by Debian.
> 
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> 
> --- a/mk/rte.sharelib.mk
> +++ b/mk/rte.sharelib.mk
> @@ -51,10 +51,10 @@ ifeq ($(LINK_USING_CC),1)
>  # Override the definition of LD here, since we're linking with CC
>  LD := $(CC) $(CPU_CFLAGS)
>  O_TO_S = $(LD) $(call linkerprefix,$(CPU_LDFLAGS)) \
> -	-shared $(OBJS) -o $(RTE_OUTPUT)/lib/$(LIB_ONE)
> +	-shared $(OBJS) -Wl,-soname,$(LIB_ONE).$(RTE_LIBVERS) -o $(RTE_OUTPUT)/lib/$(LIB_ONE)
>  else
>  O_TO_S = $(LD) $(CPU_LDFLAGS) \
> -	-shared $(OBJS) -o $(RTE_OUTPUT)/lib/$(LIB_ONE)
> +	-shared $(OBJS) -soname $(LIB_ONE).$(RTE_LIBVERS) -o $(RTE_OUTPUT)/lib/$(LIB_ONE)
>  endif
>  
>  O_TO_S_STR = $(subst ','\'',$(O_TO_S)) #'# fix syntax highlight
> --- a/mk/rte.vars.mk
> +++ b/mk/rte.vars.mk
> @@ -74,8 +74,10 @@ ifneq ($(BUILDING_RTE_SDK),)
>  endif
>  
>  RTE_LIBNAME := $(CONFIG_RTE_LIBNAME:"%"=%)
> +RTE_LIBVERS := $(CONFIG_RTE_LIBVERS:"%"=%)
>  ifeq ($(RTE_LIBNAME),)
>  RTE_LIBNAME := intel_dpdk
> +RTE_LIBVERS := 2
>  endif
>  
>  # RTE_TARGET is deducted from config when we are building the SDK.
>
  
Neil Horman Dec. 1, 2015, 1:30 p.m. UTC | #4
On Tue, Dec 01, 2015 at 12:36:15PM +0000, Robie Basak wrote:
> On Tue, Dec 01, 2015 at 02:21:02PM +0200, Panu Matilainen wrote:
> > Adding a soname and a semi-arbitrary version does not fix the fundamental
> > problems:
> > 
> > Since the library lumps together everything in DPDK, you'd have to bump its
> > version whenever any of the individual libraries bumps its version to have
> > the version mean anything. DPDK 2.0 and 2.1 are supposedly binary compatible
> > but 2.2 certainly is not, and beyond that who knows.
> > 
> > That in turn forces all apps to be rebuild whenever one of the libraries
> > changes version, whether those apps use that particular library or not.
> 
> If we bundle all the libraries together into one package, then in
> distributions we have to rebuild anyway when any of the libraries
> changes version since a dependent package can't just depend on any later
> version, because we don't know in advance what ABI breaks might occur.
> It's also trivial to do rebuilds in a distribution. I'd prefer to see
> ABI versioning done right to avoid the pain that might occur there.
> Rebuilding dependent packages is on the other hand straightforward.
> 
> > The combined library doesn't have symbol versioning, so besides the better
> > version compatibility tracking it loses other benefits like limited symbol
> > visibility.
> 
> The combined library *should* have symbol versioning, which I've brought
> up before. This isn't a reason to not have a combined library; it is a
> reason to fix the combined library.
> 
Agreed, but using a linker script as the combined library gets you the proper
symbol versioning.

> Why is limited symbol visibility a benefit in this case?
> 
Because it prevents an application from inadvertently using symbols that would
otherwise appear in another library (i.e. if not using the combined library, you
know you've used a symbol in another library because you are then forced to add
that library to the build.

> > Not to mention the extra complexity in makefiles to support it, the
> > increasing amount of duct-tape required to hold it together. And still eg
> > the MLX pmds declare the configuration not supported at all.
> 
> I'd argue that this is because the build system is unnecessarily complex
> currently. A library consumer should just be able to #include
> <dpdk/foo.h> and link with -ldpdk. It should not have a build system or
> custom flags imposed on it by one of the libraries it uses.
> 
I don't disagree here, but again, modeling libdpdk.so as  a linker script gets
you to that point (or 99% of the way there at least).

Neil

> Robie
>
  
Neil Horman Dec. 9, 2015, 2:16 p.m. UTC | #5
On Tue, Dec 08, 2015 at 05:03:26PM +0000, Robie Basak wrote:
> On Wed, Dec 02, 2015 at 06:44:19AM -0500, Neil Horman wrote:
> > Theres nothing "complex" about the simple fact that a project builds lots of
> > libraries.  Its extreemely common. Any graphic window manager has exactly the
> > same situation, as do any number of tools that have multiple hardware backends
> > impelmented in user space (v4l, sane, iptables, to name just a few).
> > 
> > > Before I go into details, it would be nice if someone could please
> > > explain why DPDK has to be "special" in needing to do this? I don't
> > Its not special, see above.  Not saying the build environment cant be improved,
> > but the fact that there are multiple libraries is pretty straightforward.
> 
> It's fine in principle for an upstream to ship multiple shared
> libraries, but it is extra and unnecessary work unless there's a
> *reason* to have multiple shared libraries. What are the reasons for
> DPDK?
> 
Separation of functionality.  There is no need to build a library that supports
10 different NICS when a given application is only using one.  Likewise with
other discrete functionality, e.g. you don't link against the ACL library if you
dont' want to use ACL's.

> > > In Debian and Ubuntu, we manage a library transition (an ABI bump in a
> > > library together with all dependencies moving to use the new ABI) by
> > > concurrently packaging both the old and new libraries at once. This
> > > works well with the norm for libraries. We ship one binary package per
> > > soname, with the major version as part of the package name. This allows
> > > a system to have two (or more) ABIs installed simultaneously. For a
> > > library transition, we just package the new version and then that can
> > > land and work concurrently as we then individually update every
> > > dependent (library-consuming) package.
> 
> > So thats, a distribution choice, not an upstream problem.
> 
> No, that's how shared libraries work. By design, multiple ABI versions
> can be co-installed. That's why sonames have the ABI major version
> inside them and the filenames reflect the sonames.
> 
We're talking about different things.  The DPDK support ABI versioning in their
sonames, if you would look at the makefiles and output, you would clearly see
that.  Theres no reason that multiple versions of DPDK can't be co-installed,
thats easy, the question here is weather it should support multiple discrete
libraries or only a single large library.  It seems from your comments that a
single monolithic library should be the only option, and thats clearly the less
flexible one.

> It is a distribution choice to exploit this capability. But it is an
> upstream problem if this capability is broken.
> 
Its not broken.  In what way do you think soname versioning is broken in DPDK?
And in what way is it broken that the only solution is to merge all the
libraries into a single monolithic one?

> By shipping multiple shared libraries, DPDK isn't breaking this
> capability per se. But if the upstream expectation is that it's no
> additional work for distributions because the multiple libraries can
> just be bundled together into a single distribution package, then _this_
> is what breaks the capability.
> 
> Instead DPDK needs to acknowledge that splitting libraries _does_ cause
> additional packaging work for any distribution that wants to use the
> multiple co-installed ABI feature of shared libraries as they are
> designed.
> 

Very well, allowing multiple separate libraries causes some extra work for
packaging.  Specifically it requires that distributions carry a patch that
instantiate a specific library ABI major version number that is incremented ni
lock step for every library in a given build (which is admittedly not what
upstream currently does).  I don't see that as overly hard to do, but to each
their own.

However, the solution there is to propose a patch that defines a single ABI
variable in the makefile structure that is applied to every library on a symbol
version bump.  The answer is _not_ to somehow entangle that with the idea that
we have to have a single monolithic library.  Their separate issues, and you
can solve the problem that you are complaining about without throwing the
proverbial baby out with the bathwater.

> Then, it becomes for upstream a question of the trade-off: does the
> benefit of split libraries outweigh the extra work this creates on
> packagers? To understand this, first I need to understand the rationale
> for shipping multiple shared libaries specifically in DPDK, and I feel
> that you (well, Red Hat) have yet to present a case.
>
Some of the reasons I've mentioned above.  Regardless however, the solution
your advocating to the problem you describe is orthogonal to the complaints
you're making there.  If your goal is to allow multiple ABI versions in a given
package, then propose that ABI versions be incremented monolithically in the
upstream build. Even if a given library hasn't changed, it doesn't hurt to bump
the version number - that is to say, as a distribution, if an application links
against library A version 2, it will also link against library B version 2
(assuming it uses library B), even if library B has no ABI changes in it (thats
simply an artifact of how packaging works, you dont' typically mix and match
libraries from separate builds). So...just bump the soname version number, and
while you're at it, make it a global build setting, not a per library build
setting.  Then you can use it to append a version number to the combined library
(script) as well

What you shouldn't do is assume that each library _has_ to have an independent
ABI version number, and use that to bootstrap an argument that the only solution
is a single combined library.

> >                                                            And it seems like a
> > problem you should have already solved (note the examples above).  If you feel
> > like you need to package multiple ABI versions in the same library, you can,
> > just update the LIBABIVER of all the libraries, instead of the ones that truly
> > change, so that each library is guaranteed a newer so version, to make the
> > library file name unique.  Yes you have to make a small change from upstream,
> > but thats part of the work that distribution maintainers do.
> 
Yes, this makes good sense.

> If it makes sense for upstream, it would be better for all if the code
> was maintained in once place rather than fragmented across distribution
> patches. My argument here is that _does_ make sense for upstream, which
> is why I took the question to this list before we uploaded our first
> patched version to Ubuntu.
> 
yes, thats fine.  So, it seems like perhaps we're talking about the same
solution here. If we have a single LIBABIVER variable that is applied to each
DSO, why do we then further need to only allow a single combined library?

Neil
  

Patch

--- a/mk/rte.sharelib.mk
+++ b/mk/rte.sharelib.mk
@@ -51,10 +51,10 @@  ifeq ($(LINK_USING_CC),1)
 # Override the definition of LD here, since we're linking with CC
 LD := $(CC) $(CPU_CFLAGS)
 O_TO_S = $(LD) $(call linkerprefix,$(CPU_LDFLAGS)) \
-	-shared $(OBJS) -o $(RTE_OUTPUT)/lib/$(LIB_ONE)
+	-shared $(OBJS) -Wl,-soname,$(LIB_ONE).$(RTE_LIBVERS) -o $(RTE_OUTPUT)/lib/$(LIB_ONE)
 else
 O_TO_S = $(LD) $(CPU_LDFLAGS) \
-	-shared $(OBJS) -o $(RTE_OUTPUT)/lib/$(LIB_ONE)
+	-shared $(OBJS) -soname $(LIB_ONE).$(RTE_LIBVERS) -o $(RTE_OUTPUT)/lib/$(LIB_ONE)
 endif
 
 O_TO_S_STR = $(subst ','\'',$(O_TO_S)) #'# fix syntax highlight
--- a/mk/rte.vars.mk
+++ b/mk/rte.vars.mk
@@ -74,8 +74,10 @@  ifneq ($(BUILDING_RTE_SDK),)
 endif
 
 RTE_LIBNAME := $(CONFIG_RTE_LIBNAME:"%"=%)
+RTE_LIBVERS := $(CONFIG_RTE_LIBVERS:"%"=%)
 ifeq ($(RTE_LIBNAME),)
 RTE_LIBNAME := intel_dpdk
+RTE_LIBVERS := 2
 endif
 
 # RTE_TARGET is deducted from config when we are building the SDK.