← Back to team overview

maria-developers team mailing list archive

Issues with static linking

 

The issue of static vs. dynamic linking keeps coming up in relation to
building of MariaDB (and MySQL). I thought I would share some thoughts on
this.

There is a tradition for linking statically in MySQL for performance and other
reasons. However, these days there are a number of problems with static
linking, to the extent that I think static linking of system libraries is
really impractical.

Now, in a *fully* static binary that does not load any shared libraries at
run-time, static linking works ok. However, in most real-life situations,
there is always some amount of dynamic linking:

 - Nowadays glibc uses dynamic loading at runtime to handle NSS. And while
   this can be disabled, it is strongly discouraged:
   http://gnu.gds.tuwien.ac.at/software/libc/FAQ.html#s-2.22

 - Support for dynamic loading of pluggable storage engines requires dynamic
   loading at runtime.

 - Even just installing a single UDF (user-defined function) into the server
   requires dynamic loading of a shared library.

 - Client libraries (libmysql, libmysqld) may be used in dynamic shared
   objects built by users (like the Amarok plugin using libmysqld).

So a binary built for general distribution really has no choice but to support
some dynamic linking (users building on their own can do differently, of
course).

The problem now is the following:

Consider a binary B that loads a dynamic shared object S. Assume that both B
and S use the same library L.

Now if B links L statically, part of L, but not necessarily all of it, will be
included inside B. These parts of L will be of some version v1 of L determined
when B was built, and they will be visible to S. But S may use other parts of
L not included in B, and these may be of a different version v2 of L, as
determined by how S was linked and loaded.

The result is that S will be using a mix of two different versions of L. This
very rarely work.

Typically, L will be eg. libc, and B will contain parts of libc from the build
machine, while S will use the libc installed on the machine on which B is run.

This is the basic problem. Because of the way linking works in Linux and
similar systems, libraries used by dynamically loaded objects must themselves
be linked dynamically to avoid this problem with mixed versions.

This applies to things like libc and libz. On the other hand objects that are
internally part of MariaDB (libmystrings, libmyisam, etc.) are no problem to
link statically.

There are also a number of bugs on MySQL relating to this. So it is not just
an academic problem.

The conclusion as far as I can see is that for MariaDB binaries, we should
link system libraries dynamically, or things will just not work correctly for
our users. Unless someone knows a way to avoid these problems.

-----------------------------------------------------------------------

Just to recap, the main benefits of static linking are, to my knowledge:

1. Performance. On i386, position-independent code (as needed in shared
objects) needs code like this in the prologue of every function that accesses
global names (like calling other functions):

          call label
  label:
          pop %ebx

This consumes cycles, and reduces available registers on an already
register-starved architecture.

2. System independence. Upgrading system libraries on the host does not affect
the application, eg. less change of breakage due to incompatibility (though
this can be also a disadvantage in case of eg. security fixes).

As far as I know, nowadays, these advantages are much smaller:

1. On x86_64, the CPU architecture has been fixed to support
position-independent code with little or no extra overhead (like pc-relative
addressing modes).

2. Glibc has become much better at preserving binary compatibility across
versions.

-----------------------------------------------------------------------

Hope this helps,

 - Kristian.