padawanTony padawanTony - 1 month ago 7x
C Question

Linking files/headers in C

Let's say I have the following program (


#include <stdio.h>
#include <math.h>

#define NAME "ashoka"

int main(int argc, char *argv[])
printf("Hello, world! My name is %s\n", NAME);

So, as I understand it the process of compiling this program is:

  1. Preprocessing: will copy-paste the stdio.h and math.h functions declarations and replace

    clang -E hello.c

  2. Compiling: will turn code into assembly code

    clang -S hello.c

    file produced: hello.s

  3. Assembling: transform assembly code to object code

    clang -c hello.s

    file produced: hello.o

  4. Linking: combine object files into one file that we will execute.

    clang hello.o -lm

    OR (let's say I also want to link hello2.o)

    clang hello.o hello2.o

So, here come the questions:

  1. Is the process described the correct one?

  2. In the linking stage, we link together
    (Object code) files. I know that
    resides in
    directory. Where is
    ? How does the linker find it?

  3. What are
    (static libraries) and
    (dynamic libraries) in Linux? And how are they related with
    files and the linking stage?

  4. Let's say I want to share a library I made with the world. I have a
    file, in which I have declared and implemented my functions. How would I go about sharing this so that people would include it in their projects by doing either
    #include <mylib.h>
    #include "mylib.h"

  1. Yes, though going through assembly is an extra step (you can just compile the C source to an object). Internally, the compiler will have many more stages: parsing code into an AST, generating intermediate code (e.g. LLVM bitcode for clang), optimizing, etc.
  2. math.h just defines protypes for the standard math library libm.a (which you link with -lm). The functions themselves live in object files archived inside libm.a (see below).
  3. Static libraries are just archives of object files. The linker will check what symbols are used and will extract and link the object files that export those symbols. Those libraries can be manipulated with ar (for example ar -t lists the object files in a library). Dynamic (or shared) libraries are not included in the output binary. Instead, the symbols your code needs are loaded at runtime.
  4. You would simply create an header file with your externed prototypes:

    #ifndef MYLIB_H
    #define MYLIB_H
    extern int mylib_something(char *foo, int baz);

    and ship it with your library. Of course the developer must also link (dinamically) against your library.

The advantage of static libraries is reliability: there will be no surprises, because you already linked your code against the exact version you're sure it works with. Other cases where it may be useful is when you're using uncommon or bleeding-edge libraries and you don't want to install them as shared. This comes at the cost of increased binary size.

Shared libraries produce smaller binaries (because the library is not in the binary) with smaller RAM footprint (because the OS can load the library once and share it among many processes), but they require a bit more care to make sure you're loading exactly what you want (e.g. see DLL Hell on Windows).

As @iharob notes, their advantages don't just stop at binary size. For example, if a bug is fixed in a shared library all programs will benefit from it (as long as it doesn't break compatibility). Also, shared libraries provide abstraction between the external interface and the implementation. For example, say an OS provides a library for applications to interface to it. With updates, the OS interface changes, and the library implementation tracks those changes. If it was compiled as a static library, all programs would have to be recompiled with the new version. If it was a shared library, they wouldn't even notice it (as long as the external interface stays the same). Another example are Linux libraries that wrap system/distro-specific aspects to a common interface.