YoYoYonnY YoYoYonnY - 1 month ago 7
C Question

How does an extern variable work in a shared library

Say I wrote a simple dynamic library like this:

lib.h



#pragma once

extern int x;
extern int p(void);


lib.c



#include <lib.h>
#include <stdio.h>

x = 0;
int p(void) {
printf("lib: %d\n", x++);
return 0;
}


a.c



#include <lib.h>
#include <stdio.h>

int main(void) {
for (; !p(); x--) printf("a.c: %d\n", x);
return 0;
}


b.c



#include <lib.h>
#include <stdio.h>

int main(void) {
for (; !p(); x = 0) printf("b.c: %d\n", x);
return 0;
}


What would a and b print?
I can think of a couple things that that may happen:


  • Linker error:
    x
    declared
    extern
    but never defined.

  • Each process gets it's own
    x
    , including
    lib
    . (b.c is always 0, a.c counts down, lib counts up)

  • Each process gets it's own
    x
    to share with
    lib
    . (a.c and b.c are always 1, lib is always 0)

  • All processes share the same
    x
    , including
    lib
    . (a.c, b.c and lib return random values)

  • All processes share the same
    x
    , including
    lib
    , until someone other than
    lib
    writes to it, then that process gets it's own version of
    x
    , not shared with
    lib
    (Read this online somewhere). (lib always increments, b.c always prints 0, a.c counts down)



What typically happens? Are there any inconsistencies between compilers/platforms we should know about? Can we force one behaviour (I am thinking
__declspec(dllexport)
, compiler flags, etc.)?

Answer

There are several parts to this question:

What would a and b print? I can think of a couple things that that may happen:

Linker error: x declared extern but never defined.

Nothing would be printed since a and b probably haven't been built into executables yet. Of course you need to link lib.so, lib.a or an import library lib.lib to expose the executable to a linkable definition of x, otherwise nothing else works (mostly, it can be more complicated than that if try hard).

Each process gets it's own x, including lib. (b.c is always 0, a.c counts down, lib counts up)

lib isn't a process in your scenario, it's a shared library. The shared library is separately loaded and linked in each process space where something references it in a way understood by the dynamic loader (ld-linux.so, ntdll.dll on windows). Each process observes a copy of the loaded library in its address space, and the library itself sees the same copy, so running a should print 0 followed by 1 forever. p() is run and tested, x is printed, x is decremented back to 0. b will also print 0 followed by 1 forever. p() is run and tested, x is printed, x is set to 0. Note that p() prints x++ so the increment takes place after the value is captured for the argument to printf. The x variables to which the programs containing a and b refer are specific to each run of a or b. This is frequently accomplished at the OS level by mapping pages of the actual loadable library from disk into memory and setting them "copy-on-write", where attempted changes by the host process cause the OS to allocate a new page and copy the old contents on first. The result is that unmodified parts of the loaded library take up less actual memory.

Each process gets it's own x to share with lib. (a.c and b.c are always 1, lib is always 0)

Lib isn't a separate process. Executing p() in a sees the same x as the one linked by a.

All processes share the same x, including lib. (a.c, b.c and lib return random values)

Normally not the case (also see below).

All processes share the same x, including lib, until someone other than lib writes to it, then that process gets it's own version of x, not shared with lib (Read this online somewhere). (lib always increments, b.c always prints 0, a.c counts down)

Some old runtime systems that don't support separate address spaces do work this way, notably amigados. It's quite unlikely you'll run into one.

What typically happens? Are there any inconsistencies between compilers/platforms we should know about? Can we force one behaviour (I am thinking __declspec(dllexport), compiler flags, etc.)?

In the vast majority of cases, each process shares extern variables with the one instance of the given library loaded in that process. Unless you take specific action, that's expected.

In the comments, there were a few other questions:

Can windows dlls (or others) export non-function data.

Yes. Use the DATA qualifier in the .def file when building the import lib. For others it's not different from exporting functions. You'll however receive a pointer to the target variable rather than be bound to the space occupied.

Asterisk, see below?

On windows, sections have a SHARED attribute that causes the loader to allocate the same page in every process that uses the DLL. It's not the default and you have to jump through hoops to and use platform specific pragmas to do it. There are a lot of reasons not to use this.

Most of the time, when a dll wants to share state among copies of itself loaded in many processes, it uses the shared memory API of the host system (CreateFileMapping or mmap usually). This allows flexibility (for example, all a processes could share one version of x, separate from all b processes with another copy of x). Note that using SHARED could easily mean that running a could crash b, and having another long running user c loaded could keep either a or b from starting up again until a reboot.

Comments