John John - 1 year ago 89
C++ Question

Separate compilation units vs Single Compilation unit for faster compilation, linking, and optimised code?

There are several questions which talk about why we should have separate compilation units so improve compile times (for example, not including any code in the hpp file,but only in the cpp files).

But then I found this question:

#include all .cpp files into a single compilation unit?

If we can ignore the question of maintainability, if we can just look at compile / link times, as well as optimising code, what would be the benefits and pitfalls of having just one hpp and cpp file ?

Note that the post i linked to talks about a single cpp file (while there are many header files). I'm asking what happens if we just have one hpp file and one cpp file.....

EDIT: if we can ignore the fact that changing a single line, will cause the entire code to be recompiled, will it still be faster than if 1000's of separate files are recompiled from scratch...

EDIT: I am not interested in a discussion about maintainability. I'm trying to understand what makes a compiler compile faster. This question has nothing to do with what is practical, but more to do with just understanding a simple matter:

Will one large hpp & cpp file, compile faster than if the code was split across many hpp and cpp files, using a single core.

EDIT: I think people are getting sidetracked and talking about what is practical and what one SHOULD do. This question is not about what one SHOULD do - it is simply to help me understand what the compiler is doing under the hood - until now no one has answered that question, and instead is talking about whether it is practical or not.

EDIT: Besides the one person who actually tried to answer this question - I feel this question hasn't got the justice it deserved and is being unnecessarily down voted. SO is about sharing information, not punishing questions because people asking don't already know the answer.

Answer Source

It is compiler specific, and depends upon the optimizations you are asking from your compiler.

Most recent free software C++11 (or C++14) compilers are able to do link-time optimization : both recent GCC & Clang/LLVM are accepting the -flto flag (for link time optimiation...). To use it you should compile and link your code with it, and some additional (same) optimization flags. A typical use thru the make builder could be:

make 'CXX=g++ -flto -O2' 

or, in separate commands:

g++ -flto -O2 -Wall -I/usr/local/include -c
g++ -flto -O2 -Wall -I/usr/local/include -c
g++ -flto -O2 -Wall src1.o src2.o -L/usr/local/lib -lsome -o binprog

Don't forget -flto -O2 at link time !

Then the code is compiled nearly the same as if you put all & in the same compilation unit. In particular, the compiler is able to (and sometimes will) inline a call from a function in to a function in

What happens under the hoods with -flto (with GCC, but in principle it is similar in Clang) is that the compiler is putting some intermediate representation (in some Gimple/SSA form) of your source code in each object file. At "link-time" (actually done also by the compiler, not only the linker) this intermediate representation is reloaded and processed and recompiled for the entire program. So the compilation time nearly doubles.

So -flto is slowing the compilation (approximately by a factor of 2) and might sometimes give a few percents of performance improvement (execution time of the produced binary).

I'm trying to understand what makes a compiler compile faster.

This is compiler specific, and depends a lot with the optimizations you are asking from it. Using a recent GCC5 or GCC6, with g++ -O2 (and IIRC also with clang++ -O2) by practical and empirical measure the compilation time is proportional not only to the total size of the compilation unit (e.g. the number of tokens produced after preprocessing & include & macro expansions, and even template expansion) but also to the square of the size of the biggest function. A possible explanation is related to the time complexity of register allocation and instruction scheduling. Notice that the standard headers of the C++11 or C++14 containers are expanded to something quite big (e.g #include <vector> gives about ten thousand lines).

Precompiled headers might help reducing the compilation time (but not always!). You would have a single common header, and you'll better have not too small compilation units: total compilation time is slightly faster with 20 *.cc files of about two thousand lines each than with 200 *.cc files of two hundred lines each (notably because header files expand to many tokens). I generally recommend having at least a thousand lines per *.cc file if possible, so having just one small file of a hundred lines per class implementation is often a bad idea (in terms of overall compilation time). For a tiny project of e.g. 4KLOC having a single source file is quite sensible.

Of course, several compilation units could be compiled in parallel, e.g. with make -j

PS. See (the slides and documentations and follow the many links on) for more about GCC internals.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download