Today I decided to decompile a simple "Hello world" program written in visual C++, using IDA Pro.
With my previous knowledge I was sure I would not find the immediate call to printf at the executable entry point, and I was right.
I found a lot of code that was not written by me and added by the compiler during the compilation process.
I would like a better understanding of what code is added during the compilation process.
What does it do?
Are there any "tricks" to quickly find "main" and skip all unnecessary code generated by disassembly?
The best I could find was in this post :
saying the execution order of an executable compiled using visual c++ is as follows:
There are various things that are required by the c++ standard that you will likely encounter.
Most importantly is that there needs to be code that handles the construction of any statics in the main translation unit before main is called, and a function that after main leaves that handles their destruction. Additionally, the standard requires a function
atexit that allows you to register additional functions to be called after main returns.
So at a minimum, the startup code needs to be able to build this data structure of functions that will be called on return from main. This is a dynamic data structure because it needs to be added to runtime by the program, and the order of calls is the opposite of registering (so typically you want a data structure that makes adding to the place you walk from easy).
But additionally, the standard requires that statics in other translation units are created before any function is executed in that translation unit. Often, compilers will simply arrange everything in the linker so it all get's called before main, but that is not required. Those compilers that do things differently, then need to provide thunks to initialisation routines in the other translation unit code linked that will called on first function call.
Just this is quite a bit of work if you use any standard library. Remember, std::cout is a static object (static lifetime, not static linkage - confusingly overloaded word alert). So that means building up communications to your console out, which will have whatever APIs needed by your platform called. There are many such objects in the standard.
And then, there may be stuff specific to your platform and/or compiler that prepares the process in some useful way, or parses environment variables, or loads "standard" dynamic/shared libraries, or similar stuff.
Typically, exit is just walking that list and somehow providing the return value of main to the environment, since most modern OSes clean up after themselves, but there may be system specific stuff in addition to that.