I was reading Game Coding Complete 4th edition. There is a topic regarding Memory alignment. In the code below author says that first struct is really slow because it is both not bit aligned nor byte aligned. the second one is not bit aligned but byte aligned. the last one in fast because it's both. He says without pragma compiler will align the memory itself which causes waste of memory.I couldn't really get the calculations.
This is some portion from the text:-
If the compiler were left to optimize SlowStruct by adding unused bytes,
each structure would be 24 bytes instead of just 14. Seven extra bytes are padded after
the first char variable, and the remaining bytes are added at the end. This ensures
that the entire structure always starts on an 8-byte boundary. That’s about 40 percent
of wasted space, all due to a careless ordering of member variables.
This is the concluding line in bolds:-
Don’t let the compiler waste precious memory space. Put some of your brain cells to
work and align your own member variables.
Please show me calculations and explain the padding concept in text more clearly.
#pragma pack(push, 1)
char c : 6;
__int64 d : 64;
int b : 32;
char a : 8;
The examples given in the book are highly dependent on the used compiler and computer architecture. If you test them in your own program you may get totally different results than the author. I will assume a 64-bit architecture, because the author does also, from what I've read in the description. Lets look at the examples one by one:
ReallySlowStruct IF the used compiler supports non-byte aligned struct members, the start of "d" will be at the seventh bit of the first byte of the struct. Sounds very good for memory saving. The problem with this is, that C does not allow bit-adressing. So to save newValue to the "d" member, the compiler must do a whole lot of bit shifting operations: Save the first two bits of "newValue" in byte0, shifted 6 bits to the right. Then shift "newValue" two bits to the left and save it starting at byte 1. Byte 1 is a non-aligned memory location, that means the bulk memory transfer instructions won't work, the compiler must save every byte at a time.
SlowStruct It gets better. The compiler can get rid of all the bit-fiddling. But writing "d" will still require writing every byte at a time, because it is not aligned to the native "int" size. The native size on a 64-bit system is 8. so every memory address not divisable by 8 can only be accessed one byte at a time. And worse, if I switch off packing, I will waste a lot of memory space: every member which is followed by an int will be padded with enough bytes to let the integer start at a memory location divisable by 8. In this case: char a and c will both take up 8 bytes.
FastStruct this is aligned to the size of int on the target machine. "b" takes up 8 bytes as it should. Because the chars are all bundled at one place, the compiler does not pad them and does not waste space. chars are only 1 byte each, so we do not need to pad them. However, a 64-bit machine will insert 4 more bytes at the end of FastStruct, if padding is turned on.