Alek Depler Alek Depler - 21 days ago 6
C++ Question

Memory alignment when casting different data types

I'm curious, why my code has different behavior on x86 and armeabi platforms. The concept of code (it is not real code, but it is enough to understand the problem):

struct data
{
int x;
}

void method(unsigned char* buff)
{
data D;

memcpy(&D.x, buff, sizeof(int)); //good approach
D.x = *(int*)buff; //bad approach
}


So when this code is compiled under arm architecture by GCC - it leads to SIGFAULT (unaligned memory) at line with casting of data types, though msvc-compiled code is running ok. As far as I know, the only correct solution in this case is to use memcpy. Can someone explain what is really happening at runtime?

Answer

This is a fundamental hardware limitation.

Some CPUs can only execute hardware instructions that access 2, 4 (or 8) byte values on suitably aligned addresses only. That is, they cannot read a 4 byte value (for example) from an odd memory address. That operation generates a hardware exception that gets translated to a signal. All access to 4 byte values must be from physical addresses that are evenly divided by 4 (for example).

The exact alignment restrictions vary by CPU. That's just how the CPU is designed. Here's a theoretical example. Let's say that on a given hardware platform all RAM is accessed as 32 bit words. From the programming viewpoint it's still 8-bit bytes, but each RAM word holds four bytes. CPU instructions that affect a single byte are executed by the CPU by fetching the entire memory word containing that byte, executing the operation, then storing it back. But, an operation that affects a 4 byte integer, for example, is expected to reference the logical address for the first byte in that memory word. The CPU fetches the entire 4 byte word, executes the operation, then stores it back.

So the end result is that the CPU is not capable of addressing 4 byte values that do not start on an even 4 byte boundary. Theoretically, this would be possible to implement by fetching two adjacent memory words, executing the operation affecting the 4 byte value that overlaps both words, then storing both of them back in RAM. Of course, this adds significant complications, and some CPUs are simply not designed to do that.

In your example, the pointer dereference translates to a direct CPU instruction, that will fail if the actual memory address is odd (for example). memcpy() does a byte-by-byte copy, and works.

Comments