Nosturion Nosturion - 1 month ago 7
C++ Question

Output from arbitrary dereferenced pointer

I fill the memory as follows:

char buf[8] = { 0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77, 0x88};


And than put the unsigned long pointer in turn on first 5 bytes and output result:

char *c_ptr;
unsigned long *u_ptr;

c_ptr = buf;
for (int i=0;i<5;i++)
{
u_ptr = (unsigned long *)c_ptr;
printf("%X\n",*u_ptr);
c_ptr++;
}


When I execute this code on my x64 plaform I get what I expected:

44332211
55443322
66554433
77665544
88776655


But when I execute the same code on ARM platform I get following:

44332211
11443322
22114433
33221144
88776655


I.e. it get bound every 4 byte and dereference only 4 bytes within this bounds.

So I want to ask, if this behavior (when
pointer_value%4 != 0
) erroneous or implementation-specific?

UPD:
I known about endiannes, I want to know is this correct, that I am getting

11443322


instead of

55443322


I.e when I have pointer for example
0x10000001

It makes unsigned long from bytes with addresses
0x10000001
,
0x10000002
,
0x10000003
and than
0x10000000
, instead of
0x10000005
.

Answer

After suspecting memory alignment I did a quick google =)

http://awayitworks.blogspot.co.nz/2010/02/arm-memory-alignment.html

Stated in that article:

Till ARMv4 architecture, it’s assumed that address given for fetching contents is memory aligned...a 32-bit data fetch should have address aligned to 32-bit and so on. As guessed correctly the problem is only for 32-bit and 16-bit data fetching. ARM ignores lower 2-bits of address if the data fetch is 32-bit, and ignores lower 1-bit if data fetch is 16-bit. So, in all if the address is not properly aligned then data fetch will be erroneous.

Note the last sentence =)

If you require the behaviour that you expected on x86, you'll have to explicitly build the integers from chars, ie (assuming little-endian):

// Endian-specific
inline unsigned long ulong_at( const char *p ) {
    return ((unsigned long)p[0])
         | (((unsigned long)p[1]) << 8)
         | (((unsigned long)p[2]) << 16)
         | (((unsigned long)p[3]) << 24);
}

Or perhaps:

// Architecture-specific
inline unsigned long ulong_at( const char *p ) {
    unsigned long val;
    char *v = (char*)&val;
    v[0] = p[0];
    v[1] = p[1];
    v[2] = p[2];
    v[3] = p[3];
    return val;
}