user2377766 user2377766 - 2 months ago 7x
C Question

Loading xmm registers from address location

I'm trying to load/store a memory from/to a char pointer array using the xmm0 128-bit register on 32bit operating system.

What I tried is very simple :

int main()

char* data = new char[33];
for (int i = 0; i < 32; i++)data[i] = 'a';
data[32] = 0;
movdqu xmm0,[data]

delete[] data;

The problem is that this doesn't seem to work. The first time I debugged the win32 application I got :

xmm0 = 0024F8380000000000F818E30055F158

The second time I debugged it I got:

xmm0 = 0043FD6800000000002C18E3008CF158

So there must be something with the line:

movdqu xmm0,[data]

I tried using this instead:

movdqu xmm0,data

but I got the same result.

What I thought was the problem is that I copy the address instead of the data at the address. However the value shown at xmm0 register is too large for a 32bit address so it must be copying memory from another address.

I also tried some other instructions I found at the internet, but with same result.

Is it the way I'm passing the pointer or am I misunderstanding something about xmm basics?

A valid solution with explanation will be appreciated.

Eventhough I found the solution(finally after 3h). I would still like an explanation:

push eax
mov eax,data
movdqu xmm0,[eax]
pop eax

why should I pass the pointer to 32-bit register ? (Please edit my question by answering with the explanation.)

#include <iostream>

int main()
    char *dataptr = new char[33];
    char datalocal[33];
    dataptr[0] = 'a';   dataptr[1] = 0;
    datalocal[0] = 'a'; datalocal[1] = 0;
    printf("%p %p %c\n", dataptr, &dataptr, dataptr[0]);
    printf("%p %p %c\n", datalocal, &datalocal, datalocal[0]);
    delete[] dataptr;


0xd38050 0x7635bd709448 a
0x7635bd709450 0x7635bd709450 a

As we can see, the dynamic pointer data is really a pointer variable (32b or 64b at 0x7635bd709448), containing pointer to the heap 0xd38050.

The local variable is directly 33 chars long buffer, allocated at address 0x7635bd709450.

But the datalocal works also as char * value.

I'm a bit confused what is the formal C++ explanation of this. While writing C++ code, this feels quite natural and dataptr[0] is first element in the heap memory (ie dereferencing dataptr twice), but in assembler you see the true nature of dataptr, which is address of the pointer variable. So you have first to load the heap pointer by mov eax,[data] = loads eax with 0xd38050, then you can load content of 0xd38050 into xmm0 by using [eax].

With local variable there is no variable with address of it, the symbol datalocal is already address of first element, so movdqu xmm0,[data] will work then.

In the "wrong" case you can still do movdqu xmm0,[data], it's not problem of CPU to load 128b from 32b variable, it will simply continue reading beyond the 32bits and read another 96bits belonging to other variables/code. In case you are around memory boundary and this is the last memory page of app, it will crash on invalid access.

edit: there's few times alignment mentioned in comments. That's a valid point, to access the memory trough movdqu it should be aligned. Check your C++ compiler intrinsics, for Visual this should work:

__declspec(align(16)) char datalocal[33];
char *dataptr = _aligned_malloc(33, 16);