shelladept shelladept - 1 year ago 75
Linux Question

Why do the 32-bit and 64-bit Compiled Versions of this Program Populate Memory in this Way?

I am trying to better understand how the stack and heap work. I have run into a snag when comparing the 32-bit and 64-bit compiled versions of the same program. In both cases I used a guest Fedora 15 VM (both 32 and 64), gcc for compiling, gdb for debugging, and the same host hardware. The program in question is very simple and immediately below:

C program

void function(int a, int b, int c, int d){
int value;
char buffer[10];

value = 1234;
buffer[0] = 'A';

int main(){
function(1, 2, 3, 4);

In the interest of space, I omitted the assembly dump of the program; however if anyone thinks it might help them answer my questions, I'd be happy to include it.

32-bit Compiled Program:

Parameters 4 (0xbffff3e4), 3 (0xbffff3e0), 2 (0xbffff3dc) and 1 (0xbffff3d8) are pushed onto the stack first. Next the location of the instruction following the call for function()--or return address--is placed on the stack (0x080483d1). Next the value of the base pointer for the previous stack (0xbffff3e8) is pushed on to the stack.

(gdb) x/16xw $esp
0xbffff3c0: 0x00000000 0x410759c3 0x4105d237 0x00000000
0xbffff3d0: 0xbffff3e8 0x080483d1 0x00000001 0x00000002//pointers
0xbffff3e0: 0x00000003 0x00000004 0x00000000 0x4105d413//followed by params
0xbffff3f0: 0x00000001 0xbffff484 0xbffff48c 0x41040fc4

64-bit Compiled Program:

However; here the values 4, 3, 2, and 1 are nowhere to be seen. All I can see, no matter how far down the stack I look is the return address (0x4004ae) and previous stack frame's Base Pointer (0x7fffffffe210).

(gdb) x/16xg $rsp
0x7fffffffe200: 0x00007fffffffe210 0x00000000004004ae //pointers
0x7fffffffe210: 0x0000000000000000 0x00000036d042139d
0x7fffffffe220: 0x0000000000000000 0x00007fffffffe2f8
0x7fffffffe230: 0x0000000100000000 0x0000000000400491
0x7fffffffe240: 0x0000000000000000 0x7ade47f577d82f75
0x7fffffffe250: 0x0000000000400390 0x00007fffffffe2f0
0x7fffffffe260: 0x0000000000000000 0x0000000000000000
0x7fffffffe270: 0x8521b80ab3982f75 0x7ab3e77151682f75

64-bit Compiled Program with print statement:

Now, after adding a simple print statement:

printf("%d, %c\n", flag, buffer[0]);

in function(), I can see the wayward parameters (see below, 0x7fffffffe1e0-0x7fffffffe1ec). I can also see the Base Pointer from the previous stack frame, 0x7fffffffe210 (in 0x7fffffffe200) and the return address 0x400520 (in 0x7fffffffe208). I believe it changed due to the new print statement. Why are 4, 3, 2, and 1 not visible without a print statement in this case? Is the 64-bit implementation of the gcc compiler smart enough to not 'waste' memory for parameters and local variables which are never used?

(gdb) x/16xg $rsp
0x7fffffffe1e0: 0x0000000300000004 0x0000000100000002 //parameters
0x7fffffffe1f0: 0x0000000000000000 0x00000000004003e0
0x7fffffffe200: 0x00007fffffffe210 0x0000000000400520 //pointers
0x7fffffffe210: 0x0000000000000000 0x00000036d042139d
0x7fffffffe220: 0x0000000000000000 0x00007fffffffe2f8
0x7fffffffe230: 0x0000000100000000 0x0000000000400503
0x7fffffffe240: 0x0000000000000000 0xd3c0c92559feaed9
0x7fffffffe250: 0x00000000004003e0 0x00007fffffffe2f0

Finally, why does the 32 bit OS place the parameters 4, 3, 2, and 1 higher in the stack than it does the previously mentioned pointers. And why does the 64 bit OS instead place the parameters lower in the stack than said pointers? I was under the impression that passed parameters were always placed on the stack first (and hence, would be in a larger-value memory address since the stack grows toward smaller addresses). Then the saved base pointer and return address followed (so the base pointer could be reset to its previous value and the calling function could be returned to). This is the behavior I am observing in the 32-bit compiled code, but not the 64-bit version. What am I misunderstanding? I appreciate any insight into this matter and apologize if my questions are unclear. Please let me know any way I can be more concise (or if I am factually incorrect at any point).

Thank you in advance.

Answer Source

The 64-bit ABI used by Linux differs considerably from the 32-bit ABI: in the 64-bit world, arguments are often passed in registers, rather than on the stack.

Before adding the printf(), you're not finding the arguments on the stack because the first (up to) 6 integer or pointer arguments get passed in registers (in the order %rdi, %rsi, %rdx, %rcx, %r8, %r9).

After adding the printf(), they probably get saved on the stack in the process of register contents being shuffled around for the printf() call - take a look at the assembly; it's probably obvious once you know what the ABI looks like.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download