Lazer Lazer - 4 months ago 14
C Question

How are numbers greater than 2^32 handled by a 32 bit machine?

I am trying to understand how calculations involving numbers greater than 232 happen on a 32 bit machine.

C code

$ cat size.c
#include<stdio.h>
#include<math.h>

int main() {

printf ("max unsigned long long = %llu\n",
(unsigned long long)(pow(2, 64) - 1));
}
$


gcc output

$ gcc size.c -o size
$ ./size
max unsigned long long = 18446744073709551615
$


Corresponding assembly code

$ gcc -S size.c -O3
$ cat size.s
.file "size.c"
.section .rodata.str1.4,"aMS",@progbits,1
.align 4
.LC0:
.string "max unsigned long long = %llu\n"
.text
.p2align 4,,15
.globl main
.type main, @function
main:
pushl %ebp
movl %esp, %ebp
andl $-16, %esp
subl $16, %esp
movl $-1, 8(%esp) #1
movl $-1, 12(%esp) #2
movl $.LC0, 4(%esp) #3
movl $1, (%esp) #4
call __printf_chk
leave
ret
.size main, .-main
.ident "GCC: (Ubuntu 4.4.3-4ubuntu5) 4.4.3"
.section .note.GNU-stack,"",@progbits
$


What exactly happens on the lines 1 - 4?

Is this some kind of string concatenation at the assembly level?

Answer

__printf_chk is a wrapper around printf which checks for stack overflow, and takes an additional first parameter, a flag (e.g. see here.)

pow(2, 64) - 1 has been optimised to 0xffffffffffffffff as the arguments are constants.

As per the usual calling conventions, the first argument to __printf_chk() (int flag) is a 32-bit value on the stack (at %esp at the time of the call instruction). The next argument, const char * format, is a 32-bit pointer (the next 32-bit word on the stack, i.e. at %esp+4). And the 64-bit quantity that is being printed occupies the next two 32-bit words (at %esp+8 and %esp+12):

pushl   %ebp                 ; prologue
movl    %esp, %ebp           ; prologue
andl    $-16, %esp           ; align stack pointer
subl    $16, %esp            ; reserve bytes for stack frame
movl    $-1, 8(%esp)   #1    ; store low half of 64-bit argument (a constant) to stack
movl    $-1, 12(%esp)  #2    ; store high half of 64-bit argument (a constant) to stack
movl    $.LC0, 4(%esp) #3    ; store address of format string to stack
movl    $1, (%esp)     #4    ; store "flag" argument to __printf_chk to stack
call    __printf_chk         ; call routine
leave                        ; epilogue
ret                          ; epilogue

The compiler has effectively rewritten this:

printf("max unsigned long long = %llu\n", (unsigned long long)(pow(2, 64) - 1));

...into this:

__printf_chk(1, "max unsigned long long = %llu\n", 0xffffffffffffffffULL);

...and, at runtime, the stack layout for the call looks like this (showing the stack as 32-bit words, with addresses increasing from the bottom of the diagram upwards):

        :                 :
        :     Stack       :
        :                 :
        +-----------------+
%esp+12 |      0xffffffff | \ 
        +-----------------+  } <-------------------------------------.
%esp+8  |      0xffffffff | /                                        |
        +-----------------+                                          |
%esp+4  |address of string| <---------------.                        |
        +-----------------+                 |                        |
%esp    |               1 | <--.            |                        |
        +-----------------+    |            |                        |
                  __printf_chk(1, "max unsigned long long = %llu\n", |
                                                    0xffffffffffffffffULL);