crnlx crnlx - 1 month ago 11
C Question

Unexpected output from Bubblesort program with MSVC vs TCC

One of my friends sent this code to me, saying it doesn't work as expected:

#include<stdio.h>

void main()
{
int a [10] ={23, 100, 20, 30, 25, 45, 40, 55, 43, 42};
int sizeOfInput = sizeof(a)/sizeof(int);
int b, outer, inner, c;
printf("Size is : %d \n", sizeOfInput);

printf("Values before bubble sort are : \n");
for ( b = 0; b &lt; sizeOfInput; b++)
printf("%d\n", a[b]);

printf("End of values before bubble sort... \n");

for ( outer = sizeOfInput; outer > 0; outer-- )
{
for ( inner = 0 ; inner < outer ; inner++)
{
printf ( "Comparing positions: %d and %d\n",inner,inner+1);
if ( a[inner] > a[inner + 1] )
{
int tmp = a[inner];
a[inner] = a [inner+1];
a[inner+1] = tmp;
}
}
printf ( "Bubble sort total array size after inner loop is %d :\n",sizeOfInput);
printf ( "Bubble sort sizeOfInput after inner loop is %d :\n",sizeOfInput);
}
printf ( "Bubble sort total array size at the end is %d :\n",sizeOfInput);
for ( c = 0 ; c < sizeOfInput; c++)
printf("Element: %d\n", a[c]);
}





I am using Micosoft Visual Studio Command Line Tool for compiling this on a Windows XP machine.
cl /EHsc bubblesort01.c



My friend gets the correct output on a dinosaur machine (code is compiled using TCC there).

My output is unexpected. The array mysteriously grows in size, in between.

If you change the code so that the variable
sizeOfInput
is changed to
sizeOfInputt
, it gives the expected results!

A search done at Microsoft Visual C++ Developer Center doesn't give any results for "sizeOfInput".

I am not a C/C++ expert, and am curious to find out why this happens - any C/C++ experts who can "shed some light" on this?

Unrelated note: I seriously thought of rewriting the whole code to use quicksort or merge sort before posting it here. But, after all, it is not Stooge sort...

Edit: I know the code is not correct (it reads beyond the last element), but I am curious why the variable name makes a difference.

Answer Source

Like interjay's answer mentioned, once you get into undefined behavior all bets are off. However, when you said that merely renaming the variable changed the behavior of the program, I got curious about what was going on (undefined or not).

First, I didn't believe that renaming the variable would change the compiler's output (all other things being equal), but sure enough - when I tried it, I was surprised to see exactly what you described.

So I had the compiler dump the assembly for the code it was generating for each version of the source file, and ran a comparison. Here's what I found in the compilers description of how it was laying out the local variables:

    ***** test.sizeOfInput.cod
    _c$ = -56                    ; size = 4
    _b$ = -52                    ; size = 4
    _inner$ = -48                ; size = 4
    _a$ = -44                    ; size = 40
>>> _sizeOfInput$ = -4           ; size = 4
    _main   PROC
    ***** test.sizeOfInputt.cod
    _c$ = -56                    ; size = 4
>>> _sizeOfInputt$ = -52         ; size = 4
    _b$ = -48                    ; size = 4
    _inner$ = -44                ; size = 4
    _a$ = -40                    ; size = 40
    _main   PROC
    *****

What you'll notice is that when the variable is named sizeOfInput, he compiler places it at a higher address than array a (just past the end of the array), and when the variable is named sizeOfInputt it places it at a lower address than array a instead of just past the end of the array. That means that in the build that has the variable named sizeOfInput, the undefined behavior that occurs when you modify a[10] is changing the value of sizeOfInput. In the build that uses the name sizeOfInputt, since that variable isn't at the end of the array, the write to a[10] trashes something else.

As to why the compiler would lay out the variables differently when the name of one changes in an apparently insignificant way - I have no idea.

But this is a good example of why you shouldn't count on the layout of local variables (or pretty much any variables, though you can count on the layout order of struct elements), and why when it comes to undefined behavior, "it works on my machine" doesn't cut it as proof that something works.