Jake Jake - 1 month ago 10
C Question

code order with variable length array

In C99, is there a big different between these two?:

int main() {
int n , m;
scanf("%d %d", &n, &m);
int X[n][m];

X[n-1][m-1] = 5;
printf("%d", X[n-1][m-1]);
}


and:

int main(int argc, char *argv[]) {
int n , m;
int X[n][m];
scanf("%d %d", &n, &m);

X[n-1][m-1] = 5;
printf("%d", X[n-1][m-1]);
}


The first one seems to always work, whereas the second one appears to work for most inputs, but gives a segfault for the inputs
5 5
and
6 6
and returns a different value than 5 for the input
9 9
. So do you need to make sure to get the values before declaring them with variable length arrays or is there something else going on here?

Answer

When the second one works, it's pure chance. The fact that it ever works proves that, thankfully, compilers can't yet make demons fly out of your nose.

Declaring a variable doesn't necessarily initialize it. int n, m; leaves both n and m with undefined values in this case, and attempting to access those values is undefined behavior. If the raw binary data in the memory those point to happen to be interpreted into a value larger than the values entered for n and m -- which is very, very far from guaranteed -- then your code will work; if not, it won't. Your compiler could also have made this segfault, or made it melt your CPU; it's undefined behavior, so anything can happen.

For example, let's say that the area of memory that the compiler dedicates to n happened to contain the number 10589231, and m got 14. If you then entered an n of 12 and an m of 6, you're golden -- the array happens to be big enough. On the other hand, if n got 4 and m got 2, then your code will look past the end of the array, and you'll get undefined behavior -- which might not even break, since it's entirely possible that the bits stored in four-byte segments after the end of the array are both accessible to your program and valid integers according to your compiler/the C standard. In addition, it's possible for n and m to end up with negative values, which leads to... weird stuff. Probably.

Of course, this is all fluff and speculation depending on the compiler, OS, time of day, and phase of the moon,1 and you can't rely on any numbers happening to be initialized to the right ones.

With the first one, on the other hand, you're assigning the values through scanf, so (assuming it doesn't error) (and the entered numbers aren't negative) (or zero) you're going to have valid indices, because the array is guaranteed to be big enough because the variables are initialized properly.


Just to be clear, even though variables are required to be zero-initialized under some circumstances doesn't mean you should rely on that behavior. You should always explicitly give variables a default value, or initialize them as soon as possible after their declaration (in the case of using something like scanf). This makes your code clearer, and prevents people from wondering if you're relying on this type of UB.


1: Source: Ryan Bemrose, in chat

Comments