Mehrdad Mehrdad - 1 year ago 68
C Question

(Why) is using an uninitialized variable undefined behavior in C?

If I have:

unsigned int x;
x -= x;

it's clear that
should be zero after this expression, but everywhere I look, they say the behavior of this code is undefined, not merely the value of
(until before the subtraction).

Two questions:

  • Is the behavior of this code indeed undefined?

    (E.g. Might the code crash [or worse] on a compliant system?)

  • If so, why does C say that the behavior is undefined, when it is perfectly clear that
    should be zero here?

    i.e. What is the advantage given by not defining the behavior here?

    Clearly, the compiler could simply use whatever garbage value it deemed "handy" inside the variable, and it would work as intended... what's wrong with that approach?

Answer Source

Yes this behavior is undefined but for different reasons than most people are aware of.

First, using an unitialized value is by itself not undefined behavior, but the value is simply indeterminate. Accessing this then is UB if the value happens to be a trap representation for the type. Unsigned types have rarely trap representations, so you would be relatively safe on that side.

What makes the behavior undefined is an additional property of your variable, namely that it "could have been declared with register" that is its address is never taken. Such variables are treated specially because there are architectures that have real CPU registers that have a sort of extra state that is "uninitialized" and that doesn't correspond to a value in the type domain.

Edit: The relevant phrase of the standard is

If the lvalue designates an object of automatic storage duration that could have been declared with the register storage class (never had its address taken), and that object is uninitialized (not declared with an initializer and no assignment to it has been performed prior to use), the behavior is undefined.

And to make it clearer, the following code is legal under all circumstances:

unsigned char a, b;
memcpy(&a, &b, 1);
a -= a;
  • Here the addresses of a and b are taken, so their value is just indeterminate.
  • Since unsigned char never has trap representations that indeterminate value is just unspecified, any value of unsigned char could happen.
  • At the end a must hold the value 0.

Edit2: a and b have unspecified values:

3.19.3 unspecified value
valid value of the relevant type where this International Standard imposes no requirements on which value is chosen in any instance