FrozenHeart FrozenHeart - 2 months ago 9
C Question

Is passing an uninitialized variable to another function UB

I wonder is it true that just passing an uninitialized variable to a function results in an undefined behavior?

It seems really weird for me.

Suppose that we have the following code:

void open_db(db* conn)
{
// Open database connection and store it in the conn
}

int main()
{
db* conn;
open_db(conn);
}


It seems perfectly legal to me. It doesn't dereference an uninitialized variable nor it doesn't relay on its state. It just passes an uninitialized pointer to another function that stores some data in it via
operator new
or something like this.

If it's UB, could you quote the exact place where the Standard says so?

And is it also true for other types like
int
?

void foo(int bar)
{
// ...
}

int main()
{
int bar;
foo(bar); // UB?
}

Answer

It is UB, and the type of the argument does not matter. The relevant bits of C99 are: when you declare a variable with "automatic storage duration" but don't initialize it, its value is indeterminate (6.2.4p5, 6.7.8p10); any use of an indeterminate value provokes undefined behavior (J.2 refers to 6.2.4, 6.7.8, and 6.8)1.

And even if it wasn't UB (for instance if conn had been initialized)., this code would not have the effect you seem to expect it to have. As written, open_db cannot modify the variable conn in its caller.

A slight variation on your code is valid whether or not conn is initialized, and does do what you expect it to do, though:

void open_db(db **conn)
{
  *conn = internal_open_db();
}

int main()
{
  db *conn;
  open_db(&conn);
}

The address-of operator, unary &, is one of the very few things in the language that does not provoke undefined behavior when applied to an uninitialized variable, because it does not read the value of the variable. It only determines the memory location of the variable. That is a determinate value, that can safely be passed to open_db (but note that its type signature has changed: it is now receiving a pointer to a pointer to a db. And open_db can now use the pointer-dereference operator, unary *, to write a result into the variable.

In C++ only, this very common pattern receives a bit of syntactic sugar:

void open_db(db *&conn)
{
  conn = internal_open_db();
}

int main()
{
  db *conn;
  open_db(conn);
}

Changing the second star to an ampersand makes the conn argument to open_db now a "reference" to a pointer. It's still a pointer to a pointer "under the hood", but the compiler fills in the & and * operators for you as necessary.


1 For my fellow language lawyers: Annex J is non-normative, and I can't find any normative statement backing up its assertion that using an indeterminate value is always UB. (It might help if I could find a definition of what it means to "use a value" in the first place. I believe the intent was anything that triggers 6.3.2.1p2 "lvalue conversion", but I don't think that's ever actually stated.)

The definition of an "indeterminate value" is "an unspecified value or a trap representation"; using an unspecified value does not provoke UB. Using a trap representation does provoke UB, but not all types have trap reps. C11, but not C99, has a sentence in 6.3.2.1p2 that states quite baldly "if [the code reads a value from] an object of automatic storage duration that could have been declared with the register storage class (never had its address taken), and that object is uninitialized, the behavior is undefined" -- but note that it doesn't use the term-of-art "indeterminate value" here, and it restricts the rule to variables whose address is not taken.

However, C compilers absolutely do treat reading any uninitialized variable as UB regardless of whether its type has trap reps or whether its address has been taken, and J.2 certainly reflects the intent of the committee, as do a number of examples in clause 7 where the word "indeterminate" appears solely to point out that reading some variable is UB.