Aaron McDaid Aaron McDaid - 2 months ago 14
C Question

Violating of strict-aliasing in C, even without any casting?

How can

*i
and
u.i
print different numbers in this code, even though
i
is defined as
int *i = &u.i;
? I can only assuming that I'm triggering UB here, but I can't see how exactly.

(ideone demo replicates if I select 'C' as the language. But as @2501 pointed out, not if 'C99 strict' is the language. But then again, I get the problem with
gcc-5.3.0 -std=c99
!)

// gcc -fstrict-aliasing -std=c99 -O2
union
{
int i;
short s;
} u;

int * i = &u.i;
short * s = &u.s;

int main()
{
*i = 2;
*s = 100;

printf(" *i = %d\n", *i); // prints 2
printf("u.i = %d\n", u.i); // prints 100

return 0;
}


(gcc 5.3.0, with
-fstrict-aliasing -std=c99 -O2
)

My theory is that
100
is the 'correct' answer, because the write to the union member through the
short
-lvalue
*s
is defined as such (for this platform/endianness/whatever). But I think that the optimizer doesn't realize that the write to
*s
can alias
u.i
, and therefore it thinks that
*i=2;
is the only line that can affect
*i
. Is this a reasonable theory?

If
*s
can alias
u.i
, and
u.i
can alias
*i
, then surely the compiler should think that
*s
can alias
*i
? Shouldn't aliasing be 'transitive'?

Finally, I always had this assumption that strict-aliasing problems were caused by bad casting. But there is no casting in this!

(My background is C++, I'm hoping I'm asking a reasonable question about C here. My (limited) understanding is that, in C99, it is acceptable to write through one union member and then reading through another member of a different type.)

Answer

The disrepancy is issued by -fstrict-aliasing optimization option. Its behavior and possible traps are well described in GCC documentation:

Pay special attention to code like this:

      union a_union {
        int i;
        double d;
      };

      int f() {
        union a_union t;
        t.d = 3.0;
        return t.i;
      }

The practice of reading from a different union member than the one most recently written to (called “type-punning”) is common. Even with -fstrict-aliasing, type-punning is allowed, provided the memory is accessed through the union type. So, the code above works as expected. See Structures unions enumerations and bit-fields implementation. However, this code might not:

      int f() {
        union a_union t;
        int* ip;
        t.d = 3.0;
        ip = &t.i;
        return *ip;
      }
Comments