PSkocik PSkocik -4 years ago 53
C Question

Strict aliasing and overlay inheritance

Consider this code example:

#include <stdio.h>

typedef struct A A;

struct A {
int x;
int y;
};

typedef struct B B;

struct B {
int x;
int y;
int z;
};

int main()
{
B b = {1,2,3};
A *ap = (A*)&b;

*ap = (A){100,200}; //a clear http://port70.net/~nsz/c/c11/n1570.html#6.5p7 violation

ap->x = 10; ap->y = 20; //lvalues of types int and int at the right addrresses, ergo correct ?

printf("%d %d %d\n", b.x, b.y, b.z);
}


I used to think that something like casting B* to A* and using A* to manipulate the B* object was a strict aliasing violation.
But then I realized the standard really only requires that:


An object shall have its stored value accessed only by an lvalue
expression that has one of the following types: 1) a type compatible
with the effective type of the object, (...)


and expressions such as
ap->x
do have the correct type and address, and the type of
ap
shouldn't really matter there (or does it?). This would, in my mind, imply that this type of overlay inheritance is correct as long as the substructure isn't manipulated as a whole.

Is this interpretation flawed or ostensibly at odds with what the authors of the standard intended?

Answer Source

The line with *ap = is a strict aliasing violation: an object of type B is written using an lvalue expression of type A.

Supposing that line was not present, and we moved onto ap->x = 10; ap->y = 20;. In this case an lvalue of type int is used to write objects of type int.

There is disagreement about whether this is a strict aliasing violation or not. I think that the letter of the Standard says that it is not, but others (including gcc and clang developers) consider ap->x as implying that *ap was accessed. Most agree that the standard's definition of strict aliasing is too vague and needs improvement.

Sample code using your struct definitions:

void f(A* ap, B* bp)
{
  ap->x = 213;
  ++bp->x;
  ap->x = 213;
  ++bp->x;
}

int main()
{
   B b = { 0 };
   f( (A *)&b, &b );
   printf("%d\n", b.x);
}

For me this outputs 214 at -O2, and 2 at -O3 , with gcc. The generated assembly on godbolt for gcc 6.3 was:

f:
    movl    (%rsi), %eax
    movl    $213, (%rdi)
    addl    $2, %eax
    movl    %eax, (%rsi)
    ret

which shows that the compiler has rearranged the function to:

int temp = bp->x + 2;
ap->x = 213;
bp->x = temp;

and therefore the compiler must be considering that ap->x may not alias bp->x.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download