zac zac - 1 year ago 157
C Question

Packing and pointer aliasing, C and C++

union vec
{
#pragma pack(push,1)
struct
{
float x, y, z;
}
#pragma pack(pop)
float vals[3];
};


Consider the above definition. (Anonymous unions in C99 aside)

I suppose this answer possibly permits different answers depending on choice of compiler, choice of language, and choice of standard.


  1. I believe I am guaranteed (via #pragma compiler documentation, not language guarantee) that
    sizeof(vec) == 3*sizeof(float)

  2. As such, I believe I am guaranteed that
    &vec.x == &vec.vals[0]
    , ect.

  3. However, I am unsure if it is legal (that is, not allowed via strict aliasing), to write from
    v.x
    and then read from
    v.vals[0]



Packing aside, I believe the relevant verbiage (from the C99 standard, at least) is:



  • a type compatible with the effective type of the object,

  • an aggregate or union type that includes one of the aforementioned
    types among its members (including, recursively, a member of a
    subaggregate or contained union), or



Answer Source
  1. I believe I am guaranteed (via #pragma compiler documentation, not language guarantee) that sizeof(vec) == 3*sizeof(float)

Yes that's correct, assuming the #pragma disabled padding entirely.


  1. As such, I believe I am guaranteed that &vec.x == &vec.vals[0], ect.

This is guaranteed regardless of padding, because there can never be padding at the beginning of the struct/union. See for example C11 6.7.2.1 §15:

There may be unnamed padding within a structure object, but not at its beginning.

This holds true for all versions of the C standard, and as far as I know, also for all versions of the C++ standard.


  1. However, I am unsure if it is legal (that is, not allowed via strict aliasing), to write from v.x and then read from v.vals[0]

This is fine in C but undefined behavior in C++.

In C, the ./-> operator guarantees this, C11 6.5.2.3:

A postfix expression followed by the . operator and an identifier designates a member of a structure or union object. The value is that of the named member,95) and is an lvalue if the first expression is an lvalue.

Where footnote 95 (informative, not normative) says:

95) If the member used to read the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called ‘‘type punning’’). This might be a trap representation.

C++ have no guarantees like this, so "type punning" through unions is undefined behavior in C++. This is a major difference between the two languages.

Furthermore, C has the concept of common initial sequence for unions, also specified in C11 6.5.2.3:

One special guarantee is made in order to simplify the use of unions: if a union contains several structures that share a common initial sequence (see below), and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them anywhere that a declaration of the completed type of the union is visible. Two structures share a common initial sequence if corresponding members have compatible types (and, for bit-fields, the same widths) for a sequence of one or more initial members.


It is true that the array and the struct in your example may alias, because of the part you cited "an aggregate or union type that includes one of the aforementioned types among its members". So writing to the struct and then reading that data through the array does not violate strict aliasing, neither in C nor C++.

However, C++ has the concept of "active member" when dealing with unions, so in C++ this would give poorly-specified behavior for other reasons than aliasing - namely that C++ only guaranteed that the last written member of the union can be safely read.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download