hmijail hmijail - 24 days ago 5
C Question

Underallocating memory for a union

Given this declaration:

struct s1 {
int type;
union u1 {
char c;
int i[10000];
} u;
} s;


I'm wondering whether we can allocate less memory for the struct than sizeof(struct s1) would suggest:

struct s1 * s_char = malloc(sizeof(int)+sizeof(char));


On one hand, this seems intuitive: if one knows that s/he will never reach past the char
s_char.u.c
, then allocating the whole sizeof(struct s1) looks like a big waste.

On the other hand, I rather understand the C11 standard to be against this - BUT it's never spelled out. The two passages I have found that can be understood as being against this are these:


  • if the struct somehow assumes that its full size has been allocated, this opens the door to Undefined Behavior: a new object can be allocated just after s_char but still inside of the "real" sizeof(struct s1) bytes assumed by the struct, which would then trigger item 54 of Annex J.2 of the C11 standard: UB if




An object is assigned to an inexactly overlapping object or to an
exactly overlapping object with incompatible type (6.5.16.1).



  • 6.2.6.1 paragraph 7:




When a value is stored in a member of an object of union type, the
bytes of the object representation that do not correspond to that
member but do correspond to other members take unspecified values.


But this can also be understood as either the standard refusing to deal with what happens with those values, or saying that those values can actually be expected to change arbitrarily.

In summary, there is an intuition of "but we're only using 5 bytes!" vs language-lawyeristic caution - not proof. And my question is: is there any more evidence for any side? More concretely: is it ever OK to underallocate memory for a union or any other data structure?

Again: intuition is what brought the problem, I don't want more of it. I am looking for something reasoned on reliable facts, like the C11 standard and/or compiler information. Also, I already know that the standard way to do this is to substitute the struct-with-union for a union-of-structs with a Common Initial Sequence, though that is also not without risks... . But that is tangential here.

Answer

Looks like the GCC maintainers think that underallocating memory for a union causes UB, as seen (kinda tangentially) in this bug report. There is no standard-based explanation, but still this implies that the compiler can't be expected to support it, so it makes no sense to look further.