Innocent Bystander Innocent Bystander - 2 months ago 17
C++ Question

Is it legal to cast to enum values not representable by enum?

Given

enum class val { foo = 1, bar = 2, baz = 4 };


It is possible to define:

val operator|(val x, val y)
{
return static_cast<val>(static_cast<int>(x) | static_cast<int>(y));
}


However, is it semantically correct to do so?

I am leaning towards no, as demonstrated in the following, seemingly well-behaving example:

int convert(val x)
{
switch(x)
{
case val::foo: return 42;
case val::bar: return 53;
case val::baz: return 64;
}
}


Calling
convert(val::foo | val::bar)
will return
0
when compiled with g++ and segmentation fault with clang++.

Here is g++ version. And here is clang++ version.

My question is two-fold:


  1. Is it semantically correct to store values in an enum that are not represented by an enumerator? Excerpts from the standard are most welcome.



1.a Which compiler is correct in the above linked examples, g++ or clang++?


  1. Is there a standard (or proposed) way to represent flags in C++?



I can think of several possible implementations:

enum class val { foo, bar, baz, size };
using val_flags = std::set<val>; // (1)
using val_flags = std::vector<bool>; // (2)
using val_flags = std::bitset<val::size>; // (3)
using val_flags = std::underlying_type<val>::type; // (4)


UPDATE:

Thank you all for your answers. I ended up resurrecting my old enum operator template. In case anybody is interested, it can be found here: github.com

Answer

following, seemingly well-behaving example:

It's not, but make one minor change:

int convert(val x)
{
    switch(x)
    {
    case val::foo: return 42;
    case val::bar: return 53;
    case val::baz: return 64;
    }

    return 9; // ADDED THIS LINE
}

and all will be well. An alternate fix would be to use a default: case and return there.

Your existing code triggers undefined behavior1 by reaching the closing brace of a function with a non-void return type. Because it is undefined behavior, both compilers are correct.

The semantics of holding values in an enum type which are bitwise OR combinations of enumerator values are well-defined and guaranteed. The standard requires that instances of the enum can store any integer value with no more bits used than any of the enumerator values defined, which includes all bitwise-OR combinations. The formal language used to say this is a bit messy, but here it is (note that your case is an enum class, these always have fixed underlying type and the first sentence applies):

For an enumeration whose underlying type is fixed, the values of the enumeration are the values of the underlying type. Otherwise, for an enumeration where emin is the smallest enumerator and emax is the largest, the values of the enumeration are the values in the range bmin to bmax, defined as follows: Let K be 1 for a two’s complement representation and 0 for a ones’ complement or sign-magnitude representation. bmax is the smallest value greater than or equal to max(|emin| − K, |emin|) and equal to 2M − 1, where M is a non-negative integer. bmin is zero if emin is non-negative and −(bmax + K) otherwise. The size of the smallest bit-field large enough to hold all the values of the enumeration type is max(M, 1) if bmin is zero and M + 1 otherwise. It is possible to define an enumeration that has values not defined by any of its enumerators. If the enumerator-list is empty, the values of the enumeration are as if the enumeration had a single enumerator with value 0.

(from n4582, section 7.2 [dcl.enum])


1 From 6.6.3 [stmt.return]:

Flowing off the end of a constructor, a destructor, or a function with a cv void return type is equivalent to a return with no operand. Otherwise, flowing off the end of a function other than main (3.6.1) results in undefined behavior.