Alex Lop. - 5 months ago 21

C++ Question

I faced an interesting scenario in which I got different results depending on the right operand type, and I can't really understand the reason for it.

Here is the minimal code:

`#include <iostream>`

#include <cstdint>

int main()

{

uint16_t check = 0x8123U;

uint64_t new_check = (check & 0xFFFF) << 16;

std::cout << std::hex << new_check << std::endl;

new_check = (check & 0xFFFFU) << 16;

std::cout << std::hex << new_check << std::endl;

return 0;

}

I compiled this code with g++ (gcc version 4.5.2) on Linux 64bit:

The output was:

ffffffff81230000

81230000

I can't really understand the reason for the output in the first case.

Why at some point would any of the temporal calculation results be promoted to a

`int64_t`

I would accept a result of '0' in both cases if a 16bit value is shifted 16 bits left in the first place and then promoted to a 64bit value. I also do accept the second output if the compiler first promotes the

`check`

`uint64_t`

But how come

`&`

`int32_t`

`uint32_t`

Answer

That's indeed an interesting corner case. It only occurs here because you use `uint16_t`

for the unsigned type when you architecture use 32 bits for `ìnt`

Here is a extract from *Clause 5 Expressions* from draft n4296 for C++14 (emphasize mine):

10 Many binary operators that expect operands of arithmetic or enumeration type cause conversions ... This pattern is called the usual arithmetic conversions, which are defined as follows:

...

(10.5.3) — Otherwise, if the operand that hasunsigned integer type has rank greater than or equal to the rank of the type of the other operand, the operand with signed integer type shall be converted to the type of the operand with unsigned integer type.

(10.5.4) — Otherwise, if the type of the operand withsigned integer type can represent all of the values of the type of the operand with unsigned integer type, the operand with unsigned integer type shall be converted to the type of the operand with signed integer type.

You are in the 10.5.4 case:

`uint16_t`

is only 16 bits while`int`

is 32`int`

can represent all the values of`uint16_t`

So the `uint16_t check = 0x8123U`

operand is converted to the signed `0x8123`

and result of the bitwise `&`

is still 0x8123.

But the shift (bitwise so it happens at the representation level) causes the result to be the intermediate unsigned 0x81230000 which converted to an int gives a negative value (technically it is implementation defined, but this conversion is a common usage)

5.8 Shift operators [expr.shift]

...

Otherwise, if E1 has a signed type and non-negative value, and E1×2^{E2}isrepresentable in the corresponding unsigned typeof the result type, then that value, converted to the result type, is the resulting value;...

and

4.7 Integral conversions [conv.integral]

...

3 If the destination type is signed, the value is unchanged if it can be represented in the destination type; otherwise, the value isimplementation-defined.

(beware this was true undefined behaviour in C++11...)

So you end with a conversion of the signed int 0x81230000 to an `uint64_t`

which as expected gives 0xFFFFFFFF81230000, because

4.7 Integral conversions [conv.integral]

...

2 If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2n where n is the number of bits used to represent the unsigned type).

TL/DR: There is no undefined behaviour here, what causes the result is the conversion of signed 32 bits int to unsigned 64 bits int. The only part part that is *undefined behaviour* is a shift that would cause a sign overflow but all common implementations share this one and it is *implementation defined* in C++14 standard.

Of course, if you force the second operand to be unsigned everything is unsigned and you get evidently the correct `0x81230000`

result.

[EDIT] As explained by MSalters, the result of the shift is only *implementation defined* since C++14, but was indeed *undefined behaviour* in C++11. The shift operator paragraph said:

...

Otherwise, if E1 has a signed type and non-negative value, and E1×2^{E2}isrepresentable in the result type, then that is the resulting value;otherwise, the behavior is undefined.