jpo38 jpo38 - 1 year ago 55
C++ Question

How to safely offset bits without undefined behaviour?

I'm writting a function that will convert a bitset to a int/uint value considering that the bitset could have fewer bits than target type.

Here is the function I wrote:

template <typename T,size_t count> static T convertBitSetToNumber( const std::bitset<count>& bitset )
T result;
#define targetSize (sizeof( T )*CHAR_BIT)
if ( targetSize > count )
// if bitset is 0xF00, converting it as 0x0F00 will lose sign information (0xF00 is negative, while 0x0F00 is positive)
// This is because sign bit is on the left.
// then, we need to add a zero (4bits) on the right and then convert 0xF000, later, we will divide by 16 (2^4) to preserve sign and value

size_t missingbits = targetSize - count;

std::bitset<targetSize> extended;
extended.reset(); // set all to 0
for ( size_t i = 0; i != count; ++i )
if ( i < count )
extended[i+missingbits] = bitset[i];

result = static_cast<T>( extended.to_ullong() );

result = result >> missingbits;

return result;
return static_cast<T>( bitset.to_ullong() );

And the "test program":

uint16_t val1 = Base::BitsetUtl::convertBitSetToNumber<uint16_t,12>( std::bitset<12>( "100010011010" ) );
// val1 is 0x089A
int16_t val2 = Base::BitsetUtl::convertBitSetToNumber<int16_t,12>( std::bitset<12>( "100010011010" ) );
// val2 is 0xF89A

See the program works when using
as target type, as:

uint16_t val = 0x89A0; // 1000100110100000
val = val >> 4; // 0000100010011010

However, it fails when using
, because
0x89A0 >> 4
instead of expected

int16_t val = 0x89A0; // 1000100110100000
val = val >> 4; // 1111100010011010

I don't understand why >> operator sometimes insert 0 and sometimes 1. And I can't find out how to safely do the final operation of my function (
result = result >> missingbits;
must be wrong at some point...)

Answer Source

It's because shifting is an arithmetic operation, and that promotes the operands to int, which will do sign extension.

I.e. promoting the signed 16-bit integer (int16_t) 0x89a0 to a 32-bit signed integer (int) causes the value to become 0xffff89a0, which is the value that is shifted.

See e.g. this arithmetic operation conversion reference for more information.

You should cast the variable (or value) to an unsigned integer (i.e. uint16_t in your case):

val = static_cast<uint16_t>(val) >> 4;

If the type is not really know, like if it's a template argument, then you can use std::make_unsigned:

val = static_cast<typename std::make_unsigned<T>::type>(val) >> 4;