thndrwrks - 1 year ago 69

C Question

I am trying to convert a

`uint16_t`

`uint32_t`

`Input Output`

ABCDb -> AABB CCDDb

A,B,C,D are individual bits

Example outputs:

0000b -> 0000 0000b

0001b -> 0000 0011b

0010b -> 0000 1100b

0011b -> 0000 1111b

....

1100b -> 1111 0000b

1101b -> 1111 0011b

1110b -> 1111 1100b

1111b -> 1111 1111b

Is there a bithack-y way to achieve this behavior?

Answer Source

Interleaving bits by Binary Magic Numbers contained the clue:

```
uint32_t expand_bits(uint16_t bits)
{
uint32_t x = bits;
x = (x | (x << 8)) & 0x00FF00FF;
x = (x | (x << 4)) & 0x0F0F0F0F;
x = (x | (x << 2)) & 0x33333333;
x = (x | (x << 1)) & 0x55555555;
return x | (x << 1);
}
```

The first four steps consecutively interleave the source bits in groups of 8, 4, 2, 1 bits with zero bits, resulting in `00AB00CD`

after the first step, `0A0B0C0D`

after the second step, and so on. The last step then duplicates each even bit (containing an original source bit) into the neighboring odd bit, thereby achieving the desired bit arrangement.

A number of variants are possible. The last step can also be coded as `x + (x << 1)`

or `3 * x`

. The `|`

operators in the first four steps can be replaced by `^`

operators. The masks can also be modified as some bits are naturally zero and don't need to be cleared. On some processors short masks may be incorporated into machine instructions as immediates, reducing the effort for constructing and / or loading the mask constants. It may also be advantageous to increase instruction-level parallelism for out-of-order processors and optimize for those with shift-add or integer-multiply-add instructions. One code variant incorporating various of these ideas is:

```
uint32_t expand_bits (uint16_t bits)
{
uint32_t x = bits;
x = (x ^ (x << 8)) & ~0x0000FF00;
x = (x ^ (x << 4)) & ~0x00F000F0;
x = x ^ (x << 2);
x = ((x & 0x22222222) << 1) + (x & 0x11111111);
x = (x << 1) + x;
return x;
}
```