Willi Mentzel Willi Mentzel - 14 days ago 6
C Question

In special cases: Is & faster than %?

I saw the chosen answer to this post.

I was suprised that

(x & 255) == (x % 256)
if x is an unsigned integer, I wondered if it makes sense to always replace
%
with
&
in
x % n
for
n = 2^a (a = [1, ...])
and x being a positive integer.

Since this is a special case in which I as a human can decide because I know with which values the program will deal with and the compiler does not. Can I gain a significant performance boost if my program uses a lot of modulo operations?

Sure, I could just compile and look at the dissassembly. But this would only answer my question for one compiler/architecture. I would like to know if this is in principle faster.

Answer

If your integral type is unsigned, the compiler will optimize it, and the result will be the same. If it's signed, something is different...

This program:

int mod_signed(int i) {
  return i % 256;
}
int and_signed(int i) {
  return i & 255;
}
unsigned mod_unsigned(unsigned int i) {
  return i % 256;
}
unsigned and_unsigned(unsigned int i) {
  return i & 255;
}

will be compiled (by GCC 6.2 with -O3; Clang 3.9 produces very similar code) into:

mod_signed(int):
        mov     edx, edi
        sar     edx, 31
        shr     edx, 24
        lea     eax, [rdi+rdx]
        movzx   eax, al
        sub     eax, edx
        ret
and_signed(int):
        movzx   eax, dil
        ret
mod_unsigned(unsigned int):
        movzx   eax, dil
        ret
and_unsigned(unsigned int):
        movzx   eax, dil
        ret

The result assembly of mod_signed is different because

If both operands to a multiplication, division, or modulus expression have the same sign, the result is positive. Otherwise, the result is negative. The result of a modulus operation's sign is implementation-defined.

and AFAICT, most of implementation decided that the result of a modulus expression is always the same as the sign of the first operand. See this documentation.

Hence, mod_signed is optimized to (from nwellnhof's comment):

int d = i < 0 ? 255 : 0;
return ((i + d) & 255) - d;

Logically, we can prove that i % 256 == i & 255 for all unsigned integers, hence, we can trust the compiler to do its job.