Andrew Sun Andrew Sun - 28 days ago 8
C Question

Is integer overflow undefined in inline x86 assembly?

Say I have the following C code:

int32_t foo(int32_t x) {
return x + 1;
}


This is undefined behavior when
x == INT_MAX
. Now say I performed the addition with inline assembly instead:

int32_t foo(int32_t x) {
asm("incl %0" : "+g"(x));
return x;
}


Question: Does the inline assembly version still invoke undefined behavior when
x == INT_MAX
? Or does undefined behavior only apply to the C code?

Answer

No, there's no UB with this. C rules don't apply to the asm instructions.

Inline-asm behaviour is implementation defined, and GNU C inline asm is defined as a black box for the compiler. Inputs go in, outputs come out, and the compiler doesn't know how. All it knows is what you tell it using the out/in/clobber constraints.


Your foo that uses inline-asm behaves identically to

int32_t foo(int32_t x) {
    uint32_t u = x;
    return ++u;
}

on x86, because x86 is a 2's complement machine, so integer wraparound is well-defined. (Except for performance: the asm version defeats constant propagation, and also gives the compiler no ability to optimize x - inc(x) to -1, etc. etc. https://gcc.gnu.org/wiki/DontUseInlineAsm unless there's no way to coax the compiler into generating optimal asm by tweaking the C.)

It doesn't raise exceptions. Setting the OF flag has no impact on anything, because GNU C inline asm for x86 (i386 and amd64) has an implicit "cc" clobber, so the compiler will assume that the condition codes in EFLAGS hold garbage after every inline-asm statement. gcc6 introduced a new syntax for asm to produce flag results (which can save a SETCC in your asm and a TEST generated by the compiler for asm blocks that want to return a flag condition).

Some architectures do raise exceptions (traps) on integer overflow, but x86 is not one of them (except when a division quotient doesn't fit in the destination register). On MIPS, you'd use ADDIU instead of ADDI on signed integers if you wanted them to be able to wrap without trapping. (Because it's also a 2's complement ISA, so signed wraparound is the same in binary as unsigned wraparound.)


Undefined Behaviour in x86 asm:

BSF and BSR (find first set bit forward or reverse) leave their destination register with undefined contents if the input was zero. (TZCNT and LZCNT don't have that problem). Intel's recent x86 CPUs do define the behaviour, which is to leave the destination unmodified, but the x86 manuals don't guarantee that. See the section on TZCNT in this answer for more discussion on the implications, e.g. that TZCNT/LZCNT/POPCNT have a false dependency on the output in Intel CPUs.

Several other instructions leave some flags undefined in some/all cases. (especially AF/PF). IMUL for example leaves ZF, PF, and AF undefined.

Presumably any given CPU has consistent behaviour, but the point is that other CPUs might behave differently even though they're still x86. If you're Microsoft, Intel will design their future CPUs to not break your existing code. If your code is that widely-relied-on, you'd better stick to only relying on behaviour documented in the manuals, not just what your CPU happens to do. See Andy Glew's answer and comments here. Andy was one of the architects of Intel's P6 microarchitecture.