Fast x86 Integer to Boolean

Consider the following C code:

   int int_to_bool( int i )
      return i == 0 ? 0 : 1;

If you run this thru your favorite x86 compiler, there’s a good chance you’ll see either a setcc instruction, or a cmovcc instruction. The latter was added in Pentium Pro, and thus cannot always be generated. The former has the unfortunate requirement that it write a byte register destination, which invokes the dreaded “insert semantics” of x86.

Here’s a sequence I dreamed up that avoids these problems (input in ecx):

   33 DB     xor         eax,eax
   3B D9     cmp         eax,ecx
   13 DB     adc         eax,eax

In this case I’m using cmp in such a way that operand order is important. I need to subtract my input from zero. (The compare instruction is just a subtract which only sets flags.) As it turns out, this sets the carry flag to exactly the answer I want to return. All that’s left is to extract the carry flag, and the quickest way to do that is to perform add-with-carry into a zero.

In summary: 6 bytes of code, 3 simple ubiquitous ALU instructions and — best of all — no merge or partial register stall issues.

Microsoft’s C compiler also does this kind of superoptimzer-inspired bit-twiddly magic for integer absolute value.

The x86 general-purpose registers are divided into sub-registers of varying widths. The ax register, for example, refers to the low 16-bits of the eax register. Similarly al refers to the lower byte. When writing to these sub-registers the upper bits of the register are preserved. This can hurt performance on a modern dynamically-scheduled processor. For more detail:


3 Responses to “Fast x86 Integer to Boolean”

  1. 1 Azeem Jiva March 24, 2008 at 3:28 pm

    What kinds of improvements are we talking about?

  2. 2 Mark March 24, 2008 at 9:34 pm

    I’ll let you know when I implement it 🙂

  1. 1 More Stupid x86 Assembly Tricks at mark++ Trackback on April 1, 2008 at 3:45 pm
Comments are currently closed.

%d bloggers like this: