More Stupid x86 Assembly Tricks

A couple of weeks ago I went hunting for a better way to compute x!=0 on x86. Eventually, I came up with a cute carry-flag trick and blogged about it.

(Note: I’m not branching on this comparison — that would be easy. Instead I want the value of the comparison in a general-purpose register. I should have made this explicitly clear in my original post. Alas, I did not. Doh.)

My goal was to avoid using setcc, because partial-register writes are the devil.

Try as I might, I couldn’t imagine a way generalize my solution so that it would also work for x==0. Someone suggested that I try the GNU superoptimizer (PDF, code), so I did.

At first I was a bit disappointed that the superoptimizer didn’t discover my sequence for x!=0. I think, maybe, the cost heuristics are outdated. (It should model xor reg,reg as being really cheap†.)

Turns out that the superoptimizer is still really clever anyway. It was a source of some great ideas. I’m delighted with what “we” came up with for x==0:

Old method; naive and literal:

85 c9           test     ecx, ecx
0f 94 c0        sete     al
0f b6 c0        movzx    eax, al

New method:

31 c0           xor      eax, eax
83 f9 01        cmp      ecx, 1
11 c0           adc      eax, eax

Once again the new method avoids the setcc and thus avoids insert semantics. As a nice bonus, we save a byte of code.

†This doesn’t actually depend on the input register at all. It’s essentially a “load-zero” instruction. Modern processors understand this and schedule accordingly.


5 Responses to “More Stupid x86 Assembly Tricks”

  1. 1 Zonn October 20, 2008 at 1:22 am

    Hey Mark,

    I just randomly ran across your site.

    Off the top of my head, if you didn’t mind the results being
    0 or FFFFFFFF, you could use:

    cmp ecx,1
    sbc eax,eax ; FFFFFFFF if ecx=0, 0 if ecx != 0

    it’s been a long time since I’ve done in x86 assembly, but I think this would work…

  2. 2 Mark October 20, 2008 at 5:18 am

    Yea that should totally compute the expression (x == 0 ? ~0 : 0). Looks good. Might be a good reason to use ~0 and 0 as true and false, if one had a choice.

  3. 3 Zonn October 20, 2008 at 6:30 pm

    I remember back in the early days (the late 70’s), the Microsoft’s BASIC interpreter used all 1’s (FFFF) as TRUE, and anything else as false. This was common practice on small micro-controllers at the time.

    It’s advantage was (to use ‘C’ nomenclature):

    A & B is the same as A && B
    A | B is the same as A || B

    which allows you to write a single routine that is shared by both math and logic functions. And when you’re trying to fit a BASIC interpreter into 4K of code, that can be very useful.

  4. 4 your mother in law October 27, 2008 at 5:57 pm

    mark, mark, mark – what the beep are these guys talking about…you are such a geek. and you are geek enough to take that as a compliment.

    but thanks for maintaining this site so i can always find your address!

  5. 5 Mark October 27, 2008 at 9:00 pm

    Give me cancer now, God.

Comments are currently closed.

%d bloggers like this: