380 Picoseconds

Please excuse me while I toot my own horn. Take a look at this:


   C:\latency >run
   latency
   imul    : 57 - 53 = 4
   lea shl : 56 - 53 = 3
   just lea: 55 - 53 = 2
   just shl: 54 - 53 = 1

That’s right, bitches, I am dynamically measuring the latency of a single x86 instruction — accurate down to one cycle! That’s ~380 picoseconds on my hardware.

This is really hard (impossible?) to do without a serializing read time-stamp counter instruction.

Advertisements

3 Responses to “380 Picoseconds”


  1. 1 AMDFan June 15, 2007 at 9:58 pm

    That’s cool – which CPU has this instruction? Is this on real hardware that you are running at Microsoft?

  2. 2 Mark June 16, 2007 at 12:22 am

    Everything since K8 rev F has had it. Check out RDTSCP in the AMD Architecture Programmers Manual, volume 3:
    http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/24594.pdf


Comments are currently closed.




%d bloggers like this: