Archive for the 'Hardware' Category

Comatose PC

A couple of my PCs occasionally slip into a coma from which I cannot boot them. I know these machines are just “stuck”, and not dead, because they eventually recover and work just fine for months.

It happens to my Linux server at home. On this machine the initial hang seems to be related to a problem with the PCI Wifi card.

It also happens to a Windows XP box at work. On this machine the initial hang seems to be precipitated by Windows automatic updates. (After install, it tries to reboot but only gets half-way.)

Once a system falls into a coma, it’s always the same infuriating crap:

Pressing the power button gives me lights on the case, and I hear disks and fans spinning, but I never get video or hear the “happy beep.” An instantaneous press of the power button at this point does nothing — I need to do the 5-second hold to power down. Reset is similarly useless.

If I hit the power switch on the power supply itself, or physically unplug the power, and wait for what seems like an utterly random time period (occasionally days, I kid you not) this sometimes helps. Eventually I plug it back in and it immediately boots as if nothing ever happened.

Dear internet, please help. I cannot construct a useful set of Google search terms to for this problem. Ideally, I would like a solution but I will also be satisfied with:

  • A reliable way to rouse the machines from their coma
  • An explanation of the cause for this, so I can properly direct my ire

Update 6/18/09: It appears that this problem, in at least one case, was likely caused by the power supply. Still working on the other one. Thanks for the suggestions.


The End of Architecture

The End of Architecture
Burton Smith, Tera Computer Company
17th Annual Symposium on Computer Architecture
Seattle, Washington
May 29, 1990

(Thanks, Wendy!)

How to Undervolt a MacBook

Briefly, undervolting is the process of manipulating a processor’s P-state tables to cause it to run at a lower voltage, while keeping frequency unchanged. This has no effect whatsoever on performance, but can extend battery life and reduce heat dissipation.

Undervolting cannot damage your CPU, but it can cause your machine to crash. You should be prepared to boot your mac into safe mode, in the event that something goes awry.

I’m going to assume a basic knowledge of undervolting, and merely describe the process and results for my MacBook. (For more detailed info, there’s a great article at Nordic Hardware.)

OK lets get started. Here’s what we need:
Continue reading ‘How to Undervolt a MacBook’

Microsoft Manycore Computing Workshop 2007

Burton Smith, David Patterson and a host of other parallel computing and computer architecture demigods were on hand to answer the question, “What the hell are we going to do with all these cores?”

If, like me, you couldn’t make it, at least you can view the slides, available here.

Andy Glew’s CMOVcc Mea Culpa

Compiler writers and assembly coders have long bemoaned the fact that x86 has no CMOVcc store. Additionally, many are shocked to learn that a CMOVcc load always reads memory. Consider the following situation:

   int x = (p == NULL) ? 0 : p->v;

You’d like to generate this code, but it will crash when p is null:

   cmp rax, 0
   cmovne rcx, [rax+foo_offset]

I just discovered this comp.arch posting, where Andy Glew explains how this all came to be.

60% of All Transistors are Made Up on the Spot

CNet News wonders why, Despite its aging design, the x86 is still in charge.

The article includes this quote from Simon Crosby, CTO of XenSource:

There’s no reason whatsoever why the Intel architecture remains so complex. There’s no reason why they couldn’t ditch 60 percent of the transistors on the chip, most of which are for legacy modes.

Wow. 60%? What a huge waste! Could that really be true? Lets take a look at a random K8 die shot from the inter-web:

K8 die shot

Hmm. See that highly regular pattern that comprises the right half the chip? That would be cache.

Lets assume that Crosby meant 60% of the other part. You know, the not-cache stuff. Even if we are most charitable, he still wrong. We could perhaps simplify the front end of the chip (marked above as “Fetch Scan Align Micro-code.”) Still, much of that section is for the branch predictor and TLB, which might be good to keep around.

If I were going to invent a number, I’d have picked something much closer to 1%.

Chris Hecker is Wrong About OoO Execution

Here is a quote from Chris Hecker at GDC 2005:

Modern CPUs use out-of-order execution, which is there to make crappy code run fast. This was really good for the industry when it happened, although it annoyed many assembly language wizards in Sweden.

I first heard this when Chris was quoted by Pete Isensee (from the XBOX 360 team) in his NWCPP talk a year ago. Maybe Chris was kidding. I don’t know. What I do know is:

  1. He is wrong
  2. Smart people are believing him
  3. It’s time to set the record straight

Processors implement dynamic scheduling because sometimes the ideal order for a given sequence of instructions can only be known at runtime. In fact, the ideal order can change each time the instructions are executed.

Imagine your binary contains the following very simple code:

     mov rax, [foo]
     mov rbx, [bar]

Two loads — that’s all. Lets assume that each of the loads misses cache 10% of the time. Often, one will miss but the other will hit. If you have an in-order machine, and the first load misses, you are forced to wait — you cannot proceed to the 2nd load, and you cannot hide any of the miss latency.

No matter how much of an uber assembly coder you are, you are going to be forced to choose an order for these two loads. More likely, your compiler will make this choice for you. Either way, that choice will be wrong at least some of the time.

An OoO processor can do the right thing every time.