Lies, damn lies, and statistics

It's an interesting phenomenon to see how the human mind perceives reliability. A survey done in the mid 1970s sheds some light. In this survey, the average person on the street was asked, “Would you be satisfied with your telephone service working 90% of the time?” You'd be surprised how many people looked at the number 90, and thought to themselves, “Wow, that's a pretty high number! So, yes, I'd be happy with that!” But when the question was reversed, “Would you be satisfied if your telephone service didn't work 10% of the time?” they were inclined to change their answer even though it's the exact same question!

To understand three, four, or more nines, you need to put the availability percentage into concrete terms—for example, a “down times per year” scale. With a three nines system, your unavailability is 1 - 0.999, or 0.001%. A year has 24 × 365 = 8760 hours. A three nines system would be unavailable for 0.001 × 8760 = 8.76 hours per year.

Some ISPs boast that their end-user availability is an “astounding” 99%—but that's 87.6 hours per year of downtime, over 14 minutes of downtime per day!

This confirms the point that you need to pay careful attention to the numbers; your gut reaction to 99% availability might be that it's pretty good (similar to the telephone example above), but when you do the math, 14 minutes of downtime per day may be unacceptable.

The following table summarizes the downtime for various availability percentages:

Availability % Downtime per Year
99 3.65 days
99.9 8.76 hours
99.99 52.56 minutes
99.999 5.256 minutes
99.9999 31.5 seconds

This leads to the question of how many nines are required. Looking at the other end of the reliability spectrum, a typical telephone central office is expected to have six nines availability—roughly 20 minutes of downtime every 40 years. Of course, each nine that you add means that the system is ten times more available.