Life in the 99th Percentile
In everyday life, 99% of something is often just as good as the whole thing. 99% of a hot dog will fill you up. Not so in systems. An application with 99% availability means it will be unavailable on average 1% of the time or 87 hours a year. Downtime is typically not evenly distributed. In fact is is more likely to occur when your systems are stressed, during peak time. For this reason, availability is generally stated in degrees of the 99th percentile, typically from 99.99% to 99.999%. A business should consider how many 9’s it needs and how many it can afford. How many 9’s are needed is usually dictated by the cost and the effect of down time on the business. Will down-time cause damage deemed a) catastrophic and irreparable (eg: loss of life for a medical application), significant (eg: loss of income for an e-commerce site) or merely inconvenient (loss of productivity and frustrated customers for a call center)? While no business wants downtime, 99.999% availability is exponentially more expensive than 99.99% availability. Additional uptime requires expensive investments in software, hardware and networking infrastructure, including load balancers and session replicators. Lastly, any site or application that claims 100% uptime should be treated with extreme caution. No system is perfect. No operator or system administrator is perfect. Even with hot deployments enabled, you sometimes need software updates and planned maintenance windows. A business is often better off expecting “only” 99.999% availability and having contingency measures in place than expecting, paying for and never getting to the ellusive 100% mark.