| By Bob Gourley | Article Rating: |
|
| May 4, 2011 03:30 PM EDT | Reads: |
537 |
The recent Amazon cloud outage at its Northern Virginia data center will raise concerns among CIOs looking to the “public cloud” to improve IT service delivery and reduce day-to-day operating costs. Industry claims of superior “up-time” performance, reliability and massive redundancy must now be revisited and re-evaluated. In these early days immediately following the outage, the prevailing sentiment likely will be of shaken confidence in “public cloud” services accompanied by much finger-pointing amid claims of breached service level agreements (SLA’s) and misrepresentation of the cloud’s resiliency. With more time and a calmer approach, however, most enterprises will realize that no cloud implementation is fail-proof and that shifting significant portions of their enterprises to the cloud should not be undertaken without adequate contigency plans and risk mitigation.
The outage appeared to be limited to a single “availability zone” in only one region. AWS customers that heeded Amazon’s advice to spread their services across multiple availability zones to ensure resiliency felt little impact. Those that chose not to pay the extra cost associated with multiple zones may be regretting that decision now. Amazon claims its web services are now operating normally for most customers and the company said it will post a detailed report on what went wrong last week.
The after-action report is going to be interesting but we’re already lots of lessons learned spring up. The key points seem to be:
- Cloud outages may be rare but they can happen. Enterprises need to architect their cloud services for failure by spreading mission-critical, customer-facing services across multiple zones, physical separate data centers and/or multiple cloud providers. Putting 100% of your IT service eggs in one “cloud” basket is risky.
- Service level agreements are important and will become even more so. Gartner’s Lydia Leong provides a great recap of what went wrong and insight into Amazon’s standard EC2 SLAs.
Amazon’s SLA for EC2 is 99.95% for multi-availability zone deployments. That means you should expect that you can have about 4.5 hours of total region downtime each year without Amazon violating their SLA. Note, by the way that this outage does not actually violate their SLA. Their SLA defines unavailability as a lack of external connectivity to EC2 instances, coupled with the inability to provision working instances. In this case, EC2 was just fine by that definition. It was Elastic Block Store (EBS) and Relational Database Service (RDS) which weren’t, and neither of those services have SLAs.
It seems likely that some of Amazon’s customers, those most affected by the outage, likely will seek to renegotiate SLA’s to ensure they have more “teeth” and offer compensation for lost revenue.
- Plan for disaster and how to recovery from it–even in the cloud. The outage highlighted the fact that some of Amazon’s customers did not have a disaster recovery strategy. A number of solution providers now cover cloud disaster recovery and most can recover physical or virtual machines in a cloud within minutes. But if you don’t have plan for it, it won’t happen.
- Cloud deployments are still cheaper that traditional data centers. For most organizations, the cost of deploying in the cloud remains about 10 times cheaper than building your own data center or even private cloud.
- Hybrid IT Service Delivery Can Help Reduce Cloud Exposure. Most mature enterprises move to the cloud in stages resulting in a hybrid portfolio of IT services provisioned from a mix of public cloud, private cloud and traditional IT infrastructure. While these hybrid service models can be challenge to manage effectively, they do offer some built in resiliency in case any one segment fails.
This post sponsored by the Enterprise CIO Forum and HP.
Read the original blog entry...
Published May 4, 2011 Reads 537
Copyright © 2011 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
More Stories By Bob Gourley
Bob Gourley, former CTO of the Defense Intelligence Agency (DIA), is Founder and CTO of Crucial Point LLC, a technology research and advisory firm providing fact based technology reviews in support of venture capital, private equity and emerging technology firms. He has extensive industry experience in intelligence and security and was awarded an intelligence community meritorious achievement award by AFCEA in 2008, and has also been recognized as an Infoworld Top 25 CTO and as one of the most fascinating communicators in Government IT by GovFresh.
- The Top 100 Bloggers on Cloud Computing
- My Two-Year-Old Daughter Sofia Facing Life-Long Disability in Syria
- My Abducted Two-Year-Old American Guest Worker Daughter in Syria
- Twitter Ordered to Turn Over WikiLeaks Data
- Government and the Cloud Audience at Cloud Expo 2011 New York
- iPad Means Business with VMware View Virtual Desktops App
- Uncle Who Kidnapped Sofia to Syria Still a Practicing MD in Chicago
- Government IT Examines Cloud Computing to Cut Costs
- State Department Photos of My Abducted Daughter Sofia in Syria
- Sofia's Abduction to Syria: Interview with Diane Macedo of Fox News
- A Bedtime Story for Sofia Who Remains Abducted in Syria
- IBM and Cable&Wireless to Develop UK Smart Energy Cloud
- Cloud Computing From "Agile Cloud Integration" to "Zero Latency"
- The Top 100 Bloggers on Cloud Computing
- Driving Java Innovation in the Cloud at Cloud Expo 2011 New York
- Cloud Expo 2011 New York: The Economic Impact of GovCloud on IT
- Leading Cloud CxO's Join Speaker Faculty of Cloud Expo New York
- CIO of the National Reconnaissance Office to Present at Cloud Expo New York
- Driving Autopilot: The Key to Cloud Success at Cloud Expo 2011 New York
- Twitter Down, Arab Revolution on Hold
- Cloud Expo 2011 New York: The Impact of Cloud Computing on IT
- My Two-Year-Old Daughter Sofia Facing Life-Long Disability in Syria
- Analysis of U.S. CIO Federal Cloud Computing Strategy
- New Subscription-Based Cloud Computing and PaaS Platform from Unisys
- The Top 150 Players in Cloud Computing
- GDS International: Global Warming Scam?
- The Top 250 Players in the Cloud Computing Ecosystem
- Cloud Expo New York Call for Papers Now Open
- AJAX Technology Company JackBe Receives $6.5 Million Funding
- The Next Chapter in the Virtualization Story Begins
- The Top 50 Bloggers on Cloud Computing
- Cloud Expo Announces CloudCamp @ Cloud Expo Silicon Valley
- 4th International Cloud Computing Conference & Expo Starts Today
- The Cloud Computing Kettle Heats Right Up
- IT Must Invest in Architecture and Engineering: Adaptivity CEO
- Cloud CEOs, CTOs & SVPs to Speak at 4th International Cloud Computing Expo

































