|
Apr 26
2011
|
|
PC World wrote about what your business can learn from the Amacon Cloud Outage, noting that you should examine the SLAs you get from your cloud provider as an indicator of the level of reliability of their cloud product, as well as looking at diversification and simply deciding what is mission critical in your company.
"And while you're negotiating those deals with one or more cloud providers, take a minute to examine your service level agreements (SLAs) with any provider. SLAs should set out how your providers are rewarded when things go right, and how you're compensated when things go wrong."
"Especially if you're working with a local service provider which is working with an Amazon, a Google, or another major public cloud infrastructure vendor, make sure those SLAs spell out who is responsible for what should things go awry. It's worth the extra time and effort early in the relationship to make sure those SLAs are clear, comprehensive and iron-clad."
"If something goes wrong, you don't want your business to languish offline while your vendors pass the buck for responsibility for the outage. This is the very definition of when you want one throat to choke, and you want to make sure it's clear to whom that throat belongs."
I'd like to offer a few summary thoughts that expand on the article based on our experience here at ENKI:
- For truly mission critical applications, going onsite is an expensive and unnecessary step... a return to a past most of us would rather put behind us. Instead, diversify your cloud deployments across different geographies.
- True fault-tolerant diversification requires at a minimum that your application be set up to maintain data currency across multiple deployments. You will want to look at the way databases work, files are stored, and how you will compensate for delays in replication whether you choose an active/active or active/standby DR solution.
- As the article points out, SLAs matter, but are you getting the right SLAs? The companies who suffered from this outage didn't get any disaster recovery/business continuance SLAs from Amazon and that should have been a red flag for them. To solve this problem, you will need to develop in-house IT expertise on DR/BC and dedicate resources to it, or choose a different cloud vendor or operations services provider that can do this for you.
- DR/BC is expensive because it requires some thought (i.e., labor) and duplicated hardware. This means it isn't a no-brainer and you'll want to assess exactly how much protection each of your applications needs and what your budget is to provide it.
Cloud is, unfortunately, not yet a panacea that offers enterprise-class reliability for free. Instead, it is a way to reduce the cost and headaches of managing your own hardware. You still need experienced IT staff to manage your deployment - whether they are in-house or outsourced. ENKI's outsourced operations services are designed to help companies match their cloud deployment to their business needs and then managed that deployment to deliver the SLAs that are required.









