Much has been written about AWS’ December 2021 outage. What is your best cloud strategy? What DR options are available? We take a closer look.
AWS December 2021 Outage
On December 7, 2021, an hours-long AWS outage took down popular websites, disrupted smart devices and caused delivery delays at Amazon warehouses. Service disruptions included:
- Google, Disney Plus, Venmo, DoorDash, Spotify, Alexa, Ring and trading-based app Robinhood
- Some of Amazon’s delivery operations ground to a halt
- Third-party Amazon sellers couldn’t ship products
- Colleges that rely on software to host content postponed exams during finals week
An automated computer program (designed to make its network more reliable) created a cascade effect that made a large number of its systems to unexpectedly behave strangely. Then a surge of activity on Amazon’s networks occurred, ultimately preventing users from accessing some of its cloud services.
In Q2 2021, AWS controlled 33% of the global cloud infrastructure market, according to Synergy Research Group, followed by Microsoft at 20% and Google at 10%. Revenue at AWS increased 39% in the Q3 2021 from a year earlier to $16.1 billion, outpacing growth of 15% across all of Amazon.
AWS needs their cloud service to be as reliable as possible. “We expect to release a new version of our Service Health Dashboard early next year that will make it easier to understand service impact and a new support system architecture that actively runs across multiple AWS regions to ensure we do not have delays in communicating with customers,” AWS said.
AWS DR Options in the Cloud
Clients who use AWS are responsible for initiating and maintaining their own disaster recovery efforts. The customer is responsible for resiliency IN the cloud while AWS is responsible for resiliency OF the cloud.
In a detailed white paper, AWS outlines DR options in the cloud for their customers:
Back Up and Restore
This approach can also be used to mitigate against a regional disaster by replicating data to other AWS regions, or to mitigate lack of redundancy for workloads deployed to a single availability zone.
With the pilot light approach, data is replicated from one region to another and provision a copy of core workload infrastructure.
CloudEndure Disaster Recovery
Available from the AWS Marketplace, the option continuously replicates server hosted applications and server hosted databases from any source into AWS using block-level replication of the underlying server.
The warm standby approach involves ensuring that there is a scaled down, but fully functional, copy of a production environment in another region.
Multi Cloud Debate
Bloomberg News noted these disruptions will likely reinvigorate industry debate around multi-cloud strategies, an idea that a company should duplicate its services across multiple cloud computing providers so no one crash impacts operations. Some, like Forrester analyst Brent Ellis, believe that will help companies side-step big web outages. “It’s a decision large enterprises have to make or they’ll inevitably be in a situation where they’re down for several hours,” he said.
The three major hyperscalers offer unique benefits for being on their cloud. With AWS, clients receive a wide variety of services including analytics, developer and management tools, IoT and security. Azure easily integrates with other Microsoft tools such as Office 365. Google has expertise in open source technologies and containers. With a multi cloud strategy, organizations can use applications on whichever infrastructure makes the most sense.
What’s Your Cloud Resilience Strategy?
A multi cloud strategy is a great idea, but there can be implementation challenges. Recovery Point is here to help: we’re the failover cloud that gets the job done. Failover is an extremely important function for critical systems that require always-on accessibility.
With many IT departments stretched thin, it’s more important than ever to have a trusted cloud partner. Recovery Point has the depth of resources and credentialed skills to scale and adapt with your organization, no matter how your IT strategy evolves.
We deliver our services from independently certified, geographically diverse facilities, with support from credentialed engineers and cloud professionals who enforce the most rigorous security standards to protect your data, every step of the way.
To learn more about our cloud services, call 877-445-4333 or contact Recovery Point here.
12/23/2021 Update from The Washington Post: AWS suffered its third outage in a month, briefly shutting down a vast number of online services critical to everyday life and highlighting again the vulnerabilities of an increasingly interconnected Web.
“A single glitch in a high-profile provider will have huge implications on countless organizations of all sizes, in often very unexpected ways,” said Ed Skoudis, president of the SANS Technology Institute. “Service interruptions are vast and impact thousands of companies and millions of users. We are putting more eggs into fewer and fewer baskets. More eggs get broken that way.”The Washington Post: Amazon Web Services’ third outage in a month exposes a weak point in the Internet’s backbone (December 22, 2021)