
On one hand no. AWS handled the issue very well. Problems were reported at around 10:45am (ET) and within an hour AWS had acknowledged the issue and found the cause. It executed mitigations and within a few hours the problem was mostly resolved.
When things go wrong, we’re often quick to point out the frailty of the cloud as ‘someone else’s computer’ but actually, the cloud providers usually respond and resolve issues far faster than in-house IT teams could.
On the other hand, outages like this serve to highlight just how dependent we are on AWS in particular, but also a small number of very powerful cloud companies.
The hosting landscape has changed dramatically in just a few years. A decade ago, the market was much more fragmented with hundreds (or thousands) of data centre and hosting companies each with a relatively small portion of market-share. Now, we have an oligarchy of the major cloud providers.
This issue for AWS affected Amazon’s own services like Ring and IMDB. It also affected Tinder, Roku, Coinbase, iRobot and Duolingo.
An enormous number of SaaS business services are hosted on AWS and when these tools are hit, the knock-on effect impacts thousands of downstream businesses. The technology supply-chain is a big, interconnected web, but often it leads back to AWS, Microsoft Azure or Google.
The other positive we can take from this incident is that it was localised to specific AWS Regions. In many ways, we shouldn’t think of AWS as just one cloud, but rather a collection of connected data centres spread across the world. It is designed so that an issue in US-East-1 doesn’t affect Seoul or London.
Architecting for continuity is always a balance of risk vs cost. Infrastructure should always be spread across multiple Availability Zones (the actual data centres) but can be spread over multiple Regions or even cloud providers to protect against wider outages. Depending on the criticality of the application, you can make it as resilient as it demands.
Peter Groucutt, Managing Director at Databarracks
Fachartikel

Zusammenfassung des Webinars „Let’s Encrypt“: Eine neue Ära der Zertifikatsüberwachung

Messung des ROI in der Cybersicherheit

Quantifizierung des Risikos von ERP-Ausfallzeiten und das Streben nach betrieblicher Ausfallsicherheit

Spionieren Chrome-Erweiterungen von AI Sie aus?

Die Rolle von DMARC in der E-Mail-Sicherheit
Studien

Aufruf zum Handeln: Dringender Plan für den Übergang zur Post-Quanten-Kryptographie erforderlich

IBM-Studie: Gen-KI wird die finanzielle Leistung von Banken im Jahr 2025 steigern

Smartphones – neue Studie zeigt: Phishing auch hier Sicherheitsrisiko Nr. 1

GenAI: Wirtschaft in Deutschland vernachlässigt Transformation der Geschäftsmodelle
