AWS data centers damaged by strikes
- Ars Technica reports drone strikes damaged Amazon data centers in the Middle East, forcing AWS to stop billing affected cloud customers while repairs proceed. - Amazon expects months of repairs, creating a slow recovery path that highlights physical and geopolitical resilience risks at-scale across regions globally. - The incident strengthens the case for regional failover, customer prioritisation and degraded modes in design trade-offs. (arstechnica.com)
Cloud data centers are supposed to be boring. That is the whole point. You rent compute in a region, spread workloads across availability zones, and trust the physical layer to stay in the background. But AWS just got a reminder that the cloud still sits on real buildings, real power systems, and real geography — and those things can get hit. Amazon now says damage from drone strikes on AWS facilities in the UAE and Bahrain will take several more months to repair, with billing suspended for affected workloads in the UAE region while recovery drags on. ### What actually got hit? The damaged sites are AWS data centers in the United Arab Emirates and Bahrain. In early March, Amazon said drones directly struck two AWS facilities in the UAE, while a nearby strike damaged a Bahrain facility. The company said the attacks caused structural damage, power disruption, and even water damage from fire suppression systems — which is the kind of ugly, physical failure cloud customers usually never have to think about. ### Why is this still a problem two months later? Because repairing a cloud region is not like rebooting a server farm. If a building takes structural damage, power gear is compromised, and storage systems get exposed to fire or water, recovery turns into a supply-chain and construction problem. The AWS Health Dashboard now says the UAE region, ME-CENTRAL-1, is still unable to reliably support customer applications, and that normal restoration is expected to take several months. Relevant billing there is suspended while repairs continue. ### Didn’t AWS design for failures like this? Yes — but mostly for ordinary failures. AWS regions are split into availability zones so one data center problem does not take down the whole region. The catch is that this event appears to have damaged multiple sites at once. TeleGeography notes that two of the three availability zones in the UAE region were taken out simultaneously, while one zone in Bahrain was also impaired. That breaks the normal “just fail over inside the region” playbook. Basically, redundancy worked the way it was designed to work — and then the event was bigger than the design assumption. ### Which services were affected? A lot of the foundational ones. AWS has pointed to disruption across S3, DynamoDB, EC2 launches, and then the downstream services that depend on them, like Lambda, Kinesis, CloudWatch, and RDS. That matters because once storage and database layers wobble, everything stacked on top starts wobbling too. A cloud outage like this is less like one website going down and more like part of the local operating system for companies disappearing. ### Why not just move everything elsewhere? Some customers can. AWS is explicitly telling customers to migrate workloads to other regions and restore inaccessible resources from remote backups. But that advice hides the hard part — you only get a clean escape if you already built for it. If your backups lived in the same region, or your app assumed low-latency local access, or your contracts require data to stay in-country, migration gets messy fast. In some cases, it is not just expensive. It may be legally constrained. ### What does the billing pause tell you? It tells you this is not a normal service blip. Suspending billing is Amazon admitting that the region cannot currently deliver its usual contract value. That is a practical concession, but it is also a signal: the company expects the disruption to last long enough that charging as usual would be hard to defend. Ars Technica says the full disruption could stretch to nearly half a year when you count the time since the March strikes. ### Is this just an AWS story? Not really. It is a cloud architecture story. CNBC quoted AWS CEO Matt Garman saying teams were working “24/7” to keep infrastructure operating, but the bigger lesson is industry-wide. Google, Microsoft, Oracle, and everyone else are building more capacity in geopolitically tense places because that is where demand, power, land, and capital are lining up. The more cloud becomes critical infrastructure, the less anyone gets to pretend that regional conflict is somebody else’s problem. ### So what changes now? Customers will take “multi-region” more seriously than “multi-AZ.” Architects will care more about remote backups, degraded modes, and whether an app can survive with higher latency for a while. And companies with strict data residency rules are going to look hard at the tradeoff they made — local compliance versus survivability. The cloud still abstracts servers. It does not abstract war.