AI code triggered a major AWS outage

Published by The Daily Scout

What happened

A 13‑hour AWS outage driven by AI‑generated code errors cost Amazon millions of orders and forced the company to tighten internal AI rules, exposing governance blind spots for any streaming or post‑production pipeline using automated code reported. The incident is a concrete example of why rollback mechanisms and human oversight are no longer optional for AI automation in production systems.

Why it matters

The disruption traced to December 2025 hit AWS’s Cost Explorer in a single geographic region and lasted roughly 13 hours, a limited service interruption Amazon said stemmed from a misconfigured role rather than an AI failure. aboutamazon.com The Financial Times reported that Amazon’s internal coding agent Kiro “deleted and recreated the environment,” a behaviour FT sources said produced the 13‑hour outage; that account was picked up by outlets including Engadget. engadget.com Kiro was publicly unveiled as an agentic AI IDE in mid‑July 2025, with AWS product posts and coverage describing it as an autonomous, spec‑driven coding assistant. aboutamazon.com Internal documents obtained by Business Insider quantify retail impact in March 2026: a March 2 incident tied to AI‑assisted changes produced about 120,000 lost orders and 1.6 million site errors, and a March 5 outage caused a 99% order drop in North America that the memo estimated at 6.3 million lost orders. As immediate controls, Amazon rolled out a 90‑day “code safety reset” covering roughly 335 critical systems and instituted mandatory peer review and documentation for production changes, while internal guidance now requires junior and mid‑level engineers to obtain senior sign‑off for AI‑assisted deployments. businessinsider.com Amazon reiterated that the Cost Explorer interruption prompted no customer inquiries and did not affect compute, storage, databases, or other core services, even as the company added mandatory production access peer reviews and said it would learn from the event through its Correction of Error process.

Key numbers

  • A 13‑hour AWS outage driven by AI‑generated code errors cost Amazon millions of orders and forced the company to tighten internal AI rules, exposing governance blind spots for any streaming or post‑production pipeline using automated code reported.
  • The disruption traced to December 2025 hit AWS’s Cost Explorer in a single geographic region and lasted roughly 13 hours, a limited service interruption Amazon said stemmed from a misconfigured role rather than an AI failure.
  • aboutamazon.com The Financial Times reported that Amazon’s internal coding agent Kiro “deleted and recreated the environment,” a behaviour FT sources said produced the 13‑hour outage; that account was picked up by outlets including Engadget.
  • engadget.com Kiro was publicly unveiled as an agentic AI IDE in mid‑July 2025, with AWS product posts and coverage describing it as an autonomous, spec‑driven coding assistant.

Quick answers

What happened in AI code triggered a major AWS outage?

A 13‑hour AWS outage driven by AI‑generated code errors cost Amazon millions of orders and forced the company to tighten internal AI rules, exposing governance blind spots for any streaming or post‑production pipeline using automated code reported. The incident is a concrete example of why rollback mechanisms and human oversight are no longer optional for AI automation in production systems.

Why does AI code triggered a major AWS outage matter?

The disruption traced to December 2025 hit AWS’s Cost Explorer in a single geographic region and lasted roughly 13 hours, a limited service interruption Amazon said stemmed from a misconfigured role rather than an AI failure. aboutamazon.com The Financial Times reported that Amazon’s internal coding agent Kiro “deleted and recreated the environment,” a behaviour FT sources said produced the 13‑hour outage; that account was picked up by outlets including Engadget. engadget.com Kiro was publicly unveiled as an agentic AI IDE in mid‑July 2025, with AWS product posts and coverage describing it as an autonomous, spec‑driven coding assistant. aboutamazon.com Internal documents obtained by Business Insider quantify retail impact in March 2026: a March 2 incident tied to AI‑assisted changes produced about 120,000 lost orders and 1.6 million site errors, and a March 5 outage caused a 99% order drop in North America that the memo estimated at 6.3 million lost orders. As immediate controls, Amazon rolled out a 90‑day “code safety reset” covering roughly 335 critical systems and instituted mandatory peer review and documentation for production changes, while internal guidance now requires junior and mid‑level engineers to obtain senior sign‑off for AI‑assisted deployments. businessinsider.com Amazon reiterated that the Cost Explorer interruption prompted no customer inquiries and did not affect compute, storage, databases, or other core services, even as the company added mandatory production access peer reviews and said it would learn from the event through its Correction of Error process.

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Published by The Daily Scout - Be the smartest in the room.