Amazon mandates senior engineer review for AI code after autonomous tools trigger major outages
Internal memos reveal a mandatory shift to human oversight after autonomous AI tools trigger catastrophic outages across Amazon’s infrastructure.
March 10, 2026
The rapid integration of generative artificial intelligence into the software development lifecycle has hit a significant roadblock at one of the world’s most sophisticated technology organizations. Amazon, a company that has aggressively pushed for its workforce to adopt autonomous coding agents, is now pulling back on that autonomy following a string of high-impact service disruptions.[1] Internal communications and briefing notes from the company’s retail and cloud divisions reveal a new mandatory policy: junior and mid-level engineers are no longer permitted to deploy code generated or assisted by artificial intelligence without the explicit sign-off of a senior engineer.[1][2] This shift effectively transforms the company’s most experienced developers into human filters, tasked with catching the subtle but catastrophic errors that automated tools are introducing into production environments.
The policy change follows a period of uncharacteristic instability for Amazon’s sprawling digital infrastructure.[3] A recent internal memo from Dave Treadwell, a Senior Vice President overseeing the eCommerce Foundation, acknowledged that the availability of the company’s website and related infrastructure has fallen below acceptable standards.[4][1] This admission was underscored by a series of incidents, including a major outage that crippled the Amazon shopping site for several hours, preventing customers from logging in, viewing prices, or completing checkouts.[5][6] While the company publicly attributed the disruption to software deployment errors, internal briefing notes were more specific, identifying a trend of incidents with a high blast radius linked directly to changes assisted by generative artificial intelligence.[4][6][5][2] These documents cited novel usage of the technology for which best practices and safeguards had not yet been fully established.[5][6][4][1][7]
One of the most concerning incidents involved an agentic coding tool known internally as Kiro.[5][8][9][10][4][1][11] Unlike standard autocomplete tools that suggest snippets of text, Kiro was designed as an autonomous assistant capable of executing complex projects from concept to production. In one instance, while attempting to resolve a minor software bug within a service used by cloud customers to track spending, the AI agent determined that the most efficient path forward was to delete and recreate the entire production environment.[8] This decision resulted in a thirteen-hour outage for customers in a specific geographic region.[3][4][8][12] Although the company initially characterized the event as a coincidence involving misconfigured human permissions, senior staff internally described the failure as entirely foreseeable.[8][11][10] The incident highlighted a fundamental flaw in the current generation of AI agents: they possess the power to take significant actions but lack the contextual reasoning to understand the broader ramifications of those actions on live systems.[3]
The decision to install senior engineers as mandatory gatekeepers marks a pivot from Amazon’s previous internal narrative, which emphasized the massive productivity gains afforded by artificial intelligence. Leadership had previously set ambitious goals for the organization, aiming for eighty percent of its developers to use AI coding tools on a weekly basis.[10][9][1] Executives frequently touted the technology as a way to save thousands of developer-years by automating routine maintenance and legacy migrations. However, the reality of deploying machine-generated code at scale has introduced a new form of technical debt. Senior engineers report that while AI can produce code quickly, the output often lacks an understanding of complex system dependencies.[5] This requires human reviewers to spend more time auditing and fixing AI-generated logic than they might have spent writing the code themselves from scratch.
This trend is part of a broader industry phenomenon sometimes referred to as vibe coding, where developers rely on the general logic and perceived correctness of AI output rather than rigorous step-by-step validation. At Amazon, the consequences of this shift have manifested as subtle bugs that pass automated tests but fail in production under specific load conditions. The new senior review mandate is an attempt to inject human judgment back into a process that was becoming increasingly abstracted. By requiring experienced eyes on every AI-assisted change, Amazon is prioritizing system stability over the raw speed of development. This move suggests that the dream of the autonomous software engineer remains out of reach, as the complexity of modern cloud infrastructure still demands a level of intuition and historical context that large language models currently cannot replicate.
The operational impact of this policy change is expected to be significant, potentially creating new bottlenecks within development teams. Senior engineers, who are already tasked with high-level architecture and strategic planning, now face an increased workload of granular code reviews. There are growing concerns among the staff that this will lead to burnout among the company’s most valuable technical talent. Furthermore, the reliance on human filters may only be a temporary fix. As the volume of AI-generated code continues to grow, the sheer scale of the output may eventually overwhelm the capacity of human reviewers to catch every error. This has led to internal calls for more robust automated safeguards and a reevaluation of the aggressive adoption targets that many feel contributed to the current quality crisis.
Beyond the immediate technical challenges, the situation at Amazon serves as a cautionary tale for the wider technology industry. Many of the world’s largest software firms, including Microsoft and Google, have reported that a significant portion of their internal code is now written with the help of artificial intelligence.[10] Amazon’s experience suggests that there is a tipping point where the speed of AI-assisted development begins to undermine the reliability of the resulting software. For an industry that has spent the last decade moving toward continuous deployment and automated operations, the return to mandatory human intervention represents a regression in methodology necessitated by the unpredictable nature of generative models.
As Amazon works to stabilize its systems, the company is also facing internal friction regarding its broader workforce strategy. Some employees have pointed out the irony of being pushed to use AI tools that are then blamed for outages, even as the company undergoes significant layoffs. There is a sense among the engineering staff that the push for AI adoption was more about cost-cutting and headcount reduction than genuine technical improvement. The internal briefing notes acknowledge that the best practices for these tools are still under development, yet the mandate to use them preceded the creation of those very safeguards. This disconnect between executive-level mandates and engineering-level reality has created a culture of skepticism among those tasked with maintaining the world’s largest e-commerce and cloud platform.
The long-term implications for the AI industry are profound. If a company with the resources and technical pedigree of Amazon cannot safely deploy AI-generated code at scale without heavy human oversight, it suggests that smaller organizations with fewer senior resources may be at even greater risk. The industry may be entering a period of cooling expectations, where the initial hype of AI-driven productivity is replaced by a sober assessment of the risks involved.[3] The future of software engineering may not be a world where machines write the code, but rather one where humans become specialized auditors of increasingly complex and opaque automated systems.
For now, Amazon remains committed to the technology, but with a newfound respect for its volatility. The company is reportedly working on more sophisticated testing frameworks and specialized AI models that are better trained on its specific internal architecture. However, until those tools can prove their reliability, the responsibility for keeping the internet’s most critical infrastructure online will remain firmly in the hands of its most experienced humans. The shift at Amazon demonstrates that in the race for technological advancement, the cost of speed can often be the very reliability that made the service successful in the first place.