AI Tech SuiteDiscover AI Tools, News, and Jobs

Anthropic leaks proprietary Claude Code source code and secret model blueprints in security lapse

Internal security failures reveal proprietary source code and hidden features, providing a blueprint of Anthropic’s flagship agentic AI architecture.

March 31, 2026

Anthropic leaks proprietary Claude Code source code and secret model blueprints in security lapse

Anthropic, the San Francisco-based AI laboratory that has long positioned itself as the industry’s leading advocate for safety and constitutional design, is currently navigating a significant internal security crisis. Following a string of recent data exposures, the company inadvertently published the complete source code for its flagship AI coding assistant, Claude Code, on a public software repository.[1][2][3][4][5][6][7] This major lapse occurred just days after a separate configuration error on the company’s internal blog exposed details about an unreleased, high-capability model tier known as Claude Mythos.[8] Together, these incidents represent one of the most substantial involuntary disclosures of intellectual property in the history of the generative AI sector, providing competitors and researchers with a literal blueprint of the orchestration logic that powers one of the market's most successful agentic tools.

The breach originated from a routine update to the public npm registry, the central repository where developers share and download JavaScript software packages.[1] On a Tuesday morning that will likely be scrutinized by DevOps engineers for years, Anthropic pushed version 2.1.88 of its official @anthropic-ai/claude-code package. Unbeknownst to the deployment team, the update included a 59.8-megabyte JavaScript source map file named cli.js.map. In modern software development, source maps are debugging tools designed to bridge the gap between compressed, minified production code and the original, human-readable TypeScript source. By including this file in a production release, Anthropic effectively attached the full, unminified source code to a package distributed to millions of users. The exposure was first identified by security researcher Chaofan Shou, who noted that the file contained references to nearly 1,900 proprietary TypeScript files.[3][2] Before Anthropic could pull the package from the registry, the code had been mirrored across multiple public GitHub repositories and forked thousands of times, ensuring its permanent availability on the internet.

The contents of the leaked codebase, totaling more than 512,000 lines of code, offer an unprecedented look into the internal engineering culture and technical strategies of a top-tier AI lab. Perhaps the most controversial discovery within the files is a subsystem labeled Undercover Mode.[6] Found within the utility directory of the codebase, this feature was specifically engineered to allow Anthropic employees to use the AI tool when contributing to public open-source projects without the AI revealing its identity or mentioning internal Anthropic projects. The activation logic for this mode ensures that the AI remains "undercover" unless it can verify it is working within a verified internal repository.[6] Additionally, the leak confirmed a series of internal animal-themed codenames for Anthropic’s model roadmap. While the public is familiar with Opus, Sonnet, and Haiku, the source code frequently references Capybara, Fennec, Numbat, and Tengu. Capybara corresponds to the recently leaked Mythos model, while Tengu appears to be the internal designation for the Claude Code project itself.

Beyond the curiosity of internal codenames, the leak provides a deep technical dive into the architecture of high-agency AI tools.[9] Developers dissecting the files found that the terminal interface for Claude Code is built using a combination of React and Ink, an unusual choice for a command-line interface that provides the tool with its sophisticated, stateful UI. The code also revealed the presence of a "dream" system, an experimental memory architecture designed to allow the agent to persist state and "think" about codebase structures across different user sessions without relying on external databases. This logic is part of what has allowed Claude Code to capture a significant portion of the developer market, with recent industry data suggesting the tool is now responsible for roughly 4% of all public GitHub commits globally. For a product that contributes an estimated $2.5 billion in annualized recurring revenue to Anthropic’s roughly $19 billion total run rate, the exposure of these "agentic" orchestration secrets is a direct blow to the company’s competitive moat.

The security and strategic implications of the leak extend far beyond the loss of trade secrets.[10] One of the most surprising architectural revelations was that Anthropic assembles its complex system prompts—the highly guarded instructions that tell the AI how to behave—client-side within the CLI tool rather than on a protected server. This means that the core "personality" and safety guardrails for the agent have effectively been sitting in plain sight on users' hard drives, merely obfuscated by standard minification. Security researchers have already pointed out that this transparency makes it significantly easier for malicious actors to design prompt injection attacks or bypass the tool's built-in permission models. Furthermore, the leak highlighted a concurrent vulnerability involving the axios library, a key dependency used by Claude Code for HTTP requests. Because the specific version of this dependency was hard-coded into the leaked source, researchers were able to confirm that users who updated during a specific three-hour window on the day of the leak may have been exposed to a supply-chain attack that could lead to credential exfiltration.

From a market perspective, the exposure of the Claude Code source code has effectively leveled the playing field for dozens of smaller AI startups and competitors. By studying the leaked files, rival companies now have access to a mature, battle-tested framework for handling multi-threaded autonomous tasks, background processes, and complex tool-call loops.[10] The leak even included internal testing logs and performance benchmarks for the upcoming Capybara model, which surprisingly showed a false claims rate of nearly 30% in its eighth internal iteration—a candid data point that Anthropic likely never intended to share with the public. This level of transparency into the "messy" middle of AI development provides a rare reality check against the polished marketing narratives typical of Silicon Valley’s elite labs.

In the wake of the incident, Anthropic has characterized the event as a packaging issue caused by human error rather than a compromise of its underlying model weights or customer data.[3] The company has since rolled out new automated checks for its build pipelines to prevent the inclusion of source maps in production releases and has moved to revoke several internal API hooks that were visible in the code. However, the reputational damage remains significant. For a company that markets itself on the premise of being the most responsible and safety-conscious player in the field, failing to secure its own source code through basic DevOps hygiene is a pointed irony.

The long-term impact of this disclosure will likely be felt in the rapid acceleration of open-source agentic tools that mimic Anthropic’s successful orchestration patterns. As the industry moves toward "agentic" AI—models that don't just chat but actively perform work on local systems—the blueprint provided by the Claude Code leak will serve as an unofficial reference manual for the next several years of development. While Anthropic continues to lead in model intelligence, the transparency forced by this error has turned a proprietary advantage into a public commodity, shifting the focus of the AI arms race from how these tools are built to how they are governed. The incident serves as a stark reminder that in the high-stakes world of artificial intelligence, even the most advanced systems are only as secure as the human processes that deploy them.