Autonomous AI agents now identify and exploit the majority of known smart contract vulnerabilities

OpenAI and Paradigm’s new benchmark reveals AI agents can autonomously exploit smart contracts, sparking a high-stakes security race.

February 19, 2026

Autonomous AI agents now identify and exploit the majority of known smart contract vulnerabilities
Artificial intelligence has transitioned from a passive coding assistant to an active participant in the high-stakes world of blockchain security.[1][2] A groundbreaking benchmark developed through a collaboration between OpenAI and the crypto-focused investment firm Paradigm has demonstrated that autonomous AI agents can now identify and exploit the majority of known smart contract vulnerabilities without human intervention.[3] This framework, known as EVMbench, signals a major advancement in the evaluation of agentic AI systems, moving beyond simple code generation to test end-to-end execution in economically meaningful environments where over 100 billion dollars in assets are at risk.[4][5]
The EVMbench framework is designed to move AI evaluation away from academic exercises and toward real-world performance.[2] It consists of 120 curated high-severity vulnerabilities drawn from 40 different smart contract audits and competitive platforms such as Code4rena.[6][4] Unlike previous benchmarks that primarily tested a model's ability to describe a bug, EVMbench operates in three distinct modes: detection, patching, and exploitation.[4][2] In the exploit mode, the AI agent is placed within a sandboxed Ethereum Virtual Machine environment and tasked with executing a successful fund-draining attack.[7][4] To receive a passing grade, the agent must do more than just write code; it must interact with a local blockchain, adjust its strategy based on transaction failures, and produce a verifiable change in the on-chain state, such as an increased wallet balance or a triggered protocol failure.
The performance results across frontier models highlight a significant gap between an AI’s ability to find a bug and its ability to weaponize it. Research indicates that while models like GPT-4o and Claude 3.5 Sonnet are highly proficient at identifying surface-level flaws, the most advanced agentic systems are now reaching a critical threshold in end-to-end exploitation. Recent evaluations show that when these models are equipped with agentic tools—allowing them to run code, view logs, and iterate on their scripts—they can successfully execute complex attacks that were previously thought to require expert-level human ingenuity.[4] One of the most striking findings is that model performance in exploitation has surged in a remarkably short period. Whereas early iterations of coding models could only exploit a small fraction of critical bugs, the latest generation of reasoning models has reached a success rate of over 70 percent in certain exploit categories.[8][3]
Despite their offensive prowess, AI agents still struggle with the nuances of defensive security, specifically in detection and remediation.[9][5][10][8][3] While the "exploit" tasks have an explicit goal—continuing to iterate until funds are drained—the "detect" and "patch" tasks require a higher level of thoroughness and caution. In detect mode, agents often suffer from a lack of exhaustive auditing, frequently stopping after finding a single obvious vulnerability rather than scanning the entire codebase for subtle logic flaws.[10] Similarly, in the patch mode, AI agents often struggle to fix a security hole without breaking the intended functionality of the contract.[5][7] Maintaining the delicate balance of decentralized protocol logic while closing reentrancy or integer overflow gaps remains a bottleneck. However, the benchmark found that providing even a minimal hint about the location of a bug can cause patch success rates to jump from under 40 percent to over 90 percent, suggesting that the primary limitation is currently the "search" capability rather than the technical skill of the AI.[3]
The implications for the blockchain and AI industries are dual-natured, presenting both a democratic tool for security and a lowered barrier for malicious actors. On the defensive side, the automation of smart contract audits could significantly lower the costs for decentralized finance protocols, which currently pay tens of thousands of dollars for manual reviews. OpenAI has already signaled a commitment to this defensive shift, pledging millions in API credits to cybersecurity researchers and expanding the beta of its dedicated security agent, Aardvark.[9][4][3] On the offensive side, the ability for an autonomous system to scan the public ledger and execute fund-draining attacks independently raises urgent concerns. Because blockchain transactions are immutable and irreversible, the emergence of AI agents capable of "self-financing" through large-scale hacks presents a unique geopolitical and economic risk.
As AI continues to integrate with financial infrastructure, the development of benchmarks like EVMbench is essential for tracking the evolution of "economically meaningful" agency.[11][2] The shift from chatbots to active participants on-chain means that developers can no longer rely on obscurity or complexity as a defense. With AI agents demonstrating the ability to reason through multi-step flash loan attacks and complex state-management bugs, the industry must transition toward a future where AI-driven security is integrated into every stage of the development lifecycle. The results of the OpenAI and Paradigm study confirm that the race between AI-powered offense and defense is no longer a theoretical scenario, but a live reality that will define the security of the next generation of financial rails.

Sources
Share this article