AI Tech Suite

Cybercriminals Jailbreak Grok, Mixtral: Turning Top AI Into Potent Weapons

Criminals jailbreak powerful commercial AI models like Grok, rebranding them as uncensored WormGPT to scale sophisticated attacks.

June 22, 2025

Cybercriminals Jailbreak Grok, Mixtral: Turning Top AI Into Potent Weapons

Cybercriminals are escalating their attacks by leveraging increasingly sophisticated and powerful artificial intelligence models, marking a significant evolution in the landscape of AI-driven threats. The generative AI tool known as WormGPT, initially notorious for its ability to craft convincing phishing emails and malicious code, has now become a brand name for a new class of uncensored AI.[1][2] Recent investigations have revealed that new variants of these malicious tools are not built from scratch but are instead clever adaptations of legitimate, high-end commercial large language models, including Grok from xAI and Mixtral from Mistral AI.[1][3][4] Threat actors are not developing bespoke AI; they are skillfully jailbreaking existing, powerful systems, bypassing their ethical and safety guardrails to weaponize them for criminal purposes.[3][5] This development signals a dangerous new phase in the cybersecurity arms race, where the very tools designed to advance technology are being turned against their creators and the public.

The initial emergence of WormGPT in mid-2023 on underground forums represented a pivotal moment, providing cybercriminals with an AI tool specifically designed for illicit activities.[1][6] Based on the open-source GPT-J 6B model, it was offered on a subscription basis and was trained on malware-related data to assist in creating malicious software and persuasive fraudulent content.[1][7] Following media exposure and the unmasking of its creator, the original WormGPT was shut down, but this only paved the way for its name to become a genericized trademark for a host of new, more potent successors.[3][6][2] These new iterations, such as FraudGPT, EvilGPT, and others, quickly appeared, advertised with an even broader array of malicious capabilities.[1][8] This demonstrated a persistent and growing demand within the cybercrime ecosystem for AI tools that operate without the ethical constraints programmed into mainstream models like ChatGPT.[9][10] The core issue was that while public AI models had safeguards, criminals sought uncensored alternatives to automate and scale their attacks.[6]

A significant leap in this malicious evolution was recently uncovered by security researchers, who found new WormGPT variants being sold on dark web forums.[11] These weren't based on older, open-source models but were instead wrappers around cutting-edge commercial AI.[1] One variant, dubbed "keanu-WormGPT," was discovered to be powered by xAI's Grok.[3][11] Its operators didn't need to build a new AI; they simply utilized the Grok API and engineered a custom system prompt to instruct the model to bypass its safety features.[5][12] This "jailbreak" forces the AI to generate malicious outputs, such as phishing emails and credential-stealing malware, on command.[12] Similarly, another variant known as "xzin0vich-WormGPT" was found to be running on Mistral AI's Mixtral model, again using manipulated prompts to unlock its malicious potential.[1][11] These tools are sold via subscription through platforms like Telegram, making them accessible to a wide range of criminals, regardless of their technical expertise.[3][13]

The implications of this trend are profound and far-reaching for both the AI industry and global cybersecurity. The ability of criminals to co-opt powerful, legitimate AI models highlights the inherent vulnerabilities in public-facing APIs and the significant challenge of enforcing safety guardrails.[14] Security experts note that these guardrails are often more like "speed bumps" than impenetrable barriers, capable of slowing but not stopping a determined adversary.[2] This situation has given rise to a "jailbreak-as-a-service" market, which drastically lowers the barrier to entry for cybercrime, allowing even novice attackers to launch sophisticated campaigns.[15][2] The proliferation of uncensored models, which can be downloaded and run locally without oversight, creates a widening asymmetry where malicious actors can multiply their effectiveness with minimal cost.[9] This forces the cybersecurity community into a reactive posture, constantly playing catch-up with attackers who are continuously finding new ways to exploit the latest AI advancements.

In conclusion, the weaponization of high-end commercial AI models represents a formidable and evolving threat. The transformation of WormGPT from a singular tool into a brand for jailbroken, uncensored AI services demonstrates a resilient and adaptive cybercriminal market.[1][2] By piggybacking on the power of models like Grok and Mixtral, attackers are now capable of generating more sophisticated and convincing malicious content at an unprecedented scale.[4][16] This development presents a stark challenge for AI developers and security professionals alike, who must now contend with the dual nature of these powerful technologies. As generative AI becomes more integrated into our digital lives, the battle to prevent its misuse will intensify, requiring more robust security protocols, continuous monitoring, and a deeper understanding of the methods criminals use to turn these innovative tools toward destructive ends. The genie is out of the bottle, and securing it will require a concerted and ongoing effort from all corners of the tech industry.[9]