Sakana AI's Collective AI: Smaller Models Outperform Industry Giants

Sakana AI's nature-inspired methods combine smaller models, offering a more efficient, powerful, and democratized path for AI development.

July 7, 2025

Sakana AI's Collective AI: Smaller Models Outperform Industry Giants
A new frontier in artificial intelligence is being pioneered by the Tokyo-based research and development company Sakana AI, which has developed novel methods for multiple large language models (LLMs) to work together to solve complex problems. This approach, which contrasts with the prevailing industry focus on building ever-larger singular models, has demonstrated significant performance improvements in early testing. The company, founded in 2023 by former Google AI researchers David Ha and Llion Jones, along with former Stability AI COO Ren Ito, is drawing inspiration from natural concepts like evolution and the collective intelligence of a school of fish, from which it derives its name.[1][2][3] Sakana AI's work suggests a paradigm shift towards a more collaborative and efficient AI ecosystem, where the specialized strengths of different models are combined to achieve superior results.
At the core of Sakana AI's recent breakthroughs are two distinct but related methodologies: "Evolutionary Model Fusion" and a collaborative inference-time technique.[4][5] The first, Evolutionary Model Fusion, automates the process of combining existing open-source AI models to create new, more capable ones.[4] This method employs an evolutionary algorithm that mimics natural selection to explore the vast number of potential combinations between different models.[6][7] It operates in two main ways: by recombining the layers of different models, referred to as the data flow space, and by mixing the weights of their parameters.[4] The algorithm then selects the "fittest" resulting models based on their performance on specific tasks, and these successful models "repopulate" the next generation of mergers.[6] This process, which can continue for hundreds of generations, allows for the automatic creation of specialized models without the immense computational cost of training a new model from scratch.[6][8] For example, Sakana AI successfully created a Japanese-language LLM with strong mathematical abilities by evolving a merger between a Japanese model and an English-language model that excelled at math.[7]
The second innovative approach focuses on enabling multiple, distinct LLMs to cooperate in real-time to solve a single problem, a concept referred to as "inference-time scaling." Sakana AI introduced a new algorithm called Adaptive Branching Monte Carlo Tree Search (AB-MCTS) to facilitate this collaboration.[5] This technique acknowledges that different frontier models, such as those from Google, OpenAI, and others, possess unique strengths and weaknesses based on their training data and architecture.[9][5] The AB-MCTS algorithm acts as a strategic coordinator, dynamically deciding whether to "search deeper" by having a model refine a promising solution, or "search wider" by prompting a different model to generate a new approach.[9] The system uses probability models to select the most suitable LLM for each step of the problem-solving process, effectively creating a "dream team" of AI experts that can outperform any individual model.[9][5] In tests on the difficult ARC (Abstraction and Reasoning Corpus) benchmark, this multi-model approach demonstrated a performance boost of up to 30% compared to the individual models working alone.[9]
The implications of Sakana AI's work are significant for the broader AI industry. The dominant trend has been a race to build larger and larger foundation models, a resource-intensive endeavor accessible to only a few tech giants.[1] Sakana AI's methods offer a more democratized and efficient path to developing powerful, specialized AI.[2][7] By leveraging the "collective intelligence" of the vast number of existing open-source models, the evolutionary merging technique reduces the need for costly training runs and allows for the rapid creation of custom models tailored to specific needs.[6][10] Similarly, the collaborative inference-time algorithm provides a way to boost the performance of existing models without altering them, simply by making them work together more intelligently. This could allow enterprises to get more value from their current AI investments and tackle more complex, multi-faceted problems that often stump single-model systems.[9] This approach is distinct from the "Mixture-of-Experts" (MoE) architecture, which routes tasks to specialized sub-networks within a single large model, as Sakana's method can dynamically combine entirely separate, pre-existing models.[11]
Founded by a team with deep roots in AI innovation—Llion Jones was a co-author of the seminal "Attention Is All You Need" paper that introduced the Transformer architecture—Sakana AI has quickly established itself as a major player.[1][3] The Tokyo-based startup has attracted significant investment, including a Series A funding round of approximately $200 million from prominent investors like Khosla Ventures, Lux Capital, and NVIDIA, along with major Japanese corporations.[12][13][14] This backing will allow the company to further invest in talent and the computational infrastructure needed to advance its nature-inspired research.[13] While not aiming to compete directly with giants like OpenAI, Sakana AI is carving out a crucial niche by focusing on creating collective, multi-modal models.[1] The company's work on projects like the "AI Scientist," which aims to automate the scientific research lifecycle, further underscores its ambition to push the boundaries of what AI can achieve through collaboration and evolutionary principles.[15][16] By demonstrating that a swarm of smaller, specialized models can collectively achieve more than a single behemoth, Sakana AI is charting a new and potentially more sustainable course for the future of artificial intelligence.[17]

Sources
Share this article