AI2 Debuts OLMo 3, First Open AI Model to Reveal Its Reasoning
Beyond open weights: OLMo 3 reveals AI's entire pipeline and step-by-step reasoning for unprecedented transparency.
November 20, 2025

The Allen Institute for AI (AI2) has introduced OLMo 3, a family of fully open artificial intelligence models that marks a significant step toward transparency in a field often dominated by proprietary systems.[1][2] The release features what AI2 is calling the first fully open 32-billion-parameter "thinking" model, which is designed to expose its step-by-step reasoning process to users.[3][2] This move challenges the "open-weight" status quo, where developers might release model weights but keep the training data and processes private, and instead provides the entire "model flow," from the data used for training to the final deployment.[2][4] The OLMo 3 family, available in 7B and 32B parameter sizes, aims to deliver high performance and efficiency, with one model reportedly being 2.5 times more compute-efficient than a comparable model from Meta.[3][1]
A central innovation of this release is the OLMo 3-Think model, which is engineered to generate explicit chains of reasoning.[3][1] Until now, this level of visible logic has primarily been a feature of closed systems, leaving the inner workings of many AI models a "black box" to researchers and the public.[3][5] With OLMo 3-Think, users can inspect the intermediate steps the model takes to arrive at a conclusion, a capability that holds profound implications for research, development, and trust in AI systems.[6][2] This transparency allows for unprecedented customization and reproducible research at scale.[2] Researchers can now trace a model's behavior, including its reasoning steps, directly back to the training data and decisions that influenced it, a feature enabled by an updated tool called OlmoTrace.[6][1] This traceability is crucial for understanding model biases, improving performance, and fostering a more accountable AI ecosystem.
The OLMo 3 family consists of several versions tailored for different applications.[1] The foundational Olmo 3-Base models, at 7B and 32B parameters, are designed for strong performance in programming, math, and reading comprehension.[7][8] Building on this base is the Olmo 3-Instruct model, a 7B variant optimized for following user directions, engaging in multi-turn dialogues, and using tools.[6][7] The flagship Olmo 3-Think, available in both 7B and 32B sizes, is specifically built for advanced research and experiments that require a deep look into the model's reasoning process.[2][7] For those exploring reinforcement learning, AI2 has also released Olmo 3-RL Zero, an experimental model designed to bootstrap complex reasoning behaviors.[6][1] All models support a 65,000-token context window, a 16-fold increase from the previous generation, allowing for the analysis of much longer documents.[3][8]
Performance and efficiency are key pillars of the OLMo 3 release. AI2 reports that its models achieve results that are competitive with or even surpass those of much larger systems.[3][8] The OLMo 3-Base 32B model is touted as outperforming other fully open base models, while the OLMo 3-Think 32B is presented as the strongest fully open thinking model available.[6] On various benchmarks, the OLMo 3 models have demonstrated strong capabilities, narrowing the performance gap with some of the best open-weight models of similar scale.[3][8] For instance, the 7B Instruct model matches or outperforms competitors like Qwen 2.5 and Llama 3.1.[6] This performance is achieved with notable efficiency; AI2 claims the Olmo 3-Base 7B model was trained with 2.5 times the compute efficiency of Meta's Llama-3.1-8B, measured in GPU hours per token.[3] This balance of power and efficiency lowers the barrier to entry for researchers and developers, reducing both computational costs and environmental impact.[7][8]
The debut of OLMo 3 signifies a broader shift in the artificial intelligence landscape, where the demand for transparency and accessibility is growing.[1][9] By releasing the entire pipeline—including the 6-trillion-token Dolma 3 dataset, training code, and intermediate checkpoints—under a permissive Apache 2.0 license, AI2 is empowering the open-source community to build upon, customize, and scrutinize its work.[6][2][10] This level of openness is transformative, potentially accelerating innovation and allowing smaller entities to compete with large tech corporations.[2][9] The move is seen as crucial for the viability of fully open-source models, ensuring they remain competitive with their closed or partially open counterparts.[5] As AI becomes more integrated into society, the ability to understand how these complex systems think and reason will be paramount for ensuring they are developed and deployed responsibly and ethically.