AI Tech Suite

Andrej Karpathy's nanochat enables full ChatGPT-style AI training for $100.

Karpathy's nanochat democratizes AI, offering a full-stack, open-source pipeline to build custom ChatGPT-style models affordably.

October 13, 2025

Andrej Karpathy's nanochat enables full ChatGPT-style AI training for $100.

Prominent AI researcher and OpenAI co-founder Andrej Karpathy has released nanochat, an open-source project that provides a complete, end-to-end pipeline for training a ChatGPT-style language model from scratch. This new repository is a significant step forward from his widely acclaimed nanoGPT project, which focused solely on the pretraining phase of model development. With nanochat, Karpathy delivers a cohesive and minimal codebase that covers the entire workflow, from raw data to a functional interactive chat application, aiming to demystify the complex process of creating large language models and make it accessible to a broader audience of developers, researchers, and students. The project is designed to be a definitive educational tool, allowing users to build and interact with their own custom language model with relatively modest resources and time.

The core contribution of nanochat is its comprehensive, "full-stack" approach to LLM creation within a single, hackable repository.[1] The project encompasses every critical stage of the model development lifecycle, starting with training a tokenizer from scratch using a new implementation in the Rust programming language.[2][3] Following tokenization, the pipeline guides the user through pretraining a Transformer-based LLM on a large dataset like FineWeb.[1] Unlike its predecessor, nanochat extends far beyond this initial step, incorporating crucial post-training phases. These include supervised fine-tuning (SFT) on conversational data, handling of multiple-choice questions, and even an optional reinforcement learning (RL) stage using GRPO to further refine the model's capabilities.[1][2] The entire codebase consists of about 8,000 lines of clean, dependency-minimal code, culminating in an efficient inference engine that supports features like KV caching for faster performance.[1][2] The final output is a trained model that users can interact with through either a command-line interface or a simple, ChatGPT-like web UI.[1]

A central theme of the nanochat release is the democratization of AI development by drastically lowering the barriers to entry in terms of cost and complexity. Karpathy has provided tangible benchmarks, stating that a user can launch a cloud GPU instance and, with a single script, train a basic, interactive ChatGPT clone in as little as four hours for approximately $100.[1][2] This makes hands-on LLM development accessible not just to large corporations, but to startups, individual hobbyists, and academic researchers. The project is designed to scale with available resources; a slightly longer training run of about 12 hours can produce a model that surpasses the GPT-2 CORE benchmark.[1][2] For those with a larger budget, a more capable model that demonstrates coherence and can solve simple math and coding problems can be trained for around $1,000 over approximately 42 hours.[1] By providing these clear cost-performance tiers, nanochat empowers a new wave of innovators to experiment with and build custom chat models tailored to niche applications without requiring massive capital investment.[2]

Beyond its practical applications, nanochat is fundamentally an educational endeavor designed to foster a deeper, foundational understanding of how modern AI systems are built. Karpathy has explicitly stated his goal for the project is to create a "strong baseline" stack in one readable and forkable repository.[1] He intends for nanochat to serve as the capstone project for LLM101n, a new undergraduate-level course he is developing.[1] This aligns with Karpathy's long-standing commitment to AI education, which includes his popular YouTube tutorials that break down complex concepts like backpropagation and transformer architecture.[4][5] By open-sourcing the entire pipeline, he allows students and developers to move beyond treating powerful models as impenetrable black boxes. Instead, they can engage directly with the mechanics of tokenization, pretraining, fine-tuning, and reinforcement learning, gaining an intuition for a process that has been largely confined to elite AI labs. This approach is expected to spur community-driven innovation, much like nanoGPT did with its "speedrun" challenges that saw developers compete to train models with maximum efficiency.[6]

In conclusion, the release of nanochat represents a significant contribution to the open-source AI ecosystem. It is not designed to compete with the performance of frontier models like GPT-4, but rather to disseminate the knowledge and practical tools required to build similar, albeit smaller, systems. By providing a clear, affordable, and comprehensive path from raw text to a functioning chat model, Andrej Karpathy has furthered the cause of transparency and accessibility in artificial intelligence. The project stands to empower a new generation of AI builders, enabling rapid prototyping for businesses, providing an invaluable hands-on learning resource for students, and serving as a robust foundation for future research and experimentation in the open-source community. It reinforces the idea that understanding how to build these complex systems is as important as simply using them.