Andrej Karpathy Releases 'nanochat': A Minimal, End-to-End ChatGPT-Kind Pipeline You Can Follow In ~4 Hours For ~$100

Andrej Karpathy has open-sourced nanochat, a compact, dependency-light codebase that implements a full ChatGPT-style stack—from tokenizer teaching to web UI inference—geared towards reproducible, hackable LLM teaching on a single multi-GPU node.

The repo provides a single-script “speedrun” that executes the entire loop: tokenization, base pretraining, mid-training on chat/multiple-choice/tool-use data, Supervised Finetuning (SFT), elective RL on GSM8K, evaluation, and serving (CLI + ChatGPT-like web UI). The advisable setup is an 8×H100 node; at ~$24/hour, the 4-hour speedrun lands near $100. A post-run report.md summarizes metrics (CORE, ARC-E/C, MMLU, GSM8K, HumanEval, ChatCORE).

Tokenizer and data path

Tokenizer: custom-made Rust BPE (constructed by the use of Maturin), with a 65,536-token vocab; teaching makes use of FineWeb-EDU shards (re-packaged/shuffled for straightforward entry). The walkthrough opinions ~4.8 characters/token compression and compares in opposition to GPT-2/4 tokenizers.
Eval bundle: a curated set for CORE (22 autocompletion datasets like HellaSwag, ARC, BoolQ, and so forth.), downloaded into ~/.cache/nanochat/eval_bundle.

Model, scaling, and “speedrun” aim

The speedrun config trains a depth-20 Transformer (≈560M params with 1280 hidden channels, 10 consideration heads of dim 128) for ~11.2B tokens per Chinchilla-style scaling (params × ~20 tokens). The author estimates this run as a ~4e19 FLOPs performance model. Teaching makes use of Muon for matmul parameters and AdamW for embeddings/unembeddings; loss is reported in bits-per-byte (bpb) to be tokenizer-invariant.

Mid-training, SFT, and equipment use

After pretraining, mid-training adapts the underside model to conversations (SmolTalk) and explicitly teaches multiple-choice habits (100K MMLU auxiliary-train questions) and software program use by inserting <|python_start|>…<|python_end|> blocks; a small GSM8K slice is included to seed calculator-style utilization. The default mixture: SmolTalk (460K), MMLU aux-train (100K), GSM8K basic (8K), totaling 568K rows.

SFT then fine-tunes on higher-quality conversations whereas matching test-time formatting (padded, non-concatenated rows) to chop again put together/inference mismatch. The repo’s occasion post-SFT metrics (speedrun tier) report ARC-Easy 0.3876, ARC-Downside 0.2807, MMLU 0.3151, GSM8K 0.0455, HumanEval 0.0854, ChatCORE 0.0884.

Software program use is wired end-to-end: the custom-made Engine implements KV cache, prefill/decode inference, and a simple Python interpreter sandbox for tool-augmented runs—utilized in every teaching and evaluation flows.

Non-compulsory RL on GSM8K by the use of a simplified GRPO loop

The final word (elective) stage applies reinforcement finding out on GSM8K with a simplified GRPO routine. The walkthrough clarifies what’s omitted relative to canonical PPO-style RLHF: no perception space by the use of a reference model, no KL penalties, on-policy updates (discard PPO ratios/clip), token-level GAPO-style normalization, and mean-shift profit. Just about, it behaves close to REINFORCE whereas defending the group-relative profit calculation. Scripts scripts.chat_rl and scripts.chat_eval -i rl -a GSM8K reveal the loop.

Worth/prime quality scaling and higher fashions

The README sketches two greater targets previous the ~$100 speedrun:

~$300 tier: d=26 (~12 hours), barely surpasses GPT-2 CORE; requires additional pretraining shards and batch-size modifications.
~$1,000 tier: ~41.6 hours, with materially improved coherence and basic reasoning/coding capability.

The repo moreover discover prior experimental runs the place a d=30 model educated for ~24 hours reached 40s on MMLU, 70s on ARC-Easy, 20s on GSM8K.

Evaluation snapshot (speedrun tier)

An occasion report.md desk for the ~$100/≈4-hour run reveals: CORE 0.2219 (base); after mid-training/SFT, ARC-E 0.3561→0.3876, ARC-C ~0.2875→0.2807, MMLU 0.3111→0.3151, GSM8K 0.0250→0.0455, HumanEval 0.0671→0.0854, ChatCORE 0.0730→0.0884; wall-clock 3h51m.

https://github.com/karpathy/nanochat/discussions/1

Key Takeaways

nanochat is a minimal, end-to-end ChatGPT-style stack (~8K LOC) that runs by the use of a single speedrun.sh on one 8×H100 node (~4h ≈ $100).
The pipeline covers tokenizer (Rust BPE), base pretraining, mid-training, SFT, elective RL on GSM8K (simplified GRPO), evaluation, and serving (CLI + Web UI).
Speedrun metrics (occasion report.md): CORE 0.2219 base; after SFT—ARC-Easy 0.3876, ARC-Downside 0.2807, MMLU 0.3151, GSM8K 0.0455, HumanEval 0.0854.
Scaling tiers are outlined: ~$300 (d=26, ~12h) “barely outperforms GPT-2 CORE”; ~$1,000 (~41.6h) for materially increased coherence/reasoning.

Karpathy’s nanochat lands in a useful heart flooring: a single, clear, dependency-light repository that stitches tokenizer teaching (Rust BPE), pretraining on FineWeb-EDU, mid-training (SmolTalk/MMLU aux/GSM8K with software program use tags), SFT, elective simplified GRPO on GSM8K, and a thin Engine (KV cache, prefill/decode, Python interpreter) proper right into a reproducible speedrun on an 8×H100 node, producing a traceable report.md with CORE/ARC/MMLU/GSM8K/HumanEval and a minimal Web UI.

Attempt the Technical particulars and Codes. Be at liberty to check out our GitHub Internet web page for Tutorials, Codes and Notebooks. Moreover, be at liberty to adjust to us on Twitter and don’t neglect to affix our 100k+ ML SubReddit and Subscribe to our E-newsletter. Wait! are you on telegram? now you presumably could be a part of us on telegram as correctly.

Excited to launch new repo: nanochat!
(it’s among the many many most unhinged I’ve written).

Not like my earlier associated repo nanoGPT which solely lined pretraining, nanochat is a minimal, from scratch, full-stack teaching/inference pipeline of a simple ChatGPT clone in a single,… pic.twitter.com/LLhbLCoZFt

— Andrej Karpathy (@karpathy) October 13, 2025

Asif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is devoted to harnessing the potential of Artificial Intelligence for social good. His newest endeavor is the launch of an Artificial Intelligence Media Platform, Marktechpost, which stands out for its in-depth safety of machine finding out and deep finding out info that’s every technically sound and easily understandable by a big viewers. The platform boasts of over 2 million month-to-month views, illustrating its recognition amongst audiences.

🙌 Adjust to MARKTECHPOST: Add us as a hottest provide on Google.

Elevate your perspective with NextTech Info, the place innovation meets notion.
Uncover the newest breakthroughs, get distinctive updates, and be a part of with a world group of future-focused thinkers.
Unlock tomorrow’s traits as we communicate: study additional, subscribe to our publication, and transform part of the NextTech neighborhood at NextTech-news.com

Keep forward of the curve with NextBusiness 24. Discover extra tales, subscribe to our e-newsletter, and be a part of our rising group at nextbusiness24.com

What's Hot

How Rose Rock Bridge is constructing the way forward for power in Tulsa, Oklahoma

Good Metropolis’s Latest Pinnacle Is Proper Right here! Q3 2025

Those that walked away with the Nobel and the prize cash

Andrej Karpathy Releases ‘nanochat’: A Minimal, End-to-End ChatGPT-Kind Pipeline You Can Follow In ~4 Hours For ~$100

Good Metropolis’s Latest Pinnacle Is Proper Right here! Q3 2025

Coco Robotics Taps UCLA Professor To Steer New Bodily AI Evaluation Lab

Buy Firan Know-how On Weak Spot, This Analyst Says

How Rose Rock Bridge is constructing the way forward for power in Tulsa, Oklahoma

Good Metropolis’s Latest Pinnacle Is Proper Right here! Q3 2025

Those that walked away with the Nobel and the prize cash

Coco Robotics Taps UCLA Professor To Steer New Bodily AI Evaluation Lab

How Rose Rock Bridge is constructing the way forward for power in Tulsa, Oklahoma

Good Metropolis’s Latest Pinnacle Is Proper Right here! Q3 2025

Those that walked away with the Nobel and the prize cash

Topics

-

Regional Insights

What's Hot

Andrej Karpathy Releases ‘nanochat’: A Minimal, End-to-End ChatGPT-Kind Pipeline You Can Follow In ~4 Hours For ~$100

Tokenizer and data path

Model, scaling, and “speedrun” aim

Mid-training, SFT, and equipment use

Non-compulsory RL on GSM8K by the use of a simplified GRPO loop

Worth/prime quality scaling and higher fashions

Evaluation snapshot (speedrun tier)

Key Takeaways

Related Posts

Topics

-

Regional Insights