Update README to mention nanochat and deprecation

Added a note about the deprecation of nanoGPT and the introduction of nanochat.
This commit is contained in:
Andrej
2025-11-12 11:52:34 -08:00
committed by GitHub
parent 93a43d9a5c
commit 3adf61e154

View File

@@ -3,6 +3,13 @@
![nanoGPT](assets/nanogpt.jpg)
---
**Update Nov 2025** nanoGPT has a new and improved cousin called [nanochat](https://github.com/karpathy/nanochat). It is very likely you meant to use/find nanochat instead. nanoGPT (this repo) is now very old and deprecated but I will leave it up for posterity.
---
The simplest, fastest repository for training/finetuning medium-sized GPTs. It is a rewrite of [minGPT](https://github.com/karpathy/minGPT) that prioritizes teeth over education. Still under active development, but currently the file `train.py` reproduces GPT-2 (124M) on OpenWebText, running on a single 8XA100 40GB node in about 4 days of training. The code itself is plain and readable: `train.py` is a ~300-line boilerplate training loop and `model.py` a ~300-line GPT model definition, which can optionally load the GPT-2 weights from OpenAI. That's it.
![repro124m](assets/gpt2_124M_loss.png)