stanford cs336 · from scratch

Lecture Atlas

Suppose you had to build a modern language model from nothing — no framework, no pretrained weights, no recipe. What would you actually need to know? Stanford CS336 spends eighteen lectures answering that question, and this atlas is the map: every lecture a node on the lifecycle from raw data to a served, aligned model. Click one for its concepts, takeaways, and the lines worth remembering.

Distilled from Stanford CS336: Language Modeling from Scratch (Spring 2025). Cards are study notes — watch the lecture for the real thing.

The shape of the answer

Building a language model from nothing turns out to mean mastering eight distinct crafts: sourcing and filtering data, choosing an architecture, respecting the memory wall of the hardware, distributing training across machines, predicting returns before you spend, measuring what you built, aligning it with post-training, and serving it cheaply enough to matter. No single lecture is the hard part — the hard part is that every stage constrains every other one. That is what the map above is for.

If you want the longer story of how the field got here, the companion course How LLMs Came to Be walks the history behind this stack.

Lecture Atlas

The shape of the answer

Further reading & sources