TheoremPathMap the theorems. Understand the assumptions. Follow the proof path.

A source-backed guide to machine learning theory, statistics, optimization, and deep learning, organized around the prerequisites that actually connect.

Start with a theorem Open the atlas

Start

Start from the right layer.

TheoremPath is not a flat syllabus. Pick the layer that matches the missing prerequisite, then move through the graph in order.

Take the diagnostic ->

FoundationsCalculus, linear algebra, and probabilityUse this route when the notation is the blocker: gradients, matrix maps, distributions, and proof moves before ML theory.

Core pathMath to MLMove from probability and optimization into empirical risk, concentration bounds, and the first generalization proofs.

Practice bridgeEngineer to ML theoryStart from working ML intuition and attach the assumptions, failure modes, and theorem statements underneath it.

Statistics bridgeStats to deep learningConnect likelihood, inference, bias-variance, and asymptotics to modern deep-learning theory and transformer papers.

Research pathML research readinessFollow concentration, VC/PAC, Rademacher complexity, optimization, attention, and scaling laws in dependency order.

Atlas

Build the concept graph.

The important objects are dependencies: probability before concentration, concentration before uniform convergence, optimization before training dynamics.

Concentration inequalitiesMarkov, Chebyshev, Hoeffding, Bernstein Uniform convergenceFinite-class bounds, ERM, and generalization gaps VC dimensionShattering, growth functions, and sample complexity Rademacher complexitySymmetrization and data-dependent capacity Transformer architectureSelf-attention, FFN blocks, and residual streams Scaling lawsCompute allocation, data scaling, and fit limits

Browse the curriculum ->

Method

Read claims with evidence attached.

Topic pages stay public. Sign-in is for saved notes, diagnostics, and review state; the theory itself remains readable without an account.

Claims are scoped

The site separates a theorem statement, its assumptions, and the page-level explanation so evidence attaches to the claim it actually supports.

Diagnostics route gaps

Missed items map to prerequisite concepts, not broad topic pages. The next step is a graph repair, not another generic lesson.

Lean evidence stays narrow

Formal wrappers appear only when the Lean theorem matches the governed claim scope and the manifest records the exact proof object.

Labs

Use the diagrams.

Interactive work is for mechanics: gradients moving, random vectors concentrating, matrix maps changing geometry.

Browse all demos ->

OptimizationOptimizer pinballCompare GD, Momentum, Nesterov, and Adam across convex and non-convex surfaces.

Probability geometryHigh-dimensional probabilityWatch concentration, spectra, and spike separation as dimension changes.

Linear algebraMatrix mechanicsManipulate maps, rank, eigenvectors, determinants, trace, and SVD directly.

Recent work

Essays, labs, and topic updates.

See all updates ->

Post

Apr 30, 2026

What OpenAI's goblin episode reveals about reward models

A small creature-metaphor habit in GPT-5.5 becomes a clean case study in reward models, proxy objectives, behavior transfer, and synthetic-data feedback loops.

Open Post-training Reward models

Lab

Apr 25, 2026

Diffusion Lab

Step the forward and reverse processes of a 2D toy diffusion. Watch noise schedules, score estimates, and DDIM vs DDPM samplers side by side.

Open Diffusion models Score matching

Lab

Apr 23, 2026

Induction Heads Lab

Train a 2-layer attention-only transformer in your browser. Watch the induction circuit form during the loss-cliff phase transition; ablate any head to see in-context learning collapse.

Open Induction heads Mechanistic interpretability

Choose one theorem and trace what it depends on.

The fastest route through hard theory is not more pages. It is a visible dependency path and one honest next step.

Start with a theorem See how it works