Skip to main content

Prerequisite chain

Prerequisites for Optimizer Theory: SGD, Adam, and Muon

Topics you need before working through Optimizer Theory: SGD, Adam, and Muon. Direct prerequisites are listed first; transitive prerequisites (the chain reachable through them) follow.

Direct prerequisites (8)

  1. Convex Optimization Basicslayer 1, tier 1
  2. Adam Optimizerlayer 2, tier 1
  3. Automatic Differentiationlayer 1, tier 1
  4. Gradient Descent Variantslayer 1, tier 1
  5. Information Geometrylayer 3, tier 3
  6. Preconditioned Optimizers: Shampoo, K-FAC, and Natural Gradientlayer 3, tier 2
  7. Riemannian Optimization and Manifold Constraintslayer 3, tier 2
  8. Training Dynamics and Loss Landscapeslayer 4, tier 2

Reachable through the chain (185)

These topics are not directly cited as prerequisites but are reached transitively by following the chain upward. Working through the direct prerequisites pulls these in.