Skip to main content

Prerequisite chain

Prerequisites for Transformer Architecture

Topics you need before working through Transformer Architecture. Direct prerequisites are listed first; transitive prerequisites (the chain reachable through them) follow.

Direct prerequisites (15)

  1. Attention Mechanism Theorylayer 4, tier 2
  2. Feedforward Networks and Backpropagationlayer 2, tier 1
  3. Softmax and Numerical Stabilitylayer 1, tier 1
  4. Adam Optimizerlayer 2, tier 1
  5. Attention Mechanisms Historylayer 3, tier 2
  6. Byte-Level Language Modelslayer 4, tier 3
  7. Convolutional Neural Networkslayer 3, tier 2
  8. Deep Learning (Goodfellow, Bengio, Courville)layer 0B, tier 1
  9. Distributional Semanticslayer 2, tier 2
  10. Linear Layer: Shapes, Bias, and Memorylayer 2, tier 1
  11. Macroeconomic Time-Series Forecastinglayer 4, tier 3
  12. Recurrent Neural Networkslayer 3, tier 2
  13. RNNs for Signal Sequenceslayer 4, tier 3
  14. Token Prediction and Language Modelinglayer 3, tier 2
  15. Word Embeddingslayer 2, tier 2

Reachable through the chain (154)

These topics are not directly cited as prerequisites but are reached transitively by following the chain upward. Working through the direct prerequisites pulls these in.