Quiz Hub
1774 questions across 362 topics. Self-study and multiple choice. Three difficulty tiers.
coreadvancedresearch
Find your level10-question diagnostic across all topics. Get a recommended learning path.
~5 minai safety
Adversarial Machine LearningT2
51q
Calibration and Uncertainty QuantificationT2
3-73q
Catastrophic ForgettingT2
4-53q
Constitutional AIT2
5-63q
Data Contamination and EvaluationT2
4-53q
Differential PrivacyT2
4-67q
Ethics and Fairness in MLT2
4-53q
Mechanistic Interpretability: Features, Circuits, and Causal FaithfulnessT1
5-92q
Reward HackingT2
61q
algorithms foundations
applied math
applied statistics
beyond llms
causal semiparametric
concentration probability
Bernstein InequalityT1
44q
Chernoff BoundsT1
4-73q
Concentration InequalitiesT1
2-1050q
Contraction InequalityT2
71q
Empirical Processes and ChainingT2
7-96q
Epsilon-Nets and Covering NumbersT1
6-94q
Extreme Value TheoryT2
6-72q
Fat Tails and Heavy-Tailed DistributionsT1
4-64q
Hanson-Wright InequalityT2
6-72q
High-Dimensional Probability (Vershynin)T1
101q
Matrix ConcentrationT1
7-87q
McDiarmid's InequalityT1
5-1013q
Measure Concentration and Geometric Functional AnalysisT1
5-95q
Restricted Isometry PropertyT2
6-72q
Stochastic Processes for MLT2
7-85q
Sub-Exponential Random VariablesT1
5-62q
Sub-Gaussian Random VariablesT1
4-915q
Symmetrization InequalityT1
7-83q
decision theory
formal verification
foundations
Basic Logic and Proof TechniquesT2
2-818q
Benford's LawT2
2-43q
Birthday ParadoxT2
3-43q
Cantor's Theorem and UncountabilityT2
3-54q
Cardinality and CountabilityT2
2-716q
Common InequalitiesT1
2-610q
Common Probability DistributionsT1
1-742q
Compactness and Heine-BorelT1
2-915q
Continuity in RⁿT1
2-718q
Counting and CombinatoricsT2
2-43q
Differentiation in RⁿT1
1-521q
Eigenvalues and EigenvectorsT1
2-620q
Expectation, Variance, Covariance, and MomentsT1
1-741q
Exponential Function PropertiesT1
1-413q
Gram Matrices and Kernel MatricesT1
3-54q
Inner Product Spaces and OrthogonalityT1
2-619q
Integration and Change of VariablesT2
4-56q
Inverse and Implicit Function TheoremT2
3-54q
Joint, Marginal, and Conditional DistributionsT1
1-730q
KL DivergenceT1
2-816q
Kolmogorov Probability AxiomsT1
1-829q
Linear IndependenceT1
1-24q
Markov Chains and Steady StateT2
2-96q
Matrix NormsT1
2-55q
Matrix Operations and PropertiesT1
1-727q
Metric Spaces, Convergence, and CompletenessT1
2-516q
Moment Generating FunctionsT2
3-76q
Monty Hall ProblemT2
2-54q
Numerical Stability and ConditioningT1
34q
Peano AxiomsT2
3-85q
Positive Semidefinite MatricesT1
2-726q
Random VariablesT1
1-725q
Sequences and Series of FunctionsT2
41q
Sets, Functions, and RelationsT1
1-844q
Signals and Systems for MLT2
3-75q
Singular Value DecompositionT1
2-512q
Skewness, Kurtosis, and Higher MomentsT1
3-54q
Taylor ExpansionT1
2-76q
Tensors and Tensor OperationsT1
1-34q
Total Variation DistanceT1
4-77q
Triangular DistributionT2
2-54q
Vectors, Matrices, and Linear MapsT1
1-536q
Zermelo-Fraenkel Set TheoryT2
4-75q
learning theory
learning theory core
Algorithmic StabilityT1
2-98q
Bias-Complexity TradeoffT2
3-55q
Empirical Risk MinimizationT1
1-944q
Glivenko-Cantelli TheoremT2
4-62q
Hypothesis Classes and Function SpacesT1
1-910q
Kolmogorov Complexity and MDLT2
3-77q
Loss FunctionsT2
3-610q
No-Free-Lunch TheoremT2
3-66q
PAC Learning FrameworkT1
2-1051q
Rademacher ComplexityT1
1-1035q
Realizability AssumptionT1
4-712q
Sample Complexity BoundsT1
3-811q
Understanding Machine Learning (Shalev-Shwartz, Ben-David)T1
41q
Uniform ConvergenceT1
3-914q
VC DimensionT1
2-1058q
llm construction
Attention as Kernel RegressionT3
61q
Attention Is All You Need (Paper)T1
5-65q
Attention Mechanism TheoryT2
4-911q
Attention Mechanisms HistoryT2
2-43q
Attention Sinks and Retrieval DecayT2
5-63q
Attention Variants and EfficiencyT2
5-63q
BERT and the Pretrain-Finetune ParadigmT2
2-43q
Bits, Nats, Perplexity, and BPBT2
1-46q
Chain-of-Thought and ReasoningT1
4-63q
Context EngineeringT2
4-53q
Decoding StrategiesT2
3-53q
Distributed Training TheoryT3
4-54q
DPO vs GRPO vs RL for ReasoningT2
5-73q
Efficient Transformers SurveyT2
5-73q
Fine-Tuning and AdaptationT1
5-63q
Flash AttentionT2
5-94q
GPU Compute ModelT2
4-63q
Hallucination TheoryT1
4-53q
Induction HeadsT2
5-92q
Knowledge DistillationT2
4-64q
Linear Layer: Shapes, Bias, and MemoryT1
2-45q
Mixture of ExpertsT2
4-74q
Model Compression and PruningT2
6-72q
Optimizer Theory: SGD, Adam, and MuonT1
3-74q
Perplexity and Language Model EvaluationT2
31q
Reinforcement Learning from Human FeedbackT1
5-73q
Residual Stream and Transformer InternalsT2
31q
RLHF and AlignmentT2
4-63q
Scaling Compute-Optimal TrainingT2
52q
Scaling LawsT1
5-63q
Sparse Autoencoders for Interpretability: TopK, JumpReLU, Matryoshka, and ScalingT1
5-64q
Token Prediction and Language ModelingT2
1-22q
Tokenization and Information TheoryT3
3-76q
Training Dynamics and Loss LandscapesT2
2-62q
Transformer ArchitectureT2
1-711q
mathematical infrastructure
Automatic DifferentiationT1
4-64q
Borel-Cantelli LemmasT1
4-76q
Brownian MotionT2
81q
Characteristic FunctionsT1
5-75q
Complex Numbers for FourierT2
3-75q
Convex DualityT1
3-810q
Distance Metrics ComparedT2
4-78q
Fokker–Planck EquationT2
61q
Functional Analysis CoreT2
6-93q
Information Theory FoundationsT2
1-719q
Ito's LemmaT2
5-62q
Martingale TheoryT2
2-1026q
Matrix CalculusT1
1-99q
Measure-Theoretic ProbabilityT1
2-1027q
Modes of Convergence of Random VariablesT1
3-713q
Radon-Nikodym and Conditional ExpectationT1
3-98q
Spectral Theory of OperatorsT3
6-85q
Stochastic Calculus for MLT3
5-85q
Stochastic Differential EquationsT2
5-62q
The Hessian MatrixT1
2-716q
The Jacobian MatrixT1
2-910q
Vector Calculus Chain RuleT1
41q
methodology
Ablation Study DesignT2
3-53q
Base Rate FallacyT2
3-55q
Causal Inference and the Ladder of CausationT1
6-109q
Causal Inference BasicsT3
31q
Class Imbalance and ResamplingT2
2-43q
Commons Governance and Institutional AnalysisT2
51q
Confusion Matrices and Classification MetricsT1
1-713q
Confusion Matrix: MCC, Kappa, and Cost-Sensitive EvaluationT1
4-65q
Evaluation Metrics and PropertiesT2
21q
Exploratory Data AnalysisT2
2-33q
Feature Importance and InterpretabilityT2
4-54q
Federated LearningT2
3-67q
Hypothesis Testing for MLT2
2-817q
Model Evaluation Best PracticesT1
1-514q
P-Hacking and Multiple TestingT2
3-85q
Proper Scoring RulesT2
61q
Reproducibility and Experimental RigorT2
3-75q
ROC Curve and AUCT2
4-73q
Simpson's ParadoxT2
3-76q
Statistical Significance and Multiple ComparisonsT2
1-55q
The Bitter LessonT1
4-73q
The Era of ExperienceT1
5-63q
Train-Test Split and Data LeakageT1
2-46q
Types of Bias in StatisticsT1
3-43q
ml methods
Activation FunctionsT1
1-736q
AdaBoostT2
3-53q
AIC and BICT1
6-73q
Anomaly DetectionT2
3-43q
AutoencodersT2
2-54q
BaggingT1
5-63q
Bayesian Neural NetworksT3
5-82q
Contrastive LearningT2
4-85q
Convolutional Neural NetworksT2
31q
Cross-Entropy Loss: MLE, KL Divergence, and ClassificationT1
2-814q
Data Preprocessing and Feature EngineeringT1
2-34q
Decision Trees and EnsemblesT2
3-43q
Deep Learning for Time SeriesT2
5-74q
Dimensionality Reduction TheoryT2
2-55q
Elastic NetT2
3-55q
EM Algorithm VariantsT2
71q
Ensemble Methods TheoryT2
41q
Feedforward Networks and BackpropagationT1
1-917q
Gauss-Markov TheoremT1
4-64q
Gaussian Mixture Models and EMT2
3-64q
Gaussian Process RegressionT2
5-72q
Generalized Additive ModelsT2
3-53q
Generative Adversarial NetworksT2
61q
Gradient BoostingT1
51q
Graph Neural NetworksT2
3-64q
K-Means ClusteringT1
22q
K-Nearest NeighborsT2
2-32q
Lasso RegressionT1
2-56q
Linear RegressionT1
2-615q
Logistic RegressionT1
2-56q
Loss Functions CatalogT1
1-35q
Multi-Class and Multi-Label ClassificationT2
3-76q
Naive BayesT2
3-62q
Natural Language Processing FoundationsT2
3-75q
Overfitting and UnderfittingT1
1-511q
PerceptronT2
3-75q
Principal Component AnalysisT1
2-616q
Random ForestsT1
4-52q
Recommender SystemsT2
51q
Recurrent Neural NetworksT2
2-73q
Ridge RegressionT1
2-78q
Score MatchingT1
6-86q
Skip Connections and ResNetsT1
51q
Spectral ClusteringT2
81q
Support Vector MachinesT1
2-88q
t-SNE and UMAPT2
51q
Time Series Forecasting BasicsT2
4-75q
Transfer LearningT2
3-53q
Universal Approximation TheoremT1
2-75q
Variational AutoencodersT1
4-64q
Word EmbeddingsT2
3-76q
XGBoostT2
4-77q
model timeline
modern generalization
Benign OverfittingT2
5-73q
Double DescentT2
6-93q
Gaussian Processes for Machine LearningT3
6-89q
GrokkingT2
4-63q
Implicit Bias and Modern GeneralizationT1
7-104q
Information BottleneckT3
71q
Lazy vs Feature LearningT2
7-92q
Neural Network Optimization LandscapeT2
5-74q
Neural Tangent Kernel: Lazy Training, Kernel Equivalence, μP, and the Limits of WidthT1
7-84q
Optimal Transport and Earth Mover's DistanceT2
91q
PAC-Bayes BoundsT1
6-104q
Representation Learning TheoryT2
71q
Wasserstein DistancesT3
4-76q
numerical optimization
Bayesian Optimization for HyperparametersT2
1-54q
Conditioning and Condition NumberT1
4-55q
Conjugate Gradient MethodsT2
62q
Coordinate DescentT2
3-53q
Floating-Point ArithmeticT1
2-33q
Line Search MethodsT2
3-76q
Log-Probability ComputationT1
2-55q
Mirror Descent and Frank-WolfeT2
52q
Newton's MethodT1
4-57q
Numerical Linear AlgebraT2
41q
Online Convex OptimizationT2
5-62q
Projected Gradient DescentT2
3-76q
Proximal Gradient MethodsT1
5-73q
Quasi-Newton MethodsT1
5-62q
Second-Order Optimization MethodsT2
61q
Softmax and Numerical StabilityT1
2-711q
Submodular OptimizationT3
5-85q
Trust Region MethodsT2
51q
optimization function classes
Bias-Variance TradeoffT2
1-934q
Convex Optimization BasicsT1
1-832q
Cross-Validation TheoryT2
2-64q
Gradient Descent VariantsT1
1-616q
Gradient Flow and Vanishing GradientsT1
2-77q
Kernels and Reproducing Kernel Hilbert SpacesT2
3-85q
Preconditioned Optimizers: Shampoo, K-FAC, and Natural GradientT2
32q
Regularization TheoryT2
2-44q
Stochastic Approximation TheoryT2
4-73q
Stochastic Gradient Descent ConvergenceT1
1-716q
Subgradients and SubdifferentialsT1
4-75q
predictive uncertainty
rl theory
Actor-Critic MethodsT2
5-62q
Agentic RL and Tool UseT2
5-63q
Bellman EquationsT1
2-712q
Exploration vs ExploitationT2
2-711q
Markov Decision ProcessesT1
4-73q
Minimax and Saddle PointsT2
41q
Model-Based Reinforcement LearningT2
81q
Multi-Armed Bandits TheoryT2
5-75q
No-Regret LearningT2
3-75q
Offline Reinforcement LearningT2
7-82q
Online Learning and BanditsT2
3-75q
Policy Gradient TheoremT1
6-88q
Q-LearningT1
5-75q
Reward Design and Reward MisspecificationT1
61q
Temporal Difference LearningT2
41q
Value Iteration and Policy IterationT1
4-66q
sampling mcmc
Adaptive Rejection SamplingT3
71q
Burn-in and Convergence DiagnosticsT2
52q
Coupling Arguments and Mixing TimeT3
8-93q
Gibbs SamplingT1
5-88q
Hamiltonian Monte CarloT2
7-95q
Importance SamplingT1
3-87q
Langevin DynamicsT2
8-92q
Markov Chain Monte CarloT1
5-87q
Metropolis-Hastings AlgorithmT1
5-910q
Monte Carlo MethodsT1
51q
No-U-Turn Sampler and Neal's FunnelT2
71q
Rao-BlackwellizationT2
3-76q
Rejection SamplingT2
3-77q
Variance Reduction TechniquesT2
4-77q
sequential inference
statistical estimation
Asymptotic Statistics: M-Estimators, Delta Method, LANT1
3-815q
Basu's TheoremT3
61q
Bayesian EstimationT2
2-912q
Bootstrap MethodsT1
4-64q
Central Limit TheoremT1
2-819q
Cramér-Rao Bound: Information Inequality, Achievability, and Sharper VariantsT1
4-78q
Empirical Bayes vs Hierarchical BayesT2
61q
Fisher Information: Curvature, KL Geometry, and the Natural GradientT1
4-919q
Goodness-of-Fit TestsT2
3-75q
Law of Large NumbersT1
1-616q
Maximum Likelihood Estimation: Theory, Information Identity, and Asymptotic EfficiencyT1
1-952q
Method of MomentsT2
51q
Shrinkage Estimation and the James-Stein Estimator: Inadmissibility, SURE, and Brown's CharacterizationT1
5-75q
Stein's ParadoxT2
3-87q
Sufficient Statistics and Exponential FamiliesT2
4-76q
The EM AlgorithmT1
61q
statistical foundations
Design-Based vs. Model-Based InferenceT2
3-75q
Detection TheoryT2
51q
Fano InequalityT2
7-92q
Kernel Two-Sample TestsT2
6-73q
Minimax Lower Bounds: Le Cam, Fano, Assouad, and the Reduction to TestingT1
7-92q
Neyman-Pearson and Hypothesis Testing TheoryT2
2-86q
Nonresponse and Missing DataT2
3-85q
Order StatisticsT2
3-75q
Random Matrix Theory OverviewT2
7-82q
Robust Statistics and M-EstimatorsT2
61q
Sample Size DeterminationT2
3-75q
Survey Sampling MethodsT2
3-75q