Quiz Hub

1774 questions across 362 topics. Self-study and multiple choice. Three difficulty tiers.

coreadvancedresearch

Find your level10-question diagnostic across all topics. Get a recommended learning path.

ai safety

Adversarial Machine LearningT2

Calibration and Uncertainty QuantificationT2

Catastrophic ForgettingT2

Constitutional AIT2

Data Contamination and EvaluationT2

Differential PrivacyT2

Ethics and Fairness in MLT2

Mechanistic Interpretability: Features, Circuits, and Causal FaithfulnessT1

Reward HackingT2

algorithms foundations

Dynamic ProgrammingT1

Fast Fourier TransformT2

Greedy AlgorithmsT2

Information Retrieval FoundationsT1

Matrix Multiplication AlgorithmsT2

applied math

Bayesian State EstimationT2

Cryptographic Hash FunctionsT2

Kalman FilterT1

Signal Detection TheoryT2

State Space ModelsT2

Time Series FoundationsT2

applied statistics

Non-Probability SamplingT1

beyond llms

CLIP, OpenCLIP, and SigLIP: Contrastive Language-Image PretrainingT1

Diffusion ModelsT1

Vision Transformer Lineage: ViT, DeiT, Swin, MAE, DINOv2, SAMT1

causal semiparametric

Double/Debiased Machine LearningT1

concentration probability

Bernstein InequalityT1

Chernoff BoundsT1

Concentration InequalitiesT1

Contraction InequalityT2

Empirical Processes and ChainingT2

Epsilon-Nets and Covering NumbersT1

Extreme Value TheoryT2

Fat Tails and Heavy-Tailed DistributionsT1

Hanson-Wright InequalityT2

High-Dimensional Probability (Vershynin)T1

Matrix ConcentrationT1

McDiarmid's InequalityT1

Measure Concentration and Geometric Functional AnalysisT1

Restricted Isometry PropertyT2

Stochastic Processes for MLT2

Sub-Exponential Random VariablesT1

Sub-Gaussian Random VariablesT1

Symmetrization InequalityT1

decision theory

Arrow's Impossibility TheoremT2

Auction TheoryT2

Bounded RationalityT1

Decision Theory FoundationsT2

Expected Utility TheoryT2

Game Theory FoundationsT1

Kelly CriterionT2

Mechanism DesignT2

Nash EquilibriumT2

Prospect TheoryT2

Von Neumann Minimax TheoremT2

formal verification

AlphaProof and AI-Assisted Theorem ProvingT1

foundations

Basic Logic and Proof TechniquesT2

Benford's LawT2

Birthday ParadoxT2

Cantor's Theorem and UncountabilityT2

Cardinality and CountabilityT2

Common InequalitiesT1

Common Probability DistributionsT1

Compactness and Heine-BorelT1

Continuity in RⁿT1

Counting and CombinatoricsT2

Differentiation in RⁿT1

Eigenvalues and EigenvectorsT1

Expectation, Variance, Covariance, and MomentsT1

Exponential Function PropertiesT1

Gram Matrices and Kernel MatricesT1

Inner Product Spaces and OrthogonalityT1

Integration and Change of VariablesT2

Inverse and Implicit Function TheoremT2

Joint, Marginal, and Conditional DistributionsT1

KL DivergenceT1

Kolmogorov Probability AxiomsT1

Linear IndependenceT1

Markov Chains and Steady StateT2

Matrix Operations and PropertiesT1

Metric Spaces, Convergence, and CompletenessT1

Moment Generating FunctionsT2

Monty Hall ProblemT2

Numerical Stability and ConditioningT1

Positive Semidefinite MatricesT1

Random VariablesT1

Sequences and Series of FunctionsT2

Sets, Functions, and RelationsT1

Signals and Systems for MLT2

Singular Value DecompositionT1

Skewness, Kurtosis, and Higher MomentsT1

Taylor ExpansionT1

Tensors and Tensor OperationsT1

Total Variation DistanceT1

Triangular DistributionT2

Vectors, Matrices, and Linear MapsT1

Zermelo-Fraenkel Set TheoryT2

learning theory

Adaptive Learning Is Not IIDT2

learning theory core

Algorithmic StabilityT1

Bias-Complexity TradeoffT2

Empirical Risk MinimizationT1

Glivenko-Cantelli TheoremT2

Hypothesis Classes and Function SpacesT1

Kolmogorov Complexity and MDLT2

Loss FunctionsT2

No-Free-Lunch TheoremT2

PAC Learning FrameworkT1

Rademacher ComplexityT1

Realizability AssumptionT1

Sample Complexity BoundsT1

Understanding Machine Learning (Shalev-Shwartz, Ben-David)T1

Uniform ConvergenceT1

llm construction

Attention as Kernel RegressionT3

Attention Is All You Need (Paper)T1

Attention Mechanism TheoryT2

Attention Mechanisms HistoryT2

Attention Sinks and Retrieval DecayT2

Attention Variants and EfficiencyT2

BERT and the Pretrain-Finetune ParadigmT2

Bits, Nats, Perplexity, and BPBT2

Chain-of-Thought and ReasoningT1

Context EngineeringT2

Decoding StrategiesT2

Distributed Training TheoryT3

DPO vs GRPO vs RL for ReasoningT2

Efficient Transformers SurveyT2

Fine-Tuning and AdaptationT1

Flash AttentionT2

GPU Compute ModelT2

Hallucination TheoryT1

Induction HeadsT2

Knowledge DistillationT2

Linear Layer: Shapes, Bias, and MemoryT1

Mixture of ExpertsT2

Model Compression and PruningT2

Optimizer Theory: SGD, Adam, and MuonT1

Perplexity and Language Model EvaluationT2

Reinforcement Learning from Human FeedbackT1

Residual Stream and Transformer InternalsT2

RLHF and AlignmentT2

Scaling Compute-Optimal TrainingT2

Sparse Autoencoders for Interpretability: TopK, JumpReLU, Matryoshka, and ScalingT1

Token Prediction and Language ModelingT2

Tokenization and Information TheoryT3

Training Dynamics and Loss LandscapesT2

Transformer ArchitectureT2

methodology

Ablation Study DesignT2

Base Rate FallacyT2

Causal Inference and the Ladder of CausationT1

Causal Inference BasicsT3

Class Imbalance and ResamplingT2

Commons Governance and Institutional AnalysisT2

Confusion Matrices and Classification MetricsT1

Confusion Matrix: MCC, Kappa, and Cost-Sensitive EvaluationT1

Evaluation Metrics and PropertiesT2

Exploratory Data AnalysisT2

Feature Importance and InterpretabilityT2

Federated LearningT2

Hypothesis Testing for MLT2

Model Evaluation Best PracticesT1

P-Hacking and Multiple TestingT2

Proper Scoring RulesT2

Reproducibility and Experimental RigorT2

ROC Curve and AUCT2

Simpson's ParadoxT2

Statistical Significance and Multiple ComparisonsT2

The Bitter LessonT1

The Era of ExperienceT1

Train-Test Split and Data LeakageT1

Types of Bias in StatisticsT1

ml methods

Activation FunctionsT1

Anomaly DetectionT2

Bayesian Neural NetworksT3

Contrastive LearningT2

Convolutional Neural NetworksT2

Cross-Entropy Loss: MLE, KL Divergence, and ClassificationT1

Data Preprocessing and Feature EngineeringT1

Decision Trees and EnsemblesT2

Deep Learning for Time SeriesT2

Dimensionality Reduction TheoryT2

EM Algorithm VariantsT2

Ensemble Methods TheoryT2

Feedforward Networks and BackpropagationT1

Gauss-Markov TheoremT1

Gaussian Mixture Models and EMT2

Gaussian Process RegressionT2

Generalized Additive ModelsT2

Generative Adversarial NetworksT2

Gradient BoostingT1

Graph Neural NetworksT2

K-Means ClusteringT1

K-Nearest NeighborsT2

Lasso RegressionT1

Linear RegressionT1

Logistic RegressionT1

Loss Functions CatalogT1

Multi-Class and Multi-Label ClassificationT2

Natural Language Processing FoundationsT2

Overfitting and UnderfittingT1

Principal Component AnalysisT1

Random ForestsT1

Recommender SystemsT2

Recurrent Neural NetworksT2

Ridge RegressionT1

Score MatchingT1

Skip Connections and ResNetsT1

Spectral ClusteringT2

Support Vector MachinesT1

t-SNE and UMAPT2

Time Series Forecasting BasicsT2

Transfer LearningT2

Universal Approximation TheoremT1

Variational AutoencodersT1

Word EmbeddingsT2

model timeline

DeepSeek ModelsT1

modern generalization

Benign OverfittingT2

Double DescentT2

Gaussian Processes for Machine LearningT3

Implicit Bias and Modern GeneralizationT1

Information BottleneckT3

Lazy vs Feature LearningT2

Neural Network Optimization LandscapeT2

Neural Tangent Kernel: Lazy Training, Kernel Equivalence, μP, and the Limits of WidthT1

Optimal Transport and Earth Mover's DistanceT2

PAC-Bayes BoundsT1

Representation Learning TheoryT2

Wasserstein DistancesT3

numerical optimization

Bayesian Optimization for HyperparametersT2

Conditioning and Condition NumberT1

Conjugate Gradient MethodsT2

Coordinate DescentT2

Floating-Point ArithmeticT1

Line Search MethodsT2

Log-Probability ComputationT1

Mirror Descent and Frank-WolfeT2

Newton's MethodT1

Numerical Linear AlgebraT2

Online Convex OptimizationT2

Projected Gradient DescentT2

Proximal Gradient MethodsT1

Quasi-Newton MethodsT1

Second-Order Optimization MethodsT2

Softmax and Numerical StabilityT1

Submodular OptimizationT3

Trust Region MethodsT2

optimization function classes

Bias-Variance TradeoffT2

Convex Optimization BasicsT1

Cross-Validation TheoryT2

Gradient Descent VariantsT1

Gradient Flow and Vanishing GradientsT1

Kernels and Reproducing Kernel Hilbert SpacesT2

Preconditioned Optimizers: Shampoo, K-FAC, and Natural GradientT2

Regularization TheoryT2

Stochastic Approximation TheoryT2

Stochastic Gradient Descent ConvergenceT1

Subgradients and SubdifferentialsT1

predictive uncertainty

Split Conformal PredictionT1

Weighted Conformal Prediction Under Covariate ShiftT1

rl theory

Actor-Critic MethodsT2

Agentic RL and Tool UseT2

Bellman EquationsT1

Exploration vs ExploitationT2

Markov Decision ProcessesT1

Minimax and Saddle PointsT2

Model-Based Reinforcement LearningT2

Multi-Armed Bandits TheoryT2

No-Regret LearningT2

Offline Reinforcement LearningT2

Online Learning and BanditsT2

Policy Gradient TheoremT1

Reward Design and Reward MisspecificationT1

Temporal Difference LearningT2

Value Iteration and Policy IterationT1

sampling mcmc

Adaptive Rejection SamplingT3

Burn-in and Convergence DiagnosticsT2

Coupling Arguments and Mixing TimeT3

Gibbs SamplingT1

Hamiltonian Monte CarloT2

Importance SamplingT1

Langevin DynamicsT2

Markov Chain Monte CarloT1

Metropolis-Hastings AlgorithmT1

Monte Carlo MethodsT1

No-U-Turn Sampler and Neal's FunnelT2

Rao-BlackwellizationT2

Rejection SamplingT2

Variance Reduction TechniquesT2

scientific ml

Classical ODEs: Existence, Stability, and Numerical MethodsT1

sequential inference

E-Values and Anytime-Valid InferenceT1

statistical estimation

Asymptotic Statistics: M-Estimators, Delta Method, LANT1

Basu's TheoremT3

Bayesian EstimationT2

Bootstrap MethodsT1

Central Limit TheoremT1

Cramér-Rao Bound: Information Inequality, Achievability, and Sharper VariantsT1

Empirical Bayes vs Hierarchical BayesT2

Fisher Information: Curvature, KL Geometry, and the Natural GradientT1

Goodness-of-Fit TestsT2

Law of Large NumbersT1

Maximum Likelihood Estimation: Theory, Information Identity, and Asymptotic EfficiencyT1

Method of MomentsT2

Shrinkage Estimation and the James-Stein Estimator: Inadmissibility, SURE, and Brown's CharacterizationT1

Stein's ParadoxT2

Sufficient Statistics and Exponential FamiliesT2

The EM AlgorithmT1

statistical foundations

Design-Based vs. Model-Based InferenceT2

Detection TheoryT2

Fano InequalityT2

Kernel Two-Sample TestsT2

Minimax Lower Bounds: Le Cam, Fano, Assouad, and the Reduction to TestingT1

Neyman-Pearson and Hypothesis Testing TheoryT2

Nonresponse and Missing DataT2

Order StatisticsT2

Random Matrix Theory OverviewT2

Robust Statistics and M-EstimatorsT2

Sample Size DeterminationT2

Survey Sampling MethodsT2

training techniques

Adam OptimizerT1

Batch NormalizationT1

Batch Size and Learning DynamicsT2

Data Augmentation TheoryT2

Label Smoothing and RegularizationT2

Learning Rate SchedulingT1

Regularization in PracticeT1

Weight InitializationT1