Skip to main content
← Choose a different target

Unlock: Data Contamination and Evaluation

When training data overlaps test benchmarks, model scores become meaningless. Types of contamination, detection methods, dynamic benchmarks, and how to read evaluation claims skeptically.

209 Prerequisites0 Mastered0 Working164 Gaps
Prerequisite mastery22%
Recommended probe

Realizability Assumption is your weakest prerequisite with available questions. You haven't been assessed on this topic yet.

Not assessed12 questions
Not assessed10 questions
Not assessed3 questions
Not assessed17 questions

Sign in to track your mastery and see personalized gap analysis.