Skip to main content
← Choose a different target

Unlock: Reward Hacking

Goodhart's law for AI: when models exploit reward model weaknesses instead of being genuinely helpful, including verbosity hacking, sycophancy, and structured mitigation strategies.

420 Prerequisites0 Mastered0 Working278 Gaps
Prerequisite mastery34%
Recommended probe

Bernstein Inequality is your weakest prerequisite with available questions. You haven't been assessed on this topic yet.

Not assessed4 questions
Not assessed2 questions
Not assessed5 questions
Not assessed3 questions
Not assessed1 question
Not assessed3 questions

Sign in to track your mastery and see personalized gap analysis.