Unlock: Offline Reinforcement Learning
Learning policies from a fixed dataset without environment interaction: distributional shift as the core challenge, conservative Q-learning (CQL) as the standard fix, and Decision Transformer as an alternative sequence modeling approach.
256 Prerequisites0 Mastered0 Working197 Gaps
Prerequisite mastery23%
Recommended probe
Natural Language Processing Foundations is your weakest prerequisite with available questions. You haven't been assessed on this topic yet.
Not assessed5 questions
Q-LearningCore
Not assessed5 questions
Sign in to track your mastery and see personalized gap analysis.