Offline Reinforcement Learning Reading List
Curated by Mouhssine Rifaki | Stanford Electrical Engineering | Last updated April 2026
Learning policies from fixed datasets without environment interaction. The data-driven paradigm for RL.
- Off-Policy Deep Reinforcement Learning without Exploration
Fujimoto, Meger, Precup. ICML 2019.
- Conservative Q-Learning for Offline Reinforcement Learning
Kumar et al.. NeurIPS 2020.
- Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems
Levine et al.. arXiv 2020.
- Decision Transformer: Reinforcement Learning via Sequence Modeling
Chen et al.. NeurIPS 2021.
- A Minimalist Approach to Offline Reinforcement Learning
Fujimoto and Gu. NeurIPS 2021.
- Offline Reinforcement Learning with Implicit Q-Learning
Kostrikov, Nair, Levine. ICLR 2022.
- COMBO: Conservative Offline Model-Based Policy Optimization
Yu et al.. NeurIPS 2021.
- D4RL: Datasets for Deep Data-Driven Reinforcement Learning
Fu et al.. arXiv 2020.
- Behavior Regularized Offline Reinforcement Learning
Wu et al.. arXiv 2019.
- Why Should I Trust You, Bellman? The Bellman Error is a Poor Replacement for Value Error
Voloshin et al.. ICML 2023.
← Back to main page