D4RL: Datasets for Deep Data-Driven Reinforcement Learning, J Fu, et al., arxiv
Batch Policy Learning under Constraints, HM Le, et al., arxiv
Off-Policy Deep Reinforcement Learning without Exploration, S Fujimoto, et al., ICML 19
Behavior Regularized Offline Reinforcement Learning, Y Wu, et al., arxiv
Conservative Q-Learning for Offline Reinforcement Learning, A Kumar, et al., NeuraIPS 20
Critic Regularized Regression, Z Wang, et al., NeuraIPS 20
MOPO: Model-based Offline Policy Optimization, T Yu, et al., NeurIPS 20
MOReL: Model-Based Offline Reinforcement Learning, R Kidambi, et al., arxiv