This repository contains the code and datasets used to produce the results in our paper published in TMLR.
Installation instructions for the DeepMind Control suite can be found here.
Installation instructions for the Maze environment can be found here.
Datasets can be downloaded from here.
In Appendix C.2 of the paper the random-medium-expert datasets for the DMC suite environments/tasks are stated to be 1M transitions, the same as medium and expert. This is incorrect and the paper should have stated the random-medium-expert datasets are 200k transitions. These datasets were intentionally smaller than medium and expert as we wanted to create more of a challenge based on sub-optimality, diversity and (relatively) small numbers of transitions.
We provide individual examples of running each algorithm for one set of DMC and Maze datasets. To train on a different dataset, simply update the associated parameters.
For the DMC suite we have created a separate package for loading environments and datasets. This can be installed by cloning this repository, navigating to the root and running
pip install -r requirements.txt
Further instructions, including folder structures for the data, can be accessed here.
Be sure to update expert and random scores as well as the number of sub-action dimensions.
A full list of expert and random scores is available in the file "Expert_Random_Scores.csv"
If you experience any problems or have any queries, please raise an issue or pull request.