We integrate Dreamerv2 with Behavior Cloning, where a set of off-policy demonstration trajectory is ultilized to boost the learning efficiency of world model learning.
Compared with Dreamerv2, we have the following improvements/novelties:
- Demonstration-guided wolrd model learning and policy learning: Additional off-policy demonstration data is sampled by Replay Buffer for world model learning and behavior learning.
For more information:
- Benchmark on AccelNet Surgical Challenge
- Support multiple needle variations
- Integration with gym, dVRK
-
Install Anaconda
-
Install Dependencies with GPU support
conda create -n efficient_dreamer python=3.7 conda activate efficient_dreamer conda install cudatoolkit=11.3 -c pytorch pip install tensorflow==2.9.0 tensorflow_probability==0.17.0 conda install cudnn=8.2 -c anaconda pip install protobuf==3.20.1
-
Install Efficient-Dreamer
pip install -e .
For all command lines, we assume the current directory is <path to gym-suture>
, otherwise, change directory by
cd <path to gym-suture>