Asynchronous Actor Critic with unsupervised auxiliary tasks
Updated small_maze maze to deepmind maze HEAD
Maze is L-shaped at both ends. Visual results in navigation_visual_results folder
Training results in tensorboard_results folder
Auxiliary tasks
rp - Reward prediction (skewed sampling)
vp - Value prediction (unskewed sampling)
rp_vp
pc - Pixel control (unskewed sampling)
rp_vp_pc
fp - Frame prediction (unskewed sampling)
rp_vp_fp
ap - Action prediction (unskewed sampling)
ftp - Frame threshold prediction (image is thresholded)
flp - Flow prediction
Bigger beta for exploration works best when the episode is short
Requires deepmind lab. (Tested with last commit 832c50ee2a80b8b1e4a15fd60d1f8c1b7774c8ea)
$ git clone https://github.com/deepmind/lab.git
Place A3C folder in deepmind 'lab' folder
py_binary(
name = "a3c_train",
srcs = ["A3C/main.py"],
data = [":deepmind_lab.so"],
main = "A3C/main.py",
)
in config.py
''' Choose task '''
CONFIG = FP
from lab directory
bazel run :a3c_train
from A3C directory
tensorboard --logdir=worker_0:'./train_0',worker_1:'./train_1',worker_2:'./train_2',worker_3:'./train_3'
tensorboard_results folder
navigation_visual_results folder
- Add action to LSTM
- FD - predict frame pixel difference instead of actual frame
[WORK IN PROGRESS]