Skip to content

Latest commit

 

History

History
 
 

Atari2600

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 

breakout

Reproduce the following reinforcement learning methods:

Claimed performance in the paper can be reproduced, on several games I've tested with.

DQN

DQN typically took 2 days of training to reach a score of 400 on breakout game. My Batch-A3C implementation only took <2 hours. Both were trained on one GPU with an extra GPU for simulation.

The x-axis is the number of iterations, not wall time. Iteration speed on Tesla M40 is about 9.7it/s for B-A3C. D-DQN is faster at the beginning but will converge to 12it/s due of exploration annealing.

A demo trained with Double-DQN on breakout is available at youtube.

How to use

Download atari roms to $TENSORPACK_DATASET/atari_rom (defaults to tensorpack/dataflow/dataset/atari_rom).

To train:

./DQN.py --rom breakout.bin --gpu 0

To visualize the agent:

./DQN.py --rom breakout.bin --task play --load pretrained.model

A3C code will be released very soon.