tensorpack/examples/Atari2600 at master · revilokeb/tensorpack

History

Name		Name	Last commit message	Last commit date
parent directory ..
DQN.py		DQN.py
README.md		README.md
atari.py		atari.py
breakout.jpg		breakout.jpg
common.py		common.py
curve-breakout.png		curve-breakout.png

README.md

Reproduce the following reinforcement learning methods:

Nature-DQN in: Human-level Control Through Deep Reinforcement Learning
Double-DQN in: Deep Reinforcement Learning with Double Q-learning
A3C in Asynchronous Methods for Deep Reinforcement Learning. (I used a modified version where each batch contains transitions from different simulators, which I called "Batch-A3C".)

Claimed performance in the paper can be reproduced, on several games I've tested with.

DQN typically took 2 days of training to reach a score of 400 on breakout game. My Batch-A3C implementation only took <2 hours. Both were trained on one GPU with an extra GPU for simulation.

The x-axis is the number of iterations, not wall time. Iteration speed on Tesla M40 is about 9.7it/s for B-A3C. D-DQN is faster at the beginning but will converge to 12it/s due of exploration annealing.

A demo trained with Double-DQN on breakout is available at youtube.

How to use

Download atari roms to $TENSORPACK_DATASET/atari_rom (defaults to tensorpack/dataflow/dataset/atari_rom).

To train:

./DQN.py --rom breakout.bin --gpu 0

To visualize the agent:

./DQN.py --rom breakout.bin --task play --load pretrained.model

A3C code will be released very soon.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Atari2600

Atari2600

README.md

How to use

Files

Atari2600

Directory actions

More options

Directory actions

More options

Latest commit

History

Atari2600

Folders and files

parent directory

README.md

How to use