Skip to content

Reinforcement learning algorithms implemented for Tensorflow 2.0+ [DQN, DDPG, AE-DDPG, SAC, PPO, Primal-Dual DDPG]

Notifications You must be signed in to change notification settings

BigBearBlacken/TF2-RL

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

50 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Reinforcement Learning Agents

Implemented for Tensorflow 2.0+

New Updates!

  • DDPG with prioritized replay
  • Primal-Dual DDPG for CMDP

Future Plans

  • SAC Discrete

Usage

  • Install dependancies imported (my tf2 conda env as reference)
  • Each file contains example code that runs training on CartPole env
  • Training: python3 TF2_DDPG_LSTM.py
  • Tensorboard: tensorboard --logdir=DDPG/logs

Hyperparameter tuning

Agents

Agents tested using CartPole env.

Name On/off policy Model Action space support
DQN off-policy Dense, LSTM discrete
DDPG off-policy Dense, LSTM discrete, continuous
AE-DDPG off-policy Dense discrete, continuous
SAC:bug: off-policy Dense continuous
PPO on-policy Dense discrete, continuous

Contrained MDP

Name On/off policy Model Action space support
Primal-Dual DDPG off-policy Dense discrete, continuous

Models

Models used to generate the demos are included in the repo, you can also find q value, reward and/or loss graphs

Demos

DQN Basic, time step = 4, 500 reward DQN LSTM, time step = 4, 500 reward
DDPG Basic, 500 reward DDPG LSTM, time step = 5, 500 reward
AE-DDPG Basic, 500 reward PPO Basic, 500 reward

About

Reinforcement learning algorithms implemented for Tensorflow 2.0+ [DQN, DDPG, AE-DDPG, SAC, PPO, Primal-Dual DDPG]

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%