Skip to content

Pytorch implementation of popular deep reinforcement learning algorithms towards SOA performance.

Notifications You must be signed in to change notification settings

fiberleif/Pytorch-RL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

75 Commits
 
 
 
 

Repository files navigation

Pytorch-RL

Pytorch implementation of popular deep reinforcement learning algorithms towards SOA performance.

Implemented algorithms:

  • Proximal Policy Optimization (PPO)
  • Deep Deterministic Policy Gradient (DDPG)

To be implemented algorithms:

  • Trust Region Policy Optimization (TRPO)
  • Generative Adversatial Imitation Learning (GAIL)
  • (Double/Dueling) Deep Q-Learning (DQN)

Dependency

  • Python 3.6
  • Numpy 1.15
  • Scipy 1.1.0
  • Mujoco-py 0.5.7
  • Gym 0.9.0
  • sklearn 0.0
  • PyTorch v0.4.0

Code Usage

Run PPO algorithm in MuJoCo Suite

cd ppo
python ppo_train.py --e Reacher-v1 -n 60000 -b 50
python ppo_train.py --e InvertedPendulum-v1
python ppo_train.py --e InvertedDoublePendulum-v1 -n 12000
python ppo_train.py --e Swimmer-v1 -n 2500 -b 5
python ppo_train.py --e Hopper-v1 -n 30000
python ppo_train.py --e HalfCheetah-v1 -n 3000 -b 5
python ppo_train.py --e Walker2d-v1 -n 25000
python ppo_train.py --e Ant-v1 -n 100000
python ppo_train.py --e Humanoid-v1 -n 200000
python ppo_train.py --e HumanoidStandup-v1 -n 200000 -b 5

Run DDPG algorithm in MuJoCo Suite

cd ddpg
python ddpg_train.py --e Reacher-v1 --start_timesteps 1000
python ddpg_train.py --e InvertedPendulum-v1 --start_timesteps 1000
python ddpg_train.py --e InvertedDoublePendulum-v1 --start_timesteps 1000
python ddpg_train.py --e Swimmer-v1 --start_timesteps 1000
python ddpg_train.py --e Hopper-v1 --start_timesteps 1000
python ddpg_train.py --e HalfCheetah-v1 --start_timesteps 10000
python ddpg_train.py --e Walker2d-v1 --start_timesteps 1000
python ddpg_train.py --e Ant-v1 --start_timesteps 10000

References

About

Pytorch implementation of popular deep reinforcement learning algorithms towards SOA performance.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published