PPO

About

This repo contains an optimised version of PPO using tricks like Generalised Advantage Estimates, Entropy Regularisation etc. in an attempt to match the performance offered by StableBaselines3's PPO.

Usage

To train the agent, run train.py
Run tensorboard --logdir runs to visualise the data in your browser
To test the trained policy, run test.py

Results

PPO Continuous LunarLander-v2	PPO Continuous LunarLander-v2

PPO Continuous BipedalWalker-v3	PPO Continuous BipedalWalker-v3

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.vscode		.vscode
GIFs		GIFs
Plot_Data		Plot_Data
Plot_Graphs		Plot_Graphs
__pycache__		__pycache__
runs		runs
videos		videos
PPO.py		PPO.py
PPO_BipedalWalker-v3_model_1694988276		PPO_BipedalWalker-v3_model_1694988276
PPO_LunarLanderContinuous-v2_model_1694924985		PPO_LunarLanderContinuous-v2_model_1694924985
README.md		README.md
data_viz.ipynb		data_viz.ipynb
plot.py		plot.py
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PPO

About

Usage

Results

About

Releases

Packages

Languages

Manaro-Alpha/PPO_PyTorch

Folders and files

Latest commit

History

Repository files navigation

PPO

About

Usage

Results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages