Skip to content

Comparing Trust Region Policy Optimization and Natural Policy Gradient in Reinforcement Learning

Notifications You must be signed in to change notification settings

TomFrederik/rl_reproducibility

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

56 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Comparing Trust Region Policy Optimization and Natural Policy Gradient in Reinforcement Learning

The core implementation is contained in the following files:

actor.py - implements the actor class

algorithms.py - implements TRPO and NPG, as well as target algorithms (MC returns and GAE)

experiment_class - implements an experiment class and wrapper class to run experiments with different parameters more smoothly

utils.py - implements useful functions, such as sampling memory or updating pytorch model parameters.

Minimal working examples can be found in

min_example_actor.py

demo.py

experiment_demo.py

For the results, we refer to the actual report

Report.pdf

About

Comparing Trust Region Policy Optimization and Natural Policy Gradient in Reinforcement Learning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages