Skip to content

Naive implementation of Monte-Carlo Policy-Gradient Control

License

Notifications You must be signed in to change notification settings

shuvoxcd01/REINFORCE

Repository files navigation

REINFORCE

Naive implementation of Monte-Carlo Policy-Gradient Control. CartPole-v0 has been used here as the environment.

The algorithm is given below.

There is one trick though. The return, G, is normalized. This helps the algorithm to have numerical stability.

About

Naive implementation of Monte-Carlo Policy-Gradient Control

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages