PyTorch implementation of Constrained Policy Optimization (CPO)

This repository has a simple to understand and use implementation of CPO in PyTorch. A dummy constraint function is included and can be adapted based on your needs.

Pre-requisites

PyTorch (The code is tested on PyTorch 1.2.0.)
OpenAI Gym.
MuJoCo (mujoco-py)
If working with a GPU, set OMP_NUM_THREADS to 1 using:

export OMP_NUM_THREADS=1

Features

Tensorboard integration to track learning.
Best model is tracked and saved using the value and standard deviation of average reward.

Usage

python algos/main.py --env-name CartPole-v1 --algo-name=CPO --exp-num=1 --exp-name=CPO/CartPole --save-intermediate-model=10 --gpu-index=0 --max-iter=500

Code Reference

Khrylx/PyTorch-RL

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
algos		algos
assets		assets
core		core
cpo_theory		cpo_theory
models		models
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PyTorch implementation of Constrained Policy Optimization (CPO)

Pre-requisites

Features

Usage

Code Reference

Technical Details on CPO

About

Releases

Packages

Languages

License

SapanaChaudhary/PyTorch-CPO

Folders and files

Latest commit

History

Repository files navigation

PyTorch implementation of Constrained Policy Optimization (CPO)

Pre-requisites

Features

Usage

Code Reference

Technical Details on CPO

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages