using PPO implementation in custom environement #2

chaubeyniha · 2023-01-31T14:23:32Z

Hi, thank you for writing this code I found it extremely helpful as a beginner.
I have been using this implementation in a custom environment and I had a general question.

One of the hyperparameters is n_steps, number of steps to run for each environment per update. I was wondering if there is an inherent issue if my custom environment has maximum 250 steps and loses reward for the time that passes.

Can this create a conflict and will it not learn as well? I hope my question makes sense. Please do let me know.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

using PPO implementation in custom environement #2

using PPO implementation in custom environement #2

chaubeyniha commented Jan 31, 2023

using PPO implementation in custom environement #2

using PPO implementation in custom environement #2

Comments

chaubeyniha commented Jan 31, 2023