Inconsistency in solved return for Humanoid environment #456

LabChameleon · 2024-01-05T13:43:25Z

LabChameleon
Jan 5, 2024

Hi!

is there an agreed threshold when the humanoid environment is considered to be solved? In the Brax paper, they refer to the environment as solved with an average return of about 12000 as can be seen in Figure 9. In the Brax training tutorial, a threshold of 13000 is stated as given in https://github.com/google/brax/blob/a89322496dcb07ac5a7e002c2e1d287c8c64b7dd/notebooks/training.ipynb#L261

Thanks!

btaba · 2024-02-11T22:40:23Z

btaba
Feb 11, 2024
Maintainer

Hi @LabChameleon , Looking at Figure 4, the environments are trained with 1e6 steps to 6-8k reward, comparing Brax v1 and MuJoCo. Brax has changed considerably since then and I expect that a similar analysis and comparison would need to be done to find an average return that "solves" the environment in the current Brax version (and also for each physics backend). We don't have a particular number lying around for this currently. @cdfreeman-google may have better insights here.

0 replies

LabChameleon · 2024-02-12T08:00:43Z

LabChameleon
Feb 12, 2024
Author

Hi @btaba,

thanks for your reply and for moving this thread to Discussions. For Humanoid I get rewards around 12100-12400 that solve the environment qualitatively (running with a basic arm-swing motion). I was irritated by the number of 13k in the tutorial then. For a paper, I was hoping I could give certain numbers at which the environments are considered to be solved. However, then I will limit myself to writing that the policies qualitatively solve environments. I would still be very interested if anyone has more insights here though.

Thanks again!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inconsistency in solved return for Humanoid environment #456

{{title}}

Replies: 2 comments

{{title}}

{{title}}

Select a reply

Inconsistency in solved return for Humanoid environment #456

LabChameleon Jan 5, 2024

Replies: 2 comments

btaba Feb 11, 2024 Maintainer

LabChameleon Feb 12, 2024 Author

LabChameleon
Jan 5, 2024

btaba
Feb 11, 2024
Maintainer

LabChameleon
Feb 12, 2024
Author