Skip to content

Commit

Permalink
Minor Documentation improvements in HumanoidStandup (#1284)
Browse files Browse the repository at this point in the history
  • Loading branch information
Kallinteris-Andreas authored Jan 1, 2025
1 parent e732459 commit c11ac05
Showing 1 changed file with 3 additions and 3 deletions.
6 changes: 3 additions & 3 deletions gymnasium/envs/mujoco/humanoidstandup_v5.py
Original file line number Diff line number Diff line change
Expand Up @@ -195,11 +195,11 @@ class HumanoidStandupEnv(MujocoEnv, utils.EzPickle):
A reward for moving up (trying to stand up).
This is not a relative reward, measuring how far up the robot has moved since the last timestep,
but an absolute reward measuring how far up the Humanoid has moved up in total.
It is measured as $w{uph} \times (z_{after action} - 0)/dt$,
where $z_{after action}$ is the z coordinate of the torso after taking an action,
It is measured as $w_{uph} \times \frac{z_{after\_action} - 0}{dt}$,
where $z_{after\_action}$ is the z coordinate of the torso after taking an action,
and $dt$ is the time between actions, which depends on the `frame_skip` parameter (default is $5$),
and `frametime`, which is $0.01$ - so the default is $dt = 5 \times 0.01 = 0.05$,
and $w_{uph}$ is `uph_cost_weight`.
and $w_{uph}$ is `uph_cost_weight` (default is $1$).
- *quad_ctrl_cost*:
A negative reward to penalize the Humanoid for taking actions that are too large.
$w_{quad\_control} \times \|action\|_2^2$,
Expand Down

0 comments on commit c11ac05

Please sign in to comment.