Skip to content

Commit

Permalink
docs: fix typo (#1219)
Browse files Browse the repository at this point in the history
  • Loading branch information
Cyber3x authored Oct 18, 2024
1 parent 9cf678e commit 8ab56a4
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 2 deletions.
2 changes: 1 addition & 1 deletion docs/introduction/create_custom_env.md
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,7 @@ The :meth:`step` method usually contains most of the logic for your environment,
For our environment, several things need to happen during the step function:
- We use the self._action_to_direction to convert the discrete action (e.g., 2) to a grid direction with our agent location. To prevent the agent from going out of bounds of the grd, we clip the agen't location to stay within bounds.
- We use the self._action_to_direction to convert the discrete action (e.g., 2) to a grid direction with our agent location. To prevent the agent from going out of bounds of the grid, we clip the agent's location to stay within bounds.
- We compute the agent's reward by checking if the agent's current position is equal to the target's location.
- Since the environment doesn't truncate internally (we can apply a time limit wrapper to the environment during :meth:make), we permanently set truncated to False.
- We once again use _get_obs and _get_info to obtain the agent's observation and auxiliary information.
Expand Down
2 changes: 1 addition & 1 deletion docs/introduction/record_agent.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ In the script above, for the :class:`RecordVideo` wrapper, we specify three diff
For the :class:`RecordEpisodicStatistics`, we only need to specify the buffer lengths, this is the max length of the internal ``time_queue``, ``return_queue`` and ``length_queue``. Rather than collect the data for each episode individually, we can use the data queues to print the information at the end of the evaluation.
For speed ups in evaluating environments, it is possible to implement this with vector environments to in order to evaluate ``N`` episodes at the same time in parallel rather than series.
For speed ups in evaluating environments, it is possible to implement this with vector environments in order to evaluate ``N`` episodes at the same time in parallel rather than series.
```

## Recording the Agent during Training
Expand Down

0 comments on commit 8ab56a4

Please sign in to comment.