diff --git a/main/.buildinfo b/main/.buildinfo index 0c7fb6353..d7134a43b 100644 --- a/main/.buildinfo +++ b/main/.buildinfo @@ -1,4 +1,4 @@ # Sphinx build info version 1 # This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done. -config: 4fe7d5568ffb88fafbd431a32ff4ff59 +config: 8f1f4633f4779a42ab438b902a042b69 tags: d77d1c0d9ca2f4c8421862c7c5a0d620 diff --git a/main/_downloads/315c4c52fb68082a731b192d944e2ede/tutorials_python.zip b/main/_downloads/315c4c52fb68082a731b192d944e2ede/tutorials_python.zip index 5abbd284f..acc63877d 100644 Binary files a/main/_downloads/315c4c52fb68082a731b192d944e2ede/tutorials_python.zip and b/main/_downloads/315c4c52fb68082a731b192d944e2ede/tutorials_python.zip differ diff --git a/main/_downloads/a5659940aa3f8f568547d47752a43172/tutorials_jupyter.zip b/main/_downloads/a5659940aa3f8f568547d47752a43172/tutorials_jupyter.zip index dff07d80e..a51091186 100644 Binary files a/main/_downloads/a5659940aa3f8f568547d47752a43172/tutorials_jupyter.zip and b/main/_downloads/a5659940aa3f8f568547d47752a43172/tutorials_jupyter.zip differ diff --git a/main/introduction/create_custom_env/index.html b/main/introduction/create_custom_env/index.html index 51dffbf03..5d79262f2 100644 --- a/main/introduction/create_custom_env/index.html +++ b/main/introduction/create_custom_env/index.html @@ -476,7 +476,7 @@
We use the self._action_to_direction to convert the discrete action (e.g., 2) to a grid direction with our agent location. To prevent the agent from going out of bounds of the grid, we clip the agent’s location to stay within bounds.
We compute the agent’s reward by checking if the agent’s current position is equal to the target’s location.
Since the environment doesn’t truncate internally (we can apply a time limit wrapper to the environment during :meth:make), we permanently set truncated to False.
Since the environment doesn’t truncate internally (we can apply a time limit wrapper to the environment during make()
), we permanently set truncated to False.
We once again use _get_obs and _get_info to obtain the agent’s observation and auxiliary information.