updating

Farama-Foundation · Nov 4, 2024 · 366bc34 · 366bc34
1 parent 9c8a992
commit 366bc34
Show file tree

Hide file tree

Showing 16 changed files with 37 additions and 28 deletions.
diff --git a/docs/_static/img/metaworld-text.svg b/docs/_static/img/metaworld-text.svg
diff --git a/docs/_static/img/ml1-1.gif b/docs/_static/img/ml1-1.gif
diff --git a/docs/_static/img/ml1.gif b/docs/_static/img/ml1.gif
diff --git a/docs/_static/img/ml10-1.gif b/docs/_static/img/ml10-1.gif
diff --git a/docs/_static/img/ml10.gif b/docs/_static/img/ml10.gif
diff --git a/docs/_static/img/ml45-1.gif b/docs/_static/img/ml45-1.gif
diff --git a/docs/_static/img/ml45.gif b/docs/_static/img/ml45.gif
diff --git a/docs/_static/img/mt1-1.gif b/docs/_static/img/mt1-1.gif
diff --git a/docs/_static/img/mt1.gif b/docs/_static/img/mt1.gif
diff --git a/docs/_static/img/mt10-1.gif b/docs/_static/img/mt10-1.gif
diff --git a/docs/_static/img/mt10.gif b/docs/_static/img/mt10.gif
diff --git a/docs/benchmark/benchmark_descriptions.md b/docs/benchmark/benchmark_descriptions.md
@@ -20,8 +20,8 @@ Below, different levels of difficulty are described.
 In the easiest setting, **MT1**, a single task needs to be learned where the agent must *reach*, *push*, or *pick and place* a goal object.
 There is no testing of generalization involved in this setting.
 
-```{figure} _static/mt1.gif
-   :alt: Multi-Task 1 
+```{figure} ../_static/mt1.gif
+   :alt: Multi-Task 1
    :width: 500
 ```
 
@@ -32,8 +32,8 @@ There is no testing of generalization involved in this setting.
 
 
 
-```{figure} _static/mt10.gif
-   :alt: Multi-Task 10 
+```{figure} ../_static/mt10.gif
+   :alt: Multi-Task 10
    :width: 500
 ```
 
@@ -56,8 +56,8 @@ For the test evaluation, unseen goal locations are used to measure generalizatio
 
 
 
-```{figure} _static/ml1.gif
-   :alt: Meta-RL 1 
+```{figure} ../_static/ml1.gif
+   :alt: Meta-RL 1
    :width: 500
 ```
 
@@ -66,8 +66,8 @@ For the test evaluation, unseen goal locations are used to measure generalizatio
 
 The meta-learning setting with 10 tasks, **ML10**, involves training on 10 manipulation tasks and evaluating on 5 unseen tasks during the test phase.
 
-```{figure} _static/ml10.gif
-   :alt: Meta-RL 10 
+```{figure} ../_static/ml10.gif
+   :alt: Meta-RL 10
    :width: 500
 ```
 
@@ -76,7 +76,7 @@ The meta-learning setting with 10 tasks, **ML10**, involves training on 10 manip
 The most difficult environment setting of metaworld, **ML45**, challenges the agent to be trained on 45 distinct manipulation tasks and evaluated on 5 test tasks.
 
 
-```{figure} _static/ml45.gif
-   :alt: Meta-RL 10 
+```{figure} ../_static/ml45.gif
+   :alt: Meta-RL 10
    :width: 500
 ```
diff --git a/docs/usage/basic_usage.md → docs/benchmark/expert_trajectories.md b/docs/usage/basic_usage.md → docs/benchmark/expert_trajectories.md
@@ -1,10 +1,10 @@
 ---
 layout: "contents"
-title: Generate data with expert policies
+title: Expert Trajectories
 firstpage:
 ---
 
-# Generate data with expert policies
+# Expert Trajectories
 
 ## Expert Policies
 For each individual environment in Meta-World (i.e. reach, basketball, sweep) there are expert policies that solve the task. These policies can be used to generate expert data for imitation learning tasks.
@@ -14,13 +14,12 @@ The below example provides sample code for the reach environment. This code can
 
 
 ```python
-from metaworld import MT1
+import gymnasium as gym
+import metaworld
+from metaworld.policies.sawyer_reach_v3_policy import SawyerReachV3Policy as p
 
-from metaworld.policies.sawyer_reach_v2_policy import SawyerReachV2Policy as p
+env = gym.make('MetaWorld/reach-v3')
 
-mt1 = MT1('reach-v2', seed=42)
-env = mt1.train_classes['reach-v2']()
-env.set_task(mt1.train_tasks[0])
 obs, info = env.reset()
 
 policy = p()

diff --git a/docs/benchmark/resetting.md b/docs/benchmark/resetting.md
@@ -0,0 +1,7 @@
+---
+layout: "contents"
+title: Resetting to a Specific State
+firstpage:
+---
+
+# Resetting to a Specific State
diff --git a/docs/index.md b/docs/index.md
@@ -20,20 +20,15 @@ Meta-World is an open-source simulated benchmark for meta-reinforcement learning
 **Basic example:**
 
 ```{code-block} python
+import gymnasium as gym
 import metaworld
-import random
 
-print(metaworld.ML1.ENV_NAMES)  # Check out the available environments
+env = gym.make('MetaWorld/reach-v3')
 
-ml1 = metaworld.ML1('pick-place-v2') # Construct the benchmark, sampling tasks
+obs = env.reset()
+a = env.action_space.sample()
+next_obs, reward, terminate, truncate, info = env.step(a)
 
-env = ml1.train_classes['pick-place-v2']()  # Create an environment with task `pick_place`
-task = random.choice(ml1.train_tasks)
-env.set_task(task)  # Set task
-
-obs = env.reset()  # Reset environment
-a = env.action_space.sample()  # Sample an action
-obs, reward, terminate, truncate, info = env.step(a)
 ```
 
 ```{toctree}
@@ -44,17 +39,19 @@ introduction/basic_usage
 evaluation/evaluation
 installation/installation
 rendering/rendering
-usage/basic_usage
 ```
 
 ```{toctree}
 :hidden:
 :caption: Benchmark Information
+benchmark/environment_creation
 benchmark/state_space
 benchmark/action_space
 benchmark/benchmark_descriptions
 benchmark/env_tasks_vs_task_init
 benchmark/reward_functions
+benchmark/expert_trajectories
+benchmark/resetting
 ```
 
 

diff --git a/docs/introduction/basic_usage.md b/docs/introduction/basic_usage.md
@@ -135,3 +135,5 @@ import metaworld
 
 envs = gym.make('Meta-World/mt-custom-sync', envs_list=['env_name_1-V3', 'env_name_2-V3', 'env_name_3-V3'], seed=seed)
 ```
+
+## Arguments
Original file line number	Diff line number	Diff line change
Expand Up		@@ -135,3 +135,5 @@ import metaworld

		envs = gym.make('Meta-World/mt-custom-sync', envs_list=['env_name_1-V3', 'env_name_2-V3', 'env_name_3-V3'], seed=seed)
		```

		## Arguments