diff --git a/.gitignore b/.gitignore index 03304668..65249c39 100644 --- a/.gitignore +++ b/.gitignore @@ -1,8 +1,8 @@ # pip distribution folder dist/ -# datasets folder -datasets/ +# datasets folder at top-level (leading slash) +/datasets/ # local test dataset that is lazily downloaded by example scripts tests/assets/test.hdf5 diff --git a/README.md b/README.md index 1593cbe4..28a5dd27 100644 --- a/README.md +++ b/README.md @@ -11,45 +11,62 @@
-[**[Homepage]**](https://arise-initiative.github.io/robomimic-web/) [**[Documentation]**](https://arise-initiative.github.io/robomimic-web/docs/introduction/overview.html) [**[Study Paper]**](https://arxiv.org/abs/2108.03298) [**[Study Website]**](https://arise-initiative.github.io/robomimic-web/study/) [**[ARISE Initiative]**](https://github.com/ARISE-Initiative) +[**[Homepage]**](https://robomimic.github.io/) [**[Documentation]**](https://robomimic.github.io/docs/introduction/overview.html) [**[Study Paper]**](https://arxiv.org/abs/2108.03298) [**[Study Website]**](https://robomimic.github.io/study/) [**[ARISE Initiative]**](https://github.com/ARISE-Initiative) ------- ## Latest Updates +- [05/23/2022] **v0.2.1**: Updated website and documentation to feature more tutorials :notebook_with_decorative_cover: - [12/16/2021] **v0.2.0**: Modular observation modalities and encoders :wrench:, support for [MOMART](https://sites.google.com/view/il-for-mm/home) datasets :open_file_folder: - [08/09/2021] **v0.1.0**: Initial code and paper release ------- -**robomimic** is a framework for robot learning from demonstration. It offers a broad set of demonstration datasets collected on robot manipulation domains, and learning algorithms to learn from these datasets. This project is part of the broader [Advancing Robot Intelligence through Simulated Environments (ARISE) Initiative](https://github.com/ARISE-Initiative), with the aim of lowering the barriers of entry for cutting-edge research at the intersection of AI and Robotics. +**robomimic** is a framework for robot learning from demonstration. +It offers a broad set of demonstration datasets collected on robot manipulation domains and offline learning algorithms to learn from these datasets. +**robomimic** aims to make robot learning broadly *accessible* and *reproducible*, allowing researchers and practitioners to benchmark tasks and algorithms fairly and to develop the next generation of robot learning algorithms. -Imitating human demonstrations is a promising approach to endow robots with various manipulation capabilities. While recent advances have been made in imitation learning and batch (offline) reinforcement learning, a lack of open-source human datasets and reproducible learning methods make assessing the state of the field difficult. The overarching goal of **robomimic** is to provide researchers and practitioners with: +## Core Features -- a **standardized set of large demonstration datasets** across several benchmarking tasks to facilitate fair comparisons, with a focus on learning from human-provided demonstrations -- a **standardized set of large demonstration datasets** across several benchmarking tasks to facilitate fair comparisons, with a focus on learning from human-provided demonstrations (see [this link](https://arise-initiative.github.io/robomimic-web/docs/introduction/quickstart.html#supported-datasets) for a list of supported datasets) -- **high-quality implementations of several learning algorithms** for training closed-loop policies from offline datasets to make reproducing results easy and lower the barrier to entry -- a **modular design** that offers great flexibility in extending algorithms and designing new algorithms ++ +
-This release of **robomimic** contains seven offline learning [algorithms](https://arise-initiative.github.io/robomimic-web/docs/modules/algorithms.html) and standardized [datasets](https://arise-initiative.github.io/robomimic-web/docs/introduction/results.html) collected across five simulated and three real-world multi-stage manipulation tasks of varying complexity. We highlight some features below (for a more thorough list of features, see [this link](https://arise-initiative.github.io/robomimic-web/docs/introduction/quickstart.html#features-overview)): + -## Troubleshooting -Please see the [troubleshooting](https://arise-initiative.github.io/robomimic-web/docs/miscellaneous/troubleshooting.html) section for common fixes, or [submit an issue](https://github.com/ARISE-Initiative/robomimic/issues) on our github page. +## Reproducing benchmarks -## Reproducing study results +The robomimic framework also makes reproducing the results from different benchmarks and datasets easy. See the [datasets page](https://robomimic.github.io/docs/datasets/overview.html) for more information on downloading datasets and reproducing experiments. -The **robomimic** framework also makes reproducing the results from this [study](https://arise-initiative.github.io/robomimic-web/study) easy. See the [results documentation](https://arise-initiative.github.io/robomimic-web/docs/introduction/results.html) for more information. +## Troubleshooting + +Please see the [troubleshooting](https://robomimic.github.io/docs/miscellaneous/troubleshooting.html) section for common fixes, or [submit an issue](https://github.com/ARISE-Initiative/robomimic/issues) on our github page. + +## Contributing to robomimic +This project is part of the broader [Advancing Robot Intelligence through Simulated Environments (ARISE) Initiative](https://github.com/ARISE-Initiative), with the aim of lowering the barriers of entry for cutting-edge research at the intersection of AI and Robotics. +The project originally began development in late 2018 by researchers in the [Stanford Vision and Learning Lab](http://svl.stanford.edu/) (SVL). +Now it is actively maintained and used for robotics research projects across multiple labs. +We welcome community contributions to this project. +For details please check our [contributing guidelines](https://robomimic.github.io/docs/miscellaneous/contributing.html). -## Citations +## Citation Please cite [this paper](https://arxiv.org/abs/2108.03298) if you use this framework in your work: @@ -57,7 +74,7 @@ Please cite [this paper](https://arxiv.org/abs/2108.03298) if you use this frame @inproceedings{robomimic2021, title={What Matters in Learning from Offline Human Demonstrations for Robot Manipulation}, author={Ajay Mandlekar and Danfei Xu and Josiah Wong and Soroush Nasiriany and Chen Wang and Rohun Kulkarni and Li Fei-Fei and Silvio Savarese and Yuke Zhu and Roberto Mart\'{i}n-Mart\'{i}n}, - booktitle={arXiv preprint arXiv:2108.03298}, + booktitle={Conference on Robot Learning (CoRL)}, year={2021} } ``` diff --git a/docs/conf.py b/docs/conf.py index e5ffa057..7402a994 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -14,7 +14,7 @@ import sys sys.path.insert(0, os.path.abspath('.')) -import sphinx_rtd_theme +import sphinx_book_theme import robomimic @@ -29,7 +29,6 @@ # ones. extensions = [ 'sphinx.ext.napoleon', - 'sphinx_rtd_theme', 'sphinx_markdown_tables', 'sphinx.ext.mathjax', 'sphinx.ext.githubpages', @@ -60,7 +59,7 @@ # General information about the project. project = 'robomimic' -copyright = '2021, Ajay Mandlekar, Danfei Xu, Josiah Wong, Soroush Nasiriany, Chen Wang' +copyright = '2022, Ajay Mandlekar, Danfei Xu, Josiah Wong, Soroush Nasiriany, Chen Wang' author = 'Ajay Mandlekar, Danfei Xu, Josiah Wong, Soroush Nasiriany, Chen Wang' # The version info for the project you're documenting, acts as replacement for @@ -98,7 +97,7 @@ # The theme to use for HTML and HTML Help pages. See the documentation for # a list of builtin themes. # -html_theme = 'sphinx_rtd_theme' +html_theme = 'sphinx_book_theme' # Theme options are theme-specific and customize the look and feel of a theme # further. For a list of options available for each theme, see the @@ -111,11 +110,11 @@ # so a file named "default.css" will overwrite the builtin "default.css". html_static_path = ['_static'] -html_context = { - 'css_files': [ - '_static/theme_overrides.css', # override wide tables in RTD theme - ], -} +# html_context = { +# 'css_files': [ +# '_static/theme_overrides.css', # override wide tables in RTD theme +# ], +# } # -- Options for HTMLHelp output ------------------------------------------ diff --git a/docs/datasets/d4rl.md b/docs/datasets/d4rl.md new file mode 100644 index 00000000..ba19e82a --- /dev/null +++ b/docs/datasets/d4rl.md @@ -0,0 +1,55 @@ +# D4RL + +## Overview +The [D4RL](https://arxiv.org/abs/2004.07219) benchmark provides a set of locomotion tasks and demonstration datasets. + +## Downloading + +Use `convert_d4rl.py` in the `scripts/conversion` folder to automatically download and postprocess the D4RL dataset in a single step. For example: + +```sh +# by default, download to robomimic/datasets +$ python convert_d4rl.py --env walker2d-medium-expert-v0 +# download to specific folder +$ python convert_d4rl.py --env walker2d-medium-expert-v0 --folder /path/to/output/folder/ +``` + +- `--env` specifies the dataset to download +- `--folder` specifies where you want to download the dataset. If no folder is provided, the `datasets` folder at the top-level of the repository will be used. + +The script will download the raw hdf5 dataset to `--folder`, and the converted one that is compatible with this repository into the `converted` subfolder. + +## Postprocessing + +No postprocessing is required, assuming the above script is run! + +## D4RL Results + +Below, we provide a table of results on common D4RL datasets using the algorithms included in the released codebase. We follow the convention in the TD3-BC paper, where we average results over the final 10 rollout evaluations, but we use 50 rollouts instead of 10 for each evaluation. Apart from a small handful of the halfcheetah results, the results align with those presented in the [TD3_BC paper](https://arxiv.org/abs/2106.06860). We suspect the halfcheetah results are different because we used `mujoco-py` version `2.0.2.13` in our evaluations, as opposed to `1.5` in order to be consistent with the version we were using for robosuite datasets. The results below were generated with `gym` version `0.17.3` and this `d4rl` [commit](https://github.com/rail-berkeley/d4rl/tree/9b68f31bab6a8546edfb28ff0bd9d5916c62fd1f). + +| | **BCQ** | **CQL** | **TD3-BC** | +| ----------------------------- | ------------- | ------------- | ------------- | +| **HalfCheetah-Medium** | 40.8% (4791) | 38.5% (4497) | 41.7% (4902) | +| **Hopper-Medium** | 36.9% (1181) | 30.7% (980) | 97.9% (3167) | +| **Walker2d-Medium** | 66.4% (3050) | 65.2% (2996) | 77.0% (3537) | +| **HalfCheetah-Medium-Expert** | 74.9% (9016) | 21.5% (2389) | 79.4% (9578) | +| **Hopper-Medium-Expert** | 83.8% (2708) | 111.7% (3614) | 112.2% (3631) | +| **Walker2d-Medium-Expert** | 70.2% (3224) | 77.4% (3554) | 102.0% (4683) | +| **HalfCheetah-Expert** | 94.3% (11427) | 29.2% (3342) | 95.4% (11569) | +| **Hopper-Expert** | 104.7% (3389) | 111.8% (3619) | 112.2% (3633) | +| **Walker2d-Expert** | 80.5% (3699) | 108.0% (4958) | 105.3% (4837) | + + +### Reproducing D4RL Results + +In order to reproduce the results above, first make sure that the `generate_paper_configs.py` script has been run, where the `--dataset_dir` argument is consistent with the folder where the D4RL datasets were downloaded using the `convert_d4rl.py` script. This is also the first step for reproducing results on the released robot manipulation datasets. The `--config_dir` directory used in the script (`robomimic/exps/paper` by default) will contain a `d4rl.sh` script, and a `d4rl` subdirectory that contains all the json configs. The table results above can be generated simply by running the training commands in the shell script. + +## Citation +```sh +@article{fu2020d4rl, + title={D4rl: Datasets for deep data-driven reinforcement learning}, + author={Fu, Justin and Kumar, Aviral and Nachum, Ofir and Tucker, George and Levine, Sergey}, + journal={arXiv preprint arXiv:2004.07219}, + year={2020} +} +``` \ No newline at end of file diff --git a/docs/datasets/momart.md b/docs/datasets/momart.md new file mode 100644 index 00000000..b76195a1 --- /dev/null +++ b/docs/datasets/momart.md @@ -0,0 +1,57 @@ +# MOMART Datasets and Experiments + +## Overview +[Mobile Manipulation RoboTurk (MoMaRT)](https://sites.google.com/view/il-for-mm/home) datasets are a collection of demonstrations collected on 5 long-horizon robot mobile manipulation tasks in a realistic simulated kitchen. + ++ + + + + + + + + + +
+ +## Downloading + + +Warning!
+ +When working with these datasets, please make sure that you have installed [iGibson](http://svl.stanford.edu/igibson/) from source and are on the `momart` branch. Exact steps for installing can be found [HERE](https://sites.google.com/view/il-for-mm/datasets#h.qw0vufk0hknk). + +Create Your Own Environment Wrapper!
+ +If you want to generate your own dataset in a custom environment platform that is not listed above, please see [THIS PAGE](../modules/environments.md#implement-an-environment-wrapper). + +
+
+- **`data`** (group)
+
+ - **`total`** (attribute) - number of state-action samples in the dataset
+
+ - **`env_args`** (attribute) - a json string that contains metadata on the environment and relevant arguments used for collecting data. Three keys: `env_name`, the name of the environment or task to create, `env_type`, one of robomimic's supported [environment types](https://github.com/ARISE-Initiative/robomimic/blob/master/robomimic/envs/env_base.py#L9), and `env_kwargs`, a dictionary of keyword-arguments to be passed into the environment of type `env_name`.
+
+ - **`demo_0`** (group) - group for the first trajectory (every trajectory has a group)
+
+ - **`num_samples`** (attribute) - the number of state-action samples in this trajectory
+
+ - **`model_file`** (attribute) - the xml string corresponding to the MJCF MuJoCo model. Only present for robosuite datasets.
+
+ - **`states`** (dataset) - flattened raw MuJoCo states, ordered by time. Shape (N, D) where N is the length of the trajectory, and D is the dimension of the state vector. Should be empty or have dummy values for non-robosuite datasets.
+
+ - **`actions`** (dataset) - environment actions, ordered by time. Shape (N, A) where N is the length of the trajectory, and A is the action space dimension
+
+ - **`rewards`** (dataset) - environment rewards, ordered by time. Shape (N,) where N is the length of the trajectory.
+
+ - **`dones`** (dataset) - done signal, equal to 1 if playing the corresponding action in the state should terminate the episode. Shape (N,) where N is the length of the trajectory.
+
+ - **`obs`** (group) - group for the observation keys. Each key is stored as a dataset.
+
+ - **`
+
Warning!
+ +Dataset images should be of type `np.uint8` and be stored in channel-last `(H, W, C)` format. This is because: + +- **(1)** this is a common format that many `gym` environments and all `robosuite` environments return image observations in +- **(2)** using `np.uint8` (vs floats) saves space in dataset storage + +Note that the robosuite observation extraction script (`dataset_states_to_obs.py`) already stores images in the correct format. + ++
Warning!
+ +Actions should be **normalized between -1 and 1**. This is because this range enables easier policy learning via the use of `tanh` layers). + +The `get_dataset_info.py` script can be used to sanity check stored actions, and will throw an `Exception` if there is a violation. + +Note!
+ +You can easily list the filter keys present in a dataset with the `get_dataset_info.py` script (see [this link](../tutorials/dataset_contents.html#view-dataset-structure-and-videos)), and you can even pass a `--verbose` flag to list the exact demonstrations that each filter key corresponds to. + +Warning!
-# generate json configs for running all experiments at robomimic/exps/paper -$ python generate_paper_configs.py --output_dir /tmp/experiment_results +When working with these datasets, please make sure that you have installed [robosuite](https://robosuite.ai/) from source and are on the `offline_study` branch. -# the training command can be found in robomimic/exps/paper/core.sh -# Training results can be viewed at /tmp/experiment_results (--output_dir when generating paper configs). -$ python train.py --config ../exps/paper/core/lift/ph/low_dim/bc.json -``` +@@ -68,12 +70,17 @@ These datasets were collected by 1 operator using the [RoboTurk](https://robotur | ![lift_real](../images/lift_real.jpg) | ![can_real](../images/can_real.jpg) | ![tool_hang_real](../images/tool_hang_real.jpg) | | [image](http://downloads.cs.stanford.edu/downloads/rt_benchmark/lift_real/ph/demo.hdf5) (1.9 GB) | [image](http://downloads.cs.stanford.edu/downloads/rt_benchmark/can_real/ph/demo.hdf5) (5.3 GB) | [image](http://downloads.cs.stanford.edu/downloads/rt_benchmark/tool_hang_real/ph/demo.hdf5) (58 GB) | +
+
| **Lift
(MH)** | **Can
(MH)** | **Square
(MH)** | **Transport
(MH)** |
@@ -83,11 +90,17 @@ These datasets were collected by 6 operators using the [RoboTurk](https://robotu
| [low_dim](http://downloads.cs.stanford.edu/downloads/rt_benchmark/lift/mh/low_dim.hdf5)
(46 MB) | [low_dim](http://downloads.cs.stanford.edu/downloads/rt_benchmark/can/mh/low_dim.hdf5)
(108 MB) | [low_dim](http://downloads.cs.stanford.edu/downloads/rt_benchmark/square/mh/low_dim.hdf5)
(119 MB) | [low_dim](http://downloads.cs.stanford.edu/downloads/rt_benchmark/transport/mh/low_dim.hdf5)
(609 MB) |
| [image](http://downloads.cs.stanford.edu/downloads/rt_benchmark/lift/mh/image.hdf5)
(2.6 GB) | [image](http://downloads.cs.stanford.edu/downloads/rt_benchmark/can/mh/image.hdf5)
(5.1 GB) | [image](http://downloads.cs.stanford.edu/downloads/rt_benchmark/square/mh/image.hdf5)
(6.5 GB) | [image](http://downloads.cs.stanford.edu/downloads/rt_benchmark/transport/mh/image.hdf5)
(32 GB) |
+
@@ -100,11 +113,17 @@ These datasets were generated by [training](https://github.com/ARISE-Initiative/
| [image (sparse)](http://downloads.cs.stanford.edu/downloads/rt_benchmark/lift/mg/image_sparse.hdf5)
(19 GB) | [image (sparse)](http://downloads.cs.stanford.edu/downloads/rt_benchmark/can/mg/image_sparse.hdf5)
(48 GB) |
| [image (dense)](http://downloads.cs.stanford.edu/downloads/rt_benchmark/lift/mg/image_dense.hdf5)
(19 GB) | [image (dense)](http://downloads.cs.stanford.edu/downloads/rt_benchmark/can/mg/image_dense.hdf5)
(48 GB) |
+
| **Can Paired** |
| :----------------------------------------------------------: |
@@ -113,15 +132,39 @@ This is a diagnostic dataset to test the ability of algorithms to learn from mix
| [low_dim (sparse)](http://downloads.cs.stanford.edu/downloads/rt_benchmark/can/paired/low_dim.hdf5)
(39 MB) |
| [image (sparse)](http://downloads.cs.stanford.edu/downloads/rt_benchmark/can/paired/image.hdf5)
(1.7 GB) |
+
Want to Run Experiments on Custom Observations?
+ +We provide the raw (observation-free) `demo.hdf5` datasets so that you can generate your own custom set of observations, such as additional camera viewpoints. For information, see [Extracting Observations from Datasets](robosuite.md#extracting-observations-from-mujoco-states). +**NOTE**: To compare against how our paper's released datasets were generated, please see the `extract_obs_from_raw_datasets.sh` script. + +Post-Processed Dataset Structure
+ +This post-processed `demo.hdf5` file in its current state is _missing_ observations (e.g.: proprioception, images, ...), rewards, and dones, which are necessary for training policies. + +However, keeping these observation-free datasets is useful because it **allows flexibility in [extracting](robosuite.md#extracting-observations-from-mujoco-states) different kinds of observations and rewards**. + ++ +- `data` (group) + + - `total` (attribute) - number of state-action samples in the dataset + + - `env_args` (attribute) - a json string that contains metadata on the environment and relevant arguments used for collecting data + + - `demo_0` (group) - group for the first demonstration (every demonstration has a group) + + - `num_samples` (attribute) - the number of state-action samples in this trajectory + - `model_file` (attribute) - the xml string corresponding to the MJCF MuJoCo model + - `states` (dataset) - flattened raw MuJoCo states, ordered by time + - `actions` (dataset) - environment actions, ordered by time + + - `demo_1` (group) - group for the second demonstration + + ... +
+Warning! Train-Validation Data Splits
+ +For robosuite datasets, if using your own [train-val splits](overview.md#filter-keys), generate these splits _before_ extracting observations. This ensures that all postprocessed hdf5s generated from the `demo.hdf5` inherits the same filter keys. + ++ +
+ +## Downloading + +Warning!
+ +When working with these datasets, please make sure that you have installed [robosuite](https://robosuite.ai/) from source and are on the `roboturk_v1` branch. + +- - - - - - - - - - -
- -This repository is fully compatible with [MOMART](https://sites.google.com/view/il-for-mm/home) datasets, a large collection of long-horizon, multi-stage simulated kitchen tasks executed by a mobile manipulator robot. See [this link](https://sites.google.com/view/il-for-mm/datasets) for a breakdown of the MOMART dataset structure, guide on downloading MOMART datasets, and running experiments using the datasets. - - - -## D4RL Datasets - -This repository is fully compatible with most [D4RL](https://github.com/rail-berkeley/d4rl) datasets. See [this link](./results.html#d4rl) for a guide on downloading D4RL datasets and running D4RL experiments. - - - -## RoboTurk Pilot Datasets - -The first [RoboTurk paper](https://arxiv.org/abs/1811.02790) released [large-scale pilot datasets](https://roboturk.stanford.edu/dataset_sim.html) collected with robosuite `v0.3`. These datasets consist of over 1000 task demonstrations each on several Sawyer `PickPlace` and `NutAssembly` task variants, collected by several human operators. This repository is fully compatible with these datasets. - -![roboturk_pilot](../images/roboturk_pilot.png) - -To get started, first download the dataset [here](http://cvgl.stanford.edu/projects/roboturk/RoboTurkPilot.zip) (~9 GB download), and unzip the file, resulting in a `RoboTurkPilot` folder. This folder has subdirectories corresponding to each task, each with a raw hdf5 file. You can convert the demonstrations using a command like the one below. - -```sh -# convert the Can demonstrations, and also create a "fastest_225" filter_key (prior work such as IRIS has trained on this subset) -$ python conversion/convert_roboturk_pilot.py --folder /path/to/RoboTurkPilot/bins-Can --n 225 -``` - -Next, make sure that you're on the [roboturk_v1](https://github.com/ARISE-Initiative/robosuite/tree/roboturk_v1) branch of robosuite, which is a modified version of v0.3. **You should always be on the roboturk_v1 branch when using these datasets.** Finally, follow the instructions in the above "Extracting Observations from MuJoCo states" section to extract observations from the raw converted `demo.hdf5` file, in order to produce an hdf5 ready for training. \ No newline at end of file diff --git a/docs/introduction/examples.md b/docs/introduction/examples.md deleted file mode 100644 index 779c62d0..00000000 --- a/docs/introduction/examples.md +++ /dev/null @@ -1,162 +0,0 @@ -# Working with robomimic Modules - -This section discusses some simple examples packaged with the repository (in the top-level `examples` folder) that provide a more thorough understanding of components used in the repository. These examples are meant to assist users who may want to build on these components, or use these components in other applications, in contrast to the [Getting Started](./quickstart.html) section, which provides examples relevant to using the repository as-is. - -## Train Loop Example - -We include a simple example script in `examples/simple_train_loop.py` to show how easy it is to use our `SequenceDataset` class and standardized hdf5 datasets in a general torch training loop. Run the example using the command below. - -```sh -$ python examples/simple_train_loop.py -``` - -Modifying this example for use in other code repositories is simple. First, create the dataset loader as in the script. - -```python -from robomimic.utils.dataset import SequenceDataset - -def get_data_loader(dataset_path): - """ - Get a data loader to sample batches of data. - """ - dataset = SequenceDataset( - hdf5_path=dataset_path, - obs_keys=( # observations we want to appear in batches - "robot0_eef_pos", - "robot0_eef_quat", - "robot0_gripper_qpos", - "object", - ), - dataset_keys=( # can optionally specify more keys here if they should appear in batches - "actions", - "rewards", - "dones", - ), - load_next_obs=True, - frame_stack=1, - seq_length=10, # length-10 temporal sequences - pad_frame_stack=True, - pad_seq_length=True, # pad last obs per trajectory to ensure all sequences are sampled - get_pad_mask=False, - goal_mode=None, - hdf5_cache_mode="all", # cache dataset in memory to avoid repeated file i/o - hdf5_use_swmr=True, - hdf5_normalize_obs=False, - filter_by_attribute=None, # can optionally provide a filter key here - ) - print("\n============= Created Dataset =============") - print(dataset) - print("") - - data_loader = DataLoader( - dataset=dataset, - sampler=None, # no custom sampling logic (uniform sampling) - batch_size=100, # batches of size 100 - shuffle=True, - num_workers=0, - drop_last=True # don't provide last batch in dataset pass if it's less than 100 in size - ) - return data_loader - -data_loader = get_data_loader(dataset_path="/path/to/your/dataset.hdf5") -``` - -Then, construct your model, and use the same pattern as in the `run_train_loop` function in the script, to iterate over batches to train the model. - -```python -for epoch in range(1, num_epochs + 1): - - # iterator for data_loader - it yields batches - data_loader_iter = iter(data_loader) - - for train_step in range(gradient_steps_per_epoch): - # load next batch from data loader - try: - batch = next(data_loader_iter) - except StopIteration: - # data loader ran out of batches - reset and yield first batch - data_loader_iter = iter(data_loader) - batch = next(data_loader_iter) - - # @batch is a dictionary with keys loaded from the dataset. - # Train your model on the batch below. - -``` - - - -## Config Example - -The simple config example script at `examples/simple_config.py` shows how the `Config` object can easily be instantiated and modified safely with different levels of locking. We reproduce certain portions of the script. First, we can create a `Config` object and call `lock` when we think we won't need to change it anymore. - -```python -from robomimic.config.base_config import Config - -# create config -config = Config() - -# add nested attributes to the config -config.train.batch_size = 100 -config.train.learning_rate = 1e-3 -config.algo.actor_network_size = [1000, 1000] -config.lock() # prevent accidental changes -``` - -Now, when we try to add a new key (or modify the value of an existing key), the config will throw an error. - -```python -# the config is locked --- cannot add new keys or modify existing keys -try: - config.train.optimizer = "Adam" -except RuntimeError as e: - print(e) -``` - -However, the config can be safely modified using appropriate contexts. - -```python -# values_unlocked scope allows modifying values of existing keys, but not adding keys -with config.values_unlocked(): - config.train.batch_size = 200 -print("batch_size={}".format(config.train.batch_size)) - -# unlock config within the scope, allowing new keys to be inserted -with config.unlocked(): - config.test.num_eval = 10 - -# verify that the config remains locked outside of the scope -assert config.is_locked -assert config.test.is_locked -``` - -Finally, the config can also be updated by using external dictionaries - this is helpful for loading config jsons. - -```python -# update this config with external config from a dict -ext_config = { - "train": { - "learning_rate": 1e-3 - }, - "algo": { - "actor_network_size": [1000, 1000] - } -} -with config.values_unlocked(): - config.update(ext_config) - -print(config) -``` - -Please see the [Config documentation](../modules/configs.html) for more information on Config objects. - - - -## Observation Networks Example - -The example script in `examples/simple_obs_net.py` discusses how to construct networks for taking observation dictionaries as input, and that produce dictionaries as outputs. See [this section](../modules/models.html#observation-encoder-and-decoder) in the documentation for more details. - - - -## Custom Observation Modalities Example - -The example script in `examples/add_new_modality.py` discusses how to (a) modify pre-existing observation modalities, and (b) add your own custom observation modalities with custom encoding. See [this section](../modules/models.html#observation-encoder-and-decoder) in the documentation for more details about the encoding and decoding process. \ No newline at end of file diff --git a/docs/introduction/features.md b/docs/introduction/features.md deleted file mode 100644 index 3aa8248f..00000000 --- a/docs/introduction/features.md +++ /dev/null @@ -1,41 +0,0 @@ -# Features Overview - -## Summary - -In this section, we briefly summarize some key features and where you should look to learn more about them. - -1. **Datasets supported by robomimic** - - See a list of supported datasets [here](./features.html#supported-datasets).Running quick sanity check experiments
+ +Make sure to add the `--debug` flag to your experiments as a sanity check that your implementation works. + +Warning!
+ +This example [requires robosuite](./installation.html#robosuite) to be installed (under the `offline_study` branch), but it can be run without robosuite by disabling rollouts in `robomimic/exps/templates/bc.json`: simply change the `experiment.rollout.enabled` flag to `false`. + +1. Create and activate conda environment
```sh -# create a python 3.7 virtual environment $ conda create -n robomimic_venv python=3.7.9 -# activate virtual env $ conda activate robomimic_venv ``` -Next, install [PyTorch](https://pytorch.org/) (in our example below, we chose to use version `1.6.0` with CUDA `10.2`). You can omit the `cudatoolkit=10.2` if you're on a machine without a CUDA-capable GPU (such as a Macbook). +2. Install PyTorch
+ +[PyTorch](https://pytorch.org/) reference + ++ +```sh +# Can change pytorch, torchvision versions +# We don't install cudatoolkit since Mac does not have NVIDIA GPU +$ conda install pytorch==1.6.0 torchvision==0.7.0 -c pytorch +``` + +
+```sh -# install pytorch with specific version of cuda +# Can change pytorch, torchvision versions $ conda install pytorch==1.6.0 torchvision==0.7.0 cudatoolkit=10.2 -c pytorch ``` -Next, we'll install the repository and its requirements. We provide two options - installing from source, and installing from pip. **We strongly recommend installing from source**, as it allows greater flexibility and easier access to scripts and examples. +
+3. Install robomimic
-First, clone the repository from github. +
```sh
-# clone the repository
+$ cd
```sh -# install such that changes to source code will be reflected directly in the installation -$ pip install -e . +$ pip install robomimic ``` -To run a quick test, without any dependence on simulators, run the following example +
+Warning! Additional dependencies might be required
-For maximum functionality though, we also recommend installing [robosuite](https://robosuite.ai/) -- see the section on simulators below. +This is all you need for using the suite of algorithms and utilities packaged with robomimic. However, to use our demonstration datasets, you may need additional dependencies. Please see the [datasets page](../datasets/overview.html) for more information on downloading datasets and reproducing experiments, and see [the simulators section below](installation.html#install-simulators). +
+ Required for running most robomimic examples and released datasets. Compatible with robosuite v1.2+. Install via:
```sh
-$ pip install robomimic
+# From source (recommended)
+$ cd mujoco-py dependency!
+ +Useful for running some of our algorithms on the [D4RL](https://arxiv.org/abs/2004.07219) datasets. + +Install via the instructions [here](https://github.com/rail-berkeley/d4rl). + +
++ +
-## Contributing to robomimic + + + + +## Reproducing benchmarks + +The robomimic framework also makes reproducing the results from different benchmarks and datasets easy. See the [datasets page](../datasets/overview.html) for more information on downloading datasets and reproducing experiments. ## Troubleshooting Please see the [troubleshooting](../miscellaneous/troubleshooting.html) section for common fixes, or [submit an issue](https://github.com/ARISE-Initiative/robomimic/issues) on our github page. -## Reproducing study results - -The **robomimic** framework also makes reproducing the results from this [study](https://arise-initiative.github.io/robomimic-web/study.) easy. See the [reproducing results documentation](./results.html) for more information. +## Contributing to robomimic +This project is part of the broader [Advancing Robot Intelligence through Simulated Environments (ARISE) Initiative](https://github.com/ARISE-Initiative), with the aim of lowering the barriers of entry for cutting-edge research at the intersection of AI and Robotics. +The project originally began development in late 2018 by researchers in the [Stanford Vision and Learning Lab](http://svl.stanford.edu/) (SVL). +Now it is actively maintained and used for robotics research projects across multiple labs. +We welcome community contributions to this project. +For details please check our [contributing guidelines](../miscellaneous/contributing.html). -## Citations +## Citation Please cite [this paper](https://arxiv.org/abs/2108.03298) if you use this framework in your work: -``` +```bibtex @inproceedings{robomimic2021, title={What Matters in Learning from Offline Human Demonstrations for Robot Manipulation}, author={Ajay Mandlekar and Danfei Xu and Josiah Wong and Soroush Nasiriany and Chen Wang and Rohun Kulkarni and Li Fei-Fei and Silvio Savarese and Yuke Zhu and Roberto Mart\'{i}n-Mart\'{i}n}, - booktitle={arXiv preprint arXiv:2108.03298}, + booktitle={Conference on Robot Learning (CoRL)}, year={2021} } ``` \ No newline at end of file diff --git a/docs/introduction/quickstart.md b/docs/introduction/quickstart.md deleted file mode 100644 index 3ac9efd3..00000000 --- a/docs/introduction/quickstart.md +++ /dev/null @@ -1,110 +0,0 @@ -# Getting Started - -This section discusses how to get started with the robomimic repository, by providing examples of how to train and evaluate models. - -## Training Models - -This section discusses how models can be trained. - -**Note:** These examples [require robosuite](./installation.html#robosuite) to be installed, but they can run without robosuite by disabling rollouts in `robomimic/configs/base_config.py`, `robomimic/exps/templates/bc.json`, and `examples/train_bc_rnn.py`. - -### Run a quick example - -To see a quick example of a training run, along with the outputs, run the `train_bc_rnn.py` script in the `examples` folder (the `--debug` flag is used to ensure the training run only takes a few seconds). - -```sh -$ python train_bc_rnn.py --debug -``` - -The default dataset used is the one in `tests/assets/test.hdf5` and the default directory where results are saved for the example training run is in `tests/tmp_model_dir`. Both can be overridden by passing arguments to the above script. - -**Warning:** If you are using the default dataset (and rollouts are enabled), please make sure that robosuite is on the `offline_study` branch of robosuite. - -After the script finishes, you can check the training outputs in the output directory (`tests/tmp_model_dir/bc_rnn_example` by default). See the "Viewing Training Results" section below for more information on interpreting the output. - -### Ways to launch training runs - -In this section, we describe the different ways to launch training runs. - -#### Using a config json (preferred) - -One way is to use the `train.py` script, and pass a config json via the `--config` argument. The dataset can be specified by setting the `data` attribute of the `train` section of the config json, or specified via the `--dataset` argument. The example below runs a default template json for the BC algorithm. **This is the preferred way to launch training runs.** - -```sh -$ python train.py --config ../exps/templates/bc.json --dataset ../../tests/assets/test.hdf5 -``` - -Please see the [hyperparameter helper docs](./advanced.html#using-the-hyperparameter-helper-to-launch-runs) to see how to easily generate json configs for launching training runs. - -#### Constructing a config object in code - -Another way to launch a training run is to make a default config (with a line like `config = config_factory(algo_name="bc")`), modify the config in python code, and then call the train function, like in the `examples/train_bc_rnn.py` script. - -```python -import robomimic -import robomimic.utils.torch_utils as TorchUtils -from robomimic.config import config_factory -from robomimic.scripts.train import train - -# make default BC config -config = config_factory(algo_name="bc") - -# set config attributes here that you would like to update -config.experiment.name = "bc_rnn_example" -config.train.data = "/path/to/dataset.hdf5" -config.train.output_dir = "/path/to/desired/output_dir" -config.train.batch_size = 256 -config.train.num_epochs = 500 -config.algo.gmm.enabled = False - -# get torch device -device = TorchUtils.get_torch_device(try_to_use_cuda=True) - -# launch training run -train(config, device=device) -``` - -#### Directly modifying the config class source code (avoid this) - -Technically, a third way to launch a training run is to directly modify the relevant `Config` classes (such as `config/bc_config.py` and `config/base_config.py`) and then run `train.py` but **this is not recommended**, especially if using the codebase with version control (e.g. git). Modifying these files modifies the default settings, and it's easy to forget that these changes were made, or unintentionally commit these changes so that they become the new defaults. For this reason, **we recommend never modifying the config classes directly, unless you are modifying an algorithm and adding new config keys**. - -To learn more about the `Config` class, read the [Configs documentation](../modules/configs.html), or look at the source code. - - -## Viewing Training Results - -This section discusses how to view and interpret the results of training runs. - -### Logs, Models, and Rollout Videos - -Training runs will output results to the directory specified by `config.train.output_dir`, under a folder with the experiment name (specified by `config.experiment.name`). This folder contains a directory named by a timestamp (e.g. `20210708174935`) for every training run with this same name, and within that directory, there should be three folders - `logs`, `models`, and `videos`. - -The `logs` directory will contain everything printed to stdout in `log.txt` (only if `config.experiment.logging.terminal_output_to_txt` is set to `True`), and a `tb` folder containing tensorboard logs (only if `config.experiment.logging.log_tb` is set to True). You can visualize the tensorboard results by using a command like the below, and then opening the link printed on the terminal in a web browser. The tensorboard logs have convenient sections for rollout evaluations, quantities logged during training, quantities logged during validation, and timing statistics for different parts of the training process (in minutes). - -```sh -$ tensorboard --logdir /path/to/output/dir --bind_all -``` - -The `models` directory contains saved model checkpoints. These can be used by the `run_trained_agent.py` script (more on this below). The `config.experiment.save` portion of the config controls if and when models are saved during training. - -The `videos` directory contains evaluation rollout videos collected during training, when evaluating trained models in the environment (only if `config.experiment.render_video` is set to `True`). The `config.experiment.rollout` portion of the config controls how often rollouts happen, and how many happen. - -### Evaluating Trained Policies - -Saved policy checkpoints in the `models` directory can be evaluated using the `run_trained_agent.py` script. The below example can be used to evaluate a policy with 50 rollouts of maximum horizon 400 and save the rollouts to a video. The agentview and wrist camera images are used to render video frames. - -```sh -$ python run_trained_agent.py --agent /path/to/model.pth --n_rollouts 50 --horizon 400 --seed 0 --video_path /path/to/output.mp4 --camera_names agentview robot0_eye_in_hand -``` - -The 50 agent rollouts can also be written to a new dataset hdf5. - -```sh -python run_trained_agent.py --agent /path/to/model.pth --n_rollouts 50 --horizon 400 --seed 0 --dataset_path /path/to/output.hdf5 --dataset_obs -``` - -Instead of storing the observations, which can consist of high-dimensional images, they can be excluded by omitting the `--dataset_obs` flag. The observations can be extracted using the `dataset_states_to_obs.hdf5` script (see the Datasets documentation for more information on this). - -```sh -python run_trained_agent.py --agent /path/to/model.pth --n_rollouts 50 --horizon 400 --seed 0 --dataset_path /path/to/output.hdf5 -``` diff --git a/docs/miscellaneous/contributing.md b/docs/miscellaneous/contributing.md index aef40460..b412b3b2 100644 --- a/docs/miscellaneous/contributing.md +++ b/docs/miscellaneous/contributing.md @@ -1,7 +1,5 @@ # Contributing Guidelines -We are so happy to see you reading this page! - Our team wholeheartedly welcomes the community to contribute to robomimic. Contributions from members of the community will help ensure the long-term success of this project. Before you plan to make contributions, here are important resources to get started with: - Read the robomimic [documentation](https://arise-initiative.github.io/robomimic-web/docs/overview.html) and [paper](https://arxiv.org/abs/2108.03298) @@ -41,7 +39,7 @@ We value readability and adhere to the following coding conventions: We also list additional suggested contributing guidelines that we adhered to during development. -- When creating new networks (e.g. subclasses of `Module` in `models/base_nets.py`), always sub-modules into a property called `self.nets`, and if there is more than one sub-module, make it a module collection (such as a `torch.nn.ModuleDict`). This is to ensure that the pattern `model.to(device)` works as expected with multiple levels of nested torch modules. As an example of nesting, see the `_create_networks` function in the `VAE` class (`models/vae_nets.py`) and the `MIMO_MLP` class (`models/obs_nets.py`). +- When creating new networks (e.g. subclasses of `Module` in `models/base_nets.py`), always put sub-modules into a property called `self.nets`, and if there is more than one sub-module, make it a module collection (such as a `torch.nn.ModuleDict`). This is to ensure that the pattern `model.to(device)` works as expected with multiple levels of nested torch modules. As an example of nesting, see the `_create_networks` function in the `VAE` class (`models/vae_nets.py`) and the `MIMO_MLP` class (`models/obs_nets.py`). - Do not use default mutable arguments -- they can lead to terrible bugs and unexpected behavior (see [this link](https://florimond.dev/blog/articles/2018/08/python-mutable-defaults-are-the-source-of-all-evil/) for more information). For this reason, in functions that expect optional dictionaries and lists (for example, the `core_kwargs` argument in the `obs_encoder_factory` function, or the `layer_dims` argument in the `MLP` class constructor), we use a default argument of `core_kwargs=None` or an empty tuple (since tuples are immutable) `layer_dims=()`. diff --git a/docs/miscellaneous/references.md b/docs/miscellaneous/references.md index 325c1da9..4b654153 100644 --- a/docs/miscellaneous/references.md +++ b/docs/miscellaneous/references.md @@ -2,15 +2,20 @@ A list of projects and papers that use **robomimic**. If you would like to add your work to this list, please send the paper or project information to Ajay Mandlekar (Note: see tutorial on using these models
+ +See the ["Using Pretrained Models"](../tutorials/using_pretrained_models.html) tutorial for instructions on using these models. + +Warning: use correct robosuite branch!
+ +When using these trained models, please make sure that robosuite is on the `offline_study` branch of robosuite. + ++ +
**robomimic** implements a suite of reusable network modules at different abstraction levels that make creating new policy models easy. diff --git a/docs/modules/overview.md b/docs/modules/overview.md index d49da6aa..2342feff 100644 --- a/docs/modules/overview.md +++ b/docs/modules/overview.md @@ -1,19 +1,24 @@ # Overview -![overview](../images/module_overview.png) ++ +
-The **robomimic** framework consists of several modular pieces that interact to train and evaluate a policy. A [Config](./configs.html) object is used to define all settings for a particular training run, including the hdf5 dataset that will be used to train the agent, and algorithm hyperparameters. The demonstrations in the hdf5 dataset are loaded into a [SequenceDataset](./dataset.html) object, which is used to provide minibatches for the train loop. Training consists of an [Algorithm](./algorithms.html) object that trains a set of [Models](./models.html) (including the Policy) by repeatedly sampling minibatches from the fixed, offline dataset. Every so often, the policy is evaluated in the [Environment](./environments.html) by conducting a set of rollouts. Statistics and other important information during the training process are logged to disk (e.g. tensorboard outputs, model checkpoints, and evaluation rollout videos). We also provide additional utilities in [TensorUtils](./tensor_utils.html) to work with complex observations in the form of nested tensor dictionaries. +The **robomimic** framework consists of several modular components that interact to train and evaluate a policy: +- **Experiment config**: a config object defines all settings for a training run +- **Data**: an hdf5 dataset is loaded into a dataloader, which provides minibatches to the algorithm +- **Training**: an algorithm object trains a set of models (including the policy) +- **Evaluation**: the policy is evaluated in the environment by conducting a set of rollouts +- **Logging**: experiment statistics, model checkpoints, and videos are saved to disk -The directory structure of the repository is as follows. +These modules are encapsulated by the robomimic directory structure: -- `robomimic/algo`: policy learning algorithm implementations (see [Algorithm documentation](./algorithms.html) for more information) -- `robomimic/config`: config classes (see [Config documentation](./configs.html) for more information) -- `robomimic/envs`: wrappers for environments, used during evaluation rollouts (see [Environment documentation](./environments.html) for more information) -- `robomimic/exps/templates`: config templates for each policy learning algorithm (these are auto-generated with the `robomimic/scripts/generate_config_templates.py` script) -- `robomimic/models`: network implementations (see [Models documentation](./models.html) for more information) +- `examples`: examples to better understand modular components in the codebase +- [`robomimic/algo`](./algorithms.html): policy learning algorithm implementations +- [`robomimic/config`](./configs.html): default algorithm configs +- [`robomimic/envs`](./environments.html): wrappers for environments, used during evaluation rollouts +- `robomimic/exps/templates`: config templates for experiments +- [`robomimic/models`](./models.html): network implementations - `robomimic/scripts`: main repository scripts -- `robomimic/utils`: a collection of utilities, including the [SequenceDataset](./dataset.html) class to load hdf5 datasets into a torch training pipeline, and [TensorUtils](./tensor_utils.html) to work with nested tensor dictionaries -- `tests`: test scripts for validating repository functionality -- `examples`: some simple examples to better understand modular components in the codebase (see the [Examples documentation](../introduction/examples.html) for more information) -- `docs`: files to generate sphinx documentation +- `robomimic/utils`: a collection of utilities, including the [SequenceDataset](./dataset.html) class to load datasets, and [TensorUtils](../tutorials/tensor_collections.html#tensorutils) to work with nested tensor dictionaries diff --git a/docs/tutorials/configs.md b/docs/tutorials/configs.md new file mode 100644 index 00000000..fa67a28a --- /dev/null +++ b/docs/tutorials/configs.md @@ -0,0 +1,53 @@ +# Configuring and Launching Training Runs + +Robomimic uses a centralized [configuration system](../modules/configs.html) to specify (hyper)parameters at all levels. Below we walk through two ways to configure and launching training runs. + + +#### Best practices +Warning! Do not modify default configs!
+ +Do not directly modify the default configs such as `config/bc_config.py`, especially if using the codebase with version control (e.g. git). Modifying these files modifies the default settings, and it’s easy to forget that these changes were made, or unintentionally commit these changes so that they become the new defaults. + +Note: HDF5 Dataset Structure.
+ +[This link](../datasets/overview.html#dataset-structure) shows the expected structure of each hdf5 dataset. + +Jupyter Notebook: A Deep Dive into Dataset Structure
+ +Any user wishing to write custom code that works with robomimic datasets should also look at the [jupyter notebook](https://github.com/ARISE-Initiative/robomimic/blob/master/examples/notebooks/datasets.ipynb) at `examples/notebooks/datasets.ipynb`, which showcases several useful python code snippets for working with robomimic hdf5 datasets. + +Note: These examples are compatible with any robomimic dataset.
+ +The examples in this section use the small hdf5 dataset packaged with the repository in `tests/assets/test.hdf5`, but you can run these examples with any robomimic hdf5 dataset. If you are using the default dataset, please make sure that robosuite is on the `offline_study` branch of robosuite -- this is necessary for the playback scripts to function properly. + +Relevant settings in base json file
+ +Sections of the config that are not involved in the scan and that do not differ from the default values in the template can also be omitted, if desired. + ++ ```json { "algo_name": "bc", @@ -94,7 +108,16 @@ $ cat /tmp/gen_configs/base.json } ``` -The next step is to define a function that returns a `ConfigGenerator`. In our example, we would like to run the BC-RNN algorithm with an RNN horizon of 10. This requires setting `config.train.seq_length = 10` and `config.algo.rnn.enabled = True` -- we could have modified our base json file directly (as mentioned above) but we opted to set it in the generator function below. The first three calls to `add_param` do exactly this. Leaving `name=""` ensures that the experiment name is not determined by these parameter values. +
+Empty hyperparameter names
+ +Leaving `name=""` ensures that the experiment name is not determined by these parameter values. +Only do this if you are sweeping over a single value! -The `group` argument specifies which arguments should be modified together. The hyperparameter script will generate a training run for each hyperparameter setting in the cartesian product between all groups. Thus, putting the RNN dimension and MLP layer dims in the same group ensures that the parameters change together (RNN dimension 400 always occurs with MLP layer dims (1024, 1024), and RNN dimension 1000 always occurs with an empty MLP). Finally, notice the use of the `value_names` argument -- by default, the generated config will have an experiment name consisting of the base name under `config.experiment.name` already present in the base json, and then the `name` specified for each parameter, along with the string representation of the selected value in `values`, but `value_names` allows you to override this with a custom string for the corresponding value. +Sweeping hyperparameters together
+ +We set the RNN dimension and MLP layer dims in the same group to ensure that the parameters change together (RNN dimension 400 always occurs with MLP layer dims (1024, 1024), and RNN dimension 1000 always occurs with an empty MLP). + +Note: Understand how to launch training runs and view results first!
+ +Before trying to reproduce published results, it might be useful to read the following tutorials: +- [how to launch training runs](./configs.html) +- [how to view training results](./viewing_results.html) +- [how to launch multiple training runs efficiently](./hyperparam_scan.html) + +Jupyter Notebook: Working with Pretrained Policies
+ +The rest of this tutorial shows how to use utility scripts to load and rollout a trained policy. If you wish to do so via an interactive notebook, please refer to the [jupyter notebook](https://github.com/ARISE-Initiative/robomimic/blob/master/examples/notebooks/run_policy.ipynb) at `examples/notebooks/run_policy.ipynb`. The notebook tutorial shows how to download a checkpoint from the model zoo, load the checkpoint in pytorch, and rollout the policy. + +Loading Trained Checkpoints
+ +Please see the [Using Pretrained Models](./using_pretrained_models.html) tutorial to see how to load the trained model checkpoints in the `models` directory. + ++ +
+ + +Experiment results (y-axis) are logged across epochs (x-axis). +You may find the following logging metrics useful: +- `Rollout/`: evaluation rollout metrics, eg. success rate, rewards, etc. + - `Rollout/Success_Rate/{envname}-max`: maximum success rate over time (this is the metric the [study paper](https://arxiv.org/abs/2108.03298) uses to evaluate baselines) +- `Timing_Stats/`: time spent by the algorithm loading data, training, performing rollouts, etc. +- `Timing_Stats/`: time spent by the algorithm loading data, training, performing rollouts, etc. +- `Train/`: training stats +- `Validation/`: validation stats +- `System/RAM Usage (MB)`: system RAM used by algorithm \ No newline at end of file diff --git a/examples/notebooks/datasets.ipynb b/examples/notebooks/datasets.ipynb new file mode 100644 index 00000000..f930e99d --- /dev/null +++ b/examples/notebooks/datasets.ipynb @@ -0,0 +1,1419 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "d7e6ab09", + "metadata": {}, + "source": [ + "# A deep dive into robomimic datasets\n", + "\n", + "This notebook will provide examples on how to work with robomimic datasets through various python code examples. This notebook assumes that you have installed `robomimic` and `robosuite` (which should be on the `offline_study` branch)." + ] + }, + { + "cell_type": "markdown", + "id": "2a05e543", + "metadata": {}, + "source": [ + "## Download dataset\n", + "\n", + "First, let's try downloading a simple dataset - we'll use the Lift (PH) dataset. Note that there are utility scripts such as `scripts/download_datasets.py` to do this for us, but for the purposes of this example, we'll use the python API." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "4e2b90e6", + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "low_dim.hdf5: 18.6MB [00:00, 27.5MB/s] \n" + ] + } + ], + "source": [ + "import os\n", + "import json\n", + "import h5py\n", + "import numpy as np\n", + "\n", + "import robomimic\n", + "import robomimic.utils.file_utils as FileUtils\n", + "\n", + "# the dataset registry can be found at robomimic/__init__.py\n", + "from robomimic import DATASET_REGISTRY\n", + "\n", + "# set download folder and make it\n", + "download_folder = \"/tmp/robomimic_ds_example\"\n", + "os.makedirs(download_folder, exist_ok=True)\n", + "\n", + "# download the dataset\n", + "task = \"lift\"\n", + "dataset_type = \"ph\"\n", + "hdf5_type = \"low_dim\"\n", + "FileUtils.download_url(\n", + " url=DATASET_REGISTRY[task][dataset_type][hdf5_type][\"url\"], \n", + " download_dir=download_folder,\n", + ")\n", + "\n", + "# enforce that the dataset exists\n", + "dataset_path = os.path.join(download_folder, \"low_dim.hdf5\")\n", + "assert os.path.exists(dataset_path)" + ] + }, + { + "cell_type": "markdown", + "id": "54bdec82", + "metadata": {}, + "source": [ + "## Read quantities from dataset\n", + "\n", + "Next, let's demonstrate how to read different quantities from the dataset. There are scripts such as `scripts/get_dataset_info.py` that can help you easily understand the contents of a dataset, but in this example, we'll break down how to do this directly.\n", + "\n", + "First, let's take a look at the number of demonstrations in the file." + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "a35cd8e9", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "hdf5 file /tmp/robomimic_ds_example/low_dim.hdf5 has 200 demonstrations\n" + ] + } + ], + "source": [ + "# open file\n", + "f = h5py.File(dataset_path, \"r\")\n", + "\n", + "# each demonstration is a group under \"data\"\n", + "demos = list(f[\"data\"].keys())\n", + "num_demos = len(demos)\n", + "\n", + "print(\"hdf5 file {} has {} demonstrations\".format(dataset_path, num_demos))" + ] + }, + { + "cell_type": "markdown", + "id": "bdb073a0", + "metadata": {}, + "source": [ + "Next, let's list all of the demonstrations, along with the number of state-action pairs in each demonstration." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "9bda3e70", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "demo_0 has 59 samples\n", + "demo_1 has 58 samples\n", + "demo_2 has 57 samples\n", + "demo_3 has 55 samples\n", + "demo_4 has 51 samples\n", + "demo_5 has 58 samples\n", + "demo_6 has 49 samples\n", + "demo_7 has 49 samples\n", + "demo_8 has 44 samples\n", + "demo_9 has 51 samples\n", + "demo_10 has 54 samples\n", + "demo_11 has 49 samples\n", + "demo_12 has 53 samples\n", + "demo_13 has 59 samples\n", + "demo_14 has 51 samples\n", + "demo_15 has 50 samples\n", + "demo_16 has 50 samples\n", + "demo_17 has 45 samples\n", + "demo_18 has 49 samples\n", + "demo_19 has 51 samples\n", + "demo_20 has 48 samples\n", + "demo_21 has 57 samples\n", + "demo_22 has 47 samples\n", + "demo_23 has 47 samples\n", + "demo_24 has 52 samples\n", + "demo_25 has 48 samples\n", + "demo_26 has 43 samples\n", + "demo_27 has 46 samples\n", + "demo_28 has 49 samples\n", + "demo_29 has 41 samples\n", + "demo_30 has 46 samples\n", + "demo_31 has 42 samples\n", + "demo_32 has 59 samples\n", + "demo_33 has 54 samples\n", + "demo_34 has 48 samples\n", + "demo_35 has 58 samples\n", + "demo_36 has 45 samples\n", + "demo_37 has 50 samples\n", + "demo_38 has 49 samples\n", + "demo_39 has 41 samples\n", + "demo_40 has 40 samples\n", + "demo_41 has 49 samples\n", + "demo_42 has 53 samples\n", + "demo_43 has 39 samples\n", + "demo_44 has 46 samples\n", + "demo_45 has 49 samples\n", + "demo_46 has 47 samples\n", + "demo_47 has 40 samples\n", + "demo_48 has 53 samples\n", + "demo_49 has 48 samples\n", + "demo_50 has 45 samples\n", + "demo_51 has 47 samples\n", + "demo_52 has 46 samples\n", + "demo_53 has 55 samples\n", + "demo_54 has 43 samples\n", + "demo_55 has 56 samples\n", + "demo_56 has 40 samples\n", + "demo_57 has 38 samples\n", + "demo_58 has 38 samples\n", + "demo_59 has 44 samples\n", + "demo_60 has 42 samples\n", + "demo_61 has 54 samples\n", + "demo_62 has 41 samples\n", + "demo_63 has 42 samples\n", + "demo_64 has 53 samples\n", + "demo_65 has 38 samples\n", + "demo_66 has 41 samples\n", + "demo_67 has 42 samples\n", + "demo_68 has 39 samples\n", + "demo_69 has 42 samples\n", + "demo_70 has 48 samples\n", + "demo_71 has 45 samples\n", + "demo_72 has 38 samples\n", + "demo_73 has 36 samples\n", + "demo_74 has 48 samples\n", + "demo_75 has 36 samples\n", + "demo_76 has 48 samples\n", + "demo_77 has 39 samples\n", + "demo_78 has 44 samples\n", + "demo_79 has 44 samples\n", + "demo_80 has 40 samples\n", + "demo_81 has 38 samples\n", + "demo_82 has 47 samples\n", + "demo_83 has 52 samples\n", + "demo_84 has 53 samples\n", + "demo_85 has 46 samples\n", + "demo_86 has 38 samples\n", + "demo_87 has 39 samples\n", + "demo_88 has 39 samples\n", + "demo_89 has 41 samples\n", + "demo_90 has 42 samples\n", + "demo_91 has 37 samples\n", + "demo_92 has 51 samples\n", + "demo_93 has 50 samples\n", + "demo_94 has 51 samples\n", + "demo_95 has 46 samples\n", + "demo_96 has 56 samples\n", + "demo_97 has 53 samples\n", + "demo_98 has 46 samples\n", + "demo_99 has 46 samples\n", + "demo_100 has 47 samples\n", + "demo_101 has 43 samples\n", + "demo_102 has 58 samples\n", + "demo_103 has 52 samples\n", + "demo_104 has 48 samples\n", + "demo_105 has 55 samples\n", + "demo_106 has 49 samples\n", + "demo_107 has 62 samples\n", + "demo_108 has 43 samples\n", + "demo_109 has 50 samples\n", + "demo_110 has 45 samples\n", + "demo_111 has 46 samples\n", + "demo_112 has 44 samples\n", + "demo_113 has 43 samples\n", + "demo_114 has 47 samples\n", + "demo_115 has 49 samples\n", + "demo_116 has 59 samples\n", + "demo_117 has 52 samples\n", + "demo_118 has 54 samples\n", + "demo_119 has 53 samples\n", + "demo_120 has 63 samples\n", + "demo_121 has 53 samples\n", + "demo_122 has 60 samples\n", + "demo_123 has 51 samples\n", + "demo_124 has 47 samples\n", + "demo_125 has 55 samples\n", + "demo_126 has 56 samples\n", + "demo_127 has 58 samples\n", + "demo_128 has 55 samples\n", + "demo_129 has 53 samples\n", + "demo_130 has 50 samples\n", + "demo_131 has 47 samples\n", + "demo_132 has 46 samples\n", + "demo_133 has 43 samples\n", + "demo_134 has 45 samples\n", + "demo_135 has 54 samples\n", + "demo_136 has 53 samples\n", + "demo_137 has 57 samples\n", + "demo_138 has 50 samples\n", + "demo_139 has 48 samples\n", + "demo_140 has 49 samples\n", + "demo_141 has 54 samples\n", + "demo_142 has 55 samples\n", + "demo_143 has 49 samples\n", + "demo_144 has 51 samples\n", + "demo_145 has 45 samples\n", + "demo_146 has 50 samples\n", + "demo_147 has 51 samples\n", + "demo_148 has 50 samples\n", + "demo_149 has 58 samples\n", + "demo_150 has 46 samples\n", + "demo_151 has 46 samples\n", + "demo_152 has 45 samples\n", + "demo_153 has 42 samples\n", + "demo_154 has 49 samples\n", + "demo_155 has 45 samples\n", + "demo_156 has 63 samples\n", + "demo_157 has 41 samples\n", + "demo_158 has 42 samples\n", + "demo_159 has 45 samples\n", + "demo_160 has 43 samples\n", + "demo_161 has 46 samples\n", + "demo_162 has 52 samples\n", + "demo_163 has 55 samples\n", + "demo_164 has 44 samples\n", + "demo_165 has 42 samples\n", + "demo_166 has 51 samples\n", + "demo_167 has 64 samples\n", + "demo_168 has 57 samples\n", + "demo_169 has 52 samples\n", + "demo_170 has 48 samples\n", + "demo_171 has 45 samples\n", + "demo_172 has 53 samples\n", + "demo_173 has 39 samples\n", + "demo_174 has 47 samples\n", + "demo_175 has 52 samples\n", + "demo_176 has 63 samples\n", + "demo_177 has 50 samples\n", + "demo_178 has 47 samples\n", + "demo_179 has 48 samples\n", + "demo_180 has 55 samples\n", + "demo_181 has 52 samples\n", + "demo_182 has 55 samples\n", + "demo_183 has 53 samples\n", + "demo_184 has 44 samples\n", + "demo_185 has 59 samples\n", + "demo_186 has 45 samples\n", + "demo_187 has 43 samples\n", + "demo_188 has 44 samples\n", + "demo_189 has 52 samples\n", + "demo_190 has 51 samples\n", + "demo_191 has 40 samples\n", + "demo_192 has 49 samples\n", + "demo_193 has 42 samples\n", + "demo_194 has 36 samples\n", + "demo_195 has 54 samples\n", + "demo_196 has 42 samples\n", + "demo_197 has 40 samples\n", + "demo_198 has 45 samples\n", + "demo_199 has 49 samples\n" + ] + } + ], + "source": [ + "# each demonstration is named \"demo_#\" where # is a number.\n", + "# Let's put the demonstration list in increasing episode order\n", + "inds = np.argsort([int(elem[5:]) for elem in demos])\n", + "demos = [demos[i] for i in inds]\n", + "\n", + "for ep in demos:\n", + " num_actions = f[\"data/{}/actions\".format(ep)].shape[0]\n", + " print(\"{} has {} samples\".format(ep, num_actions))" + ] + }, + { + "cell_type": "markdown", + "id": "ff998d62", + "metadata": {}, + "source": [ + "Now, let's dig into a single trajectory to take a look at some of the quantities in each demonstration." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "2f7b497a", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "timestep 0\n", + "obs\n", + "{\n", + " \"object\": [\n", + " 0.026449414293141932,\n", + " 0.026981257415918496,\n", + " 0.8314240728038541,\n", + " 0.0,\n", + " 0.0,\n", + " 0.9691094123222487,\n", + " 0.24663119621902302,\n", + " -0.11694359335499373,\n", + " -0.042210722419907185,\n", + " 0.1804238946722606\n", + " ],\n", + " \"robot0_eef_pos\": [\n", + " -0.0904941790618518,\n", + " -0.015229465003988687,\n", + " 1.0118479674761147\n", + " ],\n", + " \"robot0_eef_quat\": [\n", + " 0.9972275858407224,\n", + " -0.007232815062599316,\n", + " 0.07403510413011814,\n", + " 0.0019057232216049126\n", + " ],\n", + " \"robot0_eef_vel_ang\": [\n", + " 0.0,\n", + " 0.0,\n", + " 0.0\n", + " ],\n", + " \"robot0_eef_vel_lin\": [\n", + " 0.0,\n", + " 0.0,\n", + " 0.0\n", + " ],\n", + " \"robot0_gripper_qpos\": [\n", + " 0.020833,\n", + " -0.020833\n", + " ],\n", + " \"robot0_gripper_qvel\": [\n", + " 0.0,\n", + " 0.0\n", + " ],\n", + " \"robot0_joint_pos\": [\n", + " -0.041410389327799474,\n", + " 0.21736868939218443,\n", + " 0.007539738887367773,\n", + " -2.589845402931484,\n", + " -0.007843816214400163,\n", + " 2.9554575771696747,\n", + " 0.7738283119303786\n", + " ],\n", + " \"robot0_joint_pos_cos\": [\n", + " 0.9991427123462239,\n", + " 0.9764683001348743,\n", + " 0.9999715763034073,\n", + " -0.851609955590076,\n", + " 0.9999692374313213,\n", + " -0.9827268240955787,\n", + " 0.7152403924484989\n", + " ],\n", + " \"robot0_joint_pos_sin\": [\n", + " -0.04139855511284203,\n", + " 0.21566098124535443,\n", + " 0.0075396674514822334,\n", + " -0.5241760043533744,\n", + " -0.007843735782256875,\n", + " 0.1850621225508276,\n", + " 0.6988785166322665\n", + " ],\n", + " \"robot0_joint_vel\": [\n", + " 0.0,\n", + " 0.0,\n", + " 0.0,\n", + " 0.0,\n", + " 0.0,\n", + " 0.0,\n", + " 0.0\n", + " ]\n", + "}\n", + "action\n", + "[-0. 0. 0. 0.00381497 0.14820713 0.01447902\n", + " -1. ]\n", + "timestep 1\n", + "obs\n", + "{\n", + " \"object\": [\n", + " 0.02644963304851104,\n", + " 0.026981489259044183,\n", + " 0.8193230127298963,\n", + " 3.687451776864639e-06,\n", + " 6.636591241497241e-06,\n", + " 0.9691094122922576,\n", + " 0.24663119622001103,\n", + " -0.12042898519933254,\n", + " -0.04002245385574614,\n", + " 0.19099218419447617\n", + " ],\n", + " \"robot0_eef_pos\": [\n", + " -0.0939793521508215,\n", + " -0.013040964596701957,\n", + " 1.0103151969243724\n", + " ],\n", + " \"robot0_eef_quat\": [\n", + " 0.9976541041157041,\n", + " -0.005165799685637573,\n", + " 0.06825034642115067,\n", + " 0.0012219934912254607\n", + " ],\n", + " \"robot0_eef_vel_ang\": [\n", + " 0.030008726302490445,\n", + " 0.32730904658446547,\n", + " 0.12017070883228292\n", + " ],\n", + " \"robot0_eef_vel_lin\": [\n", + " -0.06337027577857833,\n", + " 0.0584591961202545,\n", + " -0.03119876681899503\n", + " ],\n", + " \"robot0_gripper_qpos\": [\n", + " 0.021144677345077283,\n", + " -0.021098803032220514\n", + " ],\n", + " \"robot0_gripper_qvel\": [\n", + " 0.020536516913238687,\n", + " -0.01967148001615872\n", + " ],\n", + " \"robot0_joint_pos\": [\n", + " -0.040347907449666154,\n", + " 0.21541772265344838,\n", + " 0.010760092121867423,\n", + " -2.5941357356309553,\n", + " -0.006995190747474993,\n", + " 2.9461625155338433,\n", + " 0.7730220470953911\n", + " ],\n", + " \"robot0_joint_pos_cos\": [\n", + " 0.9991861336026011,\n", + " 0.9768871889182116,\n", + " 0.9999421107673003,\n", + " -0.8538510003817338,\n", + " 0.9999755337529701,\n", + " -0.9809642324353866,\n", + " 0.7158036410836986\n", + " ],\n", + " \"robot0_joint_pos_sin\": [\n", + " -0.04033696092028956,\n", + " 0.21375551484692587,\n", + " 0.0107598844899072,\n", + " -0.520517501288009,\n", + " -0.006995133698693658,\n", + " 0.19418850296156312,\n", + " 0.6983016163602369\n", + " ],\n", + " \"robot0_joint_vel\": [\n", + " 0.03543580106414391,\n", + " -0.07043374630232116,\n", + " 0.08873795004885236,\n", + " -0.11157492930937571,\n", + " 0.01835795870713538,\n", + " -0.2870936164472272,\n", + " -0.01528323082917978\n", + " ]\n", + "}\n", + "action\n", + "[ 0.204 0.087 -0.072 0.0040731 0.13793246 0.00472844\n", + " -1. ]\n", + "timestep 2\n", + "obs\n", + "{\n", + " \"object\": [\n", + " 0.02644979048142018,\n", + " 0.026981656585798056,\n", + " 0.8188363799238965,\n", + " 6.124550057574148e-06,\n", + " 1.1039634957752413e-05,\n", + " 0.9691094122391417,\n", + " 0.24663119622246055,\n", + " -0.12215724667074111,\n", + " -0.03828681206822328,\n", + " 0.1896874619017075\n", + " ],\n", + " \"robot0_eef_pos\": [\n", + " -0.09570745618932093,\n", + " -0.011305155482425224,\n", + " 1.008523841825604\n", + " ],\n", + " \"robot0_eef_quat\": [\n", + " 0.9980669555971526,\n", + " -0.0037167922644707474,\n", + " 0.06203175911084908,\n", + " 0.0007736031979145991\n", + " ],\n", + " \"robot0_eef_vel_ang\": [\n", + " 0.008174675780542328,\n", + " 0.2117180550504403,\n", + " 0.025337152199680114\n", + " ],\n", + " \"robot0_eef_vel_lin\": [\n", + " 0.021946528579046963,\n", + " 0.019244360508797305,\n", + " -0.03325545509870823\n", + " ],\n", + " \"robot0_gripper_qpos\": [\n", + " 0.023065368793203082,\n", + " -0.023089813685773657\n", + " ],\n", + " \"robot0_gripper_qvel\": [\n", + " 0.05386538871513393,\n", + " -0.05464789210707713\n", + " ],\n", + " \"robot0_joint_pos\": [\n", + " -0.03862546371287872,\n", + " 0.2176935517862639,\n", + " 0.012540741592900084,\n", + " -2.59203892182414,\n", + " -0.006915289916081674,\n", + " 2.933857605408448,\n", + " 0.7734679128154149\n", + " ],\n", + " \"robot0_joint_pos_cos\": [\n", + " 0.9992541295153923,\n", + " 0.9763981884673584,\n", + " 0.9999213659307244,\n", + " -0.852757695866131,\n", + " 0.9999760894779743,\n", + " -0.978500557298211,\n", + " 0.7154922211915071\n", + " ],\n", + " \"robot0_joint_pos_sin\": [\n", + " -0.03861586003749854,\n", + " 0.21597818768954657,\n", + " 0.012540412881329139,\n", + " -0.5223067222821157,\n", + " -0.0069152347999298655,\n", + " 0.20624417414096938,\n", + " 0.6986206992456232\n", + " ],\n", + " \"robot0_joint_vel\": [\n", + " 0.03587066701618021,\n", + " 0.11427360447919478,\n", + " 0.007098630969981366,\n", + " 0.13903479858802026,\n", + " -0.0045408292982859885,\n", + " -0.2366374332873465,\n", + " 0.021655490569626238\n", + " ]\n", + "}\n", + "action\n", + "[ 0.323 0.131 -0.073 0.00670891 0.12851983 -0.00825769\n", + " -1. ]\n", + "timestep 3\n", + "obs\n", + "{\n", + " \"object\": [\n", + " 0.026449610437205617,\n", + " 0.026981465550533244,\n", + " 0.8200930180165275,\n", + " 2.989477547714015e-06,\n", + " 5.389691506845857e-06,\n", + " 0.9691094122977165,\n", + " 0.24663119623840957,\n", + " -0.12084811993760973,\n", + " -0.03702038558498077,\n", + " 0.1864341070243961\n", + " ],\n", + " \"robot0_eef_pos\": [\n", + " -0.09439850950040411,\n", + " -0.010038920034447524,\n", + " 1.0065271250409236\n", + " ],\n", + " \"robot0_eef_quat\": [\n", + " 0.9983755425976737,\n", + " -0.003414240303874383,\n", + " 0.056872138948892356,\n", + " 0.00042274972047360443\n", + " ],\n", + " \"robot0_eef_vel_ang\": [\n", + " 0.015588272761144207,\n", + " 0.20338910134567467,\n", + " 0.001415554639112783\n", + " ],\n", + " \"robot0_eef_vel_lin\": [\n", + " 0.06160381269345277,\n", + " 0.025010135696567033,\n", + " -0.03656737866839496\n", + " ],\n", + " \"robot0_gripper_qpos\": [\n", + " 0.026358274928870957,\n", + " -0.02636141449910154\n", + " ],\n", + " \"robot0_gripper_qvel\": [\n", + " 0.0745201310630871,\n", + " -0.07411367614184626\n", + " ],\n", + " \"robot0_joint_pos\": [\n", + " -0.03650447151890695,\n", + " 0.22634892583028604,\n", + " 0.013167323320805235,\n", + " -2.581147494051686,\n", + " -0.006772139059673779,\n", + " 2.9212721348692297,\n", + " 0.7754134156517873\n", + " ],\n", + " \"robot0_joint_pos_cos\": [\n", + " 0.999333785766275,\n", + " 0.9744922663574513,\n", + " 0.9999133120507784,\n", + " -0.847018564469704,\n", + " 0.999977069153916,\n", + " -0.9758274525241541,\n", + " 0.7141316994350739\n", + " ],\n", + " \"robot0_joint_pos_sin\": [\n", + " -0.03649636455929419,\n", + " 0.22442108370097089,\n", + " 0.0131669428358545,\n", + " -0.5315633089705137,\n", + " -0.006772087295968501,\n", + " 0.21854240526776433,\n", + " 0.7000113683805237\n", + " ],\n", + " \"robot0_joint_vel\": [\n", + " 0.04691353138789363,\n", + " 0.1994635772259614,\n", + " 0.01143329494353153,\n", + " 0.2674340987745522,\n", + " 0.004438708485487719,\n", + " -0.27163705349779815,\n", + " 0.052193205025039595\n", + " ]\n", + "}\n", + "action\n", + "[ 0.491 0.21 -0.19 0.00824866 0.12259889 -0.02527941\n", + " -1. ]\n", + "timestep 4\n", + "obs\n", + "{\n", + " \"object\": [\n", + " 0.026449513581120028,\n", + " 0.026981362757104405,\n", + " 0.8206866047770742,\n", + " 1.3457488015772851e-06,\n", + " 2.4265647886561854e-06,\n", + " 0.969109412312331,\n", + " 0.24663119624238392,\n", + " -0.1173250875824681,\n", + " -0.03522245982820696,\n", + " 0.1830334739182844\n", + " ],\n", + " \"robot0_eef_pos\": [\n", + " -0.09087557400134808,\n", + " -0.008241097071102555,\n", + " 1.0037200786953586\n", + " ],\n", + " \"robot0_eef_quat\": [\n", + " 0.9986502568703245,\n", + " -0.0038131638489504183,\n", + " 0.05179871492175989,\n", + " -0.00013178296654296415\n", + " ],\n", + " \"robot0_eef_vel_ang\": [\n", + " 0.02512282243933508,\n", + " 0.20309719193885506,\n", + " -0.02906478800666458\n", + " ],\n", + " \"robot0_eef_vel_lin\": [\n", + " 0.10622683467941454,\n", + " 0.03716610735305083,\n", + " -0.06009354586057167\n", + " ],\n", + " \"robot0_gripper_qpos\": [\n", + " 0.030413930567981536,\n", + " -0.030375387077326593\n", + " ],\n", + " \"robot0_gripper_qvel\": [\n", + " 0.08575894495332877,\n", + " -0.0853433936999638\n", + " ],\n", + " \"robot0_joint_pos\": [\n", + " -0.033742898584622906,\n", + " 0.24158202550110155,\n", + " 0.014307614740759025,\n", + " -2.562765138816014,\n", + " -0.006403913367086805,\n", + " 2.907945086256439,\n", + " 0.7796524071820293\n", + " ],\n", + " \"robot0_joint_pos_cos\": [\n", + " 0.999430762410992,\n", + " 0.9709607078578428,\n", + " 0.9998976478262572,\n", + " -0.8371046247858968,\n", + " 0.9999794950168696,\n", + " -0.9728283562961821,\n", + " 0.7111579499359364\n", + " ],\n", + " \"robot0_joint_pos_sin\": [\n", + " -0.03373649576620538,\n", + " 0.23923900977097504,\n", + " 0.014307126598938211,\n", + " -0.5470428202271399,\n", + " -0.006403869596315117,\n", + " 0.2315275128058616,\n", + " 0.703032268279996\n", + " ],\n", + " \"robot0_joint_vel\": [\n", + " 0.06079543928279841,\n", + " 0.3522398354114109,\n", + " 0.02364828609743366,\n", + " 0.4241619255231188,\n", + " 0.0024945495675376796,\n", + " -0.2753739143424271,\n", + " 0.10996504673995737\n", + " ]\n", + "}\n", + "action\n", + "[ 0.465 0.309 -0.304 0.00568831 0.10961337 -0.04577875\n", + " -1. ]\n" + ] + } + ], + "source": [ + "# look at first demonstration\n", + "demo_key = demos[0]\n", + "demo_grp = f[\"data/{}\".format(demo_key)]\n", + "\n", + "# Each observation is a dictionary that maps modalities to numpy arrays, and\n", + "# each action is a numpy array. Let's print the observations and actions for the \n", + "# first 5 timesteps of this trajectory.\n", + "for t in range(5):\n", + " print(\"timestep {}\".format(t))\n", + " obs_t = dict()\n", + " # each observation modality is stored as a subgroup\n", + " for k in demo_grp[\"obs\"]:\n", + " obs_t[k] = demo_grp[\"obs/{}\".format(k)][t] # numpy array\n", + " act_t = demo_grp[\"actions\"][t]\n", + " \n", + " # pretty-print observation and action using json\n", + " obs_t_pp = { k : obs_t[k].tolist() for k in obs_t }\n", + " print(\"obs\")\n", + " print(json.dumps(obs_t_pp, indent=4))\n", + " print(\"action\")\n", + " print(act_t)" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "552be387", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "shape of first ten actions (10, 7)\n", + "shape of all actions (59, 7)\n" + ] + } + ], + "source": [ + "# we can also grab multiple timesteps at once directly, or even the full trajectory at once\n", + "first_ten_actions = demo_grp[\"actions\"][:10]\n", + "print(\"shape of first ten actions {}\".format(first_ten_actions.shape))\n", + "all_actions = demo_grp[\"actions\"][:]\n", + "print(\"shape of all actions {}\".format(all_actions.shape))" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "57976238", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "success\n" + ] + } + ], + "source": [ + "# the trajectory also contains the next observations under \"next_obs\", \n", + "# for convenient use in a batch (offline) RL pipeline. Let's verify\n", + "# that \"next_obs\" and \"obs\" are offset by 1.\n", + "for k in demo_grp[\"obs\"]:\n", + " # obs_{t+1} == next_obs_{t}\n", + " assert(np.allclose(demo_grp[\"obs\"][k][1:], demo_grp[\"next_obs\"][k][:-1]))\n", + "print(\"success\")" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "51ab4a38", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "dones\n", + "[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n", + " 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1]\n", + "\n", + "rewards\n", + "[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.\n", + " 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.\n", + " 0. 0. 0. 0. 0. 0. 1. 1. 1. 1. 1.]\n" + ] + } + ], + "source": [ + "# we also have \"done\" and \"reward\" information stored in each trajectory.\n", + "# In this case, we have sparse rewards that indicate task completion at\n", + "# that timestep.\n", + "dones = demo_grp[\"dones\"][:]\n", + "rewards = demo_grp[\"rewards\"][:]\n", + "print(\"dones\")\n", + "print(dones)\n", + "print(\"\")\n", + "print(\"rewards\")\n", + "print(rewards)" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "id": "360df27c", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "