Open-source implementations on real robots. This repository is currently under active development.
This repository also contains the real-world implementation of the following paper:
SPA: 3D Spatial-Awareness Enables Effective Embodied Representation
Haoyi Zhu, Honghui Yang, Yating Wang, Jiange Yang, Limin Wang, Tong He
arXiv preprint, 2024
[ Project Page ] | [ arXiv ] | [ Github ]
Point Cloud Matters: Rethinking the Impact of Different Observation Spaces on Robot Learning
Haoyi Zhu, Yating Wang, Di Huang, Weicai Ye, Wanli Ouyang, Tong He
Neural Information Processing Systems (NeurIPS) Dataset & Benchmark, 2024
[ Project Page ] | [ arXiv ] | [ GitHub ] | [ Videos ]
The videos are all done automatically by learned policies.
Pick Cube | Stack Cube | Fold Cloth |
conda create -n realrobot python=3.11 -y
pip install -r requirements.txt
Point cloud related
# please install with your PyTorch and CUDA version
# e.g. torch 2.4.0 + cuda 118:
pip install torch-geometric torch-scatter torch-sparse torch-cluster -f
must matches your CUDA version, see official Github for more information.
# e.g. for CUDA 11.8:
pip3 install spconv-cu118
# build FPS sampling operations (CUDA required)
cd libs/pointops
# docker & multi GPU arch
# e.g. 7.5: RTX 3000; 8.0: a100 More available in:
TORCH_CUDA_ARCH_LIST="7.5 8.0" python install
cd ../..
Install librealsense
Repeat here:
sudo apt-get update && sudo apt-get upgrade && sudo apt-get dist-upgrade sudo apt-get install libssl-dev libusb-1.0-0-dev libudev-dev pkg-config libgtk-3-dev sudo apt-get install git wget cmake build-essential sudo apt-get install libglfw3-dev libgl1-mesa-dev libglu1-mesa-dev at cd /some/path git clone cd IntelRealSense ./scripts/ ./scripts/ mkdir build && cd build cmake ../ -DCMAKE_BUILD_TYPE=Release sudo make uninstall && make clean && make && sudo make install
After installing, you can connect cameras with USB and you can run
to get the serial numbers as well as to visualize results. -
Install pyrealsense2
pip install pyrealsense2
and get the serial numbers of your cameras. Modify the variables here. We use two cameras by default, but it is very easy to adapt to other number of cameras.
Please follow the original repo.
Camera Caliberation
To caliberate your camera, you can first print out the charuco file under asserts/calib.io_charuco_297x210_5x7_35_26_DICT_4X4.pdf and put it on your table. Then you can run with:
python scripts/
The caliberation file will be saved under data/calib
- For single arm setting, you can run with:
python scripts/ task_name=${task_name} user_name=${use_name} episode_id=${episode_id} teleop=low_cost_robot_single_arm teleop.leader.device_name=${device_name} teleop.follower.device_name=${device_name}
- Similarly, for dual arm setting, you can run with:
python scripts/ task_name=${task_name} user_name=${use_name} episode_id=${episode_id} teleop=low_cost_robot teleop.op_left.leader.device_name=${device_name} teleop.op_left.follower.device_name=${device_name} teleop.op_right.leader.device_name=${device_name} teleop.op_right.follower.device_name=${device_name}
After each episode, you can press q
to end it.
Then the reajectories will be saved under data/teleop/${task_name}/${use_name}/ep_${episode_id}
with format of:
βββ teleop
β βββ ...
β βββ task_name
β β βββ use_name
β β β βββ ep_id
| | | β βββ meta.json # meta data
| | | β βββ ${timestamp1}.npy # recorded data
| | | β βββ ${timestamp2}.npy
| | | β βββ ...
β βββ ...
Training Examples
ACT training with RGB images, by default we use 4 gpu with DDP
python src/ task_name=${task_name} trainer.devices=4 exp_low_cost_robot=base_rgb task_name=reach_cube
RGB-D reach
python src/ task_name=reach_cube exp_low_cost_robot=base_rgbd
Pointcloud reach:
python src/ task_name=reach_cube exp_low_cost_robot=base_pcd data.train.calib_file=data/calib/reach.npy
Pick, train for more epochs:
# we use loop to control epoch length, as too much validation may be slow
# you can also use trainer.max_epochs to control epoch number
python src/ exp_low_cost_robot=base_rgb data.train.loop=500 exp_name=base_rgb_loop500 task_name=pick_cube
Evaluation Examples
Evaluate reach cube with RGB:
python src/ exp_low_cost_robot=base_rgb task_name=reach_cube max_timesteps=150 num_rollouts=20 ckpt_path=${ckpt_path} num_rollouts=${num_rollouts}
Evaluate reach cube with point cloud:
python src/ task_name=reach_cube max_timesteps=150 num_rollouts=20 exp_low_cost_robot=base_pcd data.train.calib_file=data/calib/reach.npy ckpt_path=${ckpt_path} num_rollouts=${num_rollouts}
The script will evaluate the given checkpoint with ${num_rollouts}
repeats. You can early stop one episode by press q
- Add RealSense RGB-D cameras
- Add data collector
- Add low-cost-robot
- Teleoperation
- Inference
- Add More Robots (Coming soon, stay tuned!)
- Add policies
- Diffusion Policy
Override any config parameter from command line
This codebase is based on Hydra, which allows for convenient configuration overriding:
python src/ trainer.max_epochs=20 seed=300
Note: You can also add new parameters with
python src/ +some_new_param=some_new_value
Train on CPU, GPU, multi-GPU and TPU
# train on CPU
python src/ trainer=cpu
# train on 1 GPU
python src/ trainer=gpu
# train on TPU
python src/ +trainer.tpu_cores=8
# train with DDP (Distributed Data Parallel) (4 GPUs)
python src/ trainer=ddp trainer.devices=4
# train with DDP (Distributed Data Parallel) (8 GPUs, 2 nodes)
python src/ trainer=ddp trainer.devices=4 trainer.num_nodes=2
# simulate DDP on CPU processes
python src/ trainer=ddp_sim trainer.devices=2
# accelerate training on mac
python src/ trainer=mps
Train with mixed precision
# train with pytorch native automatic mixed precision (AMP)
python src/ trainer=gpu +trainer.precision=16
Use different tricks available in Pytorch Lightning
# gradient clipping may be enabled to avoid exploding gradients
python src/ trainer.gradient_clip_val=0.5
# run validation loop 4 times during a training epoch
python src/ +trainer.val_check_interval=0.25
# accumulate gradients
python src/ trainer.accumulate_grad_batches=10
# terminate training after 12 hours
python src/ +trainer.max_time="00:12:00:00"
Note: PyTorch Lightning provides about 40+ useful trainer flags.
Easily debug
# runs 1 epoch in default debugging mode
# changes logging directory to `logs/debugs/...`
# sets level of all command line loggers to 'DEBUG'
# enforces debug-friendly configuration
python src/ debug=default
# run 1 train, val and test loop, using only 1 batch
python src/ debug=fdr
# print execution time profiling
python src/ debug=profiler
# try overfitting to 1 batch
python src/ debug=overfit
# raise exception if there are any numerical anomalies in tensors, like NaN or +/-inf
python src/ +trainer.detect_anomaly=true
# use only 20% of the data
python src/ +trainer.limit_train_batches=0.2 \
+trainer.limit_val_batches=0.2 +trainer.limit_test_batches=0.2
Note: Visit configs/debug/ for different debugging configs.
Resume training from checkpoint
python src/ ckpt_path="/path/to/ckpt/name.ckpt"
Note: Checkpoint can be either path or URL.
Note: Currently loading ckpt doesn't resume logger experiment, but it will be supported in future Lightning release.
Create a sweep over hyperparameters
# this will run 9 experiments one after the other,
# each with different combination of seed and learning rate
python src/ -m seed=100,200,300,0.00005,0.00001
Note: Hydra composes configs lazily at job launch time. If you change code or configs after launching a job/sweep, the final composed configs might be impacted.
Execute all experiments from folder
python src/ -m 'exp_maniskill2_act_policy/maniskill2_task@maniskill2_task=glob(*)'
Note: Hydra provides special syntax for controlling behavior of multiruns. Learn more here. The command above executes all task experiments from configs/exp_maniskill2_act_policy/maniskill2_task.
Execute run for multiple different seeds
python src/ -m seed=100,200,300 trainer.deterministic=True
makes pytorch more deterministic but impacts the performance.
For more instructions, refer to the official documentation for Pytorch Lightning, Hydra, and Lightning Hydra Template.
This repository is released under the MIT license.
Our code is primarily built upon Pytorch Lightning, Hydra, Lightning Hydra Template, Point Cloud Matters, EasyRobot, RH20T, AirExo, ACT, Shaka-Lab's implementation, Diffusion Policy. We extend our gratitude to all these authors for their generously open-sourced code and their significant contributions to the community.
title={RealRobot: A Project for Open-Sourced Robot Learning Research},
author={RealRobot Contributors},
howpublished = {\url{}},
title={Point Cloud Matters: Rethinking the Impact of Different Observation Spaces on Robot Learning},
author={Zhu, Haoyi and Wang, Yating and Huang, Di and Ye, Weicai and Ouyang, Wanli and He, Tong},
journal={arXiv preprint arXiv:2402.02500},