This is the code for pose estimation module of CRAVES. If you want to test on the OWI-535 hardware, please refer the control module here.
The craves.ai project controls a toy robotic arm (OWI-535) with a single RGB camera. Please see the system pipeline and how it works in docs/README.md first before trying the code. The following animation shows the arm controlling by a mounted camera to reach a goal without relying on any other sensors.
Here are some visualization result from the YouTube dataset:
./data_generation/load_data_and_vis.py
contains samples on how to visualize the images and their annotations.
We created three datasets for this project, namely synthetic
, lab
and youtube
.
Download the datasets from here.
For the usage of these datasets, please refer to here.
-
Download the checkpoint for the pretrained model here and put it into a folder, e.g. ./checkpoint/checkpoint.pth.tar.
-
Create a folder for result saving, e.g.
./saved_results
. -
Open
./scripts/val_arm_reall.sh
. Make sure--data-dir
,--resume
and--save-result-dir
match with the folder where you put the datasets, the pre-train model and the saved result in, respectively. For example,--data-dir ../data/test_20181024 --resume ../checkpoint/checkpoint.pth.tar --save-result-dir ../saved_results
-
cd ./scripts
then runsh val_arm_reall.sh
and you can see the accuracy on the real lab dataset.
The output you should expect to see:
sh val_arm_reall.sh
=> creating model 'hg', stacks=2, blocks=1
=> loading checkpoint '../checkpoint/checkpoint.pth.tar'
=> loaded checkpoint '../checkpoint/checkpoint.pth.tar' (epoch 30)
Total params: 6.73M
No. images of dataset 1 : 428
merging 1 datasets, total No. images: 428
No. minibatches in validation set:72
Evaluation only
Processing |################################| (72/72) Data: 0.000000s | Batch: 0.958s | Total: 0:01:08 | ETA: 0:00:01 | Loss: 0.0009 | Acc: 0.9946
As you can see, the overall accuracy on the lab dataset is 99.46% under the [email protected] metric.
Other shell scripts you may want to try:
train_arm.sh
andtrain_arm_concat.sh
: train a model from scratch with synthetic dataset only and with multiple datasets, respectively.val_arm_syn.sh
: evaluate model on synthetic datasetval_arm_reall_with_3D
: evaluate model on synthetic dataset, giving both 2D and 3D output.val_arm_youtube.sh
andval_arm_youtube_vis_only.sh
: evaluate model on youtube dataset, with all keypoints and only visible keypoints, respectively.
Dependencies: pytorch with version 0.4.1 or higher, OpenCV
The 2D pose estimation module is developed based on pytorch-pose.
Download the binary for Windows or Linux (tested in Ubuntu 16.04).
Unzip and run ./LinuxNoEditor/ArmUE4.sh
.
Run the following script to generate images and ground truth
pip install unrealcv imageio
cd ./data_generation
python demo_capture.py
Generated data are saved in ./data/new_data
by default. You can visualize the groundtruth with the script ./data_generation/load_data_and_vis.py
.
The control module of CRAVES is hosted in another repo, https://github.com/zfw1226/craves_control.
Please see this repo for hardware drivers, pose estimator, a PID-like controller, and a RL-based controller.
If you found CRAVES useful, please consider citing:
@article{zuo2019craves,
title={CRAVES: Controlling Robotic Arm with a Vision-based, Economic System},
author={Zuo, Yiming and Qiu, Weichao and Xie, Lingxi and Zhong, Fangwei and Wang, Yizhou and Yuille, Alan L},
journal={CVPR},
year={2019}
}
If you have any question or suggestions, please open an issue in this repo. Thanks.
Disclaimer: authors are a group of scientists working on computer vision research. They are not associated with the company manufactures this arm. If you have a better hardware to recommend, or want to apply this technique to your arm, please contact us.