Skip to content

vengdeng/Official-Joint-Human-Pose-Estimation-and-Stereo-Localization

Repository files navigation

[Offical] Joint-Human-Pose-Estimation-and-Stereo-Localization

This is Wenlong's master thesis at VITA Lab EPFL, in this work, we are interested in perceiving humans — a fundamental and critical category for any autonomous vehicle operating alongside pedestrians (from social robots to self-driving cars). Note that our definition of human generalizes to pedestrians and any other category involving humans in the publicly available KITTI dataset, such as person sitting.

Install

Python 3 is required and you need to clone this repository and then:

pip install numpy cython imgaug pycocotools
pip install torch==1.4.0 torchvision==0.5.0
python setup.py install 
cd openpsf/correlation_package
python setup.py install

Please notice the correlation module requires GPU, so Mac could not run the model successfully.

Stereo Training

We load pretrained pifpaf weights for the 2d pose detection. Please download pretrained weights frompifpaf.
Please keep the folder openpsf's name unchanged, since the pretrained pifpaf model will assign weight according to the folder name.

   python3 -m openpsf.train  --momentum=0.95   --epochs=20   --lr-decay 10 20   --batch-size=3   --basenet=resnet152block5   --quad=1   --headnets pif paf psf  --square-edge=401   --regression-loss=laplace   --lambdas 30 2 2 50 3 3 50 3 3   --crop-fraction=0.5 --pretrained (the model from pifpaf)

Pretrained Psf model

The pretrained model weights for person localization can be found from the google drive. The model with correlation module is psf_corr. If you want to use The model without correlation module, please replace the all heads_corr in nets.py with head_psf, the pretrained model is psf_no_corr.

Jupyter Example

We give an example in our example to illustrate the usage of our model with kitti dataset. there is no need to retrain if you want use your own dataset, you can fintune the hyperparameter in association_pair.py (ie the camera parameter ratio k and confidence threshold score) to better match your dataset.

Stereo Inference

python3 -m openpsf.predict --help

Stereo Result

ALP Type error < 0.5 error < 1 error < 2
PSF stereo Stereo 47.6% 56.9% 63.2%
MonoLoco Mono 27.6% 47.8% 66.2%
3DOP Mono 41.5% 54.5% 63.0%
MonoDepth Mono 19.1% 33.0% 47.5%
ALE Type Easy Moderate Hard
PSF stereo Stereo 0.50(0.59) 0.59(0.72) 0.73(0.65)
3DOP Mono 0.54(0.72) 0.85(1.13) 1.56(1.65)
MonoLoco Mono 0.85(0.88) 0.97(1.23) 1.14(1.49)
MonoDepth Mono 1.40(1.69) 2.19(2.98) 2.31(3.77)

Citation

The paper appears at ICRA 2020. If you use, compare with, or refer to this work, please cite

@INPROCEEDINGS{9197069,
  author={W. {Deng} and L. {Bertoni} and S. {Kreiss} and A. {Alahi}},
  booktitle={2020 IEEE International Conference on Robotics and Automation (ICRA)}, 
  title={Joint Human Pose Estimation and Stereo 3D Localization}, 
  year={2020},
  volume={},
  number={},
  pages={2324-2330},
  doi={10.1109/ICRA40945.2020.9197069}}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published