we proposed a vision-based method that identify human gesture for autonomous vehicles. The method utilizes a human skeleton estimation model to find the locations of the human skeleton key points, and constructs a feature vector to represent the human gesture based on the estimated human skeleton key points. A gesture recognition model is further deployed to identify certain gestures of interests.
The human skeleton estimation is following this CVPR paper.
The original repo can be found here.
- Human Skeleton Estimation Model Implementation and Training
- Gesture Recognition Model Implementation and Training
A demo script is provided to see the whole process (main.ipynp) In this scrip, a test image is firstly imported, then we use the test image to run the human skeleton estimation model, then construct the feature vector, finally we use the gesture recognition model to interpret the gesture.
To run the main.ipynp, you need to :
-
Install all the packages you need following the package_spec.txt (Note that you need to install keras 2.2.0, other version may fail to extract the weights)
-
Install Anaconda and jupyter notebook
-
Download the model weights from this LINK, put the model_weights.h5 file in the data folder.
-
Run main.ipynp cell by cell
- Option 1: Download a small sample training set saved by us from this LINK
- Option 2: Download the COCO data set (65GB) and API following this REPO
- Download the training data (6GB) following the LINK
- Put the train data in data folder
- (Optional) Download the pre-trained weights from this LINK, put the model_weights.h5 file in the data folder
- cd skeleton_estimation_train
- python3 train_model_main.py
- Without pre-trained weights, set RETRAIN = 0
- train_model_main.py: main code to train the model
- packages_lib.py: import all the packages we need
- model_lib.py: all the sub cnn modes we need (vgg, ...) to build the skeleton detection model
- model_builder.py: use the sub cnns from model_lib.py to build the final model
We want to point out that our work is focusing on the Human Skeleton Estimation Model Implementation, and Gesture Recognition is an application that uses the detected human skeleton, which is not our focus in this project.
We can not find a free and public data set that fit out requirement, we manually labeled 1000 images that contain three most common human intentions: pedestrian stopping the vehicle, pedestrian requesting a ride and biker indicating the lane changing, to test the gesture recognition concept.
- Install jupyter notebook
- cd gesture_train
- run gesture_recognition_train.ipynb