Hierarchical Windowed Graph Attention Network (HWGAT) for Sign Language Recognition

Overview

Hierarchical Windowed Graph Attention Network (HWGAT) is a deep learning model specifically designed for sign language recognition. This model leverages hierarchical and windowed attention mechanisms to effectively capture the temporal and spatial dependencies in sign language skeleton data. This repository includes a comprehensive implementation of HWGAT, covering data preprocessing and the full training pipeline.

Installation

To get started with HWGAT for sign language recognition, follow these steps:

Clone the repository:

git clone https://github.com/suvajit-patra/sl-hwgat.git
cd sl-hwgat/hwgat

Create a docker instance with the Dockerfile and run the container.
Install the required dependencies with pip install -r requirements.txt.

Usage

Data Preprocessing

The data preprocessing pipeline prepares the raw sign language data for training.

Generate metadata: Ensure your dataset is structured properly and run the metadata generator scripts with correspoding dataset. After this, one metadata must be generated to run the deep learning pipeline with the following command. If you are using different dataset then make our own mata generator.
```
python meta_generators/FDMSE-ISL_meta_gen.py
```
!!!Note: Remember to update the paths inside every meta generator script.

This should generate a file in '/data/datasets/FDMSE-ISL/FDMSE-ISL_meta/metadata.csv'.
Generate keypoints: Extract keypoints and save them using the pose_feature_extract.py file by running the following command, where --root: root directory of the dataset, --meta: dataset's metadata.csv, --out_path: saving path of the outputs (keypoints) (the folder will be created under the root directory).
```
python pose_feature_extract.py --root '/data/datasets/FDMSE-ISL' --meta '/data/datasets/FDMSE-ISL/FDMSE-ISL_meta/metadata.csv' -m mediapipe --out_path 'mediapipe_out/'
```
Process keypoints data: Next preprocess the generated keypoints so that it can be used to trained the transformer based model using the following command, where -ds: dataset name, --root: root directory of the dataset, --meta: dataset's metadata.csv, -dr: keypoints output relative path from the root, -kpm: keypoint extraction model, -ft: feature type that is extracted.
```
python data_preprocess.py --root /data/datasets/FDMSE-ISL/ --ds FDMSE-ISL --meta /data/datasets/FDMSE-ISL/FDMSE-ISL_meta/metadata.csv -dr mediapipe_out/ kpm mediapipe -ft keypoints
```

Model Training

Once the data is preprocessed, you can train the HWGAT model using the training pipeline provided.

Configure the training parameters: Edit the configs.py file to set your training parameters, such as learning rate, batch size, number of epochs, etc.
Training the model: Start the training process of the model by running
```
python main.py -m train -d FDMSE-ISL --model HWGATE
```

Testing the model: Test the model using

python main.py -m test -d FDMSE-ISL --model HWGAT -t 240227_1807 -px best_loss

Load and train the model: Load and train the model or finetune on different datasets using

Load and train on same dataset.

python main.py -m load -d FDMSE-ISL --model HWGATE -t 240227_1807 -px best_loss

Finetune on other dataset.

python main.py -m load -d INCLUDE --model HWGATE -mw output/FDMSE-ISL/HWGAT_240227_1807/model_best_loss.pt

Examples

Go to this repository to get the demo application the HWGAT model for sign language recognition tasks.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Citation

If you find this project useful in your research, please consider cite:

@misc{patra2024hierarchicalwindowedgraphattention,
      title={Hierarchical Windowed Graph Attention Network and a Large Scale Dataset for Isolated Indian Sign Language Recognition}, 
      author={Suvajit Patra and Arkadip Maitra and Megha Tiwari and K. Kumaran and Swathy Prabhu and Swami Punyeshwarananda and Soumitra Samanta},
      year={2024},
      eprint={2407.14224},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2407.14224}, 
}

Thank you for using this repository. For any questions or support, please open an issue in this repository.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
hwgat		hwgat
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hierarchical Windowed Graph Attention Network (HWGAT) for Sign Language Recognition

Overview

Table of Contents

Installation

Usage

Data Preprocessing

Model Training

Examples

License

Citation

About

Releases

Packages

Contributors 3

Languages

License

suvajit-patra/sl-hwgat

Folders and files

Latest commit

History

Repository files navigation

Hierarchical Windowed Graph Attention Network (HWGAT) for Sign Language Recognition

Overview

Table of Contents

Installation

Usage

Data Preprocessing

Model Training

Examples

License

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages