Facial Expression Recognition in the Presence of Speech

A deep neural network that detects emotions through facial expressiosn from talking faces.

Datasets

GRID corpus dataset
RAVDESS
AffectNet
Oulu-CASIA
NBE Datasets (not publically available yet)

Usage

The DNN consists of two sub-models:

LipNet
CNN

All scripts for data preperation, model training, and model benchmarks are provided under the directory of script.

Dataset Preparation

Store the dataset under the data directory. Each dataset folder should has a structure as follows:

Audiovisual Dataset

dataset
    npy
    videos

GRID

The GRID dataset requirs additional alignment files (check the LipNet repo for more details), the directory tree should looks like this:

GRID
    align
    npy
    videos

Canonical Image Dataset:

dataset
    train
        faces
            angry
                xxx.jpg
            disgust
            fear
            happy
            neutral
            sad
            surprise
    test
        faces
            ...

Important Settings

All images should be stored in the format of .jpg.
All videos should be stored in the format of .npy.
Each face image should be cropped to the shape (224,224,3), and each lip image should be in the shape of (50,100,3). The order of shape is Width x Height x Channel.

Model Training

LipNet

Use the script/lipnet.py script to start training the lipnet model after preprocessed the GRID dataset.

Baseline

Use the script/baseline.py script to start training the baseline AFER model (still image based) after preprocessed the affectnet dataset.

DNN

Use the script/dnn.py script to start training the dnn model after preprocessed the RAVDESS dataset.

Model Predicting

Use the predict.py script to analyze a video or a directory of videos with a trained model:

Model Evaluation

Scripts under the directory of script/evaluating are used for real-time handy evaluation, results are not guaranteed.

To-do List

Tensorflow Dataset pipeline
Generate dummy cropped image representing netual emotion
Documentation: Proper usage and code documentation
Testing: Develop unit testing

Author

Kai Yao - fecodoo @ Aalto University - Department of Computer Science

License

This project is licensed under the Apache 2.0 License - see the LICENSE file for details

Name		Name	Last commit message	Last commit date
Latest commit History 172 Commits
common		common
core		core
data/dictionaries		data/dictionaries
script		script
utils		utils
.env		.env
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Facial Expression Recognition in the Presence of Speech

Datasets

Usage

Dataset Preparation

Audiovisual Dataset

GRID

Canonical Image Dataset:

Important Settings

Model Training

LipNet

Baseline

DNN

Model Predicting

Model Evaluation

To-do List

Author

License

About

Releases 1

Packages

Languages

License

FecoDoo/lipnet

Folders and files

Latest commit

History

Repository files navigation

Facial Expression Recognition in the Presence of Speech

Datasets

Usage

Dataset Preparation

Audiovisual Dataset

GRID

Canonical Image Dataset:

Important Settings

Model Training

LipNet

Baseline

DNN

Model Predicting

Model Evaluation

To-do List

Author

License

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages