This repository holds an extension of the codebase behind the EMNLP 2021 paper Towards Label-Agnostic Emotion Embeddings to facial emotion recognition. It thus generalizes our emotion-embedding-approach from language only to language and vision.
By Sven Buechel and Udo Hahn, Jena University Language and Information Engineering (JULIE) Lab: https://julielab.de. With special thanks to Luise Modersohn for participating in an earlier version of this codebase.
This codebase was developed on and tested for Debian 9.
cd
into the project root folder. Set up your conda environment:
conda create --name "emocoder" python=3.7 pip
conda activate emocoder
pip install -r requirements.txt
Add the project root to your PYTHONPATH: export PYTHONPATH=$PYTHONPATH:$(pwd)
.
Copy and rename the file emocoder/experiments/config_template.json
to config.json
.
You are now set-up to use our codebase. However, to re-run our experiments, you will need to download the respective datasets (see below).
There are four types of datasets necessary to replicate all experiments, text datasets, word datasets, image datasets,
and word embeddings.
Download the following files (1–2GB each), unzip them and place them under emocoder/data/vectors
.
- en1. Either request the 1999-version of the Affective Norms for English Words (ANEW) from the
Center for the Study of Emotion and Attention at the University
of Florida, or copy-paste/parse the data from the Techreport Bradley, M. M., & Lang, P. J. (1999). Affective Norms
for English Words (Anew): Stimuli, Instruction Manual and Affective Ratings (C–1). The Center for Research in
Psychophysiology, University of Florida. Format the data as an tsv file with column headers
word
,valence
,arousal
,dominance
and save it underemocoder/data/datasets/ANEW1999.csv
. - en2. Get the file
Stevenson(2007)-ANEW_emotional_categories.xls
from Stevenson et al. (2007) and save it asemocoder/data/datasets/stevenson2007.xls
. - es1. Get the file
13428_2015_700_MOESM1_ESM.csv
from Stadthagen-Gonzalez et al. (2017) and save it asemocoder/data/datasets/Stadthagen_VA.csv
. - es2. Get the file
13428_2017_962_MOESM1_ESM.csv
from Stadthagen-Gonzalez et al. (2018) and save it asemocoder/data/datasets/Stadthagen_BE.csv
. - de1. Get the file
BAWL-R.xls
from Vo et al. (2009) which is currently available here. You will need to request a password from the authors. Save the file without password asemocoder/data/datasets/Vo.csv
. We had to run an automatic file repair when opening it with Excel for the first time. - de2. Get the file
13428_2011_59_MOESM1_ESM.xls
from Briesemeister et al. (2011) and save it asemocoder/data/datasets/Briesemeister2011.xls
. - pl1. Get the file
13428_2014_552_MOESM1_ESM.xlsx
from Riegel et al. (2015) and save it asemocoder/data/datasets/Riegel2015.xlsx
. - pl2. Get the file
S1 Dataset
from Wierzba et al. (2015) and save it asemocoder/data/datasets/Wierzba2015.xlsx
. - tr1. Get the file
TurkishEmotionalWordNorms.csv
from Kapucu et al. (2018) which is available here. Place it underemocoder/data/datasets/Kapucu.csv
. - tr2. This dataset is included in tr1.
- Affective Text. Get the archive
AffectiveText.Semeval.2007.tar.gz
from Strapparava and Mihalcea (2007) and save it asemocoder/data/datasets/AffectiveText.tar.gz
. - EmoBank. This dataset will download automatically from GitHub when needed.
- CVAT. Get the archive
ChineseEmoBank.zip
from Lung-Hao Lee and save it asemocoder/data/datasets/ChineseEmoBank.zip
. We requested the dataset directly from the author via personal communication.
- Facial Emotion Recognition. Get the file
fer2013.csv"
from this Kaggle competition as described in this paper. Also get, the fileimdb_DimEmotion.mat
from this repository, as described in this paper. Combine both files into a new file calledfer2013+vad.csv
as illustrated in the notebookemocoder/scripts/read-and-combine-FER2013-vad-data.ipynb
. Place this file underemocoder/data/datasets/fer2013
. - AffectNet. Get the AffectNet datbase from this paper and place the
folders
Labels
andManuallyAnnotated
underemocoder/data/datasets/AffectNet
.
- From the project root, run
python emocoder/scripts/setup-target-folder-structure.py
. This will create a newtarget
folder and all necessary subfolders. If thetarget
folder already exists, rename it first to keep old and new results separate. python emocoder/scripts/run_all_mapping_experiments.py
(this should only take a couple of minutes). Per default, the experiments will be run on on gpu 0. Use the--gpu
parameter to choose a gpu.- Identify the path to the model checkpoint in the "no_nrc" condition. This should be something like
emocoder/target/mapping/multitask/dev/no_nrc-<someTimestamp>/checkpoints/model_<epochNumber>.pt
. Insert this path inemocoder/experiments/config.json
in the "EMOTION_CODEC_PATH" field. python emocoder/scripts/run_all_baseline_experiments.py
(this may take a couple of hours)python emocoder/scripts/run_all_encoder_experiments.py
(this may take a couple of hours)python emocoder/scripts/run_all_checkpoint_test.py
(this may take a couple of hours)python emocoder/scripts/aggregate_all.py --split test
You have now re-run all of our experiments. You can find your replicated results within the emocoder/target/
directory.
Note that small deviations are to be expected due to inherent randomness.
The best way to inspect the results are the notebooks under emocoder/analysis
.
To recreate our visualizations of the emotion space, cd emocoder/analysis
and run the four get_*.py
scripts in there:
python get_prediction_head_embeddings.py
python get_word_emotion_embeddings.py
python get_text_emotion_embeddings.py
python get_image_emotion_embeddings.py
The figures from the paper can then be accessed by running Interlingua-Visualization.ipynb
.
Due to the randomness inherent to the training process, the plots will look slightly different than the published versions. The original plots can be recreated by running the above steps with the original experimental results (published separately).
Should you have any further questions, please reach out to me via [email protected].