Tomas Jakab*, Ankush Gupta*, Hakan Bilen, Andrea Vedaldi (* equal contribution). Advances in Neural Information Processing Systems (NeurIPS) 2018.
Software that learns to discover object landmarks without any manual annotations. It automatically learns from images or videos and works across different datasets of faces, humans, and 3D objects.
- Linux
- Python 2.7
- TensorFlow 1.10.0. Other versions (1.*.*) are also likely to work
- Torch 0.4.1
- CUDA cuDNN. CPU mode may work but is untested
- Python dependecies listed in
requirements.txt
Clone this repository
git clone https://github.com/tomasjakab/imm && cd imm
Install Python dependecies by running
pip install --upgrade -r requirements.txt
Add the path to this codebase to PYTHONPATH
export PYTHONPATH=$PYTHONPATH:$(pwd)
Download trained models [0.9G] and set the path to them in configs/paths/default.yaml
, option logdir
.
Use Jupyter notebook examples/visualize.ipynb
to run a model trained on AFLW dataset of faces that predicts 10 unsupervised landmarks.
We provide pre-trained models to re-produce the experimental results on facial landmark detection datasets (CelebA, MAFL, and AFLW). Please download them first as described in Getting Started/Visualize Unsupervised Landmarks.
Download CelebA [7.8G] dataset and set the path to it configs/paths/default.yaml
, option celeba_data_dir
.
MAFL dataset is already included in CelebA download.
To test on MAFL dataset run
bash examples/test_mafl.sh <N>
This loads a model that was trained on CelebA dataset to predict N
unsupervised landmarks (N
can be set to 10, 30 or 50). It then trains a linear regressor from unsupervised landmarks to 5 labeled landmarks using MAFL training set and evaluates it on MAFL test set.
Download AFLW [1.1G] dataset and set the path to it configs/paths/default.yaml
, option aflw_data_dir
.
To test on AFLW dataset run
bash examples/test_aflw.sh <N>
This loads a model that was trained on CelebA dataset and finetuned on AFLW dataset to predict N
unsupervised landmarks (N
can be set to 10, 30, or 50). It then trains a linear regressor from unsupervised landmarks to 5 labeled landmarks using AFLW training set and evaluates it on AFLW test set.
If you wish to train your own model, please download VGG16 model [0.6G] that was pre-trained on colorization task and is needed for perceptual loss. This model comes from the paper Colorization as a Proxy Task for Visual Understanding, Larsson, Maire, Shakhnarovich, CVPR 2017. Set the path to this model in configs/paths/default.yaml
, option vgg16_path
. Also download and update the paths to the datasets as described above.
Set the option logdir
in configs/paths/default.yaml
to the location where you wish to store training logs and checkpoints.
To train a model for N
(e.g., N
can be 10, 30 or anything else) unsupervised landmarks on CelebA dataset run
bash examples/train_celeba.sh <N>
We first train on CelebA as described above, and then fine-tune on AFLW due to its small size.
To finetune a model for N
unsupervised landmarks on AFLW dataset run
bash examples/train_aflw.sh <N> <celeba_checkpoint>
where celeba_checkpoint
is the path to the model checkpoint trained on CelebA. This could be for example data/logs/celeba-10pts/model.ckpt
.
Test errors reported in the paper were obtained with a data pipline that was using MATLAB for image pre-processing. This codebase uses a Python re-implementation. Due to numerical differences, the test errors may slightly differ. If you wish to reproduce the exact numbers from the paper contact us at [email protected] to get this data pipeline (requires MATLAB).