Skip to content

DeepHumor: Image-based Meme Generation using Deep Learning

Notifications You must be signed in to change notification settings

ilya16/deephumor

Repository files navigation

DeepHumor: Image-based Meme Generation using Deep Learning

Final Project in "Deep Learning" course in Skotech, 2020.
Authors: Ilya Borovik, Bulat Khabibullin, Vladislav Kniazev, Oluwafemi Olaleke and Zakhar Pichugin

Open in YouTube Open In Colab

Deep Learning meme

Description

The repository presents multiple meme generation models (see illustrations below):

  • Captioning LSTM with Image-only Encoder
  • Captioning LSTM with Image-label Encoder
  • Base Captioning Transformer with Global image embedding
  • Captioning Transformer LSTM with Spatial image features

Observe the models in action in the demo notebook:
Open In Colab Open in GitHub

All pretrained models will be automatically downloaded and built in Colab runtime.

Except for the models, we collect and release a large-scale dataset of 900,000 meme templates crawled from MemeGenerator website. The dataset is uploaded to Google Drive. Description of the dataset is given in the corresponding section.

Note: Repository state at the end of "Deep Learning" course project is recorded in the branch skoltech-dl-project.

Training code

The example code for training the models is provided in Colab notebook. It contains the training progress and TensorBoard logs for all experiments described in the project report.

Dataset

We crawl and preprocess a large-scale meme dataset consisting of 900,000 meme captions for 300 meme template images collected from MemeGenerator website. During the data collection we clean the data from evident duplicates, long caption outliers, non-ASCII symbols and non-English templates.

Download dataset

Crawled dataset of 300 meme templates with 3000 captions per templates can be downloaded using load_data.sh script or directly from Google Drive. The data is split into train/val/test with 2500/250/250 captions per split for each template. We provide the data splits to make the comparison of new models with our works possible.

The dataset archive follows the following format:

├── memes900k
|   ├── images -- template images
|       ├── cool-dog.jpg
|       ├── dogeee.jpg
|       ├── ...
|   ├── tempaltes.txt -- template labels and image urls
|   ├── captions.txt -- all captions
|   ├── captions_train.txt -- training split
|   ├── captions_val.txt -- validation split
|   ├── captions_test.txt -- test split

Crawl dataset

To crawl own dataset, run the following script:

python crawl_data.py --source memegenerator.net --save-dir ../memes \
    --poolsize 25 --num-templates 300 --num-captions 3000 \
    --detect-english --detect-duplicates \
    --min-len 10 --max-len 96 --max-tokens 31

Then, split the data into train/val/test using:

python split_data.py --data-dir ../memes --splits 2500 250 250

Models

Captioning LSTM

Captioning LSTM

Captioning LSTM with labels

Captioning LSTM with labels

Captioning Base Transformer

Captioning Base Transformer

Captioning Transformer

Captioning Transformer

Releases

No releases published

Packages

No packages published