A Simplified Version of Google's "Quick, Draw"

Soft Computing project, Software Engineering and Information Technologies, Undegraduate Academic Studies, Faculty of Technical Sciences, University of Novi Sad, 2019/2020

Technologies used: Keras 2.3.1, Python 3.6.1, Tensorflow 2.0.0

Overview

The goal is to predict what the user has drawn on the canvas. A subset of the „Quick, Draw!“ dataset was used, which included the following six classes: Airplane, Alarm clock, Ant, Axe, Bicycle, The Mona Lisa.

Demo

How to run

clone the project via git clone https://github.com/UrosOgrizovic/SimpleGoogleQuickdraw.git
download data (see Fetching the data)
in terminal, enter set WRAPT_INSTALL_EXTENSIONS=false (this is required due to a pip install tensorflow problem)
in terminal, enter pip3 install -r requirements.txt to install the dependencies
download VGG weights and place them in models/transfer_learning:
- VGG19_10k.h5: https://drive.google.com/file/d/1uvpi0ugDtwueWnGk4m13jp3KK8HAbVCj/view?usp=sharing
- VGG19_100k.h5: https://drive.google.com/file/d/1GvKnDsntD73XjyYa4XBsiJnG-SPohoQC/view?usp=sharing
run web.py

Fetching the data

Create a folder called data in project root, download and place the following files into that folder:

Airplane: https://storage.cloud.google.com/quickdraw_dataset/full/numpy_bitmap/airplane.npy

Alarm clock: https://storage.cloud.google.com/quickdraw_dataset/full/numpy_bitmap/alarm%20clock.npy

Ant: https://storage.cloud.google.com/quickdraw_dataset/full/numpy_bitmap/ant.npy

Axe: https://storage.cloud.google.com/quickdraw_dataset/full/numpy_bitmap/axe.npy

Bicycle: https://storage.cloud.google.com/quickdraw_dataset/full/numpy_bitmap/bicycle.npy

The Mona Lisa: https://storage.cloud.google.com/quickdraw_dataset/full/numpy_bitmap/The%20Mona%20Lisa.npy

Models

Vanilla CNN

13 layers, excluding the input layer (view architecture visualization). Dropout was used to avoid overfitting. The kernel's dimensions are 3x3, which is an often-used kernel size.

This model was trained on both 10,000 images per label and 100,000 images per label. The latter case brought no noticeable improvement.

Callbacks used:

ImageDataGenerator was used for augmenting the images, which helps avoid overfitting.
EarlyStopping was especially useful for the 100k-images-per-label-model, as it greatly reduced the number of epochs that the model would execute before stopping. It was set up in such a way that if the validation loss was noticed to have stopped decreasing after five epochs, the training would terminate.
ModelCheckpoint was used with the save_best_only flag set to True, so as to only save the latest best model (i.e. the best model out of all the epochs) according to the validation loss.
ReduceLROnPlateau was used because models often benefit from reducing the learning rate by a factor of 2-10 once learning stagnates [2]. Yet again, the monitored value was the validation loss.

Constraints used:

MaxNorm is a type of weight constraint. From Dropout: A Simple Way to Prevent Neural Networks from Overfitting: "One particular form of regularization was found to be especially useful for dropout—constraining the norm of the incoming weight vector at each hidden unit to be upper bounded by a fixed constant c."

Plots:

SVM

The default value of 1 was used for C, the penalty error term. 'rbf' was the value used for the kernel parameter, also by default. The default value of 'scale' was used for gamma, the RBF kernel coefficient.²

Training was very slow; from docs: "The fit time scales at least quadratically with the number of samples and may be impractical beyond tens of thousands of samples." This model doesn't work well on this problem.

A "grid search" on C and gamma was performed using cross-validation [1].

Perhaps the performance of this model could be improved by using the histogram of oriented gradients (HOG).

²C tells the SVM optimization how much to avoid misclassifying each training example by (large C - small hyperplane, and vice versa), and gamma defines how far the influence of a single training example (i.e. point) reaches (large gamma - the decision boundary will only depend on the points close to it - that is, each point's influence radius will be small, and vice versa).

VGG19

Consists of 24 layers, excluding the input layer (view architecture visualization). However, instead of using VGG19's fully connected layers, I used my own, because my problem doesn't have 1000 classes. Additionally, I had to pad Google's 28x28 images to 32x32 images, because this model doesn't accept images smaller than 32x32.

This model uses 3x3 convolution filters. Its predecessor, VGG16, achieved state-of-the-art results in the ImageNet Challenge 2014 by adding more weight layers compared to previous models that had done well in that competition.

Accuracy per model

set	CNN 10k	CNN 100k	SVM 2k	SVM 10k	VGG 10k	VGG 100k
train	~99%	~97%	~89%*	~84%*	~94%	~94%
validation	~99%	~97%	~89%*	~84%*	~94%	~94%
test	~96%	~98%	~89%*	~84%*	~94%	~94%

* 10-fold cross validation was done for the SVM models, so there are only train and test accuracies

References

[1] - Hsu, Chih-Wei, Chih-Chung Chang, and Chih-Jen Lin. "A practical guide to support vector classification." (2003): 1396-1400.

[2] - Ravaut, Mathieu, and Satya Gorti. "Gradient descent revisited via an adaptive online learning rate." arXiv preprint arXiv:1801.09136 (2018).

Name		Name	Last commit message	Last commit date
Latest commit History 116 Commits
models		models
static		static
templates		templates
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
constants.py		constants.py
image_operations.py		image_operations.py
poster.pdf		poster.pdf
requirements.txt		requirements.txt
web.py		web.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A Simplified Version of Google's "Quick, Draw"

Overview

Demo

How to run

Fetching the data

Models

Vanilla CNN

SVM

VGG19

Accuracy per model

References

About

Releases

Packages

Languages

License

UrosOgrizovic/SimpleGoogleQuickdraw

Folders and files

Latest commit

History

Repository files navigation

A Simplified Version of Google's "Quick, Draw"

Overview

Demo

How to run

Fetching the data

Models

Vanilla CNN

SVM

VGG19

Accuracy per model

References

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages