Skip to content

Latest commit

 

History

History
25 lines (16 loc) · 1.94 KB

README.md

File metadata and controls

25 lines (16 loc) · 1.94 KB

Variational autoencoder (VAE) as a Generator

Dataset

MNIST images show digits from 0-9 in 28x28 grayscale images. We do not center them at 0, because we will be using a binary cross-entropy loss that treats pixel values as probabilities in [0,1]. We create both a training set and a test set.

Encoder

We use a convolutional encoder and decoder, which generally gives better performance than fully connected versions that have the same number of parameters. In convolution layers, we increase the channels as we approach the bottleneck, but note that the total number of features still decreases, since the channels increase by a factor of 2 in each convolution, but the spatial size decreases by a factor of 4. Kernel size 4 is used to avoid biasing problems described here: https://distill.pub/2016/deconv-checkerboard/

VAE Loss

VAE loss is a linear combination of the reconstruction loss (computed as BCE between original and reconstructed image) and the KL-Divergence between the prior distriubtrion over latent vectors and the distribution estimated by the gerenated for the given image. Here, we give you the KL term and the combintation, but have you implement the reconstruction loss.

Original images

image

Training Curve

image

VAE reconstruction:

image

VAE can generate new digits by drawing latent vectors from the prior distribution. Although the generated digits are not perfect, they are usually better than for a non-variational Autoencoder

image