Skip to content

Latest commit

 

History

History
48 lines (29 loc) · 1.75 KB

README.md

File metadata and controls

48 lines (29 loc) · 1.75 KB

Jigsaw CNN

A chainer implementation of self-supervised jigsaw CNNs. The authors have published their caffe implementation

Patches

The jigsaw CNN learn a representation by reassembling an image from it's patches.

Random crop

This is achieved by:

  1. randomly cropping a square from the image.
  2. segmenting the crop into 9 patches (with more random crops).
  3. permuting the patches
  4. predict what permutation was applied to the patches.

Random patches

With the aim of learning about structure, colour and texture without labels.

Usage

python -m jigsaw.train --gpu 3 "/path/to/train/*.jpg" "/path/to/test/*.jpg"

Note that the path globs must be quoted or the shell we expand them. Images will automatically be rescaled, cropped and turned into patches at runtime. Check --help for more details. Training on the cpu is not supported, you must specify a gpu ID.

Filters

This is what the first layer filters look like after 350k batches. They look good but need some more fine tuning.

Filters

Notes

To identify an n-permutation we only need n-1 elements so I've made the task harder by randomly zero'ing one of the patches (i.e dropout for patches). Permutations are generated in a different manner than specified in the paper but the average hamming distance is almost the same at 0.873 (see scripts/perm-gen.py).

The architecture we use to generate patch representations is closer to ZFNet than AlexNet

Training could be made faster by precalculating batches.