Skip to content

Latest commit



160 lines (143 loc) · 13.2 KB

File metadata and controls

160 lines (143 loc) · 13.2 KB




This project is based on TensorFlow 2 and has implemented representative convolutional neural networks in recent years, which are trained on the CIFAR-10 dataset and suitable for image classification tasks. The basic architecture of the network refers to the original papers on arXiv as much as possible, and some of them have been modified for the CIFAR-10 dataset. The best accuracy is 97.05%.


  • Python 3.7
  • TensorFlow-gpu 2.1
  • Jupyter Notebook


The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.



Dataset: CIFAR-10
No Pre-train

Network Params Batch Size Epochs Time Per Epoch Total Time Accuracy Remarks
AlexNet 9.63M 128 100 36s 1h 78.44%
NIN 0.97M 128 100 36s 1h 90.38%
VGG16 33.69M 128 100 41s 1h 8min 92.34%
InceptionV1 0.37M 128 100 42s 1h 10min 93.02% simplified
InceptionV2 0.65M 128 100 51s 1h 25min 93.40% simplified
InceptionV3 1.17M 128 100 55s 1h 30min 94.20% simplified
InceptionV4 2.57M 128 100 104s 2h 53min 94.55% simplified
ResNet18 11.18M 128 150 39s 1h 38min 95.11% pre-act
ResNet50 23.59M 128 100 88s 2h 27min 94.55% pre-act
DilatedConv 2.02M 128 100 92s 2h 33min 93.22%
SqueezeNet 0.73M 32 100 35s 58min 88.41% light-weight
StochasticDepth 23.59M 128 100 92s 2h 33min 95.07% ResNet50
FractalNet 33.76M 128 100 48s 1h 20min 94.32%
Xception 1.36M 128 100 54s 1h 30min 94.56% simplified
PyramidNet110 9.90M 128 100 185s 5h 8min 95.65%
ResNeXt50 23.11M 128 100 210s 5h 50min 95.43% 32×4d
WideResNet 36.51M 128 150 138s 5h 45min 95.94% 28-10
DenseNet100 3.31M 128 150 159s 6h 38min 95.57% 100-24
DenseNet121 7.94M 128 100 110s 3h 3min 94.91% 121-32
DualPathNet50 21.05M 128 100 220s 6h 7min 95.44%
DualPathNet92 34.38M 128 100 370s 10h 17min 95.78%
ShuffleNetV2 1.28M 128 100 39s 1h 5min 92.41% light-weight
MobileNetV3 4.21M 128 100 66s 1h 50min 94.85% light-weight
SE-ResNet50 26.10M 128 100 110s 3h 3min 95.37%
SE-ResNeXt50 25.59M 128 120 270s 9h 96.12% 32×4d
SE-WideResNet 36.86M 128 150 175s 7h 18min 96.60% 28-10
SE-WideResNet_2 36.86M 128 220 143s 8h 45min 97.05% more tricks
SENet154 567.9M 128 100 ---- ----- -----
CBAM-ResNet50 26.12M 128 100 154s 4h 17min 95.01%
SKNet 6.73M 256 100 205s ----- -----
EfficientNetB0 3.45M 64 100 390s ----- -----

SOTA : SE-WideResNet (more tricks) (Acc. : 97.05%)

Remarks :

  • simplified : replace the stem structure with one convolutional layer, channels are divided by 4
  • pre-act : ResNet V2 (full pre-activation)
  • light-weight : smaller efficient CNN architecture which is suitable for mobile and embedded vision applications

Implement Detail

Details of the SOTA network :

  • Architecture :
    • WideResNet (depth=28, k=10) (improved)
    • Squeeze-and-Excitation Block
  • Policy : mixed_precision (FP16)
  • Pre-process : Z-score normalization
  • Data augment : Rotation, Shift, Shear, Zoom, HorizontalFlip, Mixup(alpha=0.2)
  • Learning rate :
    • Initial learning rate : 0.1
    • Learning rate decay : Hyperbolic-Tangent Decay (-6,3)
    • WarmingUp
  • Weight decay : 0.0001
  • Weight initial : he_normal
  • Activation : replace relu with swish
  • Dropout : 0.1
  • Optimizer : SGDM with nesterov
  • Label smoothing : 0.1
  • Gradient clipping


MIT License

Copyright (c) 2020 ZZH