Repository for "Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation".
- python2/3
- tensorflow
- tensorflow-slim (optional)
The model was trained with SGD+Momentum optimizer.
Optimizer | iter | Top1 | Top5 | Pre-trained Model |
---|---|---|---|---|
SGD | 1million | 69.7% | 89.2% | google drive |
The MobilenetV2 is implemented by two versions:
- By tensorflow-slim. This version can be copied under tensorflow-slim and use the API which is provided by slim to train and test the model.
- By tensorflow. This version is used to achieved higher speed with NCHW and half-percision.
Please notice there is a mistake in the paper (after the fourth bottleneck layer, the output size should be × 64 instead of × 64 )
Each line describes a sequence of 1 or more identical (modulo stride) layers, repeated n times. All layers in the same sequence have the same number c of output channels. The first layer of each sequence has a stride s and all others use stride 1. All spatial convolutions use 3 × 3 kernels. The expansion factor t is always applied to the input size.
The bottleneck blocks use shortcut directly between the bottlenecks to improve the ability of gradient to propagate across multipler layers.
- Because ReLU is capable of preserving information only if the input manifold lies in a low-dimensional subspace of the input space, the input and output data of linear bottlenecks is in low-dimension space.
- Stride (s) in the depthwise convolutions is for down sampling.
- Expansion ratio (t) is greater than 1 for projecting the low dimensioal data to high dimension.
- Pre-trained model
- Time test in 1080Ti