This repository is an official implementation of the paper Inception Convolution with Efficient Dilation Search.
Human pose estimation results on COCO
Download ImageNet pre-trained checkpoints from Google Drive or Baidu Drive with password gq87
.
Extract the file to get the following directory tree
|-- README.md
|-- ckpt
| |-- detection
| |-- human_pose
| |-- segmentation
|-- config
|-- model
|-- pattern_zoo
Users can quickly use IC-Conv in the following simple ways.
from model.ic_resnet import ic_resnet50
import torch
pattern_path = 'pattern_zoo/detection/ic_resnet50_k9.json'
load_path = 'ckpt/detection/r50_imagenet_retrain/ckpt.pth.tar'
net = ic_resnet50(pattern_path=pattern_path)
state = torch.load(load_path, 'cpu')
net.load_state_dict(state, strict=False)
state_keys = set(state.keys())
model_keys = set(net.state_dict().keys())
missing_keys = model_keys - state_keys
print(missing_keys)
inputs = torch.rand(1, 3, 224, 224)
outputs = net.forward(inputs)
print(outputs.shape)
MMPose users can use IC-Conv in the following ways.
- Copying the config files to the config path of mmpose, such as
cp human_pose/config/ic_res50_k13_coco_640x640.py your_mmpose_path/mmpose/configs/bottom_up/resnet/coco/ic_res50_k13_coco_640x640.py
- Copying the inception conv files to the model path of mmpose,
cp human_pose/model/ic_conv2d.py your_mmpose_path/mmpose/mmpose/models/backbones/ic_conv2d.py
cp human_pose/model/ic_resnet.py your_mmpose_path/mmpose/mmpose/models/backbones/ic_resnet.py
- Running it directly like this.
We provided the pre-trained weights of IC-ResNet-50, IC-ResNet-101and IC-ResNeXt-101 (32x4d) on ImageNet and the weights trained on specific tasks.
For users with limited computing power, you can directly reuse our provided IC-Conv and ImageNet pre-training weights for detection, segmentation, and 2d human pose estimation tasks on other datasets.
Attentions: The links in the tables below are relative paths. Therefore, you should clone the repository and download checkpoints.
Detector | Backbone | Lr | AP | dilation_pattern | checkpoint |
---|---|---|---|---|---|
Faster-RCNN-FPN | IC-R50 | 1x | 38.9 | pattern | ckpt/imagenet_retrain_ckpt |
Faster-RCNN-FPN | IC-R101 | 1x | 41.9 | pattern | ckpt/imagenet_retrain_ckpt |
Cascade-RCNN-FPN | IC-R50 | 1x | 42.4 | pattern | ckpt/imagenet_retrain_ckpt |
Cascade-RCNN-FPN | IC-R101 | 1x | 45.0 | pattern | ckpt/imagenet_retrain_ckpt |
Detector | Backbone | Lr | box AP | mask AP | dilation_pattern | checkpoint |
---|---|---|---|---|---|---|
Mask-RCNN-FPN | IC-R50 | 1x | 40.0 | 35.9 | pattern | ckpt/imagenet_retrain_ckpt |
Mask-RCNN-FPN | IC-R101 | 1x | 42.6 | 37.9 | pattern | ckpt/imagenet_retrain_ckpt |
Cascade-RCNN-FPN | IC-R50 | 1x | 43.4 | 36.8 | pattern | ckpt/imagenet_retrain_ckpt |
Cascade-RCNN-FPN | IC-R101 | 1x | 45.7 | 38.7 | pattern | ckpt/imagenet_retrain_ckpt |
We adjust the learning rate of resnet backbone in MMPose and get better baseline results. Please see the specific config files in config/human_pose/
.
Backbone | Input Size | AP | dilation_pattern | checkpoint |
---|---|---|---|---|
R50(mmpose) | 640x640 | 47.9 | ~ | ~ |
R50 | 640x640 | 51.0 | ~ | ~ |
IC-R50 | 640x640 | 62.2 | pattern | ckpt/imagenet_retrain_ckpt |
R101 | 640x640 | 55.5 | ~ | ~ |
IC-R101 | 640x640 | 63.3 | pattern | ckpt/imagenet_retrain_ckpt |
Backbone | Input Size | AP |
---|---|---|
R50(mmpose) | 640x640 | 52.5 |
R50 | 640x640 | 55.8 |
IC-R50 | 640x640 | 65.8 |
R101 | 640x640 | 60.2 |
IC-R101 | 640x640 | 68.5 |
The human pose estimation experiments are built upon MMPose.
If our paper helps your research, please cite it in your publications:
@article{liu2020inception,
title={Inception Convolution with Efficient Dilation Search},
author={Liu, Jie and Li, Chuming and Liang, Feng and Lin, Chen and Sun, Ming and Yan, Junjie and Ouyang, Wanli and Xu, Dong},
journal={arXiv preprint arXiv:2012.13587},
year={2020}
}