Model | Backbone | Detector | Input Size | AP | Speed | Download | Config | Training Log |
---|---|---|---|---|---|---|---|---|
Simple Baseline | ResNet50 | YOLOv3 | 256x192 | 70.6 | 2.94 iter/s | model | cfg | log |
Fast Pose | ResNet50 | YOLOv3 | 256x192 | 72.0 | 3.54 iter/s | model | cfg | log |
Fast Pose (DUC) | ResNet50 - unshuffle | YOLOv3 | 256x192 | 72.4 | 2.91 iter/s | model | cfg | log |
HRNet | HRNet-W32 | YOLOv3 | 256x192 | 72.5 | 2.13 iter/s | model | cfg | log |
Fast Pose (DCN) | ResNet50 - dcn | YOLOv3 | 256x192 | 72.8 | 2.94 iter/s | model | cfg | log |
Fast Pose (DUC) | ResNet152 | YOLOv3 | 256x192 | 73.3 | 1.62 iter/s | model | cfg | log |
- All models are trained on keypoint train 2017 images which contains at least one human with keypoint annotations (64115 images).
- The evaluation is done on COCO keypoint val 2017 (5000 images).
- Flip test is used by default.
- One TITAN XP is used for speed test, with
batch_size=64
in each iteration. - Offline human detection results are used in speed test.
FastPose
is our own network design. Paper coming soon!