Skip to content

Commit

Permalink
[Fix] Fix TIN normalize config (#2579)
Browse files Browse the repository at this point in the history
  • Loading branch information
cir7 authored Sep 6, 2023
1 parent 9349a72 commit d36db71
Show file tree
Hide file tree
Showing 5 changed files with 14 additions and 8 deletions.
4 changes: 3 additions & 1 deletion configs/_base_/models/tin_r50.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
# model settings

preprocess_cfg = dict(
mean=[127.5, 127.5, 127.5], std=[127.5, 127.5, 127.5], format_shape='NCHW')
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
format_shape='NCHW')

model = dict(
type='Recognizer2D',
Expand Down
2 changes: 1 addition & 1 deletion configs/recognition/tin/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ For a long time, the vision community tries to learn the spatio-temporal represe

| frame sampling strategy | resolution | gpus | backbone | pretrain | top1 acc | top5 acc | testing protocol | inference time(video/s) | gpu_mem(M) | config | ckpt | log |
| :---------------------: | :------------: | :--: | :------: | :-------------: | :------: | :------: | :--------------: | :---------------------: | :--------: | :-----------------------: | :---------------------: | :---------------------: |
| 1x1x8 | short-side 256 | 8x4 | ResNet50 | TSM-Kinetics400 | 71.77 | 90.36 | 8 clips x 1 crop | x | 6185 | [config](/configs/recognition/tin/tin_imagenet-pretrained-r50_8xb6-1x1x8-40e_sthv2-rgb.py) | [ckpt](https://download.openmmlab.com/mmaction/v1.0/recognition/tin/tin_kinetics400-pretrained-tsm-r50_1x1x8-50e_kinetics400-rgb/tin_kinetics400-pretrained-tsm-r50_1x1x8-50e_kinetics400-rgb_20220913-7f10d0c0.pth) | [log](https://download.openmmlab.com/mmaction/v1.0/recognition/tin/tin_kinetics400-pretrained-tsm-r50_1x1x8-50e_kinetics400-rgb/tin_kinetics400-pretrained-tsm-r50_1x1x8-50e_kinetics400-rgb.log) |
| 1x1x8 | short-side 256 | 8x4 | ResNet50 | TSM-Kinetics400 | 71.86 | 90.44 | 8 clips x 1 crop | x | 6185 | [config](/configs/recognition/tin/tin_imagenet-pretrained-r50_8xb6-1x1x8-40e_sthv2-rgb.py) | [ckpt](https://download.openmmlab.com/mmaction/v1.0/recognition/tin/tin_kinetics400-pretrained-tsm-r50_1x1x8-50e_kinetics400-rgb/tin_kinetics400-pretrained-tsm-r50_1x1x8-50e_kinetics400-rgb_20220913-7f10d0c0.pth) | [log](https://download.openmmlab.com/mmaction/v1.0/recognition/tin/tin_kinetics400-pretrained-tsm-r50_1x1x8-50e_kinetics400-rgb/tin_kinetics400-pretrained-tsm-r50_1x1x8-50e_kinetics400-rgb.log) |

Here, we use `finetune` to indicate that we use [TSM model](https://download.openmmlab.com/mmaction/v1.0/v1.0/recognition/tsm/tsm_imagenet-pretrained-r50_8xb16-1x1x8-50e_kinetics400-rgb/tsm_imagenet-pretrained-r50_8xb16-1x1x8-50e_kinetics400-rgb_20220831-64d69186.pth) trained on Kinetics-400 to finetune the TIN model on Kinetics-400.

Expand Down
4 changes: 2 additions & 2 deletions configs/recognition/tin/metafile.yml
Original file line number Diff line number Diff line change
Expand Up @@ -66,8 +66,8 @@ Models:
Results:
- Dataset: Kinetics-400
Metrics:
Top 1 Accuracy: 71.77
Top 5 Accuracy: 90.36
Top 1 Accuracy: 71.86
Top 5 Accuracy: 90.44
Task: Action Recognition
Training Log: https://download.openmmlab.com/mmaction/v1.0/recognition/tin/tin_kinetics400-pretrained-tsm-r50_1x1x8-50e_kinetics400-rgb/tin_kinetics400-pretrained-tsm-r50_1x1x8-50e_kinetics400-rgb.log
Weights: https://download.openmmlab.com/mmaction/v1.0/recognition/tin/tin_kinetics400-pretrained-tsm-r50_1x1x8-50e_kinetics400-rgb/tin_kinetics400-pretrained-tsm-r50_1x1x8-50e_kinetics400-rgb_20220913-7f10d0c0.pth
6 changes: 3 additions & 3 deletions mmaction/models/backbones/resnet_tin.py
Original file line number Diff line number Diff line change
Expand Up @@ -325,6 +325,9 @@ def init_structure(self):
if len(self.non_local_cfg) != 0:
self.make_non_local()

def _get_wrap_prefix(self):
return ['.net2']

def make_temporal_interlace(self):
"""Make temporal interlace for some layers."""
num_segment_list = [self.num_segments] * 4
Expand Down Expand Up @@ -365,6 +368,3 @@ def make_block_interlace(stage, num_segments, shift_div):
self.shift_div)
self.layer4 = make_block_interlace(self.layer4, num_segment_list[3],
self.shift_div)

def init_weights(self):
pass
6 changes: 5 additions & 1 deletion mmaction/models/backbones/resnet_tsm.py
Original file line number Diff line number Diff line change
Expand Up @@ -305,6 +305,9 @@ def make_non_local(self):
self.num_segments,
self.non_local_cfg)

def _get_wrap_prefix(self):
return ['.net', '.block']

def load_original_weights(self, logger):
"""Load weights from original checkpoint, which required converting
keys."""
Expand All @@ -317,7 +320,7 @@ def load_original_weights(self, logger):
for name, module in self.named_modules():
# convert torchvision keys
ori_name = name
for wrap_prefix in ['.net', '.block']:
for wrap_prefix in self._get_wrap_prefix():
if wrap_prefix in ori_name:
ori_name = ori_name.replace(wrap_prefix, '')
wrapped_layers_map[ori_name] = name
Expand Down Expand Up @@ -352,6 +355,7 @@ def load_original_weights(self, logger):
if layer_name in wrapped_layers_map:
wrapped_name = param_name.replace(
layer_name, wrapped_layers_map[layer_name])
print(f'wrapped_name {wrapped_name}')
state_dict_torchvision[
wrapped_name] = state_dict_torchvision.pop(param_name)

Expand Down

0 comments on commit d36db71

Please sign in to comment.