Pretrained model #2

KimWu1994 · 2022-07-28T13:24:07Z

Can trained models be provided, especially on the GQA dataset.

jeasinema · 2022-08-16T09:18:14Z

Sorry for the late reply. We're working towards an initial release of GQA models. The ETA is in the coming ~1-2 weeks.

jeasinema · 2022-09-10T16:03:00Z

Hi,

The pre-trained models on GQA has been relealsed since ae34a63. Sorry for the delay and please let us know if you have any questions.

gulu999 · 2022-10-17T03:11:00Z

Sorry to interrupt you. When I use the pre-trained model(swin_base), the program runs with the following error：
RuntimeError: Error(s) in loading state_dict for DataParallel:
Unexpected key(s) in state_dict: "module.encoder.encoder.proj.last_layer.weight_g","module.encoder.encoder.proj.last_layer.weight_v", "module.encoder.encoder.proj2.last_layer.weight_g", "module.encoder.encoder.proj2.last_layer.weight_v".

jeasinema · 2022-10-17T03:19:49Z

Hi @gulu999, please try to add strict=False to all the load_state_dict function calls. LMK if you still have any questions and it will be very helpful if you can provide the complete error log (with line number, etc).

gulu999 · 2022-10-17T04:15:28Z

First of all thanks for such a quick reply.
After I add strict=False，the program runs with the following error：
File "C:/Users/li/Desktop/xgqapaper/relvit/relvitli/main.py", line 281, in main_worker
optimizer.load_state_dict(ckpt['optimizer'],strict=False)
TypeError: load_state_dict() got an unexpected keyword argument 'strict'
And the complete error log without strict=False:
C:\Users\li\anaconda3\python.exe C:/Users/li/Desktop/xgqapaper/relvit/relvitli/main.py
config:
{'train_dataset': 'gqa', 'train_dataset_args': {'root_dir': 'D:\datasets\GQA\relvit\gqa_annotations', 'split': 'train'}, 'test_dataset': 'gqa', 'test_dataset_args': {'root_dir': 'D:\datasets\GQA\relvit\gqa_annotations', 'split': 'val'}, 'model': 'mcan-customized', 'model_args': {'word_emb_path': './cache/gqa_word_embed.npy', 'encoder': 'transparent_superpixel_encoder', 'encoder_args': {'encoder': 'swin_base', 'use_boxes_dim': False}}, 'load_encoder': './cache/swin_base-{}.pth', 'encoder_pretrain': 'imagenet', 'train_batches': 1000000, 'ep_per_batch': 1, 'max_epoch': 12, 'eval_mode': 1, 'relvit': True, 'relvit_weight': 1.0, 'relvit_loss_tau': 0.04, 'relvit_local_only': 2, 'relvit_mode': 1, 'relvit_sample_uniform': True, 'relvit_num_concepts': 1615, 'relvit_moco_m': 0.999, 'relvit_moco_use_queue': False, 'relvit_moco_K': 10, 'relvit_num_tokens': 49, 'optimizer': 'adamw', 'optimizer_args': {'lr': 0.0001, 'weight_decay': 0, 'milestones': [8, 10], 'eps': '1e-8'}, 'print_freq': 10, 'save_epoch': 1, 'eval_epoch': 1, 'grad_norm': 0.5}
set gpu: 0
train dataset: 711945 samples
test dataset: 32509 samples
==> Successfully loaded ./cache/swin_base-imagenet.pth for the enocder.
MCANCustomized(
(encoder): TransparentSuperpixelEncoder(
(encoder): SwinTransformer(
....
(proj_norm): LayerNorm()
(proj): Linear(in_features=1024, out_features=1843, bias=True)
)
Traceback (most recent call last):
File "C:/Users/li/Desktop/xgqapaper/relvit/relvitli/main.py", line 632, in
main(config)
File "C:/Users/li/Desktop/xgqapaper/relvit/relvitli/main.py", line 102, in main
main_worker(args.train_gpu, args.ngpus_per_node, args)
File "C:/Users/li/Desktop/xgqapaper/relvit/relvitli/main.py", line 278, in main_worker
model.load_state_dict(ckpt['state_dict'])
File "C:\Users\li\anaconda3\lib\site-packages\torch\nn\modules\module.py", line 1406, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for DataParallel:
Unexpected key(s) in state_dict: "module.encoder.encoder.proj.last_layer.weight_g", "module.encoder.encoder.proj.last_layer.weight_v", "module.encoder.encoder.proj2.last_layer.weight_g", "module.encoder.encoder.proj2.last_layer.weight_v".

jeasinema · 2022-10-18T00:38:59Z

Hi @gulu999, I found this in the error log

optimizer.load_state_dict(ckpt['optimizer'],strict=False)
TypeError: load_state_dict() got an unexpected keyword argument 'strict'

Sorry for not making it clear but there is no need to add strict=False to optimizer.load_state_dict.

Please let me know if this helps with the issue.

gulu999 · 2022-10-18T00:47:29Z

Thank you again！
After I follow your modification，the program runs with the following error：

Traceback (most recent call last):
File "C:/Users/li/Desktop/xgqapaper/relvit/relvitli/main.py", line 632, in
main(config)
File "C:/Users/li/Desktop/xgqapaper/relvit/relvitli/main.py", line 102, in main
main_worker(args.train_gpu, args.ngpus_per_node, args)
File "C:/Users/li/Desktop/xgqapaper/relvit/relvitli/main.py", line 281, in main_worker
optimizer.load_state_dict(ckpt['optimizer'])
File "C:\Users\li\anaconda3\lib\site-packages\torch\optim\optimizer.py", line 145, in load_state_dict
raise ValueError("loaded state dict contains a parameter group "
ValueError: loaded state dict contains a parameter group that doesn't match the size of optimizer's group

jeasinema · 2022-10-18T00:56:33Z

Hi @gulu999, since we don't have relvit/relvitli/main.py in the original repo, it could be a bit hard to reproduce this error on our end. Quick question: do you want to continue the training or just evaluate/fine-tune the weights? You may simply skip optimizer.load_state_dict(ckpt['optimizer']) if not continuing the training. Otherwise, could you help with more details on how to reproduct it? Thank you.

gulu999 · 2022-10-18T01:18:43Z

I want to evaluate the weight and I use pycharm to run the code on my laptop(only one Gpu) . The main.py is almost the same as your code（train_gqa.py) . The difference is that I changed some parameters:
if name == 'main':
parser = argparse.ArgumentParser()
parser.add_argument('--config-file',default='train_gqa_mcan.yaml')
parser.add_argument('--svname', default=None)
parser.add_argument('--save_dir', default='./save_dist')
parser.add_argument('--tag', default=None)
# parser.add_argument('--gpu', default='0')
parser.add_argument('--seed', type=int, default=123)
parser.add_argument('--workers', type=int, default=8)
# parser.add_argument('--test_only', action='store_true')
parser.add_argument('--test_only', default=True)
parser.add_argument('--test_model', default='D:\datasets\GQA\relvit\swin_base_original_gqa.pth')

And in the train_gqa_mcan.yaml, I only Change the encoder to swin_base.
sorry to bother you and my English is not very good. Thank you very much.

jeasinema · 2022-10-18T01:22:26Z

Hi @gulu999, thank you so much for the information! Since you're evaluating with the weights, you may just skip optimizer.load_state_dict(ckpt['optimizer']) and add strict=False to the remaining load_state_dict calls.

gulu999 · 2022-10-20T00:56:25Z

Thank you very much, I have successfully run the code and got the correct result.

dal-code · 2022-12-21T06:48:34Z

Hi,

The pre-trained models on GQA has been relealsed since ae34a63. Sorry for the delay and please let us know if you have any questions.

Hi, Can you provide a model trained on the on the HICO dataset.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pretrained model #2

Pretrained model #2

KimWu1994 commented Jul 28, 2022

jeasinema commented Aug 16, 2022

jeasinema commented Sep 10, 2022

gulu999 commented Oct 17, 2022

jeasinema commented Oct 17, 2022

gulu999 commented Oct 17, 2022

jeasinema commented Oct 18, 2022

gulu999 commented Oct 18, 2022

jeasinema commented Oct 18, 2022

gulu999 commented Oct 18, 2022

jeasinema commented Oct 18, 2022

gulu999 commented Oct 20, 2022

dal-code commented Dec 21, 2022

Pretrained model #2

Pretrained model #2

Comments

KimWu1994 commented Jul 28, 2022

jeasinema commented Aug 16, 2022

jeasinema commented Sep 10, 2022

gulu999 commented Oct 17, 2022

jeasinema commented Oct 17, 2022

gulu999 commented Oct 17, 2022

jeasinema commented Oct 18, 2022

gulu999 commented Oct 18, 2022

jeasinema commented Oct 18, 2022

gulu999 commented Oct 18, 2022

jeasinema commented Oct 18, 2022

gulu999 commented Oct 20, 2022

dal-code commented Dec 21, 2022