Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inference on AVA and JHMDB Needs Maintenance and Necessary Files #14

Open
DanLuoNEU opened this issue Jan 18, 2023 · 3 comments
Open

Inference on AVA and JHMDB Needs Maintenance and Necessary Files #14

DanLuoNEU opened this issue Jan 18, 2023 · 3 comments

Comments

@DanLuoNEU
Copy link

DanLuoNEU commented Jan 18, 2023

For the version I am using,

AVA2.1 inference needs several modifications:

  1. video_frame_list = sorted(glob(video_frame_path + '/*.jpg'))

For function loadvideo, the function should be reading images with the video name.
video_frame_list = sorted(glob(video_frame_path + vid + '/*.jpg'))

  1. Change the path here for the annotations.

    f = open("/xxx/datasets/ava_val_excluded_timestamps_v2.1.csv")

  2. The fixes above would get the number listed in the README table. But there would still be a tensorboard error "EOFerror". Add lines after

if cfg.DDP_CONFIG.GPU_WORLD_RANK == 0:
        writer.close()

AVA2.2 Inference

per_class [0.49119732        nan 0.32108856 0.58690862 0.1453127  0.25250868
 0.05269343 0.55119903 0.47336599 0.58118356 0.83511073 0.85809156
 0.4264426  0.79215918 0.7533182         nan 0.61339698        nan
        nan 0.04726829        nan 0.16529978        nan 0.23965087
        nan 0.04494236 0.306021   0.55275175 0.36725148 0.07057226
        nan        nan        nan 0.12159738        nan 0.03173127
 0.02196539 0.2641557         nan        nan 0.67544085        nan
 0.00367732        nan 0.01473403 0.03833153 0.03002702 0.37160171
 0.53368705        nan 0.21649021 0.1374056         nan 0.29578147
        nan 0.03978733 0.10253565 0.03219929 0.33915299 0.01752664
 0.28362901 0.3223239  0.14873739 0.52285939 0.14770317 0.11950478
 0.44886859 0.17733113 0.06789831 0.27917222        nan 0.46795067
 0.06238106 0.71983267        nan 0.05018591 0.31590126 0.09531384
 0.8376019  0.70844574]
{'PascalBoxes_Precision/[email protected]': 0.30985340450933535, 'PascalBoxes_PerformanceByCategory/[email protected]/bend/bow (at the waist)': 0.4911973183134509, 'PascalBoxes_PerformanceByCategory/[email protected]/crouch/kneel': 0.3210885611841083, 'PascalBoxes_PerformanceByCategory/[email protected]/dance': 0.5869086163647963, 'PascalBoxes_PerformanceByCategory/[email protected]/fall down': 0.14531270272554303, 'PascalBoxes_PerformanceByCategory/[email protected]/get up': 0.25250867821227696, 'PascalBoxes_PerformanceByCategory/[email protected]/jump/leap': 0.05269343043207558, 'PascalBoxes_PerformanceByCategory/[email protected]/lie/sleep': 0.5511990313327797, 'PascalBoxes_PerformanceByCategory/[email protected]/martial art': 0.47336599427812304, 'PascalBoxes_PerformanceByCategory/[email protected]/run/jog': 0.5811835550049768, 'PascalBoxes_PerformanceByCategory/[email protected]/sit': 0.8351107282724392, 'PascalBoxes_PerformanceByCategory/[email protected]/stand': 0.8580915605931295, 'PascalBoxes_PerformanceByCategory/[email protected]/swim': 0.42644259946642094, 'PascalBoxes_PerformanceByCategory/[email protected]/walk': 0.7921591772441756, 'PascalBoxes_PerformanceByCategory/[email protected]/answer phone': 0.7533181965878357, 'PascalBoxes_PerformanceByCategory/[email protected]/carry/hold (an object)': 0.613396976906247, 'PascalBoxes_PerformanceByCategory/[email protected]/climb (e.g., a mountain)': 0.047268291513739374, 'PascalBoxes_PerformanceByCategory/[email protected]/close (e.g., a door, a box)': 0.16529978105316412, 'PascalBoxes_PerformanceByCategory/[email protected]/cut': 0.239650870599096, 'PascalBoxes_PerformanceByCategory/[email protected]/dress/put on clothing': 0.04494235744272522, 'PascalBoxes_PerformanceByCategory/[email protected]/drink': 0.30602100382076136, 'PascalBoxes_PerformanceByCategory/[email protected]/drive (e.g., a car, a truck)': 0.5527517520577403, 'PascalBoxes_PerformanceByCategory/[email protected]/eat': 0.3672514840844659, 'PascalBoxes_PerformanceByCategory/[email protected]/enter': 0.07057225556756908, 'PascalBoxes_PerformanceByCategory/[email protected]/hit (an object)': 0.12159737681929804, 'PascalBoxes_PerformanceByCategory/[email protected]/lift/pick up': 0.03173127096825363, 'PascalBoxes_PerformanceByCategory/[email protected]/listen (e.g., to music)': 0.021965385905557883, 'PascalBoxes_PerformanceByCategory/[email protected]/open (e.g., a window, a car door)': 0.2641556990694153, 'PascalBoxes_PerformanceByCategory/[email protected]/play musical instrument': 0.6754408509957595, 'PascalBoxes_PerformanceByCategory/[email protected]/point to (an object)': 0.0036773150722066972, 'PascalBoxes_PerformanceByCategory/[email protected]/pull (an object)': 0.01473402768023624, 'PascalBoxes_PerformanceByCategory/[email protected]/push (an object)': 0.038331529680086275, 'PascalBoxes_PerformanceByCategory/[email protected]/put down': 0.03002701544153771, 'PascalBoxes_PerformanceByCategory/[email protected]/read': 0.3716017145811048, 'PascalBoxes_PerformanceByCategory/[email protected]/ride (e.g., a bike, a car, a horse)': 0.5336870531261757, 'PascalBoxes_PerformanceByCategory/[email protected]/sail boat': 0.21649020512834088, 'PascalBoxes_PerformanceByCategory/[email protected]/shoot': 0.13740559748226708, 'PascalBoxes_PerformanceByCategory/[email protected]/smoke': 0.2957814682780021, 'PascalBoxes_PerformanceByCategory/[email protected]/take a photo': 0.03978732762876234, 'PascalBoxes_PerformanceByCategory/[email protected]/text on/look at a cellphone': 0.10253564997258985, 'PascalBoxes_PerformanceByCategory/[email protected]/throw': 0.03219929211064902, 'PascalBoxes_PerformanceByCategory/[email protected]/touch (an object)': 0.33915299353156436, 'PascalBoxes_PerformanceByCategory/[email protected]/turn (e.g., a screwdriver)': 0.017526643108955034, 'PascalBoxes_PerformanceByCategory/[email protected]/watch (e.g., TV)': 0.28362901476702795, 'PascalBoxes_PerformanceByCategory/[email protected]/work on a computer': 0.322323903124391, 'PascalBoxes_PerformanceByCategory/[email protected]/write': 0.1487373880589133, 'PascalBoxes_PerformanceByCategory/[email protected]/fight/hit (a person)': 0.5228593870747025, 'PascalBoxes_PerformanceByCategory/[email protected]/give/serve (an object) to (a person)': 0.14770317484649234, 'PascalBoxes_PerformanceByCategory/[email protected]/grab (a person)': 0.11950477963584528, 'PascalBoxes_PerformanceByCategory/[email protected]/hand clap': 0.44886858836133026, 'PascalBoxes_PerformanceByCategory/[email protected]/hand shake': 0.17733112595251085, 'PascalBoxes_PerformanceByCategory/[email protected]/hand wave': 0.06789830556787521, 'PascalBoxes_PerformanceByCategory/[email protected]/hug (a person)': 0.27917221591712854, 'PascalBoxes_PerformanceByCategory/[email protected]/kiss (a person)': 0.4679506698404774, 'PascalBoxes_PerformanceByCategory/[email protected]/lift (a person)': 0.062381058259554645, 'PascalBoxes_PerformanceByCategory/[email protected]/listen to (a person)': 0.7198326661128859, 'PascalBoxes_PerformanceByCategory/[email protected]/push (another person)': 0.050185914377705816, 'PascalBoxes_PerformanceByCategory/[email protected]/sing to (e.g., self, a person, a group)': 0.31590125934914154, 'PascalBoxes_PerformanceByCategory/[email protected]/take (an object) from (a person)': 0.09531383956904724, 'PascalBoxes_PerformanceByCategory/[email protected]/talk to (e.g., self, a person, a group)': 0.8376018955287321, 'PascalBoxes_PerformanceByCategory/[email protected]/watch (a person)': 0.7084457445779531}
mAP: 0.30985
@DanLuoNEU
Copy link
Author

DanLuoNEU commented Feb 2, 2023

For JHMDB Inference

modify the loading detr part according to the built model embed_query input dimensions to avoid this problem

pretrained_dict.update({k: v[:query_size]})

pretrained_dict.update({k: v[:query_size]})
if query_size == model.module.query_embed.weight.shape[0]: continue 
if v.shape[0] < model.module.query_embed.weight.shape[0]: # In case the pretrained model does not align
  query_embed_zeros=torch.zeros(model.module.query_embed.weight.shape)
  pretrained_dict.update({k: query_embed_zeros})
else:
  pretrained_dict.update({k: v[:model.module.query_embed.weight.shape[0]]})

Got different mAP as the table shows

per_class [0.96529908 0.4870422  0.81740977 0.64671594 0.99981187 0.48678173
 0.72522214 0.70157535 0.99132313 0.99332738 0.92539198 0.63780982
 0.6607778  0.89695387 0.78694818 0.42965094 0.26324953 0.94429166
 0.27346689 0.68134081 0.87238637        nan        nan        nan]
{'PascalBoxes_Precision/[email protected]': 0.7231798302410739, 'PascalBoxes_PerformanceByCategory/[email protected]/Basketball': 0.9652990848728149, 'PascalBoxes_PerformanceByCategory/[email protected]/BasketballDunk': 0.4870421987013735, 'PascalBoxes_PerformanceByCategory/[email protected]/Biking': 0.8174097664543525, 'PascalBoxes_PerformanceByCategory/[email protected]/CliffDiving': 0.6467159401389935, 'PascalBoxes_PerformanceByCategory/[email protected]/CricketBowling': 0.9998118686054533, 'PascalBoxes_PerformanceByCategory/[email protected]/Diving': 0.48678173366600064, 'PascalBoxes_PerformanceByCategory/[email protected]/Fencing': 0.7252221388068574, 'PascalBoxes_PerformanceByCategory/[email protected]/FloorGymnastics': 0.7015753486207187, 'PascalBoxes_PerformanceByCategory/[email protected]/GolfSwing': 0.9913231289322941, 'PascalBoxes_PerformanceByCategory/[email protected]/HorseRiding': 0.9933273801597415, 'PascalBoxes_PerformanceByCategory/[email protected]/IceDancing': 0.9253919821730238, 'PascalBoxes_PerformanceByCategory/[email protected]/LongJump': 0.637809816668955, 'PascalBoxes_PerformanceByCategory/[email protected]/PoleVault': 0.6607777957457814, 'PascalBoxes_PerformanceByCategory/[email protected]/RopeClimbing': 0.8969538737505489, 'PascalBoxes_PerformanceByCategory/[email protected]/SalsaSpin': 0.7869481765834933, 'PascalBoxes_PerformanceByCategory/[email protected]/SkateBoarding': 0.42965094009542815, 'PascalBoxes_PerformanceByCategory/[email protected]/Skiing': 0.26324952994810963, 'PascalBoxes_PerformanceByCategory/[email protected]/Skijet': 0.9442916605769802, 'PascalBoxes_PerformanceByCategory/[email protected]/SoccerJuggling': 0.27346688938240526, 'PascalBoxes_PerformanceByCategory/[email protected]/Surfing': 0.681340807090747, 'PascalBoxes_PerformanceByCategory/[email protected]/TennisSwing': 0.8723863740884812, 'PascalBoxes_PerformanceByCategory/[email protected]/TrampolineJumping': nan, 'PascalBoxes_PerformanceByCategory/[email protected]/VolleyballSpiking': nan, 'PascalBoxes_PerformanceByCategory/[email protected]/WalkingWithDog': nan}
mAP: 0.72318

@DanLuoNEU DanLuoNEU changed the title AVA dataloader image loading with missing video name Inference AVA and JHMDB Needs Maintenance and Necessary Files Feb 2, 2023
@DanLuoNEU DanLuoNEU changed the title Inference AVA and JHMDB Needs Maintenance and Necessary Files Inference on AVA and JHMDB Needs Maintenance and Necessary Files Feb 2, 2023
@CKK-coder
Copy link

For JHMDB Inference

modify the loading detr part according to the built model embed_query input dimensions to avoid this problem

pretrained_dict.update({k: v[:query_size]})

pretrained_dict.update({k: v[:query_size]})
if query_size == model.module.query_embed.weight.shape[0]: continue 
if v.shape[0] < model.module.query_embed.weight.shape[0]: # In case the pretrained model does not align
  query_embed_zeros=torch.zeros(model.module.query_embed.weight.shape)
  pretrained_dict.update({k: query_embed_zeros})
else:
  pretrained_dict.update({k: v[:model.module.query_embed.weight.shape[0]]})

Got different mAP as the table shows

per_class [0.96529908 0.4870422  0.81740977 0.64671594 0.99981187 0.48678173
 0.72522214 0.70157535 0.99132313 0.99332738 0.92539198 0.63780982
 0.6607778  0.89695387 0.78694818 0.42965094 0.26324953 0.94429166
 0.27346689 0.68134081 0.87238637        nan        nan        nan]
{'PascalBoxes_Precision/[email protected]': 0.7231798302410739, 'PascalBoxes_PerformanceByCategory/[email protected]/Basketball': 0.9652990848728149, 'PascalBoxes_PerformanceByCategory/[email protected]/BasketballDunk': 0.4870421987013735, 'PascalBoxes_PerformanceByCategory/[email protected]/Biking': 0.8174097664543525, 'PascalBoxes_PerformanceByCategory/[email protected]/CliffDiving': 0.6467159401389935, 'PascalBoxes_PerformanceByCategory/[email protected]/CricketBowling': 0.9998118686054533, 'PascalBoxes_PerformanceByCategory/[email protected]/Diving': 0.48678173366600064, 'PascalBoxes_PerformanceByCategory/[email protected]/Fencing': 0.7252221388068574, 'PascalBoxes_PerformanceByCategory/[email protected]/FloorGymnastics': 0.7015753486207187, 'PascalBoxes_PerformanceByCategory/[email protected]/GolfSwing': 0.9913231289322941, 'PascalBoxes_PerformanceByCategory/[email protected]/HorseRiding': 0.9933273801597415, 'PascalBoxes_PerformanceByCategory/[email protected]/IceDancing': 0.9253919821730238, 'PascalBoxes_PerformanceByCategory/[email protected]/LongJump': 0.637809816668955, 'PascalBoxes_PerformanceByCategory/[email protected]/PoleVault': 0.6607777957457814, 'PascalBoxes_PerformanceByCategory/[email protected]/RopeClimbing': 0.8969538737505489, 'PascalBoxes_PerformanceByCategory/[email protected]/SalsaSpin': 0.7869481765834933, 'PascalBoxes_PerformanceByCategory/[email protected]/SkateBoarding': 0.42965094009542815, 'PascalBoxes_PerformanceByCategory/[email protected]/Skiing': 0.26324952994810963, 'PascalBoxes_PerformanceByCategory/[email protected]/Skijet': 0.9442916605769802, 'PascalBoxes_PerformanceByCategory/[email protected]/SoccerJuggling': 0.27346688938240526, 'PascalBoxes_PerformanceByCategory/[email protected]/Surfing': 0.681340807090747, 'PascalBoxes_PerformanceByCategory/[email protected]/TennisSwing': 0.8723863740884812, 'PascalBoxes_PerformanceByCategory/[email protected]/TrampolineJumping': nan, 'PascalBoxes_PerformanceByCategory/[email protected]/VolleyballSpiking': nan, 'PascalBoxes_PerformanceByCategory/[email protected]/WalkingWithDog': nan}
mAP: 0.72318

Thank you for your correction.Do you find any code about video map inference. I want to reproduce the video map of UCF101-24.

@FransHk
Copy link

FransHk commented Sep 19, 2024

Thanks for taking your time to write this, helped me greatly. It's a shame that the codebase for this model is such a mess as-is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants