We, as team "TheSSVL" or "EgoMotion-COMPASS", took 2nd place in both Object State Change Classification and PNR temporal localization tasks in Ego4d Challenge 2022. Please refer to our validation report for more details on our methodology
TODO
- Post Techincal report on Arxiv
- Release codes which we used in Ego4d Challenge 2022
- Release codes of our latest work on egocentric video understading
in addition to "wandb", we use same environment as VideoMAE and ego4d oscc i3d-resnet50 baseline. Please refer to the repos for more information
please refer to ego4d instruction and download required videos for fho_oscc task. For convinience, we save clips where state change occurs and clips where no state change occurs in two different directories. You can use same directory for both kinds of clips if you want.
Our directory structure:
/path/to/ego4d:
/v1
/full_scale
/*.mp4
...
/annotations
/*.json
# for saving clips where state change occurs
/pos
/unique_id
...
# for saving clips where no state change occurs
/neg
/unique_id
...
After finish downloading required videos, you could start finetuning experiments following instructions bellow.
- Finetuning pretrained weights on Ego4d oscc and temporal localization at the same time:
- Modify required paramters including dataset path in config/finetune_vitb_ego4d.yml or config/finetune_vitl_ego4d.yml, e.g.
finetune: "" # path to the pretrained weight
ps: you can download pretrained videoMAE weights from videoMAE repository:vitl, vitb
-
Modify required paramters in ./scripts/finetune_ego4d.sh
-
Finally, in ./scripts, run
# finetune on single node
bash finetune_ego4d.sh 0 0.0.0.0
# finetune on two nodes:
# run on first node
bash finetune_ego4d.sh 0 0.0.0.0
# run on second node
bash finetune_ego4d.sh 1 ip_address_of_first_machine
- Test on Ego4d oscc and temporal localization:
-
Similar to 1, modify required paramters including dataset path in config/test_ego4d.yml
-
Modify required paramters in ./scripts/test_ego4d.sh
-
Finally, in ./scripts, run
# test on single node
bash test_ego4d.sh 0 0.0.0.0
# test on two nodes:
# run on first node
bash test_ego4d.sh 0 0.0.0.0
# run on second node
bash test_ego4d.sh 1 ip_address_of_first_machine
For emphasis, two json files (one for oscc one for temporal localization) in the format specified by ego4d challenge will be generated and stored in the directory:
$output_dir/$name
where $output_dir is specified in config/test_ego4d.yml and $name is specified in test_ego4d.sh
[1] VideoMAE by Zhan, etc : VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
[2] VideoMAE by Kaiming, etc : Masked Autoencoders As Spatiotemporal Learners
[3] Vanilla MAE: Masked Autoencoders Are Scalable Vision Learners
[4] Ego4D: Ego4D: Around the World in 3,000 Hours of Egocentric Video
If you have any questions about our projects or implementation, please open an issue or contact via email:
Jiachen Lei: [email protected]
We built our codes based on ego4d-i3dresnet50, VideoMAE, MAE-pytorch. Thanks to all the contributors of these great repositories.