Make sure you are at the root of the project.
Basic Usage is
python code/main_dist.py $exp_name --arg1=val1 --arg2.subarg1=val2 --arg3.subarg2.subsubarg1=val3
Here the arguments follow the hierarchy defined in the config file.
In most cases, need to set four things:
ds.exp_setting
asgt5
(using 5 proposals per frame including ground-truth) orp100
(using 100 proposals per frame)mdl.name
asigrnd
(ImgGrnd),vgrnd
(VidGrnd),vog
(VOGNet). Forvog
model, you need to explicitly add--mdl.obj_tx.use_rel=True
and--mdl.mul_tx.use_rel=True
as they are set toFalse
by default in the config file.ds.conc_type
assvsq
,sep
,temp
orspat
depending on the concatenation strategy to use.train.prob_thresh
which is the probability threshold used for evaluation. Note thatsvsq
,sep
use a threshold=0.
All default hyper-parameters defined in configs/anet_srl_cfg.yml
can be similarly changed in command-line itself.
Note that the string after code/main_dist.py
is arbitrary, and you can set it to anything
you want.
To run ImgGrnd with SVSQ
strategy:
python code/main_dist.py "svsq_igrnd" --ds.conc_type='svsq' --mdl.name='igrnd' --train.prob_thresh=0.
Similarly, to run ImgGrnd with SEP
, TEMP
, SPAT
, set --ds.conc_type
python code/main_dist.py "sep_igrnd" --ds.conc_type='sep' --mdl.name='igrnd' --train.prob_thresh=0.
python code/main_dist.py "temp_igrnd" --ds.conc_type='temp' --mdl.name='igrnd' --train.prob_thresh=0.2
python code/main_dist.py "spat_igrnd" --ds.conc_type='spat' --mdl.name='igrnd' --train.prob_thresh=0.2
Similary, to run VidGrnd in SPAT
python code/main_dist.py "spat_vgrnd" --ds.conc_type='spat' --mdl.name='vgrnd' --train.prob_thresh=0.2
Or, to run VOGNet in SPAT
python code/main_dist.py "spat_vog" --ds.conc_type='spat' --mdl.name='vog' \
--mdl.obj_tx.use_rel=True --mdl.mul_tx.use_rel=True --train.prob_thresh=0.2
To run with 100 proposals per frame, additionally pass --ds.exp_setting='p100'
For TEMP
and SPAT
in p100
we set prob_thresh=0.1
(by tuning on validation set).
You can also test models trained in one concatenation type like SPAT
in another type like TEMP
. For instance,
python code/main_dist.py "vog_train_spat_test_temp" --ds.conc_type='temp' --mdl.name='vog' --train.prob_thresh=0.2 --train.resume=True --train.resume_path='./tmp/models/spat_vog.pth --only_val=True
- No object or multi-modal transformer, use
--mdl.name='igrnd'
- Only object transformer, use
--mdl.name='vgrnd'
- Only multimodal transformer, use
--mdl.name='vog'
and--mdl.obj_tx.to_use=false
- Both object and multimodal transformer with RPE use
--mdl.name='vog'
To set the number of heads and layers, set n_layers
and n_heads
under obj_tx
and mul_tx
respectively.
To use relative position encoding set use_rel=true
under obj_tx
and/or mul_tx
.
For transfering GT5 trained models to P100, we need to pass train.resume=True
and
provide the trained moel path via train.resume_path
.
We also need to provide only_val=True
and set ds.exp_setting='p100'
For instance, to test a ImageGrnd in p100
setting which was trained in gt5
setting:
python code/main_dist.py "svsq_igrnd_gt5_to_p100" --ds.conc_type='svsq' --mdl.name='igrnd --train.prob_thresh=0.' --train.resume=True --train.resume_path='./tmp/models/svsq_igrnd.pth' --ds.exp_setting='p100' --only_val=True
Here, ./tmp/models/svsq_igrnd.pth
is the model path for ImgGrnd trained using SVSQ in GT5 setting.
For TEMP
and SPAT
we found train.prob_thresh=0.5
to give the best results
We provide google drive links to the best model, the output predictions for all the tables in the paper. Alternatively, you can download them at once from this drive link
Additionally, the exact command used to run the model can be found in the log file under cmd
Page 7, Table 3, Row 1
Conc Strategy | Model Type | ID | |||
---|---|---|---|---|---|
SVSQ | ImgGrnd | svsq_igrnd_gt5_2Mar20 | model | log | pred |
SVSQ | VidGrnd | svsq_vgrnd_3Mar20 | model | log | pred |
SVSQ | VOGNet | svsq_vog_3Mar20 | model | log | pred |
SEP | ImgGrnd | sep_igrnd_gt5_2Mar20 | model | log | pred |
SEP | VidGrnd | sep_vgrnd_3Mar20 | model | log | pred |
SEP | VOGNet | sep_vog_3Mar20 | model | log | pred |
TEMP | ImgGrnd | temp_igrnd_2Mar20 | model | log | pred |
TEMP | VidGrnd | temp_vgrnd_3Mar20 | model | log | pred |
TEMP | VOGNet | temp_vog_3Mar20 | model | log | pred |
SPAT | ImgGrnd | spat_igrnd_3Mar20 | model | log | pred |
SPAT | VidGrnd | spat_vgrnd_3Mar20 | model | log | pred |
SPAT | VOGNet | spat_vog_3Mar20 | model | log | pred |
Page 7, Table 3, Row 2
Conc Strategy | Model Type | ID | |||
---|---|---|---|---|---|
SVSQ | ImgGrnd | svsq_igrnd_p100_10Mar20 | model | log | pred |
SVSQ | VidGrnd | svsq_vgrnd_p100_10Mar20 | model | log | pred |
SVSQ | VOGNet | svsq_vog_p100_10Mar20 | model | log | pred |
SEP | ImgGrnd | sep_igrnd_p100_11Mar20 | model | log | pred |
SEP | VidGrnd | sep_vgrnd_p100_8Mar20 | model | log | pred |
SEP | VOGNet | sep_vog_p100_6Mar20 | model | log | pred |
TEMP | ImgGrnd | temp_igrnd_p100_11Mar20 | model | log | pred |
TEMP | VidGrnd | temp_vgrnd_p100_8Mar20 | model | log | pred |
TEMP | VOGNet | temp_vog_p100_6Mar20 | model | log | pred |
SPAT | ImgGrnd | spat_igrnd_p100_11Mar20 | model | log | pred |
SPAT | VidGrnd | spat_vgrnd_p100_8Mar20 | model | log | pred |
SPAT | VOGNet | spat_vog_p100_6Mar20 | model | log | pred |
Page 7, Table 4
Train Conc Strategy | Test Conc Strategy | ID | |||
---|---|---|---|---|---|
SVSQ | SVSQ | svsq_vog_3Mar20 | model | log | pred |
SVSQ | TEMP | vog_gt5_train_svsq_val_temp_4Mar20 | - | log | pred |
SVSQ | SPAT | vog_gt5_train_svsq_val_spat_4Mar20 | - | log | pred |
TEMP | SVSQ | vog_gt5_train_temp_val_svsq_4Mar20 | - | log | pred |
TEMP | SPAT | vog_gt5_train_temp_val_spat_4Mar20 | - | log | pred |
TEMP | TEMP | temp_vog_3Mar20 | model | log | pred |
SPAT | SVSQ | vog_gt5_train_spat_val_svsq_4Mar20 | - | log | pred |
SPAT | TEMP | vog_gt5_train_spat_val_temp_4Mar20 | - | log | pred |
SPAT | SPAT | spat_vog_3Mar20 | model | log | pred |
Page 7, Table 5
Train Sampling | Test Sampling | Strategy | ID | |||
---|---|---|---|---|---|---|
Rnd | CS | SEP | sep_vog_gt5_rand_samp_4Mar20 | model | log | pred |
Rnd | CS | TEMP | temp_vog_gt5_rand_samp_4Mar20 | model | log | pred |
Rnd | CS | SPAT | spat_vog_gt5_rand_samp_4Mar20 | model | log | pred |
CS+Rnd | CS | SEP | sep_vog_p100_6Mar20 | model | log | pred |
CS+Rnd | CS | TEMP | temp_vog_3Mar20 | model | log | pred |
CS+Rnd | CS | SPAT | spat_vog_3Mar20 | model | log | pred |
CS+Rnd | Rnd | SEP | sep_vog_gt5_train_cs_test_rand_6Mar20 | - | log | pred |
CS+Rnd | Rnd | TEMP | temp_vog_gt5_train_cs_test_rand_6Mar20 | - | log | pred |
CS+Rnd | Rnd | SPAT | spat_vog_gt5_train_cs_test_rand_6Mar20 | - | log | pred |
Page 7, Table 6
Num Vids | ID | |||
---|---|---|---|---|
2 | spat_vog_2vid_4Mar20 | model | log | pred |
3 | spat_vog_3vid_4Mar20 | model | log | pred |
5 | spat_vog_5vid_4Mar20 | model | log | pred |
Page 7, Table 7
MDL Name | ID | |||
---|---|---|---|---|
ImgGrnd | spat_igrnd_3Mar20 | model | log | pred |
ImgGrnd_otx | spat_vgrnd_3Mar20 | model | log | pred |
ImgGrnd_otx_rpe | spat_vgrnd_rel_4Mar20 | model | log | pred |
ImgGrnd_mtx | spat_vog_only_mul_4Mar20 | model | log | pred |
ImgGrnd_mtx_rpe | spat_vog_only_mul_rel_4Mar20 | model | log | pred |
ImgGrnd_3L6H | spat_vgrnd_3L6H_4Mar20 | model | log | pred |
ImgGrnd_otx_mtx_rpe | spat_vog_3Mar20 | model | log | pred |
VOGNet_mtx_3L6H | spat_vog_3L6H_6Mar20 | model | log | pred |
VOGNet_mtx_3L6H_otx_3L6H | spat_vog_obj_3L6H_mul_3L6H_8Mar20 | model | log | pred |