Skip to content

Official implementation for Multi-Modal Interaction Graph Convolutional Network for Temporal Language Localization in Videos

License

Notifications You must be signed in to change notification settings

zmzhang2000/MIGCN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Multi-modal Interaction Graph Convolutioal Network for Temporal Language Localization in Videos

Official implementation for Multi-Modal Interaction Graph Convolutional Network for Temporal Language Localization in Videos

Model Pipeline

model-pipeline

Usage

Environment Settings

We use the PyTorch framework.

  • Python version: 3.7.0
  • PyTorch version: 1.4.0

Get Code

Clone the repository:

git clone https://github.com/zmzhang2000/MIGCN.git
cd MIGCN

Data Preparation

Charades-STA

ActivityNet

  • Download the preprocessed annotations of ActivityNet.
  • Download the C3D features of ActivityNet.
  • Process the C3D feature according to process_activitynet_c3d() in data/preprocess/preprocess.py.
  • Save them in data/activitynet.

Pre-trained Models

  • Download the checkpoints of Charades-STA and ActivityNet.
  • Save them in checkpoints

Data Generation

We provide the generation procedure of all MIGCN data.

  • The raw data is listed in data/raw_data/download.sh.
  • The preprocess code is in data/preprocess.

Training

Train MIGCN on Charades-STA with I3D feature:

python main.py --dataset charades --feature i3d

Train MIGCN on ActivityNet with C3D feature:

python main.py --dataset activitynet --feature c3d

Testing

Test MIGCN on Charades-STA with I3D feature:

python main.py --dataset charades --feature i3d --test --model_load_path checkpoints/$MODEL_CHECKPOINT

Test MIGCN on ActivityNet with C3D feature:

python main.py --dataset activitynet --feature c3d --test --model_load_path checkpoints/$MODEL_CHECKPOINT

Other Hyper-parameters

List other hyper-parameters by:

python main.py -h

Reference

Please cite the following paper if MIGCN is helpful for your research

@ARTICLE{9547801,
  author={Zhang, Zongmeng and Han, Xianjing and Song, Xuemeng and Yan, Yan and Nie, Liqiang},
  journal={IEEE Transactions on Image Processing}, 
  title={Multi-Modal Interaction Graph Convolutional Network for Temporal Language Localization in Videos}, 
  year={2021},
  volume={30},
  number={},
  pages={8265-8277},
  doi={10.1109/TIP.2021.3113791}}

About

Official implementation for Multi-Modal Interaction Graph Convolutional Network for Temporal Language Localization in Videos

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published