Skip to content

Official PyTorch implementation of IEEE Transaction on Multimedia 2023 paper “DilateFormer: Multi-Scale Dilated Transformer for Visual Recognition” .

Notifications You must be signed in to change notification settings

iSEE-Laboratory/dilateformer

 
 

Repository files navigation

DilateFormer

Official PyTorch implementation of IEEE Transaction on Multimedia 2023 paper “DilateFormer: Multi-Scale Dilated Transformer for Visual Recognition” . [paper] [Project Page]

We currenent release the pytorch version code for:

  • ImageNet-1K training

Image classification

Our repository is built base on the DeiT repository, but we add some useful features:

  1. Calculating accurate FLOPs and parameters with fvcore (see check_model.py).
  2. Auto-resuming.
  3. Saving best models and backup models.
  4. Generating training curve (see generate_tensorboard.py).

Installation

  • Install PyTorch 1.7.0+ and torchvision 0.8.1+

    conda install -c pytorch pytorch torchvision
  • Install other packages

    pip install timm
    pip install fvcore

Training

Simply run the training scripts as followed, and take dilateformer_tiny as example:

bash dist_train.sh dilateformer_tiny [other prams]

If the training was interrupted abnormally, you can simply rerun the script for auto-resuming. Sometimes the checkpoint may not be saved properly, you should set the resumed model via --reusme ${work_path}/ckpt/backup.pth.

Generate curves

You can generate the training curves as followed:

python3 generate_tensoboard.py

Note that you should install tensorboardX.

Calculating FLOPs and Parameters

You can calculate the FLOPs and parameters via:

python3 check_model.py

Acknowledgement

This repository is built using the timm library and the DeiT repository.

Citation

If you use this code for a paper, please cite:

DilateFormer

@article{jiao2023dilateformer,
title = {DilateFormer: Multi-Scale Dilated Transformer for Visual Recognition},
author = {Jiao, Jiayu and Tang, Yu-Ming and Lin, Kun-Yu and Gao, Yipeng and Ma, Jinhua and Wang, Yaowei and Zheng, Wei-Shi},
journal = {{IEEE} Transaction on Multimedia},
year = {2023}
}

About

Official PyTorch implementation of IEEE Transaction on Multimedia 2023 paper “DilateFormer: Multi-Scale Dilated Transformer for Visual Recognition” .

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.4%
  • Shell 0.6%