Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
atss_pvt_v2_b2_fpn_fp16_detraug_3x_coco.py		atss_pvt_v2_b2_fpn_fp16_detraug_3x_coco.py
atss_pvt_v2_b2_li_fpn_fp16_detraug_3x_coco.py		atss_pvt_v2_b2_li_fpn_fp16_detraug_3x_coco.py
cascade_mask_rcnn_pvt_v2_b2_fpn_fp16_detraug_3x_coco.py		cascade_mask_rcnn_pvt_v2_b2_fpn_fp16_detraug_3x_coco.py
cascade_mask_rcnn_pvt_v2_b2_li_fpn_fp16_detraug_3x_coco.py		cascade_mask_rcnn_pvt_v2_b2_li_fpn_fp16_detraug_3x_coco.py
gfl_pvt_v2_b2_fpn_fp16_detraug_3x_coco.py		gfl_pvt_v2_b2_fpn_fp16_detraug_3x_coco.py
gfl_pvt_v2_b2_li_fpn_fp16_detraug_3x_coco.py		gfl_pvt_v2_b2_li_fpn_fp16_detraug_3x_coco.py
sparse_rcnn_pvt_v2_b2_fpn_300_proposals_detraug_3x_coco.py		sparse_rcnn_pvt_v2_b2_fpn_300_proposals_detraug_3x_coco.py
sparse_rcnn_pvt_v2_b2_li_fpn_300_proposals_detraug_3x_coco.py		sparse_rcnn_pvt_v2_b2_li_fpn_300_proposals_detraug_3x_coco.py

README.md

Pyramid Vision Transformer (PVT)

Introduction

This directory contains the configs and results of PVTv2. You can find more examples in the original repository.

Please consider using the mmdet's configs when you train new models.

Results and Models

Method	Backbone	Pretrain	Lr schd	Aug	box AP	mask AP	Config	Download
ATSS	PVTv2-B2-Li	ImageNet-1K	3x	Yes	48.9	-	config	log & model
ATSS	PVTv2-B2	ImageNet-1K	3x	Yes	49.9	-	config	log & model
GFL	PVTv2-B2-Li	ImageNet-1K	3x	Yes	49.2	-	config	log & model
GFL	PVTv2-B2	ImageNet-1K	3x	Yes	50.2	-	config	log & model
Sparse R-CNN	PVTv2-B2-Li	ImageNet-1K	3x	Yes	48.9	-	config	log & model
Sparse R-CNN	PVTv2-B2	ImageNet-1K	3x	Yes	50.1	-	config	log & model
Cascade Mask R-CNN	PVTv2-B2-Li	ImageNet-1K	3x	Yes	50.9	44.0	config	log & model
Cascade Mask R-CNN	PVTv2-B2	ImageNet-1K	3x	Yes	51.1	44.4	config	log & model

Usage

Mixed Precision Training

The current configs use mixed precision training via MMCV by default. Please install PyTorch >= 1.6.0 to use torch.cuda.amp.

If you find performance difference from apex (used by the original authors), please raise an issue. Otherwise, we will clean code for apex.

Click me to use apex

To install apex, run:

git clone https://github.com/NVIDIA/apex
cd apex
python setup.py install --cpp_ext --cuda_ext --user

Modify configs with the following code:

runner = dict(type='EpochBasedRunnerAmp', max_epochs=36)
fp16 = None
optimizer_config = dict(
    type='ApexOptimizerHook',
    update_interval=1,
    grad_clip=None,
    coalesce=True,
    bucket_size_mb=-1,
    use_fp16=True,
)

Citation

PVTv1

@misc{wang2021pyramid,
      title={Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions},
      author={Wenhai Wang and Enze Xie and Xiang Li and Deng-Ping Fan and Kaitao Song and Ding Liang and Tong Lu and Ping Luo and Ling Shao},
      year={2021},
      eprint={2102.12122},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

PVTv2

@misc{wang2021pvtv2,
      title={PVTv2: Improved Baselines with Pyramid Vision Transformer},
      author={Wenhai Wang and Enze Xie and Xiang Li and Deng-Ping Fan and Kaitao Song and Ding Liang and Tong Lu and Ping Luo and Ling Shao},
      year={2021},
      eprint={2106.13797},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pvtv2_original

pvtv2_original

README.md

Pyramid Vision Transformer (PVT)

Introduction

Results and Models

Usage

Mixed Precision Training

Citation

Files

pvtv2_original

Directory actions

More options

Directory actions

More options

Latest commit

History

pvtv2_original

Folders and files

parent directory

README.md

Pyramid Vision Transformer (PVT)

Introduction

Results and Models

Usage

Mixed Precision Training

Citation