This codebase provides the official PyTorch implementation of our CVPR 2024 paper:
Benchmarking Segmentation Models with Mask-Preserved Attribute Editing
Zijin Yin, Kongming Liang, Bing Li, Zhanyu Ma, Jun Guo
In IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR) 2024
We generate diverse synthetic samples by editing real images via diffusion models, and use synthetic-real pairs to evaluate semantic segmentation performances.
Install mmsegmentation
pip install -U openmim
mim install mmengine
mim install "mmcv>=2.0.0"
pip install "mmsegmentation>=1.0.0"
pip install "mmdet>=3.0.0rc4"
Install diffusers and transformers.
pip install diffusers==0.17.1
pip install transformers==4.26.1
Please refer to
Step 1. dataset_prepare.md for dataset preparation
Step 2. text_edit.md for image caption editing
Step 3. appear_edit.md for image appearance (color, material...) attributes editing (our conference mainly focuses on this part)
Step 4. geo_edit.md for object geometry (size, position) attributes editing (our journal extention mainly focuses on this part)
Step 5. filter.md for noisy filtering strategy.
Our code is built on top of several excellent research codebases and models, including PnP, LLAMA, and LLaVA, and additionally borrows mask filtering strategy from FreeMask, clip directional similarity metric from LANCE. Thanks for their contributions!