This repository contains the code and data for the following paper:
Exposing Text-Image Inconsistency Using Diffusion Models (ICLR 2024)
Please download the dataset here: [Google Drive]. When seeking permission, kindly provide your details along with the intended purpose for using this dataset. Please be aware that our dataset is exclusively intended for research purposes.
We tested on the environment of torch 1.13.1 with a cuda version of 11.7
pip install -r requirements.txt
Please check the provided jupyter notebook for details, or you can easily run the model using following code:
import torch
from PIL import Image
from pipeline import DTIILPipeline
im = Image.open('./asset/exampe.jpg').resize((512,512)).convert("RGB")
model_id = "runwayml/stable-diffusion-v1-5"
pipe = DTIILPipeline.from_pretrained(model_id, safety_checker=None)
mask = pipe(prompt, im)['final_mask']
If you find our code or dataset useful, please cite:
@inproceedings{
huang2024exposing,
title={Exposing Text-Image Inconsistency Using Diffusion Models},
author={Mingzhen Huang and Shan Jia and Zhou Zhou and Yan Ju and Jialing Cai and Siwei Lyu},
booktitle={The Twelfth International Conference on Learning Representations},
year={2024},
}