Vector Quantized Diffusion Model for Text-to-Image Synthesis

Overview

This is the official repo for the paper: [Vector Quantized Diffusion Model for Text-to-Image Synthesis].

VQ-Diffusion is based on a VQ-VAE whose latent space is modeled by a conditional variant of the recently developed Denoising Diffusion Probabilistic Model (DDPM). It produces significantly better text-to-image generation results when compared with Autoregressive models with similar numbers of parameters. Compared with previous GAN-based methods, VQ-Diffusion can handle more complex scenes and improve the synthesized image quality by a large margin.

Our code and model is ready, however, they are still under the review of the company. We promise to release them in December.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Vector Quantized Diffusion Model for Text-to-Image Synthesis

Overview

Framework

Samples

More Samples

Files

README.md

Latest commit

History

README.md

File metadata and controls

Vector Quantized Diffusion Model for Text-to-Image Synthesis

Overview

Framework

Samples

More Samples