diff --git a/README.md b/README.md index 56f8d180..b3beb699 100644 --- a/README.md +++ b/README.md @@ -74,6 +74,7 @@ Projects that were developed in Scenic or used it for their experiments: * [A Generative Approach for Wikipedia-Scale Visual Entity Recognition](https://arxiv.org/abs/2403.02041) * [Streaming Dense Video Captioning](https://arxiv.org/abs/2404.01297) * [Dense Video Object Captioning from Disjoint Supervision](https://arxiv.org/abs/2306.11729) +* [Semantica: An Adaptable Image-Conditioned Diffusion Model](https://arxiv.org/abs/2405.14857) More information can be found in [projects](https://github.com/google-research/scenic/tree/main/scenic/projects#list-of-projects-hosted-in-scenic). diff --git a/scenic/projects/README.md b/scenic/projects/README.md index d5e06c0f..2934776f 100644 --- a/scenic/projects/README.md +++ b/scenic/projects/README.md @@ -132,6 +132,14 @@ > trained on single modalities or tasks. > Details can be found in the [paper](https://arxiv.org/abs/2111.12993). +* [Semantica](modified_simple_diffusion) + + > Semantica is a image-conditioned diffusion model that generates + > images based on the semantics of a conditioning image. It is trained + > exclusively on web-scale image pairs employing pretrained image encoders + > and semantic data-filtering. + > Details can be found in the [paper](https://arxiv.org/abs/2405.14857). + * [T5](t5) > Wrappers of T5 models in [t5x](https://github.com/google-research/t5x).