GitHub - WesLee88524/transfusion-pytorch: Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI

Transfusion - Pytorch (wip)

Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI.

Once completed, will also extend this to flow matching, as well as audio and video.

Citations

@inproceedings{Zhou2024TransfusionPT,
    title  = {Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model},
    author = {Chunting Zhou and Lili Yu and Arun Babu and Kushal Tirumala and Michihiro Yasunaga and Leonid Shamis and Jacob Kahn and Xuezhe Ma and Luke Zettlemoyer and Omer Levy},
    year   = {2024},
    url    = {https://api.semanticscholar.org/CorpusID:271909855}
}

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.github/workflows		.github/workflows
transfusion_pytorch		transfusion_pytorch
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
transfusion.png		transfusion.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Transfusion - Pytorch (wip)

Citations

About

Releases

Packages

Languages

License

WesLee88524/transfusion-pytorch

Folders and files

Latest commit

History

Repository files navigation

Transfusion - Pytorch (wip)

Citations

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages