Transfusion - Pytorch (wip)

Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI.

Once completed, will also extend this to flow matching, as well as audio and video.

Citations

@inproceedings{Zhou2024TransfusionPT,
    title  = {Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model},
    author = {Chunting Zhou and Lili Yu and Arun Babu and Kushal Tirumala and Michihiro Yasunaga and Leonid Shamis and Jacob Kahn and Xuezhe Ma and Luke Zettlemoyer and Omer Levy},
    year   = {2024},
    url    = {https://api.semanticscholar.org/CorpusID:271909855}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Transfusion - Pytorch (wip)

Citations

Files

README.md

Latest commit

History

README.md

File metadata and controls

Transfusion - Pytorch (wip)

Citations