A curated list of latest research papers, projects and resources related to DiT/FLUX. Content is automatically updated daily.
Last Update: 2025-01-18 06:25:59
Thanks to @longxiang-ai for the template.
- Image Editing (18 papers) - Papers about image editing with Diffusion Transformer or FLUX
- Image Generation (100 papers) - Papers focusing on image generation with Diffusion Transformer or FLUX
- Video Related (62 papers) - Papers about video generation and editing with Diffusion Transformer or FLUX
- EliGen: Entity-Level Controlled Image Generation with Regional Attention (Published: 2025-01-02)
Authors: Hong Zhang, Zhongjie Duan, Xingjun Wang, Yingda Chen, Yu Zhang
Links:
Keywords: image inpainting, Control, image generation, diffusion transformer, text-to-image - Context Canvas: Enhancing Text-to-Image Diffusion Models with Knowledge Graph-Based RAG (Published: 2024-12-12)
Authors: Kavana Venkatesh, Yusuf Dalva, Ismini Lourentzou, Pinar Yanardag
Links:
Keywords: Control, text-to-image, FLUX, image editing - FluxSpace: Disentangled Semantic Editing in Rectified Flow Transformers (Published: 2024-12-12)
Authors: Yusuf Dalva, Kavana Venkatesh, Pinar Yanardag
Links:
Keywords: FLUX, image generation, image editing, Control, rectified flow - AMO Sampler: Enhancing Text Rendering with Overshooting (Published: 2024-11-28)
Authors: Xixi Hu, Keyang Xu, Bo Liu, Qiang Liu, Hongliang Fei
Links:
Keywords: FLUX, Control, image generation, text-to-image, rectified flow - Prediction with Action: Visual Policy Learning via Joint Denoising Process (Published: 2024-11-27)
Authors: Yanjiang Guo, Yucheng Hu, Jianke Zhang, Yen-Jen Wang, Xiaoyu Chen, Chaochao Lu, Jianyu Chen
Links: |
Keywords: image generation, Control, diffusion transformer, image editing - HeadRouter: A Training-free Image Editing Framework for MM-DiTs by Adaptively Routing Attention Heads (Published: 2024-11-22)
Authors: Yu Xu, Fan Tang, Juan Cao, Yuxin Zhang, Xiaoyu Kong, Jintao Li, Oliver Deussen, Tong-Yee Lee
Links:
Keywords: image generation, diffusion transformer, image editing - Stable Flow: Vital Layers for Training-Free Image Editing (Published: 2024-11-21)
Authors: Omri Avrahami, Or Patashnik, Ohad Fried, Egor Nemchinov, Kfir Aberman, Dani Lischinski, Daniel Cohen-Or
Links: |
Keywords: Control, diffusion transformer, inversion, image editing - Oscillation Inversion: Understand the structure of Large Flow Model through the Lens of Inversion Method (Published: 2024-11-17)
Authors: Yan Zheng, Zhenxiao Liang, Xiaoyan Cong, Lanqing guo, Yuehao Wang, Peihao Wang, Zhangyang Wang
Links: |
Keywords: FLUX, image editing, inversion, text-to-image, rectified flow - Latent Space Disentanglement in Diffusion Transformers Enables Precise Zero-shot Semantic Editing (Published: 2024-11-12)
Authors: Zitao Shuai, Chenwei Wu, Zhengxu Tang, Bowen Song, Liyue Shen
Links:
Keywords: image generation, Control, diffusion transformer, image editing - Taming Rectified Flow for Inversion and Editing (Published: 2024-11-07)
Authors: Jiangshan Wang, Junfu Pu, Zhongang Qi, Jiayi Guo, Yue Ma, Nisha Huang, Yuxin Chen, Xiu Li, Ying Shan
Links: |
Keywords: FLUX, video generation, video editing, diffusion transformer, inversion, rectified flow - DiT4Edit: Diffusion Transformer for Image Editing (Published: 2024-11-05)
Authors: Kunyu Feng, Yue Ma, Bingyuan Wang, Chenyang Qi, Haozhe Chen, Qifeng Chen, Zeyu Wang
Links:
Keywords: image generation, diffusion transformer, inversion, image editing, Control - FiTv2: Scalable and Improved Flexible Vision Transformer for Diffusion Model (Published: 2024-10-17)
Authors: ZiDong Wang, Zeyu Lu, Di Huang, Cai Zhou, Wanli Ouyang, and Lei Bai
Links: |
Keywords: image generation, diffusion transformer, rectified flow - Semantic Image Inversion and Editing using Rectified Stochastic Differential Equations (Published: 2024-10-14)
Authors: Litu Rout, Yujia Chen, Nataniel Ruiz, Constantine Caramanis, Sanjay Shakkottai, Wen-Sheng Chu
Links:
Keywords: FLUX, image editing, inversion, Control, rectified flow - Effective Diffusion Transformer Architecture for Image Super-Resolution (Published: 2024-09-29)
Authors: Kun Cheng, Lei Yu, Zhijun Tu, Xiao He, Liyu Chen, Yong Guo, Mingrui Zhu, Nannan Wang, Xinbo Gao, Jie Hu
Links:
Keywords: image generation, diffusion transformer, image super-resolution - PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions (Published: 2024-09-23)
Authors: Weifeng Lin, Xinyu Wei, Renrui Zhang, Le Zhuo, Shitian Zhao, Siyuan Huang, Junlin Xie, Yu Qiao, Peng Gao, Hongsheng Li
Links: |
Keywords: Control, image generation, diffusion transformer, image editing, Controllable, text-to-image - Latent Space Disentanglement in Diffusion Transformers Enables Zero-shot Fine-grained Semantic Editing (Published: 2024-08-23)
Authors: Zitao Shuai, Chenwei Wu, Zhengxu Tang, Bowen Song, Liyue Shen
Links: |
Keywords: Control, text-to-image, diffusion transformer, image editing - Lazy Diffusion Transformer for Interactive Image Editing (Published: 2024-04-18)
Authors: Yotam Nitzan, Zongze Wu, Richard Zhang, Eli Shechtman, Daniel Cohen-Or, Taesung Park, Michaël Gharbi
Links:
Keywords: diffusion transformer, image editing - Lightning-Fast Image Inversion and Editing for Text-to-Image Diffusion Models (Published: 2023-12-19)
Authors: Dvir Samuel, Barak Meiri, Haggai Maron, Yoad Tewel, Nir Darshan, Shai Avidan, Gal Chechik, Rami Ben-Ari
Links:
Keywords: text-to-image, FLUX, inversion, image editing
Showing the latest 50 out of 100 papers
- Learnings from Scaling Visual Tokenizers for Reconstruction and Generation (Published: 2025-01-16)
Authors: Philippe Hansen-Estruch, David Yan, Ching-Yao Chung, Orr Zohar, Jialiang Wang, Tingbo Hou, Tao Xu, Sriram Vishwanath, Peter Vajda, Xinlei Chen
Links:
Keywords: image generation, diffusion transformer, video generation - Enhancing Image Generation Fidelity via Progressive Prompts (Published: 2025-01-13)
Authors: Zhen Xiong, Yuqi Li, Chuanguang Yang, Tiao Tan, Zhihong Zhu, Siyuan Li, Yue Ma
Links:
Keywords: image generation, Control, diffusion transformer - 3DIS-FLUX: simple and efficient multi-instance generation with DiT rendering (Published: 2025-01-09)
Authors: Dewei Zhou, Ji Xie, Zongxin Yang, Yi Yang
Links: |
Keywords: FLUX, Control, image generation, Controllable, text-to-image - Circuit Complexity Bounds for Visual Autoregressive Model (Published: 2025-01-08)
Authors: Yekun Ke, Xiaoyu Li, Yingyu Liang, Zhenmei Shi, Zhao Song
Links:
Keywords: image generation, diffusion transformer - GS-DiT: Advancing Video Generation with Pseudo 4D Gaussian Fields through Efficient Dense 3D Point Tracking (Published: 2025-01-05)
Authors: Weikang Bian, Zhaoyang Huang, Xiaoyu Shi, Yijin Li, Fu-Yun Wang, Hongsheng Li
Links: |
Keywords: Control, diffusion transformer, video generation - EliGen: Entity-Level Controlled Image Generation with Regional Attention (Published: 2025-01-02)
Authors: Hong Zhang, Zhongjie Duan, Xingjun Wang, Yingda Chen, Yu Zhang
Links:
Keywords: image inpainting, Control, image generation, diffusion transformer, text-to-image - Dual Diffusion for Unified Image Generation and Understanding (Published: 2024-12-31)
Authors: Zijie Li, Henry Li, Yichun Shi, Amir Barati Farimani, Yuval Kluger, Linjie Yang, Peng Wang
Links:
Keywords: image generation, text-to-image, diffusion transformer - Open-Sora: Democratizing Efficient Video Production for All (Published: 2024-12-29)
Authors: Zangwei Zheng, Xiangyu Peng, Tianji Yang, Chenhui Shen, Shenggui Li, Hongxin Liu, Yukun Zhou, Tianyi Li, Yang You
Links: |
Keywords: image generation, text-to-image, diffusion transformer, video generation - UNIC-Adapter: Unified Image-instruction Adapter with Multi-modal Transformer for Image Generation (Published: 2024-12-25)
Authors: Lunhao Duan, Shanshan Zhao, Wenjun Yan, Yinglun Li, Qing-Guo Chen, Zhao Xu, Weihua Luo, Kaifu Zhang, Mingming Gong, Gui-Song Xia
Links:
Keywords: Control, image generation, diffusion transformer, Controllable, text-to-image - 1.58-bit FLUX (Published: 2024-12-24)
Authors: Chenglin Yang, Celong Liu, Xueqing Deng, Dongwon Kim, Xing Mei, Xiaohui Shen, Liang-Chieh Chen
Links:
Keywords: image generation, text-to-image, FLUX - DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation (Published: 2024-12-24)
Authors: Minghong Cai, Xiaodong Cun, Xiaoyu Li, Wenze Liu, Zhaoyang Zhang, Yong Zhang, Ying Shan, Xiangyu Yue
Links:
Keywords: Control, diffusion transformer, video generation, video editing - Layer- and Timestep-Adaptive Differentiable Token Compression Ratios for Efficient Diffusion Transformers (Published: 2024-12-22)
Authors: Haoran You, Connelly Barnes, Yuqian Zhou, Yan Kang, Zhenbang Du, Wei Zhou, Lingzhi Zhang, Yotam Nitzan, Xiaoyang Liu, Zhe Lin, Eli Shechtman, Sohrab Amirghodsi, Yingyan Celine Lin
Links:
Keywords: image generation, text-to-image, diffusion transformer - CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up (Published: 2024-12-20)
Authors: Songhua Liu, Zhenxiong Tan, Xinchao Wang
Links: |
Keywords: image generation, diffusion transformer - Efficient Scaling of Diffusion Transformers for Text-to-Image Generation (Published: 2024-12-16)
Authors: Hao Li, Shamit Lal, Zhiheng Li, Yusheng Xie, Ying Wang, Yang Zou, Orchid Majumder, R. Manmatha, Zhuowen Tu, Stefano Ermon, Stefano Soatto, Ashwin Swaminathan
Links:
Keywords: image generation, text-to-image, diffusion transformer, Control - Causal Diffusion Transformers for Generative Modeling (Published: 2024-12-16)
Authors: Chaorui Deng, Deyao Zhu, Kunchang Li, Shi Guang, Haoqi Fan
Links:
Keywords: image generation, diffusion transformer - Video Diffusion Transformers are In-Context Learners (Published: 2024-12-14)
Authors: Zhengcong Fei, Di Qiu, Changqian Yu, Debang Li, Mingyuan Fan, Xiang Wen
Links: |
Keywords: Control, diffusion transformer, video generation, Controllable - MSC: Multi-Scale Spatio-Temporal Causal Attention for Autoregressive Video Diffusion (Published: 2024-12-13)
Authors: Xunnong Xu, Mengying Cao
Links:
Keywords: Control, diffusion transformer, video generation - Context Canvas: Enhancing Text-to-Image Diffusion Models with Knowledge Graph-Based RAG (Published: 2024-12-12)
Authors: Kavana Venkatesh, Yusuf Dalva, Ismini Lourentzou, Pinar Yanardag
Links:
Keywords: Control, text-to-image, FLUX, image editing - FluxSpace: Disentangled Semantic Editing in Rectified Flow Transformers (Published: 2024-12-12)
Authors: Yusuf Dalva, Kavana Venkatesh, Pinar Yanardag
Links:
Keywords: FLUX, image generation, image editing, Control, rectified flow - Multimodal Latent Language Modeling with Next-Token Diffusion (Published: 2024-12-11)
Authors: Yutao Sun, Hangbo Bao, Wenhui Wang, Zhiliang Peng, Li Dong, Shaohan Huang, Jianyong Wang, Furu Wei
Links:
Keywords: image generation, diffusion transformer - FlexDiT: Dynamic Token Density Control for Diffusion Transformer (Published: 2024-12-08)
Authors: Shuning Chang, Pichao Wang, Jiasheng Tang, Yi Yang
Links:
Keywords: video generation, Control, image generation, diffusion transformer, text-to-image - MotionStone: Decoupled Motion Intensity Modulation with Diffusion Transformer for Image-to-Video Generation (Published: 2024-12-08)
Authors: Shuwei Shi, Biao Gong, Xi Chen, Dandan Zheng, Shuai Tan, Zizheng Yang, Yuyuan Li, Jingwen He, Kecheng Zheng, Jingdong Chen, Ming Yang, Yinqiang Zheng
Links:
Keywords: Control, diffusion transformer, video generation - Self-Guidance: Boosting Flow and Diffusion Generation on Their Own (Published: 2024-12-08)
Authors: Tiancheng Li, Weijian Luo, Zhiyang Chen, Liyuan Ma, Guo-Jun Qi
Links:
Keywords: FLUX, video generation, image generation, diffusion transformer, text-to-image - Language-Guided Image Tokenization for Generation (Published: 2024-12-08)
Authors: Kaiwen Zha, Lijun Yu, Alireza Fathi, David A. Ross, Cordelia Schmid, Dina Katabi, Xiuye Gu
Links:
Keywords: image generation, text-to-image, diffusion transformer - Mind the Time: Temporally-Controlled Multi-Event Video Generation (Published: 2024-12-06)
Authors: Ziyi Wu, Aliaksandr Siarohin, Willi Menapace, Ivan Skorokhodov, Yuwei Fang, Varnith Chordia, Igor Gilitschenski, Sergey Tulyakov
Links:
Keywords: Control, diffusion transformer, video generation - CreatiLayout: Siamese Multimodal Diffusion Transformer for Creative Layout-to-Image Generation (Published: 2024-12-05)
Authors: Hui Zhang, Dexiang Hong, Tingwei Gao, Yitong Wang, Jie Shao, Xinglong Wu, Zuxuan Wu, Yu-Gang Jiang
Links: |
Keywords: image generation, Control, diffusion transformer, Controllable - Navigation World Models (Published: 2024-12-04)
Authors: Amir Bar, Gaoyue Zhou, Danny Tran, Trevor Darrell, Yann LeCun
Links:
Keywords: Control, diffusion transformer, video generation, Controllable - Seeing Beyond Views: Multi-View Driving Scene Video Generation with Holistic Attention (Published: 2024-12-04)
Authors: Hannan Lu, Xiaohe Wu, Shudong Wang, Xiameng Qin, Xinyu Zhang, Junyu Han, Wangmeng Zuo, Ji Tao
Links: |
Keywords: Control, diffusion transformer, video generation - Panoptic Diffusion Models: co-generation of images and segmentation maps (Published: 2024-12-04)
Authors: Yinghan Long, Kaushik Roy
Links:
Keywords: image generation, Control, diffusion transformer - Generative Photography: Scene-Consistent Camera Control for Realistic Text-to-Image Synthesis (Published: 2024-12-03)
Authors: Yu Yuan, Xijun Wang, Yichen Sheng, Prateek Chennuri, Xingguang Zhang, Stanley Chan
Links:
Keywords: image generation, text-to-image, FLUX, Control - World-consistent Video Diffusion with Explicit 3D Modeling (Published: 2024-12-02)
Authors: Qihang Zhang, Shuangfei Zhai, Miguel Angel Bautista, Kevin Miao, Alexander Toshev, Joshua Susskind, Jiatao Gu
Links:
Keywords: image generation, Control, diffusion transformer, video generation - CPA: Camera-pose-awareness Diffusion Transformer for Video Generation (Published: 2024-12-02)
Authors: Yuelei Wang, Jian Zhang, Pengtao Jiang, Hao Zhang, Jinwei Chen, Bo Li
Links:
Keywords: Control, diffusion transformer, video generation, Controllable - TinyFusion: Diffusion Transformers Learned Shallow (Published: 2024-12-02)
Authors: Gongfan Fang, Kunjun Li, Xinyin Ma, Xinchao Wang
Links: |
Keywords: image generation, diffusion transformer - AMO Sampler: Enhancing Text Rendering with Overshooting (Published: 2024-11-28)
Authors: Xixi Hu, Keyang Xu, Bo Liu, Qiang Liu, Hongliang Fei
Links:
Keywords: FLUX, Control, image generation, text-to-image, rectified flow - AC3D: Analyzing and Improving 3D Camera Control in Video Diffusion Transformers (Published: 2024-11-27)
Authors: Sherwin Bahmani, Ivan Skorokhodov, Guocheng Qian, Aliaksandr Siarohin, Willi Menapace, Andrea Tagliasacchi, David B. Lindell, Sergey Tulyakov
Links:
Keywords: Control, diffusion transformer, video generation - Prediction with Action: Visual Policy Learning via Joint Denoising Process (Published: 2024-11-27)
Authors: Yanjiang Guo, Yucheng Hu, Jianke Zhang, Yen-Jen Wang, Xiaoyu Chen, Chaochao Lu, Jianyu Chen
Links: |
Keywords: image generation, Control, diffusion transformer, image editing - Type-R: Automatically Retouching Typos for Text-to-Image Generation (Published: 2024-11-27)
Authors: Wataru Shimoda, Naoto Inoue, Daichi Haraguchi, Hayato Mitani, Seichi Uchida, Kota Yamaguchi
Links:
Keywords: image generation, text-to-image, FLUX - Accelerating Vision Diffusion Transformers with Skip Branches (Published: 2024-11-26)
Authors: Guanjie Chen, Xinyu Zhao, Yucheng Zhou, Tianlong Chen, Yu Cheng
Links: |
Keywords: image generation, diffusion transformer, video generation - Identity-Preserving Text-to-Video Generation by Frequency Decomposition (Published: 2024-11-26)
Authors: Shenghai Yuan, Jinfa Huang, Xianyi He, Yunyuan Ge, Yujun Shi, Liuhan Chen, Jiebo Luo, Li Yuan
Links:
Keywords: Control, diffusion transformer, video generation, Controllable - OminiControl: Minimal and Universal Control for Diffusion Transformer (Published: 2024-11-22)
Authors: Zhenxiong Tan, Songhua Liu, Xingyi Yang, Qiaochu Xue, Xinchao Wang
Links:
Keywords: Control, diffusion transformer - HeadRouter: A Training-free Image Editing Framework for MM-DiTs by Adaptively Routing Attention Heads (Published: 2024-11-22)
Authors: Yu Xu, Fan Tang, Juan Cao, Yuxin Zhang, Xiaoyu Kong, Jintao Li, Oliver Deussen, Tong-Yee Lee
Links:
Keywords: image generation, diffusion transformer, image editing - Stable Flow: Vital Layers for Training-Free Image Editing (Published: 2024-11-21)
Authors: Omri Avrahami, Or Patashnik, Ohad Fried, Egor Nemchinov, Kfir Aberman, Dani Lischinski, Daniel Cohen-Or
Links: |
Keywords: Control, diffusion transformer, inversion, image editing - Oscillation Inversion: Understand the structure of Large Flow Model through the Lens of Inversion Method (Published: 2024-11-17)
Authors: Yan Zheng, Zhenxiao Liang, Xiaoyan Cong, Lanqing guo, Yuehao Wang, Peihao Wang, Zhangyang Wang
Links: |
Keywords: FLUX, image editing, inversion, text-to-image, rectified flow - SmoothCache: A Universal Inference Acceleration Technique for Diffusion Transformers (Published: 2024-11-15)
Authors: Joseph Liu, Joshua Geddes, Ziyu Guo, Haomiao Jiang, Mahesh Kumar Nandwana
Links:
Keywords: image generation, diffusion transformer - Latent Space Disentanglement in Diffusion Transformers Enables Precise Zero-shot Semantic Editing (Published: 2024-11-12)
Authors: Zitao Shuai, Chenwei Wu, Zhengxu Tang, Bowen Song, Liyue Shen
Links:
Keywords: image generation, Control, diffusion transformer, image editing - DiT4Edit: Diffusion Transformer for Image Editing (Published: 2024-11-05)
Authors: Kunyu Feng, Yue Ma, Bingyuan Wang, Chenyang Qi, Haozhe Chen, Qifeng Chen, Zeyu Wang
Links:
Keywords: image generation, diffusion transformer, inversion, image editing, Control - Adaptive Caching for Faster Video Generation with Diffusion Transformers (Published: 2024-11-04)
Authors: Kumara Kahatapitiya, Haozhe Liu, Sen He, Ding Liu, Menglin Jia, Chenyang Zhang, Michael S. Ryoo, Tian Xie
Links:
Keywords: Control, diffusion transformer, video generation - Training-free Regional Prompting for Diffusion Transformers (Published: 2024-11-04)
Authors: Anthony Chen, Jianjin Xu, Wenzhao Zheng, Gaole Dai, Yida Wang, Renrui Zhang, Haofan Wang, Shanghang Zhang
Links: |
Keywords: image generation, text-to-image, diffusion transformer, FLUX - GameGen-X: Interactive Open-world Game Video Generation (Published: 2024-11-01)
Authors: Haoxuan Che, Xuanhua He, Quande Liu, Cheng Jin, Hao Chen
Links:
Keywords: Control, diffusion transformer, video generation - In-Context LoRA for Diffusion Transformers (Published: 2024-10-31)
Authors: Lianghua Huang, Wei Wang, Zhi-Fan Wu, Yupeng Shi, Huanzhang Dou, Chen Liang, Yutong Feng, Yu Liu, Jingren Zhou
Links: |
Keywords: image generation, text-to-image, diffusion transformer
Showing the latest 50 out of 62 papers
- Learnings from Scaling Visual Tokenizers for Reconstruction and Generation (Published: 2025-01-16)
Authors: Philippe Hansen-Estruch, David Yan, Ching-Yao Chung, Orr Zohar, Jialiang Wang, Tingbo Hou, Tao Xu, Sriram Vishwanath, Peter Vajda, Xinlei Chen
Links:
Keywords: image generation, diffusion transformer, video generation - Multi-subject Open-set Personalization in Video Generation (Published: 2025-01-10)
Authors: Tsai-Shien Chen, Aliaksandr Siarohin, Willi Menapace, Yuwei Fang, Kwot Sin Lee, Ivan Skorokhodov, Kfir Aberman, Jun-Yan Zhu, Ming-Hsuan Yang, Sergey Tulyakov
Links:
Keywords: diffusion transformer, video generation - ConceptMaster: Multi-Concept Video Customization on Diffusion Transformer Models Without Test-Time Tuning (Published: 2025-01-08)
Authors: Yuzhou Huang, Ziyang Yuan, Quande Liu, Qiulin Wang, Xintao Wang, Ruimao Zhang, Pengfei Wan, Di Zhang, Kun Gai
Links:
Keywords: diffusion transformer, video generation - Magic Mirror: ID-Preserved Video Generation in Video Diffusion Transformers (Published: 2025-01-07)
Authors: Yuechen Zhang, Yaoyang Liu, Bin Xia, Bohao Peng, Zexin Yan, Eric Lo, Jiaya Jia
Links: |
Keywords: diffusion transformer, video generation - TransPixar: Advancing Text-to-Video Generation with Transparency (Published: 2025-01-06)
Authors: Luozhou Wang, Yijun Li, Zhifei Chen, Jui-Hsien Wang, Zhifei Zhang, He Zhang, Zhe Lin, Yingcong Chen
Links:
Keywords: diffusion transformer, video generation - GS-DiT: Advancing Video Generation with Pseudo 4D Gaussian Fields through Efficient Dense 3D Point Tracking (Published: 2025-01-05)
Authors: Weikang Bian, Zhaoyang Huang, Xiaoyu Shi, Yijin Li, Fu-Yun Wang, Hongsheng Li
Links: |
Keywords: Control, diffusion transformer, video generation - Open-Sora: Democratizing Efficient Video Production for All (Published: 2024-12-29)
Authors: Zangwei Zheng, Xiangyu Peng, Tianji Yang, Chenhui Shen, Shenggui Li, Hongxin Liu, Yukun Zhou, Tianyi Li, Yang You
Links: |
Keywords: image generation, text-to-image, diffusion transformer, video generation - Accelerating Diffusion Transformers with Dual Feature Caching (Published: 2024-12-25)
Authors: Chang Zou, Evelyn Zhang, Runlin Guo, Haohang Xu, Conghui He, Xuming Hu, Linfeng Zhang
Links: |
Keywords: diffusion transformer, video generation - DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation (Published: 2024-12-24)
Authors: Minghong Cai, Xiaodong Cun, Xiaoyu Li, Wenze Liu, Zhaoyang Zhang, Yong Zhang, Ying Shan, Xiangyu Yue
Links:
Keywords: Control, diffusion transformer, video generation, video editing - FFA Sora, video generation as fundus fluorescein angiography simulator (Published: 2024-12-23)
Authors: Xinyuan Wu, Lili Wang, Ruoyu Chen, Bowen Liu, Weiyi Zhang, Xi Yang, Yifan Feng, Mingguang He, Danli Shi
Links:
Keywords: diffusion transformer, video generation - Video Diffusion Transformers are In-Context Learners (Published: 2024-12-14)
Authors: Zhengcong Fei, Di Qiu, Changqian Yu, Debang Li, Mingyuan Fan, Xiang Wen
Links: |
Keywords: Control, diffusion transformer, video generation, Controllable - LinGen: Towards High-Resolution Minute-Length Text-to-Video Generation with Linear Computational Complexity (Published: 2024-12-13)
Authors: Hongjie Wang, Chih-Yao Ma, Yen-Cheng Liu, Ji Hou, Tao Xu, Jialiang Wang, Felix Juefei-Xu, Yaqiao Luo, Peizhao Zhang, Tingbo Hou, Peter Vajda, Niraj K. Jha, Xiaoliang Dai
Links: |
Keywords: diffusion transformer, video generation - MSC: Multi-Scale Spatio-Temporal Causal Attention for Autoregressive Video Diffusion (Published: 2024-12-13)
Authors: Xunnong Xu, Mengying Cao
Links:
Keywords: Control, diffusion transformer, video generation - From Slow Bidirectional to Fast Autoregressive Video Diffusion Models (Published: 2024-12-10)
Authors: Tianwei Yin, Qiang Zhang, Richard Zhang, William T. Freeman, Fredo Durand, Eli Shechtman, Xun Huang
Links:
Keywords: diffusion transformer, video generation - STIV: Scalable Text and Image Conditioned Video Generation (Published: 2024-12-10)
Authors: Zongyu Lin, Wei Liu, Chen Chen, Jiasen Lu, Wenze Hu, Tsu-Jui Fu, Jesse Allardice, Zhengfeng Lai, Liangchen Song, Bowen Zhang, Cha Chen, Yiran Fei, Yifan Jiang, Lezhi Li, Yizhou Sun, Kai-Wei Chang, Yinfei Yang
Links:
Keywords: diffusion transformer, video generation - ACDiT: Interpolating Autoregressive Conditional Modeling and Diffusion Transformer (Published: 2024-12-10)
Authors: Jinyi Hu, Shengding Hu, Yuxuan Song, Yufei Huang, Mingxuan Wang, Hao Zhou, Zhiyuan Liu, Wei-Ying Ma, Maosong Sun
Links:
Keywords: diffusion transformer, video generation - FlexDiT: Dynamic Token Density Control for Diffusion Transformer (Published: 2024-12-08)
Authors: Shuning Chang, Pichao Wang, Jiasheng Tang, Yi Yang
Links:
Keywords: video generation, Control, image generation, diffusion transformer, text-to-image - MotionStone: Decoupled Motion Intensity Modulation with Diffusion Transformer for Image-to-Video Generation (Published: 2024-12-08)
Authors: Shuwei Shi, Biao Gong, Xi Chen, Dandan Zheng, Shuai Tan, Zizheng Yang, Yuyuan Li, Jingwen He, Kecheng Zheng, Jingdong Chen, Ming Yang, Yinqiang Zheng
Links:
Keywords: Control, diffusion transformer, video generation - Self-Guidance: Boosting Flow and Diffusion Generation on Their Own (Published: 2024-12-08)
Authors: Tiancheng Li, Weijian Luo, Zhiyang Chen, Liyuan Ma, Guo-Jun Qi
Links:
Keywords: FLUX, video generation, image generation, diffusion transformer, text-to-image - Mind the Time: Temporally-Controlled Multi-Event Video Generation (Published: 2024-12-06)
Authors: Ziyi Wu, Aliaksandr Siarohin, Willi Menapace, Ivan Skorokhodov, Yuwei Fang, Varnith Chordia, Igor Gilitschenski, Sergey Tulyakov
Links:
Keywords: Control, diffusion transformer, video generation - Navigation World Models (Published: 2024-12-04)
Authors: Amir Bar, Gaoyue Zhou, Danny Tran, Trevor Darrell, Yann LeCun
Links:
Keywords: Control, diffusion transformer, video generation, Controllable - Seeing Beyond Views: Multi-View Driving Scene Video Generation with Holistic Attention (Published: 2024-12-04)
Authors: Hannan Lu, Xiaohe Wu, Shudong Wang, Xiameng Qin, Xinyu Zhang, Junyu Han, Wangmeng Zuo, Ji Tao
Links: |
Keywords: Control, diffusion transformer, video generation - SyncFlow: Toward Temporally Aligned Joint Audio-Video Generation from Text (Published: 2024-12-03)
Authors: Haohe Liu, Gael Le Lan, Xinhao Mei, Zhaoheng Ni, Anurag Kumar, Varun Nagaraja, Wenwu Wang, Mark D. Plumbley, Yangyang Shi, Vikas Chandra
Links:
Keywords: video generation - World-consistent Video Diffusion with Explicit 3D Modeling (Published: 2024-12-02)
Authors: Qihang Zhang, Shuangfei Zhai, Miguel Angel Bautista, Kevin Miao, Alexander Toshev, Joshua Susskind, Jiatao Gu
Links:
Keywords: image generation, Control, diffusion transformer, video generation - CPA: Camera-pose-awareness Diffusion Transformer for Video Generation (Published: 2024-12-02)
Authors: Yuelei Wang, Jian Zhang, Pengtao Jiang, Hao Zhang, Jinwei Chen, Bo Li
Links:
Keywords: Control, diffusion transformer, video generation, Controllable - OpenHumanVid: A Large-Scale High-Quality Dataset for Enhancing Human-Centric Video Generation (Published: 2024-11-28)
Authors: Hui Li, Mingwang Xu, Yun Zhan, Shan Mu, Jiaye Li, Kaihui Cheng, Yuxuan Chen, Tan Chen, Mao Ye, Jingdong Wang, Siyu Zhu
Links: |
Keywords: diffusion transformer, video generation - AC3D: Analyzing and Improving 3D Camera Control in Video Diffusion Transformers (Published: 2024-11-27)
Authors: Sherwin Bahmani, Ivan Skorokhodov, Guocheng Qian, Aliaksandr Siarohin, Willi Menapace, Andrea Tagliasacchi, David B. Lindell, Sergey Tulyakov
Links:
Keywords: Control, diffusion transformer, video generation - Accelerating Vision Diffusion Transformers with Skip Branches (Published: 2024-11-26)
Authors: Guanjie Chen, Xinyu Zhao, Yucheng Zhou, Tianlong Chen, Yu Cheng
Links: |
Keywords: image generation, diffusion transformer, video generation - Identity-Preserving Text-to-Video Generation by Frequency Decomposition (Published: 2024-11-26)
Authors: Shenghai Yuan, Jinfa Huang, Xianyi He, Yunyuan Ge, Yujun Shi, Liuhan Chen, Jiebo Luo, Li Yuan
Links:
Keywords: Control, diffusion transformer, video generation, Controllable - LetsTalk: Latent Diffusion Transformer for Talking Video Synthesis (Published: 2024-11-24)
Authors: Haojie Zhang, Zhihao Liang, Ruibo Fu, Zhengqi Wen, Xuefei Liu, Chenxing Li, Jianhua Tao, Yaling Liang
Links:
Keywords: diffusion transformer, video generation - TaQ-DiT: Time-aware Quantization for Diffusion Transformers (Published: 2024-11-21)
Authors: Xinyan Liu, Huihong Shi, Yang Xu, Zhongfeng Wang
Links:
Keywords: diffusion transformer, video generation - PoM: Efficient Image and Video Generation with the Polynomial Mixer (Published: 2024-11-19)
Authors: David Picard, Nicolas Dufour
Links: |
Keywords: diffusion transformer, video generation - Taming Rectified Flow for Inversion and Editing (Published: 2024-11-07)
Authors: Jiangshan Wang, Junfu Pu, Zhongang Qi, Jiayi Guo, Yue Ma, Nisha Huang, Yuxin Chen, Xiu Li, Ying Shan
Links: |
Keywords: FLUX, video generation, video editing, diffusion transformer, inversion, rectified flow - Adaptive Caching for Faster Video Generation with Diffusion Transformers (Published: 2024-11-04)
Authors: Kumara Kahatapitiya, Haozhe Liu, Sen He, Ding Liu, Menglin Jia, Chenyang Zhang, Michael S. Ryoo, Tian Xie
Links:
Keywords: Control, diffusion transformer, video generation - GameGen-X: Interactive Open-world Game Video Generation (Published: 2024-11-01)
Authors: Haoxuan Che, Xuanhua He, Quande Liu, Cheng Jin, Hao Chen
Links:
Keywords: Control, diffusion transformer, video generation - ARLON: Boosting Diffusion Transformers with Autoregressive Models for Long Video Generation (Published: 2024-10-27)
Authors: Zongyi Li, Shujie Hu, Shujie Liu, Long Zhou, Jeongsoo Choi, Lingwei Meng, Xun Guo, Jinyu Li, Hefei Ling, Furu Wei
Links: |
Keywords: diffusion transformer, video generation - Boosting Camera Motion Control for Video Diffusion Transformers (Published: 2024-10-14)
Authors: Soon Yau Cheong, Duygu Ceylan, Armin Mustafa, Andrew Gilbert, Chun-Hao Paul Huang
Links:
Keywords: Control, diffusion transformer, video generation - Scaling Laws For Diffusion Transformers (Published: 2024-10-10)
Authors: Zhengyang Liang, Hao He, Ceyuan Yang, Bo Dai
Links:
Keywords: image generation, text-to-image, diffusion transformer, video generation - Pyramidal Flow Matching for Efficient Video Generative Modeling (Published: 2024-10-08)
Authors: Yang Jin, Zhicheng Sun, Ningyuan Li, Kun Xu, Kun Xu, Hao Jiang, Nan Zhuang, Quzhe Huang, Yang Song, Yadong Mu, Zhouchen Lin
Links: |
Keywords: diffusion transformer, video generation - Accelerating Diffusion Transformers with Token-wise Feature Caching (Published: 2024-10-05)
Authors: Chang Zou, Xuyang Liu, Ting Liu, Siteng Huang, Linfeng Zhang
Links:
Keywords: diffusion transformer, video generation - LoVA: Long-form Video-to-Audio Generation (Published: 2024-09-23)
Authors: Xin Cheng, Xihua Wang, Yihan Wu, Yuyue Wang, Ruihua Song
Links:
Keywords: diffusion transformer, video editing - Qihoo-T2X: An Efficient Proxy-Tokenized Diffusion Transformer for Text-to-Any-Task (Published: 2024-09-06)
Authors: Jing Wang, Ao Ma, Jiasong Feng, Dawei Leng, Yuhui Yin, Xiaodan Liang
Links: |
Keywords: diffusion transformer, video generation - DiVE: DiT-based Video Generation with Enhanced Control (Published: 2024-09-03)
Authors: Junpeng Jiang, Gangyi Hong, Lijun Zhou, Enhui Ma, Hengtong Hu, Xia Zhou, Jie Xiang, Fan Liu, Kaicheng Yu, Haiyang Sun, Kun Zhan, Peng Jia, Miao Zhang
Links:
Keywords: Control, diffusion transformer, video generation, Controllable - VQ4DiT: Efficient Post-Training Vector Quantization for Diffusion Transformers (Published: 2024-08-30)
Authors: Juncan Deng, Shuaiting Li, Zeyu Wang, Hong Gu, Kedong Xu, Kejie Huang
Links:
Keywords: image generation, diffusion transformer, video generation - xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations (Published: 2024-08-22)
Authors: Can Qin, Congying Xia, Krithika Ramakrishnan, Michael Ryoo, Lifu Tu, Yihao Feng, Manli Shu, Honglu Zhou, Anas Awadalla, Jun Wang, Senthil Purushwalkam, Le Xue, Yingbo Zhou, Huan Wang, Silvio Savarese, Juan Carlos Niebles, Zeyuan Chen, Ran Xu, Caiming Xiong
Links:
Keywords: diffusion transformer, video generation - CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer (Published: 2024-08-12)
Authors: Zhuoyi Yang, Jiayan Teng, Wendi Zheng, Ming Ding, Shiyu Huang, Jiazheng Xu, Yuanming Yang, Wenyi Hong, Xiaohan Zhang, Guanyu Feng, Da Yin, Xiaotao Gu, Yuxuan Zhang, Weihan Wang, Yean Cheng, Ting Liu, Bin Xu, Yuxiao Dong, Jie Tang
Links: |
Keywords: diffusion transformer, video generation - Tora: Trajectory-oriented Diffusion Transformer for Video Generation (Published: 2024-07-31)
Authors: Zhenghao Zhang, Junchao Liao, Menghao Li, Zuozhuo Dai, Bingxue Qiu, Siyu Zhu, Long Qin, Weizhi Wang
Links: |
Keywords: Control, diffusion transformer, video generation, Controllable - MotionCraft: Crafting Whole-Body Motion with Plug-and-Play Multimodal Controls (Published: 2024-07-30)
Authors: Yuxuan Bian, Ailing Zeng, Xuan Ju, Xian Liu, Zhaoyang Zhang, Wei Liu, Qiang Xu
Links:
Keywords: Control, diffusion transformer, video generation - Diffusion Transformer Captures Spatial-Temporal Dependencies: A Theory for Gaussian Process Data (Published: 2024-07-23)
Authors: Hengyu Fu, Zehao Dou, Jiawei Guo, Mengdi Wang, Minshuo Chen
Links:
Keywords: diffusion transformer, video generation - Anchored Diffusion for Video Face Reenactment (Published: 2024-07-21)
Authors: Idan Kligvasser, Regev Cohen, George Leifman, Ehud Rivlin, Michael Elad
Links:
Keywords: diffusion transformer, video generation
- Scalable Diffusion Models with Transformers (ICCV 2023)
Authors: William Peebles, Saining Xie
Code: 🔗 GitHub
Keywords: diffusion model, transformer architecture
Feel free to submit Pull Requests to improve this list! Please follow these formats:
- Paper entry format:
**[Paper Title](link)** - Brief description
- Project entry format:
[Project Name](link) - Project description
Thanks to @longxiang-ai for the template.