Home

Awesome

Awesome Diffusion Transformers Awesome

TitleInitial DateVenueTaskResource
MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model31 Aug 2022TPAMI'2024
All are Worth Words: A ViT Backbone for Diffusion Models25 Sep 2022CVPR'2023
Learning to Learn with Generative Models of Neural Network Checkpoints26 Sep 2022arXiv
Scalable Diffusion Models with Transformers19 Dec 2022ICCV'2023
Exploring Vision Transformers as Diffusion Learners28 Dec 2022arXiv
DLT: Conditioned layout generation with Joint Discrete-Continuous Diffusion Layout Transformer07 Mar 2023ICCV'2023
Masked Diffusion Transformer is a Strong Image Synthesizer25 Mar 2023ICCV'2023
Diffusion Transformer for Adaptive Text-to-Speech03 May 2023Interspeech'2023
VDT: General-purpose Video Diffusion Transformers via Mask Modeling22 May 2023ICLR'2024
ViT-TTS: Visual Text-to-Speech with Scalable Diffusion Transformer22 May 2023EMNLP'2023
U-DiT TTS: U-Diffusion Vision Transformer for Text-to-Speech22 May 2023arXiv
Fast Training of Diffusion Models with Masked Transformers15 Jun 2023TMLR
DiT-3D: Exploring Plain Diffusion Transformers for 3D Shape Generation04 Jul 2023NeurIPS'2023
Large-Vocabulary 3D Diffusion Model with Transformer14 Sep 2023ICLR'2024
Cartoondiff: Training-free Cartoon Image Generation with Diffusion Transformer Models15 Sep 2023arXiv
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis30 Sep 2023ICLR'2024
Dolfin: Diffusion Layout Transformers without Autoencoder25 Oct 2023arXiv
Mapache: Masked parallel transformer for advanced speech editing and synthesis03 Dec 2023ICASSP'2024
DiffiT: Diffusion Vision Transformers for Image Generation04 Dec 2023arXiv
GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation07 Dec 2023CVPR'2024
Photorealistic Video Generation with Diffusion Models11 Dec 2023arXiv
DiT-Head: High-Resolution Talking Head Synthesis using Diffusion Transformers11 Dec 2023arXiv
Fast Training of Diffusion Transformer with Extreme Masking for 3D Point Clouds Generation12 Dec 2023arXiv
NViST: In the Wild New View Synthesis from a Single Image with Transformers13 Dec 2023arXiv
TransDDPM: Transformer-Based Denoising Diffusion Probabilistic Model for Image Restoration28 Dec 2023PRCV'2023
Latte: Latent Diffusion Transformer for Video Generation05 Jan 2024arXiv
PIXART-δ: Fast and Controllable Image Generation with Latent Consistency Models10 Jan 2024arXiv
SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers16 Jan 2024arXiv
Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers21 Jan 2024arXiv
Cross-view Masked Diffusion Transformers for Person Image Synthesis02 Feb 2024arXiv
DiffsFormer: A Diffusion Transformer on Stock Factor Augmentation05 Feb 2024arXiv
Sora15 Feb 2024OpenAI
SDiT: Spiking Diffusion Model with Transformer18 Feb 2024arXiv
FiT: Flexible Vision Transformer for Diffusion Model19 Feb 2024arXiv
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis22 Feb 2024arXiv
OpenDiT26 Feb 2024GitHub
FineDiffusion: Scaling up Diffusion Models for Fine-grained Image Generation with 10,000 Classes28 Feb 2024arXiv
Open-Sora-Plan01 Mar 2024GitHub
Stable Diffusion 3: Research Paper05 Mar 2024Stability AI

Contributing

Your contributions are always welcome!

Feel free to add/update contents in the data.json file.

This README and the website will be updated automatically, powered by GitHub Actions.

🚀 🚀 🚀