MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model | 31 Aug 2022 | TPAMI'2024 | | |
All are Worth Words: A ViT Backbone for Diffusion Models | 25 Sep 2022 | CVPR'2023 | | |
Learning to Learn with Generative Models of Neural Network Checkpoints | 26 Sep 2022 | arXiv | | |
Scalable Diffusion Models with Transformers | 19 Dec 2022 | ICCV'2023 | | |
Exploring Vision Transformers as Diffusion Learners | 28 Dec 2022 | arXiv | | |
DLT: Conditioned layout generation with Joint Discrete-Continuous Diffusion Layout Transformer | 07 Mar 2023 | ICCV'2023 | | |
Masked Diffusion Transformer is a Strong Image Synthesizer | 25 Mar 2023 | ICCV'2023 | | |
Diffusion Transformer for Adaptive Text-to-Speech | 03 May 2023 | Interspeech'2023 | | |
VDT: General-purpose Video Diffusion Transformers via Mask Modeling | 22 May 2023 | ICLR'2024 | | |
ViT-TTS: Visual Text-to-Speech with Scalable Diffusion Transformer | 22 May 2023 | EMNLP'2023 | | |
U-DiT TTS: U-Diffusion Vision Transformer for Text-to-Speech | 22 May 2023 | arXiv | | |
Fast Training of Diffusion Models with Masked Transformers | 15 Jun 2023 | TMLR | | |
DiT-3D: Exploring Plain Diffusion Transformers for 3D Shape Generation | 04 Jul 2023 | NeurIPS'2023 | | |
Large-Vocabulary 3D Diffusion Model with Transformer | 14 Sep 2023 | ICLR'2024 | | |
Cartoondiff: Training-free Cartoon Image Generation with Diffusion Transformer Models | 15 Sep 2023 | arXiv | | |
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis | 30 Sep 2023 | ICLR'2024 | | |
Dolfin: Diffusion Layout Transformers without Autoencoder | 25 Oct 2023 | arXiv | | |
Mapache: Masked parallel transformer for advanced speech editing and synthesis | 03 Dec 2023 | ICASSP'2024 | | |
DiffiT: Diffusion Vision Transformers for Image Generation | 04 Dec 2023 | arXiv | | |
GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation | 07 Dec 2023 | CVPR'2024 | | |
Photorealistic Video Generation with Diffusion Models | 11 Dec 2023 | arXiv | | |
DiT-Head: High-Resolution Talking Head Synthesis using Diffusion Transformers | 11 Dec 2023 | arXiv | | |
Fast Training of Diffusion Transformer with Extreme Masking for 3D Point Clouds Generation | 12 Dec 2023 | arXiv | | |
NViST: In the Wild New View Synthesis from a Single Image with Transformers | 13 Dec 2023 | arXiv | | |
TransDDPM: Transformer-Based Denoising Diffusion Probabilistic Model for Image Restoration | 28 Dec 2023 | PRCV'2023 | | |
Latte: Latent Diffusion Transformer for Video Generation | 05 Jan 2024 | arXiv | | |
PIXART-δ: Fast and Controllable Image Generation with Latent Consistency Models | 10 Jan 2024 | arXiv | | |
SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers | 16 Jan 2024 | arXiv | | |
Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers | 21 Jan 2024 | arXiv | | |
Cross-view Masked Diffusion Transformers for Person Image Synthesis | 02 Feb 2024 | arXiv | | |
DiffsFormer: A Diffusion Transformer on Stock Factor Augmentation | 05 Feb 2024 | arXiv | | |
Sora | 15 Feb 2024 | OpenAI | | |
SDiT: Spiking Diffusion Model with Transformer | 18 Feb 2024 | arXiv | | |
FiT: Flexible Vision Transformer for Diffusion Model | 19 Feb 2024 | arXiv | | |
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis | 22 Feb 2024 | arXiv | | |
OpenDiT | 26 Feb 2024 | GitHub | | |
FineDiffusion: Scaling up Diffusion Models for Fine-grained Image Generation with 10,000 Classes | 28 Feb 2024 | arXiv | | |
Open-Sora-Plan | 01 Mar 2024 | GitHub | | |
Stable Diffusion 3: Research Paper | 05 Mar 2024 | Stability AI | | |