Awesome

Diffusion Models and Representation Learning:A Survey

This repo is constructed for collecting and categorizing papers about diffusion models according to our survey paper——Diffusion Models and Representation Learning:A Survey.

Considering the fast development of this field, we will continue to update both Arxiv paper and this Repo.

Overview

Diffusion Models are popular generative modeling methods in various vision tasks, attracting significant attention. They can be considered a unique instance of self-supervised learning methods due to their independence from label annotation. This survey explores the interplay between diffusion models and representation learning. It provides an overview of diffusion models' essential aspects, including mathematical foundations, popular denoising network architectures, and guidance methods. Various approaches related to diffusion models and representation learning are detailed. These include frameworks that leverage representations learned from pre-trained diffusion models for subsequent recognition tasks and methods that utilize advancements in representation and self-supervised learning to enhance diffusion models. This survey aims to offer a comprehensive overview of the taxonomy between diffusion models and representation learning, identifying key areas of existing concerns and potential exploration.

Papers (listed according to year)

Diffusion Models for Representation Learning

ADDP: Learning General Representations for Image Recognition and Generation with Alternating Denoising Diffusion Process
C. Tian, C. Tao, J. Dai, H. Li, Z. Li, L. Lu, X. Wang, H. Li, G. Huang, X. Zhu
ICLR, 2024.

Sd4match: Learning to prompt stable diffusion model for semantic matching
X. Li, J. Lu, K. Han, V. A. Prisacariu
CVPR 2024.

Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models
J. Guo, X. Xu, Y. Pu, Z. Ni, C. Wang, M. Vasu, S. Song, G. Huang, H. Shi
CVPR 2024.

Soda: Bottleneck Diffusion Models for Representation Learning
D. A. Hudson, D. Zoran, M. Malinowski, A. K. Lampinen, A. Jaegle, J. L. McClelland, L. Matthey, F. Hill, A. Lerchner
CVPR 2024.

Masked Diffusion as Self-Supervised Representation Learner
Z. Pan, J. Chen, Y. Shi
arXiv 2024.

ScribbleGen: Generative Data Augmentation Improves Scribble-Supervised Semantic Segmentation
J. Schnell, J. Wang, L. Qi, V. T. Hu, M. Tang
arXiv, 2024.

Deconstructing Denoising Diffusion Models for Self-Supervised Learning
X. Chen, Z. Liu, S. Xie, K. He
arXiv 2024.

Can Generative Models Improve Self-Supervised Representation Learning?
S. Ayromlou, A. Afkanpour, V. R. Khazaie, F. Forghani
arXiv 2024.

Unsupervised Semantic Correspondence Using Stable Diffusion
E. Hedlin, G. Sharma, S. Mahajan, H. Isack, A. Kar, A. Tagliasacchi, K. M. Yi
NeurIPS 2023.

A Tale of Two Features: Stable Diffusion Complements DINO for Zero-Shot Semantic Correspondence
J. Zhang, C. Herrmann, J. Hur, L. P. Cabrera, V. Jampani, D. Sun, M.-H. Yang
NeurIPS 2023.

Emergent Correspondence from Image Diffusion
L. Tang, M. Jia, Q. Wang, C. P. Phoo, B. Hariharan
NeurIPS 2023.

Diffusion Hyperfeatures: Searching Through Time and Space for Semantic Correspondence
G. Luo, L. Dunlap, D. H. Park, A. Holynski, T. Darrell
NeurIPS 2023.

Plug-and-Play Diffusion Features for Text-Driven Image-to-Image Translation
N. Tumanyan, M. Geyer, S. Bagon, T. Dekel
CVPR, 2023.

Diversity is Definitely Needed: Improving Model-Agnostic Zero-Shot Classification via Stable Diffusion
J. Shipard, A. Wiliem, K. N. Thanh, W. Xiang, C. Fookes
CVPR, 2023.

Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models
J. Xu, S. Liu, A. Vahdat, W. Byeon, X. Wang, S. De Mello
CVPR 2023.

Denoising Diffusion Autoencoders are Unified Self-Supervised Learners
W. Xiang, H. Yang, D. Huang, Y. Wang
ICCV, 2023.

Diffusion Models as Masked Autoencoders
C. Wei, K. Mangalam, P.-Y. Huang, Y. Li, H. Fan, H. Xu, H. Wang, C. Xie, A. Yuille, C. Feichtenhofer
ICCV, 2023.

Unleashing Text-to-Image Diffusion Models for Visual Perception
W. Zhao, Y. Rao, Z. Liu, B. Liu, J. Zhou, J. Lu
ICCV 2023.

Your diffusion model is secretly a zero-shot classifier
A. C. Li, M. Prabhudesai, S. Duggal, E. Brown, D. Pathak
ICCV 2023.

Diffusion Model as Representation Learner
X. Yang, X. Wang
ICCV 2023.

Dreamteacher: Pretraining image backbones with deep generative models
D. Li, H. Ling, A. Kar, D. Acuna, S. W. Kim, K. Kreis, A. Torralba, S. Fidler
ICCV 2023.

Infodiffusion: Representation Learning Using Information Maximizing Diffusion Models
Y. Wang, Y. Schiff, A. Gokaslan, W. Pan, F. Wang, C. De Sa, V. Kuleshov
ICML, PMLR, 2023.

Learning Data Representations with Joint Diffusion Models
K. Deja, T. Trzciński, J. M. Tomczak
Joint European Conference on Machine Learning and Knowledge Discovery in Databases, 2023.

Diffusion Models Beat GANs on Image Classification
S. Mukhopadhyay, M. Gwilliam, V. Agarwal, N. Padmanabhan, A. Swaminathan, S. Hegde, T. Zhou, A. Shrivastava
arXiv 2023.

Do Text-Free Diffusion Models Learn Discriminative Visual Representations?
S. Mukhopadhyay, M. Gwilliam, Y. Yamaguchi, V. Agarwal, N. Padmanabhan, A. Swaminathan, T. Zhou, A. Shrivastava
arXiv 2023.

Unsupervised Representation Learning from Pre-Trained Diffusion Probabilistic Models
Z. Zhang, Z. Zhao, Z. Lin
NeurIPS 2022.

Diffusion Autoencoders: Toward a Meaningful and Decodable Representation
K. Preechakul, N. Chatthee, S. Wizadwongsa, S. Suwajanakorn
CVPR 2022.

Label-Efficient Semantic Segmentation with Diffusion Models
D. Baranchuk, A. Voynov, I. Rubachev, V. Khrulkov, A. Babenko
ICLR 2022.

Prompt-to-Prompt Image Editing with Cross Attention Control
A. Hertz, R. Mokady, J. Tenenbaum, K. Aberman, Y. Pritch, D. Cohen-Or
arXiv 2022.

Your ViT is Secretly a Hybrid Discriminative-Generative Diffusion Model
X. Yang, S.-M. Shih, Y. Fu, X. Zhao, S. Ji
arXiv 2022.

Diffusion Models Beat GANs on Image Synthesis
P. Dhariwal, A. Nichol
NeurIPS 2021.

Representation Learning for Diffusion Model

Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
Sihyun Yu, Sangkyung Kwak, Huiwon Jang, Jongheon Jeong, Jonathan Huang, Jinwoo Shin, Saining Xie
https://arxiv.org/abs/2410.06940

Diffusion Handles Enabling 3D Edits for Diffusion Models by Lifting Activations to 3D
K. Pandey, P. Guerrero, M. Gadelha, Y. Hold-Geoffroy, K. Singh, N. J. Mitra
CVPR 2024.

Readout Guidance: Learning Control from Diffusion Features
G. Luo, T. Darrell, O. Wang, D. B. Goldman, A. Holynski
CVPR 2024.

Depth-aware guidance with self-estimated depth representations of diffusion models
G. Kim, W. Jang, G. Lee, S. Hong, J. Seo, S. Kim
Pattern Recognition, vol. 153, 2024.

Diffusion Model with Perceptual Loss
S. Lin, X. Yang
arXiv 2024.

Self-Rectifying Diffusion Sampling with Perturbed-Attention Guidance
D. Ahn, H. Cho, J. Min, W. Jang, J. Kim, S. Kim, H. H. Park, K. H. Jin, S. Kim
arXiv 2024.

Return of Unconditional Generation: A Self-supervised Representation Generation Method
T. Li, D. Katabi, K. He
arXiv 2024.

Rethinking Cluster-Conditioned Diffusion Models
N. Adaloglou, T. Kaiser, F. Michels, M. Kollmann
arXiv 2024.

Diffusion Models and Semi-Supervised Learners Benefit Mutually with Few Labels
Z. You, Y. Zhong, F. Bao, J. Sun, C. Li, J. Zhu
NeurIPS 2023.

Diffusion Self-Guidance for Controllable Image Generation
D. Epstein, A. Jabri, B. Poole, A. Efros, A. Holynski
NeurIPS 2023.

kNN-Diffusion: Image Generation via Large-Scale Retrieval
S. Sheynin, O. Ashual, A. Polyak, U. Singer, O. Gafni, E. Nachmani, Y. Taigman
ICLR, 2023.

Self-Guided Diffusion Models
V. T. Hu, D. W. Zhang, Y. M. Asano, G. J. Burghouts, C. G. M. Snoek
CVPR 2023.

Improving Sample Quality of Diffusion Models Using Self-Attention Guidance
S. Hong, G. Lee, W. Jang, S. Kim
ICCV 2023.

Guided Diffusion from Self-Supervised Diffusion Features
V. T. Hu, Y. Chen, M. Caron, Y. M. Asano, C. G. M. Snoek, B. Ommer
arXiv 2023.

Retrieval-Augmented Diffusion Models
A. Blattmann, R. Rombach, K. Oktay, B. Ommer
NeurIPS 2022.

Elucidating the design space of diffusion-based generative models
T. Karras, M. Aittala, T. Aila, S. Laine
NeurIPS 2022.

General

State of the Art on Diffusion Models for Visual Computing
R. Po, W. Yifan, V. Golyanik, K. Aberman, J. T. Barron, A. H. Bermano, E. R. Chan, T. Dekel, A. Holynski, A. Kanazawa, C. K. Liu, L. Liu, B. Mildenhall, M. Nießner, B. Ommer, C. Theobalt, P. Wonka, G. Wetzstein
Computer Graphics Forum 2024.

Diffusion Models in Vision: A Survey
F. Croitoru, V. Hondru, R. Ionescu, M. Shah
IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 45, no. 09, 2023.

Diffusion Models: A Comprehensive Survey of Methods and Applications
L. Yang, Z. Zhang, Y. Song, S. Hong, R. Xu, Y. Zhao, W. Zhang, B. Cui, M.-H. Yang
ACM Computing Surveys, vol. 56, no. 4, 2023.

Improved Techniques for Maximum Likelihood Estimation for Diffusion ODEs
K. Zheng, C. Lu, J. Chen, J. Zhu
ICML PMLR 2023.

On the Design Fundamentals of Diffusion Models: A Survey
Z. Chang, G. A. Koulieris, H. P. H. Shum
arXiv 2023.

Understanding Diffusion Models: A Unified Perspective
C. Luo
arXiv 2022.

Progressive Distillation for Fast Sampling of Diffusion Models
T. Salimans, J. Ho
ICLR, 2022.

High-Resolution Image Synthesis with Latent Diffusion Models
R. Rombach, A. Blattmann, D. Lorenz, P. Esser, B. Ommer
CVPR, 2022.

Dynamic Dual-Output Diffusion Models
Y. Benny, L. Wolf
CVPR 2022.

Variational Diffusion Models
D. Kingma, T. Salimans, B. Poole, J. Ho
NeurIPS, vol. 34, 2021.

A Variational Perspective on Diffusion-Based Generative Models and Score Matching
C.-W. Huang, J. H. Lim, A. Courville
NeurIPS 2021.

Score-Based Generative Modeling through Stochastic Differential Equations
Y. Song, J. Sohl-Dickstein, D. P. Kingma, A. Kumar, S. Ermon, B. Poole
ICLR, 2021.

Denoising Diffusion Probabilistic Models
J. Ho, A. Jain, P. Abbeel
NeurIPS 2020.

Generative Modeling by Estimating Gradients of the Data Distribution
Y. Song, S. Ermon
NeurIPS, vol. 32, 2019.