Home

Awesome

Information-Theoretic Diffusion (ITD)

dino

Main Contribution

We introduce a new mathematical foundation for diffusion models inspired by classic results in information theory, which yields an exact unified objective for modeling either continuous or discrete data and provides justification for ensembling of diffusion models.

$$ \log p(x) = - \frac{1}{2} \int_{0}^{\infty} \text{mmse}(x, \gamma) d\gamma + \text{constant} \qquad \text{where} \quad \text{mmse} = \min_{\hat{x}} \mathbb E_{p(z_{\gamma}|x)} \left[ || x - \hat{x}(z_{\gamma}, \gamma) ||^2 \right] $$

<p align="center", width="100%"> <img width="49%" src="./assets/I-MMSE.svg"> <img width="49%" src="./assets/PEOD.svg"> </p>

Diffusion Math Comparison

$$-\log p(x) = constant + \frac{1}{2} \int_0^\infty \text{mmse}(x, \gamma) d\gamma$$

<div align="center">
Variational BoundInformation-Theoretic Bound (ours)
Exact?No, it's an approximationYES, it's an analytic solution
Simple?No, it has non-MSE termsYES, it has only one integral
</div>

Usage

Installation

Clone this repository and navigate to './ITdiffusion' as working directory in the Linux terminal or Anaconda Powershell Prompt, then run the command:

pip install -e .

This would install the 'itdiffusion' python package that scripts depend on.

(<span style="color:red">Note</span>: If you meet troubles when installing the 'mpi4py' library, please refer here. Run the above command again after fixing the problem.)

Utilities

Folder 'utilsitd' includes the utilities for our diffusion model, and especially, the ITD model is wrapped in diffusionmodel.py.

Preparing Data

We use CIFAR-10 dataset in our paper. The dataset preprocessing code is provided by dataset generation. For convenience, we include it in cifar10.py. You could run it directly to get processed dataset.

Fine-tuning

The following commands are used to run 'fine_tune.py':

  1. IDDPM + CIFAR10 + vlb:
python ./scripts/fine_tune.py 
--data_train_dir XXX/cifar_train 
--model_path XXX/iddpm/cifar10_uncond_vlb_50M_500K.pt 
--image_size 32 --num_channels 128 --num_res_blocks 3 --learn_sigma True --dropout 0.3 
--iddpm True --train_batch_size 32 --lr 2.5e-5 --epoch 10
  1. DDPM + CIFAR10:
python ./scripts/fine_tune.py 
--data_train_dir XXX/cifar_train
--image_size 32
--iddpm False --train_batch_size 64 --lr 1e-4 --epoch 10

For evaluation, run 'test.py' directly:

  1. IDDPM + CIFAR10 + vlb:
python ./scripts/test.py 
--data_train_dir XXX/cifar_train --data_test_dir XXX/cifar_test
--model_path ../checkpoints/iddpm/model_epoch10.pt 
--image_size 32 --num_channels 128 --num_res_blocks 3 --learn_sigma True --dropout 0.3 
--iddpm True --test_batch_size 256 --npoints 1000 --soft True
  1. DDPM + CIFAR10:
python ./scripts/test.py 
--data_train_dir XXX/cifar_train --data_test_dir XXX/cifar_test
--image_size 32
--iddpm False --test_batch_size 256 --npoints 1000 --soft True

Models

Results

<p align="center"> $$\text{Table: } \mathbb E \left[ -\log p(x) \right] \text{ (bits/dimension)}$$ </p> <div align="center">
ModelTraining ObjectiveVariational BoundIT Bound (ours)
IDDPMVariational-4.05-4.09
IDDPM (tune)Info-Theoretic-3.85-4.28
</div> <p align="center", width="100%"> <img width="55%" src="./assets/cont_density.png"> </p>

BibTeX

@inproceedings{
kong2023informationtheoretic,
title={Information-Theoretic Diffusion},
author={Xianghao Kong and Rob Brekelmans and Greg {Ver Steeg}},
booktitle={International Conference on Learning Representations},
year={2023},
url={https://arxiv.org/abs/2302.03792} }

References