Home

Awesome

Denoising Diffusion Models for Plug-and-Play Image Restoration

Yuanzhi Zhu, Kai Zhang, Jingyun Liang, Jiezhang Cao, Bihan Wen, Radu Timofte, Luc Van Gool.

[project page] [paper]

This repository contains the code and data associated with the paper "Denoising Diffusion Models for Plug-and-Play Image Restoration", which was presented at the CVPR workshop NTIRE 2023.

This code is based on the OpenAI Guided Diffusion and DPIR.


Contents

Abstract

Plug-and-play Image Restoration (IR) has been widely recognized as a flexible and interpretable method for solving various inverse problems by utilizing any off-the-shelf denoiser as the implicit image prior. However, most existing methods focus on discriminative Gaussian denoisers. Although diffusion models have shown impressive performance for high-quality image synthesis, their potential to serve as a generative denoiser prior to the plug-and-play IR methods remains to be further explored. While several other attempts have been made to adopt diffusion models for image restoration, they either fail to achieve satisfactory results or typically require an unacceptable number of Neural Function Evaluations (NFEs) during inference. This paper proposes DiffPIR, which integrates the traditional plug-and-play method into the diffusion sampling framework. Compared to plug-and-play IR methods that rely on discriminative Gaussian denoisers, DiffPIR is expected to inherit the generative ability of diffusion models. Experimental results on three representative IR tasks, including super-resolution, image deblurring, and inpainting, demonstrate that DiffPIR achieves state-of-the-art performance on both the FFHQ and ImageNet datasets in terms of reconstruction faithfulness and perceptual quality with no more than 100 NFEs.

Setting Up

Clone and Install

git clone https://github.com/yuanzhi-zhu/DiffPIR.git
cd DiffPIR
pip install -r requirements.txt

for motion blur, please download https://github.com/LeviBorodenko/motionblur to the DiffPIR folder.

Model Download

links to model checkpoints can be found in ./model_zoo/README.md

you can also download with:

bash download.sh

Do not forget to rename "ffhq_10m" to "diffusion_ffhq_m" for code consistency.

Inference Code

python main_ddpir_sisr.py # SR
python main_ddpir_deblur.py # deblur
python main_ddpir_inpainting.py # inpainting

Alternatively:

python main_ddpir.py --opt configs/sisr.yaml # SR
python main_ddpir.py --opt configs/deblur.yaml # deblur
python main_ddpir.py --opt configs/inpaint.yaml # inpainting

Train Your Own Diffusion Models

To train a new diffusion model, please follow OpenAI Guided Diffusion.

Brief Introduction

Upon comparison with several earlier iterative image restoration methods, such as USRNet, we found that the diffusion sampling framework offers a more systematic approach to solve data sub-problems and prior sub-problems in an iterative plug and play manner.

Start with the following optimization problem:

$$ \hat{\mathbf{x}} = \mathop{\arg\min}_\mathbf{x} \frac{1}{2\sigma_n^2}|\mathbf{y} - \mathcal{H}(\mathbf{x})|^2 + \lambda \mathcal{P}(\mathbf{x}) $$

In previous works, we can solve this iteratively with:

\begin{align}
\mathbf{{z}}_{k} &= \mathop{\arg\min}_{\mathbf{z}} \frac{1}{2(\sqrt{\lambda/\mu})^2}\|\mathbf{z}-\mathbf{x}_{k}\|^2  + \mathcal{P}(\mathbf{z}) \\
\mathbf{{x}}_{k-1} &= \mathop{\arg\min}_{\mathbf{x}}  \|\mathbf{y} - \mathcal{H}(\mathbf{x})\|^2 + \mu\sigma_n^2\|\mathbf{x}-\mathbf{{z}}_{k} \|^2 
\end{align}

In our study, we address the optimization problem stated above with an altered schedule instead:

\begin{align}
\mathbf{{x}}_{0}^{(t)}&=\mathop{\arg\min}_{\mathbf{z}} {\frac{1}{2\bar{\sigma}_t^2}\|\mathbf{z}-\mathbf{x}_{t}\|^2}  + {\mathcal{P}(\mathbf{z})}\\
\mathbf{\hat{x}}_{0}^{(t)}&=\mathop{\arg\min}_{\mathbf{x}}  \|\mathbf{y} - \mathcal{H}(\mathbf{x})\|^2 + \rho_t\|\mathbf{x}-\mathbf{{x}}_{0}^{(t)} \|^2 \\
\mathbf{x}_{t-1} &\longleftarrow \mathbf{\hat{x}}_{0}^{(t)}
\end{align}

where the data sub-problem is treated as a denoising issue where diffusion models are utilized as generative denoisers.

<p align="center"> <img src="figs/illustration.png" width="900px"/> <em>Figure 1. Illustration of our plug-and-play sampling method.</em> </p>

To detail further, as demonstrated in Figure 1, at every timestep $t$, we initially estimate $\mathbf{x}^{(t)}_0$ from $\mathbf{x}_{t}$ through denoising, using the off-the-shelf unconditional pre-trained diffusion models. Following this, we address the data sub-problem to acquire an updated $\mathbf{\hat x}^{(t)}_0$ (indicated by the red line).

The overall plug and play sampling algorithm can be summarized as follows:

<p align="center"> <img src="figs/algorithm.png" width="600px"/> </p>

For a more comprehensive understanding, feel free to check out the related paper or slides.

<!-- In this work we have demonstrated, both theoretically and empirically, that DiffPIR is a more systematic and efficient diffusion sampling approach for image restoration. -->

Results

Quantitative Results

<details open> <summary>Click to display/hide quantitative results tables</summary>
FFHQ/Method($\sigma=0.05$)NFEs $\downarrow$Deblur (Gaussian)<br>PSNR $\uparrow$Deblur (Gaussian)<br>FID $\downarrow$Deblur (Gaussian)<br>LPIPS $\downarrow$Deblur (motion)<br>PSNR $\uparrow$Deblur (motion)<br>FID $\downarrow$Deblur (motion)<br>LPIPS $\downarrow$SR ($\times 4$)<br>PSNR $\uparrow$SR ($\times 4$)<br>FID $\downarrow$SR ($\times 4$)<br>LPIPS $\downarrow$
DiffPIR10027.3659.650.23626.5765.780.25526.6465.770.260
DPS [1]100025.4665.570.24723.3173.310.28925.7767.010.256
DDRM [2]2025.93101.890.298---27.9289.430.265
DPIR [3]$>$2027.79123.990.45026.41146.440.46728.03133.390.456
ImageNet/Method($\sigma=0.05$)NFEs $\downarrow$Deblur (Gaussian)<br>PSNR $\uparrow$Deblur (Gaussian)<br>FID $\downarrow$Deblur (Gaussian)<br>LPIPS $\downarrow$Deblur (motion)<br>PSNR $\uparrow$Deblur (motion)<br>FID $\downarrow$Deblur (motion)<br>LPIPS $\downarrow$SR ($\times 4$)<br>PSNR $\uparrow$SR ($\times 4$)<br>FID $\downarrow$SR ($\times 4$)<br>LPIPS $\downarrow$
DiffPIR10022.8093.360.35524.01124.630.36623.18106.320.371
DPS [1]100019.58138.800.43417.75184.450.49122.16114.930.383
DDRM [2]2022.33160.730.427---23.89118.550.358
DPIR [3]$>$2023.86189.920.47623.60210.310.48923.99204.830.475
FFHQ/Method($\sigma=0.0$)NFEs $\downarrow$Inpaint (box)<br>FID $\downarrow$Inpaint (box)<br>LPIPS $\downarrow$Inpaint (random)<br>PSNR $\uparrow$Inpaint (random)<br>FID $\downarrow$Inpaint (random)<br>LPIPS $\downarrow$Deblur (Gaussian)<br>PSNR $\uparrow$Deblur (Gaussian)<br>FID $\downarrow$Deblur (Gaussian)<br>LPIPS $\downarrow$Deblur (motion)<br>PSNR $\uparrow$Deblur (motion)<br>FID $\downarrow$Deblur (motion)<br>LPIPS $\downarrow$SR ($\times 4$)<br>PSNR $\uparrow$SR ($\times 4$)<br>FID $\downarrow$SR ($\times 4$)<br>LPIPS $\downarrow$
DiffPIR2035.720.11734.0330.810.11630.7446.640.17037.0320.110.08429.1758.020.187
DiffPIR10025.640.10736.1713.680.06631.0039.270.15237.5311.540.06429.5247.800.174
DPS [1]100043.490.14534.6533.140.10527.3151.230.19226.7358.630.22227.6459.060.209
DDRM [2]2037.050.11931.8356.600.16428.4067.990.238---30.0968.590.188
DPIR [3]>20-----30.5296.160.35038.3927.550.23330.4196.160.362

[1]: Chung et al., "Diffusion Posterior Sampling for General Noisy Inverse Problems", 2022
[2]: Kawar et al., "Denoising Diffusion Restoration Models", 2022
[3]: Zhang et al., "Plug-and-play Image Restoration with Deep Denoiser Prior", 2021

</details>

Qualitative Results

<details open> <summary>Click to display/hide qualitative results images</summary> <p align="center"> <img src="figs/SR_results.png" width="900px"/> </p> <p align="center"> <img src="figs/deblur_results.png" width="900px"/> </p> <p align="center"> <img src="figs/inpainting_results.png" width="900px"/> </p> </details>

Citation

If you find this repo helpful, please cite:

@inproceedings{zhu2023denoising, % DiffPIR
      title={Denoising Diffusion Models for Plug-and-Play Image Restoration},
      author={Yuanzhi Zhu and Kai Zhang and Jingyun Liang and Jiezhang Cao and Bihan Wen and Radu Timofte and Luc Van Gool},
      booktitle={IEEE Conference on Computer Vision and Pattern Recognition Workshops (NTIRE)},
      year={2023},
}

Acknowledgments

This work was partly supported by the ETH Zurich General Fund (OK), the Alexander von Humboldt Foundation and the Huawei Fund.