Awesome
<div align="center">[ICCV 2023] On the Effectiveness of Spectral Discriminators for Perceptual Quality Improvement
Xin Luo, Yunan Zhu, Shunxin Xu, Dong Liu
[Paper
] [Video
] [BibTeX
] :zap: :rocket: :fire:
📌 Overview
Several recent studies advocate the use of spectral discriminators, which evaluate the Fourier spectra of images for generative modeling. However, the effectiveness of the spectral discriminators is not well interpreted yet. We tackle this issue by examining the spectral discriminators in the context of perceptual image super-resolution (i.e., GAN-based SR), as SR image quality is susceptible to spectral changes. Our analyses reveal that the spectral discriminator indeed performs better than the ordinary (a.k.a. spatial) discriminator in identifying the differences in the high-frequency range; however, the spatial discriminator holds an advantage in the low-frequency range. Thus, we suggest that the spectral and spatial discriminators shall be used simultaneously. Moreover, we improve the spectral discriminators by first calculating the patchwise Fourier spectrum and then aggregating the spectra by Transformer. We verify the effectiveness of the proposed method twofold. On the one hand, thanks to the additional spectral discriminator, our obtained SR images have their spectra better aligned to those of the real images, which leads to a better PD tradeoff. On the other hand, our ensembled discriminator predicts the perceptual quality more accurately, as evidenced in the no-reference image quality assessment task.
:star: News
- Sept. 28, 2023: Training code is released!
- July. 19, 2023: We release our test code and models, training and analysis code will be released at the end of September.
:sunflower: Main Results
ESRGAN | SPSR | ESRGAN+LDL | ESRGAN<br>+DualFormer(Ours) |
---|---|---|---|
PSNR/SSIM/LPIPS | PSNR/SSIM/LPIPS | PSNR/SSIM/LPIPS | PSNR/SSIM/LPIPS |
28.0465/0.7669/0.1597 | 28.3978/0.7821/0.1069 | 28.2440/0.7758/0.1133 | 29.3049/0.8023/0.1030 |
<img width="200" src="figures/ESRGAN_img1.png"> | <img width="200" src="figures/SPSR_img1.png"> | <img width="200" src="figures/SPSR_img1.png"> | <img width="200" src="figures/DualFormer_img1.png"> |
<img width="200" src="figures/ESRGAN_img2.png"> | <img width="200" src="figures/SPSR_img2.png"> | <img width="200" src="figures/SPSR_img2.png"> | <img width="200" src="figures/DualFormer_img2.png"> |
Installation
This implementation based on BasicSR, please refer to it to get more information on usage.
# create a virtual environment [Recommended but optional]
conda create -n dual_former python=3.9
source activate dual_former
# Install necessities
# In DualFormer/
pip install --user -e .
:rocket: Usage
Download our pretrained models (for both SR and IQA), and place the contents in experiments/pretrained_models/ (you will need to create these directories first, e.g., mkdir -p experiments/pretrained_models
, if you are in project root directory.)
- Download the DIV2K, BSD100 and Urban100, test datasets, and place them in datasets/.
- Evaluate models.
python basicsr/test.py -opt options/test/test_esrgan_x4_dual_former.yml
- Download the test dataset from here, place it in datasets/.
- [Optional] You may also generate the test dataset yourself using the provided method (note that the resulting dataset may differ slightly from what was used in the paper due to randomness in the degradation synthesis process):
python scripts/generate_hgd_dataset.py \ --input datasets/DIV2K/DIV2K_valid_HR \ --hr_folder datasets/DIV2K/HGD/HR/X4 \ --lr_folder datasets/DIV2K/HGD/LR/X4 \ --scale 4
- Evaluate models.
# ESRGAN version python basicsr/test.py -opt options/test/test_esrgan_x4_hgd_dual_former.yml # BebyGAN version python basicsr/test.py -opt options/test/test_bebygan_x4_hgd_dual_former.yml
- Download the IQA datasets KonIQ-10k, LIVE-itW, PIPAL, and place them in dataset/.
- Start testing.
bash scripts/test/test_iqa_vgg_specformer.sh bash scripts/test/test_iqa_dual_former.sh
Furthermore, we provide our code for analysis, so as to facilitate and promote further research.
<details> <summary><b>Calculate magnitude RMSE in frequency range for DIV2K validation set</b></summary>- Download DIV2K dataset and place it (or using
ln -s
in linux to make a soft link) under datasets/. - Download officially pretrained Real-ESRNet/Real-ESRGAN models, place them in experiments/pretrained_models.
- Execute the code below to reproduce Tab.1 in our paper. Three ranges are $[0, \frac{3}{10}),[\frac{3}{10}\frac{8}{10})$, and $[\frac{8}{10},1]$ respectively, corresponding roughly to the divisions in Fig. 1a of the paper.
# For Real-ESRNet
python scripts/estimate_difference_in_frequency_range.py --model_path experiments/pretrained_models/RealESRNet_x4plus.pth
# For Real-ESRGAN
python scripts/estimate_difference_in_frequency_range.py --model_path experiments/pretrained_models/RealESRGAN_x4plus.pth # The resultant numbers may slightly differ from those in Tab.1 of our paper, as we utilized our own reproduced model in the paper
# Test other dataset
python scripts/estimate_difference_in_frequency_range.py --dataset_opt other_dataset.yml # Please ref to options/DIV2K_valid.yml see how to make a proper dataset configuration
</details>
<details>
<summary><b>Plot the spectral profile of a model on a dataset</b></summary>
- [Optional] Generate datasets (It is required in following example)
# Generate LR images for DIV2K validation set utilizing Second-order degradation model (note that the resulting dataset would not exactly same as we used, since random seed was not set beforehand).
bash scripts/generate_realesrgan_dataset.sh # modify the file to change the path
- Estimate statistics
# Estimate statistics of HR images
python scripts/estimate_spectral_statistics.py --input datasets/DIV2K/RealESRGAN/HR/X4 --experiment_name spectral_analysis_G_DIV2K_train_HR_patch_size_256 --mode 1 --patch_size 64
# Estimate statistics of Real-ESRNet's outputs
python scripts/estimate_spectral_statistics.py --input datasets/DIV2K/RealESRGAN/LR/X4 --experiment_name spectral_analysis_G_realesrnet_DIV2K_x4_patch_size_256 --mode 0 --model_path experiments/pretrained_models/RealESRNet_x4plus.pth --patch_size 64
# Estimate statistics of Real-ESRGAN's outputs
python scripts/estimate_spectral_statistics.py --input datasets/DIV2K/RealESRGAN/LR/X4 --experiment_name spectral_analysis_G_realesrgan_DIV2K_x4_patch_size_256 --mode 0 --model_path experiments/pretrained_models/RealESRGAN_x4plus.pth --patch_size 64
- Plot the spectral profile
# modify the file for your needs
python scripts/plot_spectral_profile.py
</details>
<details>
<summary><b>Evaluate the robustness of a discriminator under frequency masking and noise</b></summary>
<!--
```shell
bash scripts/robustness_analysis.sh
``` -->
</details>
:boat: Train
<details> <summary><b>x4 Super Resolution (Bicubic degradation)</b></summary>-
Prepare DF2K dataset under the guideline (just ignore OST part), and organize data according to
datasets
item inoptions/train/train_esrgan_x4_dual_former.yml
. -
Download pretrained ESRNet, and place it in experiments/pretrained_models/.
-
Start your training.
python basicsr/train.py --auto_resume -opt options/train/train_esrgan_x4_dual_former.yml
- Test results.
# Modify pretrain_network_g to your model path
python basicsr/test.py -opt options/test/test_esrgan_x4_dual_former.yml
</details>
<details>
<summary><b>x4 Super Resolution (Hard gated degradation model)</b></summary>
-
Prepare DF2K+OST dataset under the guideline, and organize data according to
datasets
item inoptions/train/train_esrgan_x4_hgd_dual_former.yml
. -
[Optional] Train PSNR-oriented model. By default, step 3 will use our pretrained PSNR-oriented model, you could modify option file to use yours.
CUDA_VISIBLE_DEVICES=0,1,2,3 \
scripts/dist_train_autoresume.sh 4 options/train/train_esrnet_x4_hgd.yml # it require 4 GPUs.
- Start your training.
CUDA_VISIBLE_DEVICES=0,1,2,3 \
scripts/dist_train_autoresume.sh 4 options/train/train_esrgan_x4_hgd_dual_former.yml # it require 4 GPUs.
- Test results.
# Modify pretrain_network_g to your model path
python basicsr/test.py -opt options/test/test_esrgan_x4_hgd_dual_former.yml
</details>
<details>
<summary><b>Opinion Unaware No-Reference IQA</b></summary>
-
Prepare DF2K+OST dataset under the guideline, organize data according to
datasets
item inoptions/train/train_esrgan_x4_sgd_dual_former.yml
. -
[Optional] Train PSNR-oriented model. By default, step 3 will use our pretrained PSNR-oriented model, you could modify option file to use yours.
CUDA_VISIBLE_DEVICES=0,1 \
scripts/dist_train_autoresume.sh 2 python basicsr/train.py --auto_resume -opt options/train/train_esrnet_x4_sgd.yml # it require 2 GPUs.
- Start your training.
CUDA_VISIBLE_DEVICES=0,1 \
scripts/dist_train_autoresume.sh 2 options/train/train_esrgan_x4_sgd_vgg_specformer.yml # it require 2 GPUs.
CUDA_VISIBLE_DEVICES=0,1 \
scripts/dist_train_autoresume.sh 2 options/train/train_esrgan_x4_sgd_dual_former.yml # it require 2 GPUs.
- Test results.
# You should modify these two scripts accordingly, name, path etc...
bash scripts/test/test_iqa_vgg_specformer.sh
bash scripts/test/test_iqa_dual_former.sh
</details>
:heart: Citing Us
If you find this repository or our work useful, please consider giving a star :star: and citation :t-rex:, which would be greatly appreciated:
@inproceedings{luo2023effectiveness,
title={On the Effectiveness of Spectral Discriminators for Perceptual Quality Improvement},
author={Luo, Xin and Zhu, Yunan and Xu, Shunxin and Liu, Dong},
booktitle={ICCV},
year={2023}
}
:email: Contact
If you have any questions, please open an issue (the recommended way) or contact us via
License
This work is licensed under MIT license. See the LICENSE for details.
Acknowledgement
Our repository builds upon the excellent framework provided by BasicSR.