Home

Awesome

Self-Asymmetric Invertible Network (SAIN)

This is the PyTorch implementation of paper "Self-Asymmetric Invertible Network for Compression-Aware Image Rescaling" (AAAI 2023 Oral). [arxiv]

Abstract

High-resolution (HR) images are usually downscaled to low-resolution (LR) ones for better display and afterward upscaled back to the original size to recover details. Recent work in image rescaling formulates downscaling and upscaling as a unified task and learns a bijective mapping between HR and LR via invertible networks. However, in real-world applications (e.g., social media), most images are compressed for transmission. Lossy compression will lead to irreversible information loss on LR images, hence damaging the inverse upscaling procedure and degrading the reconstruction accuracy. In this paper, we propose the Self-Asymmetric Invertible Network (SAIN) for compression-aware image rescaling. To tackle the distribution shift, we first develop an end-to-end asymmetric framework with two separate bijective mappings for high-quality and compressed LR images, respectively. Then, based on empirical analysis of this framework, we model the distribution of the lost information (including downscaling and compression) using isotropic Gaussian mixtures and propose the Enhanced Invertible Block to derive high-quality/compressed LR images in one forward pass. Besides, we design a set of losses to regularize the learned LR images and enhance the invertibility. Extensive experiments demonstrate the consistent improvements of SAIN across various image rescaling datasets in terms of both quantitative and qualitative evaluation under standard image compression formats (i.e., JPEG and WebP).

Framework Overview

Framework Overview

Qualitative Results

Qualitative results of image rescaling (×2) on DIV2K under distortion at different JPEG QFs. JPEGx2

Qualitative results of image rescaling (×4) on DIV2K under distortion at different JPEG QFs. JPEGx2

Quantitative Results

Quantitative results (PSNR / SSIM) of image rescaling on DIV2K under distortion at different JPEG QFs.

Downscaling & UpscalingScaleJPEG QF=30JPEG QF=50JPEG QF=70JPEG QF=80JPEG QF=90
Bicubic & Bicubic$\times2$29.38 / 0.808130.19 / 0.833930.91 / 0.856031.38 / 0.870331.96 / 0.8878
Bicubic & SRCNN$\times 2$28.01 / 0.787228.69 / 0.815429.43 / 0.841930.01 / 0.861030.88 / 0.8878
Bicubic & EDSR$\times 2$28.92 / 0.794729.93 / 0.825731.01 / 0.854631.91 / 0.875333.44 / 0.9052
Bicubic & RDN$\times 2$28.95 / 0.795429.96 / 0.826531.02 / 0.854931.91 / 0.875233.41 / 0.9046
Bicubic & RCAN$\times 2$28.84 / 0.793229.84 / 0.824530.94 / 0.853831.87 / 0.874933.44 / 0.9052
CAR & EDSR$\times 2$27.83 / 0.760228.66 / 0.790329.44 / 0.816530.07 / 0.834731.31 / 0.8648
IRN$\times 2$29.24 / 0.805130.20 / 0.834231.14 / 0.860431.86 / 0.878332.91 / 0.9023
SAIN (Ours)$\times 2$31.47 / 0.874733.17 / 0.908234.73 / 0.929635.46 / 0.937435.96 / 0.9419
Downscaling & UpscalingScaleJPEG QF=30JPEG QF=50JPEG QF=70JPEG QF=80JPEG QF=90
Bicubic & Bicubic$\times 4$26.27 / 0.694526.81 / 0.714027.28 / 0.732627.57 / 0.745627.90 / 0.7618
Bicubic & SRCNN$\times 4$25.49 / 0.681925.91 / 0.701226.30 / 0.720626.55 / 0.734426.84 / 0.7521
Bicubic & EDSR$\times 4$25.87 / 0.679326.57 / 0.705227.31 / 0.732927.92 / 0.755028.88 / 0.7889
Bicubic & RDN$\times 4$25.92 / 0.681926.61 / 0.707527.33 / 0.734327.92 / 0.755628.84 / 0.7884
Bicubic & RCAN$\times 4$25.77 / 0.677226.45 / 0.703127.21 / 0.731127.83 / 0.753728.82 / 0.7884
Bicubic & RRDB$\times 4$25.87 / 0.680326.58 / 0.706327.36 / 0.734327.99 / 0.756828.98 / 0.7915
CAR & EDSR$\times 4$25.25 / 0.661025.76 / 0.682726.22 / 0.703726.69 / 0.721427.91 / 0.7604
IRN$\times 4$25.98 / 0.686726.62 / 0.709627.24 / 0.732827.72 / 0.750828.42 / 0.7777
HCFlow$\times 4$25.89 / 0.683826.38 / 0.702926.79 / 0.720427.05 / 0.732827.41 / 0.7485
SAIN (Ours)$\times 4$27.90 / 0.774529.05 / 0.808829.83 / 0.827230.13 / 0.833130.31 / 0.8367

Cross-dataset evaluation of image rescaling (×2) over standard benchmarks: Set5, Set14, BSD100, and Urban100.

Cross-dataset

Dependencies and Installation

The codes are developed under the following environments:

  1. Python 3.7.1 (Recommend to use Anaconda)
conda create -n sain python=3.7.1
conda activate sain
  1. PyTorch=1.9.0, torchvision=0.10.0, cudatoolkit=11.1
python -m pip install --upgrade pip
pip install torch==1.9.0+cu111 torchvision==0.10.0+cu111 -f https://download.pytorch.org/whl/torch_stable.html
  1. Other dependencies
pip install -r requirements.txt

Dataset Preparation

We use the DIV2K training split for model training, and validate on DIV2K validation split and four widely-used benchmarks: Set5, Set14, BSDS100, and Urban100.

Please organize the datasets and the code in a folder stucture as:

├── datasets
│   ├── BSDS100
│   │   └── *.png
│   ├── DIV2K
│   │   ├── DIV2K_train_HR
│   │   │   └── *.png
│   │   ├── DIV2K_train_LR_bicubic
│   │   │   ├── X2
│   │   │   │   └── *.png
│   │   │   └── X4
│   │   │       └── *.png
│   │   ├── DIV2K_valid_HR
│   │   │   └── *.png
│   │   └── DIV2K_valid_LR_bicubic
│   │       ├── X2
│   │       │   └── *.png
│   │       └── X4
│   │           └── *.png
│   ├── Set5
│   │   ├── GTmod12
│   │   │   └── *.png
│   │   ├── LRbicx2
│   │   │   └── *.png
│   │   └── LRbicx4
│   │       └── *.png
│   ├── Set14
│   │   ├── GTmod12
│   │   │   └── *.png
│   │   ├── LRbicx2
│   │   │   └── *.png
│   │   └── LRbicx4
│   │       └── *.png
│   └── urban100
│       └── *.png
└── SAIN 
    ├── codes
    ├── experiments
    ├── results
    └── tb_logger

To accelerate training, we suggest crop the 2K resolution images to sub-images for faster IO speed.

Testing

The pretrained models is available in ./experiments/pretrained_models and the config files is available in ./codes/options for quickly reproducing the results reported in the paper.

For scale x2 with JPEG compression QF=90, change directory to .code/, run

python test.py -opt options/test/test_SAIN_JPEG_g_5_e_5_v_3_x2.yml -format JPEG -qf 90

For scale x4 with JPEG compression QF=90, change directory to .code/, run

python test.py -opt options/test/test_SAIN_JPEG_g_5_e_10_v_6_x4.yml -format JPEG -qf 90

For scale x2 with WebP compression QF=90, change directory to .code/, run

python test.py -opt options/test/test_SAIN_WebP_g_5_e_5_v_3_x2.yml -format WebP -qf 90

The visual results and quantitative reports will be written to ./results.

Training

The training configs are included in ./codes/options/train. For example, for scale x2 with JPEG compression, change directory to .code/, run

python train.py -opt options/train/train_SAIN_JPEG_g_5_e_5_v3_x2.yml

Acknowledgement

The code is based on IRN and BasicSR.

Contact

If you have any questions, please create an issue or contact yangjinhai.01@bytedance.com.