Home

Awesome

Feature Modulation Transformer: Cross-Refinement of Global Representation via High-Frequency Prior for Image Super-Resolution

Ao Li, Le Zhang, Yun Liu and Ce Zhu, "Feature Modulation Transformer: Cross-Refinement of Global Representation via High-Frequency Prior for Image Super-Resolution", ICCV, 2023

[paper] [pretrained models]


Abstract: Transformer-based methods have exhibited remarkable potential in single image super-resolution (SISR) by effectively extracting long-range dependencies. However, most of the current research in this area has prioritized the design of transformer blocks to capture global information, while overlooking the importance of incorporating high-frequency priors, which we believe could be beneficial. In our study, we conducted a series of experiments and found that transformer structures are more adept at capturing low-frequency information, but have limited capacity in constructing high-frequency representations when compared to their convolutional counterparts. Our proposed solution, the cross-refinement adaptive feature modulation transformer (CRAFT), integrates the strengths of both convolutional and transformer structures. It comprises three key components: the high-frequency enhancement residual block (HFERB) for extracting high-frequency information, the shift rectangle window attention block (SRWAB) for capturing global information, and the hybrid fusion block (HFB) for refining the global representation. Our experiments on multiple datasets demonstrate that CRAFT outperforms state-of-the-art methods by up to 0.29dB while using fewer parameters.

<p align="center"> <img width="700" src="figs/CRAFT.png"> </p>

HRLRSwinIRESRTCRAFT (ours)
<img src="figs/img012_HR.png" height=80><img src="figs/img012_LR.png" height=80><img src="figs/img012_SWINIR.png" height=80><img src="figs/img012_ESRT.png" height=80><img src="figs/img012_CRAFT.png" height=80>
<img src="figs/YumeiroCooking_HR.png" height=80><img src="figs/YumeiroCooking_LR.png" height=80><img src="figs/YumeiroCooking_SWINIR.png" height=80><img src="figs/YumeiroCooking_ESRT.png" height=80><img src="figs/YumeiroCooking_CRAFT.png" height=80>

Dependencies & Installation

# Clone the github repo and go to the default directory 'CRAFT'.
git clone https://github.com/AVC2-UESTC/CRAFT-SR.git
conda create -n CRAFT python=3.7
conda activate CRAFT
pip install -r requirements.txt
python setup.py develop

Training

Train with DIV2K

Testing

Test images with HR

Results

Model#ParametersSet5Set14BSD100Urban100Manga109
CRAFT-X2737K38.23/0.961533.92/0.921132.33/0.901632.86/0.934339.39/0.9786
CRAFT-X3744K34.71/0.929530.61/0.846929.24/0.809328.77/0.863534.29/0.9491
CRAFT-X4753K32.52/0.898928.85/0.787227.72/0.741826.56/0.799531.18/0.9168

Citation

If you find the code helpful in your research or work, please cite the following paper(s).

@inproceedings{li2023craft,
  title={Feature Modulation Transformer: Cross-Refinement of Global Representation via High-Frequency Prior for Image Super-Resolution},
  author={Li, Ao and Zhang, Le and Liu, Yun and Zhu, Ce},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={12514--12524},
  year={2023}
}

Acknowledgements

This code is built on BasicSR, CAT, and Restormer.