Awesome
<div align="center"> Multi-scale Attention Network for Single Image Super-Resolution </div>
<div align="center">Yan Wang<sup>†</sup>, Yusen Li<sup>†</sup>, Gang Wang, Xiaoguang Liu
</div> <p align="center"> Nankai University </p> <p align="center"> <a href="https://arxiv.org/abs/2209.14145" alt="arXiv"> <img src="https://img.shields.io/badge/arXiv-2209.14145-b31b1b.svg?style=flat" /></a> <a href="https://github.com/icandle/MAN/blob/main/LICENSE" alt="license"> <img src="https://img.shields.io/badge/license-Apache--2.0-%23B7A800" /></a> <a href="https://github.com/icandle/MAN/blob/main/images/man_ntire24.pdf"> <img src="https://img.shields.io/badge/Docs-Slide&Poster-8A2BE2" /></a> </p>Overview: To unleash the potential of ConvNet in super-resolution, we propose a multi-scale attention network (MAN), by coupling a classical multi-scale mechanism with emerging large kernel attention. In particular, we proposed multi-scale large kernel attention (MLKA) and gated spatial attention unit (GSAU). Experimental results illustrate that our MAN can perform on par with SwinIR and achieve varied trade-offs between state-of-the-art performance and computations.
This repository contains PyTorch implementation for MAN (CVPRW 2024).
<details> <summary>Table of contents</summary> <p align="center">- Requirements
- Datasets
- Implementary Details
- Train and Test
- Results and Models
- Acknowledgments
- Citation
⚙️ Requirements
🎈 Datasets
Testing: Set5, Set14, BSD100, Urban100, Manga109 (Google Drive/Baidu Netdisk).
Preparing: Please refer to the Dataset Preparation of BasicSR.
🔎 Implementary Details
Network architecture: Group number (n_resgroups): 1 for simplicity, MAB number (n_resblocks): 5/24/36, channel width (n_feats): 48/60/180 for tiny/light/base MAN.
<p align="center"> <img src="images/MAN_arch.png" width=100% height=100% > <br /></p> <em> Overview of the proposed MAN constituted of three components: the shallow feature extraction module (SF), the deep feature extraction module (DF) based on multiple multi-scale attention blocks (MAB), and the high-quality image reconstruction module. </em>
Component details: Three multi-scale decomposition modes are utilized in MLKA. The 7×7 depth-wise convolution is used in the GSAU.
<p align="center"> <img src="images/MAN_details.png" width=60% height=60% > <br /></p> <em> Details of Multi-scale Large Kernel Attention (MLKA), Gated Spatial Attention Unit (GSAU), and Large Kernel Attention Tail (LKAT). </em> ▶️ Train and Test
The BasicSR framework is utilized to train our MAN, also testing.
Training with the example option
CUDA_VISIBLE_DEVICES=0,1,2,3 \
python -m torch.distributed.launch --nproc_per_node=4 --master_port=4321 train.py -opt options/trian_MAN.yml --launcher pytorch
Testing with the example option
python test.py -opt options/test_MAN.yml
The training/testing results will be saved in the ./experiments
and ./results
folders, respectively.
📊 Results and Models
Pretrained models available at Google Drive and Baidu Netdisk (pwd: mans for all links).
HR (x4) | MAN-tiny | EDSR-base+ | MAN-light | EDSR+ | MAN |
---|---|---|---|---|---|
<img width=100% height=100% src="images/Visual_Results/U004/HR.png"> | <img width=100% height=100% src="images/Visual_Results/U004/MAN-Tiny.png"> | <img width=100% height=100% src="images/Visual_Results/U004/EDSR-Base.png"> | <img width=100% height=100% src="images/Visual_Results/U004/MAN-Light.png"> | <img width=100% height=100% src="images/Visual_Results/U004/EDSR.png"> | <img width=100% height=100% src="images/Visual_Results/U004/MAN.png"> |
<img width=100% height=100% src="images/Visual_Results/U012/HR.png"> | <img width=100% height=100% src="images/Visual_Results/U012/MAN-Tiny.png"> | <img width=100% height=100% src="images/Visual_Results/U012/EDSR-Base.png"> | <img width=100% height=100% src="images/Visual_Results/U012/MAN-Light.png"> | <img width=100% height=100% src="images/Visual_Results/U012/EDSR.png"> | <img width=100% height=100% src="images/Visual_Results/U012/MAN.png"> |
<img width=100% height=100% src="images/Visual_Results/U044/HR.png"> | <img width=100% height=100% src="images/Visual_Results/U044/MAN-Tiny.png"> | <img width=100% height=100% src="images/Visual_Results/U044/EDSR-Base.png"> | <img width=100% height=100% src="images/Visual_Results/U044/MAN-Light.png"> | <img width=100% height=100% src="images/Visual_Results/U044/EDSR.png"> | <img width=100% height=100% src="images/Visual_Results/U044/MAN.png"> |
<img width=100% height=100% src="images/Visual_Results/D0850/HR.png"> | <img width=100% height=100% src="images/Visual_Results/D0850/MAN-Tiny.png"> | <img width=100% height=100% src="images/Visual_Results/D0850/EDSR-Base.png"> | <img width=100% height=100% src="images/Visual_Results/D0850/MAN-Light.png"> | <img width=100% height=100% src="images/Visual_Results/D0850/EDSR.png"> | <img width=100% height=100% src="images/Visual_Results/D0850/MAN.png"> |
Params/FLOPs | 150K/8G | 1518K/114G | 840K/47G | 43090K/2895G | 8712K/495G |
Results of our MAN-tiny/light/base models. Set5 validation set is used below to show the general performance. The visual results of five testsets are provided in the last column.
Methods | Params | FLOPs | PSNR/SSIM (x2) | PSNR/SSIM (x3) | PSNR/SSIM (x4) | Results |
---|---|---|---|---|---|---|
MAN-tiny | 150K | 8.4G | 37.91/0.9603 | 34.23/0.9258 | 32.07/0.8930 | x2/x3/x4 |
MAN-light | 840K | 47.1G | 38.18/0.9612 | 34.65/0.9292 | 32.50/0.8988 | x2/x3/x4 |
MAN+ | 8712K | 495G | 38.44/0.9623 | 34.97/0.9315 | 32.87/0.9030 | x2/x3/x4 |
💖 Acknowledgments
We would thank VAN and BasicSR for their enlightening work!
🎓 Citation
@inproceedings{wang2024multi,
title={Multi-scale Attention Network for Single Image Super-Resolution},
author={Wang, Yan and Li, Yusen and Wang, Gang and Liu, Xiaoguang},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
year={2024}
}
or
@article{wang2022multi,
title={Multi-scale Attention Network for Single Image Super-Resolution},
author={Wang, Yan and Li, Yusen and Wang, Gang and Liu, Xiaoguang},
journal={arXiv preprint arXiv:2209.14145},
year={2022}
}