Awesome

Scale-Adaptive Feature Aggregation for Efficient Space-Time Video Super-Resolution

YouTube | Poster | Enhancement Model | demo | 中文介绍

Introduction

We want to increase video resolution and frame rates end-to-end (end-to-end STVSR). This project is the implement of Scale-Adaptive Feature Aggregation for Efficient Space-Time Video Super-Resolution. Our SAFA network outperforms recent state-of-the-art methods such as TMNet and VideoINR by an average improvement of over 0.5dB on PSNR, while requiring less than half the number of parameters and only 1/3 computational costs.

We have released some dedicated visual effect models for ordinary users. Some insights on multi-scale processing and feature fusion are reflected in RIFE applications, see Practical-RIFE.

Space-Time Super-Resolution:

slomo_origin -> slomo

CLI Usage

Installation

git clone git@github.com:megvii-research/WACV2024-SAFA.git
cd WACV2024-SAFA
pip3 install -r requirements.txt

Download the pretrained model from Google Drive.

Run

Image Interpolation

python3 inference_img.py --img demo/i0.png demo/i1.png --exp=3

(2^3=8X interpolation results)

python3 inference_img.py --img demo/i0.png demo/i1.png --ratio=0.4

(for an arbitrary timestep)

Training and Reproduction

We use 16 CPUs, 4 GPUs for training:

python3 -m torch.distributed.launch --nproc_per_node=4 train.py --world_size=4

The training scheme is mainly adopted from RIFE.

Recommend

We sincerely recommend some related papers:

ECCV22 - Real-Time Intermediate Flow Estimation for Video Frame Interpolation

CVPR23 - A Dynamic Multi-Scale Voxel Flow Network for Video Prediction

Citation

If you think this project is helpful, please feel free to leave a star or cite our paper:

@inproceedings{huang2024safa,
  title={Scale-Adaptive Feature Aggregation for Efficient Space-Time Video Super-Resolution},
  author={Huang, Zhewei and Huang, Ailin and Hu, Xiaotao and Hu, Chen and Xu, Jun and Zhou, Shuchang},
  booktitle={Winter Conference on Applications of Computer Vision (WACV)},
  year={2024}
}

Reference

RIFE DMVFN TMNet

ZoomingSlomo VideoINR