Awesome
AMT: All-Pairs Multi-Field Transforms for Efficient Frame Interpolation
This repository contains the official implementation of the following paper:
AMT: All-Pairs Multi-Field Transforms for Efficient Frame Interpolation<br> Zhen Li<sup>*</sup>, Zuo-Liang Zhu<sup>*</sup>, Ling-Hao Han, Qibin Hou, Chun-Le Guo, Ming-Ming Cheng<br> (* denotes equal contribution) <br> Nankai University <br> In CVPR 2023<br>
[Paper] [Project Page] [Web demos] [Video]
AMT is a lightweight, fast, and accurate algorithm for Frame Interpolation. It aims to provide practical solutions for video generation from a few given frames (at least two frames).
- More examples can be found in our project page.
Web demos
Integrated into Hugging Face Spaces 🤗 using Gradio. Try out the Web Demo:
Try AMT to interpolate between two or more images at
Change Log
- Apr 20, 2023: Our code is publicly available.
Method Overview
For technical details, please refer to the method.md file, or read the full report on arXiv.
Dependencies and Installation
-
Clone Repo
git clone https://github.com/MCG-NKU/AMT.git
-
Create Conda Environment and Install Dependencies
conda env create -f environment.yaml conda activate amt
-
Download pretrained models for demos from Pretrained Models and place them to the
pretrained
folder
Quick Demo
Note that the selected pretrained model ([CKPT_PATH]
) needs to match the config file ([CFG]
).
Creating a video demo, increasing $n$ will slow down the motion in the video. (With $m$ input frames,
[N_ITER]
$=n$ corresponds to $2^n\times (m-1)+1$ output frames.)
python demos/demo_2x.py -c [CFG] -p [CKPT] -n [N_ITER] -i [INPUT] -o [OUT_PATH] -r [FRAME_RATE]
# e.g. [INPUT]
# -i could be a video / a regular expression / a folder contains multiple images
# -i demo.mp4 (video)/img_*.png (regular expression)/img0.png img1.png (images)/demo_input (folder)
# e.g. a simple usage
python demos/demo_2x.py -c cfgs/AMT-S.yaml -p pretrained/amt-s.pth -n 6 -i assets/quick_demo/img0.png assets/quick_demo/img1.png
- Note: Please enable
--save_images
for saving the output images (Save speed will be slowed down if there are too many output images) - Input type supported:
a video
/a regular expression
/multiple images
/a folder containing input frames
. - Results are in the
[OUT_PATH]
(default isresults/2x
) folder.
Pretrained Models
<p id="Pretrained"></p> <table> <thead> <tr> <th> Dataset </th> <th> :link: Download Links </th> <th> Config file </th> <th> Trained on </th> <th> Arbitrary/Fixed </th> </tr> </thead> <tbody> <tr> <td>AMT-S</td> <th> [<a href="https://drive.google.com/file/d/1WmOKmQmd6pnLpID8EpUe-TddFpJuavrL/view?usp=share_link">Google Driver</a>][<a href="https://pan.baidu.com/s/1yGaNLeb9TG5-81t0skrOUA?pwd=f66n">Baidu Cloud</a>][<a href="https://huggingface.co/lalala125/AMT/resolve/main/amt-s.pth">Hugging Face</a>] </th> <th> [<a href="cfgs/AMT-S.yaml">cfgs/AMT-S</a>] </th> <th>Vimeo90k</th> <th>Fixed</th> </tr> <tr> <td>AMT-L</td> <th>[<a href="https://drive.google.com/file/d/1UyhYpAQLXMjFA55rlFZ0kdiSVTL7oU-z/view?usp=share_link">Google Driver</a>][<a href="https://pan.baidu.com/s/1qI4fBgS405Bd4Wn1R3Gbeg?pwd=nbne">Baidu Cloud</a>][<a href="https://huggingface.co/lalala125/AMT/resolve/main/amt-l.pth">Hugging Face</a>]</th> <th> [<a href="cfgs/AMT-L.yaml">cfgs/AMT-L</a>] </th> <th>Vimeo90k</th> <th>Fixed</th> </tr> <tr> <td>AMT-G</td> <th>[<a href="https://drive.google.com/file/d/1yieLtKh4ei3gOrLN1LhKSP_9157Q-mtP/view?usp=share_link">Google Driver</a>][<a href="https://pan.baidu.com/s/1AjmQVziQut1bXgQnDcDKvA?pwd=caf6">Baidu Cloud</a>][<a href="https://huggingface.co/lalala125/AMT/resolve/main/amt-g.pth">Hugging Face</a>] </th> <th> [<a href="cfgs/AMT-G.yaml">cfgs/AMT-G</a>] </th> <th>Vimeo90k</th> <th>Fixed</th> </tr> <tr> <td>AMT-S</td> <th>[<a href="https://drive.google.com/file/d/1f1xAF0EDm-rjDdny8_aLyeedfM0QL4-C/view?usp=share_link">Google Driver</a>][<a href="https://pan.baidu.com/s/1eZtoULyduQM8AkXeYEBOEw?pwd=8hy3">Baidu Cloud</a>][<a href="https://huggingface.co/lalala125/AMT/resolve/main/gopro_amt-s.pth">Hugging Face</a>] </th> <th> [<a href="cfgs/AMT-S_gopro.yaml">cfgs/AMT-S_gopro</a>] </th> <th>GoPro</th> <th>Arbitrary</th> </tr> </tbody> </table>Training and Evaluation
Please refer to develop.md to learn how to benchmark the AMT and how to train a new AMT model from scratch.
Citation
If you find our repo useful for your research, please consider citing our paper:
@inproceedings{licvpr23amt,
title={AMT: All-Pairs Multi-Field Transforms for Efficient Frame Interpolation},
author={Li, Zhen and Zhu, Zuo-Liang and Han, Ling-Hao and Hou, Qibin and Guo, Chun-Le and Cheng, Ming-Ming},
booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2023}
}
License
This code is licensed under the Creative Commons Attribution-NonCommercial 4.0 International for non-commercial use only. Please note that any commercial use of this code requires formal permission prior to use.
Contact
For technical questions, please contact zhenli1031[AT]gmail.com
and nkuzhuzl[AT]gmail.com
.
For commercial licensing, please contact cmm[AT]nankai.edu.cn
Acknowledgement
We thank Jia-Wen Xiao, Zheng-Peng Duan, Rui-Qi Wu, and Xin Jin for proof reading. We thank Zhewei Huang for his suggestions.
Here are some great resources we benefit from:
- IFRNet and RIFE for data processing, benchmarking, and loss designs.
- RAFT, M2M-VFI, and GMFlow for inspirations.
- FILM for Web demo reference.
If you develop/use AMT in your projects, welcome to let us know. We will list your projects in this repository.
We also thank all of our contributors.
<a href="https://github.com/MCG-NKU/AMT/graphs/contributors"> <img src="https://contrib.rocks/image?repo=MCG-NKU/AMT" /> </a>