Awesome
Sparse Global Matching for Video Frame Interpolation with Large Motion
<p align="center"> <a href="https://scholar.google.com.hk/citations?hl=zh-CN&view_op=list_works&gmla=AKKJWFe0ZBvfA_4yxMRe8BW79xNafjCwXtxN10finOaqV1EREnZGxSX6DbpZelBUJD0GZmp5S7unCf76xrgOfnS6SVA&user=dvUKnKEAAAAJ" target='_blank'>Chunxu Liu*</a>, <a href="https://scholar.google.com.hk/citations?user=48vfuRAAAAAJ&hl=en" target='_blank'>Guozhen Zhang*</a>, <a href="https://scholar.google.com/citations?user=1c9oQNMAAAAJ&hl=en" target='_blank'>Rui Zhao</a>, <a href="https://scholar.google.com.hk/citations?user=HEuN8PcAAAAJ&hl=en" target='_blank'>Limin Wang</a>, <br> Nanjing University, SenseTime Research </p> <p align="center"> <a href="http://arxiv.org/abs/2404.06913" target='_blank'> <img src="https://img.shields.io/badge/Paper-π-red"> </a> <a href="https://sgm-vfi.github.io/" target='_blank'> <img src="https://img.shields.io/badge/Project Page-π-blue"> </a> </p> <p style="font-size:30px;"> <b>TL;DR: </b>We introduce <b>Sparse Global Matching Pipeline</b> for Video Frame Interpolation task: </p> <p style="font-size:25px;"> 0. Estimate intermediate initial flows with local information. <br> 1. Identify flaws in the initial flows.<br> 2. Estimate flow compensation by <b>Sparese Global Matching</b>. <br> 3. Merge the flow compensation with the initial flows. <br> 4. Compute the intermediate frame using the flows from 3. and keep refining. </p> <div align="center"> <img src="figs/pipeline.png" width="1200"/> </div>To evaluate the effectiveness of our method in handling large motion, we carefully curate a more challenging subset from commonly used benchmarks. Experiments shows that our work can bring improvements when dealing with challenging large motion benchmarks.
<div align="center"> <img src="figs/demo.gif" width="500"/> </div>Training Dataset Preparation
We need X4K1000FPS for our sparse global matching branch fine-tuning, and Vimeo90K for our local branch training. After downloading and processing the datasets, you can place them in the following folder structure:
.
βββ ...
βββ datasets
βββ X4K1000FPS
β βββ train
β βββ val
β βββ test
βββ vimeo_triplet (needed if train local branch)
βββ ...
βββ tri_trainlist.txt
βββ sequences
Environment Setup
conda create -n sgm-vfi python=3.8
pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 --extra-index-url https://download.pytorch.org/whl/cu117
pip install -r requirements.txt
Sparse Global Matching Fine-tuning
Pretrained File Preparation
We provide the pretrained local branch model for a quicker
launch of sparse global matching. You can download the
pretrained model here and place it in
[project_folder]/log/ours-local/ckpt/ours-local.pth
.
Furthermore, for the global feature extractor GMFlow,
you can download the pretrained model in here,
then unzip it and place gmflow_sintel-0c07dcb3.pth
in [project_folder]/pretrained/gmflow_sintel-0c07dcb3.pth
.
Finally, for fine-tuning sparse global matching branch, the file folder should look like this.
Finetuning
After the preparation, you can modify and check the settings in config.py
,
the default setting is for ours-small-1/2
fine-tuning.
Finally, you can start the fine-tuning with the following command:
torchrun --nproc_per_node=4 train_x4k.py --batch_size 8 --need_patch --train_data_path path/to/X4K/train --val_data_path path/to/X4K/val
<span id="jump"> Top Related Files </span>
.
βββ train_x4k.py
βββ Trainer_x4k.py
βββ dataset_x4k.py
βββ config.py
βββ pretrained
β βββ gmflow_sintel-0c07dcb3.pth
βββ log
β βββ ours-local
β βββ ckpt
β βββ ours-local.pth
βββ model
βββ __init__.py
βββ flow_estimation_global.py
βββ matching.py
βββ gmflow.py
βββ ...
Local Branch Training
We also provide scripts for training the local branch.
After preparing the Vimeo90K
dataset, and check the settings in config_base.py
(default setting is for ours-local-branch
model training),
you can start the training process by the following command:
torchrun --nproc_per_node=4 train_base.py --batch_size 8 --data_path ./vimeo_triplet
Top Related Files
.
βββ train_base.py
βββ Trainer_base.py
βββ dataset.py
βββ config_base.py
βββ model
βββ __init__.py
βββ flow_estimation_local.py
βββ ...
Evaluation
In our paper, we analyzed the mean motion magnitude and motion sufficiency (the minimum of the top 5% of each pixelβs flow magnitude) in the most frequently used large motion benchmarks.
<div align="center"> <img src="figs/chart.png" width="600"/> </div>Evaluation Datasets Preparation
As a result, we curated the most challenging half from Xiph and SNU-FILM hard and extreme with the
help of raft-sintel.pth
checkpoint provided in RAFT.
The resulting benchmark is available here.
You can put top-half-motion-sufficiency_test-hard.txt
, top-half-motion-sufficiency_test-extreme.txt
in SNU-FILM dataset folder and top-half-motion-sufficiency-gap2.txt
in Xiph dataset folder.
Model Checkpoint Preparation
We provide the checkpoints here for evaluation. Please download and place them in the following folder structure:
.
βββ ...
βββ log
βββ ours-1-2-points
βββ ckpt
βββ ours-1-2-points.pth
We provide the evaluation script of ours-1-2-points
as follows:
XTest-L
python benchmark/XTest_interval.py --path path/to/XTest/test --exp_name ours-1-2-points --num_key_points 0.5
SNU-FILM-hard/extreme-L
python benchmark/SNU_FILM.py --path ./data/SNU-FILM --exp_name ours-1-2-points --num_key_points 0.5
(Suggestion: You can use ln -s path/to/SNUFILM (project folder)/data/SNU-FILM
to avoid extra processing on the input path name)
Xiph-L
python benchmark/Xiph.py --path ./xiph --exp_name ours-1-2-points --num_key_points 0.5
(Suggestion: You can use ln -s path/to/Xiph (project folder)/xiph
to avoid extra processing on the input path name)
Simple Inference
You can try out our simple 2x inference demo with the following command:
python demo_2x.py
(Need to prepare the model checkpoint in log/ours-1-2-points/ckpt/ours-1-2-points.pth
and the GMFlow pretrained model in pretrained/gmflow_sintel-0c07dcb3.pth
)
Citation
If you think this project is helpful in your research or for application, please feel free to leave a starβοΈ and cite our paper:
@InProceedings{Liu_2024_CVPR,
author = {Liu, Chunxu and Zhang, Guozhen and Zhao, Rui and Wang, Limin},
title = {Sparse Global Matching for Video Frame Interpolation with Large Motion},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2024},
pages = {19125-19134}
}
License and Acknowledgement
This project is released under the Apache 2.0 license. The codes are based on GMFlow, RAFT, EMA-VFI, RIFE, IFRNet. Please also follow their licenses. Thanks for their awesome works!