Home

Awesome

Sparse Global Matching for Video Frame Interpolation with Large Motion

<p align="center"> <a href="https://scholar.google.com.hk/citations?hl=zh-CN&view_op=list_works&gmla=AKKJWFe0ZBvfA_4yxMRe8BW79xNafjCwXtxN10finOaqV1EREnZGxSX6DbpZelBUJD0GZmp5S7unCf76xrgOfnS6SVA&user=dvUKnKEAAAAJ" target='_blank'>Chunxu Liu*</a>,&nbsp; <a href="https://scholar.google.com.hk/citations?user=48vfuRAAAAAJ&hl=en" target='_blank'>Guozhen Zhang*</a>,&nbsp; <a href="https://scholar.google.com/citations?user=1c9oQNMAAAAJ&hl=en" target='_blank'>Rui Zhao</a>,&nbsp; <a href="https://scholar.google.com.hk/citations?user=HEuN8PcAAAAJ&hl=en" target='_blank'>Limin Wang</a>,&nbsp; <br> Nanjing University, &nbsp; SenseTime Research </p> <p align="center"> <a href="http://arxiv.org/abs/2404.06913" target='_blank'> <img src="https://img.shields.io/badge/Paper-πŸ“•-red"> </a> <a href="https://sgm-vfi.github.io/" target='_blank'> <img src="https://img.shields.io/badge/Project Page-πŸ”—-blue"> </a> </p> <p style="font-size:30px;"> <b>TL;DR: </b>We introduce <b>Sparse Global Matching Pipeline</b> for Video Frame Interpolation task: </p> <p style="font-size:25px;"> 0. Estimate intermediate initial flows with local information. <br> 1. Identify flaws in the initial flows.<br> 2. Estimate flow compensation by <b>Sparese Global Matching</b>. <br> 3. Merge the flow compensation with the initial flows. <br> 4. Compute the intermediate frame using the flows from 3. and keep refining. </p> <div align="center"> <img src="figs/pipeline.png" width="1200"/> </div>

To evaluate the effectiveness of our method in handling large motion, we carefully curate a more challenging subset from commonly used benchmarks. Experiments shows that our work can bring improvements when dealing with challenging large motion benchmarks.

<div align="center"> <img src="figs/demo.gif" width="500"/> </div>

Training Dataset Preparation

We need X4K1000FPS for our sparse global matching branch fine-tuning, and Vimeo90K for our local branch training. After downloading and processing the datasets, you can place them in the following folder structure:

.
β”œβ”€β”€ ...
└── datasets
    β”œβ”€β”€ X4K1000FPS
    β”‚   β”œβ”€β”€ train
    β”‚   β”œβ”€β”€ val
    β”‚   └── test
    └── vimeo_triplet (needed if train local branch)
        β”œβ”€β”€ ...
        β”œβ”€β”€ tri_trainlist.txt
        └── sequences

Environment Setup

conda create -n sgm-vfi python=3.8 
pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 --extra-index-url https://download.pytorch.org/whl/cu117
pip install -r requirements.txt

Sparse Global Matching Fine-tuning

Pretrained File Preparation

We provide the pretrained local branch model for a quicker launch of sparse global matching. You can download the pretrained model here and place it in [project_folder]/log/ours-local/ckpt/ours-local.pth.

Furthermore, for the global feature extractor GMFlow, you can download the pretrained model in here, then unzip it and place gmflow_sintel-0c07dcb3.pth in [project_folder]/pretrained/gmflow_sintel-0c07dcb3.pth.

Finally, for fine-tuning sparse global matching branch, the file folder should look like this.

Finetuning

After the preparation, you can modify and check the settings in config.py, the default setting is for ours-small-1/2 fine-tuning.

Finally, you can start the fine-tuning with the following command:

torchrun --nproc_per_node=4 train_x4k.py --batch_size 8 --need_patch --train_data_path path/to/X4K/train --val_data_path path/to/X4K/val

<span id="jump"> Top Related Files </span>

.
β”œβ”€β”€ train_x4k.py
β”œβ”€β”€ Trainer_x4k.py
β”œβ”€β”€ dataset_x4k.py
β”œβ”€β”€ config.py
β”œβ”€β”€ pretrained
β”‚   └── gmflow_sintel-0c07dcb3.pth
β”œβ”€β”€ log
β”‚   └── ours-local
β”‚       └── ckpt
β”‚           └── ours-local.pth
└── model
    β”œβ”€β”€ __init__.py
    β”œβ”€β”€ flow_estimation_global.py
    β”œβ”€β”€ matching.py
    β”œβ”€β”€ gmflow.py
    └── ...

Local Branch Training

We also provide scripts for training the local branch. After preparing the Vimeo90K dataset, and check the settings in config_base.py (default setting is for ours-local-branch model training), you can start the training process by the following command:

torchrun --nproc_per_node=4 train_base.py --batch_size 8 --data_path ./vimeo_triplet 

Top Related Files

.
β”œβ”€β”€ train_base.py
β”œβ”€β”€ Trainer_base.py
β”œβ”€β”€ dataset.py
β”œβ”€β”€ config_base.py
└── model
    β”œβ”€β”€ __init__.py
    β”œβ”€β”€ flow_estimation_local.py
    └── ...

Evaluation

In our paper, we analyzed the mean motion magnitude and motion sufficiency (the minimum of the top 5% of each pixel’s flow magnitude) in the most frequently used large motion benchmarks.

<div align="center"> <img src="figs/chart.png" width="600"/> </div>

Evaluation Datasets Preparation

As a result, we curated the most challenging half from Xiph and SNU-FILM hard and extreme with the help of raft-sintel.pth checkpoint provided in RAFT.

The resulting benchmark is available here. You can put top-half-motion-sufficiency_test-hard.txt, top-half-motion-sufficiency_test-extreme.txt in SNU-FILM dataset folder and top-half-motion-sufficiency-gap2.txt in Xiph dataset folder.

Model Checkpoint Preparation

We provide the checkpoints here for evaluation. Please download and place them in the following folder structure:

.
β”œβ”€β”€ ...
└── log
    └── ours-1-2-points
        └── ckpt
            └── ours-1-2-points.pth

We provide the evaluation script of ours-1-2-points as follows:

XTest-L

python benchmark/XTest_interval.py --path path/to/XTest/test --exp_name ours-1-2-points --num_key_points 0.5

SNU-FILM-hard/extreme-L

python benchmark/SNU_FILM.py --path ./data/SNU-FILM --exp_name ours-1-2-points --num_key_points 0.5

(Suggestion: You can use ln -s path/to/SNUFILM (project folder)/data/SNU-FILM to avoid extra processing on the input path name)

Xiph-L

python benchmark/Xiph.py --path ./xiph --exp_name ours-1-2-points --num_key_points 0.5

(Suggestion: You can use ln -s path/to/Xiph (project folder)/xiph to avoid extra processing on the input path name)

Simple Inference

You can try out our simple 2x inference demo with the following command:

python demo_2x.py 

(Need to prepare the model checkpoint in log/ours-1-2-points/ckpt/ours-1-2-points.pth and the GMFlow pretrained model in pretrained/gmflow_sintel-0c07dcb3.pth)

Citation

If you think this project is helpful in your research or for application, please feel free to leave a star⭐️ and cite our paper:

@InProceedings{Liu_2024_CVPR,
    author    = {Liu, Chunxu and Zhang, Guozhen and Zhao, Rui and Wang, Limin},
    title     = {Sparse Global Matching for Video Frame Interpolation with Large Motion},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2024},
    pages     = {19125-19134}
}

License and Acknowledgement

This project is released under the Apache 2.0 license. The codes are based on GMFlow, RAFT, EMA-VFI, RIFE, IFRNet. Please also follow their licenses. Thanks for their awesome works!