Awesome
[CVPR 2022] Inertia-Guided Flow Completion and Style Fusion for Video Inpainting
[Paper] / [Demo] / [Project page] / [Poster] / [Intro]
This repository contains the implementation of the following paper:
Inertia-Guided Flow Completion and Style Fusion for Video Inpainting<br> Kaidong Zhang, Jingjing Fu and Dong Liu<br> IEEE/CVF Conference on Computer Vision and Pattern Recognition 2022 (CVPR), 2022<br>
Overview
<img src="materials/pipeline_isvi.jpg" height="260px"/>Physical objects have inertia, which resists changes in the velocity and motion direction. Inspired by this, we introduce inertia prior that optical flow, which reflects object motion in a local temporal window, keeps unchanged in the adjacent preceding or subsequent frame. We propose a flow completion network to align and aggregate flow features from the consecutive flow sequences based on the inertia prior. The corrupted flows are completed under the supervision of customized losses on reconstruction, flow smoothness, and consistent ternary census transform. The completed flows with high fidelity give rise to significant improvement on the video inpainting quality. Nevertheless, the existing flow-guided cross-frame warping methods fail to consider the lightening and sharpness variation across video frames, which leads to spatial incoherence after warping from other frames. To alleviate such problem, we propose the Adaptive Style Fusion Network (ASFN), which utilizes the style information extracted from the valid regions to guide the gradient refinement in the warped regions. Moreover, we design a data simulation pipeline to reduce the training difficulty of ASFN. Extensive experiments show the superiority of our method against the state-of-the-art methods quantitatively and qualitatively.
Prerequisites
- Linux (We tested our codes on Ubuntu18.04)
- Anaconda
- Python 3.7.6
- Pytorch 1.6.0
To get started, first please clone the repo
git clone https://github.com/hitachinsk/ISVI.git
Then, please run the following commands:
conda create -n ISVI
conda activate ISVI
pip install -r requirements.txt
bash install_dependances.sh
Quick start
- Download the pre-trained models and the data.
- Put the downloaded zip files to the root directory of this project
- Run
bash prepare_data.sh
to unzip the files - Run the object removal demo
cd tool
python video_inpainting.py --path xxx \
--path_mask xxx \
--outroot xxx
If everythings works, you will find a result.mp4
file in xxx. And the video should be like:
License
This work is licensed under MIT license. See the LICENSE for details.
Citation
If our work inspires your research or some part of the codes are useful for your work, please cite our paper:
@InProceedings{Zhang_2022_CVPR,
author = {Zhang, Kaidong and Fu, Jingjing and Liu, Dong},
title = {Inertia-Guided Flow Completion and Style Fusion for Video Inpainting},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2022},
pages = {5982-5991}
}
Our other video inpainting paper FGT and FGT++ (The journal extension of FGT.)
@inproceedings{zhang2022flow,
title={Flow-Guided Transformer for Video Inpainting},
author={Zhang, Kaidong and Fu, Jingjing and Liu, Dong},
booktitle={European Conference on Computer Vision},
pages={74--90},
year={2022},
organization={Springer}
}
@misc{https://doi.org/10.48550/arxiv.2301.10048,
doi = {10.48550/ARXIV.2301.10048},
url = {https://arxiv.org/abs/2301.10048},
author = {Zhang, Kaidong and Peng, Jialun and Fu, Jingjing and Liu, Dong},
keywords = {Computer Vision and Pattern Recognition (cs.CV), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {Exploiting Optical Flow Guidance for Transformer-Based Video Inpainting},
publisher = {arXiv},
year = {2023},
copyright = {arXiv.org perpetual, non-exclusive license}
}
Contact
If you have any questions, please contact us via
Acknowledgement
Some parts of this repo are based on FGVC and flow forward warp package. And we adopt RAFT for flow estimation.