Awesome
Learnable Gated Temporal Shift Module for Deep Video Inpainting
Official Pytorch implementation of "Learnable Gated Temporal Shift Module for Deep Video Inpainting. Chang et al. BMVC 2019." arXiv
This repository also includes the implementation of some baselines and the Free-form Video Inpainting (FVI) dataset in "Free-form Video Inpainting with 3D Gated Convolution and Temporal PatchGAN. Chang et al. ICCV 2019." arXiv
<img src='./doc/gif_teaser3.gif'> <img src='./doc/gif_teaser.gif'> <img src='./doc/gif_teaser2.gif'>See YouTube video demo or full resolution videos on Google Drive
Introduction
In "Free-form Video Inpainting with 3D Gated Convolution and Temporal PatchGAN. Chang et al. ICCV 2019.", we proposed 3D gated convolutions, Temporal PatchGAN and mask video generation algorithm to deal with free-form video inpainting in an end-to-end way. It is the first deep method for free-form video inpainting and achieves state-of-the-art performance both quantitatively and qualitatively. However, there are too many parameters for 3D gated convolutions and it takes long to train and inference.
Therefore, in "Learnable Gated Temporal Shift Module for Deep Video Inpainting. Chang et al. BMVC 2019.", we proposed a new LGTSM based on temporal shift module (TSM) for action recognition to reduce model parameters and training time to about 33%. The performance is almost the same as our previous work.
This repository contains source code for both works. Some pretrained weights for the GTSM one are given, while the LGTSM code could be found in the LGTSM branch. The implementation of the baseline CombCN is also provided.
Environment Setup
git clone git@github.com:amjltc295/Free-Form-Video-Inpainting.git
cd Free-Form-Video-Inpainting
git submodule update --init --recursive
conda env create -f environment.yaml
source activate free_form_video_inpainting
Training
Please see training
Testing
-
Download corresponding pretrained weights from Google Drive
- The weights for the ICCV 2019 work
- The one trained on FVI dataset is under
FFVI_3DGatedConv+TPGAN_trained_on_FVI_dataset
as well as its training config.
- The one trained on FVI dataset is under
- The weights for the BMVC 2019 work (LGTSM)
- The one trained on FVI dataset is named as
v0.2.3_GatedTSM_inplace_noskip_b2_back_L1_vgg_style_TSMSNTPD128_1_1_10_1_VOR_allMasks_load135_e135_pdist0.1256
- For the one trained on FaceForensics, please refer to
Readme
- The one trained on FVI dataset is named as
- The weights for the ICCV 2019 work
-
Update parameters in
src/other_configs/inference_example.json
:- If you want to test on other data, set
root_masks_dir
for testing masks androot_videos_dir
for testing frames. - If you want to turn on evaluation, set
evaluate_score
totrue
.
- If you want to test on other data, set
-
Run
python train.py -r <pretrained_weight_path> --dataset_config other_configs/inference_example.json -od test_outputs
Then, you should have a directory src/test_outputs/ like:
test_outputs
└── epoch_0
├── test_object_like
│ ├── inputs
│ │ └── input_0000
│ └── result_0000
└── test_object_removal
├── inputs
│ └── input_0000
└── result_0000
The following GIFs show the figures that will appear in
(top row) test_object_like/result_0000
, test_object_like/inputs/result_0000
,
(bottom row) test_object_removal/result_0000
, test_object_removal/inputs/result_0000
License
This repository is limited to research purpose. For any commercial usage, please contact us.
Authors
Ya-Liang Chang (Allen) amjltc295 yaliangchang@cmlab.csie.ntu.edu.tw
Zhe-Yu Liu Nash2325138 zhe2325138@cmlab.csie.ntu.edu.tw
Please cite our papers if you use this repo in your research:
@article{chang2019free,
title={Free-form Video Inpainting with 3D Gated Convolution and Temporal PatchGAN},
author={Chang, Ya-Liang and Liu, Zhe Yu and Lee, Kuan-Ying and Hsu, Winston},
journal={In Proceedings of the International Conference on Computer Vision (ICCV)},
year={2019}
}
@article{chang2019learnable,
title={Learnable Gated Temporal Shift Module for Deep Video Inpainting"},
author={Chang, Ya-Liang and Liu, Zhe Yu and Lee, Kuan-Ying and Hsu, Winston},
journal={BMVC},
year={2019}
}
Acknowledgement
This work was supported in part by the Ministry of Science and Technology, Taiwan, under Grant MOST 108-2634-F-002-004. We also benefit from the NVIDIA grants and the DGX-1 AI Supercomputer. We are grateful to the National Center for High-performance Computing.