Awesome
Dlformer
This is implementation of CVPR 2022 paper DLFormer Discrete Latent Transformer for Video Inpainting
Dependency
conda env create -f environment.yaml
Inference
Set model.params.ckpt_path as path of the model(download here)and model.params.first_stage_config.ckpt_path as path of the model (download here)
python test_transformer_sppe.py -c configs/test_breakdance_transformer.yaml -s save_dir
Training your own model
- Fine tune the autoencoder together with a codebook on current video from pretrained model (download here)
python train_vqgan.py --base configs/breakdance_vqgan.yaml -t True --gpus 0,
- Set ckpt_path in breakdance_vqgan.yaml as the path of checkpoint produced in 1) , set mode as select, and select the codes used in the current video by running
python test_vqgan.py -c configs/breakdance_vqgan.yaml -s save_dir
- In train_breakdance_transformer_lepe.yaml set model.params.first_stage_config.ckpt_path as the path of model produced in 2), model.params.first_stage_config.n_embed as the code selected in 2), model.params.transformer_config.params.vocab_size as the code selected in 2), train the transformer for code inference by running
python train_transformer.py --base configs/train_breakdance_transformer_lepe.yaml -t True --gpus 0,
- Set model.params.ckpt_path in train_breakdance_transformer.yaml as the transformer path obtained in 3) and get the result by running
python test_transformer_sppe.py -c configs/train_breakdance_transformer.yaml -s save_dir
Citation
@inproceedings{ren2022dlformer,
title={DLFormer: Discrete Latent Transformer for Video Inpainting},
author={Ren, Jingjing and Zheng, Qingqing and Zhao, Yuanyuan and Xu, Xuemiao and Li, Chen},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={3511--3520},
year={2022}
}
Acknowledgement
Our code is based on VQGAN and STTN. Thanks for their code sharing.