Awesome
TecoGAN-PyTorch
Introduction
This is a PyTorch reimplementation of TecoGAN: Temporally Coherent GAN for Video Super-Resolution (VSR). Please refer to the official TensorFlow implementation TecoGAN-TensorFlow for more information.
<p align = "center"> <img src="resources/fire.gif" width="320" /> <img src="resources/pond.gif" width="320" /> </p> <p align = "center"> <img src="resources/foliage.gif" width="320" /> <img src="resources/bridge.gif" width="320" /> </p>Updates
- 11/2021: Supported 2x SR.
- 10/2021: Supported model training/testing on the REDS dataset.
- 07/2021: Upgraded codebase to support multi-GPU training & testing.
Features
- Better Performance: This repo provides model with smaller size yet better performance than the official repo. See our Benchmark.
- Multiple Degradations: This repo supports two types of degradation, BI (Matlab's imresize with the option bicubic) & BD (Gaussian Blurring + Down-sampling). <!--Please refer to [this wiki]() for more details about degradation types.-->
- Unified Framework: This repo provides a unified framework for distortion-based and perception-based VSR methods.
Contents
Dependencies
- Ubuntu >= 16.04
- NVIDIA GPU + CUDA
- Python >= 3.7
- PyTorch >= 1.4.0
- Python packages: numpy, matplotlib, opencv-python, pyyaml, lmdb
- (Optional) Matlab >= R2016b
Testing
Note: We apply different models according to the degradation type. The following steps are for 4xSR
under BD
degradation. You can switch to 2xSR
or BI
degradation by replacing all 4x
to 2x
and BD
to BI
below.
- Download the official Vid4 and ToS3 datasets. In
BD
mode, only ground-truth data is needed.
bash ./scripts/download/download_datasets.sh BD
You can manually download these datasets from Google Drive, and unzip them under
./data
.
- Vid4 Dataset [Ground-Truth] [Low Resolution (BD)] [Low Resolution (BI)]
- ToS3 Dataset [Ground-Truth] [Low Resolution (BD)] [Low Resolution (BI)]
The dataset structure is shown as below.
data
├─ Vid4
├─ GT # Ground-Truth (GT) sequences
└─ calendar
└─ ***.png
├─ Gaussian4xLR # Low Resolution (LR) sequences in BD degradation
└─ calendar
└─ ***.png
└─ Bicubic4xLR # Low Resolution (LR) sequences in BI degradation
└─ calendar
└─ ***.png
└─ ToS3
├─ GT
├─ Gaussian4xLR
└─ Bicubic4xLR
- Download our pre-trained TecoGAN model.
bash ./scripts/download/download_models.sh BD TecoGAN
You can download the model from [BD-4x-Vimeo][BI-4x-Vimeo][BD-4x-REDS][BD-2x-REDS], and put it under
./pretrained_models
.
- Run TecoGAN for 4x SR. The results will be saved in
./results
. You can specify which model and how many gpus to be used intest.sh
.
bash ./test.sh BD TecoGAN/TecoGAN_VimeoTecoGAN_4xSR_2GPU
- Evaluate the upsampled results using the official metrics. These codes are borrowed from TecoGAN-TensorFlow, with minor modifications to adapt to the BI degradation.
python ./codes/official_metrics/evaluate.py -m TecoGAN_4x_BD_Vimeo_iter500K
- Profile model (FLOPs, parameters and speed). You can modify the last argument to specify the size of the LR video.
bash ./profile.sh BD TecoGAN/TecoGAN_VimeoTecoGAN_4xSR_2GPU 3x134x320
Training
Note: Due to the inaccessibility of the VimeoTecoGAN dataset, we recommend using other public datasets, e.g., REDS, for model training. To use REDS as the training dataset, just download it from here and replace the following VimeoTecoGAN
to REDS
.
-
Download the official training dataset according to the instructions in TecoGAN-TensorFlow, rename to
VimeoTecoGAN/Raw
, and place under./data
. -
Generate LMDB for GT data to accelerate IO. The LR counterpart will then be generated on the fly during training.
python ./scripts/create_lmdb.py --dataset VimeoTecoGAN --raw_dir ./data/VimeoTecoGAN/Raw --lmdb_dir ./data/VimeoTecoGAN/GT.lmdb
The following shows the dataset structure after finishing the above two steps.
data
├─ VimeoTecoGAN
├─ Raw # Raw dataset
├─ scene_2000
└─ ***.png
├─ scene_2001
└─ ***.png
└─ ...
└─ GT.lmdb # LMDB dataset
├─ data.mdb
├─ lock.mdb
└─ meta_info.pkl # each key has format: [vid]_[total_frame]x[h]x[w]_[i-th_frame]
- (Optional, this step is only required for BI degradation) Manually generate the LR sequences with the Matlab's imresize function, and then create LMDB for them.
# Generate the raw LR video sequences. Results will be saved at ./data/VimeoTecoGAN/Bicubic4xLR
matlab -nodesktop -nosplash -r "cd ./scripts; generate_lr_bi"
# Create LMDB for the LR video sequences
python ./scripts/create_lmdb.py --dataset VimeoTecoGAN --raw_dir ./data/VimeoTecoGAN/Bicubic4xLR --lmdb_dir ./data/VimeoTecoGAN/Bicubic4xLR.lmdb
- Train a FRVSR model first, which can provide a better initialization for the subsequent TecoGAN training. FRVSR has the same generator as TecoGAN, but without perceptual training (GAN and perceptual losses).
bash ./train.sh BD FRVSR/FRVSR_VimeoTecoGAN_4xSR_2GPU
You can download and use our pre-trained FRVSR models instead of training from scratch. [BD-4x-Vimeo] [BI-4x-Vimeo] [BD-4x-REDS][BD-2x-REDS]
When the training is complete, set the generator's load_path
in experiments_BD/TecoGAN/TecoGAN_VimeoTecoGAN_4xSR_2GPU/train.yml
to the latest checkpoint weight of the FRVSR model.
- Train a TecoGAN model. You can specify which gpu to be used in
train.sh
. By default, the training is conducted in the background and the output info will be logged in./experiments_BD/TecoGAN/TecoGAN_VimeoTecoGAN/train/train.log
.
bash ./train.sh BD TecoGAN/TecoGAN_VimeoTecoGAN_4xSR_2GPU
- Run the following script to monitor the training process and visualize the validation performance.
python ./scripts/monitor_training.py -dg BD -m TecoGAN/TecoGAN_VimeoTecoGAN_4xSR_2GPU -ds Vid4
<p align = "center"> <img src="resources/losses.png" width="1080" /> <img src="resources/metrics.png" width="1080" /> </p>Note that the validation results are NOT exactly the same as the testing results mentioned above due to different implementation of the metrics. The differences are caused by croping policy, LPIPS version and some other issues.
Benchmark
<p align = "center"> <img src="resources/benchmark.png" width="640" /> </p><sup>[1]</sup> FLOPs & speed are computed on RGB sequence with resolution 134*320 on a single NVIDIA 1080Ti GPU.
<sup>[2]</sup> Both FRVSR & TecoGAN use 10 residual blocks, while TecoGAN+ has 16 residual blocks.
License & Citation
If you use this code for your research, please cite the following paper and our project.
@article{tecogan2020,
title={Learning temporal coherence via self-supervision for GAN-based video generation},
author={Chu, Mengyu and Xie, You and Mayer, Jonas and Leal-Taix{\'e}, Laura and Thuerey, Nils},
journal={ACM Transactions on Graphics (TOG)},
volume={39},
number={4},
pages={75--1},
year={2020},
publisher={ACM New York, NY, USA}
}
@misc{tecogan_pytorch,
author={Deng, Jianing and Zhuo, Cheng},
title={PyTorch Implementation of Temporally Coherent GAN (TecoGAN) for Video Super-Resolution},
howpublished="\url{https://github.com/skycrapers/TecoGAN-PyTorch}",
year={2020},
}
Acknowledgements
This code is built on TecoGAN-TensorFlow, BasicSR and LPIPS. We thank the authors for sharing their codes.
If you have any questions, feel free to email me jn.deng@foxmail.com