Awesome
VCGAN
VCGAN is a hybrid colorization model that works for both image and video colorization tasks.
link: https://ieeexplore.ieee.org/document/9721653
1 Pre-requisite
Note that this project is implemented using Python 3.6, CUDA 8.0, and PyTorch 1.0.0 (minimum requirement).
Besides, the cupy
, opencv
, and scikit-image
libs are used for this project.
Please build an appropriate environment for the PWC-Net to compute optical flow.
2 Visual comparisons of VCGAN and the other SOTA video colorization methods
So far we visualize two samples included in the main paper for better visualization.
2.1 DAVIS dataset - 'horsejump-high'
The representative image is shown as (please compare the significant regions marked by red rectangles):
<img src="./img/main1.png" />The gifs are shown as:
Column 1 | Column 2 | Column 3 | Column 4 |
---|---|---|---|
CIC | CIC+BTC | LTBC | LTBC+BTC |
SCGAN | SCGAN+BTC | ChromaGAN | ChromaGAN+BTC |
3DVC | FAVC | VCGAN | |
Grayscale | Ground Truth |
<img src="./img/horsejump-high/CIC.gif" width="200" height="117" /><img src="./img/horsejump-high/CIC+BTC.gif" width="200" height="117" /><img src="./img/horsejump-high/LTBC.gif" width="200" height="117" /><img src="./img/horsejump-high/LTBC+BTC.gif" width="200" height="117" />
<img src="./img/horsejump-high/SCGAN.gif" width="200" height="117" /><img src="./img/horsejump-high/SCGAN+BTC.gif" width="200" height="117" /><img src="./img/horsejump-high/ChromaGAN.gif" width="200" height="117" /><img src="./img/horsejump-high/ChromaGAN+BTC.gif" width="200" height="117" />
<img src="./img/horsejump-high/3DVC.gif" width="200" height="117" /><img src="./img/horsejump-high/FAVC.gif" width="200" height="117" /><img src="./img/horsejump-high/VCGAN.gif" width="200" height="117" />
<img src="./img/horsejump-high/Grayscale.gif" width="200" height="117" /><img src="./img/horsejump-high/GT.gif" width="200" height="117" />
2.2 Videvo dataset - 'SkateboarderTableJump'
The representative image is shown as (please compare the significant regions marked by red rectangles):
<img src="./img/main2.png" />The gifs are shown as:
Column 1 | Column 2 | Column 3 | Column 4 |
---|---|---|---|
CIC | CIC+BTC | LTBC | LTBC+BTC |
SCGAN | SCGAN+BTC | ChromaGAN | ChromaGAN+BTC |
3DVC | FAVC | VCGAN | |
Grayscale | Ground Truth |
<img src="./img/SkateboarderTableJump/CIC.gif" width="200" height="117" /><img src="./img/SkateboarderTableJump/CIC+BTC.gif" width="200" height="117" /><img src="./img/SkateboarderTableJump/LTBC.gif" width="200" height="117" /><img src="./img/SkateboarderTableJump/LTBC+BTC.gif" width="200" height="117" />
<img src="./img/SkateboarderTableJump/SCGAN.gif" width="200" height="117" /><img src="./img/SkateboarderTableJump/SCGAN+BTC.gif" width="200" height="117" /><img src="./img/SkateboarderTableJump/ChromaGAN.gif" width="200" height="117" /><img src="./img/SkateboarderTableJump/ChromaGAN+BTC.gif" width="200" height="117" />
<img src="./img/SkateboarderTableJump/3DVC.gif" width="200" height="117" /><img src="./img/SkateboarderTableJump/FAVC.gif" width="200" height="117" /><img src="./img/SkateboarderTableJump/VCGAN.gif" width="200" height="117" />
<img src="./img/SkateboarderTableJump/Grayscale.gif" width="200" height="117" /><img src="./img/SkateboarderTableJump/GT.gif" width="200" height="117" />
3 Visual comparisons of VCGAN and the other SOTA image colorization methods
We visualize the Figure 13 in the main paper for better view.
<img src="./img/image.png" />4 Download pre-trained model
Please download pre-trained ResNet50-Instance-Normalized model at this link and other pre-trained models at this link if you want to train VCGAN. The hyper-parameters follow the settings of original paper except normalization.
Please download at this link if you want to test VCGAN. It contains a folder named models
; then you can put it under train
folder. Note that there are three models provided. The model1_*.pth
is by default.
Note that they are re-trained on a single GPU that might lead to slightly different results compared with the original one.
5 Use the code
Enter the train
folder:
cd train
5.1 Training code
Put the pre-trained ResNet50-Instance-Normalized model into trained_models
folder, then change the settings and train VCGAN in first stage:
python train.py or sh first.sh # on 256x256 image resolution
After the model is trained, you can run following codes for second stage:
python train2.py or sh second.sh # on 256p video resolution
python train2.py or sh third.sh # on 480p video resolution
5.2 Testing code
For testing, please run (note that you need to change path to models):
python test_model_second_stage_by_txt.py # for DAVIS dataset
python test_model_second_stage_by_txt2.py # for videvo dataset
python test_model_second_stage_by_folder.py # for a single folder
The network interpolation can also be used when applying different models:
python network_interp.py
6 Related Projects
SCGAN: Saliency Map-guided Colorization with Generative Adversarial Network (IEEE TCSVT 2020): Project Paper Github
ChromaGAN: Adversarial Picture Colorization with Semantic Class Distribution (WACV 2020): Paper Github
FAVC: Fully Automatic Video Colorization With Self-Regularization and Diversity (CVPR 2019): Project Paper Github
3DVC: Automatic Video Colorization using 3D Conditional Generative Adversarial Networks (ISVC 2019): Paper
BTC: Learning Blind Video Temporal Consistency (ECCV 2018): Project Paper Github
LRAC: Learning Representations for Automatic Colorization (ECCV 2016): Project Paper Github
CIC: Colorful Image Colorization (ECCV 2016): Project Paper Github
LTBC: Let there be Color!: Joint End-to-end Learning of Global and Local Image Priors for Automatic Image Colorization with Simultaneous Classification (ACM TOG 2016): Project Paper Github
Pix2Pix: Image-to-Image Translation with Conditional Adversarial Nets (CVPR 2017): Project Paper Github
CycleGAN: Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks (ICCV 2017): Project Paper Github