Home

Awesome

VCGAN

VCGAN is a hybrid colorization model that works for both image and video colorization tasks.

link: https://ieeexplore.ieee.org/document/9721653

1 Pre-requisite

Note that this project is implemented using Python 3.6, CUDA 8.0, and PyTorch 1.0.0 (minimum requirement).

Besides, the cupy, opencv, and scikit-image libs are used for this project.

Please build an appropriate environment for the PWC-Net to compute optical flow.

2 Visual comparisons of VCGAN and the other SOTA video colorization methods

So far we visualize two samples included in the main paper for better visualization.

2.1 DAVIS dataset - 'horsejump-high'

The representative image is shown as (please compare the significant regions marked by red rectangles):

<img src="./img/main1.png" />

The gifs are shown as:

Column 1Column 2Column 3Column 4
CICCIC+BTCLTBCLTBC+BTC
SCGANSCGAN+BTCChromaGANChromaGAN+BTC
3DVCFAVCVCGAN
GrayscaleGround Truth

<img src="./img/horsejump-high/CIC.gif" width="200" height="117" /><img src="./img/horsejump-high/CIC+BTC.gif" width="200" height="117" /><img src="./img/horsejump-high/LTBC.gif" width="200" height="117" /><img src="./img/horsejump-high/LTBC+BTC.gif" width="200" height="117" />

<img src="./img/horsejump-high/SCGAN.gif" width="200" height="117" /><img src="./img/horsejump-high/SCGAN+BTC.gif" width="200" height="117" /><img src="./img/horsejump-high/ChromaGAN.gif" width="200" height="117" /><img src="./img/horsejump-high/ChromaGAN+BTC.gif" width="200" height="117" />

<img src="./img/horsejump-high/3DVC.gif" width="200" height="117" /><img src="./img/horsejump-high/FAVC.gif" width="200" height="117" /><img src="./img/horsejump-high/VCGAN.gif" width="200" height="117" />

<img src="./img/horsejump-high/Grayscale.gif" width="200" height="117" /><img src="./img/horsejump-high/GT.gif" width="200" height="117" />

2.2 Videvo dataset - 'SkateboarderTableJump'

The representative image is shown as (please compare the significant regions marked by red rectangles):

<img src="./img/main2.png" />

The gifs are shown as:

Column 1Column 2Column 3Column 4
CICCIC+BTCLTBCLTBC+BTC
SCGANSCGAN+BTCChromaGANChromaGAN+BTC
3DVCFAVCVCGAN
GrayscaleGround Truth

<img src="./img/SkateboarderTableJump/CIC.gif" width="200" height="117" /><img src="./img/SkateboarderTableJump/CIC+BTC.gif" width="200" height="117" /><img src="./img/SkateboarderTableJump/LTBC.gif" width="200" height="117" /><img src="./img/SkateboarderTableJump/LTBC+BTC.gif" width="200" height="117" />

<img src="./img/SkateboarderTableJump/SCGAN.gif" width="200" height="117" /><img src="./img/SkateboarderTableJump/SCGAN+BTC.gif" width="200" height="117" /><img src="./img/SkateboarderTableJump/ChromaGAN.gif" width="200" height="117" /><img src="./img/SkateboarderTableJump/ChromaGAN+BTC.gif" width="200" height="117" />

<img src="./img/SkateboarderTableJump/3DVC.gif" width="200" height="117" /><img src="./img/SkateboarderTableJump/FAVC.gif" width="200" height="117" /><img src="./img/SkateboarderTableJump/VCGAN.gif" width="200" height="117" />

<img src="./img/SkateboarderTableJump/Grayscale.gif" width="200" height="117" /><img src="./img/SkateboarderTableJump/GT.gif" width="200" height="117" />

3 Visual comparisons of VCGAN and the other SOTA image colorization methods

We visualize the Figure 13 in the main paper for better view.

<img src="./img/image.png" />

4 Download pre-trained model

Please download pre-trained ResNet50-Instance-Normalized model at this link and other pre-trained models at this link if you want to train VCGAN. The hyper-parameters follow the settings of original paper except normalization.

Please download at this link if you want to test VCGAN. It contains a folder named models; then you can put it under train folder. Note that there are three models provided. The model1_*.pth is by default.

Note that they are re-trained on a single GPU that might lead to slightly different results compared with the original one.

5 Use the code

Enter the train folder:

cd train

5.1 Training code

Put the pre-trained ResNet50-Instance-Normalized model into trained_models folder, then change the settings and train VCGAN in first stage:

python train.py or sh first.sh # on 256x256 image resolution

After the model is trained, you can run following codes for second stage:

python train2.py or sh second.sh # on 256p video resolution
python train2.py or sh third.sh # on 480p video resolution

5.2 Testing code

For testing, please run (note that you need to change path to models):

python test_model_second_stage_by_txt.py # for DAVIS dataset
python test_model_second_stage_by_txt2.py # for videvo dataset
python test_model_second_stage_by_folder.py # for a single folder

The network interpolation can also be used when applying different models:

python network_interp.py

6 Related Projects

SCGAN: Saliency Map-guided Colorization with Generative Adversarial Network (IEEE TCSVT 2020): Project Paper Github

ChromaGAN: Adversarial Picture Colorization with Semantic Class Distribution (WACV 2020): Paper Github

FAVC: Fully Automatic Video Colorization With Self-Regularization and Diversity (CVPR 2019): Project Paper Github

3DVC: Automatic Video Colorization using 3D Conditional Generative Adversarial Networks (ISVC 2019): Paper

BTC: Learning Blind Video Temporal Consistency (ECCV 2018): Project Paper Github

LRAC: Learning Representations for Automatic Colorization (ECCV 2016): Project Paper Github

CIC: Colorful Image Colorization (ECCV 2016): Project Paper Github

LTBC: Let there be Color!: Joint End-to-end Learning of Global and Local Image Priors for Automatic Image Colorization with Simultaneous Classification (ACM TOG 2016): Project Paper Github

Pix2Pix: Image-to-Image Translation with Conditional Adversarial Nets (CVPR 2017): Project Paper Github

CycleGAN: Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks (ICCV 2017): Project Paper Github