Home

Awesome

<p align="center"> <img width="60%" src="https://raw.githubusercontent.com/POSTECH-CVLab/PyTorch-StudioGAN/master/docs/figures/studiogan_logo.jpg" /> </p>

StudioGAN is a Pytorch library providing implementations of representative Generative Adversarial Networks (GANs) for conditional/unconditional image generation. StudioGAN aims to offer an identical playground for modern GANs so that machine learning researchers can readily compare and analyze a new idea.

Features

Implemented GANs

MethodVenueArchitectureGCDCLossEMA
DCGANarXiv'15CNN/ResNet<sup>[1]</sup>N/AN/AVanillaFalse
LSGANICCV'17CNN/ResNet<sup>[1]</sup>N/AN/ALeast SqaureFalse
GGANarXiv'17CNN/ResNet<sup>[1]</sup>N/AN/AHingeFalse
WGAN-WCICLR'17ResNetN/AN/AWassersteinFalse
WGAN-GPNIPS'17ResNetN/AN/AWassersteinFalse
WGAN-DRAarXiv'17ResNetN/AN/AWassersteinFalse
ACGAN-Mod<sup>[2]</sup>-ResNetcBNACHingeFalse
ProjGANICLR'18ResNetcBNPDHingeFalse
SNGANICLR'18ResNetcBNPDHingeFalse
SAGANICML'19ResNetcBNPDHingeFalse
BigGANICLR'19Big ResNetcBNPDHingeTrue
BigGAN-DeepICLR'19Big ResNet DeepcBNPDHingeTrue
BigGAN-Mod<sup>[3]</sup>-Big ResNetcBNPDHingeTrue
CRGANICLR'20Big ResNetcBNPD/CLHingeTrue
ICRGANarXiv'20Big ResNetcBNPD/CLHingeTrue
LOGANarXiv'19Big ResNetcBNPDHingeTrue
DiffAugGANNeurips'20Big ResNetcBNPD/CLHingeTrue
ADAGANNeurips'20Big ResNetcBNPD/CLHingeTrue
ContraGANNeurips'20Big ResNetcBNCLHingeTrue
FreezeDCVPRW'20-----

GC/DC indicates the way how we inject label information to the Generator or Discriminator.

EMA: Exponential Moving Average update to the generator. cBN : conditional Batch Normalization. AC : Auxiliary Classifier. PD : Projection Discriminator. CL : Contrastive Learning.

To be Implemented

MethodVenueArchitectureGCDCLossEMA
StyleGAN2CVPR' 20StyleNet--VanillaTrue

Requirements

Please refer to requirements.md for more information.

You can install the recommended environment as follows:

conda env create -f environment.yml -n studiogan

With docker, you can use:

docker pull mgkang/studiogan:latest

This is my command to make a container named "studioGAN".

Also, you can use port number 6006 to connect the tensoreboard.

docker run -it --gpus all --shm-size 128g -p 6006:6006 --name studioGAN -v /home/USER:/root/code --workdir /root/code mgkang/studiogan:latest /bin/bash

Quick Start

CUDA_VISIBLE_DEVICES=0 python3 src/main.py -t -e -c CONFIG_PATH
CUDA_VISIBLE_DEVICES=0,1,2,3 python3 src/main.py -t -e -c CONFIG_PATH

Try python3 src/main.py to see available options.

Via Tensorboard, you can monitor trends of IS, FID, F_beta, Authenticity Accuracies, and the largest singular values:

~ PyTorch-StudioGAN/logs/RUN_NAME>>> tensorboard --logdir=./ --port PORT
<p align="center"> <img width="85%" src="https://raw.githubusercontent.com/POSTECH-CVLab/PyTorch-StudioGAN/master/docs/figures/tensorboard_1.png" /> </p>

Dataset

┌── docs
├── src
└── data
    └── ILSVRC2012 or TINY_ILSVRC2012 or CUSTOM
        ├── train
        │   ├── cls0
        │   │   ├── train0.png
        │   │   ├── train1.png
        │   │   └── ...
        │   ├── cls1
        │   └── ...
        └── valid
            ├── cls0
            │   ├── valid0.png
            │   ├── valid1.png
            │   └── ...
            ├── cls1
            └── ...

Supported Training Techniques

※ StudioGAN does not support DDP training for ContraGAN. This is because conducting contrastive learning requires a 'gather' operation to calculate the exact conditional contrastive loss.

Analyzing Generated Images

The StudioGAN supports Image visualization, K-nearest neighbor analysis, Linear interpolation, and Frequency analysis. All results will be saved in ./figures/RUN_NAME/*.png.

CUDA_VISIBLE_DEVICES=0,...,N python3 src/main.py -iv -std_stat --standing_step STANDING_STEP -c CONFIG_PATH --checkpoint_folder CHECKPOINT_FOLDER --log_output_path LOG_OUTPUT_PATH
<p align="center"> <img width="95%" src="https://raw.githubusercontent.com/POSTECH-CVLab/PyTorch-StudioGAN/master/docs/figures/generated_images1.png" /> </p>
CUDA_VISIBLE_DEVICES=0,...,N python3 src/main.py -knn -std_stat --standing_step STANDING_STEP -c CONFIG_PATH --checkpoint_folder CHECKPOINT_FOLDER --log_output_path LOG_OUTPUT_PATH
<p align="center"> <img width="95%" src="https://raw.githubusercontent.com/POSTECH-CVLab/PyTorch-StudioGAN/master/docs/figures/knn_1.png" /> </p>
CUDA_VISIBLE_DEVICES=0,...,N python3 src/main.py -itp -std_stat --standing_step STANDING_STEP -c CONFIG_PATH --checkpoint_folder CHECKPOINT_FOLDER --log_output_path LOG_OUTPUT_PATH
<p align="center"> <img width="95%" src="https://raw.githubusercontent.com/POSTECH-CVLab/PyTorch-StudioGAN/master/docs/figures/interpolated_images.png" /> </p>
CUDA_VISIBLE_DEVICES=0,...,N python3 src/main.py -fa -std_stat --standing_step STANDING_STEP -c CONFIG_PATH --checkpoint_folder CHECKPOINT_FOLDER --log_output_path LOG_OUTPUT_PATH
<p align="center"> <img width="60%" src="https://raw.githubusercontent.com/POSTECH-CVLab/PyTorch-StudioGAN/master/docs/figures/diff_spectrum1.png" /> </p>
CUDA_VISIBLE_DEVICES=0,...,N python3 src/main.py -tsne -std_stat --standing_step STANDING_STEP -c CONFIG_PATH --checkpoint_folder CHECKPOINT_FOLDER --log_output_path LOG_OUTPUT_PATH
<p align="center"> <img width="80%" src="https://raw.githubusercontent.com/POSTECH-CVLab/PyTorch-StudioGAN/master/docs/figures/TSNE_results.png" /> </p>

Metrics

Inception Score (IS)

Inception Score (IS) is a metric to measure how much GAN generates high-fidelity and diverse images. Calculating IS requires the pre-trained Inception-V3 network, and recent approaches utilize OpenAI's TensorFlow implementation.

To compute official IS, you have to make a "samples.npz" file using the command below:

CUDA_VISIBLE_DEVICES=0,...,N python3 src/main.py -s -c CONFIG_PATH --checkpoint_folder CHECKPOINT_FOLDER --log_output_path LOG_OUTPUT_PATH

It will automatically create the samples.npz file in the path ./samples/RUN_NAME/fake/npz/samples.npz. After that, execute TensorFlow official IS implementation. Note that we do not split a dataset into ten folds to calculate IS ten times. We use the entire dataset to compute IS only once, which is the evaluation strategy used in the CompareGAN repository.

CUDA_VISIBLE_DEVICES=0,...,N python3 src/inception_tf13.py --run_name RUN_NAME --type "fake"

Keep in mind that you need to have TensorFlow 1.3 or earlier version installed!

Note that StudioGAN logs Pytorch-based IS during the training.

Frechet Inception Distance (FID)

FID is a widely used metric to evaluate the performance of a GAN model. Calculating FID requires the pre-trained Inception-V3 network, and modern approaches use Tensorflow-based FID. StudioGAN utilizes the PyTorch-based FID to test GAN models in the same PyTorch environment. We show that the PyTorch based FID implementation provides almost the same results with the TensorFlow implementation (See Appendix F of our paper).

Precision and Recall (PR: F_1/8=Weights Precision, F_8=Weights Recall)

Precision measures how accurately the generator can learn the target distribution. Recall measures how completely the generator covers the target distribution. Like IS and FID, calculating Precision and Recall requires the pre-trained Inception-V3 model. StudioGAN uses the same hyperparameter settings with the original Precision and Recall implementation, and StudioGAN calculates the F-beta score suggested by Sajjadi et al.

Benchmark

※ We always welcome your contribution if you find any wrong implementation, bug, and misreported score.

We report the best IS, FID, and F_beta values of various GANs. B. S. means batch size for training.

CR, ICR, DiffAug, ADA, and LO refer to regularization or optimization techiniques: CR (Consistency Regularization), ICR (Improved Consistency Regularization), DiffAug (Differentiable Augmentation), ADA (Adaptive Discriminator Augmentation), and LO (Latent Optimization), respectively.

CIFAR10 (3x32x32)

When training, we used the command below.

With a single TITAN RTX GPU, training BigGAN takes about 13-15 hours.

CUDA_VISIBLE_DEVICES=0 python3 src/main.py -t -e -l -stat_otf -c CONFIG_PATH --eval_type "test"
MethodReferenceIS(⭡)FID(⭣)F_1/8(⭡)F_8(⭡)CfgLogWeights
DCGANStudioGAN6.63849.0300.8330.795CfgLogLink
LSGANStudioGAN5.57766.6860.7570.720CfgLogLink
GGANStudioGAN6.22742.7140.9160.822CfgLogLink
WGAN-WCStudioGAN2.579159.0900.1900.199CfgLogLink
WGAN-GPStudioGAN7.45825.8520.9620.929CfgLogLink
WGAN-DRAStudioGAN6.43241.5860.9220.863CfgLogLink
ACGAN-ModStudioGAN6.62945.5710.8570.847CfgLogLink
ProjGANStudioGAN7.53933.8300.9520.855CfgLogLink
SNGANStudioGAN8.67713.2480.9830.978CfgLogLink
SAGANStudioGAN8.68014.0090.9820.970CfgLogLink
BigGANPaper9.22<sup>[4]</sup>14.73-----
BigGAN + CRPaper-11.5-----
BigGAN + ICRPaper-9.2-----
BigGAN + DiffAugRepo9.2<sup>[4]</sup>8.7-----
BigGAN-ModStudioGAN9.7468.0340.9950.994CfgLogLink
BigGAN-Mod + CRStudioGAN10.3807.1780.9940.993CfgLogLink
BigGAN-Mod + ICRStudioGAN10.1537.4300.9940.993CfgLogLink
BigGAN-Mod + DiffAugStudioGAN9.7757.1570.9960.993CfgLogLink
BigGAN-Mod + ADAStudioGAN10.1367.8810.9930.994CfgLogLink
BigGAN-Mod + LOStudioGAN9.7018.3690.9920.989CfgLogLink
ContraGANStudioGAN9.7298.0650.9930.992CfgLogLink
ContraGAN + CRStudioGAN9.8127.6850.9950.993CfgLogLink
ContraGAN + ICRStudioGAN10.1177.5470.9960.993CfgLogLink
ContraGAN + DiffAugStudioGAN9.9967.1930.9950.990CfgLogLink
ContraGAN + ADAStudioGAN9.41110.8300.9900.964CfgLogLink

When evaluating, the statistics of batch normalization layers are calculated on the fly (statistics of a batch).

IS, FID, and F_beta values are computed using 10K test and 10K generated Images.

CUDA_VISIBLE_DEVICES=0 python3 src/main.py -e -l -stat_otf -c CONFIG_PATH --checkpoint_folder CHECKPOINT_FOLDER --eval_type "test"

Tiny ImageNet (3x64x64)

When training, we used the command below.

With 4 TITAN RTX GPUs, training BigGAN takes about 2 days.

CUDA_VISIBLE_DEVICES=0,...,N python3 src/main.py -t -e -l -stat_otf -c CONFIG_PATH --eval_type "valid"
MethodReferenceIS(⭡)FID(⭣)F_1/8(⭡)F_8(⭡)CfgLogWeights
DCGANStudioGAN5.64091.6250.6060.391CfgLogLink
LSGANStudioGAN5.38190.0080.6380.390CfgLogLink
GGANStudioGAN5.146102.0940.5030.307CfgLogLink
WGAN-WCStudioGAN9.69641.4540.9400.735CfgLogLink
WGAN-GPStudioGAN1.322311.8050.0160.000CfgLogLink
WGAN-DRAStudioGAN9.56440.6550.9380.724CfgLogLink
ACGAN-ModStudioGAN6.34278.5130.6680.518CfgLogLink
ProjGANStudioGAN6.22489.1750.6260.428CfgLogLink
SNGANStudioGAN8.41253.5900.9000.703CfgLogLink
SAGANStudioGAN8.34251.4140.8980.698CfgLogLink
BigGAN-ModStudioGAN11.99831.9200.9560.879CfgLogLink
BigGAN-Mod + CRStudioGAN14.88721.4880.9690.936CfgLogLink
BigGAN-Mod + ICRStudioGAN5.60591.3260.5250.399CfgLogLink
BigGAN-Mod + DiffAugStudioGAN17.07516.3380.9790.971CfgLogLink
BigGAN-Mod + ADAStudioGAN15.15824.1210.9530.942CfgLogLink
BigGAN-Mod + LOStudioGAN6.96470.6600.8570.621CfgLogLink
ContraGANStudioGAN13.49427.0270.9750.902CfgLogLink
ContraGAN + CRStudioGAN15.62319.7160.9830.941CfgLogLink
ContraGAN + ICRStudioGAN15.83021.9400.9800.944CfgLogLink
ContraGAN + DiffAugStudioGAN17.30315.7550.9840.962CfgLogLink
ContraGAN + ADAStudioGAN8.39855.0250.8780.677CfgLogLink

When evaluating, the statistics of batch normalization layers are calculated on the fly (statistics of a batch).

IS, FID, and F_beta values are computed using 10K validation and 10K generated Images.

CUDA_VISIBLE_DEVICES=0,...,N python3 src/main.py -e -l -stat_otf -c CONFIG_PATH --checkpoint_folder CHECKPOINT_FOLDER --eval_type "valid"

ImageNet (3x128x128)

When training, we used the command below.

With 8 TESLA V100 GPUs, training BigGAN2048 takes about a month.

CUDA_VISIBLE_DEVICES=0,...,N python3 src/main.py -t -e -l -sync_bn -stat_otf -c CONFIG_PATH --eval_type "valid"
MethodReferenceIS(⭡)FID(⭣)F_1/8(⭡)F_8(⭡)CfgLogWeights
SNGANStudioGAN32.24726.7920.9380.913CfgLogLink
SAGANStudioGAN29.84834.7260.8490.914CfgLogLink
BigGANPaper98.8<sup>[4]</sup>8.7-----
BigGANPaper-21.072--Cfg--
BigGANStudioGAN28.63324.6840.9410.921CfgLogLink
BigGANStudioGAN99.7057.8930.9850.989CfgLogLink
ContraGANPaper31.10119.6930.9510.927CfgLogLink
ContraGANStudioGAN25.24925.1610.9470.855CfgLogLink

When evaluating, the statistics of batch normalization layers are calculated in advance (moving average of the previous statistics).

IS, FID, and F_beta values are computed using 50K validation and 50K generated Images.

CUDA_VISIBLE_DEVICES=0,...,N python3 src/main.py -e -l -sync_bn -c CONFIG_PATH --checkpoint_folder CHECKPOINT_FOLDER --eval_type "valid"

StudioGAN thanks the following Repos for the code sharing

Exponential Moving Average: https://github.com/ajbrock/BigGAN-PyTorch

Synchronized BatchNorm: https://github.com/vacancy/Synchronized-BatchNorm-PyTorch

Self-Attention module: https://github.com/voletiv/self-attention-GAN-pytorch

Implementation Details: https://github.com/ajbrock/BigGAN-PyTorch

Architecture Details: https://github.com/google/compare_gan

DiffAugment: https://github.com/mit-han-lab/data-efficient-gans

Adaptive Discriminator Augmentation: https://github.com/rosinality/stylegan2-pytorch

Tensorflow IS: https://github.com/openai/improved-gan

Tensorflow FID: https://github.com/bioinf-jku/TTUR

Pytorch FID: https://github.com/mseitzer/pytorch-fid

Tensorflow Precision and Recall: https://github.com/msmsajjadi/precision-recall-distributions

torchlars: https://github.com/kakaobrain/torchlars

Citation

StudioGAN is established for the following research project. Please cite our work if you use StudioGAN.

@inproceedings{kang2020ContraGAN,
  title   = {{ContraGAN: Contrastive Learning for Conditional Image Generation}},
  author  = {Minguk Kang and Jaesik Park},
  journal = {Conference on Neural Information Processing Systems (NeurIPS)},
  year    = {2020}
}

<a name="footnote_1">[1]</a> Experiments on Tiny ImageNet are conducted using the ResNet architecture instead of CNN.

<a name="footnote_2">[2]</a> Our re-implementation of ACGAN (ICML'17) with slight modifications, which bring strong performance enhancement for the experiment using CIFAR10.

<a name="footnote_3">[3]</a> Our re-implementation of BigGAN/BigGAN-Deep (ICLR'18) with slight modifications, which bring strong performance enhancement for the experiment using CIFAR10.

<a name="footnote_4">[4]</a> IS is computed using Tensorflow official code.