Awesome
ComboGAN
This is our ongoing PyTorch implementation for ComboGAN. Code was written by Asha Anoosheh (built upon CycleGAN)
[ComboGAN Paper]
<img src="img/Inference.png" width=420/>If you use this code for your research, please cite:
ComboGAN: Unrestrained Scalability for Image Domain Translation Asha Anoosheh, Eirikur Augustsson, Radu Timofte, Luc van Gool In Arxiv, 2017.
<br><br> <img src='img/Paintings.png' align="center" width=900> <br><br>
Prerequisites
- Linux or macOS
- Python 3
- CPU or NVIDIA GPU + CUDA CuDNN
Getting Started
Installation
- Install PyTorch and dependencies from http://pytorch.org
- Install Torch vision from the source.
git clone https://github.com/pytorch/vision
cd vision
python setup.py install
pip install visdom
pip install dominate
- Clone this repo:
git clone https://github.com/AAnoosheh/ComboGAN.git
cd ComboGAN
ComboGAN training
Our ready datasets can be downloaded using ./datasets/download_dataset.sh <dataset_name>
.
A pretrained model for the 14-painters dataset can be found HERE. Place under ./checkpoints/
and test using the instructions below, with args --name paint14_pretrained --dataroot ./datasets/painters_14 --n_domains 14 --which_epoch 1150
.
Example running scripts can be found in the scripts
directory.
- Train a model:
python train.py --name <experiment_name> --dataroot ./datasets/<your_dataset> --n_domains <N> --niter <num_epochs_constant_LR> --niter_decay <num_epochs_decaying_LR>
Checkpoints will be saved by default to ./checkpoints/<experiment_name>/
- Fine-tuning/Resume training:
python train.py --continue_train --which_epoch <checkpoint_number_to_load> --name <experiment_name> --dataroot ./datasets/<your_dataset> --n_domains <N> --niter <num_epochs_constant_LR> --niter_decay <num_epochs_decaying_LR>
- Test the model:
python test.py --phase test --name <experiment_name> --dataroot ./datasets/<your_dataset> --n_domains <N> --which_epoch <checkpoint_number_to_load> --serial_test
The test results will be saved to a html file here: ./results/<experiment_name>/<epoch_number>/index.html
.
Training/Testing Details
- Flags: see
options/train_options.py
for training-specific flags; seeoptions/test_options.py
for test-specific flags; and seeoptions/base_options.py
for all common flags. - Dataset format: The desired data directory (provided by
--dataroot
) should contain subfolders of the formtrain*/
andtest*/
, and they are loaded in alphabetical order. (Note that a folder named train10 would be loaded before train2, and thus all checkpoints and results would be ordered accordingly.) - CPU/GPU (default
--gpu_ids 0
): set--gpu_ids -1
to use CPU mode; set--gpu_ids 0,1,2
for multi-GPU mode. You need a large batch size (e.g.--batchSize 32
) to benefit from multiple GPUs. - Visualization: during training, the current results and loss plots can be viewed using two methods. First, if you set
--display_id
> 0, the results and loss plot will appear on a local graphics web server launched by visdom. To do this, you should havevisdom
installed and a server running by the commandpython -m visdom.server
. The default server URL ishttp://localhost:8097
.display_id
corresponds to the window ID that is displayed on thevisdom
server. Thevisdom
display functionality is turned on by default. To avoid the extra overhead of communicating withvisdom
set--display_id 0
. Secondly, the intermediate results are also saved to./checkpoints/<experiment_name>/web/index.html
. To avoid this, set the--no_html
flag. - Preprocessing: images can be resized and cropped in different ways using
--resize_or_crop
option. The default option'resize_and_crop'
resizes the image to be of size(opt.loadSize, opt.loadSize)
and does a random crop of size(opt.fineSize, opt.fineSize)
.'crop'
skips the resizing step and only performs random cropping.'scale_width'
resizes the image to have widthopt.fineSize
while keeping the aspect ratio.'scale_width_and_crop'
first resizes the image to have widthopt.loadSize
and then does random cropping of size(opt.fineSize, opt.fineSize)
.
NOTE: one should not expect ComboGAN to work on just any combination of input and output datasets (e.g. dogs<->houses
). We find it works better if two datasets share similar visual content. For example, landscape painting<->landscape photographs
works much better than portrait painting <-> landscape photographs
.