Home

Awesome

<img src='imgs/teaser_720.gif' align="right" width=360>

<br><br><br><br>

pix2pixHD

Project | Youtube | Paper <br>

Pytorch implementation of our method for high-resolution (e.g. 2048x1024) photorealistic image-to-image translation. It can be used for turning semantic label maps into photo-realistic images or synthesizing portraits from face label maps. <br><br> High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs
Ting-Chun Wang<sup>1</sup>, Ming-Yu Liu<sup>1</sup>, Jun-Yan Zhu<sup>2</sup>, Andrew Tao<sup>1</sup>, Jan Kautz<sup>1</sup>, Bryan Catanzaro<sup>1</sup>
<sup>1</sup>NVIDIA Corporation, <sup>2</sup>UC Berkeley
In CVPR 2018.

Image-to-image translation at 2k/1k resolution

<p align='center'> <img src='imgs/teaser_label.png' width='400'/> <img src='imgs/teaser_ours.jpg' width='400'/> </p> - Interactive editing results <p align='center'> <img src='imgs/teaser_style.gif' width='400'/> <img src='imgs/teaser_label.gif' width='400'/> </p> - Additional streetview results <p align='center'> <img src='imgs/cityscapes_1.jpg' width='400'/> <img src='imgs/cityscapes_2.jpg' width='400'/> </p> <p align='center'> <img src='imgs/cityscapes_3.jpg' width='400'/> <img src='imgs/cityscapes_4.jpg' width='400'/> </p> <p align='center'> <img src='imgs/face1_1.jpg' width='250'/> <img src='imgs/face1_2.jpg' width='250'/> <img src='imgs/face1_3.jpg' width='250'/> </p> <p align='center'> <img src='imgs/face2_1.jpg' width='250'/> <img src='imgs/face2_2.jpg' width='250'/> <img src='imgs/face2_3.jpg' width='250'/> </p> <p align='center'> <img src='imgs/city_short.gif' width='330'/> <img src='imgs/face_short.gif' width='450'/> </p>

Prerequisites

Getting Started

Installation

pip install dominate
git clone https://github.com/NVIDIA/pix2pixHD
cd pix2pixHD

Testing

#!./scripts/test_1024p.sh
python test.py --name label2city_1024p --netG local --ngf 32 --resize_or_crop none

The test results will be saved to a html file here: ./results/label2city_1024p/test_latest/index.html.

More example scripts can be found in the scripts directory.

Dataset

Training

#!./scripts/train_512p.sh
python train.py --name label2city_512p

Multi-GPU training

#!./scripts/train_512p_multigpu.sh
python train.py --name label2city_512p --batchSize 8 --gpu_ids 0,1,2,3,4,5,6,7

Note: this is not tested and we trained our model using single GPU only. Please use at your own discretion.

Training with Automatic Mixed Precision (AMP) for faster speed

#!./scripts/train_512p_fp16.sh
python -m torch.distributed.launch train.py --name label2city_512p --fp16

In our test case, it trains about 80% faster with AMP on a Volta machine.

Training at full resolution

Training with your own dataset

More Training/Test Details

Citation

If you find this useful for your research, please use the following.

@inproceedings{wang2018pix2pixHD,
  title={High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs},
  author={Ting-Chun Wang and Ming-Yu Liu and Jun-Yan Zhu and Andrew Tao and Jan Kautz and Bryan Catanzaro},  
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  year={2018}
}

Acknowledgments

This code borrows heavily from pytorch-CycleGAN-and-pix2pix.