Home

Awesome

<img src='imgs/teaser.gif' align="right" width=360>

<br><br><br><br>

vid2vid

Project | YouTube(short) | YouTube(full) | arXiv | Paper(full)

Pytorch implementation for high-resolution (e.g., 2048x1024) photorealistic video-to-video translation. It can be used for turning semantic label maps into photo-realistic videos, synthesizing people talking from edge maps, or generating human motions from poses. The core of video-to-video translation is image-to-image translation. Some of our work in that space can be found in pix2pixHD and SPADE. <br><br> Video-to-Video Synthesis
Ting-Chun Wang<sup>1</sup>, Ming-Yu Liu<sup>1</sup>, Jun-Yan Zhu<sup>2</sup>, Guilin Liu<sup>1</sup>, Andrew Tao<sup>1</sup>, Jan Kautz<sup>1</sup>, Bryan Catanzaro<sup>1</sup>
<sup>1</sup>NVIDIA Corporation, <sup>2</sup>MIT CSAIL
In Neural Information Processing Systems (NeurIPS) 2018

Video-to-Video Translation

<p align='center'> <img src='imgs/city_change_styles.gif' width='440'/> <img src='imgs/city_change_labels.gif' width='440'/> </p> <p align='center'> <img src='imgs/face.gif' width='440'/> <img src='imgs/face_multiple.gif' width='440'/> </p> <p align='center'> <img src='imgs/pose.gif' width='550'/> </p> <p align='center'> <img src='imgs/framePredict.gif' width='550'/> </p>

Prerequisites

Getting Started

Installation

pip install dominate requests
pip install dlib
git clone https://github.com/NVIDIA/vid2vid
cd vid2vid

Testing

Dataset

Training with Cityscapes dataset

If you have TensorFlow installed, you can see TensorBoard logs in ./checkpoints/label2city_1024/logs by adding --tf_log to the training scripts.

Training with face datasets

Training with pose datasets

Training with your own dataset

More Training/Test Details

Citation

If you find this useful for your research, please cite the following paper.

@inproceedings{wang2018vid2vid,
   author    = {Ting-Chun Wang and Ming-Yu Liu and Jun-Yan Zhu and Guilin Liu
                and Andrew Tao and Jan Kautz and Bryan Catanzaro},
   title     = {Video-to-Video Synthesis},
   booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},   
   year      = {2018},
}

Acknowledgments

We thank Karan Sapra, Fitsum Reda, and Matthieu Le for generating the segmentation maps for us. We also thank Lisa Rhee for allowing us to use her dance videos for training. We thank William S. Peebles for proofreading the paper.</br> This code borrows heavily from pytorch-CycleGAN-and-pix2pix and pix2pixHD.