Home

Awesome

Visual Object Networks

<img src='imgs/teaser.jpg' width=820>

Project Page | Paper

We present Visual Object Networks (VON), an end-to-end adversarial learning framework that jointly models 3D shapes and 2D images. Our model can synthesize a 3D shape, its intermediate 2.5D depth representation, and a 2D image all at once. The VON not only generates realistic images but also enables several 3D operations.

Visual Object Networks: Image Generation with Disentangled 3D Representation.<br/> Jun-Yan Zhu, Zhoutong Zhang, Chengkai Zhang, Jiajun Wu, Antonio Torralba, Joshua B. Tenenbaum, William T. Freeman.<br/> MIT CSAIL and Google Research.<br/> In NeurIPS 2018.

Example results

(a) Typical examples produced by a recent GAN model [Gulrajani et al., 2017].<br/> (b) Our model produces three outputs: a 3D shape, its 2.5D projection given a viewpoint, and a final image with realistic texture.<br/> (c) Our model allows several 3D applications including editing viewpoint, shape, or texture independently.

<img src='imgs/overview.jpg' width=800>

More samples

Below we show more samples from DCGAN [Radford et al., 2016], LSGAN [Mao et al., 2017], WGAN-GP [Gulrajani et al., 2017], and our VON. For our method, we show both 3D shapes and 2D images. The learned 3D prior helps produce better samples.

<img src='imgs/samples.jpg' width=820>

3D object manipulations

Our VON allows several 3D applications such as (left) changing the viewpoint, texture, or shape independently, and (right) interpolating between two objects in shape space, texture space, or both.

<img src='imgs/app.jpg' width=820>

Transfer texture across objects and viewpoints

VON can transfer the texture of a real image to different shapes and viewpoints

<img src='imgs/transfer.jpg' width=820>

Prerequisites

Getting Started

Installation

git clone -b master --single-branch https://github.com/junyanz/VON.git
cd VON
conda create --name von --file pkg_specs.txt
source activate von
bash install.sh

We only test this step with gcc 6.3.0. If you need to recompile the kernel, please run bash clean.sh first before you recompile it.

sudo docker build ./../von -t von

To access the container, run:

sudo docker run -it --runtime=nvidia --ipc=host von /bin/bash

Then, to compile the kernels, simply:

cd /app/von
source activate von
./install.sh

Generate 3D shapes, 2.5D sketches, and images

bash ./scripts/download_model.sh
bash ./scripts/figures.sh 0 car df

The test results will be saved to an HTML file here: ./results/*/*/index.html.

Model Training

bash ./scripts/download_dataset.sh
bash ./scripts/train_shape.sh 0 car df
bash ./scripts/train_texture_real.sh 0 car df 0
bash ./scripts/train_texture.sh 0 car df 0
bash ./scripts/train_full.sh 0 car df 0

Citation

If you find this useful for your research, please cite the following paper.

@inproceedings{VON,
  title={Visual Object Networks: Image Generation with Disentangled 3{D} Representations},
  author={Jun-Yan Zhu and Zhoutong Zhang and Chengkai Zhang and Jiajun Wu and Antonio Torralba and Joshua B. Tenenbaum and William T. Freeman},
  booktitle={Advances in Neural Information Processing Systems (NeurIPS)},
  year={2018}
}

Acknowledgements

This work is supported by NSF #1231216, NSF #1524817, ONR MURI N00014-16-1-2007, Toyota Research Institute, Shell, and Facebook. We thank Xiuming Zhang, Richard Zhang, David Bau, and Zhuang Liu for valuable discussions. This code borrows from the CycleGAN & pix2pix repo.