Home

Awesome

Weakly supervised 3D Reconstruction with Adversarial Constraint

This repository contains source code for Weakly supervised 3D Reconstruction with Adversarial Constraint. This is a fork project of our previous work, 3D-R2N2: 3D Recurrent Reconstruction Neural Network. Inspired by visual hull algorithm, we propose to learn 3D reconstruct from 2D silhouettes using backpropable raytrace pooling operation. Additionally, in order to overcome the limitation of visual hull such as concavity, we propose to constrain the reconstruction on unlabeled 3D shapes with adversarial constraint. [Paper][Code][Poster][Slides][Bibtex][Website]

Proposed Method

Overview

Neural network-based 3D reconstruction requires a large scale annotation of ground-truth 3D model for every 2D image, which is infeasible for real-world application. In this paper, we explore relatively inexpensive 2D supervision as an alternative for expensive 3D CAD annotation.

main figure

We propose to learn 3D reconstruction from 2D supervision as following:

Raytrace Pooling

raytracing

In order to learn 3D shapes from 2D masks, we propose a backpropable raytrace pooling layer. Raytrace pooling layer takes viewpoint and 3D shape as an input and renders a corresponding 2D image, maxpooled across voxels hit through each ray of the pixel, as shown in figure above. This layer efficiently bridges the gap between 2D masks and 3D shapes, allowing us to apply loss inspired by visual hull. Moreover, this is an efficient implementation of true raytracing using voxel-ray hit test. Unlike sampling-based approaches, our implementation does not suffer from aliasing artifacts or sampling hyperparameter tuning.

Adversarial Constraint

adversarial_constraint_eq1

adversarial_constraint_eq2

adversarial_constraint_eq3

2D mask constraint, unlike full 3D shape constraint, cannot enforce concavity or symmetry, which may be crucial for 3D reconstruction. Therefore, we apply additional constraint so that the network will generate a "valid shape" as in equation (1) above. In order to solve equation (1), in this paper, we demonstrated that a constrained optimization problem can be tackled using GAN-like network structure and loss. Equation (1) can be re-written as equation (2) using log barrier method where g(x)=1 iff reconstruction x is realistic and 0 otherwise. We observed that the ideal discriminator of GAN g<sup>*</sup>(x), which outputs g<sup>*</sup>(x)=1 iff reconstruction x is realistic, is analogous to the penalty function g(x). Therefore, we train the constraint using GAN-like adversarial loss, as in equation (3).

Installation

Installing required packages

This project requires python3. You can follow the direction below to install virtual environment within the repository or install anaconda for python 3.

git clone https://github.com/jgwak/McRecon.git
cd McRecon
pip install virtualenv
virtualenv -p python3 --system-site-packages py3
source py3/bin/activate
pip install -r requirements.txt
cp .theanorc ~/.theanorc

Using cuDNN

cuDNN may significantly accelerate Theano and is highly recommended for training. To use cuDNN library, you have to download cuDNN from the nvidia website. Then, extract the files to any directory and append the directory to the environment variables like the following. Please replace the /path/to/cuDNN/ to the directory that you extracted cuDNN.

export LD_LIBRARY_PATH=/path/to/cuDNN/lib64:$LD_LIBRARY_PATH
export CPATH=/path/to/cuDNN/include:$CPATH
export LIBRARY_PATH=/path/to/cuDNN/lib64:$LD_LIBRARY_PATH

For more details, please refer to http://deeplearning.net/software/theano/library/sandbox/cuda/dnn.html

Running the code

Quickstart with demo.py

sudo apt-get install meshlab
python demo.py prediction.obj

The demo code takes 3 images of the same chair and generates the following reconstruction.

Image 1Image 2Image 3Reconstruction

Please note that our network successfully reconstructed concavity, which cannot be learned from mask consistency loss.

deactivate

Training the network

source py3/bin/activate
mkdir ShapeNet/
wget ftp://cs.stanford.edu/cs/cvgl/ShapeNetRendering.tgz
wget ftp://cs.stanford.edu/cs/cvgl/ShapeNetVox32.tgz
tar -xzf ShapeNetRendering.tgz -C ShapeNet/
tar -xzf ShapeNetVox32.tgz -C ShapeNet/
./experiments/script/gan_mask_net.sh

Note: It takes a long time (up to 30 minutes) for the network model to compile and start running.

Citing this work

If you find our work helpful, please cite it with the following bibtex.

@inproceedings{gwak2017weakly,
  title={Weakly supervised 3D Reconstruction with Adversarial Constraint},
  author={Gwak, JunYoung and Choy, Christopher B and Chandraker, Manmohan and Garg, Animesh and Savarese, Silvio},
  booktitle = {3D Vision (3DV), 2017 Fifth International Conference on 3D Vision},
  year={2017}
}

License

MIT License<br/> Original work by Christopher B. Choy<br/> Modified work by JunYoung Gwak