Home

Awesome

pytorch-layoutnet

News: Check out our new project HoHoNet on this task and more!
News: Check out our new project HorizonNet on this task.

This is an unofficial implementation of CVPR 18 paper "LayoutNet: Reconstructing the 3D Room Layout from a Single RGB Image". Official layout dataset are all converted to .png and pretrained models are converted to pytorch state-dict.
What difference from official:

Overview of the pipeline:

Use this repo, you can:

Requirements

Visualization

1. Preparation

2. Pre-processing (Align camera pose with floor)

3. Layout Prediction with LayoutNet

4. Layout 3D Viewer

Preparation for Training

/pytorch-layoutnet 
  /data
  | /origin
  |   /data  (download and extract from official)
  |   /gt    (download and extract from official)
  /ckpt
    /panofull_*_pretrained.t7  (download and extract from official)

Training

See python train.py -h for detailed arguments explanation.
The default training strategy is the same as official. To launch experiments as official "corner+boundary" setting (--id is used to identified the experiment and can be named youself):

python train.py --id exp_default

To train only using RGB channels as input (no Manhattan line segment):

python train.py --id exp_rgb --input_cat img --input_channels 3

Gradient Ascent Post Optimization

Instead of offical 3D layout optimization with sampling strategy, this repo implement a gradient ascent optimization algorithm to minimize the similar loss of official.
The process abstract below:

  1. greedily extract the cuboid parameter from corner/edge probability map
    • The cuboid are consist of the 6 parameters (cx, cy, dx, dy, theta, h)
    • corner probability mapedge probability map
  2. sample points alone the cuboid boundary and project them to equirectangular formatted corner/edge probability map
    • The sample projected points are visualized as green dot
    • <img src="assert/output/demo_aligned_rgb_all.png" width=300>
  3. for each projected sample point, getting value by bilinear interpolation from nearest 4 neighbor pixel on the corner/edge probability map
  4. all the sampled values are reduced to a single scalar called score
  5. compute the gradient for the 6 cuboid parameter to maximize the score
  6. Iterative apply gradient ascent (step 2 through 6)

It take less than 2 seconds on CPU and found slightly better result than offical reported.

Quantitative Evaluation

See python eval.py -h for more detailed arguments explanation. To get the result from my trained network (link above):

python eval.py --path_prefix ckpt/epoch_30 --flip --rotate 0.333 0.666

To evaluate with gradient ascent post optimization:

python eval.py --path_prefix ckpt/epoch_30 --flip --rotate 0.333 0.666 --post_optimization

Dataset - PanoContext

exp3D IoU(%)Corner error(%)Pixel error(%)
Official best75.121.023.18
ours rgb only71.421.303.83
ours rgb only <br> w/ gd opt72.521.503.66
ours75.111.043.16
ours <br> w/ gd opt76.900.932.81

Dataset - Stanford 2D-3D

exp3D IoU(%)Corner error(%)Pixel error(%)
Official best77.510.922.42
ours rgb only70.391.504.28
ours rgb only <br> w/ gd opt71.901.354.25
ours75.490.963.07
ours <br> w/ gd opt78.900.882.78

References