Home

Awesome

HEAT: Holistic Edge Attention Transformer for Structured Reconstruction

<img src="https://img.shields.io/badge/PyTorch-EE4C2C?style=for-the-badge&logo=PyTorch&logoColor=white" width="9%" /> License: GPL v3

Official implementation of the paper HEAT: Holistic Edge Attention Transformer for Structured Reconstruction (CVPR 2022).

[Project page], [Arxiv]

Please use the following bib entry to cite the paper if you are using resources from this repo.

@inproceedings{chen2022heat,
     title={HEAT: Holistic Edge Attention Transformer for Structured Reconstruction},
     author={Chen, Jiacheng and Qian, Yiming and Furukawa, Yasutaka},
     booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
     year={2022}
} 

Introduction

<img src="assets/img/problem_description.png" width="90%">

This paper focuses on a typical family of structured reconstruction tasks: planar graph reconstruction. Two different tasks are included: 1) outdoor architecture reconstruction from a satellite image; or 2) floorplan reconstruction from a point density image. The above below shows examples. The key contributions of the paper are:

<img src="assets/img/pipeline.png" width="90%">

As shown by the above figure, the overall pipeline of our method consists of three key steps: 1) edge node initialization; 2) edge image feature fusion and edge filtering; and 3) holistic structural reasoning with two weight-sharing transformer decoders. Please refer to the paper for more details.

This repo provides the code, data, and pre-trained checkpoints of HEAT for the two tasks covered in the paper.

Preparation

Note: The code, data, and pre-trained models in this repo are for non-commercial research purposes only, please check the LICENSE file for details.

Environment

This repo was developed and tested with Python3.7

Install the required packages, and compile the deformable-attention modules (from deformable-DETR)

pip install -r requirements.txt
cd  models/ops/
sh make.sh
cd ...

Data

Please download the data for the two tasks from the link here. Extract the data into the ./data directory.

The file structure should be like the following:

data
├── outdoor
│   ├── cities_dataset  # the outdoor architecture dataset from previous works
│   │      ├── annot    # the G.T. planar graphs
│   │      ├── rgb      # the input images
│   │      ├── ......   # dataset splits, miscs
│   │
│   └── det_finals   # corner detection results from previous works (not used by our full method, but used for ablation studies) 
│
└── s3d_floorplan       # the Structured3D floorplan dataset, produced with the scripts from MonteFloor
    ├── annot           # the G.T. planar graphs 
    │
    ├── density         # the point density images
    │
    │── ......          # dataset splits, miscs

Note that the Structured3D floorplan data is generated with the scripts provided by MonteFloor[1]. We thank the authors for kindly sharing the processing scripts, please cite their paper if you use the corresponding resources.

Data preprocessing for floorplan reconstruction (Optional)

All the data used in our paper are provided in the download links above. However, If you are interested in the data preparation process for the floorplan reconstruction task, please refer to the s3d_preprocess directory in which we provide the scripts and a brief doc.

Checkpoints

We provide the checkpoints for our full method under this link, please download and extract.

Inference, evaluation, and visualization

We provide the instructions to run the inference, quantitative evaluation, and qualitative visualization in this section.

Outdoor architecture reconstruction

Floorplan reconstruction

Training

Set up the training arguments in arguments.py, and then run the training by:

CUDA_VISIBLE_DEVICES={gpu_ids} python train.py

Or specify the key arguments in the command line and run the outdoor experiment by:

CUDA_VISIBLE_DEVICES={gpu_ids} python train.py  --exp_dataset outdoor  --epochs 800 --lr_drop 600  --batch_size 16  --output_dir ./checkpoints/ckpts_heat_outdoor_256  --image_size 256  --max_corner_num 150  --lambda_corner 0.05  --run_validation

or run the s3d floorplan experiment by:

CUDA_VISIBLE_DEVICES={gpu_ids} python train.py  --exp_dataset s3d_floorplan  --epochs 400 --lr_drop 300  --batch_size 16  --output_dir ./checkpoints/ckpts_heat_s3d_256  --image_size 256  --max_corner_num 200  --lambda_corner 0.10  --run_validation

With the default setting (e.g., model setup, batch size, etc.), training the full HEAT (i.e., the end-to-end corner and edge modules) needs at least 2 GPUs with ~16GB memory each.

References

[1]. Stekovic, Sinisa, Mahdi Rad, Friedrich Fraundorfer and Vincent Lepetit. “MonteFloor: Extending MCTS for Reconstructing Accurate Large-Scale Floor Plans.” 2021 IEEE/CVF International Conference on Computer Vision (ICCV) (2021): 16014-16023.

[2]. Zhang, Fuyang, Xiangyu Xu, Nelson Nauata and Yasutaka Furukawa. “Structured Outdoor Architecture Reconstruction by Exploration and Classification.” 2021 IEEE/CVF International Conference on Computer Vision (ICCV) (2021): 12407-12415.