Awesome
BUOL: A Bottom-Up Framework with Occupancy-aware Lifting for Panoptic 3D Scene Reconstruction From A Single Image
Introduction
This is an official release of the paper BUOL: A Bottom-Up Framework with Occupancy-aware Lifting for Panoptic 3D Scene Reconstruction From A Single Image
<p align="left"> <img width=640 src="resources/framework.png"/> </p>BUOL: A Bottom-Up Framework with Occupancy-aware Lifting for Panoptic 3D Scene Reconstruction From A Single Image
Tao Chu, Pan Zhang, Qiong Liu, Jiaqi Wang
The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR 2023)
Results
The results of BUOL on each dataset are shown below. We have released the models.
dataset | PRQ | RSQ | RRQ | PRQ_th | PRQ_st | Download |
---|---|---|---|---|---|---|
3D FRONT | 54.05 | 63.72 | 83.14 | 49.77 | 73.34 | front3d.pth |
Matterport3D | 14.54 | 45.91 | 31.08 | 11.02 | 25.09 | matterport3d.pth |
Installation
Creat environment.
conda create -n buol -y
conda activate buol
conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=11.1 -c pytorch -c conda-forge -y
Install MinkowskiEngine.
git clone https://github.com/NVIDIA/MinkowskiEngine.git
cd MinkowskiEngine
python setup.py install --blas_include_dirs=${CONDA_PREFIX}/include --blas=openblas --force_cuda
Install PyMarchingCubes.
git clone https://github.com/xheon/PyMarchingCubes.git
cd PyMarchingCubes
git clone https://gitlab.com/libeigen/eigen.git
python setup.py install
Install other dependency packages.
pip install yacs fvcore
pip install opencv-python
conda install -c conda-forge openexr-python -y
pip install pyexr
pip install matplotlib
pip install plyfile
pip install loguru
pip install scipy
Run
Demo
Download front3d.pth
and put it at models/front3d.pth
, and run:
python demo.py
Train
Download datasets and put them in datasets/<dataset_name>
as the following structure,
and then set GPUS (e.g. GPUS: (0, 1, 2, 3)
) and MODEL.EVAL: False
in the config file,
and train with multi-GPU:
python -m torch.distributed.launch --nproc_per_node=4 main.py --cfg configs/front.yaml
Test
Download the model or train the model, and then set MODEL.WEIGHTS
as the model path.
Set GPUS: (0,)
and MODEL.EVAL: True
in the config file, and test with one GPU:
python main.py --cfg configs/front.yaml
Datasets
3D FRONT
The 3D FRONT is a synthetic indoor dataset. We process it the same as Dahnert et al. (Panoptic 3D Scene Reconstruction from a Single RGB Image). You can download or process it from there.
Structure
front3d/
<scene_id>/
├── rgb_<frame_id>.png # Color image: 320x240x3
├── depth_<frame_id>.exr # Depth image: 320x240x1
├── segmap_<frame_id>.mapped.npz # 2D Segmentation: 320x240x2, with 0: pre-mapped semantics, 1: instances
├── geometry_<frame_id>.npz # 3D Geometry: 256x256x256x1, truncated, (unsigned) distance field at 3cm voxel resolution and 12 voxel truncation.
├── segmentation_<frame_id>.mapped.npz # 3D Segmentation: 256x256x256x2, with 0: pre-mapped semantics & instances
├── weighting_<frame_id>.mapped.npz # 3D Weighting mask: 256x256x256x1
Matterport3D
The Matterport3D is a real-world indoor datasets. We follow Dahnert et al. to preprocess this dataset. In addition, we generate depth and room mask by rendering 3D scenes instead of using the origin version.
Structure
matterport/
<scene_id>/
├── <image_id>_i<frame_id>.png # Color image: 320x240x3
├── <image_id>_segmap<frame_id>.mapped.npz # 2D Segmentation: 320x240x2, with 0: pre-mapped semantics, 1: instances
├── <image_id>_intrinsics_<camera_id>.png # Intrinsics matrix: 4x4
├── <image_id>_geometry<frame_id>.npz # 3D Geometry: 256x256x256x1, truncated, (unsigned) distance field at 3cm voxel resolution and 12 voxel truncation.
├── <image_id>_segmentation<frame_id>.mapped.npz # 3D Segmentation: 256x256x256x2, with 0: pre-mapped semantics & instances
├── <image_id>_weighting<frame_id>.npz # 3D Weighting mask: 256x256x256x1
matterport_depth_gen/
<scene_id>/
├── <posithion_id>_d<frame_id>.png # Depth image: 320x240x1
matterport_room_mask/
<scene_id>/
├── <posithion_id>_rm<frame_id>.png # room mask: 320x240x1
Citation
@inproceedings{chu2023buol,
title={BUOL: A Bottom-Up Framework With Occupancy-Aware Lifting for Panoptic 3D Scene Reconstruction From a Single Image},
author={Chu, Tao and Zhang, Pan and Liu, Qiong and Wang, Jiaqi},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={4937--4946},
year={2023}
}