Home

Awesome

OccDepth: A Depth-aware Method for 3D Semantic Occupancy Network

PWC

PWC

News

Abstract

In this paper, we propose the first stereo SSC method named OccDepth, which fully exploits implicit depth information from stereo images (or RGBD images) to help the recovery of 3D geometric structures. The Stereo Soft Feature Assignment (Stereo-SFA) module is proposed to better fuse 3D depth-aware features by implicitly learning the correlation between stereo images. In particular, when the input are RGBD image, a virtual stereo images can be generated through original RGB image and depth map. Besides, the Occupancy Aware Depth (OAD) module is used to obtain geometry-aware 3D features by knowledge distillation using pre-trained depth models.

Video Demo

Mesh results compared with ground truth on KITTI-08:

<p align="center"> <img src="./assets/demo.gif" alt="video loading..." /> </p> Voxel results compared with ground truth on KITTI-08: <p align="center"> <img src="./assets/demo_voxel.gif" alt="video loading..." /> </p> Full demo videos can be downloaded via `git lfs pull`, the demo videos are saved as "assets/demo.mp4" and "assets/demo_voxel.mp4".

Results

Trained models

The trained models on GeForce RTX 2080 Ti are provided:

ConfigdatasetIoUmIoUDownload
configSemanticKITTI41.6012.84model
configNYUv249.2329.34model

Note: If you want to get better results, you should set share_2d_backbone_gradient = false, backbone_2d_name = tf_efficientnet_b7_ns and feature = feature_2d_oc = 64 (SemanticKITTI) which needs more GPU memory.

Qualitative Results

<div align="center"> <img width=374 src="./assets/result1-1.png"/><img width=400 src="./assets/result1-2.png"/>

Fig. 1: RGB based Semantic Scene Completion with/without depth-aware. (a) Our proposed OccDepth method can detect smaller and farther objects. (b) Our proposed OccDepth method complete road better.

</div>

Quantitative results on SemanticKITTI

<div align="center"> Table 1. Performance on SemanticKITTI (hidden test set).
MethodInputSC IoUSSC mIoU
2.5D/3D
LMSCNet(st)OCC33.005.80
AICNet(st)RGB, DEPTH32.86.80
JS3CNet(st)PTS39.309.10
2D
MonoSceneRGB34.1611.08
MonoScene(st)Stereo RGB40.8413.57
OccDepth (ours)Stereo RGB45.1015.90
</div> The scene completion (SC IoU) and semantic scene completion (SSC mIoU) are reported for modified baselines (marked with "st") and our OccDepth.

Detailed results on SemanticKITTI.

<div align="center"> <img src="./assets/result2.png"/> </div>

Compared with baselines.

<div align="center"> <img width=400 src="./assets/result3.png"/> </div> Baselines of 2.5D/3D-input methods. ”∗ ” means results are cited from MonoScene. ”/” means missing results

Usage

Environment

  1. Create conda environment:
conda create -y -n occdepth python=3.7
conda activate occdepth
conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.7 -c pytorch -c nvidia
  1. Install dependencies:
pip install -r requirements.txt
conda install -c bioconda tbb=2020.2

Preparing

SemanticKITTI

NYUv2

Settings

  1. Setting DATA_LOG, DATA_CONFIG in env_{dataset}.sh, examples:
    ##examples
    export DATA_LOG=$workdir/logdir/semanticKITTI
    export DATA_CONFIG=$workdir/occdepth/config/semantic_kitti/multicam_flospdepth_crp_stereodepth_cascadecls_2080ti.yaml
    
  2. Setting data_root, data_preprocess_root and data_stereo_depth_root in config file (occdepth/config/xxxx.yaml), examples:
    ##examples
    data_root: '/data/dataset/KITTI_Odometry_Semantic'
    data_preprocess_root: '/data/dataset/kitti_semantic_preprocess'
    data_stereo_depth_root: '/data/dataset/KITTI_Odometry_Stereo_Depth'
    

Inference

cd OccDepth/
source env_{dataset}.sh
## move the trained model to OccDepth/trained_models/occdepth.ckpt
## 4 gpus and batch size on each gpu is 1
python occdepth/scripts/generate_output.py n_gpus=4 batch_size_per_gpu=1

Evaluation

cd OccDepth/
source env_{dataset}.sh
## move the trained model to OccDepth/trained_models/occdepth.ckpt
## 1 gpu and batch size on each gpu is 1
python occdepth/scripts/eval.py n_gpus=1 batch_size_per_gpu=1

Training

cd OccDepth/
source env_{dataset}.sh
## 4 gpus and batch size on each gpu is 1
python occdepth/scripts/train.py logdir=${DATA_LOG} n_gpus=4 batch_size_per_gpu=1

License

This repository is released under the Apache 2.0 license as found in the LICENSE file.

Acknowledgements

Our code is based on these excellent open source projects:

Many thanks to them!

Related Repos

Citation

If you find this project useful in your research, please consider cite:

@article{miao2023occdepth,
Author = {Ruihang Miao and Weizhou Liu and Mingrui Chen and Zheng Gong and Weixin Xu and Chen Hu and Shuchang Zhou},
Title = {OccDepth: A Depth-Aware Method for 3D Semantic Scene Completion},
journal = {arXiv:2302.13540},
Year = {2023},
}

Contact

If you have any questions, feel free to open an issue or contact us at miaoruihang@megvii.com, huchen@megvii.com.