Awesome

PointOcc: Cylindrical Tri-Perspective View for Point-based 3D Semantic Occupancy Prediction

Paper

PointOcc: Cylindrical Tri-Perspective View for Point-based 3D Semantic Occupancy Prediction

Sicheng Zuo*, Wenzhao Zheng* $\dagger$, Yuanhui Huang, Jie Zhou, Jiwen Lu$\ddagger$

* Equal contribution $\dagger$ Project leader $\ddagger$ Corresponding author

Highlights

PointOcc enables the use of 2D image backbones for efficient point-based 3D semantic occupancy prediction.
The Lidar-only PointOcc even outperforms Lidar & Camera multi-modal methods by a large margin.

overview

Introduction

Semantic segmentation in autonomous driving has been undergoing an evolution from sparse point segmentation to dense voxel segmentation, where the objective is to predict the semantic occupancy of each voxel in the concerned 3D space. The dense nature of the prediction space has rendered existing efficient 2D-projection-based methods (e.g., bird's eye view, range view, etc.) ineffective, as they can only describe a subspace of the 3D scene. To address this, we propose a cylindrical tri-perspective view (TPV) to represent point clouds effectively and comprehensively and a PointOcc model to process them efficiently. Considering the distance distribution of LiDAR point clouds, we construct a tri-perspective view in the cylindrical coordinate system for more fine-grained modeling of nearer areas. We employ spatial group pooling to maintain structural details during projection and adopt 2D backbones to efficiently process each TPV plane. Finally, we obtain the features of each point by aggregating its projected features on each of the processed TPV planes without the need for any post-processing. Extensive experiments on both 3D occupancy prediction and LiDAR segmentation benchmarks demonstrate that the proposed PointOcc achieves state-of-the-art performance with much faster speed. Specifically, despite only using LiDAR, PointOcc significantly outperforms all other methods, including multi-modal methods, with a large margin on the OpenOccupancy benchmark.

network

Results

3D Semantic Occupancy Prediction

results_sop

LiDAR Segmentation

results_seg

Installation

Create a conda environment and activate it.

conda create -n uniauto python=3.8
conda activate uniauto

Install PyTorch and torchvision following the official instructions.

pip install torch==1.10.0+cu113 torchvision==0.11.0+cu113 torchaudio==0.10.0 -f https://download.pytorch.org/whl/torch_stable.html

Follow instructions in https://mmdetection3d.readthedocs.io/en/latest/get_started.html#installation to install mmcv-full, mmdet, mmsegmentation and mmdet3d

pip install -U openmim
mim install mmengine
mim install 'mmcv>=2.0.0rc4'
mim install 'mmdet>=3.0.0'
mim install mmsegmentation
mim install "mmdet3d>=1.1.0rc0"

install other packages

pip install timm==0.4.12
pip install torch_scatter
pip install spconv-cu113
#pip install numpy==1.19.5
#pip install numba==0.48.0
pip install scikit-image==0.19.3
pip install pandas==1.4.4

Preparing

Download pretrain weights from here and put it in pretrain/
Follow detaild instructions to prepare nuScenes-Occupancy.
Folder structure:

PointOcc
├── pretrain/
│   ├── swin_tiny_patch4_window7_224.pth/
├── data/
│   ├── nuscenes/
│   │   ├── maps/
│   │   ├── samples/
│   │   ├── sweeps/
│   │   ├── lidarseg/
│   │   ├── v1.0-test/
│   │   ├── v1.0-trainval/
│   │   ├── nuscenes_occ_infos_train.pkl/
│   │   ├── nuscenes_occ_infos_val.pkl/
│   ├── nuScenes-Occupancy/
│   │   ├── scene_0ac05652a4c44374998be876ba5cd6fd/
│   │   ├── ...

Getting Started

Training

Train PointOcc for LiDAR segmentation task

python train_seg.py --py-config config/pointtpv_nusc_lidarseg.py --work-dir work_dir/nusc_lidarseg/pointtpv

Train PointOcc for 3D semantic occupancy prediction task

python train_occ.py --py-config config/pointtpv_nusc_occ.py --work-dir work_dir/nusc_occ/pointtpv

Evaluation

Evaluate PointOcc for LiDAR segmentation task

python eval_seg.py --py-config config/pointtpv_nusc_lidarseg.py --ckpt-path xxx --log-file work_dir/nusc_lidarseg/pointtpv/eval.log

Evaluate PointOcc for 3D semantic occupancy prediction task

python eval_occ.py --py-config config/pointtpv_nusc_occ.py --ckpt-path xxx --log-file work_dir/nusc_occ/pointtpv/eval.log

Related Projects

Our code mainly derives from TPVFormer and is also based on Cylinder3D. Many thanks to them!

Citation

If you find this project helpful, please consider citing the following paper:

@article{pointocc,
    title={PointOcc: Cylindrical Tri-Perspective View for Point-based 3D Semantic Occupancy Prediction},
    author={Zuo, Sicheng and Zheng, Wenzhao and Huang, Yuanhui and Zhou, Jie and Lu, Jiwen},
    journal={arXiv preprint arXiv:2308.16896},
    year={2023}
}