Home

Awesome

<div align='center'> <h2>PonderV2: Pave the Way for 3D Foundation Model <br>with A Universal Pre-training Paradigm</h2>

Haoyi Zhu<sup>1,4*</sup>, Honghui Yang<sup>1,3*</sup>, Xiaoyang Wu<sup>1,2*</sup>, Di Huang<sup>1*</sup>, Sha Zhang<sup>1,4</sup>, Xianglong He<sup>1</sup>, <br> Hengshuang Zhao<sup>2</sup>, Chunhua Shen<sup>3</sup>, Yu Qiao<sup>1</sup>, Tong He<sup>1</sup>, Wanli Ouyang<sup>1</sup>

<sup>1</sup>Shanghai AI Lab, <sup>2</sup>HKU, <sup>3</sup>ZJU, <sup>4</sup>USTC

PWC<br> PWC<br> PWC<br> PWC

</div> <p align="center"> <img src="assets/radar.png" alt="radar" width="500" /> </p>

This is the official implementation of paper "PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm".

PonderV2 is a comprehensive 3D pre-training framework designed to facilitate the acquisition of efficient 3D representations, thereby establishing a pathway to 3D foundational models. It is a novel universal paradigm to learn point cloud representations by differentiable neural rendering, serving as a bridge between 3D and 2D worlds.

<p align="center"> <img src="assets/pipeline.png" alt="pipeline" width="800" /> </p>

Important Notes:

News:

Installation

Requirements

Conda Environment

conda create -n ponderv2 python=3.8 -y
conda activate ponderv2
# Choose version you want here: https://pytorch.org/get-started/previous-versions/
conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.3 -c pytorch -y
conda install h5py pyyaml -c anaconda -y
conda install sharedarray tensorboard tensorboardx addict einops scipy plyfile termcolor timm -c conda-forge -y
conda install pytorch-cluster pytorch-scatter pytorch-sparse -c pyg -y
pip install torch-geometric yapf==0.40.1 opencv-python open3d==0.10.0 imageio
pip install git+https://github.com/openai/CLIP.git

# spconv (SparseUNet)
# refer https://github.com/traveller59/spconv
pip install spconv-cu113

# precise eval
cd libs/pointops
# usual
python setup.py install
# docker & multi GPU arch
TORCH_CUDA_ARCH_LIST="ARCH LIST" python  setup.py install
# e.g. 7.5: RTX 3000; 8.0: a100 More available in: https://developer.nvidia.com/cuda-gpus
TORCH_CUDA_ARCH_LIST="7.5 8.0" python  setup.py install
cd ../..

# NeuS renderer
cd libs/smooth-sampler
# usual
python setup.py install
# docker & multi GPU arch
TORCH_CUDA_ARCH_LIST="ARCH LIST" python setup.py install
# e.g. 7.5: RTX 3000; 8.0: a100 More available in: https://developer.nvidia.com/cuda-gpus
TORCH_CUDA_ARCH_LIST="7.5 8.0" python setup.py install
cd ../..

If you want to run instance segmentation downstream tasks with PointGroup, you should also run the following:

conda install -c bioconda google-sparsehash 
cd libs/pointgroup_ops
python setup.py install --include_dirs=${CONDA_PREFIX}/include
cd ../..

Then uncomment # from .point_group import * in ponder/models/__init__.py.

Data Preparation

Please check out docs/data_preparation.md.

Model Zoo

Please check out docs/model_zoo.md.

Quick Start:

Pre-train PonderV2 (indoor) on single ScanNet dataset with 8 GPUs:

# -g: number of GPUs
# -d: dataset
# -c: config file, the final config is ./config/${-d}/${-c}.py
# -n: experiment name
bash scripts/train.sh -g 8 -d scannet -c pretrain-ponder-spunet-v1m1-0-base -n ponderv2-pretrain-sc

Pre-train PonderV2 (indoor) on ScanNet, S3DIS and Structured3D datasets using Point Prompt Training (PPT) with 8 GPUs:

bash scripts/train.sh -g 8 -d scannet -c pretrain-ponder-ppt-v1m1-0-sc-s3-st-spunet -n ponderv2-pretrain-sc-s3-st

Pre-train PonderV2 (outdoor) on single nuScenes dataset with 4 GPUs:

bash scripts/train.sh -g 4 -d nuscenes -c pretrain-ponder-spunet-v1m1-0-base -n ponderv2-pretrain-nu

Finetune PonderV2 on ScanNet semantic segmentation downstream task with PPT:

# -w: path to checkpoint
bash scripts/train.sh -g 8 -d scannet -c semseg-ppt-v1m1-0-sc-s3-st-spunet-lovasz-ft -n ponderv2-semseg-ft -w ${PATH/TO/CHECKPOINT}

Finetune PonderV2 on ScanNet instance segmentation downstream task using PointGroup:

bash scripts/train.sh -g 4 -d scannet -c insseg-ppt-v1m1-0-pointgroup-spunet-ft -n insseg-pointgroup-v1m1-0-spunet-ft -w ${PATH/TO/CHECKPOINT}
# Based on experiment folder created by training script
bash scripts/test.sh -g 8 -d scannet -n ponderv2-semseg-ft -w ${CHECKPOINT/NAME}

You can download our trained checkpoint weights in docs/model_zoo.md.

For more detailed options and examples, please refer to docs/getting_started.md.

For more outdoor pre-training and downstream information, you can also refer to UniPAD.

Todo:

Citation

@article{zhu2023ponderv2,
  title={PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm}, 
  author={Haoyi Zhu and Honghui Yang and Xiaoyang Wu and Di Huang and Sha Zhang and Xianglong He and Tong He and Hengshuang Zhao and Chunhua Shen and Yu Qiao and Wanli Ouyang},
  journal={arXiv preprint arXiv:2310.08586},
  year={2023}
}

@inproceedings{huang2023ponder,
  title={Ponder: Point cloud pre-training via neural rendering},
  author={Huang, Di and Peng, Sida and He, Tong and Yang, Honghui and Zhou, Xiaowei and Ouyang, Wanli},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={16089--16098},
  year={2023}
}

@article{yang2023unipad,
  title={UniPAD: A Universal Pre-training Paradigm for Autonomous Driving}, 
  author={Honghui Yang and Sha Zhang and Di Huang and Xiaoyang Wu and Haoyi Zhu and Tong He and Shixiang Tang and Hengshuang Zhao and Qibo Qiu and Binbin Lin and Xiaofei He and Wanli Ouyang},
  journal={arXiv preprint arXiv:2310.08370},
  year={2023},
}

Acknowledgement

This project is mainly based on the following codebases. Thanks for their great works!