Awesome

GD-MAE: Generative Decoder for MAE Pre-training on LiDAR Point Clouds (CVPR 2023)

NEWS

[2023-03-31] Code is released.

[2023-02-28] GD-MAE is accepted at CVPR 2023.

[2022-12-14] The result of GD-MAE on the Waymo Leaderboard is reported.

Installation

We test this project on NVIDIA A100 GPUs and Ubuntu 18.04.

conda create -n gd-mae python=3.7
conda activate gd-mae
conda install -y pytorch==1.10.1 torchvision==0.11.2 cudatoolkit=11.1 -c pytorch -c conda-forge
conda install -y -c fvcore -c iopath -c conda-forge fvcore iopath
conda install -y pytorch3d -c pytorch3d
pip install numpy==1.19.5 protobuf==3.19.4 scikit-image==0.19.2 waymo-open-dataset-tf-2-2-0 nuscenes-devkit==1.0.5 spconv-cu111 numba scipy pyyaml easydict fire tqdm shapely matplotlib opencv-python addict pyquaternion awscli open3d pandas future pybind11 tensorboardX tensorboard Cython
pip install torch-scatter -f https://data.pyg.org/whl/torch-1.10.1+cu111.html
git clone https://github.com/Nightmare-n/GD-MAE
cd GD-MAE && python setup.py develop --user
cd pcdet/ops/dcn && python setup.py develop --user

Data Preparation

Please follow the instruction of OpenPCDet to prepare the dataset. For the Waymo dataset, we use the evaluation toolkits to evaluate detection results.

data
│── waymo
│   │── ImageSets/
│   │── raw_data
│   │   │── segment-xxxxxxxx.tfrecord
│   │   │── ...
│   │── waymo_processed_data
│   │   │── segment-xxxxxxxx/
│   │   │── ...
│   │── waymo_processed_data_gt_database_train_sampled_1/
│   │── waymo_processed_data_waymo_dbinfos_train_sampled_1.pkl
│   │── waymo_processed_data_infos_test.pkl
│   │── waymo_processed_data_infos_train.pkl
│   │── waymo_processed_data_infos_val.pkl
│   │── compute_detection_metrics_main
│   │── gt.bin
│── kitti
│   │── ImageSets/
│   │── training
│   │   │── label_2/
│   │   │── velodyne/
│   │   │── ...
│   │── testing
│   │   │── velodyne/
│   │   │── ...
│   │── gt_database/
│   │── kitti_dbinfos_train.pkl
│   │── kitti_infos_test.pkl
│   │── kitti_infos_train.pkl
│   │── kitti_infos_val.pkl
│   │── kitti_infos_trainval.pkl
│── once
│   │── ImageSets/
│   │── data
│   │   │── 000000/
│   │   │── ...
│   │── gt_database/
│   │── once_dbinfos_train.pkl
│   │── once_infos_raw_large.pkl
│   │── once_infos_raw_medium.pkl
│   │── once_infos_raw_small.pkl
│   │── once_infos_train.pkl
│   │── once_infos_val.pkl
│── kitti-360
│   │── data_3d_raw
│   │   │── xxxxxxxx_sync/
│   │   │── ...
│── ckpts
│   │── graph_rcnn_po.pth
│   │── ...

Training & Testing

# mae pretrain & finetune
bash scripts/dist_ssl_train.sh

# one-stage model & two-stage model (separately)
bash scripts/dist_ts_train.sh

# one-stage model | two-stage model (end-to-end)
bash scripts/dist_train.sh

# test
bash scripts/dist_test.sh

Results

Waymo

	Vec_L1	Vec_L2	Ped_L1	Ped_L2	Cyc_L1	Cyc_L2	Model
Graph RCNN (w/o PointNet)	80.6/80.1	72.3/71.9	82.9/77.3	75.0/69.7	77.2/76.0	74.4/73.3	log
GD-MAE_0.2 (20% labeled data)	76.2/75.7	67.7/67.2	80.5/72.3	73.2/65.5	72.6/71.4	69.9/68.7	log
GD-MAE_iou (iou head)	79.4/78.9	70.9/70.5	82.2/75.9	74.8/68.8	75.8/74.8	73.0/72.0	log
GD-MAE_ts (two-stage)	80.2/79.8	72.4/72.0	83.1/76.7	75.5/69.4	77.2/76.2	74.4/73.4	log

We could not provide the above pretrained models due to Waymo Dataset License Agreement.

KITTI

	Easy	Moderate	Hard	Model
Graph-Vo	93.29	86.08	83.15	ckpt
Graph-VoI	95.80	86.72	83.93	ckpt
Graph-Po	93.44	86.54	83.90	ckpt

	Car	Pedestrian	Cyclist	Model
GD-MAE	82.01	48.40	67.16	pretrain/ckpt

ONCE

	Vehicle	Pedestrian	Cyclist	Model
CenterPoint-Pillar	74.10	40.94	62.17	ckpt
GD-MAE	76.79	48.84	69.14	pretrain/ckpt

Citation

If you find this project useful in your research, please consider citing:

@inproceedings{yang2023gdmae,
    author = {Honghui Yang and Tong He and Jiaheng Liu and Hua Chen and Boxi Wu and Binbin Lin and Xiaofei He and Wanli Ouyang},
    title = {GD-MAE: Generative Decoder for MAE Pre-training on LiDAR Point Clouds},
    booktitle = {CVPR},
    year = {2023},
}

@inproceedings{yang2022graphrcnn,
    author = {Honghui Yang and Zili Liu and Xiaopei Wu and Wenxiao Wang and Wei Qian and Xiaofei He and Deng Cai},
    title = {Graph R-CNN: Towards Accurate 3D Object Detection with Semantic-Decorated Local Graph},
    booktitle = {ECCV},
    year = {2022},
}

Acknowledgement

This project is mainly based on the following codebases. Thanks for their great works!