Awesome
GD-MAE: Generative Decoder for MAE Pre-training on LiDAR Point Clouds (CVPR 2023)
NEWS
[2023-03-31] Code is released.
[2023-02-28] GD-MAE is accepted at CVPR 2023.
[2022-12-14] The result of GD-MAE on the Waymo Leaderboard is reported.
Installation
We test this project on NVIDIA A100 GPUs and Ubuntu 18.04.
conda create -n gd-mae python=3.7
conda activate gd-mae
conda install -y pytorch==1.10.1 torchvision==0.11.2 cudatoolkit=11.1 -c pytorch -c conda-forge
conda install -y -c fvcore -c iopath -c conda-forge fvcore iopath
conda install -y pytorch3d -c pytorch3d
pip install numpy==1.19.5 protobuf==3.19.4 scikit-image==0.19.2 waymo-open-dataset-tf-2-2-0 nuscenes-devkit==1.0.5 spconv-cu111 numba scipy pyyaml easydict fire tqdm shapely matplotlib opencv-python addict pyquaternion awscli open3d pandas future pybind11 tensorboardX tensorboard Cython
pip install torch-scatter -f https://data.pyg.org/whl/torch-1.10.1+cu111.html
git clone https://github.com/Nightmare-n/GD-MAE
cd GD-MAE && python setup.py develop --user
cd pcdet/ops/dcn && python setup.py develop --user
Data Preparation
Please follow the instruction of OpenPCDet to prepare the dataset. For the Waymo dataset, we use the evaluation toolkits to evaluate detection results.
data
│── waymo
│ │── ImageSets/
│ │── raw_data
│ │ │── segment-xxxxxxxx.tfrecord
│ │ │── ...
│ │── waymo_processed_data
│ │ │── segment-xxxxxxxx/
│ │ │── ...
│ │── waymo_processed_data_gt_database_train_sampled_1/
│ │── waymo_processed_data_waymo_dbinfos_train_sampled_1.pkl
│ │── waymo_processed_data_infos_test.pkl
│ │── waymo_processed_data_infos_train.pkl
│ │── waymo_processed_data_infos_val.pkl
│ │── compute_detection_metrics_main
│ │── gt.bin
│── kitti
│ │── ImageSets/
│ │── training
│ │ │── label_2/
│ │ │── velodyne/
│ │ │── ...
│ │── testing
│ │ │── velodyne/
│ │ │── ...
│ │── gt_database/
│ │── kitti_dbinfos_train.pkl
│ │── kitti_infos_test.pkl
│ │── kitti_infos_train.pkl
│ │── kitti_infos_val.pkl
│ │── kitti_infos_trainval.pkl
│── once
│ │── ImageSets/
│ │── data
│ │ │── 000000/
│ │ │── ...
│ │── gt_database/
│ │── once_dbinfos_train.pkl
│ │── once_infos_raw_large.pkl
│ │── once_infos_raw_medium.pkl
│ │── once_infos_raw_small.pkl
│ │── once_infos_train.pkl
│ │── once_infos_val.pkl
│── kitti-360
│ │── data_3d_raw
│ │ │── xxxxxxxx_sync/
│ │ │── ...
│── ckpts
│ │── graph_rcnn_po.pth
│ │── ...
Training & Testing
# mae pretrain & finetune
bash scripts/dist_ssl_train.sh
# one-stage model & two-stage model (separately)
bash scripts/dist_ts_train.sh
# one-stage model | two-stage model (end-to-end)
bash scripts/dist_train.sh
# test
bash scripts/dist_test.sh
Results
Waymo
Vec_L1 | Vec_L2 | Ped_L1 | Ped_L2 | Cyc_L1 | Cyc_L2 | Model | |
---|---|---|---|---|---|---|---|
Graph RCNN (w/o PointNet) | 80.6/80.1 | 72.3/71.9 | 82.9/77.3 | 75.0/69.7 | 77.2/76.0 | 74.4/73.3 | log |
GD-MAE_0.2 (20% labeled data) | 76.2/75.7 | 67.7/67.2 | 80.5/72.3 | 73.2/65.5 | 72.6/71.4 | 69.9/68.7 | log |
GD-MAE_iou (iou head) | 79.4/78.9 | 70.9/70.5 | 82.2/75.9 | 74.8/68.8 | 75.8/74.8 | 73.0/72.0 | log |
GD-MAE_ts (two-stage) | 80.2/79.8 | 72.4/72.0 | 83.1/76.7 | 75.5/69.4 | 77.2/76.2 | 74.4/73.4 | log |
We could not provide the above pretrained models due to Waymo Dataset License Agreement.
KITTI
Easy | Moderate | Hard | Model | |
---|---|---|---|---|
Graph-Vo | 93.29 | 86.08 | 83.15 | ckpt |
Graph-VoI | 95.80 | 86.72 | 83.93 | ckpt |
Graph-Po | 93.44 | 86.54 | 83.90 | ckpt |
Car | Pedestrian | Cyclist | Model | |
---|---|---|---|---|
GD-MAE | 82.01 | 48.40 | 67.16 | pretrain/ckpt |
ONCE
Vehicle | Pedestrian | Cyclist | Model | |
---|---|---|---|---|
CenterPoint-Pillar | 74.10 | 40.94 | 62.17 | ckpt |
GD-MAE | 76.79 | 48.84 | 69.14 | pretrain/ckpt |
Citation
If you find this project useful in your research, please consider citing:
@inproceedings{yang2023gdmae,
author = {Honghui Yang and Tong He and Jiaheng Liu and Hua Chen and Boxi Wu and Binbin Lin and Xiaofei He and Wanli Ouyang},
title = {GD-MAE: Generative Decoder for MAE Pre-training on LiDAR Point Clouds},
booktitle = {CVPR},
year = {2023},
}
@inproceedings{yang2022graphrcnn,
author = {Honghui Yang and Zili Liu and Xiaopei Wu and Wenxiao Wang and Wei Qian and Xiaofei He and Deng Cai},
title = {Graph R-CNN: Towards Accurate 3D Object Detection with Semantic-Decorated Local Graph},
booktitle = {ECCV},
year = {2022},
}
Acknowledgement
This project is mainly based on the following codebases. Thanks for their great works!