Home

Awesome

PointAugmenting: Cross-Modal Augmentation for 3D Object Detection

GT-Paste data augmentation for cross-modal 3D object detection, which preserves the consistency between camera and LiDAR data.

<p align="center"> <img src='docs/aug.jpg' align="center" height="230px"> </p>

CVPR21 - PointAugmenting: Cross-Modal Augmentation for 3D Object Detection
Chunwei Wang, Chao Ma, Ming Zhu, Xiaokang Yang

@inproceedings{wang2021pointaugmenting,
  title={PointAugmenting: Cross-Modal Augmentation for 3D Object  Detection},
  author={Wang, Chunwei and Ma, Chao and Zhu, Ming and Yang, Xiaokang},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={11794--11803},
  year={2021}
}

Main results

3D detection on nuScenes test set

MAP ↑NDS ↑CarTruckC.V.BusTrailerBarrierMotor.BicyclePed.T.C.
CenterPoint60.367.385.253.520.063.656.071.159.530.784.678.4
PointAugmenting66.871.087.557.328.065.260.772.674.350.987.983.6

Installation

Please refer to the installation and usage of CenterPoint.

Image Backbone

load DCNv2

cd det3d/models/img_backbones
git clone https://github.com/CharlesShang/DCNv2
cd DCNv2
sh make.sh

For 2D image feature extraction, we use the pretrained DLA34 model from CenterTrack. Please download the model and put it into file folder /pretrained_model.

Data Preparation

Modified from CenterPoint's original document.

Step 1: Download data and organise as follows

# For nuScenes Dataset         
└── NUSCENES_DATASET_ROOT
       ├── samples       <-- key frames
       ├── sweeps        <-- frames without annotation
       ├── maps          <-- unused
       ├── v1.0-trainval <-- metadata

Step 2: Create a symlink to the dataset root

mkdir data && cd data
ln -s DATA_ROOT 
mv DATA_ROOT nuScenes # rename to nuScenes

Step 3: Create data

# nuScenes
python tools/create_data.py nuscenes_data_prep --root_path=NUSCENES_TRAINVAL_DATASET_ROOT --version="v1.0-trainval" --nsweeps=10 --rate==1

In the end, the data and info files should be organized as follows

# For nuScenes Dataset 
└── PointAugmenting
       └── data    
              └── nuScenes 
                     ├── samples       <-- key frames
                     ├── sweeps        <-- frames without annotation
                     ├── maps          <-- unused
                     |── v1.0-trainval <-- metadata and annotations
                     |── infos_train_10sweeps_withvelo_filter_True_100rate_crossmodal.pkl <-- train annotations
                     |── infos_val_10sweeps_withvelo_filter_True_crossmodal.pkl <-- val annotations
                     |── dbinfos_100rate_10sweeps_withvelo_crossmodal.pkl <-- GT database info files
                     |── gt_database_100rate_10sweeps_withvelo_crossmodal <-- GT database 

Train & Evaluate

Modified from CenterPoint's original document.

Use the following command to start a distributed training using 4 GPUs. The models and logs will be saved to work_dirs/CONFIG_NAME

python -m torch.distributed.launch --nproc_per_node=4 ./tools/train.py --config=CONFIG_PATH

For distributed testing with 4 gpus,

python -m torch.distributed.launch --nproc_per_node=4 ./tools/dist_test.py --config=CONFIG_PATH --work_dir work_dirs/CONFIG_NAME --checkpoint work_dirs/CONFIG_NAME/latest.pth

Acknowlegement

This project is not possible without multiple great opensourced codebases. We list some notable examples below.