Awesome
Hindsight is 20/20: Leveraging Past Traversals to Aid 3D Perception
This is the official code release for
[ICLR 2022] Hindsight is 20/20: Leveraging Past Traversals to Aid 3D Perception.
by Yurong You, Katie Z Luo, Xiangyu Chen, Junan Chen, Wei-Lun Chao, Wen Sun, Bharath Hariharan, Mark Campbell, and Kilian Q. Weinberger
Abstract
Self-driving cars must detect vehicles, pedestrians, and other traffic participants accurately to operate safely. Small, far-away, or highly occluded objects are particularly challenging because there is limited information in the LiDAR point clouds for detecting them. To address this challenge, we leverage valuable information from the past: in particular, data collected in past traversals of the same scene. We posit that these past data, which are typically discarded, provide rich contextual information for disambiguating the above-mentioned challenging cases. To this end, we propose a novel end-to-end trainable Hindsight framework to extract this contextual information from past traversals and store it in an easy-to-query data structure, which can then be leveraged to aid future 3D object detection of the same scene. We show that this framework is compatible with most modern 3D detection architectures and can substantially improve their average precision on multiple autonomous driving datasets, most notably by more than 300% on the challenging cases.
Citation
@inproceedings{you2022hindsight,
title = {Hindsight is 20/20: Leveraging Past Traversals to Aid 3D Perception},
author = {You, Yurong and Luo, Katie Z and Chen, Xiangyu and Chen, Junan and Chao, Wei-Lun and Sun, Wen and Hariharan, Bharath and Campbell, Mark and Weinberger, Kilian Q.},
booktitle = {Proceedings of the International Conference on Learning Representations (ICLR)},
year = {2022},
month = apr,
url = {https://openreview.net/forum?id=qsZoGvFiJn1}
}
Environment
conda create --name hindsight python=3.8
conda activate hindsight
conda install pytorch=1.9.0 torchvision torchaudio cudatoolkit=11.1 -c pytorch -c nvidia
pip install opencv-python matplotlib wandb scipy tqdm easydict scikit-learn
# ME
git clone https://github.com/NVIDIA/MinkowskiEngine.git
cd MinkowskiEngine
git checkout c854f0c # 0.5.4
python setup.py install
for OpenPCDet, follow downstream/OpenPCDet/docs/INSTALL.md
to install except
you should install the spconv with the code in third_party/spconv
.
Data Pre-processing
Please refer to data_preprocessing/lyft/LYFT_PREPROCESSING.md
and
data_preprocessing/nuscenes/NUSCENES_PREPROCESSING.md
.
Training and Evaluation
We implement the computation of SQuaSH as a submodule in OpenPCDet (as sparse_query) and modify the KITTI dataloader / augmentor to load the history traversals.
We include the corresponding configs of four detection models in downstream/OpenPCDet/tools/cfgs/lyft_models and downstream/OpenPCDet/tools/cfgs/nuscenes_boston_models. Please use them to train/evaluate corresponding base-detectors/base-detectors+Hindsight models.
Train:
We use 4 GPUs to train detection models by default.
cd downstream/OpenPCDet/tools
OMP_NUM_THREADS=6 bash scripts/dist_train.sh 4 --cfg_file <cfg> --merge_all_iters_to_one_epoch --fix_random_seed
Evaluation:
cd downstream/OpenPCDet/tools
OMP_NUM_THREADS=6 bash scripts/dist_test.sh 4 --cfg_file <cfg> --ckpt <ckpt_path>
Checkpoints
Lyft experiments
Model | Checkpoint | Config file |
---|---|---|
PointPillars | link | cfg |
PointPillars+Hindsight | link | cfg |
SECOND | link | cfg |
SECOND+Hindsight | link | cfg |
PointRCNN | link | cfg |
PointRCNN+Hindsight | link | cfg |
PV-RCNN | link | cfg |
PV-RCNN+Hindsight | link | cfg |
nuScenes experiments
Model | Checkpoint | Config file |
---|---|---|
PointPillars | link | cfg |
PointPillars+Hindsight | link | cfg |
PointRCNN | link | cfg |
PointRCNN+Hindsight | link | cfg |
License
This project is under the MIT License. We use OpenPCDet and spconv in this project and they are under the Apache-2.0 License. We list our changes here.
Contact
Please open an issue if you have any questions about using this repo.
Acknowledgement
This work uses OpenPCDet, MinkowskiEngine and spconv. We thank them for open-sourcing excellent libraries for 3D understanding tasks. We also use the scripts from 3D_adapt_auto_driving for converting Lyft and nuScenes dataset into KITTI format.