Awesome

Hindsight is 20/20: Leveraging Past Traversals to Aid 3D Perception

This is the official code release for

[ICLR 2022] Hindsight is 20/20: Leveraging Past Traversals to Aid 3D Perception.

by Yurong You, Katie Z Luo, Xiangyu Chen, Junan Chen, Wei-Lun Chao, Wen Sun, Bharath Hariharan, Mark Campbell, and Kilian Q. Weinberger

Video | Paper

Abstract

Self-driving cars must detect vehicles, pedestrians, and other trafﬁc participants accurately to operate safely. Small, far-away, or highly occluded objects are particularly challenging because there is limited information in the LiDAR point clouds for detecting them. To address this challenge, we leverage valuable information from the past: in particular, data collected in past traversals of the same scene. We posit that these past data, which are typically discarded, provide rich contextual information for disambiguating the above-mentioned challenging cases. To this end, we propose a novel end-to-end trainable Hindsight framework to extract this contextual information from past traversals and store it in an easy-to-query data structure, which can then be leveraged to aid future 3D object detection of the same scene. We show that this framework is compatible with most modern 3D detection architectures and can substantially improve their average precision on multiple autonomous driving datasets, most notably by more than 300% on the challenging cases.

Citation

@inproceedings{you2022hindsight,
  title = {Hindsight is 20/20: Leveraging Past Traversals to Aid 3D Perception},
  author = {You, Yurong and Luo, Katie Z and Chen, Xiangyu and Chen, Junan and Chao, Wei-Lun and Sun, Wen and Hariharan, Bharath and Campbell, Mark and Weinberger, Kilian Q.},
  booktitle = {Proceedings of the International Conference on Learning Representations (ICLR)},
  year = {2022},
  month = apr,
  url = {https://openreview.net/forum?id=qsZoGvFiJn1}
}

Environment

conda create --name hindsight python=3.8
conda activate hindsight
conda install pytorch=1.9.0 torchvision torchaudio cudatoolkit=11.1 -c pytorch -c nvidia
pip install opencv-python matplotlib wandb scipy tqdm easydict scikit-learn

# ME
git clone https://github.com/NVIDIA/MinkowskiEngine.git
cd MinkowskiEngine
git checkout c854f0c # 0.5.4
python setup.py install

for OpenPCDet, follow downstream/OpenPCDet/docs/INSTALL.md to install except you should install the spconv with the code in third_party/spconv.

Data Pre-processing

Please refer to data_preprocessing/lyft/LYFT_PREPROCESSING.md and data_preprocessing/nuscenes/NUSCENES_PREPROCESSING.md.

Training and Evaluation

We implement the computation of SQuaSH as a submodule in OpenPCDet (as sparse_query) and modify the KITTI dataloader / augmentor to load the history traversals.

We include the corresponding configs of four detection models in downstream/OpenPCDet/tools/cfgs/lyft_models and downstream/OpenPCDet/tools/cfgs/nuscenes_boston_models. Please use them to train/evaluate corresponding base-detectors/base-detectors+Hindsight models.

Train:

We use 4 GPUs to train detection models by default.

cd downstream/OpenPCDet/tools
OMP_NUM_THREADS=6 bash scripts/dist_train.sh 4 --cfg_file <cfg> --merge_all_iters_to_one_epoch --fix_random_seed

Evaluation:

cd downstream/OpenPCDet/tools
OMP_NUM_THREADS=6 bash scripts/dist_test.sh 4 --cfg_file <cfg> --ckpt <ckpt_path>

Checkpoints

Lyft experiments

Model	Checkpoint	Config file
PointPillars	link	cfg
PointPillars+Hindsight	link	cfg
SECOND	link	cfg
SECOND+Hindsight	link	cfg
PointRCNN	link	cfg
PointRCNN+Hindsight	link	cfg
PV-RCNN	link	cfg
PV-RCNN+Hindsight	link	cfg

nuScenes experiments

Model	Checkpoint	Config file
PointPillars	link	cfg
PointPillars+Hindsight	link	cfg
PointRCNN	link	cfg
PointRCNN+Hindsight	link	cfg

License

This project is under the MIT License. We use OpenPCDet and spconv in this project and they are under the Apache-2.0 License. We list our changes here.

Contact

Please open an issue if you have any questions about using this repo.

Acknowledgement

This work uses OpenPCDet, MinkowskiEngine and spconv. We thank them for open-sourcing excellent libraries for 3D understanding tasks. We also use the scripts from 3D_adapt_auto_driving for converting Lyft and nuScenes dataset into KITTI format.