Awesome
MonoRUn
NEWS: The code of our subsequent work EPro-PnP (CVPR 2022 Best Student Paper) has been released here!
MonoRUn: Monocular 3D Object Detection by Reconstruction and Uncertainty Propagation. CVPR 2021. [paper] <br> Hansheng Chen, Yuyao Huang, Wei Tian*, Zhong Gao, Lu Xiong. (*Corresponding author: Wei Tian.)
This repository is the PyTorch implementation for MonoRUn. The codes are based on MMDetection and MMDetection3D, although we use our own data formats. The PnP C++ codes are modified from PVNet.
<img src="demo/demo.gif" alt="demo" />Installation
Please refer to INSTALL.md.
Data preparation
Download the official KITTI 3D object dataset, including left color images, calibration files and training labels.
Download the train/val/test image lists [Google Drive | Baidu Pan, password: cj4u
]. For training with LiDAR supervision, download the preprocessed object coordinate maps [Google Drive | Baidu Pan, password: fp3h
].
Extract the downloaded archives according to the following folder structure. It is recommended to symlink the dataset root to $MonoRUn_ROOT/data
. If your folder structure is different, you may need to change the corresponding paths in config files.
$MonoRUn_ROOT
├── configs
├── monorun
├── tools
├── data
│ ├── kitti
│ │ ├── testing
│ │ │ ├── calib
│ │ │ ├── image_2
│ │ │ └── test_list.txt
│ │ └── training
│ │ ├── calib
│ │ ├── image_2
│ │ ├── label_2
│ │ ├── obj_crd
│ │ ├── mono3dsplit_train_list.txt
│ │ ├── mono3dsplit_val_list.txt
│ │ └── trainval_list.txt
Run the preparation script to generate image metas:
cd $MonoRUn_ROOT
python tools/prepare_kitti.py
Train
cd $MonoRUn_ROOT
To train without LiDAR supervision:
python train.py configs/kitti_multiclass.py --gpu-ids 0 1
where --gpu-ids 0 1
specifies the GPU IDs. In the paper we use two GPUs for distributed training. The number of GPUs affects the mini-batch size. You may change the samples_per_gpu
option in the config file to vary the number of images per GPU. If you encounter out of memory issue, add the argument --seed 0 --deterministic
to save GPU memory.
To train with LiDAR supervision:
python train.py configs/kitti_multiclass_lidar_supv.py --gpu-ids 0 1
To view other training options:
python train.py -h
By default, logs and checkpoints will be saved to $MonoRUn_ROOT/work_dirs
. You can run TensorBoard to plot the logs:
tensorboard --logdir $MonoRUn_ROOT/work_dirs
The above configs use the 3712-image split for training and the other split for validating. If you want to train on the full training set (train-val), use the config files with _trainval
postfix.
Test
You can download the pretrained models:
kitti_multiclass.pth
[Google Drive | Baidu Pan, password:6bih
] trained on KITTI training splitkitti_multiclass_lidar_supv.pth
[Google Drive | Baidu Pan, password:nmdb
] trained on KITTI training splitkitti_multiclass_lidar_supv_trainval.pth
[Google Drive | Baidu Pan, password:hg2r
] trained on KITTI train-val
To test and evaluate on the validation set using config at $CONFIG_PATH
and checkpoint at $CPT_PATH
:
python test.py $CONFIG_PATH $CPT_PATH --val-set --gpu-ids 0
To test on the test set and save detection results to $RESULT_DIR
:
python test.py $CONFIG_PATH $CPT_PATH --result-dir $RESULT_DIR --gpu-ids 0
You can append the argument --show-dir $SHOW_DIR
to save visualized results.
To view other testing options:
python test.py -h
Note: the training and testing scripts in the root directory are wrappers for the original scripts taken from MMDetection, which can be found in $MonoRUn_ROOT/tools
. For advanced usage, please refer to the official MMDetection docs.
Demo
We provide a demo script to perform inference on images in a directory and save the visualized results. Example:
python demo/infer_imgs.py $KITTI_RAW_DIR/2011_09_30/2011_09_30_drive_0027_sync/image_02/data configs/kitti_multiclass_lidar_supv_trainval.py checkpoints/kitti_multiclass_lidar_supv_trainval.pth --calib demo/calib.csv --show-dir show/2011_09_30_drive_0027
Citation
If you find this project useful in your research, please consider citing:
@inproceedings{monorun2021,
author = {Hansheng Chen and Yuyao Huang and Wei Tian and Zhong Gao and Lu Xiong},
title = {MonoRUn: Monocular 3D Object Detection by Reconstruction and Uncertainty Propagation},
booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2021}
}