Home

Awesome

PWC PWC

ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection

News:

This repository contains implementation of the monocular/multi-view 3D object detector ImVoxelNet, introduced in our paper:

ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection<br> Danila Rukhovich, Anna Vorontsova, Anton Konushin <br> Samsung Research<br> https://arxiv.org/abs/2106.01178

<p align="center"><img src="./resources/scheme.png" alt="drawing" width="90%"/></p>

Installation

For convenience, we provide a Dockerfile. Alternatively, you can install all required packages manually.

This implementation is based on mmdetection3d framework. Please refer to the original installation guide install.md, replacing open-mmlab/mmdetection3d with saic-vul/imvoxelnet. Also, rotated_iou should be installed with these 4 commands.

Most of the ImVoxelNet-related code locates in the following files: detectors/imvoxelnet.py, necks/imvoxelnet.py, dense_heads/imvoxel_head.py, pipelines/multi_view.py.

Datasets

We support three benchmarks based on the SUN RGB-D dataset.

For ScanNet please follow instructions in scannet. For KITTI and nuScenes, please follow instructions in getting_started.md.

Getting Started

Please see getting_started.md for basic usage examples.

Training

To start training, run dist_train with ImVoxelNet configs:

bash tools/dist_train.sh configs/imvoxelnet/imvoxelnet_kitti.py 8

Testing

Test pre-trained model using dist_test with ImVoxelNet configs:

bash tools/dist_test.sh configs/imvoxelnet/imvoxelnet_kitti.py \
    work_dirs/imvoxelnet_kitti/latest.pth 8 --eval mAP

Visualization

Visualizations can be created with test script. For better visualizations, you may set score_thr in configs to 0.15 or more:

python tools/test.py configs/imvoxelnet/imvoxelnet_kitti.py \
    work_dirs/imvoxelnet_kitti/latest.pth --show \
    --show-dir work_dirs/imvoxelnet_kitti

Models

v2 adds center sampling for indoor scenario. v3 simplifies 3d neck for indoor scenario. Differences are discussed in v2 and v3 preprints.

DatasetObject ClassesVersionDownload
SUN RGB-D37 from <br> Total3dUnderstandingv1 | mAP@0.15: 41.5 <br> v2 | mAP@0.15: 42.7 <br> v3 | mAP@0.15: 43.7model | log | config <br> model | log | config <br> model | log | config
SUN RGB-D30 from <br> PerspectiveNetv1 | mAP@0.15: 44.9 <br> v2 | mAP@0.15: 47.2 <br> v3 | mAP@0.15: 48.7model | log | config <br> model | log | config <br> model | log | config
SUN RGB-D10 from VoteNetv1 | mAP@0.25: 38.8 <br> v2 | mAP@0.25: 39.4 <br> v3 | mAP@0.25: 40.7model | log | config <br> model | log | config <br> model | log | config
ScanNet18 from VoteNetv1 | mAP@0.25: 40.6 <br> v2 | mAP@0.25: 45.7 <br> v3 | mAP@0.25: 48.1model | log | config <br> model | log | config <br> model | log | config
KITTICarv1 | AP@0.7: 17.8model | log | config
nuScenesCarv1 | AP: 51.8model | log | config

Example Detections

<p align="center"><img src="./resources/github.png" alt="drawing" width="90%"/></p>

Citation

If you find this work useful for your research, please cite our paper:

@inproceedings{rukhovich2022imvoxelnet,
  title={Imvoxelnet: Image to voxels projection for monocular and multi-view general-purpose 3d object detection},
  author={Rukhovich, Danila and Vorontsova, Anna and Konushin, Anton},
  booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
  pages={2397--2406},
  year={2022}
}