Awesome
TR3D: Towards Real-Time Indoor 3D Object Detection
News:
- :fire: June, 2023. TR3D is accepted at ICIP2023.
- :rocket: June, 2023. We add ScanNet-pretrained S3DIS model and log significantly pushing forward state-of-the-art.
- February, 2023. TR3D on all 3 datasets is now supported in mmdetection3d as a project.
- :fire: February, 2023. TR3D is now state-of-the-art on paperswithcode on SUN RGB-D and S3DIS.
This repository contains an implementation of TR3D, a 3D object detection method introduced in our paper:
TR3D: Towards Real-Time Indoor 3D Object Detection<br> Danila Rukhovich, Anna Vorontsova, Anton Konushin <br> Samsung Research<br> https://arxiv.org/abs/2302.02858
Installation
For convenience, we provide a Dockerfile.
Alternatively, you can install all required packages manually. This implementation is based on mmdetection3d framework.
Please refer to the original installation guide getting_started.md, including MinkowskiEngine installation, replacing open-mmlab/mmdetection3d
with samsunglabs/tr3d
.
Most of the TR3D
-related code locates in the following files:
detectors/mink_single_stage.py,
detectors/tr3d_ff.py,
dense_heads/tr3d_head.py,
necks/tr3d_neck.py.
Getting Started
Please see getting_started.md for basic usage examples. We follow the mmdetection3d data preparation protocol described in scannet, sunrgbd, and s3dis.
Training
To start training, run train with TR3D configs:
python tools/train.py configs/tr3d/tr3d_scannet-3d-18class.py
Testing
Test pre-trained model using test with TR3D configs:
python tools/test.py configs/tr3d/tr3d_scannet-3d-18class.py \
work_dirs/tr3d_scannet-3d-18class/latest.pth --eval mAP
Visualization
Visualizations can be created with test script.
For better visualizations, you may set score_thr
in configs to 0.3
:
python tools/test.py configs/tr3d/tr3d_scannet-3d-18class.py \
work_dirs/tr3d_scannet-3d-18class/latest.pth --eval mAP --show \
--show-dir work_dirs/tr3d_scannet-3d-18class
Models
The metrics are obtained in 5 training runs followed by 5 test runs. We report both the best and the average values (the latter are given in round brackets). Inference speed (scenes per second) is measured on a single NVidia RTX 4090. Please, note that ScanNet-pretrained S3DIS model was actually trained in the original openmmlab/mmdetection3d codebase.
TR3D 3D Detection
Dataset | mAP@0.25 | mAP@0.5 | Scenes <br> per sec. | Download |
---|---|---|---|---|
ScanNet | 72.9 (72.0) | 59.3 (57.4) | 23.7 | model | log | config |
SUN RGB-D | 67.1 (66.3) | 50.4 (49.6) | 27.5 | model | log | config |
S3DIS | 74.5 (72.1) | 51.7 (47.6) | 21.0 | model | log | config |
S3DIS <br> ScanNet-pretrained | 75.9 (75.1) | 56.6 (54.8) | 21.0 | model | log | config |
RGB + PC 3D Detection on SUN RGB-D
Model | mAP@0.25 | mAP@0.5 | Scenes <br> per sec. | Download |
---|---|---|---|---|
ImVoteNet | 63.4 | - | 14.8 | instruction |
VoteNet+FF | 64.5 (63.7) | 39.2 (38.1) | - | model | log | config |
TR3D+FF | 69.4 (68.7) | 53.4 (52.4) | 17.5 | model | log | config |
Example Detections
<p align="center"><img src="./resources/github.png" alt="drawing" width="90%"/></p>Citation
If you find this work useful for your research, please cite our paper:
@misc{rukhovich2023tr3d,
doi = {10.48550/ARXIV.2302.02858},
url = {https://arxiv.org/abs/2302.02858},
author = {Rukhovich, Danila and Vorontsova, Anna and Konushin, Anton},
title = {TR3D: Towards Real-Time Indoor 3D Object Detection},
publisher = {arXiv},
year = {2023}
}