Awesome
Distinctive 3D local deep descriptors, ICPR 2020
Distinctive 3D local deep descriptors (DIPs) are rotation-invariant compact 3D descriptors computed using a PointNet-based deep neural network. DIPs can be used to register point clouds without requiring an initial alignment. DIPs are generated from point-cloud patches that are canonicalised with respect to their estimated local reference frame (LRF). DIPs can effectively generalise across different sensor modalities because they are learnt end-to-end from locally and randomly sampled points. DIPs (i) achieve comparable results to the state-of-the-art on RGB-D indoor scenes (3DMatch dataset), (ii) outperform state-of-the-art by a large margin in terms of generalisation on laser-scanner outdoor scenes (ETH dataset), and (iii) generalise to indoor scenes reconstructed with the Visual-SLAM system of Android ARCore.
<p align="center"><img src="assets/teaser.jpg" width="500"></p>Descriptor quality and generalisation ability
Descriptor quality is assessed using feature-matching recall [6]. See the paper for the references.
3DMatch dataset | Generalisation ability on ETH dataset |
---|---|
<img src="assets/fmr_3dmatch.png" width="400"> | <img src="assets/fmr_eth.png" width="400"> |
Tested with
- Ubuntu 16.04
- CUDA 10.2
- Python 3.6
- Pytorch 1.4
- Open3D 0.8.0
- torch-cluster
- torch-nndistance
Installation
git clone https://github.com/fabiopoiesi/dip.git
cd dip
pip install -r requirements.txt
pip install torch-cluster==1.4.5 -f https://pytorch-geometric.com/whl/torch-1.4.0.html
cd torch-nndistance
python build.py install
Download datasets and preprocessed data
The datasets used in the paper are listed below along with links pointing to their respective original project page. For convenience and reproducibility, our preprocessed data<sup>1</sup> are available for download. The preprocessed data for the 3DMatchRotated dataset (augmented version of 3DMatch) is not provided, it needs preprocessing (see below). After downloading folders and unzipping files, the dataset root directory should have the following structure.
.
├── 3DMatch_test
├── 3DMatch_test_pre
├── 3DMatch_train
├── 3DMatch_train_pre
├── ETH_test
├── ETH_test_pre
└── VigoHome
3DMatch dataset
The original dataset can be found here. We used data from the RGB-D Reconstruction Datasets. Point cloud PLYs are generated using Multi-Frame Depth TSDF Fusion from here.
ETH dataset
The original dataset can be found here.
VigoHome dataset
We collected VigoHome with our Android ARCore-based Visual-SLAM App. The dataset can be downloaded here, while the App's apk can be downloaded here (available soon).
Preprocessing
Preprocessing can be used to generate patches and LRFs for training. This will greatly reduce training time. Preprocessing requires two steps: the first step computes point correspondences between point-cloud pairs using the Iterative Closest Point algoritm; the second step produces patches along with their LRF. To preprocess 3DMatch training data, run preprocess_3dmatch_correspondences_train.py and preprocess_3dmatch_lrf_train.py (same procedure for test data). Just make sure that datasets are downloaded and the paths in the code set.
Training
Training requires preprocessed data, i.e. patches and LRFs (it would be too slow to extract and compute them at each iteration during training). See preprocessing to create your own preprocessed data or download our preprocessed data. To train set the variable dataset_root in train.py. Then run
python train.py
Training generates checkpoints in the chkpts directory and the training logs in the logs directory. Logs can be monitored through tensorboard by running
tensorboard --logdir=logs
Demo using pretrained model
We included three demos, one for each dataset we evaluated in the paper. The point clouds processed in the demos are in assets directory and the model trained on the 3DMatch dataset is in model. Run
python demo_3dmatch.py
python demo_eth.py
python demo_vigohome.py
The results of each demo should look like the ones here below. Because the registration is estimated with RANSAC, results may differ slightly at each run.
3DMatch dataset | ETH dataset | VigoHome dataset |
---|---|---|
Graphs
Graphs<sup>2,3</sup> of Fig. 6 can be generated by running
python graphs/viz_graphs.py
Citing our work
Please cite the following paper if you use our code
@inproceedings{Poiesi2021,
title = {Distinctive {3D} local deep descriptors},
author = {Poiesi, Fabio and Boscaini, Davide},
booktitle = {IEEE Proc. of Int'l Conference on Pattern Recognition},
address = {Milan, IT}
month = {Jan}
year = {2021}
}
Acknowledgements
This research has received funding from the Fondazione CARITRO - Ricerca e Sviluppo programme 2018-2020.
We also thank <sup>1</sup>Zan Gojcic, <sup>2</sup>Chris Choy and <sup>3</sup>Xuyang Bai for providing us with their support in the collection of the data for the paper.