Awesome
MultiScan: Scalable RGBD scanning for 3D environments with articulated objects
<h4> <a href="https://3dlg-hcvc.github.io/multiscan/">Project</a> | <a href="https://openreview.net/forum?id=YxUdazpgweG">Paper</a> | <a href="https://3dlg-hcvc.github.io/multiscan/read-the-docs/index.html">Docs</a> </h4>MultiScan is a scalable RGBD dataset construction pipeline leveraging commodity mobile devices to scan indoor scenes with articulated objects and web-based semantic annotation interfaces to efficiently annotate object and part semantics and part mobility parameters.
<img src="docs/read-the-docs/_static/teaser.png" />The repository includes:
- Source code of iOS and Android scanning apps
- Processing server for 3D reconstruction, texturing and segmentation
- Web interface for browsing scans and initiate processings
- Source code of benchmark dataset preparation
- Source code of data visualization
MultiScan Dataset
Download and copy MultiScan dataset download script to [PROJECT_ROOT]/dataset
directory, and run download script to dowload the dataset:
./dataset/download.sh <output_dir>
Unzip files:
cd <output_dir>
unzip "*.zip"
The downloaded dataset would follow this file system structure.
MultiScan dataset includes:
MultiScan Benchmark Dataset
Download and copy MultiScan benchmark dataset download script to [PROJECT_ROOT]/dataset
directory, and follow the instructions bellow to dowload the MultiScan benchmark dataset:
Object instance segmentation
Preprocessed object instance segmentation data download:
./dataset/download_benchmark_dataset.sh -o <output_dir>
Part instance segmentation
Preprocessed part instance segmentation data download:
./dataset/download_benchmark_dataset.sh -p <output_dir>
Mobility prediction
Preprocessed articulated objects dataset download:
./dataset/download_benchmark_dataset.sh -a <output_dir>
Unzip files with:
cd <output_dir>
unzip "*.zip"
Please checkout benchmark dataset doc for information about preprocessed dataset download and the preprocess scripts.
Scanner App
The Scanner App collects data using sensors on an Android/iOS device. User moves around holding the device with Scanner app installed to scan the scene. Once the scanning is completed, users can upload the data to the processing server.
- source code for iOS scanning app: iOS code
- documentation for iOS scanning app: iOS doc
- source code for Android scanning app: Android code
- documentation for Android scanning app: Android doc
Processing Server
The staging server has 3 main functionalities:
- Stage uploaded scans by the devices (iOS or Android) and trigger scan processing. To ensure that scans can be automatically processed, the scans should be placed in a directory with lots of space and accessible to the scanning processor.
- Process staged scans. Handle reconstruction processing request from Web-UI, when user press interactive buttons on Web-UI.
- Index staged scans. Go through scan folders and collate information about the scans.
- source code for processing server: server code
- installation doc for processing server: install
- configurations and documentations for processing server: doc
Staging Data Formats
Details about the formats of the uploaded files, and data generated by the processing server are available at here
Web-UI
The Web-UI is an interactive interface for providing an overview of staged scan data, managing scan data, and controlling the reconstruction and mesh annotation pipeline.
- source code to webui server backend: web-server code
- source code to webui client frontend: web-client code
- Web-UIinstallation and usuage documentation: doc
Benchmark
With MultiScan dataset, we carry out a series of benchmark experiments to evaluate methods from recent work on object instance segmentation, part instance segmentation, and mobility prediction.
Please checkout benchmark dataset doc for information about preprocessed dataset download and the preprocess scripts.
Benchmark train/val/test split and selected object/part semantic label and IDs:
- train/val/test split scans_split.csv
- object label to semantic label/ID mapping object_semantic_label_map.csv
- part label to semantic label/ID mapping part_semantic_label_map.csv
Visualization
Annotations visualization
Turntable video visualizations for semantic label annotation, semantic OBB annotation, articulation annotation, and textured mesh of the scans. Please checkout the visualization doc for more information.
Citation
If you use the MultiScan data or code please cite:
@inproceedings{mao2022multiscan,
author = {Mao, Yongsen and Zhang, Yiming and Jiang, Hanxiao and Chang, Angel X, Savva, Manolis},
title = {MultiScan: Scalable RGBD scanning for 3D environments with articulated objects},
booktitle = {Advances in Neural Information Processing Systems},
year = {2022}
}
References
Our work is built on top of the ScanNet dataset acquisition framework, Open3D, and MVS-Texturing for 3D reconstruction. We use the Open3D, Pyrender, MeshLab and Instant Meshes for rendering and post-processing.
@misc{dai2017scannet,
title={ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes},
author={Angela Dai and Angel X. Chang and Manolis Savva and Maciej Halber and Thomas Funkhouser and Matthias Nießner},
year={2017},
eprint={1702.04405},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
@article{Zhou2018,
author = {Qian-Yi Zhou and Jaesik Park and Vladlen Koltun},
title = {{Open3D}: {A} Modern Library for {3D} Data Processing},
journal = {arXiv:1801.09847},
year = {2018},
}
@inproceedings{Waechter2014Texturing,
title = {Let There Be Color! --- {L}arge-Scale Texturing of {3D} Reconstructions},
author = {Waechter, Michael and Moehrle, Nils and Goesele, Michael},
booktitle= {Proceedings of the European Conference on Computer Vision},
year = {2014},
publisher= {Springer},
}
@article{Jakob2015Instant,
author = {Wenzel Jakob and Marco Tarini and Daniele Panozzo and Olga Sorkine-Hornung},
title = {Instant Field-Aligned Meshes},
journal = {ACM Transactions on Graphics (Proceedings of SIGGRAPH ASIA)},
volume = {34},
number = {6},
year = {2015},
month = nov,
doi = {10.1145/2816795.2818078},
}
@inproceedings{LocalChapterEvents:ItalChap:ItalianChapConf2008:129-136,
booktitle = {Eurographics Italian Chapter Conference},
editor = {Vittorio Scarano and Rosario De Chiara and Ugo Erra},
title = {{MeshLab: an Open-Source Mesh Processing Tool}},
author = {Cignoni, Paolo and Callieri, Marco and Corsini, Massimiliano and Dellepiane, Matteo and Ganovelli, Fabio and Ranzuglia, Guido},
year = {2008},
publisher = {The Eurographics Association},
ISBN = {978-3-905673-68-5},
DOI = {10.2312/LocalChapterEvents/ItalChap/ItalianChapConf2008/129-136}
}