Awesome

<p align="center"> <h1 align="center"><img src="assets/lamar_white.svg" width="85"><br><ins>LaMAR</ins><br>Benchmarking Localization and Mapping<br>for Augmented Reality</h1> <p align="center"> <a href="https://psarlin.com/">Paul-Edouard Sarlin*</a> · <a href="https://dsmn.ml/">Mihai Dusmanu*</a> <br> <a href="https://demuc.de/">Johannes L. Schönberger</a> · <a href="https://www.microsoft.com/en-us/research/people/paspecia/">Pablo Speciale</a> · <a href="https://www.microsoft.com/en-us/research/people/lugruber/">Lukas Gruber</a> · <a href="https://vlarsson.github.io/">Viktor Larsson</a> · <a href="http://miksik.co.uk/">Ondrej Miksik</a> · <a href="https://www.microsoft.com/en-us/research/people/mapoll/">Marc Pollefeys</a> </p> <p align="center"> <img src="assets/logos.svg" alt="Logo" height="40"> </p> <h2 align="center">ECCV 2022</h2> <h3 align="center"><a href="https://lamar.ethz.ch/">Project Page</a> | <a href="https://youtu.be/32XsRli2coo">Video</a></h3> <div align="center"></div> </p> <p align="center"> <a href="https://lamar.ethz.ch/"><img src="assets/teaser.svg" alt="Logo" width="80%"></a> <br> <em>LaMAR includes multi-sensor streams recorded by AR devices along hundreds of unconstrained trajectories captured over 2 years in 3 large indoor+outdoor locations.</em> </p>

This repository hosts the source code for LaMAR, a new benchmark for localization and mapping with AR devices in realistic conditions. The contributions of this work are:

A dataset: multi-sensor data streams captured by AR devices and laser scanners
scantools: a processing pipeline to register different user sessions together
A benchmark: a framework to evaluate algorithms for localization and mapping

See our ECCV 2022 tutorial for an overview of LaMAR and of the state of the art of localization and mapping for AR.

Overview

This codebase is composed of the following modules:

<a href="#benchmark">lamar</a>: evaluation pipeline and baselines for localization and mapping
<a href="#processing-pipeline">scantools</a>: data API, processing tools and pipeline
ScanCapture: a data recording app for Apple devices

Data format

We introduce a new data format, called Capture, to handle multi-session and multi-sensor data recorded by different devices. A Capture object corresponds to a capture location. It is composed of multiple sessions and each of them corresponds to a data recording by a given device. Each sessions stores the raw sensor data, calibration, poses, and all assets generated during the processing.

from scantools.capture import Capture
capture = Capture.load('data/CAB/')
print(capture.sessions.keys())
session = capture.sessions[session_id]  # each session has a unique id
print(session.sensors.keys())  # each sensor has a unique id
print(session.rigs)  # extrinsic calibration between sensors
keys = session.trajectories.key_pairs()  # all (timestamp, sensor_or_rig_id)
T_w_i = sessions.trajectories[keys[0]]  # first pose, from sensor/rig to world

More details are provided in the specification document CAPTURE.md.

Installation

:one: Install the core dependencies using the provided script, tested on Ubuntu 22.04:

scripts/install_core_dependencies.sh

Alternatively, you can install them manually in the following order:

Python 3.9 / 3.10 (we recommend using a venv virtual environment).
Ceres Solver 2.1
Colmap 3.8 built from source. Note: Do not install libceres-dev as it was installed in the previous step.
hloc 1.4 and its dependencies

:two: Install LaMAR libraries as editable packages:

python -m pip install -e .

:three: Optional: the processing pipeline additionally relies on heavier dependencies not required for benchmarking:

Pip dependencies: python -m pip install -e .[scantools]
raybender for raytracing
pcdmeshing for pointcloud meshing

:four: Optional: if you wish to contribute, install the development tools as well:

python -m pip install -e .[dev]

Docker images

The Dockerfile provided in this project has multiple stages, two of which are: scantools and lamar.

Building the Docker Images

You can build the Docker images for these stages using the following commands:

# Build the 'scantools' stage
docker build --target scantools -t lamar:scantools -f Dockerfile ./

# Build the 'lamar' stage
docker build --target lamar -t lamar:lamar -f Dockerfile ./

Pulling the Docker Images from GitHub Docker Registry

Alternatively, if you don't want to build the images yourself, you can pull them from the GitHub Docker Registry using the following commands:

# Pull the 'scantools' image
docker pull ghcr.io/microsoft/lamar-benchmark/scantools:latest

# Pull the 'lamar' image
docker pull ghcr.io/microsoft/lamar-benchmark/lamar:latest

Usage of docker images

To use the lamar Docker image, you can follow these steps:

Set the DATA_DIR and DOCKER_RUN environment variables:

export DATA_DIR=/path/to/data
export DOCKER_RUN="docker run -it --rm --init -u $(id -u):$(id -g) -v ${DATA_DIR}:${DATA_DIR} ghcr.io/microsoft/lamar-benchmark/lamar:latest "

Note: replace ghcr.io/microsoft/lamar-benchmark/lamar:latest with lamar:lamar if you want to use the image you built locally.

Run the desired command inside the Docker container, for example:

$DOCKER_RUN ls $DATA_DIR
$DOCKER_RUN python pipelines/pipeline_navvis_rig.py --help

The DOCKER_RUN variable is a prefix to the command you would run if the code were installed locally. This ensures the command runs inside the Docker container. The DATA_DIR will be mounted as a volume representing the same folder on the local machine, and the user/group will match the local environment to avoid having the output as root.

Benchmark

:one: Obtain the evaluation data: visit the dataset page and place the 3 scenes in ./data :

data/
├── CAB/
│   └── sessions/
│       ├── map/                # mapping session
│       ├── query_hololens/     # HoloLens test queries
│       ├── query_phone/        # Phone test queries
│       ├── query_val_hololens/ # HoloLens validation queries
│       └── query_val_phone/    # Phone validation queries
├── HGE
│   └── ...
└── LIN
    └── ...

Each scene contains a mapping session and queries for each device type. We provide a small set of validation queries with known ground-truth poses such that they can be used for developing algorithms and tuning parameters. We keep private the ground-truth poses of the test queries.

:two: Run the single-frame evaluation with the strongest baseline:

python -m lamar.run \
	--scene $SCENE --ref_id map --query_id $QUERY_ID \
	--retrieval fusion --feature superpoint --matcher superglue

where $SCENE is in {CAB,HGE,LIN} and $QUERY_ID is in {query_phone,query_hololens} for testing and in {query_val_phone,query_val_hololens} for validation. All outputs are written to ./outputs/ by default. For example, to localize validation Phone queries in the CAB scene:

python -m lamar.run \
	--scene CAB --ref_id map --query_id query_val_phone \
	--retrieval fusion --feature superpoint --matcher superglue

This executes two steps:

Create a sparse 3D map using the mapping session via feature extraction, pair selection, feature matching, triangulation
Localize each image of the sequence via feature extraction, pair selection, feature matching, absolute pose estimation

:three: Obtain the evaluation results:

validation queries: the script print the localization recall.
test queries: until the benchmark leaderboard is up and running, please send the predicted pose files to <a href="mailto:lamar-benchmark@sympa.ethz.ch">lamar-benchmark@sympa.ethz.ch</a> :warning: we will only accept at most 2 submissions per user per week.

:four: Workflow: the benchmarking pipeline is designed such that

the mapping and localization process is split into modular steps listed in lamar/tasks/
outputs like features and matches are cached and re-used over multiple similar runs
changing a configuration entry automatically triggers the recomputation of all downstream steps that depend on it

Other evaluation options

<details> <summary>[Click to expand]</summary>

Using radio signals for place recognition:

python -m lamar.run [...] --use_radios

Localization with sequences of 10 seconds instead of single images:

python -m lamar.run [...] --sequence_length_seconds 10

</details>

Adding your own algorithms

<details> <summary>[Click to expand]</summary>

To add a new local feature:

add your feature extractor to hloc in hloc/extractors/myfeature.py
create a configuration entry in lamar.tasks.feature_extraction.FeatureExtraction.methods

To add a new global feature for image retrieval:

add your feature extractor to hloc in hloc/extractors/myfeature.py
create a configuration entry in lamar.tasks.feature_extraction.RetrievalFeatureExtraction.methods

To add a new local feature matcher:

add your feature matcher to hloc in hloc/matchers/mymatcher.py
create a configuration entry in lamar.tasks.feature_matching.RetrievalFeatureMatching.methods

To add a new pose solver: create a new class that inherits from lamar.tasks.pose_estimation.SingleImagePoseEstimation:

class MyPoseEstimation(SingleImagePoseEstimation):
    method = {'name': 'my_estimator'}
    def run(self, capture):
        ...

</details>

Processing pipeline

Each step of the pipeline corresponds to a runfile in scantools/run_*.py that can be used as follow:

executed from the command line: python -m scantools.run_phone_to_capture [--args]
imported as a library:

from scantools import run_phone_to_capture
run_phone_to_capture.run(...)

We provide pipeline scripts that execute all necessary steps:

pipelines/pipeline_scans.py aligns multiple NavVis sessions and merge them into a unique reference session
pipelines/pipeline_sequence.py aligns all AR sequences to the reference session

The raw data will be released soon such that anyone is able to run the processing pipeline without access to capture devices.

Here are runfiles that could be handy for importing and exporting data:

run_phone_to_capture: convert a ScanCapture recording into a Capture session
run_navvis_to_capture: convert a NavVis recording into a Capture Session
run_session_to_kapture: convert a Capture session into a Kapture instance
run_capture_to_empty_colmap: convert a Capture session into an empty COLMAP model
run_image_anonymization: anonymize faces and license plates using the Brighter.AI API
run_radio_anonymization: anonymize radio signal IDs
run_combine_sequences: combine multiple sequence sessions into a single session
run_qrcode_detection: detect QR codes in images and store their poses

Raw data

We also release the raw original data, as recorded by the devices (HoloLens, phones, NavVis scanner), with minimal post-processing. Like the evaluation data, the raw data is accessed through the dataset page. More details are provided in the specification document RAW-DATA.md.

Release plan

We are still in the process of fully releasing LaMAR. Here is the release plan:

BibTex citation

Please consider citing our work if you use any code from this repo or ideas presented in the paper:

@inproceedings{sarlin2022lamar,
  author    = {Paul-Edouard Sarlin and
               Mihai Dusmanu and
               Johannes L. Schönberger and
               Pablo Speciale and
               Lukas Gruber and
               Viktor Larsson and
               Ondrej Miksik and
               Marc Pollefeys},
  title     = {{LaMAR: Benchmarking Localization and Mapping for Augmented Reality}},
  booktitle = {ECCV},
  year      = {2022},
}

Legal Notices

Microsoft and any contributors grant you a license to the Microsoft documentation and other content in this repository under the Creative Commons Attribution 4.0 International Public License, see the LICENSE file, and grant you a license to any code in the repository under the MIT License, see the LICENSE-CODE file.

Microsoft, Windows, Microsoft Azure and/or other Microsoft products and services referenced in the documentation may be either trademarks or registered trademarks of Microsoft in the United States and/or other countries. The licenses for this project do not grant you rights to use any Microsoft names, logos, or trademarks. Microsoft's general trademark guidelines can be found at http://go.microsoft.com/fwlink/?LinkID=254653.

Privacy information can be found at https://privacy.microsoft.com/en-us/

Microsoft and any contributors reserve all other rights, whether under their respective copyrights, patents, or trademarks, whether by implication, estoppel or otherwise.