Home

Awesome

hloc - the hierarchical localization toolbox

This is hloc, a modular toolbox for state-of-the-art 6-DoF visual localization. It implements Hierarchical Localization, leveraging image retrieval and feature matching, and is fast, accurate, and scalable. This codebase combines and makes easily accessible years of research on image matching and Structure-from-Motion.

With hloc, you can:

<p align="center"> <a href="https://arxiv.org/abs/1812.03506"><img src="doc/hloc.png" width="60%"/></a> <br /><em>Hierachical Localization uses both image retrieval and feature matching</em> </p>

Quick start ➡️ Open In Colab

Build 3D maps with Structure-from-Motion and localize any Internet image right from your browser! You can now run hloc and COLMAP in Google Colab with GPU for free. The notebook demo.ipynb shows how to run SfM and localization in just a few steps. Try it with your own data and let us know!

Installation

hloc requires Python >=3.7 and PyTorch >=1.1. Installing the package locally pulls the other dependencies:

git clone --recursive https://github.com/cvg/Hierarchical-Localization/
cd Hierarchical-Localization/
python -m pip install -e .

All dependencies are listed in requirements.txt. Starting with hloc-v1.3, installing COLMAP is not required anymore. This repository includes external local features as git submodules – don't forget to pull submodules with git submodule update --init --recursive.

We also provide a Docker image:

docker build -t hloc:latest .
docker run -it --rm -p 8888:8888 hloc:latest  # for GPU support, add `--runtime=nvidia`
jupyter notebook --ip 0.0.0.0 --port 8888 --no-browser --allow-root

General pipeline

The toolbox is composed of scripts, which roughly perform the following steps:

  1. Extract local features, like SuperPoint or DISK, for all database and query images
  2. Build a reference 3D SfM model
    1. Find covisible database images, with retrieval or a prior SfM model
    2. Match these database pairs with SuperGlue or the faster LightGlue
    3. Triangulate a new SfM model with COLMAP
  3. Find database images relevant to each query, using retrieval
  4. Match the query images
  5. Run the localization
  6. Visualize and debug

The localization can then be evaluated on visuallocalization.net for the supported datasets. When 3D Lidar scans are available, such as for the indoor dataset InLoc, step 2. can be skipped.

Strcture of the toolbox:

hloc can be imported as an external package with import hloc or called from the command line with:

python -m hloc.name_of_script --arg1 --arg2

Tasks

We provide step-by-step guides to localize with Aachen, InLoc, and to generate reference poses for your own data using SfM. Just download the datasets and you're reading to go!

Aachen – outdoor localization

Have a look at pipeline_Aachen.ipynb for a step-by-step guide on localizing with Aachen. Play with the visualization, try new local features or matcher, and have fun! Don't like notebooks? You can also run all scripts from the command line.

<p align="center"> <a href="https://nbviewer.jupyter.org/github/cvg/Hierarchical-Localization/blob/master/pipeline_Aachen.ipynb"><img src="doc/loc_aachen.svg" width="70%"/></a> </p>

InLoc – indoor localization

The notebook pipeline_InLoc.ipynb shows the steps for localizing with InLoc. It's much simpler since a 3D SfM model is not needed.

<p align="center"> <a href="https://nbviewer.jupyter.org/github/cvg/Hierarchical-Localization/blob/master/pipeline_InLoc.ipynb"><img src="doc/loc_inloc.svg" width="70%"/></a> </p>

SfM reconstruction from scratch

We show in pipeline_SfM.ipynb how to run 3D reconstruction for an unordered set of images. This generates reference poses, and a nice sparse 3D model suitable for localization with the same pipeline as Aachen.

Results

Using NetVLAD for retrieval, we obtain the following best results:

MethodsAachen dayAachen nightRetrieval
SuperPoint + SuperGlue89.6 / 95.4 / 98.886.7 / 93.9 / 100NetVLAD top 50
SuperPoint + NN85.4 / 93.3 / 97.275.5 / 86.7 / 92.9NetVLAD top 30
D2Net (SS) + NN84.6 / 91.4 / 97.183.7 / 90.8 / 100NetVLAD top 30
MethodsInLoc DUC1InLoc DUC2Retrieval
SuperPoint + SuperGlue46.5 / 65.7 / 78.352.7 / 72.5 / 79.4NetVLAD top 40
SuperPoint + SuperGlue (temporal)49.0 / 68.7 / 80.853.4 / 77.1 / 82.4NetVLAD top 40
SuperPoint + NN39.9 / 55.6 / 67.237.4 / 57.3 / 70.2NetVLAD top 20
D2Net (SS) + NN39.9 / 57.6 / 67.236.6 / 53.4 / 61.8NetVLAD top 20

Check out visuallocalization.net/benchmark for more details and additional baselines.

Supported datasets

We provide in hloc/pipelines/ scripts to run the reconstruction and the localization on the following datasets: Aachen Day-Night (v1.0 and v1.1), InLoc, Extended CMU Seasons, RobotCar Seasons, 4Seasons, Cambridge Landmarks, and 7-Scenes. For example, after downloading the dataset with the instructions given here, we can run the Aachen Day-Night pipeline with SuperPoint+SuperGlue using the command:

python -m hloc.pipelines.Aachen.pipeline [--outputs ./outputs/aachen]

BibTex Citation

If you report any of the above results in a publication, or use any of the tools provided here, please consider citing both Hierarchical Localization and SuperGlue papers:

@inproceedings{sarlin2019coarse,
  title     = {From Coarse to Fine: Robust Hierarchical Localization at Large Scale},
  author    = {Paul-Edouard Sarlin and
               Cesar Cadena and
               Roland Siegwart and
               Marcin Dymczyk},
  booktitle = {CVPR},
  year      = {2019}
}

@inproceedings{sarlin2020superglue,
  title     = {{SuperGlue}: Learning Feature Matching with Graph Neural Networks},
  author    = {Paul-Edouard Sarlin and
               Daniel DeTone and
               Tomasz Malisiewicz and
               Andrew Rabinovich},
  booktitle = {CVPR},
  year      = {2020},
}

Going further

Debugging and Visualization

<details> <summary>[Click to expand]</summary>

Each localization run generates a pickle log file. For each query, it contains the selected database images, their matches, and information from the pose solver, such as RANSAC inliers. It can thus be parsed to gather statistics and analyze failure modes or difficult scenarios.

We also provide some visualization tools in hloc/visualization.py to visualize some attributes of the 3D SfM model, such as visibility of the keypoints, their track length, or estimated sparse depth (like below).

<p align="center"> <a href="./pipeline_Aachen.ipynb"><img src="doc/depth_aachen.svg" width="60%"/></a> </p> </details>

Using your own local features or matcher

<details> <summary>[Click to expand]</summary>

If your code is based on PyTorch: simply add a new interface in hloc/extractors/ or hloc/matchers/. It needs to inherit from hloc.utils.base_model.BaseModel, take as input a data dictionary, and output a prediction dictionary. Have a look at hloc/extractors/superpoint.py for an example. You can additionally define a standard configuration in hloc/extract_features.py or hloc/match_features.py - it can then be called directly from the command line.

If your code is based on TensorFlow: you will need to either modify hloc/extract_features.py and hloc/match_features.py, or export yourself the features and matches to HDF5 files, described below.

In a feature file, each key corresponds to the relative path of an image w.r.t. the dataset root (e.g. db/1.jpg for Aachen), and has one dataset per prediction (e.g. keypoints and descriptors, with shape Nx2 and DxN).

In a match file, each key corresponds to the string path0.replace('/', '-')+'_'+path1.replace('/', '-') and has a dataset matches0 with shape N. It indicates, for each keypoint in the first image, the index of the matching keypoint in the second image, or -1 if the keypoint is unmatched.

</details>

Using your own image retrieval

<details> <summary>[Click to expand]</summary>

hloc also provides an interface for image retrieval via hloc/extract_features.py. As previously, simply add a new interface to hloc/extractors/. Alternatively, you will need to export the global descriptors into an HDF5 file, in which each key corresponds to the relative path of an image w.r.t. the dataset root, and contains a dataset global_descriptor with size D. You can then export the images pairs with hloc/pairs_from_retrieval.py.

</details>

Reconstruction with known camera parameters

<details> <summary>[Click to expand]</summary>

If the calibration of the camera is known, for example from an external calibration system, you can tell hloc to use these parameters instead of estimating them from EXIF. The name of the camera models and their parameters are defined by COLMAP. Python API:

opts = dict(camera_model='SIMPLE_RADIAL', camera_params=','.join(map(str, (f, cx, cy, k))))
model = reconstruction.main(..., image_options=opts)

Command-line interface:

python -m hloc.reconstruction [...] --image_options camera_model='"SIMPLE_RADIAL"' camera_params='"256,256,256,0"'

By default, hloc refines the camera parameters during the reconstruction process. To prevent this, add:

reconstruction.main(..., mapper_options=dict(ba_refine_focal_length=False, ba_refine_extra_params=False))
python -m hloc.reconstruction [...] --mapper_options ba_refine_focal_length=False ba_refine_extra_params=False
</details>

Versions

<details> <summary>v1.4 (July 2023)</summary> </details> <details> <summary>v1.3 (January 2022)</summary> </details> <details> <summary>v1.2 (December 2021)</summary> </details> <details> <summary>v1.1 (July 2021)</summary> </details> <details> <summary>v1.0 (July 2020)</summary>

Initial public version.

</details>

Contributions welcome!

External contributions are very much welcome. Please follow the PEP8 style guidelines using a linter like flake8. This is a non-exhaustive list of features that might be valuable additions:

Created and maintained by Paul-Edouard Sarlin with the help of many contributors.