Home

Awesome

R2D2: Reliable and Repeatable Detector and Descriptor

This repository contains the implementation of the following paper:

@inproceedings{r2d2,
  author    = {Jerome Revaud and Philippe Weinzaepfel and C{\'{e}}sar Roberto de Souza and
               Martin Humenberger},
  title     = {{R2D2:} Repeatable and Reliable Detector and Descriptor},
  booktitle = {NeurIPS},
  year      = {2019},
}

Fast-R2D2

This repository also contains the code needed to train and extract Fast-R2D2 keypoints. Fast-R2D2 is a revised version of R2D2 that is significantly faster, uses less memory yet achieves the same order of precision as the original network.

License

Our code is released under the Creative Commons BY-NC-SA 3.0 (see LICENSE for more details), available only for non-commercial use.

Getting started

You just need Python 3.6+ equipped with standard scientific packages and PyTorch1.1+. Typically, conda is one of the easiest way to get started:

conda install python tqdm pillow numpy matplotlib scipy
conda install pytorch torchvision cudatoolkit=10.1 -c pytorch

Pretrained models

For your convenience, we provide five pre-trained models in the models/ folder:

For more details about the training data, see the dedicated section below. Here is a table that summarizes the performance of each model:

model namemodel size<br>(#weights)number of<br>keypointsMMA@3 on<br>HPatches
r2d2_WAF_N16.pt0.5M5K0.686
r2d2_WASF_N16.pt0.5M5K0.721
r2d2_WASF_N8_big.pt1.0M10K0.692
faster2d2_WASF_N8_big.pt1.0M5K0.650
<!--|`r2d2_WASF_N8_big.pt`| 1.0M | 5K | 0.704 |-->

Feature extraction

To extract keypoints for a given image, simply execute:

python extract.py --model models/r2d2_WASF_N16.pt --images imgs/brooklyn.png --top-k 5000

This also works for multiple images (separated by spaces) or a .txt image list. For each image, this will save the top-k keypoints in a file with the same path as the image and a .r2d2 extension. For example, they will be saved in imgs/brooklyn.png.r2d2 for the sample command above.

The keypoint file is in the npz numpy format and contains 3 fields:

Note: You can modify the extraction parameters (scale factor, scale range...). Run python extract.py --help for more information. By default, they corespond to what is used in the paper, i.e., a scale factor equal to 2^0.25 (--scale-f 1.189207) and image size in the range [256, 1024] (--min-size 256 --max-size 1024).

Note2: You can significantly improve the MMA@3 score (by ~4 pts) if you can afford more computations. To do so, you just need to increase the upper-limit on the scale range by replacing --min-size 256 --max-size 1024 with --min-size 0 --max-size 9999 --min-scale 0.3 --max-scale 1.0.

Feature extraction with kapture datasets

Kapture is a pivot file format, based on text and binary files, used to describe SFM (Structure From Motion) and more generally sensor-acquired data.

It is available at https://github.com/naver/kapture. It contains conversion tools for popular formats and several popular datasets are directly available in kapture.

It can be installed with:

pip install kapture

Datasets can be downloaded with:

kapture_download_dataset.py update
kapture_download_dataset.py list
# e.g.: install mapping and query of Extended-CMU-Seasons_slice22
kapture_download_dataset.py install "Extended-CMU-Seasons_slice22_*"

If you want to convert your own dataset into kapture, please find some examples here.

Once installed, you can extract keypoints for your kapture dataset with:

python extract_kapture.py --model models/r2d2_WASF_N16.pt --kapture-root pathto/yourkapturedataset --top-k 5000

Run python extract_kapture.py --help for more information on the extraction parameters.

Evaluation on HPatches

The evaluation is based on the code from D2-Net.

git clone https://github.com/mihaidusmanu/d2-net.git
cd d2-net/hpatches_sequences/
bash download.sh
bash download_cache.sh
cd ../..
ln -s d2-net/hpatches_sequences # finally create a soft-link

Once this is done, extract all the features:

python extract.py --model models/r2d2_WAF_N16.pt --images d2-net/image_list_hpatches_sequences.txt

Finally, evaluate using the iPython notebook d2-net/hpatches_sequences/HPatches-Sequences-Matching-Benchmark.ipynb. You should normally get the following MMA plot: image.

New: we have uploaded in the results/ folder some pre-computed plots that you can visualize using the aforementioned ipython notebook from d2-net (you need to place them in the d2-net/hpatches_sequences/cache/ folder).

Here is a summary of the results:

result filetraining setresolutionMMA@3 on<br>HPatchesnote
r2d2_W_N16.scale-0.3-1.npyW onlyfull0.699no annotation whatsoever
r2d2_WAF_N16.size-256-1024.npyW+A+F1024 px0.686as in NeurIPS paper
r2d2_WAF_N16.scale-0.3-1.npyW+A+Ffull0.718+3.2% just from resolution
r2d2_WASF_N16.size-256-1024.npyW+A+S+F1024 px0.721with style transfer
r2d2_WASF_N16.scale-0.3-1.npyW+A+S+Ffull0.758+3.7% just from resolution

Evaluation on visuallocalization.net

In our paper, we report visual localization results on the Aachen Day-Night dataset (nighttime images) available at visuallocalization.net. We used the provided local feature evaluation pipeline provided here: https://github.com/tsattler/visuallocalizationbenchmark/tree/master/local_feature_evaluation In the meantime, the ground truth poses as well as the error thresholds of the Aachen nighttime images (which are used for the local feature evaluation) have been improved and changed on the website, thus, the original results reported in the paper cannot be reproduced.

Training the model

We provide all the code and data to retrain the model as described in the paper.

Downloading training data

The first step is to download the training data. First, create a folder that will host all data in a place where you have sufficient disk space (15 GB required).

DATA_ROOT=/path/to/data
mkdir -p $DATA_ROOT
ln -fs $DATA_ROOT data 
mkdir $DATA_ROOT/aachen

Then, manually download the Aachen dataset here and save it as $DATA_ROOT/aachen/database_and_query_images.zip. Finally, execute the download script to complete the installation. It will download the remaining training data and will extract all files properly.

./download_training_data.sh

The following datasets are now installed:

full nametagDisk# imgs# pairspython instance
Random Web imagesW2.7GB31253125auto_pairs(web_images)
Aachen DB imagesA2.5GB44794479auto_pairs(aachen_db_images)
Aachen style transfer pairsS0.3GB81153636aachen_style_transfer_pairs
Aachen optical flow pairsF2.9GB44794770aachen_flow_pairs

Note that you can visualize the content of each dataset using the following command:

python -m tools.dataloader "PairLoader(aachen_flow_pairs)"

image

Training details

To train the model, simply run this command:

python train.py --save-path /path/to/model.pt 

On a recent GPU, it takes 30 min per epoch, so ~12h for 25 epochs. You should get a model that scores 0.71 +/- 0.01 in MMA@3 on HPatches (this standard-deviation is similar to what is reported in Table 1 of the paper).

If you want to retrain fast-r2d2 architectures, run:

python train.py --save-path /path/to/fast-model.pt --net 'Fast_Quad_L2Net_ConfCFS()'

Note that you can fully configure the training (i.e. select the data sources, change the batch size, learning rate, number of epochs etc.). One easy way to improve the model is to train for more epochs, e.g. --epochs 50. For more details about all parameters, run python train.py --help.