Home

Awesome

License CC BY-NC-SA 4.0 Python 2.7

AtLoc: Attention Guided Camera Localization - AAAI 2020 (Oral).

Bing Wang, Changhao Chen, Chris Xiaoxuan Lu, Peijun Zhao, Niki Trigoni, and Andrew Markham

License

Licensed under the CC BY-NC-SA 4.0 license, see LICENSE.

Introduction

This is the PyTorch implementation of AtLoc, a simple and efficient neural architecture for robust visual localization.

Demos and Qualitative Results (click below for the video)

<p align="center"> <a href="https://youtu.be/_8NQXBadklU"><img src="./figures/real.gif" width="100%"></a> </p>

Setup

AtLoc uses a Conda environment that makes it easy to install all dependencies.

  1. Install miniconda with Python 2.7.

  2. Create the AtLoc Conda environment: conda env create -f environment.yml.

  3. Activate the environment: conda activate py27pt04.

  4. Note that our code has been tested with PyTorch v0.4.1 (the environment.yml file should take care of installing the appropriate version).

Data

We support the 7Scenes and Oxford RobotCar datasets right now. You can also write your own PyTorch dataloader for other datasets and put it in the data directory.

Special instructions for RobotCar:

  1. Download this fork of the dataset SDK, and run cd data && ./robotcar_symlinks.sh after editing the ROBOTCAR_SDK_ROOT variable in it appropriately.

  2. For each sequence, you need to download the stereo_centre, vo and gps tar files from the dataset website. The directory for each 'scene' (e.g. loop) has .txt files defining the train/test_split.

  3. To make training faster, we pre-processed the images using data/process_robotcar.py. This script undistorts the images using the camera models provided by the dataset, and scales them such that the shortest side is 256 pixels.

  4. Pixel and Pose statistics must be calculated before any training. Use the data/dataset_mean.py, which also saves the information at the proper location. We provide pre-computed values for RobotCar and 7Scenes.

Running the code

Training

The executable script is train.py. For example:

python train.py --dataset RobotCar --scene loop --model AtLoc --gpus 0
python train.py --dataset RobotCar --scene loop --model AtLoc --lstm True --gpus 0
python train.py --dataset RobotCar --scene loop --model AtLocPlus --gamma -3.0 --gpus 0

The meanings of various command-line parameters are documented in train.py. The values of various hyperparameters are defined in tools/options.py.

Inference

The trained models for partial experiments presented in the paper can be downloaded here. The inference script is eval.py. Here are some examples, assuming the models are downloaded in logs.

python eval.py --dataset RobotCar --scene loop --model AtLoc --gpus 0 --weights ./logs/RobotCar_loop_AtLoc_False/models/epoch_300.pth.tar

Calculates the network attention visualizations and saves them in a video

python saliency_map.py --dataset RobotCar --scene loop --model AtLoc --gpus 0 --weights ./logs/RobotCar_loop_AtLoc_False/models/epoch_300.pth.tar 

Citation

If you find this code useful for your research, please cite our paper

@article{wang2019atloc,
  title={AtLoc: Attention Guided Camera Localization},
  author={Wang, Bing and Chen, Changhao and Lu, Chris Xiaoxuan and Zhao, Peijun and Trigoni, Niki and Markham, Andrew},
  journal={arXiv preprint arXiv:1909.03557},
  year={2019}
}

Acknowledgements

Our code partially builds on MapNet and PoseLstm