Home

Awesome

VOLDOR-SLAM

Author: Zhixiang Min
Contact: zmin1@stevens.edu

VOLDOR-SLAM is a real-time dense-indirect SLAM system takes dense optical flows as input that supports monocular, stereo and RGB-D video sequence. The system logic is written in native python, which is friendly to modify and get started.

<img src="./teaser1.jpg" width=70% /> <img src="./teaser2.jpg" width=27% />

Related Pubulications

VOLDOR-SLAM: For the times when feature-based or direct methods are not good enough
Zhixiang Min, Enrique Dunn
ICRA 2021 [paper] [video]

VOLDOR: Visual Odometry from Log-logistic Dense Optical flow Residuals
Zhixiang Min, Yiding Yang, Enrique Dunn
CVPR 2020 [paper] [video]

Installation

Our system is built with cuda, cython and python. We will support the compatibility under the following configurations:

Windows 10 + Visual Studio 2017  /  Ubuntu 18.04
CUDA >= 9.0
Python 3.6.X
OpenCV 3.4.X
Ceres 2.0    (Optional for mapping)
pyDBoW3   (Optional for loop closure)

Python Packages

Prebuilt Binaries

Windows

Ubuntu

Prepare Data

Our method takes three inputs. You may download demo data to run and play.

Optical flows are REQUIRED for visual odometry. We recommend MaskFlowNet or PWC-Net as the optical flow estimator.
Supported format: .flo

Disparity maps are OPTIONAL input only present when stereo pairs or depth sensors are available. For stereo pairs, we recommend using the same optical flow estimator running from left to right. For depth sensors, you need to set a virtual baseline value to convert the depth to disparity as input.
Supported format: .flo, .png (x256 gray-16bit)

RGB images are OPTIONAL input for adding photo-consistency to enhance the frame alignment.
Supported format: all image formats supported by opencv

Running

cd demo
python demo.py --help
python demo.py 
--fx 320 --fy 320 --cx 320 --cy 240 --bf 80 
--flow_dir 'path_to_flow' 
--img_dir 'path_to_img' 
--disp_dir 'path_to_disp' 
--mode stereo 
--enable_mapping 
--enable_loop_closure './ORBvoc.bin' 
--save_poses './poses.txt' 
--save_depths './depths' 

More details can be found in the comments of the demo script. In demo data, we also provided bash commands in run_demo.txt under each data instance folder.

Tips

Viewer Control Manual

<pre> > q -> Exit > m -> Save point cloud to .ply > r -> Reset view > h -> Hide cameras / links > f -> Follow the current camera > w/s -> Increase/Decrease scene pixel size > d/a -> Increase/Decrease scene pixel density > x/z -> Increase/Decrease scene pixel depth range > Mouse scroll up/down -> Zoom in/out > Mouse drag with left down -> Scene rotation > Mouse drag with right down -> Scene translation </pre>

Viewer Range. When running with monocular capture, if the world scale is unfortunately initialized too large, the viewer may not display many pixels, which can be adjusted using x/z keys.

How to pick basefocal. basefocal refers to baseline (meter) times focal (px). It decides the scene scale. If the input disparity maps are from rectified stereo, the baseline of stereo pairs should be used. In the case of depth camera or monocular, basefocal can be used as a hyper-parameter tuning the confidence on the input depth. A proper basefocal value should let the displayed keyframe depth map (tmpkf_depth) shows a porper variance (neither pure white or black).

Input Frame Rate. Our framework favors a modest baseline between frames to let the optical flows being informative. For high frame rate videos, consider downsampling the frame rate based on appearance changing. (E.g. KITTI/TartanAir @ 10Hz, TUM-RGBD @ 3Hz.)

I/O BottleNeck. If you observe choppy GPU load or the latest map on the viewer colored gray, it probably means the disk I/O is bottlenecking the performance. This will affect the accuracy by blocking the local mapping while VO keeps running. Consider move your data to SSD disk or you may restart the software a few times that your disk will cache more data.

Parameter Tuning. Usually default parameters work for all. You may try loosing the loop closure thresholds to detect more loop closures. For very challenging dataset, if the full pipeline is not stable, try disable local mapping. For tuning VO parameters, pass parameters using slam.voldor_user_config. All available parameters can be found in voldor/config.h with descriptions. For SLAM parameters, check the __init__ method of VOLDOR_SLAM.

What is mono-scaled mode? Mono-scaled mode is to use the disparity map only for correcting the world scale.

License & Third Party Acknowledgements

The software is released under the attached LICENSE file.
Parts of the software rely on the following open source projects:
LambdaTwist, OpenCV, pyDBoW3, DBoW3, Ceres, SuiteSparse, Eigen, svd3, PyOpenGL