Home

Awesome

Code: Multi-view Stereo by Temporal Nonparametric Fusion

Yuxin Hou · Juho Kannala · Arno Solin

Codes for the paper:

Summary

We propose a novel idea for depth estimation from unstructured multi-view image-pose pairs, where the model has capability to leverage information from previous latent-space encodings of the scene. This model uses pairs of images and poses, which are passed through an encoder-decoder model for disparity estimation. The novelty lies in soft-constraining the bottleneck layer by a nonparametric Gaussian process prior.

Example depth estimation result running in real-time on an iPad.

Prerequisites

Training

As we mentioned in our paper, the training use the split pretrained MVDepthNet model as statring point. Check the link to get the pretrained model.

python train.py train_dataset_path --pretrained-dict pretrained_mvdepthnet --log-output

Testing

For testing run

python test.py formatted_seq_path --savepath disparity.npy --encoder encoder_path --gp gp_path --decoder decoder_path

Our pretrained model can be downloaded here.

Use your own data for testing

The formatted sequence have the folder structure like this:

We also provide one example sequence: redkitchen seq-01-formatted.

Acknowledgements

The encoder/decoder codes build on MVDepthNet. Some useful util functions used during training are from SfmLearner. Most of the training data are collected by DeMoN. We appreciate their work!

License

Copyright Yuxin Hou, Juho Kannala, and Arno Solin.

This software is provided under the MIT License. See the accompanying LICENSE file for details.