Home

Awesome

MODE: Multi-view Omnidirectional Depth Estimation with 360$^\circ$ Cameras

This repository contains the source code for our paper:

MODE: Multi-view Omnidirectional Depth Estimation with 360$^\circ$ Cameras. ECCV 2022

Dataset Deep360

Deep360 is a large synthetic outdoor dataset for multi-view omnidirectional depth estimation. It contains 2100 frames for training, 300 frames for validation and 600 frames for testing. Panoramas, ground truth disparity and depth maps are presented to train and evaluate omnidirectional depth estimation algorithms. This dataset also contains "soiled version" of panoramas which are soiled or affected by three common outdoor factors: mud spots, water drops and glare.

You can download Deep360 through this link. This folder contains 6 zip files corresponding to 6 episodes of Deep360, and a Readme.txt file for the introduction.

You can use following MD5 checksums to check these files. The check sums are computed as described in RFC 1321.

ep1_500frames.zip : 77b93fe5aa01c1c5cf367c033c2d3f68
ep2_500frames.zip : 3bb4e70a381edbf71c8e84bc0c0cb943
ep3_500frames.zip : b2be6d4e03ea73e1979e198d94edb893
ep4_500frames.zip : 648b093022e3e17f35bab1d191754b89
ep5_500frames.zip : bc1acd148ae8d286420ac5e695de900c
ep6_500frames.zip : ac3a763b31d1bd73989a8c428d505438
README.txt : 4932221f7be68c69dd6a88dc0e9658bb

The file structure of Deep360 is as follow:

Deep360
├── README.txt
├── ep1_500frames
    │ ├── training
    │   ├── rgb
    │   ├── rgb_soiled
    │   ├── disp
    │   ├── depth
    │ ├── validation
    │ ├── testing
├── ep2_500frames
├── ep3_500frames
├── ep4_500frames
├── ep5_500frames
├── ep6_500frames

Please download and unzip the file, and follow the README.txt in the dataset folder and the dataloader in this repository to use this dataset.

Introduction

MODE is a two-stage omnidirectional depth estimation framework with multi-view 360◦ cameras. The framework first estimates the depth maps from different camera pairs via omnidirectional stereo matching and then fuses the depth maps to achieve robustness against mud spots, water drops on camera lenses, and glare caused by intense light.

The pipeline of the proposed two-stage MODE

This video shows the process of generating the Deep360 dataset and the results of MODE on real scenes.

Requirements

Usage

cd models/basic/spherical_conv && bash ./build.sh && cd ../../../

you can train the model with the same protocol in this paper (w.r.t load the pretrained stack hourglass part of PSMNet) using this command:

python train_disparity.py --dataset_root [path to Deep360 folder] --checkpoint_disp [path to pretrained PSMNet ckpt] --loadSHGonly --parallel

or train it from the random initialization:

python train_disparity --dataset_root [path to Deep360 folder] --parallel

please run

python test_disparity.py --dataset_root [path to Deep360 folder] --checkpoint_disp [path to trained ckpt of disparity stage] --parallel --save_output_path [path to save output]

for testing.

We suggest storge the disparity maps and confidence maps of stereo matching stage before the fusion stage to save the training time. please run

python save_output_disparity_stage.py --checkpoint_disp [path to trained ckpt of disparity stage] --datapath [path to Deep360 folder] --outpath [path to save predicted disparity and confidence maps]

please run following command to train the fusion model:

python train_fusion.py --datapath-dataset [path to Deep360 folder] --datapath-input [path to saved outputs of disparity stage]

please run following command to test the fusion model:

python test_fusion.py --datapath-dataset [path to Deep360 folder] --datapath-input [path to saved outputs of disparity stage] --outpath [path to save the fusion results]

For all the command above, add --soiled for the soiled version of Deep360.

Pretrained Models

Our pre-trained models can be found:

ModeDisparity

ModeFusion

ModeFusion_soiled

Acknowledgements

The sperical convolution in this paper refers to Coors et al.[1].

The code of stack hourglass in stereo matching network is adapted from PSMNet [2].

Citation

If you use the Deep360 dataset, or find this project and paper helpful in your research, please cite our paper. (The citaion could be slightly different when this paper is published formally. We will update it in time.)

@inproceedings{Li_Jin2022MODE,
  author       = "Li, Ming and Jin, Xueqian and Hu, Xuejiao and Dai, Jingzhao and Du, Sidan and Li, Yang",
  title        = "MODE: Multi-view Omnidirectional Depth Estimation with 360$^\circ$ Cameras",
  booktitle    = "European Conference on Computer Vision (ECCV)",
  month        = "October",
  year         = "2022"
}

References

[1] Coors, B., Condurache, A.P., Geiger, A.: Spherenet: Learning spherical representations for detection and classification in omnidirectional images. In: Ferrari, V.,Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision – ECCV 2018. pp.525–541. Springer International Publishing, Cham(2018)

[2] Chang, J., Chen, Y.: Pyramid stereo matching network. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 5410–5418 (2018). https://doi.org/10.1109/CVPR.2018.0