Awesome

IGEV++

Our significant extension version of IGEV, named IGEV++, is available at Paper, Code

IGEV-Stereo (CVPR 2023)

This repository contains the source code for our paper:

Iterative Geometry Encoding Volume for Stereo Matching<br/> Gangwei Xu, Xianqi Wang, Xiaohuan Ding, Xin Yang<br/>

Demos

Pretrained models can be downloaded from google drive

We assume the downloaded pretrained weights are located under the pretrained_models directory.

You can demo a trained model on pairs of images. To predict stereo for Middlebury, run

python demo_imgs.py \
--restore_ckpt pretrained_models/sceneflow/sceneflow.pth \
-l=path/to/your/left_imgs \
-r=path/to/your/right_imgs

or you can demo a trained model pairs of images for a video, run:

python demo_video.py \
--restore_ckpt pretrained_models/sceneflow/sceneflow.pth \
-l=path/to/your/left_imgs \
-r=path/to/your/right_imgs

To save the disparity values as .npy files, run any of the demos with the --save_numpy flag.

Comparison with RAFT-Stereo

Method	KITTI 2012 <br> (3-noc)	KITTI 2015 <br> (D1-all)	Memory (G)	Runtime (s)
RAFT-Stereo	1.30 %	1.82 %	1.02	0.38
IGEV-Stereo	1.12 %	1.59 %	0.66	0.18

Environment

NVIDIA RTX 3090
Python 3.8
Pytorch 1.12

Create a virtual environment and activate it.

conda create -n IGEV_Stereo python=3.8
conda activate IGEV_Stereo

Dependencies

conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch -c nvidia
pip install opencv-python
pip install scikit-image
pip install tensorboard
pip install matplotlib 
pip install tqdm
pip install timm==0.5.4

Required Data

To evaluate/train IGEV-Stereo, you will need to download the required datasets.

By default stereo_datasets.py will search for the datasets in these locations.

├── /data
    ├── sceneflow
        ├── frames_finalpass
        ├── disparity
    ├── KITTI
        ├── KITTI_2012
            ├── training
            ├── testing
            ├── vkitti
        ├── KITTI_2015
            ├── training
            ├── testing
            ├── vkitti
    ├── Middlebury
        ├── trainingH
        ├── trainingH_GT
    ├── ETH3D
        ├── two_view_training
        ├── two_view_training_gt
    ├── DTU_data
        ├── dtu_train
        ├── dtu_test

Evaluation

To evaluate on Scene Flow or Middlebury or ETH3D, run

python evaluate_stereo.py --restore_ckpt ./pretrained_models/sceneflow/sceneflow.pth --dataset sceneflow

python evaluate_stereo.py --restore_ckpt ./pretrained_models/sceneflow/sceneflow.pth --dataset middlebury_H

python evaluate_stereo.py --restore_ckpt ./pretrained_models/sceneflow/sceneflow.pth --dataset eth3d

Training

To train on Scene Flow, run

python train_stereo.py --logdir ./checkpoints/sceneflow

To train on KITTI, run

python train_stereo.py --logdir ./checkpoints/kitti --restore_ckpt ./pretrained_models/sceneflow/sceneflow.pth --train_datasets kitti

Submission

For submission to the KITTI benchmark, run

python save_disp.py

MVS training and evaluation

To train on DTU, run

python train_mvs.py

To evaluate on DTU, run

python evaluate_mvs.py

Citation

If you find our work useful in your research, please consider citing our paper:


@inproceedings{xu2023iterative,
  title={Iterative Geometry Encoding Volume for Stereo Matching},
  author={Xu, Gangwei and Wang, Xianqi and Ding, Xiaohuan and Yang, Xin},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={21919--21928},
  year={2023}
}

@article{xu2024igev++,
  title={IGEV++: Iterative Multi-range Geometry Encoding Volumes for Stereo Matching},
  author={Xu, Gangwei and Wang, Xianqi and Zhang, Zhaoxing and Cheng, Junda and Liao, Chunyuan and Yang, Xin},
  journal={arXiv preprint arXiv:2409.00638},
  year={2024}
}

Acknowledgements

This project is based on RAFT-Stereo, and CoEx. We thank the original authors for their excellent works.