Awesome
IGEV++
Our significant extension version of IGEV, named IGEV++, is available at Paper, Code
IGEV-Stereo (CVPR 2023)
This repository contains the source code for our paper:
Iterative Geometry Encoding Volume for Stereo Matching<br/> Gangwei Xu, Xianqi Wang, Xiaohuan Ding, Xin Yang<br/>
<img src="IGEV-Stereo/IGEV-Stereo.png">Demos
Pretrained models can be downloaded from google drive
We assume the downloaded pretrained weights are located under the pretrained_models directory.
You can demo a trained model on pairs of images. To predict stereo for Middlebury, run
python demo_imgs.py \
--restore_ckpt pretrained_models/sceneflow/sceneflow.pth \
-l=path/to/your/left_imgs \
-r=path/to/your/right_imgs
or you can demo a trained model pairs of images for a video, run:
python demo_video.py \
--restore_ckpt pretrained_models/sceneflow/sceneflow.pth \
-l=path/to/your/left_imgs \
-r=path/to/your/right_imgs
To save the disparity values as .npy files, run any of the demos with the --save_numpy
flag.
Comparison with RAFT-Stereo
Method | KITTI 2012 <br> (3-noc) | KITTI 2015 <br> (D1-all) | Memory (G) | Runtime (s) |
---|---|---|---|---|
RAFT-Stereo | 1.30 % | 1.82 % | 1.02 | 0.38 |
IGEV-Stereo | 1.12 % | 1.59 % | 0.66 | 0.18 |
Environment
- NVIDIA RTX 3090
- Python 3.8
- Pytorch 1.12
Create a virtual environment and activate it.
conda create -n IGEV_Stereo python=3.8
conda activate IGEV_Stereo
Dependencies
conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch -c nvidia
pip install opencv-python
pip install scikit-image
pip install tensorboard
pip install matplotlib
pip install tqdm
pip install timm==0.5.4
Required Data
To evaluate/train IGEV-Stereo, you will need to download the required datasets.
By default stereo_datasets.py
will search for the datasets in these locations.
├── /data
├── sceneflow
├── frames_finalpass
├── disparity
├── KITTI
├── KITTI_2012
├── training
├── testing
├── vkitti
├── KITTI_2015
├── training
├── testing
├── vkitti
├── Middlebury
├── trainingH
├── trainingH_GT
├── ETH3D
├── two_view_training
├── two_view_training_gt
├── DTU_data
├── dtu_train
├── dtu_test
Evaluation
To evaluate on Scene Flow or Middlebury or ETH3D, run
python evaluate_stereo.py --restore_ckpt ./pretrained_models/sceneflow/sceneflow.pth --dataset sceneflow
or
python evaluate_stereo.py --restore_ckpt ./pretrained_models/sceneflow/sceneflow.pth --dataset middlebury_H
or
python evaluate_stereo.py --restore_ckpt ./pretrained_models/sceneflow/sceneflow.pth --dataset eth3d
Training
To train on Scene Flow, run
python train_stereo.py --logdir ./checkpoints/sceneflow
To train on KITTI, run
python train_stereo.py --logdir ./checkpoints/kitti --restore_ckpt ./pretrained_models/sceneflow/sceneflow.pth --train_datasets kitti
Submission
For submission to the KITTI benchmark, run
python save_disp.py
MVS training and evaluation
To train on DTU, run
python train_mvs.py
To evaluate on DTU, run
python evaluate_mvs.py
Citation
If you find our work useful in your research, please consider citing our paper:
@inproceedings{xu2023iterative,
title={Iterative Geometry Encoding Volume for Stereo Matching},
author={Xu, Gangwei and Wang, Xianqi and Ding, Xiaohuan and Yang, Xin},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={21919--21928},
year={2023}
}
@article{xu2024igev++,
title={IGEV++: Iterative Multi-range Geometry Encoding Volumes for Stereo Matching},
author={Xu, Gangwei and Wang, Xianqi and Zhang, Zhaoxing and Cheng, Junda and Liao, Chunyuan and Yang, Xin},
journal={arXiv preprint arXiv:2409.00638},
year={2024}
}
Acknowledgements
This project is based on RAFT-Stereo, and CoEx. We thank the original authors for their excellent works.