Awesome

Awesome-Deep-Stereo-Matching

Welcome to the "Awesome-Deep-Stereo-Matching" repository, a curated list of state-of-the-art deep stereo matching resources maintained by Fabio Tosi, Matteo Poggi and Luca Bartolomei, from the University of Bologna. This repository, inspired by awesome-computer-vision, aims to provide a comprehensive collection of the latest and most influential papers on deep stereo matching published in top-tier computer vision conferences and prestigious journals.

The methods included in this repository are appropriately categorized to facilitate navigation and understanding of the diverse approaches and techniques employed in deep stereo matching research. Additionally, for anyone in need, we also release the reference bib which contains the bib entries for all the works included in this page.

We use the :triangular_flag_on_post: symbol to highlight the absolute most groundbreaking works.

:rotating_light: :rotating_light: :rotating_light: This repository is closely associated with our surveys on deep stereo matching:

"A Survey on Deep Stereo Matching in the Twenties", Tosi et al., 2024
"On the Synergies between Machine Learning and Binocular Stereo for Depth Estimation from Images: a Survey", Poggi et al., 2021
"On the confidence of stereo matching in a deep-learning era: a quantitative evaluation", Poggi et al., 2022

These surveys provides an in-depth overview of the field, complementing the curated list of resources found in this repository.

Additionally, we presented a tutorial on this topic at CVPR 2024. For more information about the tutorial, including slides and additional resources, please visit our Tutorial Webpage.

If you find this repository valuable, please consider citing it in your work and giving it a star ! :star:

Full reference(s):

"A Survey on Deep Stereo Matching in the Twenties", Tosi et al., arXiv pre-print, 2024. [Paper] [Bibtex] [Google Scholar] [Tutorial]
"On the synergies between machine learning and binocular stereo for depth estimation from images: a survey", Poggi et al., IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021. [Paper] [Bibtex] [Google Scholar]
"On the Confidence of Stereo Matching in a Deep-Learning Era: A Quantitative Evaluation", Poggi et al., IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022. [Paper] [Bibtex] [Google Scholar]

How to submit a pull request?

<h2 id="fundamentals"> Surveys & Fundamentals </h2> <details open><summary style="font-size: larger; font-weight: bold;"> Stereo Matching Basics</summary><ul>

"A taxonomy and evaluation of dense two-frame stereo correspondence algorithms", Scharstein & Szeliski, International Journal of Computer Vision (TPAMI), 2002. [Paper] [Bibtex] [Google Scholar]
"Evaluation of cost functions for stereo matching", Hirschmuller & Scharstein, CVPR, 2007. [Paper] [Bibtex] [Google Scholar]
SGM: "Stereo processing by semiglobal matching and mutual information", Heiko Hirschmuller, TPAMI, 2007. [Paper] [Bibtex] [Google Scholar]
"Computer Vision: Algorithms and Applications", 2nd Edition - (Chapter 12, Depth Estimation), Richard Szeliski [Slides] [Bibtex] [Google Scholar]
"Stereo Matching", Richard Szeliski, University of Washington [Slides]
"Stereo Vision", Fei-Fei Li, Stanford Vision Lab [Slides]
"Stereo Vision: Algorithms and Applications", Stefano Mattoccia, University of Bologna [Slides] [Bibtex] [Google Scholar]

</details> <details open><summary style="font-size: larger; font-weight: bold;"> Deep Stereo Matching</summary><ul>

"A Survey on Deep Stereo Matching in the Twenties", Tosi et al., arXiv pre-print, 2024. [Paper] [Bibtex] [Google Scholar] [Tutorial]
"A survey on deep learning techniques for stereo-based depth estimation", Laga et al., IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2020. [Paper] [Bibtex] [Google Scholar]
"On the synergies between machine learning and binocular stereo for depth estimation from images: a survey", Poggi et al., IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021. [Paper] [Bibtex] [Google Scholar]

</ul> </details> <details open><summary style="font-size: larger; font-weight: bold;"> Learned Confidence Estimation </summary><ul>

"Quantitative evaluation of confidence measures in a machine learning world", Poggi et al., ICCV, 2017. [Paper] [Bibtex] [Google Scholar]
"On the Confidence of Stereo Matching in a Deep-Learning Era: A Quantitative Evaluation", Poggi et al., IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022. [Paper] [Bibtex] [Google Scholar]

</ul> </details> <details open><summary style="font-size: larger; font-weight: bold;"> Event-Based Stereo</summary><ul>

"Event-based Stereo Depth Estimation: A Survey", Ghosh et al., arXiv pre-print, 2024. [Paper] [Bibtex] [Google Scholar]

</ul> </details>

CodeBase

OpenStereo: "OpenStereo: A Comprehensive Benchmark for Stereo Matching and Strong Baseline", Xianda et al., arXiv, 2023 [Paper] [Code] [Bibtex] [Google Scholar]

:file_cabinet: Datasets

<details open id="real-world"><summary style="font-size: larger; font-weight: bold;">Real-World</summary><ul> <details open id="real-world RGB"><summary style="font-size: larger; font-weight: bold;"> RGB </summary>

KITTI 2012: "Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite", Geiger et al., CVPR, 2012. [Paper] [Dataset] [Bibtex] [Google Scholar]
KITTI 2015: "Object Scene Flow for Autonomous Vehicles", Menze et al., CVPR, 2015. [Paper] [Dataset] [Bibtex] [Google Scholar]
Middlebury v3: "High-resolution stereo datasets with subpixel-accurate ground truth", Scharstein et al., GCPR 2014. [Paper] [Dataset] [Bibtex] [Google Scholar]
Cityscapes: "The cityscapes dataset for semantic urban scene understanding", Cordts et al., CVPR, 2016. [Paper] [Dataset] [Bibtex] [Google Scholar]
ETH3D: "A multi-view stereo benchmark with high-resolution images and multi-camera videos", Schops et al., CVPR, 2017. [Paper] [Dataset] [Bibtex] [Google Scholar]
DrivingStereo: "DrivingStereo: A Large-Scale Dataset for Stereo Matching in Autonomous Driving Scenarios", Yang et al., CVPR, 2019. [Paper] [Dataset] [Bibtex] [Google Scholar]
WSVD: "Web stereo video supervision for depth prediction from dynamic scenes", Wang et al., 3DV, 2019. [Paper] [Dataset] [Bibtex] [Google Scholar]
Flickr1024: "Flickr1024: A large-scale dataset for stereo image super-resolution", Wang et al., ICCVW, 2019. [Paper] [Dataset] [Bibtex] [Google Scholar]
ApolloScape: "The apolloscape open dataset for autonomous driving and its application", Huang et al., TPAMI, 2019. [Paper] [Dataset] [Bibtex] [Google Scholar]
Holopix50k: "Holopix50k: A Large-Scale In-the-Wild Stereo Image Dataset", Hua et al., CVPR, 2020. [Paper] [Dataset] [Bibtex] [Google Scholar]
A2D2: "Audi Autonomous Driving Dataset", Geyer et al., arXiv, 2020. [Paper] [Dataset] [Bibtex] [Google Scholar]
InStereo2K: "InStereo2K: A Large Real Dataset for Stereo Matching in Indoor Scenes", Bao et al., Science China Information Sciences, 2020. [Paper] [Github] [Bibtex] [Google Scholar]
Middlebury 2021 Mobile Dataset: [Dataset] [Bibtex]
DurLAR: "DurLAR: A High-Fidelity 128-Channel LiDAR Dataset with Panoramic Ambient and Reflectivity Imagery for Multi-Modal Autonomous Driving Applications", Li et al., 3DV, 2021. [Paper] [Dataset] [Bibtex] [Google Scholar]
Booster: "Open Challenges in Deep Stereo: The Booster Dataset", Ramirez et al., CVPR, 2022. [Paper] [Dataset] [Bibtex] [Google Scholar]
WHU-Stereo: "WHU-Stereo: A challenging benchmark for stereo matching of high-resolution satellite images", Li et al., TGRS, 2023. [Paper] [Dataset] [Bibtex] [Google Scholar]
SID: "SID: Stereo Image Dataset for Autonomous Driving in Adverse Conditions", El-Shair et al., NAECON, 2024. [Paper] [Dataset] [Bibtex] [Google Scholar]
LayeredFlow: "LayeredFlow: A Real-World Benchmark for Non-Lambertian Multi-Layer Optical Flow", Wen et al., ECCV, 2024. [Paper] [Dataset] [Bibtex] [Google Scholar]
South Kensington SV: "Match Stereo Videos via Bidirectional Alignment", Jing et al., arXiv, 2024. [Paper] [Code&Dataset] [Bibtex] [WebPage] [Google Scholar]

</details> <details open><summary style="font-size: larger; font-weight: bold;"> Beyond-RGB </summary>

CATS: "CATS: A Color and Thermal Stereo Benchmark", Treible et al., CVPR, 2017. [Paper] [Dataset] [Bibtex] [Google Scholar]
RGB-NIR-Stereo: "Deep material-aware cross-spectral stereo matching", Zhi et al., CVPR, 2018. [Paper] [Code] [Bibtex] [Google Scholar]
RGB-MS: "RGB-Multispectral Matching: Dataset, Learning Methodology, Evaluation", Tosi et al., CVPR, 2022. [Paper] [Dataset] [Bibtex] [Google Scholar]
Gated Stereo: "Gated Stereo: Joint Depth Estimation from Gated and Wide-Baseline Active Stereo Cues", Walz et al., CVPR, 2023. [Paper] [Dataset] [Bibtex] [Google Scholar]
RPS/IPS: "DPS-Net: Deep Polarimetric Stereo Depth Estimation", Tian et al., ICCV, 2023. [Paper] [Dataset] [Bibtex] [Google Scholar]
MS^2: "Deep Depth Estimation From Thermal Image", Shin et al., CVPR 2023. [Paper] [Dataset] [Bibtex] [Google Scholar]
<details open><summary style="font-size: larger; font-weight: bold;"> Event Camera Based</summary>
- UZH-RPG: "Semi-Dense 3D Reconstruction with a Stereo Event Camera", Zhou et al., ECCV, 2018. [Paper] [Dataset] [Bibtex] [Google Scholar]
- MVSEC: "The Multivehicle Stereo Event Camera Dataset: An Event Camera Dataset for 3D Perception", Zhu et al., RAL 2018. [Paper] [Dataset] [Bibtex] [Google Scholar]
- DSEC: "DSEC: A Stereo Event Camera Dataset for Driving Scenarios", Gehrig et al., RAL, 2021. [Paper] [Code] [Dataset] [Bibtex] [Google Scholar]
- SHEF: "Stereo Hybrid Event-Frame (SHEF) Cameras for 3D Perception", Wang et al., IROS, 2021. [Paper] [Dataset] [Bibtex] [Google Scholar]
- TUM-VIE: "TUM-VIE: The TUM Stereo Visual-Inertial Event Data Set", Klenk et al., IROS, 2021. [Paper] [Dataset] [Bibtex] [Google Scholar]
- EVIMO2: "EVIMO2: An Event Camera Dataset for Motion Segmentation, Optical Flow, Structure from Motion, and Visual Inertial Odometry in Indoor Scenes with Monocular or Stereo Algorithms", Burner et al., ArXiv, 2022. [Paper] [Dataset] [Bibtex] [Google Scholar]
- VECtor: "VECtor: A Versatile Event-Centric Benchmark for Multi-Sensor SLAM", Gao et al., RAL, 2022. [Paper] [Dataset] [Bibtex] [Google Scholar]
- ESVIO: "ESVIO: Event-based Stereo Visual Inertial Odometry", Chen et al., RAL, 2023. [Paper] [Dataset] [Bibtex] [Google Scholar]
- M3ED: "M3ED: Multi-Robot, Multi-Sensor, Multi-Environment Event Dataset", Chaney et al., CVPRW, 2023. [Paper] [Dataset] [Bibtex] [Google Scholar]
</details>

</details> <details open><summary style="font-size: larger; font-weight: bold;"> Rendered </summary>

The NeRF-Stereo Dataset: "NeRF-Supervised Deep Stereo", Tosi et al., CVPR 2023. [Paper] [Dataset] [Bibtex] [Google Scholar]

</details> </ul> </details> <details open id="synthetic"> <summary style="font-size: larger; font-weight: bold;">Synthetic</summary>

MPI Sintel: "A naturalistic open source movie for optical flow evaluation", Butler et al., ECCV, 2012. [Paper] [Dataset] [Bibtex] [Google Scholar]
Freiburg SceneFlow: "A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation", Mayer et al., CVPR, 2016. [Paper] [Dataset] [Bibtex] [Google Scholar]
Virtual KITTI, "Virtual worlds as proxy for multi-object tracking analysis", Gaidon et al., CVPR, 2016. [Paper] [Dataset] [Bibtex] [Google Scholar]
Falling Things: "A synthetic dataset for 3d object detection and pose estimation", Tremblay et al., CVPRW, 2018. [Paper] [Dataset] [Bibtex] [Google Scholar]
HS-VS: "Hierarchical deep stereo matching on high-resolution image", Yang et al., CVPR, 2019. [Paper] [Dataset] [Bibtex] [Google Scholar]
Virtual KITTI 2, Cabon et al., arXiv, 2020. [Paper] [Dataset] [Bibtex] [Google Scholar]
TartanAir: "TartanAir: A dataset to push the limits of visual slam", Wang et al., IROS, 2020. [Paper] [Dataset] [Bibtex] [Google Scholar]
Semi-synthesis: "Semi-synthesis: A fast way to produce effective datasets for stereo matching", He et al., ICCVW, 2021. [Paper] [Bibtex] [Google Scholar]
UnrealStereo4K: "SMD-Nets: Stereo Mixture Density Networks", Tosi et al., CVPR, 2021. [Paper] [Dataset] [Bibtex] [Google Scholar]
IRS: "IRS: A large naturalistic indoor robotics stereo dataset to train deep models for disparity and surface normal estimation", Wang et al., ICME, 2021. [Paper] [Dataset] [Bibtex] [Google Scholar]
CREStereo: "Practical stereo matching via cascaded recurrent network with adaptive correlation", Li et al., CVPR, 2022. [Paper] [Dataset] [Bibtex] [Google Scholar]
SimStereo: "Active-Passive SimStereo – Benchmarking the Cross-Generalization Capabilities of Deep Learning-based Stereo Methods", Jospin et al., NeurIPS, 2022. [Paper] [Dataset] [Bibtex] [Google Scholar]
SimSIN: "Toward practical monocular indoor depth estimation", Wu et al., CVPR, 2022. [Paper] [Dataset] [Bibtex] [Google Scholar]
Spring: "Spring: A High-Resolution High-Detail Dataset and Benchmark for Scene Flow, Optical Flow and Stereo", Mehl et al., CVPR, 2023. [Paper] [Dataset] [Bibtex] [Google Scholar]
Dynamic Replica: "DynamicStereo: Consistent Dynamic Depth From Stereo Videos", Karaev et al., CVPR 2023. [Paper] [Dataset] [Bibtex] [Google Scholar]
All-In-One Drive: "A Comprehensive Perception Dataset with High-Density Long-Range Point Clouds", Weng et al., arXiv 2023. [Paper] [Dataset] [Bibtex] [Google Scholar]
PLT-D3: "PLT-D3: A High-fidelity Dynamic Driving Simulation Dataset for Stereo Depth and Scene Flow", Tokarsky et al., arXiv 2024. [Paper] [Dataset] [Bibtex] [Google Scholar]
QPS Dataset: "Disparity Estimation Using a Quad-Pixel Sensor", Wu et al., BMVC, 2024. [Paper] [WebPage] [Dataset] [Bibtex] [Google Scholar]
Infinigen Stereo Video: "Match Stereo Videos via Bidirectional Alignment", Jing et al., arXiv, 2024. [Paper] [Code&Dataset] [Bibtex] [WebPage] [Google Scholar]

</details>

Frameworks

Learning for Stereo Pipeline

<details open id="matching-cost"> <summary style="font-size: larger; font-weight: bold;">Matching Cost</summary>

Deep Embed: "A deep visual correspondence embedding model for stereo matching costs", Chen et al., ICCV, 2015. [Paper] [Bibtex] [Google Scholar]
:triangular_flag_on_post: MC-CNN: "Stereo matching by training a convolutional neural network to compare image patches", Zbontar & LeCun, JMLR, 2016. [Paper] [Code] [Bibtex1] [Bibtex2] [Google Scholar]
Content CNN: "Efficient deep learning for stereo matching", Luo et al., CVPR, 2016. [Paper] [Code] [Bibtex] [Google Scholar]
Per-pixel pyramid-pooling: "Look wider to match image patches with convolutional neural networks", Park et al., SPR, 2016. [Paper] [Bibtex] [Google Scholar]
Consistency and Distinctiveness: "Fundamental principles on learning new features for effective dense matching", Zhang et al., TIP, 2017. [Paper] [Bibtex] [Google Scholar]
MC-CNN-WS: "Weakly supervised learning of deep metrics for stereo reconstruction", Tulyakov et al., ICCV, 2017. [Paper] [Code] [Bibtex] [Google Scholar]
CBMV: "CBMV: A coalesced bidirectional matching volume for disparity estimation", Batsos et al., CVPR, 2018. [Paper] [Bibtex] [Google Scholar]
SDC: "SDC - stacked dilated convolution: A unified descriptor network for dense matching tasks", Schuster et al., CVPR, 2019. [Paper] [Bibtex] [Google Scholar]
Semi-dense Stereo: "Semi-dense Stereo Matching using Dual CNNs", Mao et al., WACV, 2019. [Paper] [Bibtex] [Google Scholar]

</details> <details open id="optimization"> <summary style="font-size: larger; font-weight: bold;">Optimization</summary>

GCP: "Learning to detect ground control points for improving the accuracy of stereo matching", Spyropoulos et al., CVPR, 2014. [Paper] [Bibtex] [Google Scholar]
LevStereo: "Leveraging stereo matching with learning-based confidence measures", Park et al., CVPR, 2015. [Paper] [Bibtex] [Google Scholar]
O1: "Learning a general-purpose confidence measure based on o (1) features and a smarter aggregation strategy for semi global matching", Poggi et al., 3DV, 2016. [Paper] [Bibtex] [Google Scholar]
PBCP: "Patch Based Confidence Prediction for Dense Disparity Map", Seki et al., BMVC, 2016. [Paper] [Bibtex] [Google Scholar]
Sgm-Nets: "Sgm-Nets: Semi-global matching with neural networks", Seki et al., CVPR, 2017. [Paper] [Bibtex] [Google Scholar]
SGM-Forest: "Learning to fuse proposals from multiple scanline optimizations in semi-global matching", Schonberger et al., ECCV, 2018. [Paper] [Bibtex] [Google Scholar]

</details> <details open id="refinement"> <summary style="font-size: larger; font-weight: bold;">Refinement</summary>

RCN: "Improved stereo matching with constant highway networks and reflective confidence learning", Shaked et al., CVPR, 2017. [Paper] [Code] [Bibtex] [Google Scholar]
DRR: "Detect, replace, refine: Deep structured prediction for pixel wise labeling", Gidaris et al., CVPR, 2017. [Paper] [Code] [Bibtex] [Google Scholar]
OSD: "Efficient stereo matching leveraging deep local and context information", Ye et al., IEEE Access, 2017. [Paper] [Bibtex] [Google Scholar]
Recresnet: "Recresnet: A recurrent residual cnn architecture for disparity map enhancement", Batsos et al., 3DV, 2018. [Paper] [Code] [Bibtex] [Google Scholar]
LRCR: "Left-right comparative recurrent model for stereo matching", Jie et al., CVPR, 2018. [Paper] [Bibtex] [Google Scholar]
FD-Fusion: "Fast stereo disparity maps refinement by fusion of data-based and model-based estimations", Ferrera et al., 3DV, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
VRN: "Learned collaborative stereo refinement", Knobelreiter et al., IJCV, 2021. [Paper] [Bibtex] [Google Scholar]
NDR: "Neural disparity refinement for arbitrary resolution stereo", Aleotti et al., 3DV, 2021. [Paper] [Website] [Bibtex]
NDR v2: "Neural disparity refinement", Tosi et al., TPAMI, 2024. [Paper] [Website] [Bibtex]

</details>

End-to-End Architectures

<details open id="foundational"> <summary style="font-size: larger; font-weight: bold;">Foundational Deep Stereo Architectures</summary><ul> <details open> <summary style="font-size: larger; font-weight: bold;">CNN-based Cost Volume Aggregation</summary><ul> <details open> <summary style="font-size: larger; font-weight: bold;">2D Architectures</summary>

:triangular_flag_on_post: DispNet-C: "A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation", Mayer et al.,CVPR, 2016. [Paper] [Bibtex] [Google Scholar]
CNN+CRF: "End-to-end training of hybrid CNN-CRF models for stereo", Knobelreiter et al., CVPR, 2017. [Paper] [Code] [Bibtex] [Google Scholar]
CRL: "Cascade residual learning: A two-stage convolutional neural network for stereo matching", Pang et al., CVPRW, 2017. [Paper] [Code] [Bibtex] [Google Scholar]
iResNet: "Learning for disparity estimation through feature constancy", Liang et al., CVPR, 2018. [Paper] [Code] [Bibtex] [Google Scholar]
DispNet-CSS: "Occlusions, motion and depth boundaries with a generic network for disparity, optical flow or scene flow estimation", Ilg et al., ECCV, 2018. [Paper] [Code] [Bibtex] [Google Scholar]
EdgeStereo: "Edgestereo: A context integrated residual pyramid network for stereo matching", Song et al., ACCV, 2018. [Paper] [Bibtex] [Google Scholar]
AutoDispNet-CSS: "Autodispnet: Improving disparity estimation with automl", Saikia et al., ICCV, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
HD<sup>3: "Hierarchical discrete distribution decomposition for match density estimation", Yin et al., ICCV, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
AANet: "AANet: Adaptive Aggregation Network for Efficient Stereo Matching", Xu et al., CVPR, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
Bi3D: "Bi3D: Stereo Depth Estimation via Binary Classifications", Badki et al., CVPR, 2020. [Paper] [Code] [Bibtex] [Google Scholar]

</details> <details open class="nested-details"> <summary style="font-size: larger; font-weight: bold;">3D Architectures</summary>

:triangular_flag_on_post: GC-Net: "End-to-end learning of geometry and context for deep stereo regression", Kendall et al., ICCV, 2017. [Paper] [Bibtex] [Google Scholar]
ECA: "Deep stereo matching with explicit cost aggregation sub-architecture", Yu et al., AAAI, 2018. [Paper] [Bibtex] [Google Scholar]
PSMNet: "Pyramid Stereo Matching Network", Chang et al., CVPR, 2018. [Paper] [Code] [Bibtex] [Google Scholar]
PDSNet: "Practical deep stereo (pds): Toward applications-friendly deep stereo matching", Tulyakov et al., NeurIPS, 2018. [Paper] [Code] [Bibtex] [Google Scholar]
HSMNet: "Hierarchical deep stereo matching on high-resolution images", Yang et al., CVPR, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
GWCNet: "Group-wise correlation stereo network", Guo et al., CVPR, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
EMCUA: "Multi-Level Context Ultra-Aggregation for Stereo Matching", Nie et al., CVPR, 2019. [Paper] [Bibtex] [Google Scholar]
CSPN: "Learning depth with convolutional spatial propagation network", Cheng et al., TPAMI, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
GA-Net: "Ga-net: Guided aggregation net for end-to-end stereo matching", Zhang et al., CVPR, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
Stereodrnet: "Stereodrnet: Dilated residual stereonet", Chabra et al., CVPR, 2019. [Paper] [Bibtex] [Google Scholar]
CasStereo: "Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching", Gu et al., CVPR, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
WaveletStereo: "WaveletStereo: Learning Wavelet Coefficients of Disparity Map in Stereo Matching", Wang et al., CVPR, 2020. [Paper] [Bibtex] [Google Scholar]
CFNet: "CFNet: Cascade and Fused Cost Volume for Robust Stereo Matching", Shen et al., CVPR, 2021. [Paper] [Code] [Bibtex] [Google Scholar]
UASNet: "UASNet: Uncertainty Adaptive Sampling Network for Deep Stereo Matching", Mao et al., ICCV, 2021 [Paper] [Bibtex] [Google Scholar]
PCR: "Parallax contextual representations for stereo matching", Deng et al., ICIP, 2021. [Paper] [Bibtex] [Google Scholar]
PCWNet: "PCW-Net: Pyramid Combination and Warping Cost Volume for Stereo Matching", Shen et al., ECCV, 2022. [Paper] [Code] [Bibtex] [Google Scholar]
ICVP: "Image-Coupled Volume Propagation for Stereo Matching", Kwon et al., ICIP, 2023. [Paper] [Code] [Bibtex] [Google Scholar]
SEDNet: "Learning the distribution of errors in stereo matching for joint disparity and uncertainty estimation", Chen et al., CVPR, 2023. [Paper] [Code] [Bibtex] [Google Scholar]

</details> </details> <details open class="nested-details"> <summary style="font-size: larger; font-weight: bold;">Neural Architecture Search (NAS)</summary>

LEAStereo: "Hierarchical Neural Architecture Search for Deep Stereo Matching", Cheng et al., NeurIPS, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
EASNet: "EASNet: searching elastic and accurate network architecture for stereo matching", Wang et al., ECCV, 2022. [Paper] [Code] [Bibtex] [Google Scholar]

</details> <details open class="nested-details"> <summary style="font-size: larger; font-weight: bold;">Iterative Optimized-based Architectures</summary>

:triangular_flag_on_post: RAFT-Stereo: "RAFT-Stereo: Multilevel Recurrent Field Transforms for Stereo Matching", Lipson et al., 3DV, 2021. [Paper] [Code] [Bibtex] [Google Scholar]
ORStereo: "Orstereo: Occlusion-aware recurrent stereo matching for 4k-resolution images", Hu et al., IROS, 2021. [Paper] [WebPage] [Bibtex] [Google Scholar]
SCV-Stereo: "SCV-Stereo: Learning Stereo Matching from a Sparse Cost Volume", Wang et al., ICIP, 2021. [Paper] [Code] [Bibtex] [Google Scholar]
CREStereo: "Practical Stereo Matching via Cascaded Recurrent Network with Adaptive Correlation", Li et al., CVPR, 2022. [Paper] [Code] [Bibtex] [Google Scholar]
EAI-Stereo: "EAI-Stereo: Error Aware Iterative Network for Stereo Matching", Zhao et al., ACCV, 2022. [Paper] [Code] [Bibtex] [Google Scholar]
IGEV-Stereo: "Iterative Geometry Encoding Volume for Stereo Matching", Xu et al., CVPR, 2023. [Paper] [Code] [Bibtex] [Google Scholar]
DLNR: "High-Frequency Stereo Matching Network", Zhao et al, CVPR, 2023. [Paper] [Code] [Bibtex] [Google Scholar]
Dynamic Stereo: "DynamicStereo: Consistent Dynamic Depth From Stereo Videos", Karaev et al., CVPR 2023. [Paper] [Code] [Bibtex] [Google Scholar]
CREStereo++: "Uncertainty Guided Adaptive Warping for Robust and Efficient Stereo Matching", Jing et al., ICCV, 2023. [Paper] [Bibtex] [Google Scholar]
Selective-Stereo: "Selective-Stereo: Adaptive Frequency Information Selection for Stereo Matching", Wang et al., CVPR, 2024. [Paper] [Code] [Bibtex] [Google Scholar]
Any-Stereo: "Any-Stereo: Arbitrary Scale Disparity Estimation for Iterative Stereo Matching", Liang et al., AAAI, 2024. [Paper] [Code] [Bibtex] [Google Scholar]
MC-Stereo: "MC-Stereo: Multi-peak Lookup and Cascade Search Range for Stereo Matching", Feng et al., 3DV, 2024. [Paper] [Code] [Bibtex] [Google Scholar]
ICGNet: "Learning Intra-view and Cross-view Geometric Knowledge for Stereo Matching", Gong et al., CVPR, 2024. [Paper] [Code] [Bibtex] [Google Scholar]
MoCha-Stereo: "MoCha-Stereo: Motif Channel Attention Network for Stereo Matching", Chen et al., CVPR, 2024. [Paper] [Code] [Bibtex] [Google Scholar]
XR-Stereo: "Stereo Matching in Time: 100+ FPS Video Stereo Matching for Extended Reality", Cheng et al., WACV, 2024. [Paper] [Code] [Bibtex] [Google Scholar]
Temporally-Consistent Stereo: "Temporally Consistent Stereo Matching", Zeng et al., ECCV, 2024. [Paper] [Code] [Bibtex] [Google Scholar]
BiDA-Stereo: "Match-Stereo-Videos: Bidirectional Alignment for Consistent Dynamic Stereo Matching", Jing et al., ECCV, 2024. [Paper] [Code] [Bibtex] [WebPage] [Google Scholar]
QPDNet: "Disparity Estimation Using a Quad-Pixel Sensor", Wu et al., BMVC, 2024. [Paper] [WebPage] [Dataset] [Bibtex] [Google Scholar]
IGEV++: "IGEV++: Iterative Multi-range Geometry Encoding Volumes for Stereo Matching", Xu et al., arXiv, 2024. [Paper] [Code] [Bibtex] [Google Scholar]
Stereo Anything: "Stereo Anything: Unifying Stereo Matching with Large-Scale Mixed Data", Guo et al., arXiv, 2024. [Paper] [Code] [Bibtex] [Google Scholar]
Stereo Anywhere: "Stereo Anywhere: Robust Zero-Shot Deep Stereo Matching Even Where Either Stereo or Mono Fail", Bartolomei et al., arXiv, 2024. [Paper] [Code] [WebPage] [Bibtex] [Google Scholar]
AIO-Stereo: "All-in-One: Transferring Vision Foundation Models into Stereo Matching", Zhou et al., AAAI, 2025. [Paper] [Bibtex] [Google Scholar]

</details> <details open class="nested-details"> <summary style="font-size: larger; font-weight: bold;">Transformer-based Architectures</summary>

STTR: "Revisiting Stereo Depth Estimation From a Sequence-to-Sequence Perspective With Transformers", Li et al., ICCV, 2021 [Paper] [Code] [Bibtex] [Google Scholar]
CEST: "Context-enhanced stereo transformer", Guo et al., ECCV, 2022. [Paper] [Code] [Bibtex] [Google Scholar]
Chitransformer: "Chitransformer: Towards Reliable Stereo From Cues", Su et al., CVPR, 2022. [Paper] [Code] [Bibtex] [Google Scholar]
Dynamic Stereo: "DynamicStereo: Consistent Dynamic Depth From Stereo Videos", Karaev et al., CVPR 2023. [Paper] [Code] [Bibtex] [Google Scholar]
GMStereo: "Unifying Flow, Stereo and Depth Estimation", Xu et al., TPAMI, 2023. [Paper] [Code] [Bibtex] [Google Scholar]
CroCo v2: "CroCo v2: Improved Cross-View Completion Pre-training for Stereo Matching and Optical Flow", Weinzaepfel et al., ICCV, 2023. [Paper] [Code] [Bibtex] [Google Scholar]
ELFNet: "Elfnet: Evidential local-global fusion for stereo matching", Lou et al., ICCV, 2023. [Paper] [Code] [Bibtex] [Google Scholar]
GOAT: "Global Occlusion-Aware Transformer for Robust Stereo Matching", Liu et al., WACV, 2024. [Paper] [Code] [Bibtex] [Google Scholar]
FormerStereo: "Learning Representations from Foundation Models for Domain Generalized Stereo Matching", Zhang et al., ECCV, 2024. [Paper] [Bibtex] [Google Scholar]

</details> <details open class="nested-details"> <summary style="font-size: larger; font-weight: bold;">Markov Random Field-based Architectures</summary>

LBPS: "Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems", Knobelreiter et al., CVPR, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
NMRF: "Neural Markov Random Field for Stereo Matching", Guan et al., CVPR, 2024. [Paper] [Code] [Bibtex] [Google Scholar]

</details> </ul> </details> <details open id="efficient-oriented"> <summary style="font-size: larger; font-weight: bold;">Efficient-Oriented Deep Stereo Architectures</summary><ul> <details open> <summary style="font-size: larger; font-weight: bold;">Compact Cost Volume Representation</summary>

Stereonet: "Stereonet: Guided hierarchical refinement for real-time edge-aware depth prediction", Khamis et al., ECCV, 2018. [Paper] [Code] [Bibtex] [Google Scholar]
Fast DS-CS: "Fast Deep Stereo with 2D Convolutional Processing of Cost Signatures", Yee et al., WACV, 2020 [Paper] [Code] [Bibtex] [Google Scholar]
DecNet: "A Decomposition Model for Stereo Matching", Yao et al., CVPR, 2021. [Paper] [Code] [Bibtex] [Google Scholar]
BTC: "Soft Cross Entropy Loss and Bottleneck Tri-Cost Volume For Efficient Stereo Depth Prediction", Nuanes et al., CVPRW, 2021. [Paper] [Bibtex] [Google Scholar]
ACVNet: "Attention Concatenation Volume for Accurate and Efficient Stereo Matching", Xu et al., CVPR, 2022. [Paper] [Code] [Bibtex] [Google Scholar]
PCVNet: "Parameterized Cost Volume for Stereo Matching", Zeng et al., ICCV, 2023. [Paper] [Code] [Bibtex] [Google Scholar]
IINet: "IINet: Implicit Intra-inter Information Fusion for Real-Time Stereo Matching", Li et al., AAAI, 2024. [Paper] [Code] [Bibtex] [Google Scholar]

</details> <details open> <summary style="font-size: larger; font-weight: bold;">Efficient Cost Volume Processing</summary>

Deeppruner: "Deeppruner: Learning efficient stereo matching via differentiable patchmatch", Duggal et al., ICCV, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
CasStereo: "Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching", Gu et al., CVPR, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
MABNet: "MABNet: a lightweight stereo network based on multibranch adjustable bottleneck module", Xing et al., ECCV, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
BGNet: "Bilateral Grid Learning for Stereo Matching Networks", Xu et al., CVPR, 2021. [Paper] [Code] [Bibtex] [Google Scholar]
Separable-Stereo: "Separable Convolutions for Optimizing 3D Stereo Networks", Rahim et al., ICIP, 2021. [Paper] [Code] [Bibtex] [Google Scholar]
TemporalStereo: "TemporalStereo: Efficient Spatial-Temporal Stereo Matching Network", Zhang et al., IROS, 2023. [Paper] [Code] [Bibtex] [Google Scholar]

</details> <details open> <summary style="font-size: larger; font-weight: bold;">Efficient Inference Schemes</summary>

Anytime: "Anytime stereo image depth estimation on mobile devices", Wang et al., ICRA, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
StereoVAE: "StereoVAE: A lightweight stereo-matching system using embedded GPUs", Chang et al., ICRA, 2023. [Paper] [Bibtex] [Google Scholar]

</details> <details open> <summary style="font-size: larger; font-weight: bold;">Lightweight Network Architecture Design</summary>

NVStereoNet: "On the importance of stereo for accurate depth estimation: An efficient semi-supervised deep neural network approach", Smolyanskiy et al., CVPRW, 2018. [Paper] [Code] [Bibtex] [Google Scholar]
MadNet: "Real-Time Self-Adaptive Deep Stereo", Tonioni et al., CVPR, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
Fadnet: "Fadnet: A Fast and Accurate Network for Disparity Estimation", Wang et al., ICRA, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
AAFS: "Attention-Aware Feature Aggregation for Real-time Stereo Matching on Edge Devices", Chang et al., ACCV, 2020 [Code] [Paper] [Bibtex] [Google Scholar]
HITNet: "HITNet: Hierarchical Iterative Tile Refinement Network for Real-time Stereo Matching", Tankovich et al., CVPR, 2021. [Paper] [Code] [Bibtex] [Google Scholar]
CoEX: "Correlate-and-Excite: Real-Time Stereo Matching via Guided Cost Volume Excitation", Bangunharcana et al., IROS, 2021. [Paper] [Code] [Bibtex] [Google Scholar]
RLStereo: "RLStereo: Real-time stereo matching based on reinforcement learning", Yang et al., TIP, 2021. [Paper] [Bibtex] [Google Scholar]
MobileStereoNet: "MobileStereoNet: Towards Lightweight Deep Networks for Stereo Matching", Shamsafar et al., WACV, 2022. [Paper] [Code] [Bibtex] [Google Scholar]
PBCStereo: "PBCStereo: A Compressed Stereo Network with Pure Binary Convolutional Operations", Cai et al., ACCV, 2022. [Paper] [Bibtex] [Google Scholar]
MadNet2: "Federated Online Adaptation for Deep Stereo", Poggi et al., CVPR, 2024. [Bibtex]
Distill-And-Prune: "Distill-then-prune: An Efficient Compression Framework for Real-time Stereo Matching Network on Edge Devices", Pan et al., ICRA, 2024. [Paper] [Bibtex] [Google Scholar]

</details> </ul> </details> <details open id="multi-task"> <summary style="font-size: larger; font-weight: bold;">Multi-Task Deep Stereo Architectures</summary><ul> <details open> <summary style="font-size: larger; font-weight: bold;">Normal-Assisted Stereo Matching</summary>

NA-Stereo: "Normal Assisted Stereo Depth Estimation", Kusupati et al., CVPR, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
HITNet: "HITNet: Hierarchical Iterative Tile Refinement Network for Real-time Stereo Matching", Tankovich et al., CVPR, 2021. [Paper] [Code] [Bibtex] [Google Scholar]

</details> <details open> <summary style="font-size: larger; font-weight: bold;">Joint Stereo Matching and Optical Flow</summary>

Multi-Task Learning Using Uncertainty: "Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics", Kendall et al., CVPR, 2018. [Paper] [Bibtex] [Google Scholar]
BridgeDepthFlow: "Bridging Stereo Matching and Optical Flow via Spatiotemporal Correspondence", Lai et al., CVPR, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
UnOS: "UnOS: Unified Unsupervised Optical-Flow and Stereo-Depth Estimation by Watching Videos", Wang et al., CVPR, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
Feature-Level Collaboration: "Feature-Level Collaboration: Joint Unsupervised Learning of Optical Flow, Stereo Depth and Camera Motion", Chi et al., CVPR, 2021. [Paper] [Bibtex]
StereoFlowGAN: "StereoFlowGAN: Co-training for Stereo and Flow with Unsupervised Domain Adaptation", Xiong et al., BMVC, 2023. [Paper] [Bibtex] [Google Scholar]

</details> <details open> <summary style="font-size: larger; font-weight: bold;">Joint Stereo Matching and Semantic Segmentation</summary>

Segstereo: "Segstereo: Exploiting semantic information for disparity estimation", Yang et al., ECCV, 2018. [Paper] [Code] [Bibtex] [Google Scholar]
Multi-Task Learning Using Uncertainty: "Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics", Kendall et al., CVPR, 2018. [Paper] [Bibtex] [Google Scholar]
DSNet: "DSNet: Joint learning for scene segmentation and disparity estimation", Zhan et al., ICRA, 2019. [Paper] [Bibtex] [Google Scholar]
Dispsegnet: "Dispsegnet: Leveraging semantics for end-to-end learning of disparity estimation from stereo imagery", Zhang et al., RAL, 2019. [Paper] [Bibtex] [Google Scholar]
SSPCV-Net: "Semantic stereo matching with pyramid cost volumes", Wu et al., ICCV, 2019. [Paper] [Bibtex] [Google Scholar]
RSS-Net: "Real-time semantic stereo matching", Dovesi et al., ICRA, 2020. [Paper] [Bibtex] [Google Scholar]
SGNet: "SGNet: Semantics Guided Deep Stereo Matching", Chen et al., ACCV, 2020. [Paper] [Bibtex] [Google Scholar]

</details> <details open> <summary style="font-size: larger; font-weight: bold;">Joint Stereo Matching and Uncertainty</summary>

RCN: "Improved stereo matching with constant highway networks and reflective confidence learning", Shaked et al., CVPR, 2017. [Paper] [Code] [Bibtex] [Google Scholar]
UCN: "Unified confidence estimation networks for robust stereo matching", Kim et al., TIP, 2018. [Paper] [Bibtex] [Google Scholar]
ACN: "Adversarial confidence estimation networks for robust stereo matching", Kim et al., T-ITS, 2020. [Paper] [Bibtex] [Google Scholar]
AcfNet: "Adaptive Unimodal Cost Volume Filtering for Deep Stereo Matching", Zhang et al., AAAI, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
Weak Adversarial Learning: "Leveraging a weakly adversarial paradigm for joint learning of disparity and confidence estimation", Poggi et al., ICPR, 2021. [Paper] [Bibtex] [Google Scholar]
Bayesian: "Joint estimation of depth and its uncertainty from stereo images using bayesian deep learning", Mehltretter Max, ISPRS, 2022. [Paper] [Bibtex] [Google Scholar]
SEDNet: "Learning the distribution of errors in stereo matching for joint disparity and uncertainty estimation", Chen et al., CVPR, 2023. [Paper] [Code] [Bibtex] [Google Scholar]

</details> <details open> <summary style="font-size: larger; font-weight: bold;"> Scene Flow </summary>

:triangular_flag_on_post: FlowNet3.0: "Occlusions, motion and depth boundaries with a generic network for disparity, optical flow or scene flow estimation", Ilg et al., ECCV, 2018. [Paper] [Code] [Bibtex] [Google Scholar]
DRISF: "Deep Rigid Instance Scene Flow", Ma et al., CVPR, 2019. [Paper] [Bibtex] [Google Scholar]
DeblurringSF: "Joint stereo video deblurring, scene flow estimation and moving object segmentation", Pan et al., TIP, 2019. [Paper] [Bibtex] [Google Scholar]
IOSF: "Learning Independent Object Motion From Unlabelled Stereoscopic Videos", Cao et al., TPAMI, 2019. [Paper] [Bibtex] [Google Scholar]
EPC++: "Every pixel counts++: Joint learning of geometry and motion with 3d holistic understanding", Luo et al., TPAMI, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
SENSE: "Sense: A shared encoder network for scene-flow estimation", Jiang et al., ICCV, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
StereoExpansion: "Upgrading Optical Flow to 3D Scene Flow through Optical Expansion", Yang et al., ICCV, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
DWARF: "Learning end-to-end scene flow by distilling single tasks knowledge", Aleotti et al., AAAI, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
SceneFlowFields++: "SceneFlowFields++: Multi-frame matching, visibility prediction, and robust interpolation for scene flow estimation", Schuster et al., IJCV, 2020. [Paper] [Bibtex] [Google Scholar]
Effiscene: "Effiscene: Efficient per-pixel rigidity inference for unsupervised joint learning of optical flow, depth, camera pose and motion segmentation", Jiao et al., CVPR, 2021. [Paper] [Bibtex] [Google Scholar]
RAFT-3D: "RAFT-3D: Scene Flow using Rigid-Motion Embeddings", Teed et al., CVPR, 2021. [Paper] [Code] [Bibtex] [Google Scholar]
RigidMask: "Learning to Segment Rigid Motions from Two Frames", Yang et al., CVPR, 2021. [Paper] [Code] [Bibtex] [Google Scholar]
Self-superflow: "Self-superflow: self-supervised scene flow prediction in stereo sequences", Bendig et al., ICIP, 2022. [Paper] [Bibtex] [Google Scholar]
CamLiFlow: "Learning optical flow and scene flow with bidirectional camera-lidar fusion", Liu et al., TPAMI, 2023. [Paper] [Code] [Bibtex] [Google Scholar]
M-FUSE: "M-fuse: Multi-frame fusion for scene flow estimation", Mehl et al., WACV, 2023. [Paper] [Code] [Bibtex] [Google Scholar]
OpticalExpansion: "Learning Optical Expansion from Scale Matching", Ling et al., CVPR, 2023. [Paper] [Bibtex] [Google Scholar]

</details> </details> <details open id="multi-modal"> <summary style="font-size: larger; font-weight: bold;">Beyond Visual Spectrum Deep Stereo Architectures</summary><ul> <details open> <summary style="font-size: larger; font-weight: bold;">Depth-Guided Sensor Stereo Networks</summary>

LidarStereoFusion: "High-precision depth estimation with the 3d lidar and stereo fusion", Park et al., ICRA, 2018. [Paper] [Bibtex] [Google Scholar]
GSD: "Guided stereo matching", Poggi et al., CVPR, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
LidarStereoNet: "Noise-Aware Unsupervised Deep Lidar-Stereo Fusion", Cheng et al., CVPR, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
Stereo-LiDAR-CCVNorm: "3d lidar and stereo fusion using stereo matching network with conditional cost volume normalization", Wang et al., IROS, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
Pseudo-LiDAR++: "Pseudo-LiDAR++: Accurate Depth for 3D Object Detection in Autonomous Driving", You et al., ICLR, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
Listereo: "Listereo: Generate dense depth maps from lidar and stereo imagery", Zhang et al., ICRA, 2020. [Paper] [Bibtex] [Google Scholar]
S<sup>3: "S<sup>3</sup>: Learnable sparse signal superdensity for guided depth estimation", Huang et al., CVPR, 2021. [Paper] [Bibtex] [Google Scholar]
LSMD-Net: "LSMD-Net: LiDAR-Stereo Fusion with Mixture Density Network for Depth Sensing", Yin et al., ACCV, 2022. [Paper] [Bibtex] [Google Scholar]
CamLiFlow: "Learning optical flow and scene flow with bidirectional camera-lidar fusion", TPAMI, 2023. [Paper] [Code] [Bibtex] [Google Scholar]
Active Disparity Sampling: "Active Disparity Sampling for Stereo Matching With Adjoint Network", Zhang et al., TIP, 2023. [Paper] [Bibtex] [Google Scholar]
VPP: "Active Stereo Without Pattern Projector", Bartolomei et al., ICCV, 2023. [Paper] [Code] [Bibtex] [Google Scholar]
SDG-Depth: "Stereo-LiDAR Depth Estimation with Deformable Propagation and Learned Disparity-Depth Conversion", Li et al., ICRA, 2024. [Paper] [Code] [Bibtex] [Google Scholar]
VPP-Extended: "Stereo-Depth Fusion through Virtual Pattern Projection", Bartolomei et al., arXiv, 2024. [Paper] [Code] [WebPage] [Bibtex] [Google Scholar]
D3RoMa: "D3RoMa: Disparity Diffusion-based Depth Sensing for Material-Agnostic Robotic Manipulation", Wei et al., CoRL, 2024. [Paper] [WebPage] [Bibtex] [Google Scholar]

</details> <details open> <summary style="font-size: larger; font-weight: bold;">Pattern Projection-Based Stereo Networks</summary>

ActiveStereoNet: "ActiveStereoNet: End-to-End Self-Supervised Learning for Active Stereo Systems", Zhang et al., ECCV, 2018. [Paper] [Bibtex] [Google Scholar]
Polka Lines: "Polka Lines: Learning Structured Illumination and Reconstruction for Active Stereo", Baek et al., CVPR, 2021. [Paper] [Bibtex] [Google Scholar]
Activezero: "Activezero: Mixed domain learning for active stereovision with zero annotation", Liu et al., CVPR, 2022. [Paper] [Code] [Bibtex] [Google Scholar]
MonoStereoFusion: "Depth Estimation by Combining Binocular Stereo and Monocular Structured-Light", Xu et al., CVPR, 2022. [Paper] [Code] [Bibtex] [Google Scholar]
Activezero++: "Activezero++: Mixed domain learning stereo and confidence-based depth completion with zero annotation", Chen et al., TPAMI, 2023. [Paper] [Bibtex] [Google Scholar]
ASGrasp: "ASGrasp: Generalizable Transparent Object Reconstruction and 6-DoF Grasp Detection from RGB-D Active Stereo Camera", Shi et al., ICRA, 2024. [Paper] [WebPage] [Bibtex] [Google Scholar]

</details> <details open> <summary style="font-size: larger; font-weight: bold;">Cross-Spectral Stereo Networks</summary>

CS-Stereo: "Deep material-aware cross-spectral stereo matching", Zhi et al., CVPR, 2018. [Paper] [Code] [Bibtex] [Google Scholar]
UCSS: "Unsupervised cross-spectral stereo matching by learning to synthesize", Liang et al., AAAI, 2019. [Paper] [Code - Unofficial] [Bibtex] [Google Scholar]
SS-MCE: "There and back again: Self-supervised multispectral correspondence estimation", Walters et al., ICRA, 2021. [Paper] [Bibtex] [Google Scholar]
RGB-MS: "RGB-Multispectral matching: Dataset, learning methodology, evaluation", Tosi et al., CVPR, 2022. [Paper] [Code] [Bibtex] [Google Scholar]
DPS-Net: "DPS-Net: Deep Polarimetric Stereo Depth Estimation", Tian et al., ICCV, 2023. [Paper] [Code] [Bibtex] [Google Scholar]
CrossSP: "Unsupervised Cross-Spectrum Depth Estimation by Visible-Light and Thermal Cameras", Guo et al., T-ITS, 2023. [Paper] [Code] [Bibtex] [Google Scholar]
Gated-RCCB: "Cross-spectral Gated-RGB Stereo Depth Estimation", Brucker et al., CVPR, 2024. [Paper] [WebPage] [Bibtex] [Google Scholar]

</details> <details open> <summary style="font-size: larger; font-weight: bold;">Event Stereo Networks</summary>

Event-IntensityStereo: "Event-Intensity Stereo: Estimating Depth by the Best of Both Worlds", Mostafavi et al., ICCV, 2021. [Paper] [Code] [Bibtex] [Google Scholar]
SE-CFF: "Stereo Depth From Events Cameras: Concentrate and Focus on the Future", Nam et al., CVPR, 2022. [Paper] [Code] [Bibtex] [Google Scholar]
SCSNet: "Selection and Cross Similarity for Event-Image Deep Stereo", Cho et al., ECCV, 2022. [Paper] [Code] [Bibtex] [Google Scholar]
DTC-SPADE: "Discrete Time Convolution for Fast Event-Based Stereo", Zhang et al., CVPR, 2022. [Paper] [Bibtex] [Google Scholar]
EFS: "Event-image fusion stereo using cross-modality feature propagation", Cho et al., AAAI, 2022. [Paper] [Bibtex] [Google Scholar]
ADES: "Learning Adaptive Dense Event Stereo From the Image Domain", Cho et al., CVPR, 2023. [Paper] [Bibtex] [Google Scholar]
SAFE: "Depth From Asymmetric Frame-Event Stereo: A Divide-and-Conquer Approach", Chen et al., WACV, 2024. [Paper] [Bibtex] [Google Scholar]
TemporalEventStereo: "Temporal Event Stereo via Joint Learning with Stereoscopic Flow", Cho et al., ECCV, 2024. [Paper] [Code] [Bibtex] [Google Scholar]
EventVPPStereo: "LiDAR-Event Stereo Fusion with Hallucinations", Bartolomei et al., ECCV, 2024. [Paper] [WebPage] [Code] [Bibtex] [Google Scholar]
ZEST: "Zero-Shot Event-Intensity Asymmetric Stereo via Visual Prompting from Image Domain", Lou et al., NeurIPS, 2024. [Paper] [Code] [Bibtex] [Google Scholar]

</details> <details open> <summary style="font-size: larger; font-weight: bold;">Gated Stereo Networks</summary>

GatedStereo: "Gated Stereo: Joint Depth Estimation from Gated and Wide-Baseline Active Stereo Cues", Walz et al., CVPR, 2023. [Paper] [WebPage] [Bibtex] [Google Scholar]
Gated-RCCB: "Cross-spectral Gated-RGB Stereo Depth Estimation", Brucker et al., CVPR, 2024. [Paper] [WebPage] [Bibtex] [Google Scholar]

</details> <details open> <summary style="font-size: larger; font-weight: bold;">Stereo Networks with Echoes </summary>

StereoEchoes: "Stereo Depth Estimation with Echoes", Zhang et al., ECCV, 2022. [Paper] [Bibtex] [Google Scholar]

</details> </ul> </details>

Architectural Analysis

OpenStereo: "OpenStereo: A Comprehensive Benchmark for Stereo Matching and Strong Baseline", Xianda et al., arXiv, 2023 [Paper] [Code] [Bibtex] [Google Scholar]
"Exploring the Usage of Pre-trained Features for Stereo Matching", Zhang et al., IJCV, 2024 [Paper] [Bibtex] [Google Scholar]

Challenges & Solutions

<details open id="over-smoothing"> <summary style="font-size: larger; font-weight: bold;">Addressing the Over-Smoothing Issue</summary>

SM-CDE: "On the over-smoothing problem of cnn based disparity estimation", Chen et al., ICCV, 2019. [Paper] [Bibtex] [Google Scholar]
AcfNet: "Adaptive Unimodal Cost Volume Filtering for Deep Stereo Matching", Zhang et al., AAAI, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
CDN: "Wasserstein Distances for Stereo Disparity Estimation", Garg et al., NeurIPS, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
SMD-Nets: "SMD-Nets: Stereo Mixture Density Networks", Tosi et al., CVPR, 2021. [Paper] [Code] [Bibtex] [Google Scholar]
NDR: "Neural disparity refinement for arbitrary resolution stereo", Aleotti et al., 3DV, 2021. [Paper] [Website] [Bibtex] [Google Scholar] [Google Scholar]
LaC: "Local similarity pattern and cost self-reassembling for deep stereo matching networks", Liu et al., AAAI, 2022. [Paper] [Code] [Bibtex] [Google Scholar]
ADL: "Adaptive Multi-Modal Cross-Entropy Loss for Stereo Matching", Xu et al., CVPR, 2024. [Paper] [Code] [Bibtex] [Google Scholar]
NDR v2: "Neural disparity refinement", Tosi et al., TPAMI, 2024. [Paper] [Website] [Bibtex] [Google Scholar]
Stereo Risk: "Stereo Risk: A Continuous Modeling Approach to Stereo Matching", Liu et al., ICML, 2024. [Paper] [Bibtex] [Google Scholar]

</details> <details open id="missing-gt"> <summary style="font-size: larger; font-weight: bold;">Missing Ground Truth Depth</summary><ul> <details open> <summary style="font-size: larger; font-weight: bold;">Self-Supervised</summary>

:triangular_flag_on_post: MonoDepth/StereoDepth: "Unsupervised monocular depth estimation with left-right consistency", Godard et al., CVPR, 2017. [Paper] [Code] [Bibtex] [Google Scholar]
USM: "Unsupervised learning of stereo matching", Zhou et al., ICCV, 2017. [Paper] [Bibtex] [Google Scholar]
OASM-Net: "Occlusion aware stereo matching via cooperative unsupervised learning", Li et al., ACCV, 2018. [Paper] [Bibtex] [Google Scholar]
UnOS: "UnOS: Unified Unsupervised Optical-Flow and Stereo-Depth Estimation by Watching Videos", Wang et al., CVPR, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
BridgeDepthFlow: "Bridging Stereo Matching and Optical Flow via Spatiotemporal Correspondence", CVPR, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
Correspondence Consistency: "Unsupervised stereo matching using confidential correspondence consistency", Joung et al., T-ITS, 2019. [Paper] [Bibtex] [Google Scholar]
Flow2Stereo: "Flow2Stereo: Effective Self-Supervised Learning of Optical Flow and Stereo Matching", Liu et al., CVPR, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
PASMNet: "Parallax attention for unsupervised stereo correspondence learning", Wang et al., TPAMI, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
MultiscopicVision: "Stereo matching by self-supervision of multiscopic vision", Yuan et al., IROS, 2021. [Paper] [WebPage] [Bibtex] [Google Scholar]
Feature-Level Collaboration: "Feature-Level Collaboration: Joint Unsupervised Learning of Optical Flow, Stereo Depth and Camera Motion", Chi et al., CVPR, 2021. [Paper] [Bibtex]
Occlusion-Aware Stereo: "Unsupervised Occlusion-Aware Stereo Matching With Directed Disparity Smoothing", Li et al., T-ITS, 2022. [Paper] [Bibtex] [Google Scholar]

</details> <details open > <summary style="font-size: larger; font-weight: bold;">Cross-Framework/Proxy Supervision</summary>

Reversing-Stereo: "Reversing the cycle: self-supervised deep stereo through enhanced monocular distillation", Aleotti et al., ECCV, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
Revealing-Stereo: "Revealing the Reciprocal Relations between Self-Supervised Stereo and Monocular Depth Estimation", Chen et al., ICCV, 2021. [Paper] [Bibtex] [Google Scholar]
TiO-Depth: "Two-in-one depth: Bridging the gap between monocular and binocular self-supervised depth estimation", Zhou et al., ICCV, 2023. [Paper] [Code] [Bibtex] [Google Scholar]
NeRF-Supervised Stereo: "NeRF-Supervised Deep Stereo", Tosi et al., CVPR, 2023. [Paper] [Website] [Code] [Bibtex]
SAG: "Self-Assessed Generation: Trustworthy Label Generation for Optical Flow and Stereo Matching in Real-world", Ling et al., arXiv, 2024. [Paper] [Code] [Bibtex] [Google Scholar]

</details> </ul> </details> <details open id="domain-shift"> <summary style="font-size: larger; font-weight: bold;">Domain Shift</summary><ul> <details open> <summary style="font-size: larger; font-weight: bold;">Zero-shot Generalization</summary><ul> <details open> <summary style="font-size: larger; font-weight: bold;">Domain-Agnostic Feature Modeling</summary>

:triangular_flag_on_post: DSM-Net: "Domain-invariant Stereo Matching Networks", Zhang et al., ECCV, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
FCStereo: "Revisiting Domain Generalized Stereo Matching Networks From a Feature Consistency Perspective", Zhang et al., CVPR, 2022. [Paper] [Code] [Bibtex] [Google Scholar]
GraftNet: "GraftNet: Towards Domain Generalized Stereo Matching With a Broad-Spectrum and Task-Oriented Feature", Liu et al., CVPR, 2022. [Paper] [Code] [Bibtex] [Google Scholar]
ITSA: "ITSA: An Information-Theoretic Approach to Automatic Shortcut Avoidance and Domain Generalization in Stereo Matching Networks", Chuah et al., CVPR, 2022. [Paper] [Code] [Bibtex] [Google Scholar]
HVT: "Domain Generalized Stereo Matching via Hierarchical Visual Transformation", Chang et al., CVPR, 2023. [Paper] [Code] [Bibtex] [Google Scholar]
MRL-Stereo: "Masked representation learning for domain generalized stereo matching", Rao et al., CVPR, 2023. [Paper] [Bibtex] [Google Scholar]
HODC: "Hierarchical Object-Aware Dual-Level Contrastive Learning for Domain Generalized Stereo Matching", Miao et al., NeurIPS, 2024. [Paper] [Website] [Bibtex] [Google Scholar]

</details> <details open> <summary style="font-size: larger; font-weight: bold;">Non-parametric Cost Volumes</summary>

MS-Nets: "Matching-space Stereo Networks for Cross-domain Generalization", Cai et al., 3DV, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
ARStereo: "Revisiting Non-Parametric Matching Cost Volumes for Robust and Generalizable Stereo Matching", Cheng et al., NeurIPS, 2022. [Paper] [Code] [Bibtex] [Google Scholar]

</details> <details open> <summary style="font-size: larger; font-weight: bold;">Integration of Additional Geometric Cues</summary>

NDR: "Neural disparity refinement for arbitrary resolution stereo", Aleotti et al., 3DV, 2021. [Paper] [Website] [Bibtex] [Google Scholar] [Google Scholar]
EVHS: "Expansion of Visual Hints for Improved Generalization in Stereo Matching", Pilzer et al., WACV, 2023. [Paper] [Bibtex] [Google Scholar]
NDR v2: "Neural disparity refinement", Tosi et al., TPAMI, 2024. [Paper] [Website] [Bibtex] [Google Scholar]

</details> <details open> <summary style="font-size: larger; font-weight: bold;">Real-World Monocular to Synthetic Stereo Data</summary>

LSSI: "Learning Stereo from Single Images", Watson et al., ECCV, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
NeRF-Supervised Stereo: "NeRF-Supervised Deep Stereo", Tosi et al., CVPR, 2023. [Paper] [Website] [Code] [Bibtex] [Google Scholar]
SAG: "Self-Assessed Generation: Trustworthy Label Generation for Optical Flow and Stereo Matching in Real-world", Ling et al., arXiv, 2024. [Paper] [Code] [Bibtex] [Google Scholar]

</details> <details open> <summary style="font-size: larger; font-weight: bold;">Knowledge Transfer</summary>

DKT-Stereo: "Robust Synthetic-to-Real Transfer for Stereo Matching", Zhang et al., CVPR, 2024. [Paper] [Code] [Bibtex] [Google Scholar]

</details> <details open> <summary style="font-size: larger; font-weight: bold;">Monocular-Stereo Integration</summary>

Stereo Anywhere: "Stereo Anywhere: Robust Zero-Shot Deep Stereo Matching Even Where Either Stereo or Mono Fail", Bartolomei et al., arXiv, 2024. [Paper] [Code] [WebPage] [Bibtex] [Google Scholar]

</details> <details open> <summary style="font-size: larger; font-weight: bold;">Data Augmentation Analysis </summary>

NLCA-Net_v2: "Rethinking training strategy in stereo matching", Rao et al., TNNLS, 2022. [Paper] [Code] [Bibtex] [Google Scholar]

</details> </ul> </details> <details open> <summary style="font-size: larger; font-weight: bold;">Offline Adaptation</summary>

Confidence-guided Adaptation: "Unsupervised adaptation for deep stereo", Tonioni et al., ICCV, 2017. [Paper] [Code] [Bibtex1] [Bibtex2] [Google Scholar]
Open-World Stereo: "Open-world stereo video matching with deep rnn", Zhong et al., ECCV, 2018. [Paper] [Bibtex] [Google Scholar]
ZOLE: "Zoom and Learn: Generalizing Deep Stereo Matching to Novel Domain", Pang et al., CVPR, 2018. [Paper] [Code] [Bibtex] [Google Scholar]
StereoGAN: "StereoGAN: Bridging Synthetic-to-Real Domain Gap by Joint Optimization of Domain Translation and Stereo Matching", Liu et al., CVPR, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
AdaStereo: "AdaStereo: A Simple and Efficient Approach for Adaptive Stereo Matching", Song et al., CVPR, 2021. [Paper] [Bibtex] [Google Scholar]
UnDAF: "UnDAF: A General Unsupervised Domain Adaptation Framework for Disparity or Optical Flow Estimation", Wang et al., CVPR, 2021. [Paper] [Code] [Bibtex] [Google Scholar]
RAG: "Continual Stereo Matching of Continuous Driving Scenes With Growing Architecture", Zhang et al., CVPR, 2022. [Paper] [Bibtex] [Google Scholar]
UCFNet: "Digging Into Uncertainty-Based Pseudo-Label for Robust Stereo Matching", Shen et al., TPAMI, 2023. [Paper] [Code] [Bibtex] [Google Scholar]
StereoFlowGAN: "StereoFlowGAN: Co-training for Stereo and Flow with Unsupervised Domain Adaptation", Xiong et al., BMVC, 2023. [Paper] [Bibtex] [Google Scholar]
Few-Shot Stereo Matching: "Few-Shot Stereo Matching with High Domain Adaptability Based on Adaptive Recursive Network", Wu et al.,IJCV, 2023. [Paper] [Code] [Bibtex] [Google Scholar]
RAG-Continual: "Reusable Architecture Growth for Continual Stereo Matching", Zhang et al.,TPAMI, 2024. [Paper] [Code] [Bibtex] [Google Scholar]

</details> <details open> <summary style="font-size: larger; font-weight: bold;">Online Continual Adaptation</summary>

:triangular_flag_on_post: MadNet: "Real-Time Self-Adaptive Deep Stereo", Tonioni et al., CVPR, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
Learning2Adapt: "Learning to adapt for stereo", Tonioni et al., CVPR, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
AoHNet: "Faster Self-adaptive Deep Stereo", Wang et al., ACCV, 2020. [Paper] [Bibtex] [Google Scholar]
Continual Adaptation for Deep Stereo: "Continual adaptation for deep stereo", Poggi et al., TPAMI, 2021. [Paper] [Code] [Bibtex] [Google Scholar]
PointFix: "PointFix: Learning to Fix Domain Bias for Robust Online Stereo Adaptation", Kim et al., ECCV, 2022. [Paper] [Bibtex] [Google Scholar]
FedStereo: "Federated Online Adaptation for Deep Stereo", Poggi et al., CVPR, 2024. [Paper] [Code] [Bibtex] [Google Scholar]

</details> </details> <details open id="adverse-weather"> <summary style="font-size: larger; font-weight: bold;">Adverse Weather</summary>

FoggyStereo: "FoggyStereo: Stereo Matching with Fog Volume Representation", Yao et al., CVPR, 2022. [Paper] [Code] [Bibtex] [Google Scholar]
DDF: "Dusk Till Dawn: Self-supervised Nighttime Stereo Depth Estimation using Visual Foundation Models", Vankadari et al., ICRA, 2024. [Paper] [Code] [Bibtex] [Google Scholar]

</details> <details open id="tom"> <summary style="font-size: larger; font-weight: bold;">Transparent and Reflective (ToM) Surfaces </summary>

DDF: "Deep Depth Fusion for Black, Transparent, Reflective and Texture-Less Objects", Chai et al., ICRA, 2020. [Paper] [Bibtex] [Google Scholar]
TA-Stereo: "Transparent Objects: A Corner Case in Stereo Matching", Wu et al., ICRA, 2023. [Paper] [Code] [Bibtex] [Google Scholar]
Depth4ToM: "Learning Depth Estimation for Transparent and Mirror Surfaces", Costanzino et al., ICCV, 2023. [Paper] [Code] [Bibtex] [Google Scholar]
ASGrasp: "ASGrasp: Generalizable Transparent Object Reconstruction and 6-DoF Grasp Detection from RGB-D Active Stereo Camera", Shi et al., ICRA, 2024. [Paper] [WebPage] [Bibtex] [Google Scholar]
D3RoMa: "D3RoMa: Disparity Diffusion-based Depth Sensing for Material-Agnostic Robotic Manipulation", Wei et al., CoRL, 2024. [Paper] [WebPage] [Bibtex] [Google Scholar]
Stereo Anywhere: "Stereo Anywhere: Robust Zero-Shot Deep Stereo Matching Even Where Either Stereo or Mono Fail", Bartolomei et al., arXiv, 2024. [Paper] [Code] [WebPage] [Bibtex] [Google Scholar]

</details> <details open id="asymmetric"> <summary style="font-size: larger; font-weight: bold;">Asymmetric Stereo </summary>

Visually-Imbalanced Stereo: "Visually Imbalanced Stereo Matching", Liu et al., CVPR, 2020. [Paper] [Code] [Bibtex]
NDR: "Neural disparity refinement for arbitrary resolution stereo", Aleotti et al., 3DV, 2021. [Paper] [Website] [Bibtex] [Google Scholar] [Google Scholar]
DA-AS: "Degradation-agnostic Correspondence from Resolution-asymmetric Stereo", Chen et al., CVPR, 2022. [Paper] [Bibtex] [Google Scholar]
SASS: "Unsupervised Deep Asymmetric Stereo Matching with Spatially-Adaptive Self-Similarity", Song et al., CVPR, 2023. [Paper] [Bibtex] [Google Scholar]
NDR v2: "Neural disparity refinement", Tosi et al., TPAMI, 2024. [Paper] [Website] [Bibtex] [Google Scholar]

</details> <details open id="temporal-consistency"> <summary style="font-size: larger; font-weight: bold;">Temporal Consistency</summary>

TemporalStereo: "TemporalStereo: Efficient Spatial-Temporal Stereo Matching Network", Zhang et al., IROS, 2023. [Paper] [Code] [Bibtex] [Google Scholar]
Dynamic Stereo: "DynamicStereo: Consistent Dynamic Depth From Stereo Videos", Karaev et al., CVPR 2023. [Paper] [Code] [Bibtex] [Google Scholar]
XR-Stereo: "Stereo Matching in Time: 100+ FPS Video Stereo Matching for Extended Reality", Cheng et al., WACV, 2024. [Paper] [Code] [Bibtex] [Google Scholar]
TemporalEventStereo: "Temporal Event Stereo via Joint Learning with Stereoscopic Flow", Cho et al., ECCV, 2024. [Paper] [Code] [Bibtex] [Google Scholar]
Temporally-Consistent Stereo: "Temporally Consistent Stereo Matching", Zeng et al., ECCV, 2024. [Paper] [Code] [Bibtex] [Google Scholar]
BiDA-Stereo: "Match-Stereo-Videos: Bidirectional Alignment for Consistent Dynamic Stereo Matching", Jing et al., ECCV, 2024. [Paper] [Code] [Bibtex] [WebPage] [Google Scholar]
BiDA-Video: "Match Stereo Videos via Bidirectional Alignment", Jing et al., arXiv, 2024. [Paper] [Code] [Bibtex] [WebPage] [Google Scholar]

</details> <details open id="continuous-estimation"> <summary style="font-size: larger; font-weight: bold;">Continuous Estimation Problem</summary>

SMD-Nets: "SMD-Nets: Stereo Mixture Density Networks", Tosi et al., CVPR, 2021. [Paper] [Code] [Bibtex] [Google Scholar]
NDR: "Neural disparity refinement for arbitrary resolution stereo", Aleotti et al., 3DV, 2021. [Paper] [Website] [Bibtex] [Google Scholar]
NDR v2: "Neural disparity refinement", Tosi et al., TPAMI, 2024. [Paper] [Website] [Bibtex] [Google Scholar]

</details> </ul> </details>

Confidence Estimation

<details open> <summary style="font-size: larger; font-weight: bold;">Machine Learning Approaches</summary><ul> <details open> <summary style="font-size: larger; font-weight: bold;">Disparity-based</summary>

ENS7: "Ensemble learning for confidence measures in stereo vision", Haeusler et al., CVPR, 2013. [Paper] [Bibtex] [Google Scholar]
O1: "Learning a general-purpose confidence measure based on o (1) features and a smarter aggregation strategy for semi global matching", Poggi et al., 3DV, 2016. [Paper] [Bibtex1] [Bibtex2] [Google Scholar]

</details> <details open> <summary style="font-size: larger; font-weight: bold;">Cost Volume-based</summary>

ENS23: "Ensemble learning for confidence measures in stereo vision", Haeusler et al., CVPR, 2013. [Paper] [Bibtex] [Google Scholar]
GCP: "Learning to detect ground control points for improving the accuracy of stereo matching", Spyropoulos et al., CVPR, 2014. [Paper] [Bibtex1] [Bibtex2] [Google Scholar]
LEV: "Leveraging stereo matching with learning-based confidence measures", Park et al., CVPR, 2015. [Paper] [Bibtex1] [Bibtex2] [Google Scholar]
FA: "Feature augmentation for learning confidence measure in stereo matching", Kim et al., TIP, 2017. [Paper] [Bibtex] [Google Scholar]

</details> <details open> <summary style="font-size: larger; font-weight: bold;">Model-based</summary>

Multi-Task Learning Using Uncertainty: "Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics", Kendall et al., CVPR, 2018. [Paper] [Bibtex] [Google Scholar]

</details> <details open> <summary style="font-size: larger; font-weight: bold;">SGM-specific</summary>

SGMForest: "Learning to fuse proposals from multiple scanline optimizations in semi-global matching", Schonberger et al., ECCV, 2018. [Paper] [Bibtex] [Google Scholar]

</details> </ul> </details> <details open> <summary style="font-size: larger; font-weight: bold;">Deep Learning Approaches</summary><ul> <details open> <summary style="font-size: larger; font-weight: bold;">Disparity-based</summary>

CCNN: "Learning from scratch a confidence measure", Poggi et al., BMVC, 2016. [Paper] [Code] [Bibtex] [Google Scholar]
PBCP: "Patch Based Confidence Prediction for Dense Disparity Map", Seki et al., BMVC, 2016. [Paper] [Bibtex] [Google Scholar]
EFN/LFN: "Stereo matching confidence learning based on multi-modal convolution neural networks", Fu et al., RFMI, 2017. [Paper] [Bibtex] [Google Scholar]
MMC: "Learning confidence measures by multi-modal convolutional neural networks", Fu et al., WACV, 2018. [Paper] [Bibtex] [Google Scholar]
LGC/ConfNet: "Beyond local reasoning for stereo confidence estimation with deep learning", Tosi et al., ECCV, 2018. [Paper] [Code] [Bibtex] [Google Scholar]
Self-adapting Confidence: "Self-adapting confidence estimation for stereo", Poggi et al., ECCV, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
SEDNet: "Learning the distribution of errors in stereo matching for joint disparity and uncertainty estimation", Chen et al., CVPR, 2023. [Paper] [Code] [Bibtex] [Google Scholar]

</details> <details open> <summary style="font-size: larger; font-weight: bold;">Cost Volume-based</summary>

RCN: "Improved stereo matching with constant highway networks and reflective confidence learning", Shaked et al., CVPR, 2017. [Paper] [Code] [Bibtex] [Google Scholar]
MPN: "Deep stereo confidence prediction for depth estimation", Kim et al., ICIP, 2017. [Paper] [Bibtex] [Google Scholar]
UCN: "Unified confidence estimation networks for robust stereo matching", Kim et al., TIP, 2018. [Paper] [Bibtex] [Google Scholar]
LAF: "Laf-net: Locally adaptive fusion networks for stereo confidence estimation", Kim et al., CVPR, 2019. [Paper] [Bibtex] [Google Scholar]
CRNN: "Pixel-Wise Confidences for Stereo Disparities Using Recurrent Neural Networks", Gul et al., BMVC, 2019. [Paper] [Bibtex] [Google Scholar]
CVA: "Cnn-based cost volume analysis as confidence measure for dense matching", Mehltretter et al., ICCVW, 2019. [Paper] [Bibtex] [Google Scholar]
Disparity Plane Sweep: "Modeling Stereo-Confidence Out of the End-to-End Stereo-Matching Network via Disparity Plane Sweep", Lee et al., AAAI, 2024. [Paper] [Bibtex] [Google Scholar]
ACN: "Adversarial confidence estimation networks for robust stereo matching", Kim et al., T-ITS, 2020. [Paper] [Bibtex] [Google Scholar]

</details> <details open> <summary style="font-size: larger; font-weight: bold;">Multiple Confidence Fusion</summary>

Learning Local Consistency: "Learning to predict stereo reliability enforcing local consistency of confidence maps", Poggi et al., CVPR, 2017. [Paper] [Bibtex] [Google Scholar]
EMC: "Even More Confident Predictions With Deep Machine-Learning", Poggi et al., CVPRW, 2017. [Paper] [Bibtex] [Google Scholar]

</details> <details open> <summary style="font-size: larger; font-weight: bold;">Sensor-based</summary>

Lidar-Confidence: "Unsupervised confidence for lidar depth maps and applications", Conti et al., IROS, 2022. [Paper] [Bibtex] [Code] [Google Scholar]

</ul> </details>

Applications

(Not an exhaustive list)

Deep3d: "Deep3d: Fully automatic 2d-to-3d video conversion with deep convolutional neural networks", Xie et al., ECCV, 2016. [Paper] [Code] [Bibtex] [Google Scholar]
Geometry to the Rescue: "Unsupervised cnn for single view depth estimation: Geometry to the rescue", Garg et al., ECCV, 2016. [Paper] [Bibtex] [Google Scholar]
MonoDepth/StereoDepth: "Unsupervised monocular depth estimation with left-right consistency", Godard et al., CVPR, 2017. [Paper] [Code] [Bibtex] [Google Scholar]
SVSM: "Single View Stereo Matching", Luo et al., CVPR, 2018. [Paper] [Code] [Bibtex] [Google Scholar]
MonoResMatch: "Learning monocular depth estimation infusing traditional stereo knowledge", Tosi et al., CVPR, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
Ida-3d: "Ida-3d: Instance-depth-aware 3d object detection from stereo vision for autonomous driving", Peng et al., CVPR, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
LIGA-Stereo: "LIGA-Stereo: Learning Lidar Geometry aware Representations for Stereo-based 3d Detector", Guo et al., ICCV, 2021. [Paper] [Code] [Bibtex] [Google Scholar]
Stereopifu: "Stereopifu: Depth aware clothed human digitization via stereo vision", Hong et al., CVPR, 2021. [Paper] [Code] [Bibtex] [Google Scholar]
Smart Glasses: "A Practical Stereo Depth System for Smart Glasses", Wang et al., CVPR, 2023. [Paper] [Bibtex] [Google Scholar]
Cross Attention Renderer: "Learning to render novel views from wide-baseline stereo pairs", Du et al., CVPR, 2023. [Paper] [Code] [Bibtex] [Google Scholar]
SDCNet: "Stereo-augmented depth completion from a single rgb-lidar image", Choi et al., ICRA, 2021. [Paper] [Bibtex] [Google Scholar]
VPPDC: "Revisiting Depth Completion from a Stereo Matching Perspective for Cross-domain Generalization", Bartolomei et al., 3DV, 2024. [Paper] [Code] [Bibtex] [Google Scholar]
CoPoNeRF: "Unifying Correspondence, Pose and NeRF for Pose-Free Novel View Synthesis from Stereo Pairs", Hong et al., CVPR, 2024. [Paper] [Code] [Bibtex] [Google Scholar]
DSGN: "Deep Stereo Geometry Network for 3D Object Detection", Chen et al., CVPR, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
StereoNeRF: "Generalizable Novel-View Synthesis using a Stereo Camera", Lee et al., CVPR, 2024. [Paper] [WebSite] [Bibtex] [Google Scholar]
Online Stereo Rectification: "Flow-Guided Online Stereo Rectification for Wide Baseline Stereo", Kumar et al., CVPR, 2024. [Paper] [WebSite] [Bibtex] [Google Scholar]
StereoDiffusion: "StereoDiffusion: Training-Free Stereo Image Generation Using Latent Diffusion Models", Wang et al., CVPRW, 2024. [Paper] [Code] [Bibtex] [Google Scholar]
GS2Mesh: "GS2Mesh: Surface Reconstruction from Gaussian Splatting via Novel Stereo Views", Wolf et al., ECCV, 2024. [Paper] [WebPage] [Bibtex] [Google Scholar]
StereoGS: "Self-Evolving Depth-Supervised 3D Gaussian Splatting from Rendered Stereo Pairs", Safadoust et al., BMVC, 2024. [Paper] [WebPage] [Bibtex] [Google Scholar]
Binocular3DGS: "Binocular3DGS: Binocular-Guided 3D Gaussian Splatting with View Consistency for Sparse View Synthesis", Han et al., NeurIPS, 2024. [Paper] [WebPage] [Code] [Bibtex] [Google Scholar]

Workshops

NTIRE 2024: HR Depth from Images of Specular and Transparent Surfaces. P. Z. Ramirez, F. Tosi, L. Di Stefano, R. Timofte A. Costanzino, M. Poggi, S. Salti, S. Mattoccia; CVPRW 2024, Seattle, US [Website]
NTIRE 2023: HR Depth from Images of Specular and Transparent Surfaces. P. Z. Ramirez, F. Tosi, L. Di Stefano, R. Timofte A. Costanzino, M. Poggi, S. Salti, S. Mattoccia; CVPRW 2023, Vancouver, Canada [Website]
Robust Vision Challenge (ROB), Zendel et al., ECCV 2022 [Website]

<h2 id="tutorials-talks"> Tutorials & Talks </h2>

Deep Stereo Matching in the Twenties. M. Poggi, F. Tosi; CVPR 2024, Seattle, US [Website]
Facing depth estimation in-the-wild with deep networks. M. Poggi, F. Tosi, F. Aleotti, K. Batsos, P. Mordohai, S. Mattoccia; ECCV 2020, SEC, Glasgow [Website]
Learning and understanding single image depth estimation in the wild. M. Poggi, F. Tosi, F. Aleotti, S. Mattoccia, C. Godard, J. Watson, M. Firman, G.J. Brostow; CVPR 2020, Seattle, Washington, US [Website]
Learning-based depth estimation from stereo and monocular images: successes, limitations and future challenges. M. Poggi, F. Tosi, K. Batsos, P. Mordohai, S. Mattoccia, CVPR 2019, Long Beach, California, US [Website]
Learning-based depth estimation from stereo and monocular images: successes, limitations and future challenges. M. Poggi, F. Tosi, K. Batsos, P. Mordohai, S. Mattoccia; 3DV 2018, Verona, Italy [Website]
Lecture: Computer Vision (Prof. Andreas Geiger, University of Tübingen). [Preliminaries] [Block Matching] [Siamese Networks] [Spatial Regularization] [End-to-End Learning]

<h2 id="demos"> Demos </h2>

Robust depth perception through Virtual Pattern Projection (VPP). L. Bartolomei, M. Poggi, F. Tosi, A. Conti, S. Mattoccia; CVPR 2024 DEMO, Seattle, US [Website] [Code] [Flyer]

:fountain_pen: Citation

Please consider citing this list if you find this repository useful:

@article{tosi2024survey,
  title={A Survey on Deep Stereo Matching in the Twenties},
  author={Fabio Tosi and Luca Bartolomei and Matteo Poggi},
  journal={arXiv preprint arXiv:2407.07816},
  year={2024},
  url={https://arxiv.org/abs/2407.07816},
  note={Extended version of CVPR 2024 Tutorial "Deep Stereo Matching in the Twenties" (https://sites.google.com/view/stereo-twenties)},
}

@article{poggi2021synergies,
  title={On the synergies between machine learning and binocular stereo for depth estimation from images: a survey},
  author={Poggi, Matteo and Tosi, Fabio and Batsos, Konstantinos and Mordohai, Philippos and Mattoccia, Stefano},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  volume={44},
  number={9},
  pages={5314--5334},
  year={2021},
  publisher={IEEE}
}

@article{poggi2021confidence,
  title={On the confidence of stereo matching in a deep-learning era: a quantitative evaluation},
  author={Poggi, Matteo and Kim, Seungryong and Tosi, Fabio and Kim, Sunok and Aleotti, Filippo and Min, Dongbo and Sohn, Kwanghoon and Mattoccia, Stefano},
  journal={IEEE transactions on pattern analysis and machine intelligence},
  volume={44},
  number={9},
  pages={5293--5313},
  year={2021},
  publisher={IEEE}
}