Home

Awesome

NVDS (ICCV 2023) & NVDS+ (TPAMI 2024) πŸš€πŸš€πŸš€

πŸŽ‰πŸŽ‰πŸŽ‰ Welcome to the NVDS GitHub repository! πŸŽ‰πŸŽ‰πŸŽ‰

The repository is official PyTorch implementation of ICCV 2023 paper "Neural Video Depth Stabilizer" (NVDS)

Authors: Yiran Wang<sup>1</sup>, Min Shi<sup>1</sup>, Jiaqi Li<sup>1</sup>, Zihao Huang<sup>1</sup>, Zhiguo Cao<sup>1</sup>, Jianming Zhang<sup>2</sup>, Ke Xian<sup>3</sup>, Guosheng Lin<sup>3</sup>

Project Page | Arxiv | Video | 视钑 | Poster | Supp | VDW Dataset | VDW Toolkits

TPAMI 2024 "NVDS+: Towards Efficient and Versatile Neural Stabilizer for Video Depth Estimation" (NVDS+)

Authors: Yiran Wang<sup>1</sup>, Min Shi<sup>1</sup>, Jiaqi Li<sup>1</sup>, Chaoyi Hong<sup>1</sup>, Zihao Huang<sup>1</sup>, Juewen Peng<sup>3</sup>, Zhiguo Cao<sup>1</sup>, Jianming Zhang<sup>2</sup>, Ke Xian<sup>1</sup>, Guosheng Lin<sup>3</sup>

Institutes: <sup>1</sup>Huazhong University of Science and Technology, <sup>2</sup>Adobe Research, <sup>3</sup>Nanyang Technological University

Paper | Arxiv | Video | 视钑 | Supp

😎 Highlights

NVDS is the first plug-and-play stabilizer that can remove flickers from any single-image depth model without extra effort. Besides, we also introduce a large-scale dataset, Video Depth in the Wild (VDW), which consists of 14,203 videos with over two million frames, making it the largest natural-scene video depth dataset. Don't forget to star this repo if you find it interesting!

πŸ’¦ License and Releasing Policy

⚑ Updates and Todo List

🌼 Abstract

Video depth estimation aims to infer temporally consistent depth. Some methods achieve temporal consistency by finetuning a single-image depth model during test time using geometry and re-projection constraints, which is inefficient and not robust. An alternative approach is to learn how to enforce temporal consistency from data, but this requires well-designed models and sufficient video depth data. To address these challenges, we propose a plug-and-play framework called Neural Video Depth Stabilizer (NVDS) that stabilizes inconsistent depth estimations and can be applied to different single-image depth models without extra effort. We also introduce a large-scale dataset, Video Depth in the Wild (VDW), which consists of 14,203 videos with over two million frames, making it the largest natural-scene video depth dataset to our knowledge. We evaluate our method on the VDW dataset as well as two public benchmarks and demonstrate significant improvements in consistency, accuracy, and efficiency compared to previous approaches. Our work serves as a solid baseline and provides a data foundation for learning-based video depth models. We will release our dataset and code for future research.

<p align="center"> <img src="PDF/fig1-pipeline.PNG" width="100%"> </p>

πŸ”¨ Installation

πŸ”₯ Demo & Inference

πŸ” Evaluations on NYUDV2

🎯 Evaluations on VDW Test Set

🍻 Star History

Star History Chart

🍭 Acknowledgement

We thank the authors for releasing PyTorch, MiDaS, DPT, GMFlow, SegFormer, VSS-CFFM, Mask2Former, PySceneDetect, and FFmpeg. Thanks for their solid contributions and cheers to the community.

πŸ“§ Citation

@InProceedings{NVDS,
    author    = {Wang, Yiran and Shi, Min and Li, Jiaqi and Huang, Zihao and Cao, Zhiguo and Zhang, Jianming and Xian, Ke and Lin, Guosheng},
    title     = {Neural Video Depth Stabilizer},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2023},
    pages     = {9466-9476}}

@ARTICLE{NVDSPLUS,
  author={Wang, Yiran and Shi, Min and Li, Jiaqi and Hong, Chaoyi and Huang, Zihao and Peng, Juewen and Cao, Zhiguo and Zhang, Jianming and Xian, Ke and Lin, Guosheng},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence}, 
  title={NVDS$^{\mathbf{+}}$: Towards Efficient and Versatile Neural Stabilizer for Video Depth Estimation}, 
  year={2024},
  pages={1-18},
  doi={10.1109/TPAMI.2024.3476387}}