Home

Awesome

Temporally Consistent Online Depth Estimation in Dynamic Scenes

Project Page | Paper

This is the official repo for our work Temporally Consistent Online Depth Estimation in Dynamic Scenes accepted at WACV 2023.

If you find CODD relevant, please cite

@inproceedings{li2023temporally,
  title={Temporally consistent online depth estimation in dynamic scenes},
  author={Li, Zhaoshuo and Ye, Wei and Wang, Dilin and Creighton, Francis X and Taylor, Russell H and Venkatesh, Ganesh and Unberath, Mathias},
  booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
  pages={3018--3027},
  year={2023}
}

Environment Setup

CODD is based on several excellent open-sourced libraries

Example setup commands (tested on Ubuntu 20.04 and 22.04)

conda create --name codd python=3.8 -y
conda activate codd
pip install scipy pyyaml terminaltables natsort
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113 # pytorch
pip install --no-index --no-cache-dir pytorch3d -f https://dl.fbaipublicfiles.com/pytorch3d/packaging/wheels/py38_cu113_pyt1121/download.html # pytorch3d
pip install mmcv-full==1.7.0 -f https://download.openmmlab.com/mmcv/dist/cu113/torch1.12/index.html # mmcv
pip install mmsegmentation # mmseg
pip install git+https://github.com/princeton-vl/lietorch.git # lietorch -- this will take a while

Pretrained Weights

Dataset Used

Configuration

For more details and examples, please see configs folder.

Network

In CODD, you can configure your model in a modular manner. The network is often specified in the following way:

model = dict(
    type='ConsistentOnlineDynamicDepth',
    stereo=dict(
        type='HITNetMF',  # enter your choice of stereo network
        ...  # model specific configs
    ),
    motion=dict(
        type="Motion",  # enter your choice of motion network
        ...  # model specific configs
    ),
    fusion=dict(
        type="Fusion",  # enter your choice of fusion network
        ...  # model specific configs
    )
)

If only stereo network is needed, you can simply comment out the motion and fusion network. You can also swap out the individual networks with your own implementation.

Dataset

In each dataset config, there are several things to be specified.

The rest of the variables are already set but feel free to adjust if you want to customize.

Train/Inference

The training config is of the following format

_base_ = [
    'PATH_TO_MODEL_CONFIG', 'PATH_TO_DATA_CONFIG',
    'default_runtime.py', 'PATH_TO_SCHEDULE_CONFIG'
]

Modify configs/train_config.py for desirable model and dataset config

The inference config is of the following format

_base_ = [
    'PATH_TO_MODEL_CONFIG', 'PATH_TO_DATA_CONFIG',
    'default_runtime.py'
]

Modify configs/inference.py for desirable model and dataset config

Training

Inference

There are two inference modes

To run inference

Optional arguments:

Others

Split Files

The split file is stored in the following format

LEFT_IMAGE RIGHT_IMAGE DISPARITY_IMAGE OPTICAL_FLOW DISPARITY_CHANGE OPTICAL_FLOW_OCCLUSION DISPARITY_FRAME2_in_FRAME1 DISPARITY_OCCLUSION

The split files can be generated by using utils/generate_split_files.py.

Visualize Point Cloud

To visualize the 3D point cloud generated from depth map, the script utils/vis_point_cloud.py can be used.

Benchmark Speed

To benchmark speed, run the following command

python benchmark.py configs/models/codd.py

Disclaimer

The majority of CODD is licensed under CC-BY-NC, however portions of the project are available under separate license terms: https://github.com/princeton-vl/RAFT-3D is licensed under the BSD-3-Clause license.