Awesome
<p align="center"> <h1 align="center">[ECCV24] OGNI-DC: Robust Depth Completion with Optimization-Guided Neural Iterations</h1> <p align="center"> <a href="https://zuoym15.github.io/"><strong>Yiming Zuo</strong></a> · <a href="https://www.cs.princeton.edu/~jiadeng/"><strong>Jia Deng</strong></a> </p> <p align="center"> <a href="https://pvl.cs.princeton.edu/">Princeton Vision & Learning Lab (PVL)</a> </p> </p> <h3 align="center"><a href="https://arxiv.org/abs/2406.11711">Paper</a> </a></h3> <p align="center"> <a href="https://arxiv.org/abs/2406.11711"> <img src="./figures/Pipeline.png" alt="Logo" width="98%"> </a> </p>Environment Setup
We recommend creating a python enviroment with anaconda.
conda create -n OGNIDC python=3.8
conda activate OGNIDC
# For CUDA Version == 11.3
conda install pytorch==1.10.1 torchvision==0.11.2 torchaudio==0.10.1 cudatoolkit=11.3 -c pytorch -c conda-forge
pip install mmcv==1.4.4 -f https://download.openmmlab.com/mmcv/dist/cu113/torch1.10/index.html
pip install mmsegmentation==0.22.1
pip install timm tqdm thop tensorboardX tensorboard opencv-python ipdb h5py ipython Pillow==9.5.0 plyfile einops
NVIDIA Apex
We used NVIDIA Apex for multi-GPU training. Apex can be installed as follows:
git clone https://github.com/NVIDIA/apex
cd apex
git reset --hard 4ef930c1c884fdca5f472ab2ce7cb9b505d26c1a
conda install cudatoolkit-dev=11.3 -c conda-forge
pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
You may face the bug ImportError: cannot import name 'container_abcs' from 'torch._six'
. In this case, change line 14 of apex/apex/_amp_state.py to import collections.abc as container_abcs
and re-install apex.
(Optional) Deformable Convolution V2 (DCNv2)
(Needed only if you use NLSPN) Build and install DCN module.
cd src/model/deformconv
sh make.sh
Datasets
Create a folder named datasets
and put all datasets under it.
NYUv2
We used preprocessed NYUv2 HDF5 dataset provided by Fangchang Ma.
cd datasets
wget http://datasets.lids.mit.edu/sparse-to-dense/data/nyudepthv2.tar.gz
tar -xvf nyudepthv2.tar.gz
After that, you will get a data structure as follows:
nyudepthv2_h5
├── train
│ ├── basement_0001a
│ ... ├── 00001.h5
│ └── ...
└── val
└── official
├── 00001.h5
└── ...
KITTI
Download the following files and unzip under the kitti_depth
folder:
data_depth_annotated, data_depth_velodyne, data_depth_selection
Finally, download kitti raw images by:
cd datasets/kitti_depth
wget https://github.com/youmi-zym/CompletionFormer/files/12575038/kitti_archives_to_download.txt
wget -i kitti_archives_to_download.txt -P kitti_raw/
cd kitti_raw
unzip "*.zip"
The overall data directory is structured as follows:
kitti_depth
├──data_depth_annotated
| ├── train
| └── val
├── data_depth_velodyne
| ├── train
| └── val
├── data_depth_selection
| ├── test_depth_completion_anonymous
| |── test_depth_prediction_anonymous
| └── val_selection_cropped
└── kitti_raw
├── 2011_09_26
├── 2011_09_28
├── 2011_09_29
├── 2011_09_30
└── 2011_10_03
VOID
First download the zip files (you can use gdown) under datasets
:
cd datasets
https://drive.google.com/open?id=1rzTFD35OCxMIguxLDcBxuIdhh5T2G7h4
Under the datasets
folder, run
sh ../unzip_void.sh
Finally, the file structure should be:
void_release
├── void_150
│ ├── data
│ │ ├── birthplace_of_internet
│ │ └── ...
│ ├── test_absolute_pose.txt
│ └── ...
├── void_500
│ └── ...
└── void_1500
└── ...
DDAD
We use the dataset pre-processed by the VPP4DC authors. You can download from:
https://drive.google.com/open?id=1y8Rt3Hld8zVTSKxx9d9yYXSzr5niKN7i
Unzip it and get the file structure:
void_release
└── pregenerated
└── val
├── gt
├── hints
├── intrinsics
└── rgb
Reproduce Results in the Paper
Download checkpoints from
https://drive.google.com/drive/folders/1LWrb1uFcJ5SGJdS8a9aqeyzkRMYyECRt?usp=sharing
and puts them under the checkpoints
folder.
Testing
cd src
# NYU in-domain (500 points)
sh testing_scripts/test_nyu.sh
# NYUv2 sparsity level generalization (5~20,000 points)
sh testing_scripts/test_nyu_sparse_inputs.sh
# KITTI sparsity level generalization (8~64 lines)
sh testing_scripts/test_kitti_sparse_inputs.sh
# Zero-shot test on VOID
sh testing_scripts/test_void.sh
# Zero-shot test on DDAD
sh testing_scripts/test_ddad.sh
# Example command for generating KITTI online server submission file
sh testing_scripts/test_kitti_server_submit.sh
Training from Scratch
Resource requirements
Training on NYU requires 1x24GB GPU (e.g., RTX 3090) and ~3 days. Training on KITTI requires 8x48GB GPUs (e.g., RTX A6000) and ~7 days.
Training
cd src
# NYU
sh training_scripts/train_nyu_generalizable.sh
# KITTI
sh training_scripts/train_kitti_generalizable.sh
# NYU best in-domain performance
sh training_scripts/train_nyu_best_performance.sh
# KITTI online server (L1)
sh training_scripts/train_kitti_best_performance_l1.sh
# KITTI online server (L1+L2)
sh training_scripts/train_kitti_best_performance_l1+l2.sh
Acknowledgement
This codebase is developed based on CompletionFormer by Youmin Zhang et al. We thank the authors for making their code public.
Citation
If you find our work helpful please consider citing our paper:
@inproceedings{ognidc,
title={OGNI-DC: Robust Depth Completion with Optimization-Guided Neural Iterations},
author={Zuo, Yiming and Deng, Jia},
booktitle={European Conference on Computer Vision (ECCV)},
pages={78--95},
year={2024}
}