Awesome
<div align="center"> <h2>🏡Know Your Neighbors: Improving Single-View Reconstruction via Spatial Vision-Language Reasoning</h2>Rui Li<sup>1</sup> · Tobias Fischer<sup>1</sup> · Mattia Segu<sup>1</sup> · Marc Pollefeys<sup>1</sup> <br> Luc Van Gool<sup>1</sup> · Federico Tombari<sup>2,3</sup>
<sup>1</sup>ETH Zürich · <sup>2</sup>Google · <sup>3</sup>Technical University of Munich
CVPR 2024
<a href="https://arxiv.org/abs/2404.03658"><img src='https://img.shields.io/badge/arXiv-KYN-red' alt='Paper PDF'></a> <a href='https://ruili3.github.io/kyn/'><img src='https://img.shields.io/badge/Project_Page-KYN-green' alt='Project Page'></a> <a href='https://huggingface.co/'><img src='https://img.shields.io/badge/Hugging_Face-KYN (coming soon)-yellow' alt='Hugging Face'></a>
</div>This work presents Know-Your-Neighbors (KYN), a single-view 3D reconstruction method that disambiguates occluded scene geometry by utilizing Vision-Language semantics and spatial reasoning.
🔗 Environment Setup
# python virtual environment
python -m venv kyn
source kyn/bin/activate
pip install -r requirements.txt
🚀 Quick Start
Download our pre-trianed model and the LSeg model, put them into ./checkpoints
. Then run the demo:
python scripts/demo.py --img media/example/0000.png --model_path checkpoints/kyn.pt --save_path /your/save/path
Herein --img
specifies the input image path, --model_path
is the model checkpoint path, and --save_path
stores the resulting depth map, BEV map, as well as 3D voxel grids.
📁 Dataset Setup
We use the KITTI-360 dataset and process it as follows:
- Register at https://www.cvlibs.net/datasets/kitti-360/index.php and download perspective images, fisheye images, raw Velodyne scans, calibrations, and vehicle poses. The required KITTI-360 official scripts & data are:
download_2d_fisheye.zip download_2d_perspective.zip download_3d_velodyne.zip calibration.zip data_poses.zip
- Preprocess with the Python script below. It rectifies the fisheye views, resizes all images, and stores them in separate folders:
python datasets/kitti_360/preprocess_kitti_360.py --data_path ./KITTI-360 --save_path ./KITTI-360
- The final folder structure should look like:
KITTI-360 ├── calibration ├── data_poses ├── data_2d_raw │ ├── 2013_05_28_drive_0003_sync │ │ ├── image_00 │ │ │ ├── data_192x640 │ │ │ └── data_rect │ │ ├── image_01 │ │ ├── image_02 │ │ │ ├── data_192x640_0x-15 │ │ │ └── data_rgb │ │ └── image_03 │ └── ... └── data_3d_raw ├── 2013_05_28_drive_0003_sync └── ...
📊 Evaluation
Quantitative Evaluation
- The data directory is set to
./KITTI-360
by default. - Download and unzip the pre-computed GT occupancy maps into
./KITTI-360
. You can also compute and store your customized GT occupancy maps by settingread_gt_occ_path: ''
and specifyingsave_gt_occ_map_path
inconfigs/eval_kyn.yaml
. - Download and unzip the object labels to
./KITTI-360
. - Download our pre-trianed model and the LSeg model, put them into
./checkpoints
. - Run the following command for evaluation:
python eval.py -cn eval_kyn
Voxel Visualization
Run the following command to generate 3D voxel models on the KITTI-360 test set:
python scripts/gen_kitti360_voxel.py -cn gen_voxel
💻 Training
Download the LSeg model and put it into ./checkpoints
. Then run:
torchrun --nproc_per_node=<num_of_gpus> train.py -cn train_kyn
where <num_of_gpus>
denotes the number of available GPUs. Models will be saved in ./result
by defualt.
📰 Citation
Please cite our paper if you use the code in this repository:
@inproceedings{li2024know,
title={Know Your Neighbors: Improving Single-View Reconstruction via Spatial Vision-Language Reasoning},
author={Li, Rui and Fischer, Tobias and Segu, Mattia and Pollefeys, Marc and Van Gool, Luc and Tombari, Federico},
booktitle={CVPR},
year={2024}
}
<!-- ### 🌟 Star History
<div style="text-align: center;">
<a href="https://star-history.com/#ruili3/Know-Your-Neighbors&Date">
<picture>
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=ruili3/Know-Your-Neighbors&type=Date&theme=dark" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=ruili3/Know-Your-Neighbors&type=Date" />
<img alt="Star History Chart" src="https://api.star-history.com/svg?repos=ruili3/Know-Your-Neighbors&type=Date" width="600"/>
</picture>
</a>
</div> -->