Awesome
EmbodiedOcc: Embodied 3D Occupancy Prediction for Vision-based Online Scene Understanding
Paper | Project Page
EmbodiedOcc: Embodied 3D Occupancy Prediction for Vision-based Online Scene Understanding
Yuqi Wu, Wenzhao Zheng$\dagger$, Sicheng Zuo, Yuanhui Huang, Jie Zhou, Jiwen Lu
$\dagger$ Project leader
EmbodiedOcc formulates an embodied 3D occupancy prediction task and proposes a Gaussian-based framework to accomplish it.
Overview
Targeting progressive embodied exploration in indoor scenarios, we formulate an embodied 3D occupancy prediction task and propose a Gaussian-based EmbodiedOcc framework accordingly. Our EmbodiedOcc maintains an explicit Gaussian memory of the current scene and updates this memory during the exploration of this scene. Both quantitative and visualization results have shown that our EmbodiedOcc outperforms existing methods in terms of local occupancy prediction and accomplishes the embodied occupancy prediction task with high accuracy and strong expandability.
Getting Started
Installation
Follow instructions HERE to prepare the environment.
Data Preparation
-
Prepare posed_images and gathered_data following the Occ-ScanNet dataset and move them to data/occscannet.
-
Download global_occ_package and streme_occ_new_package from the EmbodiedOcc-ScanNet. Unzip and move them to data/scene_occ.
Folder structure
EmbodiedOcc
├── ...
├── data/
│ ├── occscannet/
│ │ ├── gathered_data/
│ │ ├── posed_images/
│ │ ├── train_final.txt
│ │ ├── train_mini_final.txt
│ │ ├── test_final.txt
│ │ ├── test_mini_final.txt
│ ├── scene_occ/
│ │ ├── global_occ_package/
│ │ ├── streme_occ_new_package/
│ │ ├── train_online.txt
│ │ ├── train_mini_online.txt
│ │ ├── test_online.txt
│ │ ├── test_mini_online.txt
Train
- Train local occupancy prediction module using 8 GPUs on Occ-ScanNet and Occ-ScanNet-mini2:
$ cd EmbodiedOcc $ torchrun --nproc_per_node=8 train_mono.py --py-config config/train_mono_config.py $ torchrun --nproc_per_node=8 train_mono.py --py-config config/train_mono_mini_config.py
- Train EmbodiedOcc using 8 GPUs on EmbodiedOcc-ScanNet and 4 GPUs on EmbodiedOcc-ScanNet-mini:
$ cd EmbodiedOcc $ torchrun --nproc_per_node=8 train_embodied.py --py-config config/train_embodied_config.py $ torchrun --nproc_per_node=4 train_embodied.py --py-config config/train_embodied_mini_config.py
Visualize
-
Local occupancy prediction:
$ cd EmbodiedOcc $ torchrun --nproc_per_node=1 vis_mono.py --work-dir workdir/train_mono $ torchrun --nproc_per_node=1 vis_mono.py --work-dir workdir/train_mono_mini
-
Embodied occupancy prediction:
$ cd EmbodiedOcc $ torchrun --nproc_per_node=1 vis_embodied.py --work-dir workdir/train_embodied $ torchrun --nproc_per_node=1 vis_embodied.py --work-dir workdir/train_embodied_mini
Please use the same workdir path with training setting.
Related Projects
Our work is inspired by these excellent open-sourced repos: GaussianFormer ISO
Our code is based on GaussianFormer.
Citation
If you find this project helpful, please consider citing the following paper:
@article{wu2024embodiedoccembodied3doccupancy,
title={EmbodiedOcc: Embodied 3D Occupancy Prediction for Vision-based Online Scene Understanding},
author={Yuqi Wu and Wenzhao Zheng and Sicheng Zuo and Yuanhui Huang and Jie Zhou and Jiwen Lu},
journal={arXiv preprint arXiv:2412.04380},
year={2024}
}