Home

Awesome

<div align="center"> <h1>Monocular Occupancy Prediction for Scalable Indoor Scenes</h1>

Hongxiao Yu<sup>1,2</sup> · Yuqi Wang<sup>1,2</sup> · Yuntao Chen<sup>3</sup> · Zhaoxiang Zhang<sup>1,2,3</sup>

<sup>1</sup>School of Artificial Intelligence, University of Chinese Academy of Sciences (UCAS)

<sup>2</sup>NLPR, MAIS, Institute of Automation, Chinese Academy of Sciences (CASIA)

<sup>3</sup>Centre for Artificial Intelligence and Robotics (HKISI_CAS)

ECCV 2024

Static Badge Static Badge Static Badge

PWC

<img src="NYUv2.gif" width = "800" height = "200" /> </div>

Performance

Here we compare our ISO with the previously best NDC-Scene and MonoScene model.

MethodIoUceilingfloorwallwindowchairbedsofatabletvsfurnitureobjectmIoU
MonoScene42.518.8993.5012.0612.5713.7248.1936.1115.1315.2227.9612.9426.94
NDC-Scene44.1712.0293.5113.1113.7715.8349.5739.8717.1724.5731.0014.9629.03
Ours47.1114.2193.4715.8915.1418.3550.0140.8218.2525.9034.0817.6731.25

We highlight the best results in bold.

Pretrained models on NYUv2 can be downloaded here.

Preparing ISO

Installation

  1. Create conda environment:
$ conda create -n iso python=3.9 -y
$ conda activate iso
  1. This code was implemented with python 3.9, pytorch 2.0.0 and CUDA 11.7. Please install PyTorch:
$ conda install pytorch==2.2.0 torchvision==0.17.0 torchaudio==2.2.0 pytorch-cuda=11.8 -c pytorch -c nvidia
  1. Install the additional dependencies:
$ git clone --recursive https://github.com/hongxiaoy/ISO.git
$ cd ISO/
$ pip install -r requirements.txt

:bulb:Note

Change L140 in depth_anything/metric_depth/zoedepth/models/base_models/dpt_dinov2/dpt.py to

self.pretrained = torch.hub.load('facebookresearch/dinov2', 'dinov2_{:}14'.format(encoder), pretrained=False)

Then, download Depth-Anything pre-trained model and metric depth model checkpoints file to checkpoints/.

  1. Install tbb:
$ conda install -c bioconda tbb=2020.2
  1. Finally, install ISO:
$ pip install -e ./

:bulb:Note

If you move the ISO dir to another place, you should run

pip cache purge

then run pip install -e ./ again.

Datasets

NYUv2

  1. Download the NYUv2 dataset.

  2. Create a folder to store NYUv2 preprocess data at /path/to/NYU/preprocess/folder.

  3. Store paths in environment variables for faster access:

    $ export NYU_PREPROCESS=/path/to/NYU/preprocess/folder
    $ export NYU_ROOT=/path/to/NYU/depthbin 
    

    :bulb:Note

    Recommend using

    echo "export NYU_PREPROCESS=/path/to/NYU/preprocess/folder" >> ~/.bashrc

    format command for future convenience.

  4. Preprocess the data to generate labels at a lower scale, which are used to compute the ground truth relation matrices:

    $ cd ISO/
    $ python iso/data/NYU/preprocess.py NYU_root=$NYU_ROOT NYU_preprocess_root=$NYU_PREPROCESS
    

Occ-ScanNet

  1. Download the Occ-ScanNet dataset, this include:

    • posed_images
    • gathered_data
    • train_subscenes.txt
    • val_subscenes.txt
  2. Create a root folder to store Occ-ScanNet dataset /path/to/Occ/ScanNet/folder, and move the all dataset files to this folder, zip files need extraction.

  3. Store paths in environment variables for faster access:

    $ export OCC_SCANNET_ROOT=/path/to/Occ/ScanNet/folder
    

    :bulb:Note

    Recommend using

    echo "export OCC_SCANNET_ROOT=/path/to/Occ/ScanNet/folder" >> ~/.bashrc

    format command for future convenience.

Pretrained Models

Download ISO pretrained models on NYUv2, then put them in the folder /path/to/ISO/trained_models.

huggingface-cli download --repo-type model hongxiaoy/ISO

If you didn't install huggingface-cli before, please following official instructions.

Running ISO

Training

NYUv2

  1. Create folders to store training logs at /path/to/NYU/logdir.

  2. Store in an environment variable:

$ export NYU_LOG=/path/to/NYU/logdir
  1. Train ISO using 2 GPUs with batch_size of 4 (2 item per GPU) on NYUv2:
$ cd ISO/
$ python iso/scripts/train_iso.py \
    dataset=NYU \
    NYU_root=$NYU_ROOT \
    NYU_preprocess_root=$NYU_PREPROCESS \
    logdir=$NYU_LOG \
    n_gpus=2 batch_size=4

Occ-ScanNet

  1. Create folders to store training logs at /path/to/OccScanNet/logdir.

  2. Store in an environment variable:

$ export OCC_SCANNET_LOG=/path/to/OccScanNet/logdir
  1. Train ISO using 2 GPUs with batch_size of 4 (2 item per GPU) on Occ-ScanNet (should match config file name in train_iso.py):
$ cd ISO/
$ python iso/scripts/train_iso.py \
    dataset=OccScanNet \
    OccScanNet_root=$OCC_SCANNET_ROOT \
    logdir=$OCC_SCANNET_LOG \
    n_gpus=2 batch_size=4

Evaluating

NYUv2

To evaluate ISO on NYUv2 test set, type:

$ cd ISO/
$ python iso/scripts/eval_iso.py \
    dataset=NYU \
    NYU_root=$NYU_ROOT\
    NYU_preprocess_root=$NYU_PREPROCESS \
    n_gpus=1 batch_size=1

Inference

Please create folder /path/to/iso/output to store the ISO outputs and store in environment variable:

export ISO_OUTPUT=/path/to/iso/output

NYUv2

To generate the predictions on the NYUv2 test set, type:

$ cd ISO/
$ python iso/scripts/generate_output.py \
    +output_path=$ISO_OUTPUT \
    dataset=NYU \
    NYU_root=$NYU_ROOT \
    NYU_preprocess_root=$NYU_PREPROCESS \
    n_gpus=1 batch_size=1

Visualization

You need to create a new Anaconda environment for visualization.

conda create -n mayavi_vis python=3.7 -y
conda activate mayavi_vis
pip install omegaconf hydra-core PyQt5 mayavi

If you meet some problem when installing mayavi, please refer to the following instructions:

NYUv2

$ cd ISO/
$ python iso/scripts/visualization/NYU_vis_pred.py +file=/path/to/output/file.pkl

Aknowledgement

This project is built based on MonoScene. Please refer to (https://github.com/astra-vision/MonoScene) for more documentations and details.

We would like to thank the creators, maintainers, and contributors of the MonoScene, NDC-Scene, ZoeDepth, Depth Anything for their invaluable work. Their dedication and open-source spirit have been instrumental in our development.

Citation

@article{yu2024monocular,
  title={Monocular Occupancy Prediction for Scalable Indoor Scenes},
  author={Yu, Hongxiao and Wang, Yuqi and Chen, Yuntao and Zhang, Zhaoxiang},
  journal={arXiv preprint arXiv:2407.11730},
  year={2024}
}