Home

Awesome

PWC PWC

ODIN: A Single Model for 2D and 3D Segmentation

Authors: Ayush Jain, Pushkal Katara, Nikolaos Gkanatsios, Adam W. Harley, Gabriel Sarch, Kriti Aggarwal, Vishrav Chaudhary, Katerina Fragkiadaki.

Official implementation of "ODIN: A Single Model for 2D and 3D Segmentation", CVPR 2024 (Highlight).

<div align="center"> <img src="https://odin-seg.github.io/data/teaser_v6-1.png" width="100%" height="100%"/> </div><br/>

Installation

Make sure you are using a newer version of GCC>=9.2.0

export TORCH_CUDA_ARCH_LIST="6.0 6.1 6.2 7.0 7.2 7.5 8.0 8.6"
conda create -n odin python=3.10
conda activate odin
pip install torch==2.2.0+cu118 torchvision==0.17.0+cu118 --extra-index-url https://download.pytorch.org/whl/cu118
pip install torch-scatter -f https://data.pyg.org/whl/torch-2.2.0+cu118.html
pip install -r requirements.txt
sh init.sh

Data Preparation

Please refer to README in data_preparation folder for individual datasets. For eg. ScanNet data preparation README

Usage

We provide training scripts for various datasets in scripts folder. Please refer to these scripts for training ODIN.

Model Zoo

ScanNet Instance Segmentation

DatasetmAPmAP@25ConfigCheckpoint
ScanNet val (ResNet50)47.883.6configcheckpoint
ScanNet val (Swin-B)50.083.6configcheckpoint
ScanNet test (Swin-B)47.786.2configcheckpoint

ScanNet Semantic Segmentation

DatasetmIoUConfigCheckpoint
ScanNet val (ResNet50)73.3configcheckpoint
ScanNet val (Swin-B)77.8configcheckpoint
ScanNet test (Swin-B)74.4configcheckpoint

Joint 2D-3D on ScanNet and COCO

ModelmAP (ScanNet)mAP25 (ScanNet)mAP (COCO)ConfigCheckpoint
ODIN49.183.141.2configcheckpoint

ScanNet200 Instance Segmentation

DatasetmAPmAP@25ConfigCheckpoint
ScanNet200 val (ResNet50)25.636.9configcheckpoint
ScanNet200 val (Swin-B)31.545.3configcheckpoint
ScanNet200 test (Swin-B)27.239.4configcheckpoint

ScanNet200 Semantic Segmentation

DatasetmIoUConfigCheckpoint
ScanNet200 val (ResNet50)35.8configcheckpoint
ScanNet200 val (Swin-B)40.5configcheckpoint
ScanNet test (Swin-B)36.8configcheckpoint

AI2THOR Semantic and Instance Segmentation

DatasetmAPmAP@25mIoUConfigCheckpoint
AI2THOR val (ResNet)63.880.271.5configcheckpoint
AI2RHOR val (Swin)64.378.671.4configcheckpoint

Matterport3D Instance Segmentation

DatasetmAPmAP@25ConfigCheckpoint
Matterport3D val (ResNet)11.527.6configcheckpoint
Matterport val (Swin)14.536.8configcheckpoint

Matterport3D Semantic Segmentation

DatasetmIoUmAccConfigCheckpoint
Matterport3D val (ResNet)22.428.5configcheckpoint
Matterport3D val (Swin)28.638.2configcheckpoint

S3DIS Instance Segmentation

DatasetmAPmAP@25ConfigCheckpoint
S3DIS Area5 (ResNet50-Scratch)36.361.2configcheckpoint
S3DIS Area5 (ResNet50-Fine-Tuned)44.767.5configcheckpoint
S3DIS Area5 (Swin-B)43.070.0configcheckpoint

S3DIS Semantic Segmentation

DatasetmIoUConfigCheckpoint
S3DIS (ResNet50)59.7configcheckpoint
Swin-B68.6configcheckpoint

Training Logs:

Please find training logs for all models here

<a name="CitingODIN"></a>Citing ODIN

If you find ODIN useful in your research, please consider citing:

@misc{jain2024odin,
      title={ODIN: A Single Model for 2D and 3D Perception}, 
      author={Ayush Jain and Pushkal Katara and Nikolaos Gkanatsios and Adam W. Harley and Gabriel Sarch and Kriti Aggarwal and Vishrav Chaudhary and Katerina Fragkiadaki},
      year={2024},
      eprint={2401.02416},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

License

Shield: License: MIT

The majority of ODIN is licensed under a MIT License.

Acknowledgement

Parts of this code were based on the codebase of Mask2Former and Mask3D.