Home

Awesome

<br> <p align="center"> <h1 align="center"><strong>OV-PARTS: Towards Open-Vocabulary Part Segmentation</strong></h1> <p align="center"> <a href='https://github.com/kellyiss/' target='_blank'>Meng Wei</a>&emsp; <a href='https://yuexy.github.io/' target='_blank'>Xiaoyu Yue</a>&emsp; <a href='http://zhangwenwei.cn/' target='_blank'>Wenwei Zhang</a>&emsp; <a href='https://xh-liu.github.io/' target='_blank'>Xihui Liu</a>&emsp; <a href='https://aimerykong.github.io/' target='_blank'>Shu Kong</a>&emsp; <a href='https://oceanpang.github.io/' target='_blank'>Jiangmiao Pang*</a>&emsp; <br> Shanghai AI Laboratory&emsp;The University of Hong Kong&emsp;The University of Sydney&emsp;University of Macau&emsp;Texas A&M University </p> </p> <p align="center"> <a href="https://openreview.net/forum?id=EFl8zjjXeX&" target='_blank'> <img src="https://img.shields.io/badge/Paper-đź“–-blue?"> </a> </p>

🏠 About

<div style="text-align: center;"> <img src="assets/ov_parts.jpg" width=98%> </div>

OV-PARTS is a benchmark for Open-Vocabulary Part Segmentation by using the capabilities of large-scale Vision-Language Models (VLMs).

🔥 News

We organize the Open Vocabulary Part Segmentation (OV-PARTS) Challenge in the Visual Perception via Learning in an Open World (VPLOW) Workshop. Please check our website!

đź›  Getting Started

Installation

  1. Clone this repository

    git clone https://github.com/OpenRobotLab/OV_PARTS.git
    cd OV_PARTS
    
  2. Create a conda environment with Python3.8+ and install python requirements

    conda create -n ovparts python=3.8
    conda activate ovparts
    pip install -r requirements.txt
    

Data Preparation

After downloading the two benchmark datasets, please extract the files by running the following command and place the extracted folder under the "Datasets" directory.

tar -xzf PascalPart116.tar.gz
tar -xzf ADE20KPart234.tar.gz

The Datasets folder should follow this structure:

Datasets/
├─Pascal-Part-116/
│ ├─train_16shot.json
│ ├─images/
│ │ ├─train/
│ │ └─val/
│ ├─annotations_detectron2_obj/
│ │ ├─train/
│ │ └─val/
│ └─annotations_detectron2_part/
│   ├─train/
│   └─val/
└─ADE20K-Part-234/
  ├─images/
  │ ├─training/
  │ ├─validation/
  ├─train_16shot.json
  ├─ade20k_instance_train.json
  ├─ade20k_instance_val.json
  └─annotations_detectron2_part/
    ├─training/
    └─validation/

Create {train/val}_{obj/part}_label_count.json files for Pascal-Part-116.

python baselines/data/datasets/mask_cls_collect.py Datasets/Pascal-Part-116/annotations_detectron2_{obj/part}/{train/val} Datasets/Pascal-Part-116/annotations_detectron2_part/{train/val}_{obj/part}_label_count.json

Training

  1. Training the two-stage baseline ZSseg+.

    Please first download the clip model fintuned with CPTCoOp.

    Then run the training command:

    python train_net.py --num-gpus 8 --config-file configs/${SETTING}/zsseg+_R50_coop_${DATASET}.yaml
    
  2. Training the one-stage baselines CLIPSeg and CATSeg.

    Please first download the pre-trained object models of CLIPSeg and CATSeg and place them under the "pretrain_weights" directory.

    ModelsPre-trained checkpoint
    CLIPSegdownload
    CATSegdownload

    Then run the training command:

    # For CATseg.
    python train_net.py --num-gpus 8 --config-file configs/${SETTING}/catseg_${DATASET}.yaml
    
    # For CLIPseg.
    python train_net.py --num-gpus 8 --config-file configs/${SETTING}/clipseg_${DATASET}.yaml
    

Evaluation

We provide the trained weights for the three baseline models reported in the paper.

ModelsSettingPascal-Part-116 checkpointADE20K-Part-234 checkpoint
ZSSeg+Zero-shotdownloaddownload
CLIPSegZero-shotdownloaddownload
CatSetZero-shotdownloaddownload
CLIPSegFew-shotdownloaddownload
CLIPSegcross-dataset-download

To evaluate the trained models, add --eval-only to the training command.

For example:

  python train_net.py --num-gpus 8 --config-file configs/${SETTING}/catseg_${DATASET}.yaml --eval-only MODEL.WEIGHTS ${WEIGHT_PATH}

đź“ť Benchmark Results

đź”— Citation

If you find our work helpful, please cite:

@inproceedings{wei2023ov,
  title={OV-PARTS: Towards Open-Vocabulary Part Segmentation},
  author={Wei, Meng and Yue, Xiaoyu and Zhang, Wenwei and Kong, Shu and Liu, Xihui and Pang, Jiangmiao},
  booktitle={Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
  year={2023}
}

đź‘Ź Acknowledgements

We would like to express our gratitude to the open-source projects and their contributors, including ZSSeg, CATSeg and CLIPSeg. Their valuable work has greatly contributed to the development of our codebase.