Awesome
Panoptic-PartFormer ECCV-2022 [Introduction Video], [Poster]
Xiangtai Li, Shilin Xu, Yibo Yang, Guangliang Cheng, Yunhai Tong, Dacheng Tao.
Source Code and model of our ECCV-2022 paper. Our re-implementation achieved slightly better results than the original submitted paper.
News
š„š„ Our extension work, PanopticPartFormer++: A Unified and Decoupled View for Panoptic Part Segmentation, is finally accepted by T-PAMI, after one half years review. It is stronger baseline for part-aware panoptic segmentation.
Notation
We find the ground truth bugs of the PPP dataset in our local machine. The results are higher than the previous version. Thanks for the reminder by the original PPS dataset author Daan de Geus
Introduction
Panoptic Part Segmentation (PPS) aims to unify panoptic segmentation and part segmentation into one task. Previous work mainly utilizes separated approaches to handle thing, stuff, and part predictions individually without performing any shared computation and task association. In this work, we aim to unify these tasks at the architectural level, designing the first end-to-end unified method named Panoptic- PartFormer.
We model things, stuff, and part as object queries and directly learn to optimize all three predictions as unified mask prediction and classification problems. We design a decoupled decoder to generate the part feature and thing/stuff feature respectively.
Installation
It requires the following OpenMMLab packages:
- MMCV-full == v1.3.18
- MMDetection == v2.18.0
- panopticapi
The packages that in the requirement.txt file.
Usage
Data Preparation
The basic data formats are as follows:
Prepare data following MMDetection. The data structure looks like below:
For Cityscapes Panoptic Part (CPP) and PascalContext Panoptic Part (PPP) dataset:
The Extra Anotations for PPP dataset can be downloaded here.
data/
āāā VOCdevkit # pascal context part dataset
ā āāā gt_panoptic
ā āāā labels_57part
ā āāā VOC2010
ā ā āāā JPEGImages
āāā cityscapes # cityscape pascal part dataset
ā āāā annotations
ā ā āāā cityscapes_panoptic_{train,val}.json
ā ā āāā instancesonly_filtered_gtFine_{train,val}.json
ā ā āāā val_images.json
ā āāā gtFine
ā āāā gtFinePanopticParts
ā āāā leftImg8bit
Pretrained Models:
Cityscapes Pre-trained Model
Note that The cityscapes model results can be improved via large crop finetuning see the config city_r50_fam_ft.py.
R-50: link
Swin-base: link
Pascal Context Pretrained Model on COCO
R-50: link
Swin-base: link
Trained Models
CPP
R-50 link, PartPQ: 57.5
Swin-base link PartPQ: 61.9
PPP
R-50 link PartPQ: 45.1
R-101 link PartPQ: 45.6
Swin-base link PartPQ: 52.3
Training and testing
For a single machine with multi GPUs. To reproduce the performance. Make sure you have loaded the ckpt correctly.
# train
sh ./tools/dist_train.sh $CONFIG $NUM_GPU
# test
sh ./tools/dist_test.sh $CONFIG $CHECKPOINT --eval panoptic part
Note for ppp dataset training, better to add --no-validate flag since the long evaluation period and more cpu/memory cost. And then eval the model in an offline manner.
for multi machines, we use slurm
sh ./tools/slurm_train.sh $PARTITION $JOB_NAME $CONFIG $WORK_DIR
sh ./tools/slurm_test.sh $PARTITION $JOB_NAME $CONFIG $CHECKPOINT --eval panoptic, part
- PARTITION: the slurm partition you are using
- CHECKPOINT: the path of the checkpoint downloaded from our model zoo or trained by yourself
- WORK_DIR: the working directory to save configs, logs, and checkpoints
- CONFIG: the config files under the directory
configs/
- JOB_NAME: the name of the job that are necessary for slurm
You can also run training and testing without slurm by directly mim for instance/semantic segmentation or tools/train.py
for panoptic segmentation like below:
Demo Visualization
For cityscapes panoptic part
python demo/image_demo.py $IMG_FILE $CONFIG $CHECKPOING $OUT_FILE
For Pascal Panoptic part
python demo/image_demo.py $IMG_FILE $CONFIG $CHECKPOING $OUT_FILE --datasetspec_path=$1 --evalspec_path=$2
Visual Results
Acknowledgement
We build our codebase based on K-Net and mmdetection. Much thanks for their open-sourced code. Mainly refer to the implementation of thing/stuff kernels (query) interaction part.
Citation
If you find this repo useful for your research, please consider citing our paper:
@article{li2022panoptic,
title={Panoptic-partformer: Learning a unified model for panoptic part segmentation},
author={Li, Xiangtai and Xu, Shilin and Cheng, Yibo Yang and Tong, Yunhai and Tao, Dacheng and others},
journal={ECCV},
year={2022}
}
License
MIT license