Home

Awesome

SimSeg

[CVPR'23] A Simple Framework for Text-Supervised Semantic Segmentation

<p align="center"> <img src="./docs/cvpr23_simseg.png"> </p>

Links

Here are [Paper] and [Video].

Performance

MethodBackbonePASCAL VOCPASCAL ContextCOCO Stuff
SimSegViT-S56.625.827.2
SimSegViT-B57.426.229.7

Checkpoints

SimSeg checkpoints: Google Drive
Please save the .pth files under the ckpts/ folder.

SimSeg
├── ckpts
│   ├── simseg.vit-b.pth
│   ├── simseg.vit-s.pth

Dataset

We follow the MMSegmentation Dataset Preparation to download and setup the test sets.
It is recommended to arrange the dataset as the following.
If your folder structure is different, you may need to change the corresponding paths in config files.

SimSeg
├── data
│   ├── label_category
│   │   ├── pascal_voc.txt
│   │   ├── pascal_context.txt
│   │   ├── coco_stuff.txt
│   ├── VOCdevkit
│   │   ├── VOC2012
│   │   │   ├── JPEGImages
│   │   │   ├── SegmentationClass
│   │   │   ├── ImageSets
│   │   │   │   ├── Segmentation
│   │   │   │   │   ├── train.txt
│   │   │   │   │   ├── val.txt
│   │   ├── VOC2010
│   │   │   ├── JPEGImages
│   │   │   ├── SegmentationClassContext
│   │   │   ├── ImageSets
│   │   │   │   ├── SegmentationContext
│   │   │   │   │   ├── train.txt
│   │   │   │   │   ├── val.txt
│   │   │   ├── trainval_merged.json
│   ├── coco_stuff164k
│   │   ├── images
│   │   │   ├── train2017
│   │   │   ├── val2017
│   │   ├── annotations
│   │   │   ├── train2017
│   │   │   ├── val2017

Pascal VOC

Pascal VOC 2012 could be downloaded from here.

Pascal Context

The training and validation set of Pascal Context could be download from here.

To split the training and validation set from original dataset, you may download trainval_merged.json from here.

Please install Detail API and then run the following command to convert annotations into proper format.

python tools/convert_datasets/pascal_context.py data/VOCdevkit data/VOCdevkit/VOC2010/trainval_merged.json

COCO Stuff

For COCO Stuff 164k dataset, please run the following commands to download and convert the augmented dataset.

# download
mkdir coco_stuff164k && cd coco_stuff164k
wget http://images.cocodataset.org/zips/train2017.zip
wget http://images.cocodataset.org/zips/val2017.zip
wget http://calvin.inf.ed.ac.uk/wp-content/uploads/data/cocostuffdataset/stuffthingmaps_trainval2017.zip

# unzip
unzip train2017.zip -d images/
unzip val2017.zip -d images/
unzip stuffthingmaps_trainval2017.zip -d annotations/

# --nproc means 8 process for conversion, which could be omitted as well.
python tools/convert_datasets/coco_stuff164k.py data/coco_stuff164k --nproc 8

The details of this dataset could be found at here.

Environment

Requirements:

Install requirements:

pip install -r requirements.txt
pip install git+https://github.com/lucasb-eyer/pydensecrf.git

mim install mmcv-full==1.7.0

Evaluation

After

  1. Downloading pre-trained checkpoints.
  2. Preparing evaluation data.

The models could be evaluated by running the following scripts.

Pascal VOC

python3 -m torch.distributed.launch --nproc_per_node=1 --master_port=65533 tools/seg_evaluation.py --ckpt_path=ckpts/simseg.vit-s.pth --cfg=configs/clip/simseg.vit-s.yaml

Pascal Context

python3 -m torch.distributed.launch --nproc_per_node=1 --master_port=65533 tools/seg_evaluation.py --ckpt_path=ckpts/simseg.vit-s.pth --cfg=configs/clip/simseg.vit-s.yaml data.valid_name=[pascal_context]

COCO Stuff

python3 -m torch.distributed.launch --nproc_per_node=1 --master_port=65533 tools/seg_evaluation.py --ckpt_path=ckpts/simseg.vit-s.pth --cfg=configs/clip/simseg.vit-s.yaml data.valid_name=[coco_stuff]

Switch to ViT-Base backbone by simply changing

--ckpt_path=ckpts/simseg.vit-s.pth --cfg=configs/clip/simseg.vit-s.yaml

to

--ckpt_path=ckpts/simseg.vit-b.pth --cfg=configs/clip/simseg.vit-b.yaml

Acknowledgement

This work is based on ZeroVL (ECCV 2022).

Citation

If you use SimSeg in your research, please use the following BibTeX entry.

@inproceedings{yi2023simseg,
    author={Yi, Muyang and Cui, Quan and Wu, Hao and Yang, Cheng and Yoshie, Osamu and Lu, Hongtao},
    title={A Simple Framework for Text-Supervised Semantic Segmentation},
    booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    year={2023},
    pages={7071-7080}
}

License

SimSeg is released under the MIT license. See LICENSE for details.