

[CVPR-2024] AllSpark: Reborn Labeled Features from Unlabeled in Transformer for Semi-Supervised Semantic Segmentation

PWC<br> PWC<br> PWC<br> PWC<br> PWC<br> PWC<br> PWC<br> PWC<br> PWC<br> PWC<br> PWC<br> PWC

This repo is the official implementation of AllSpark: Reborn Labeled Features from Unlabeled in Transformer for Semi-Supervised Semantic Segmentation which is accepted at CVPR-2024.

<p align="center"> <img src="./docs/allspark.jpg" width=39% height=65% class="center"> <img src="./docs/framework.png" width=60% height=65% class="center"> </p>

The AllSpark is a powerful Cybertronian artifact in the film series of Transformers. It was used to reborn Optimus Prime in Transformers: Revenge of the Fallen, which aligns well with our core idea.

šŸ’„ Motivation

In this work, we discovered that simply converting existing semi-segmentation methods into a pure-transformer framework is ineffective.

<p align="center"> <img src="./docs/backbone.png" width=50% height=80% class="center"> <img src="./docs/issue.jpg" width=35% height=65% class="center"> </p>

Thus, we propose to intervene and diversify the labeled data flow with unlabeled data in the feature domain, leading to improvements in generalizability.

šŸ› ļø Usage

ā€¼ļø IMPORTANT: This version is not the final version. We made some mistakes when re-organizing the code. We will release the correct version soon. Sorry for any inconvenience this may cause.

1. Environment

First, clone this repo:

git clone https://github.com/xmed-lab/AllSpark.git
cd AllSpark/

Then, create a new environment and install the requirements:

conda create -n allspark python=3.7
conda activate allspark
pip install torch==1.12.0+cu116 torchvision==0.13.0+cu116 torchaudio==0.12.0 --extra-index-url https://download.pytorch.org/whl/cu116
pip install tensorboard
pip install six
pip install pyyaml
pip install -U openmim
mim install mmcv==1.6.2
pip install einops
pip install timm

2. Data Preparation & Pre-trained Weights

2.1 Pascal VOC 2012 Dataset

Download the dataset with wget:

wget https://hkustconnect-my.sharepoint.com/:u:/g/personal/hwanggr_connect_ust_hk/EcgD_nffqThPvSVXQz6-8T0B3K9BeUiJLkY_J-NvGscBVA\?e\=2b0MdI\&download\=1 -O pascal.zip
unzip pascal.zip

2.2 Cityscapes Dataset

Download the dataset with wget:

wget https://hkustconnect-my.sharepoint.com/:u:/g/personal/hwanggr_connect_ust_hk/EWoa_9YSu6RHlDpRw_eZiPUBjcY0ZU6ZpRCEG0Xp03WFxg\?e\=LtHLyB\&download\=1 -O cityscapes.zip
unzip cityscapes.zip

2.3 COCO Dataset

Download the dataset with wget:

wget https://hkustconnect-my.sharepoint.com/:u:/g/personal/hwanggr_connect_ust_hk/EXCErskA_WFLgGTqOMgHcAABiwH_ncy7IBg7jMYn963BpA\?e\=SQTCWg\&download\=1 -O coco.zip
unzip coco.zip

Then your file structure will be like:

ā”œā”€ā”€ VOC2012
    ā”œā”€ā”€ JPEGImages
    ā””ā”€ā”€ SegmentationClass
ā”œā”€ā”€ cityscapes
    ā”œā”€ā”€ leftImg8bit
    ā””ā”€ā”€ gtFine
ā”œā”€ā”€ coco
    ā”œā”€ā”€ train2017
    ā”œā”€ā”€ val2017
    ā””ā”€ā”€ masks

Next, download the following pretrained weights.

ā”œā”€ā”€ ./pretrained_weights
    ā”œā”€ā”€ mit_b2.pth
    ā”œā”€ā”€ mit_b3.pth
    ā”œā”€ā”€ mit_b4.pth
    ā””ā”€ā”€ mit_b5.pth

For example, mit-B5:

mkdir pretrained_weights
wget https://hkustconnect-my.sharepoint.com/:u:/g/personal/hwanggr_connect_ust_hk/ET0iubvDmcBGnE43-nPQopMBw9oVLsrynjISyFeGwqXQpw?e=9wXgso\&download\=1 -O ./pretrained_weights/mit_b5.pth

3. Training & Evaluating

# use torch.distributed.launch
sh scripts/train.sh <num_gpu> <port>
# to fully reproduce our results, the <num_gpu> should be set as 4 on all three datasets
# otherwise, you need to adjust the learning rate accordingly

# or use slurm
# sh scripts/slurm_train.sh <num_gpu> <port> <partition>

To train on other datasets or splits, please modify dataset and split in train.sh.

4. Results

Model weights and training logs will be released soon.

4.1 PASCAL VOC 2012 original

<p align="left"> <img src="./docs/pascal_org.png" width=60% class="center"> </p>
Weights of AllSpark76.0778.4179.7780.7582.12
Reproduced76.06 | log78.4179.93 | log80.70 | log82.56 | log

4.2 PASCAL VOC 2012 augmented

<p align="left"> <img src="./docs/pascal_aug.png" width=60% class="center"> </p>
Weights of AllSpark78.3279.9880.4281.14

4.3 Cityscapes

<p align="left"> <img src="./docs/cityscapes.png" width=60% class="center"> </p>
Weights of AllSpark78.3379.2480.5681.39

4.4 COCO

<p align="left"> <img src="./docs/coco.png" width=60% class="center"> </p>
Weights of AllSpark34.10 | log41.65 | log45.48 | log49.56 | log


If you find this project useful, please consider citing:

  title={AllSpark: Reborn Labeled Features from Unlabeled in Transformer for Semi-Supervised Semantic Segmentation},
  author={Wang, Haonan and Zhang, Qixiang and Li, Yi and Li, Xiaomeng},


AllSpark is built upon UniMatch and SegFormer. We thank their authors for making the source code publicly available.