Awesome
This repo is a PyTorch implementation for paper: Progressive Feature Self-Reinforcement for Weakly Supervised Semantic Segmentation
Data Preparation
PASCAL VOC 2012
1. Download
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
2. Segmentation Labels
The augmented annotations are from SBD dataset. Here is a download link of the augmented annotations at
DropBox. After downloading SegmentationClassAug.zip
, you should unzip it and move it to VOCdevkit/VOC2012/
.
VOCdevkit
└── VOC2012
├── Annotations
├── ImageSets
├── JPEGImages
├── SegmentationClass
├── SegmentationClassAug
└── SegmentationObject
MSCOCO 2014
1. Download
wget http://images.cocodataset.org/zips/train2014.zip
wget http://images.cocodataset.org/zips/val2014.zip
2. Segmentation Labels
To generate VOC style segmentation labels for COCO, you could use the scripts provided at this repo, or just download the generated masks from Google Drive.
COCO
├── JPEGImages
│ ├── train2014
│ └── val2014
└── SegmentationClass
├── train2014
└── val2014
Requirement
Please refer to requirements.txt
Our implementation incorporates a regularization term for segmentation. Please download and compile the python extension.
Train
The encoder is vit_base_patch16_224
pretrained on ImageNet. Download the weights to ./pretrained/
.
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node 4 train_voc.py --data_folder [VOCdevkit/VOC2012]
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node 4 train_coco.py --data_folder [COCO]
arguments most related to this project:
--cls_depth number of aggregation modules
--out_dim dimension of the projector output
--momentum EMA update parameter for teacher
--use_mim whether to enable masking
--block_size masking block size, must be a multiple of ViT patch size
--mask_ratio masking ratio
--w_class FSR loss weight for the aggregated token
--w_patch FSR loss weight for masked patch tokens
Evaluation
infer_*.py
will apply dense CRF to the predicted segmentation labels.
python infer_voc.py --checkpoint [PATH_TO_CHECKPOINT] --data_folder [VOCdevkit/VOC2012] --infer_set [val | test] --save_cam [True | False]
python infer_coco.py --checkpoint [PATH_TO_CHECKPOINT] --data_folder [COCO] --infer_set val --save_cam [True | False]
Acknowledgement
This repo is built upon ToCo. Our work is greatly inspired by DINO. Many thanks to their brilliant works!