Home

Awesome

Tree Energy Loss: Towards Sparsely Annotated Semantic Segmentation

Introduction

This repository is an official implementation of the CVPR 2022 paper Tree Energy Loss: Towards Sparsely Annotated Semantic Segmentation.

TEL

Abstract. Sparsely annotated semantic segmentation (SASS) aims to train a segmentation network with coarse-grained (i.e., point-, scribble-, and block-wise) supervisions, where only a small proportion of pixels are labeled in each image. In this paper, we propose a novel tree energy loss for SASS by providing semantic guidance for unlabeled pixels. The tree energy loss represents images as minimum spanning trees to model both low-level and high-level pair-wise affinities. By sequentially applying these affinities to the network prediction, soft pseudo labels for unlabeled pixels are generated in a coarse-to-fine manner, achieving dynamic online self-training. The tree energy loss is effective and easy to be incorporated into existing frameworks by combining it with a traditional segmentation loss.

News

(03/03/2022) Tree Energy Loss has been accepted by CVPR 2022.

(15/03/2022) Update codes and models.

Main Results

MethodBackboneDatasetAnnotationmIoUModel
HRNetHRNet_w48Cityscapesblock5072.2google
HRNetHRNet_w48Cityscapesblock2066.8google
HRNetHRNet_w48Cityscapesblock1061.8google
HRNetHRNet_w48ADE20kblock5040.3google
HRNetHRNet_w48ADE20kblock2036.5google
HRNetHRNet_w48ADE20kblock1034.7google
DeeplabV3+ResNet101VOC2012point65.4google
LTFResNet101VOC2012point68.0google
DeeplabV3+ResNet101VOC2012scribble77.6google
LTFResNet101VOC2012scribble77.4google

Requirements

Installation

This implementation is built upon openseg.pytorch and TreeFilter-Torch. Many thanks to the authors for the efforts.

Sparse Annotation Preparation

and finally, the dataset directory should look like:

$DATA_ROOT
├── cityscapes
│   ├── train
│   │   ├── image
│   │   ├── label
│   │   └── sparse_label
│   │       ├── block10
│   │       ├── block20
│   │       └── block50
│   ├── val
│   │   ├── image
│   │   └── label
├── ade20k
│   ├── train
│   │   ├── image
│   │   ├── label
│   │   └── sparse_label
│   │       ├── block10
│   │       ├── block20
│   │       └── block50
│   ├── val
│   │   ├── image
│   │   └── label
├── voc2012
│   ├── voc_scribbles.zip
│   ├── voc_whats_the_point.json
│   └── voc_whats_the_point_bg_from_scribbles.json

Block-Supervised Setting

(1) To evaluate the released models:

bash scripts/cityscapes/hrnet/demo.sh val block50

(2) To train and evaluate your own models:

bash scripts/cityscapes/hrnet/train.sh train model_name

bash scripts/cityscapes/hrnet/train.sh val model_name

Point-supervised and Scribble-supervised Settings

(1) To evaluate the released models:

bash scripts/voc2012/deeplab/demo.sh val scribble

(2) To train and evaluate your own models:

bash scripts/voc2012/deeplab/train.sh train model_name

bash scripts/voc2012/deeplab/train.sh val model_name