Home

Awesome

FPTrans: Feature-Proxy Transformer for Few-Shot Segmentation

Jian-Wei Zhang, Yifan Sun, Yi Yang, Wei Chen

[arXiv][Bibtex]

This repository is the PyTorch Implementation. One can find the PaddlePaddle implementation from here.

Framework

Installation

Create a virtual environment and install the required packages.

conda create -n fptrans python=3.9.7
conda activate fptrans
conda install numpy=1.21.2
conda install pytorch==1.10.0 torchvision==0.11.1 cudatoolkit=11.3 -c pytorch
conda install tqdm scipy pyyaml
pip install git+https://github.com/IDSIA/sacred.git@0.8.3
pip install dropblock pycocotools opencv-python

Put following bash function in ~/.bashrc for simplifying the CUDA_VISIBLE_DEVICES.

function cuda()
{
    if [ "$#" -eq 0 ]; then
        return
    fi
    GPU_ID=$1
    shift 1
    CUDA_VISIBLE_DEVICES="$GPU_ID" $@
}

Now we can use cuda 0 python for single GPU and cuda 0,1 python for multiple GPUs.

Getting Started

See Preparing Datasets and Pretrained Backbones for FPTrans

Usage for inference with our pretrained models

Download the checkpoints of our pretrained FPTrans from GoogleDrive or BaiduDrive (Code: FPTr), and put the pretrained models (the numbered folders) into ./output/.

DatasetsBackbone#ShotsExperiment ID (Split 0 - Split 3)
PASCAL-5iViT-B/161-shot1,2,3,4
DeiT-B/161-shot5,6,7,8
DeiT-S/161-shot9,10,11,12
DeiT-T/161-shot13,14,15,16
ViT-B/165-shot17,18,19,20
DeiT-B/165-shot21,22,23,24
COCO-20iViT-B/161-shot25,26,27,28
DeiT-B/161-shot29,30,31,32
ViT-B/165-shot33,34,35,36
DeiT-B/165-shot37,38,39,40

Run the test command:

# PASCAL ViT 1shot
cuda 0 python run.py test with configs/pascal_vit.yml exp_id=1 split=0

# PASCAL ViT 5shot
cuda 0 python run.py test with configs/pascal_vit.yml exp_id=17 split=0 shot=5

# COCO to PASCAL 1shot (cross domain, no need for training, just test)
# Load model trained from COCO, test on PASCAL 
# Notice: the code will use different splits from PASCAL-5i to avoid test 
#         classes (PASCAL) existed in training datasets (COCO).
cuda 0 python run.py test with configs/coco2pascal_vit.yml exp_id=29 split=0

Usage for training from scratch

Run the train command (adjust batch size bs for adapting the GPU memory):

# PASCAL 1shot
cuda 0 python run.py train with split=0 configs/pascal_vit.yml

# PASCAL 5shot
cuda 0,1 python run.py train with split=0 configs/pascal_vit.yml shot=5

# COCO 1shot
cuda 0,1 python run.py train with split=0 configs/coco_vit.yml

# COCO 5shot
cuda 0,1,2,3 python run.py train with split=0 configs/coco_vit.yml shot=5 bs=8

Optional arguments:

Please refer to Sacred Documentation for complete command line interface.

Performance

BackboneMethod1-shot5-shot
ResNet-50HSNet64.069.5
BAM67.870.9
ViT-B/16-384FPTrans64.773.7
DeiT-T/16FPTrans59.768.2
DeiT-S/16FPTrans65.374.2
DeiT-B/16-384FPTrans68.878.0
BackboneMethod1-shot5-shot
ResNet-50HSNet39.246.9
BAM46.251.2
ViT-B/16-384FPTrans42.053.8
DeiT-B/16-384FPTrans47.058.9

Notice that the results are obtained on NVIDIA A100/V100 platform. We find that the results may have a few fluctuation on NVIDIA GeForce 3090 with exactly the same model and environment.

Citing FPTrans

@inproceedings{zhang2022FPTrans,
  title={Feature-Proxy Transformer for Few-Shot Segmentation},
  author={Jian-Wei Zhang, Yifan Sun, Yi Yang, Wei Chen},
  journal={NeurIPS},
  year={2022}
}