Awesome
Semantic correspondence
Few-shot segmentation
Cost Aggregation with 4D Convolutional Swin Transformer for Few-Shot Segmentation (ECCV'22)
Check out project [Project Page] and the paper on [arXiv]. Pretrained weights are updated and can be found here : Link
ECCV'22 camera ready version can be found here : [arXiv].
Semantic matching codes are available at semantic-matching
branch.
Check out our new TPAMI (TBA) paper! CATs++: https://github.com/KU-CVLAB/CATs-PlusPlus
Network
Our model VAT is illustrated below:
Environment Settings
git clone https://github.com/Seokju-Cho/Volumetric-Aggregation-Transformer.git
cd Volumetric-Aggregation-Transformer
conda env create -f environment.yaml
Preparing Few-Shot Segmentation Datasets
Download following datasets:
1. PASCAL-5<sup>i</sup>
Download PASCAL VOC2012 devkit (train/val data):
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
Download PASCAL VOC2012 SDS extended mask annotations from our [Google Drive].
2. COCO-20<sup>i</sup>
Download COCO2014 train/val images and annotations:
wget http://images.cocodataset.org/zips/train2014.zip wget http://images.cocodataset.org/zips/val2014.zip wget http://images.cocodataset.org/annotations/annotations_trainval2014.zip
Download COCO2014 train/val annotations from our Google Drive: [train2014.zip], [val2014.zip]. (and locate both train2014/ and val2014/ under annotations/ directory).
3. FSS-1000
Download FSS-1000 images and annotations from our [Google Drive].
Create a directory '../Datasets_VAT' for the above three few-shot segmentation datasets and appropriately place each dataset to have following directory structure:
../ # parent directory
└── Datasets_VAT/
├── VOC2012/ # PASCAL VOC2012 devkit
│ ├── Annotations/
│ ├── ImageSets/
│ ├── ...
│ └── SegmentationClassAug/
├── COCO2014/
│ ├── annotations/
│ │ ├── train2014/ # (dir.) training masks (from Google Drive)
│ │ ├── val2014/ # (dir.) validation masks (from Google Drive)
│ │ └── ..some json files..
│ ├── train2014/
│ └── val2014/
└── FSS-1000/ # (dir.) contains 1000 object classes
├── abacus/
├── ...
└── zucchini/
Training
Training on PASCAL-5<sup>i</sup>:
python train.py --config "config/pascal_resnet{50, 101}/pascal_resnet{50, 101}_fold{0, 1, 2, 3}/config.yaml"
Training on COCO-20<sup>i</sup>:
python train.py --config "config/coco_resnet50/coco_resnet50_fold{0, 1, 2, 3}/config.yaml"
Training on FSS-1000:
python train.py --config "config/fss_resnet{50, 101}/config.yaml"
Evaluation
- Download pre-trained weights on Link
Result on PASCAL-5<sup>i</sup>:
python test.py --load "/path_to_pretrained_model/pascal_resnet{50, 101}/pascal_resnet{50, 101}_fold{0, 1, 2, 3}/"
Result on COCO-20<sup>i</sup>:
python test.py --load "/path_to_pretrained_model/coco_resnet50/coco_resnet50_fold{0, 1, 2, 3}/"
Results on FSS-1000:
python test.py --load "/path_to_pretrained_model/fss_resnet{50, 101}/"
Acknowledgement <a name="Acknowledgement"></a>
We borrow code from public projects (huge thanks to all the projects). We mainly borrow code from HSNet.
BibTeX
If you find this research useful, please consider citing:
@inproceedings{hong2022cost,
title={Cost aggregation with 4d convolutional swin transformer for few-shot segmentation},
author={Hong, Sunghwan and Cho, Seokju and Nam, Jisu and Lin, Stephen and Kim, Seungryong},
booktitle={European Conference on Computer Vision},
pages={108--126},
year={2022},
organization={Springer}
}