Home

Awesome

ReSTR: Convolution-Free Referring Image Segmentation Using Transformers

plot

This repository contains the official source code for our paper:
ReSTR: Convolution-Free Referring Image Segmentation Using Transformers
Namyup Kim<sup>1</sup>, Dongwon Kim<sup>1</sup>, Cuiling Lan<sup>2</sup>, Wenjun Zeng<sup>2</sup>, and Suha Kwak<sup>1</sup> <br> <sup>1</sup>POSTECH CSE, <sup>2</sup>Microsoft Research Asia<br> CVPR, 2022.

Environment setup

Instructions:

conda env create -f restr.yaml
conda activate restr

Data Setup

1. Setting

<!-- - Run `mkdir external`. Download, git clone, or use symlink, such that [TF-resnet](https://github.com/chenxi116/TF-resnet) and [TF-deeplab](https://github.com/chenxi116/TF-deeplab) are under `external`. Then strictly follow the `Example Usage` section of their README -->

2. Data preparation

python build_batches.py -d Gref -t train --img-size 480
python build_batches.py -d Gref -t val --img-size 480
python build_batches.py -d unc -t train --img-size 480
python build_batches.py -d unc -t val --img-size 480
python build_batches.py -d unc -t testA --img-size 480
python build_batches.py -d unc -t testB --img-size 480
python build_batches.py -d unc+ -t train --img-size 480
python build_batches.py -d unc+ -t val --img-size 480
python build_batches.py -d unc+ -t testA --img-size 480
python build_batches.py -d unc+ -t testB --img-size 480
python build_batches.py -d referit -t trainval --img-size 480
python build_batches.py -d referit -t test --img-size 480

3. Directory Structure After Sutup and Data Preparation

├─ ./data              
      ├─ mscoco   
      │   ├─ Gref_480_batch
      │   │   ├─ train_batch
      │   │   |     ├─ Gref_train_0.npz
      │   │   |     ├─ Gref_train_1.npz
      │   │   |     └─ ...
      |   |   ├─ train_image
      │   │   ├─ train_label 
      │   │   ├─ val_batch
      │   │   ├─ val_image
      │   │   └─ val_label
      │   ├─ unc_480_batch
      │   └─ unc+_480_batch
      ├─ referit
      │   └─ referit_480_batch
      │       ├─ trainval_batch
      │       └─ text_batch
      ├─ Gref_emb.npy
      ├─ referit_emb.npy
      ├─ vocabulary_Gref.txt
      └─ vocabulary_referit.txt

Training

python train_restr.py --data_dir ./data/mscoco/Gref_480_batch --adamW
python train_restr.py --data_dir ./data/mscoco/unc_480_batch --adamW
python train_restr.py --data_dir ./data/mscoco/unc+_480_batch --adamW
python train_restr.py --data_dir ./data/referit/referit_480_batch --set trainval --valset test --adamW

Evaluation

cd eval
python evaluate.py --data_dir ../data/mscoco/Gref_batch --restore_refseg ../weights/test --set val --iters 25000 --input-size 480,480 --is_vis

Citation

@inproceedings{kim2022restr,
  title={Restr: Convolution-free referring image segmentation using transformers},
  author={Kim, Namyup and Kim, Dongwon and Lan, Cuiling and Zeng, Wenjun and Kwak, Suha},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={18145--18154},
  year={2022}
}

Acknowledgement

This code is built upon the following public repositories.