Awesome
RIS-DMMI
This repository provides the PyTorch implementation of DMMI in the following papers:<br /> Beyond One-to-One: Rethinking the Referring Image Segmentation (ICCV2023) <br />
News
- 2023.10.03-The final version of our dataset has been released. Please remember to download the latest version.
- 2023.10.03-We release our code.
Dataset
We collect a new comprehensive dataset Ref-ZOM (Zero/One/Many), which contains image-text pairs in one-to-zero, one-to-one and one-to-many conditions. Similar to RefCOCO, RefCOCO+ and G-Ref, all the images in Ref-ZOM are selected from COCO dataset. Here, we provide the text, image and annotation information of Ref-ZOM, which should be utilized with COCO_trainval2014 together. <br /> Our dataset could be downloaded from:<br /> [Baidu Cloud] [Google Drive] <br /> Remember to download original COCO dataset from:<br /> [COCO Dowanload]<br />
Code
Prepare<br />
- Download the COCO_train2014 and COCO_val2014, and merge the two dataset as a new folder “trainval2014”. Then, in the Line-52 in
/refer/refer.py
, give the path of this folder toself.Image_DIR
<br /> - Download and rename the "Ref-ZOM(final).p" as "refs(final).p". Then put refs(final).p and instances.json into
/refer/data/ref-zom/*
. <br /> - Prepare the Bert similar to LAVT
- Prepare the Refcoco, Refcoco+ and Refcocog similar to LAVT
Train<br />
- Remember to change
--output_dir
and--pretrained_backbone
as your path.<br /> - Utilize
--model
to select the backbone. 'dmmi-swin' for Swin-Base and 'dmmi_res' for resnet-50.<br /> - Utilize
--dataset
,--splitBy
and--split
to select the dataset as follwos:<br />
# Refcoco
--dataset refcoco, --splitBy unc, --split val
# Refcoco+
--dataset refcoco+, --splitBy unc, --split val
# Refcocog(umd)
--dataset refcocog, --splitBy umd, --split val
# Refcocog(google)
--dataset refcocog, --splitBy google, --split val
# Ref-zom
--dataset ref-zom, --splitBy final, --split test
- Begin training!!<br />
sh train.sh
Test
- Remember to change
--test_parameter
as your path. Meanwhile, set the--model
,--dataset
,--splitBy
and--split
properly. <br /> - Begin test!!<br />
sh test.sh
Parameter
Refcocog(umd)<br />
Backbone | oIoU | mIoU | Google Drive | Baidu Cloud |
---|---|---|---|---|
ResNet-101 | 59.02 | 62.59 | Link | Link |
Swin-Base | 63.46 | 66.48 | Link | Link |
Ref-ZOM<br />
Backbone | oIoU | mIoU | Google Drive | Baidu Cloud |
---|---|---|---|---|
Swin-Base | 68.77 | 68.25 | Link | Link |
Acknowledgements
We strongly appreciate the wonderful work of LAVT. Our code is partially founded on this code-base. If you think our work is helpful, we suggest you refer to LAVT and cite it as well.<br />
Citation
If you find our work is helpful and want to cite our work, please use the following citation info.<br />
@InProceedings{Hu_2023_ICCV,
author = {Hu, Yutao and Wang, Qixiong and Shao, Wenqi and Xie, Enze and Li, Zhenguo and Han, Jungong and Luo, Ping},
title = {Beyond One-to-One: Rethinking the Referring Image Segmentation},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2023},
pages = {4067-4077}
}