VATEX | Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context Understanding | WACV 2025 | [code] [webpage] |
Shared-RIS | A Simple Baseline with Single-encoder for Referring Image Segmentation | arxiv 24.08 | [code] |
ASDA | Adaptive Selection based Referring Image Segmentation | ACM MM 2024 | code |
NeMo | Finding NeMo: Negative-mined Mosaic Augmentation for Referring Image Segmentation | ECCV 2024 | [webpage] [code] |
ReMamber | ReMamber: Referring Image Segmentation with Mamba Twister | ECCV 2024 | [code] |
GTMS | GTMS: A Gradient-driven Tree-guided Mask-free Referring Image Segmentation Method | ECCV 2024 | [code] |
SAM4MLLM | SAM4MLLM: Enhance Multi-Modal Large Language Model for Referring Expression Segmentation | ECCV 2024 | [code] |
Pseudo-RIS | Pseudo-RIS: Distinctive Pseudo-supervision Generation for Referring Image Segmentation | ECCV 2024 | [code] |
SafaRi | SafaRi: Adaptive Sequence Transformer for Weakly Supervised Referring Expression Segmentation | ECCV 2024 | [webpage] |
CM-MaskSD | CM-MaskSD: Cross-Modality Masked Self-Distillation for Referring Image Segmentation | TMM 2024 | |
Prompt-RIS | Prompt-Driven Referring Image Segmentation with Instance Contrasting | CVPR 2024 | |
LQMFormer | LQMFormer: Language-aware Query Mask Transformer for Referring Image Segmentation | CVPR 2024 | |
PPT | Curriculum Point Prompting for Weakly-Supervised Referring Image Segmentation | CVPR 2024 | |
GSVA | GSVA: Generalized Segmentation via Multimodal Large Language Models | CVPR 2024 | [code] |
RMSIN | Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation | CVPR 2024 | [code] |
MRES | Unveiling Parts Beyond Objects: Towards Finer-Granularity Referring Expression Segmentation | CVPR 2024 | [code] [webpage] |
MagNet | Mask Grounding for Referring Image Segmentation | CVPR 2024 | [webpage] |
LISA | LISA: Reasoning Segmentation via Large Language Model | CVPR 2024 | [code] |
RefSegformer | Towards Robust Referring Image Segmentation | TIP 2024 | [code] |
JMCELN | Referring Image Segmentation via Joint Mask Contextual Embedding Learning and Progressive Alignment Network | EMNLP 2023 | [code] |
CVMN | Unsupervised Domain Adaptation for Referring Semantic Segmentation | ACM MM 2023 | [code] |
CARIS | CARIS: Context-Aware Referring Image Segmentation | ACM MM 2023 | [code] |
TAS | Text Augmented Spatial-aware Zero-shot Referring Image Segmentation | EMNLP 2023 | |
BKINet | Bilateral Knowledge Interaction Network for Referring Image Segmentation | TMM 2023 | [code] |
Group-RES | Advancing Referring Expression Segmentation Beyond Single Image | ICCV 2023 | [code] |
| Weakly Supervised Referring Image Segmentation with Intra-Chunk and Inter-Chunk Consistency | ICCV 2023 | |
| Shatter and Gather: Learning Referring Image Segmentation with Text Supervision | ICCV 2023 | |
TRIS | Referring Image Segmentation Using Text Supervision | ICCV 2023 | [code] |
RIS-DMMI | Beyond One-to-One: Rethinking the Referring Image Segmentation | ICCV 2023 | [code] |
ETRIS | Bridging Vision and Language Encoders: Parameter-Efficient Tuning for Referring Image Segmentation | ICCV 2023 | [code] |
SEEM | Segment Everything Everywhere All at Once | arXiv 23.04 | [code] |
SLViT | SLViT: Scale-Wise Language-Guided Vision Transformer for Referring Image Segmentation | IJCAI 2023 | [code] |
WiCo | WiCo: Win-win Cooperation of Bottom-up and Top-down Referring Image Segmentation | IJCAI 2023 | |
M3Att | Multi-Modal Mutual Attention and Iterative Interaction for Referring Image Segmentation | TIP 2023 | |
X-Decoder | X-Decoder: Generalized Decoding for Pixel, Image and Language | CVPR 2023 | [code] [project] |
Partial-RES | Learning to Segment Every Referring Object Point by Point | CVPR 2023 | [code] |
MCRES | Meta Compositional Referring Expression Segmentation | CVPR 2023 | |
Global-Local CLIP | Zero-shot Referring Image Segmentation with Global-Local Context Features | CVPR 2023 | [code] |
PolyFormer | PolyFormer: Referring Image Segmentation as Sequential Polygon Generation | CVPR 2023 | [code] [project] |
GRES | GRES: Generalized Referring Expression Segmentation | CVPR 2023 | [code] [dataset] [project] |
CGFormer | Contrastive Grouping with Transformer for Referring Image Segmentation | CVPR 2023 | [code] |
SADLR | Semantics-Aware Dynamic Localization and Refinement for Referring Image Segmentation | AAAI 2023 | |
R-RIS | Towards Robust Referring Image Segmentation | arXiv 22.09 | [code] [project] |
- | Learning From Box Annotations for Referring Image Segmentation | TNNLS 2022 | [code] |
- | Instance-Specific Feature Propagation for Referring Segmentation | TMM 2022 | |
LAVT | LAVT: Language-Aware Vision Transformer for Referring Image Segmentation | CVPR 2022 | [code] |
CRIS | CRIS: CLIP-Driven Referring Image Segmentation | CVPR 2022 | [code] |
ReSTR | ReSTR: Convolution-free Referring Image Segmentation Using Transformers | CVPR 2022 | [project] |
TV-Net | Two-stage Visual Cues Enhancement Network for Referring Image Segmentation | ACM MM 2021 | [code] |
VLT | Vision-Language Transformer and Query Generation for Referring Segmentation | ICCV 2021 | [code] |
MDETR | MDETR - Modulated Detection for End-to-End Multi-Modal Understanding | ICCV 2021 | [code] [project] |
CEFNet | Encoder Fusion Network with Co-Attention Embedding for Referring Image Segmentation | CVPR 2021 | [code] |
BUSNet | Bottom-Up Shift and Reasoning for Referring Image Segmentation | CVPR 2021 | [code] |
LTS | Locate then Segment: A Strong Pipeline for Referring Image Segmentation | CVPR 2021 | |
CGAN | Cascade Grouped Attention Network for Referring Expression Segmentation | ACM MM 2020 | |
LSCM | Linguistic Structure Guided Context Modeling for Referring Image Segmentation | ECCV 2020 | [code] |
CMPC-Refseg | Referring Image Segmentation via Cross-Modal Progressive Comprehension | CVPR 2020 | [code] |
BRINet | Bi-directional Relationship Inferring Network for Referring Image Segmentation | CVPR 2020 | [code] |
PhraseCut | PhraseCut: Language-based Image Segmentation in the Wild | CVPR 2020 | [code] [project] |
MCN | Multi-task Collaborative Network for Joint Referring Expression Comprehension and Segmentation | CVPR 2020 | [code] |
- | Dual Convolutional LSTM Network for Referring Image Segmentation | TMM 2020 | |
STEP | See-Through-Text Grouping for Referring Image Segmentation | ICCV 2019 | |
lang2seg | Referring Expression Object Segmentation with Caption-Aware Consistency | BMVC 2019 | [code] |
CMSA | Cross-Modal Self-Attention Network for Referring Image Segmentation | CVPR 2019 | [code] |
KWA | Key-Word-Aware Network for Referring Expression Image Segmentation | ECCV 2018 | [code] |
DMN | Dynamic Multimodal Instance Segmentation Guided by Natural Language Queries | ECCV 2018 | [code] |
RRN | Referring Image Segmentation via Recurrent Refinement Networks | CVPR 2018 | [code] |
MAttNet | MAttNet: Modular Attention Network for Referring Expression Comprehension | CVPR 2018 | [code] [Demo] |
RMI | Recurrent Multimodal Interaction for Referring Image Segmentation | ICCV 2017 | [code] |
LSTM-CNN | Segmentation from natural language expressions | ECCV 2016 | [code] [project] |