Awesome

`Referring Camouflaged Object Detection`

Authors: Xuying Zhang, Bowen Yin, Zheng Lin, Qibin Hou, Deng-Ping Fan, Ming-Ming Cheng.

Introduction

This repo contains the official dataset and source code of the paper Referring Camouflaged Object Detection.
In this paper, we consider the problem of referring camouflaged object detection (Ref-COD), a new task that aims to segment specified camouflaged objects based on a small set of referring images with salient target objects.

<img src="figs/refcod.png" width="450"/> Fig. 1: Visual comparison between the standard COD and our Ref-COD. Given an image containing multiple camouflaged objects, the COD model tends to find all possible camouflaged objects that are blended into the background without discrimination, while the Ref-COD model attempts to identify the camouflaged objects under the condition of a set of referring images.

For technical questions, feel free to contact zhangxuying1004@gmail.com and bowenyin@mail.nankai.edu.cn; For commercial licensing, please contact cmm@nankai.edu.cn. If our work gives some inspiration to you, please cite it (BibTeX) and star this project. Thank you!

Note that I will upload the codes later, including:

The embedding process of the common representations of target objects;
The attribution evaluation of different COD / Ref-COD methods;
Visualization.
Other tools.

And you can first use my processed representations at the below dataset link if you are interested in our Ref-COD topic.

Environment setup

conda env create -f environment.yml
conda activate refcod

Get Start

1. Dataset.

<img src="figs/r2c7k.png" width="970"/> Fig. 2. Examples from our R2C7K dataset. Note that the camouflaged objects in Camo-subset are masked with their annotations in orange.

Download our ensembled R2C7K dataset with access code 2013 on Baidu Netdisk.

├── R2C7K  
    ├── Camo  
        ├── train                # training set of camo-subset with 64 categories.  
        └── test                 # tesing set of camo-subset with 64 categories.  
    ├── Ref          
        ├── Images               # all images of ref-subset with 64 categories.
        ├── RefFeat_ICON-R       # all object representations of ref-subset with 64 categories.  
        └── Saliency_ICON-R      # all foreground maps of ref-subset with 64 categories.

Update the 'data_root' param with your R2C7K location in train.py, infer.py and test.py.

2. Framework

<img src="figs/r2cnet.png" width="950"/> Fig. 3. Overall architecture of our R2CNet framework, which is composed of two branches, i.e., reference branch in green and segmentation branch in orange. In the reference branch, the common representation of a specified object from images is obtained by masking and pooling the visual features with the foreground map generated by a SOD network. In the segmentation branch, the visual features from the last three layers of the encoder are employed to represent the given image. Then, these two kinds of feature representations are fused and compared in the well-designed RMG module to generate a mask prior, which is used to enrich the visual feature among different scales to highlight the camouflaged targets in our RFE module. Finally, the enriched features are fed into the decoder to generate the final segmentation map. DSF: Dual-source Information Fusion, MSF: Multi-scale Feature Fusion, TM: Target Matching.

3. Infer.

Download the pre-trained r2cnet.pth checkpoints with access code 2023 on Baidu Netdisk.
Put the checkpoint file on './snapshot/saved_models/'.
Run python infer.py to generate the foreground maps of R2CNet.
You can also directly refer to the predictions R2CNet-Maps with access code 2023 on Baidu Netdisk.

4. Test.

Assert that the pre-trained r2cnet.pth checkpoint file has been placed in './snapshot/saved_models/'.
Run python test.py to evaluate the performance of R2CNet.

5. Ref-COD Benchmark Results.

Tab. 1. Comparison of the COD models with their Ref-COD counterparts. All models are evaluated on a NVIDIA RTX 3090 GPU. ‘R-50’: ResNet-50 [82], ‘E-B4’: EfficientNet-B4 [86], ‘R2-50’: Res2Net-50 [87], ‘R3 -50’: Triple ResNet-50 [2]. ‘-Ref’: the model with image references composed of salient objects. ‘Attribute’: the attribute of each network, ‘Single-obj’: the scene of a single camouflaged object, ‘Multi-obj’: the scene of multiple camouflaged objects, ‘Overall’: all scenes containing camouflaged objects. <img src="figs/benchmarks.png" width="1000"/>

Acknowlegement

This repo is mainly built based on SINet-V2, PFENet and MethodsCmp. Thanks for their great work!

Awesome

<p align=center>Referring Camouflaged Object Detection </p>

Introduction

Environment setup

Get Start

Acknowlegement

<p align=center>`Referring Camouflaged Object Detection` </p>