Awesome
SMAC: Learning Selective Mutual Attention and Contrast for RGB-D Saliency Detection
arXiv version: https://arxiv.org/abs/2010.05537
Citing our work
If you think our work is helpful, please cite
@article{liu2021learning,
title={Learning Selective Mutual Attention and Contrast for RGB-D Saliency Detection},
author={Liu, Nian and Zhang, Ni and Shao, Ling and Han, Junwei},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
year={2021}
}
The Proposed RGB-D Salient Object Detection Dataset
ReDWeb-S
We construct a new large-scale challenging dataset ReDWeb-S and it has totally 3179 images with various real-world scenes and high-quality depth maps. We split the dataset into a training set with 2179 RGB-D image pairs and a testing set with the remaining 1000 image pairs.
The proposed dataset link can be found here. [baidu pan fetch code: rp8b | Google drive]
Dataset Statistics and Comparisons
We analyze the proposed ReDWeb-S datset from several statistical aspects and also conduct a comparison between ReDWeb-S and other existing RGB-D SOD datasets.
Fig.1. Top 60% scene and object category distributions of our proposed ReDWeb-S dataset.
Fig.2. Comparison of nine RGB-D SOD dataset in terms of the distributions of global contrast and interior contrast.
Fig.3. Comparsion of the average annotation maps for nine RGB-D SOD benchmark datasets.
Fig.4. Comparsion of the distribution of object size for nine RGB-D SOD benchmark datasets.
SOTA Results on Our Proposed Dataset
We provide other SOTA RGB-D methods' results and scores on our proposed dataset. You can directly download all results [here ov08].
No. | Pub. | Name | Title | Download |
---|---|---|---|---|
00 | TIP2023 | Caver | Caver: Cross-modal view-mixed transformer for bi-modal salient object detection | results, 2kfm |
01 | TCSVT2022 | HRTransNet | HRTransNet: HRFormer-Driven Two-Modality Salient Object Detection | results, azjb |
02 | TCSVT2021 | SwinNet | SwinNet: Swin Transformer Drives Edge-Aware RGB-D and RGB-T Salient Object Detection | results, zf9s |
03 | ICCV2021 | CMINet | RGB-D Saliency Detection via Cascaded Mutual Information Minimization | results, maav |
04 | ICCV2021 | VST | Visual Saliency Transformer | results, rkq9 |
05 | ICCV2021 | SPNet | Specificity-preserving RGB-D Saliency Detection | results, wwup |
06 | CVPR2021 | DCF | Calibrated RGB-D Salient Object Detection | results, 3kn9 |
07 | ECCV2020 | PGAR | Progressively Guided Alternate Refinement Network for RGB-D Salient Object Detection | results, mwtr |
08 | ECCV2020 | HDFNet | Hierarchical Dynamic Filtering Network for RGB-D Salient Object Detection | results, b98z |
09 | ECCV2020 | DANet | A Single Stream Network for Robust and Real-time RGB-D Salient Object Detection | results, 1luj |
10 | ECCV2020 | CoNet | Accurate RGB-D Salient Object Detection via Collaborative Learning | results, bqq6 |
11 | ECCV2020 | CMWNet | Cross-Modal Weighting Network for RGB-D Salient Object Detection | results, ztv9 |
12 | ECCV2020 | cmMS | RGB-D Salient Object Detection with Cross-Modality Modulation and Selection | results, kwe5 |
13 | ECCV2020 | BBS-Net | BBS-Net: RGB-D Salient Object Detection with a Bifurcated Backbone Strategy Network | results, ya5v |
14 | ECCV2020 | ATSA | Asymmetric Two-Stream Architecture for Accurate RGB-D Saliency Detection | results, k750 |
15 | CVPR2020 | S2MA | Learning Selective Self-Mutual Attention for RGB-D Saliency Detection | results, g0pgx |
16 | CVPR2020 | JL-DCF | JL-DCF: Joint Learning and Densely-Cooperative Fusion Framework for RGB-D Salient Object Detection | results, xh9p |
17 | CVPR2020 | UCNet | UC-Net: Uncertainty Inspired RGB-D Saliency Detection via Conditional Variational Autoencoders | results, 6o93 |
18 | CVPR2020 | A2dele | A2dele: Adaptive and Attentive Depth Distiller for Efficient RGB-D Salient Object Detection | results, swv5 |
19 | CVPR2020 | SSF-RGBD | Select, Supplement and Focus for RGB-D Saliency Detection | results, oshl |
20 | TIP2020 | DisenFusion | RGBD Salient Object Detection via Disentangled Cross-Modal Fusion | results, h3hc |
21 | TNNLS2020 | D3Net | D3Net:Rethinking RGB-D Salient Object Detection: Models, Datasets, and Large-Scale Benchmarks | results, tetn |
22 | ICCV2019 | DMRA | Depth-induced multi-scale recurrent attention network for saliency detection | results, kqq4 |
23 | CVPR2019 | CPFP | Contrast Prior and Fluid Pyramid Integration for RGBD Salient Object Detection | results, 0v2c |
24 | TIP2019 | TANet | Three-stream attention-aware network for RGB-D salient object detection | results, hsy9 |
25 | CVPR2018 | PCF | Progressively Complementarity-Aware Fusion Network for RGB-D Salient Object Detection | results, qzhm |
26 | PR2019 | MMCI | Multi-modal fusion network with multiscale multi-path and cross-modal interactions for RGB-D salient object detection | results, c90m |
27 | TCyb2017 | CTMF | CNNs-based RGB-D saliency detection via cross-view transfer and multiview fusion | results, i0zb |
28 | Access2019 | AFNet | Adaptive fusion for rgb-d salient object detection | results, 54zc |
29 | TIP2017 | DF | Rgbd salient object detection via deep fusion | results, d7sc |
30 | ICME2016 | SE | Salient object detection for rgb-d image via saliency evolution | results, h10s |
31 | SPL2016 | DCMC | Saliency detection for stereoscopic images based on depth confidence analysis and multiple cues fusion | results, 18po |
32 | CVPR2016 | LBE | Local background enclosure for rgb-d salient object detection | results, iiz5 |
Methods | S-measure | maxF | E-measure | MAE |
---|---|---|---|---|
S2MA | 0.711 | 0.696 | 0.781 | 0.139 |
JL-DCF | 0.734 | 0.727 | 0.805 | 0.128 |
UCNet | 0.713 | 0.71 | 0.794 | 0.13 |
A2dele | 0.641 | 0.603 | 0.672 | 0.16 |
SSF-RGBD | 0.595 | 0.558 | 0.71 | 0.189 |
DisenFusion | 0.675 | 0.658 | 0.76 | 0.16 |
D3Net | 0.689 | 0.673 | 0.768 | 0.149 |
DMRA | 0.592 | 0.579 | 0.721 | 0.188 |
CPFP | 0.685 | 0.645 | 0.744 | 0.142 |
TANet | 0.656 | 0.623 | 0.741 | 0.165 |
PCF | 0.655 | 0.627 | 0.743 | 0.166 |
MMCI | 0.660 | 0.641 | 0.754 | 0.176 |
CTMF | 0.641 | 0.607 | 0.739 | 0.204 |
AFNet | 0.546 | 0.549 | 0.693 | 0.213 |
DF | 0.595 | 0.579 | 0.683 | 0.233 |
SE | 0.435 | 0.393 | 0.587 | 0.283 |
DCMC | 0.427 | 0.348 | 0.549 | 0.313 |
LBE | 0.637 | 0.629 | 0.73 | 0.253 |
Acknowledgement
We thank all annotators for helping us constructing the proposed dataset. Our proposed dataset is based on the ReDWeb dataset, which is a state-of-the-art dataset proposed for monocular image depth estimation. We also thank the authors for providing the ReDWeb dataset.
Contact
If you have any questions, please feel free to contact me. (nnizhang.1995@gmail.com)