Home

Awesome

Light Field Salient Object Detection: A Review and Benchmark [PDF][中文版]

:fire:News!

:running: We will keep updating state-of-the-art works! :running:

Contents:

  1. Light Field<br> i. Multi-view Images and Focal Stacks<br>
  2. Light Field SOD<br> i. Traditional Models<br> ii. Deep Learning-based Models<br> iii. Other Review Works<br>
  3. Light Field SOD Datasets<br>
  4. Benchmarking Results<br> i. RGB-D SOD Models in Our Tests<br> ii. Quantitative Comparison<br> iii. All Models' Saliency Maps<br> iv. Qualitative Comparison<br>
  5. Citation<br>

Light Field

Multi-view Images and Focal Stacks

alt text

alt text

Light Field SOD

Traditional Models

Table I: Overview of traditional LFSOD models.

No.YearModelpub.TitleLinks
12014LFSCVPRSaliency Detection on Light FieldPaper/Project
22015WSCCVPRA Weighted Sparse Coding Framework for Saliency DetectionPaper/Project
32015DILFIJCAISaliency Detection with a Deeper Investigation of Light FieldPaper/Project
42016RLICASSPRelative location for light field saliency detectionPaper/Project
52017BIFNPLA Two-Stage Bayesian Integration Framework for Salient Object Detection on Light FieldPaper/Project
62017LFSTPAMISaliency Detection on Light FieldPaper/Project
72017MATOMMSaliency Detection on Light Field: A Multi-Cue ApproachPaper/Project
82018SDDFMTAPAccurate saliency detection based on depth feature of 3D imagesPaper/Project
92018SGDCCVPRSalience Guided Depth Calibration for Perceptually Optimized Compressive Light Field 3D DisplayPaper/Project
102020RDFDMTAPRegion-based depth feature descriptor for saliency detection on light fieldPaper/Project
112020DCATIPSaliency Detection via Depth-Induced Cellular Automata on Light FieldPaper/Project
122023CDCAENTROPYExploring Focus and Depth-Induced Saliency Detection for Light FieldPaper/Project
132023TFSFAOTwo-way focal stack fusion for light field saliency detectionPaper/Project

Deep Learning-based Models

alt text

Fig. 1 Frameworks of deep light field SOD models. (a) Late-fusion (DLLF, MTCNet). (b) Middle-fusion (MoLF, LFNet). (c) Knowledge distillation-based (ERNet). (d) Reconstruction-based (DLSD). (e) Single-stream (MAC). Note that (a) utilizes the focal stack/multi-view images and all-in-focus/center view image, while (b)-(c) utilize the focal stack and all-in-focus image. (d)-(e) utilize the center-view image and micro-lens image array.

Table II: Overview of deep learning-based LFSOD models.

No.YearModelpub.TitleLinks
12019DLLFICCVDeep Learning for Light Field Saliency DetectionPaper/Project
22019DLSDIJCAIDeep Light-field-driven Saliency Detection from a Single ViewPaper/Project
32019MoLFNIPSMemory-oriented Decoder for Light Field Salient Object DetectionPaper/Project
42020ERNetAAAIExploit and Replace: An Asymmetrical Two-Stream Architecture for Versatile Light Field Saliency DetectionPaper/Project
52020LFNetTIPLFNet: Light Field Fusion Network for Salient Object DetectionPaper/Project
62020MACTIPLight Field Saliency Detection with Deep Convolutional NetworksPaper/Project
72020MTCNetTCSVTA Multi-Task Collaborative Network for Light Field Salient Object DetectionPaper/Project
82021DUT-LFSaliencyArxivDUT-LFSaliency: Versatile Dataset and Light Field-to-RGB Saliency DetectionPaper/Project
92021OBGNetACM MMOcclusion-aware Bi-directional Guided Network for Light Field Salient Object DetectionPaper/Project
102021DLGLRGICCVLight Field Saliency Detection with Dual Local Graph Learning and Reciprocative GuidancePaper/Project
112021GAGNNIEEE TIPGeometry Auxiliary Salient Object Detection for Light Fields via Graph Neural NetworksPaper/Project
122021SANetBMVCLearning Synergistic Attention for Light Field Salient Object DetectionPaper/Project
132021TCFANetIEEE SPLThree-Stream Cross-Modal Feature Aggregation Network for Light Field Salient Object DetectionPaper/Project
142021PANetIEEE TCYBPANet: Patch-Aware Network for Light Field Salient Object DetectionPaper/Project
152021MGANetIEEE ICMEWMulti-Generator Adversarial Networks For Light Field Saliency DetectionPaper/Project
162022MEANetNeurocomputingMEANet: Multi-Modal Edge-Aware Network for Light Field Salient Object DetectionPaper/Project
172022DGENetIVCDual guidance enhanced network for light field salient object detectionPaper/Project
182022NoiseLFCVPRLearning from Pixel-Level Noisy Label : A New Perspective for Light Field Saliency DetectionPaper/Project
192022ARFNetIEEE Systems JournalARFNet: Attention-Oriented Refinement and Fusion Network for Light Field Salient Object DetectionPaper/Project
202022--IEEE TIPWeakly-Supervised Salient Object Detection on Light FieldsPaper/Project
212022ESCNetIEEE TIPExploring Spatial Correlation for Light Field Saliency Detection: Expansion From a Single ViewPaper/Project
222022LFBCNetACM MMLFBCNet: Light Field Boundary-aware and Cascaded Interaction Network for Salient Object DetectionPaper/Project
232023TENetIVCTENet: Accurate light-field salient object detection with a transformer embedding networkPaper/Project
242023--TPAMIA Thorough Benchmark and a New Model for Light Field Saliency DetectionPaper/Project
252023GFRNetICMEGuided Focal Stack Refinement Network For Light Field Salient Object DetectionPaper/Project
262023FESNetIEEE TMMFusion-Embedding Siamese Network for Light Field Salient Object DetectionPaper/Project
272023LFTransNetIEEE TCSVTLFTransNet: Light Field Salient Object Detection via a Learnable Weight DescriptorPaper/Project
282023CDINetIEEE TCSVTLight Field Salient Object Detection with Sparse Views via Complementary and Discriminative Interaction NetworkPaper/Project
292024LF-TracyarXivLF Tracy: A Unified Single-Pipeline Approach for Salient Object Detection in Light Field CamerasPaper/Project
302024PANetIEEE SPLParallax-Aware Network for Light Field Salient Object DetectionPaper/Project
:fire: 312024--ResearchMasked Generative Light Field Prompting for Pixel-Level Structure SegmentationsPaper/Project
:fire: 322024--TVCLight field salient object detection based on discrete viewpoint selection and multi-feature fusionPaper/Project
:fire: 332024--SPICA foreground-context dual-guided network for light-field salient object detectionPaper/Project
:fire: 342024GMERNetTOMMGated Multi-Modal Edge Refinement Network for Light Field Salient Object DetectionPaper/Project
:fire: 352024IRNetTOMMRethinking Feature Mining for Light Field Salient Object DetectionPaper/Project
:fire: 362024FAENetICMEFocal Stack Alignment Enhancement Network For Light Field Salient Object DetectionPaper/Project
:fire: 372024--TCSVTSpatial Attention-Guided Light Field Salient Object Detection Network with Implicit Neural RepresentationPaper/Project
:fire: 382024FPTPRCVFocal Perception Transformer for Light Field Salient Object DetectionPaper/Project
:fire: 392024LRNetScientific ReportsLRNet: lightweight attention-oriented residual fusion network for light field salient object detectionPaper/Project
:fire: 402024LFSambaIEEE SPLLFSamba: Marry SAM With Mamba for Light Field Salient Object DetectionnPaper/Project
:fire: 412024TLFNetTIPTransformer-based Light Field Salient Object Detection and Its Application to AutofocusPaper/Project

Other Review Works

Table III: Overview of related reviews and surveys to LFSOD.

No.YearModelpub.TitleLinks
12015CSNEUROLight field saliency vs.2D saliency : A comparative studyPaper/Project
22020RGBDSCVMRGB-D Salient Object Detection: A SurveyPaper/Project

Light Field SOD Datasets

Table IV: Overview of light field SOD datasets. About the abbreviations: MOP=Multiple-Object Proportion (The percentage of images regarding the entire dataset, which have more than one objects per image), FS=Focal Stacks, DE=Depth maps, MV=Multi-view images, ML=Micro-lens images, GT=Ground-truth, Raw=Raw light field data. FS, MV, DE, ML, GT and Raw indicate the data provided by the datasets. '✓' denotes the data forms provided in the original datasets, while '✔️' indicates the data forms generated by us. Original data forms as well as supplement data forms can be download at 'Download' with the fetch code: 'lfso'. You can also download original dataset in 'Original Link'.

No.DatasetYearPub.SizeMOPFSMVDEMLGTRawDownloadOriginal Link
1LFSD2014CVPR1000.04✔️✔️LinkLink
2HFUT-Lytro2017ACM TOMM2550.29✔️LinkLink
3DUTLF-FS2019ICCV14620.05LinkLink
4DUTLF-MV2019IJCAI15800.04LinkLink
5Lytro Illum2020IEEE TIP6400.15✔️✔️✔️LinkLink
6DUTLF-V22021Arxiv4204Link
:fire: 7CITYU-Lytro2021IEEE TIP817
:fire: 8PKU-LF2021TPAMI5000Link

Benchmarking Results

<span id = "appendix">RGB-D SOD Models in Our Tests</span>

Table V: Overview of RGB-D SOD models in our tests.

No.YearModelPub.TitleLinks
12020BBSECCVBBS-Net: RGB-D Salient Object Detection with a Bifurcated Backbone Strategy NetworkPaper/Project
22020JLDCFCVPRJL-DCF: Joint learning and densely-cooperative fusion framework for RGB-D salient object detectionPaper/Project
32020SSFCVPRSelect, supplement and focus for RGB-D saliency detectionPaper/Project
42020UCNetCVPRUC-Net: Uncertainty Inspired RGB-D Saliency Detection via Conditional Variational AutoencodersPaper/Project
52020D3NetIEEE TNNLSRethinking RGB-D salient object detection: models, datasets, and large-scale benchmarksPaper/Project
62020S2MACVPRLearning selective self-mutual attention for RGB-D saliency detectionPaper/Project
72020cmMSECCVRGB-D salient object detection with cross-modality modulation and selectionPaper/Project
82020HDFNetECCVHierarchical Dynamic Filtering Network for RGB-D Salient Object DetectionPaper/Project
92020ATSAECCVAsymmetric Two-Stream Architecture for Accurate RGB-D Saliency DetectionPaper/Project

Quantitative Comparison

Table VI: Quantitative measures: S-measure (S<sub>α</sub>), max F-measure (F<sub>β</sub><sup>max</sup>), mean F-measure (F<sub>β</sub><sup>mean</sup>), adaptive F-measure (F<sub>β</sub><sup>adp</sup>), max E-measure (E<sub>Φ</sub><sup>max</sup>), mean E-measure (E<sub>Φ</sub><sup>mean</sup>), adaptive E-measure (E<sub>Φ</sub><sup>adp</sup>), and MAE (M) of nine light field SOD models (i.e., LFS, WSC, DILF, RDFD, DLSD, MoLF, ERNet, LFNet, MAC) and nine SOTA RGB-D based SOD models (i.e., BBS, JLDCF, SSF, UCNet, D3Net, S2MA, cmMS, HDFNet, and ATSA). <br> Note in the table, light field SOD models are marked by "†". Symbol “N/T” indicates that a model was trained on quite some images from the corresponding dataset, and, thus, it is not tested. The top three models among light field and RGB-D based SOD models are highlighted in red, blue and green, separately. ↑/↓ denotes that a larger/smaller value is better.<br> Other important notes:<br> 1). There are TWO types of ground-truth (GT) corresponding to focal stacks and multi-view images due to an inevitable issue of software used for generation, and there are usually shifts between them. Models using different data with different GT are not directly comparable. Luckly, this problem is avoided, as in the table, all the models on the same dataset are based on the SINGLE GT type of focal stacks.<br> 2). On LFSD, model MAC is tested on micro-lens image arrays, whose GT is dffierent from that of other models. As a solution, we find the transfomration between these two GT types and then again transform the MAC's results to make them aligned to the other GT type. Generally, we find the differences of metric numbers are quite small before and after transform.<br> 3). Again about MAC, on DUTLF-FS and HFUT-Lytro, MAC is tested on single up-sampled all-in-focus images.<br> 4). On HFUT-Lytro, ERNet is only tested on 155 images, since it used the remaining 100 images for training. So the obtain metric numbers are only for reference, as the other models are tested on 255 images.<br> alt text

:fire: Table VII: Quantitative measures: S-measure (S<sub>α</sub>), max F-measure (F<sub>β</sub><sup>max</sup>), max E-measure (E<sub>Φ</sub><sup>max</sup>), and MAE (M) of one retrained light field SOD model (ERNet) and seven retrained RGB-D based SOD models (ie, BBS, SSF, ATSA, S2MA, D3Net, HDFNet, and JLDCF). Note in the table, the results of original models are taken from Table VI, and the retrained models are marked by *. The best results of retrained models are highlighted in bold. ↑/↓ denotes that a larger/smaller value is better. alt text

alt text

Fig. 2 PR curves on four datasets ((a) LFSD, (b) HFUT-Lytro, (c) Lytro Illum, and (d) DUTLF-FS) for nine light field SOD models (i.e., LFS, WSC, DILF, RDFD, DLSD, MoLF, ERNet, LFNet, MAC) and nine SOTA RGB-D based SOD models (i.e., BBS, JLDCF, SSF, UCNet, D3Net, S2MA, cmMS, HDFNet, and ATSA). Note that in this figure, the solid lines and dashed lines represent the PR curves of RGB-D based SOD models and light field SOD models, respectively.

<span id = "saliency-download">All Models' Saliency Maps</span>

All models' saliency maps generated and used for our evaluation table are now publicly available at Baidu Pan (code: lfso) or Google Drive.

Qualitative Comparison

alt test

Fig. 3 Visual comparison of five light field SOD (i.e., LFS, DILF, DLSD, MoLF, ERNet, , bounded in the green box) and three SOTA RGB-D based SOD models (i.e., JLDCF, BBS, and ATSA, bounded in the red box). The first two rows in this figure show easy cases while the third to fifth rows show cases with complex backgrounds or sophisticated boundaries. The last row gives an example with low color contrast between foreground and background.

alt test

Fig. 4 Visual comparison of five light field SOD (i.e., LFS, DILF, DLSD, MoLF, ERNet, , bounded in the green box) and three SOTA RGB-D based SOD models (i.e., JLDCF, BBS, and ATSA, bounded in the red box) on detecting multiple (first three rows) and small objects (remaining rows).

Citation

Please cite our paper if you find the work useful:

@article{Fu2021lightfieldSOD,
title={Light Field Salient Object Detection: A Review and Benchmark},
author={Fu, Keren and Jiang, Yao and Ji, Ge-Peng and Zhou, Tao and Zhao, Qijun and Fan, Deng-Ping},
journal={Computational Visual Media},
volume={8},
number={4},
pages={509--534},
year={2022},
publisher={Springer}
}