Awesome
Semantically-aware Neural Radiance Fields for Visual Scene Understanding: A Comprehensive Review [Journal Pre-print]
Welcome to the official repository of our journal paper:
Semantically-aware Neural Radiance Fields for Visual Scene Understanding: A Comprehensive Review
Thang-Anh-Quan Nguyen*, Amine Bourki*, Màtyàs Macudzinski, Anthony Brunel, and Mohammed Bennamoun
(*: Denotes equal contribution).
$1$: Most up-to-date version with extended bibliography and additional contents.
1. Introduction
This repository presents a comprehensive review of recent works in the field of Neural Radiance Fields (NeRFs), with a specific focus on the integration of semantic information for enhanced visual scene understanding. Neural radiance fields have demonstrated the potential of coordinate-based neural representation, also known as neural fields or implicit neural representation. Our review aims to provide a detailed analysis of the advancements in this area, shedding light on the significance of semantically-aware NeRFs in various applications.
We invite you to explore the information and insights offered by this repository, which includes a curated list of papers, datasets, and comprehensive benchmark results related to semanticaly-aware NeRFs in the context of visual scene understanding.
Contact
Please feel free to contact me, or open a github issue if you have suggestions for improvements, insights, or if you'd like to contribute with new results or references!
2. Comparative Analysis of Previous NeRF Studies
'Semantic Tasks' include: G: 3D Geometry Enhancement, S: Segmentation, E: Editable NeRFs, O: Object Detection and 6D Pose, H: Holistic Decomposition, L: NeRFs and Language, .: denotes missing task. 'Semantic Focus' refers to whether the primary focus of the study is on semantics. *: Interesting reference, but not a journal paper.
3. Taxonomy of our Study on Semantically-aware NeRFs (SRFs)
4. Datasets
Overview of existing datasets for SRF-based multi-view scene understanding.
Legend: ‘Centricity’ refers to scene and/or object-centric datasets, respectively denoted with S and O above.
Datasets with URL | Venue | #Scenes | #Imgs | Centricity | Type | Data Modalities | Annotations |
---|---|---|---|---|---|---|---|
3DMV-VQA | CVPR 2023 | 5000 | 600K | S+O | Indoor | RGB | Visual question & answer |
NeRDS 360 | ICCV 2023 | 75 | 15k | S+O | Urban | Synthetic | 3D object boxes; 2D panoptic segmentation |
ScanNet++ | ICCV 2023 | 460 | 3.7M | S | Indoor | RGB-D | 2D/3D panoptic segmentation |
KITTI-360 | PAMI 2022 | 10 | 150K | S+O | Urban | RGB & LiDAR | 2D/3D object boxes; 2D panoptic segmentation |
SHIFT | CVPR 2022 | 4850 | 2.5M | S+O | Urban | Synthetic | 2D/3D object boxes; 2D panoptic segmentation |
HM3D Sem | arXiv 2022 | 216 | - | S | Indoor | Mesh | 3D semantic segmentation |
3D-FRONT | ICCV 2021 | 18968 | - | S+O | Indoor | Synthetic | 3D semantic segmentation |
HyperSim | ICCV 2021 | 461 | 77.4K | S+O | Indoor | Synthetic | 2D/3D object boxes; 2D/3D panoptic segmentation |
Waymo | CVPR 2020 | 1150 | 1M | S+O | Urban | RGB & LiDAR | 2D/3D object boxes; 2D panoptic segmentation |
nuScenes | CVPR 2020 | 1000 | 1.4M | S+O | Urban | RGB & LiDAR | 3D object boxes; 2D panoptic segmentation |
Replica | arXiv 2019 | 18 | - | S | Indoor | Mesh | 2D/3D panoptic segmentation |
Matterport 3D | 3DV 2017 | 90 | 194.4K | S | Indoor | RGB-D | 2D/3D panoptic segmentation |
CLEVR | CVPR 2017 | - | 100K | O | Indoor | Synthetic | Visual question & answer |
ScanNet | CVPR 2017 | 1513 | 2.5M | S+O | Indoor | RGB-D | 3D object boxes; 2D/3D panoptic segmentation |
Virtual KITTI | CVPR 2016 | 5 | 17K | S+O | Urban | Synthetic | 2D/3D object boxes; 2D panoptic segmentation |
SUN RGB-D | CVPR 2015 | 47 | 10.3K | S+O | Indoor | RGB-D | 2D/3D object boxes; 2D panoptic segmentation |
Shapenet | arXiv 2015 | - | - | O | Objects | CAD model | 3D part segmentation |
KITTI | CVPR 2012 | 22 | 15K | S+O | Urban | RGB & LiDAR | 2D/3D object boxes; 2D panoptic segmentation |
5. Benchmarks
Performance overview of the main state-of-the-art SRF methods that jointly address semantic, instance-level, and panoptic segmentation.
Citation
If you find this work useful, please consider citing it in your research as follows:
@article{SRFsota2024,
title = {Semantically-aware Neural Radiance Fields for Visual Scene Understanding: A Comprehensive Review},
author = {Thang-Anh-Quan Nguyen and Amine Bourki and M\'aty\'as Macudzinski and Anthony Brunel and Mohammed Bennamoun},
year = {2024},
eprint = {2402.11141},
archivePrefix = {arXiv},
primaryClass = {cs.CV}
}
Last Updates
- 04/10/2024: Added SoTA benchmarks for semantic, instance-level, and panoptic segmentation SRFs for the Replica and ScanNet datasets :rocket:.
- 02/17/2024: Initial release.