Home

Awesome

Awesome Scene Understanding Awesome

A curated list of awesome scene understanding papers, inspired by awesome-computer-vision.

Related Resources

Workshops and Tutorials

Survey

PapersVenueLinks
Neural Fields in Robotics: A SurveyarXiv 2024-
Advances in Data-Driven Analysis and Synthesis of 3D Indoor ScenesCGF 2023-
State-of-the-art in Automatic 3D Reconstruction of Structured Indoor EnvironmentsCGF 2020[project]
Indoor Scene Understanding in 2.5/3D for Autonomous Agents: A SurveyIEEE Access 2019-
RGBD Datasets: Past, Present and FutureCVPR Workshop 2016[project]

Dataset

Realistic Dataset

PapersVenueLinks
ScanNet++: A High-Fidelity Dataset of 3D Indoor ScenesICCV 2023[project]
ARKitScenes: A Diverse Real-World Dataset For 3D Indoor Scene Understanding Using Mobile RGB-D DataNeurIPS 2021 Dataset Track[code]
Zillow Indoor Dataset: Annotated Floor Plans With 360Ëš Panoramas and 3D Room LayoutsCVPR 2021[code]
HoliCity: A City-Scale Data Platform for Learning Holistic 3D StructuresCoRR 2020[project]
OASIS: A Large-Scale Dataset for Single Image 3D in the WildCVPR 2020[project]
3D Scene Graph: A Structure for Unified Semantics, 3D Space, and CameraICCV 2019[project]
The Replica Dataset: A Digital Replica of Indoor SpacesCoRR 2019[code]
Matterport3D: Learning from RGB-D Data in Indoor Environments3DV 2017[project]
Joint 2D-3D-Semantic Data for Indoor Scene UnderstandingCoRR 2017[project]
ScanNet: Richly-annotated 3D Reconstructions of Indoor ScenesCVPR 2017[project]
SceneNN: a Scene Meshes Dataset with aNNotations3DV 2016[project]
SUN RGB-D: A RGB-D Scene Understanding Benchmark SuiteCVPR 2015[project]
SUN3D: A Database of Big Spaces Reconstructed using SfM and Object LabelsICCV 2013[project]
Indoor Segmentation and Support Inference from RGBD ImagesECCV 2012[project]

Synthetic Dataset

PapersVenueLinks
Infinigen Indoors: Photorealistic Indoor Scenes using Procedural GenerationCVPR 2024[project]
R3DS: Reality-linked 3D Scenes for Panoramic Scene UnderstandingCoRR 2024[project]
FurniScene: A Large-scale 3D Room Dataset with Intricate Furnishing ScenesCoRR 2024-
GeoSynth: A Photorealistic Synthetic Indoor Dataset for Scene UnderstandingVR 2023[code]
MINERVAS: Massive INterior EnviRonments VirtuAl SynthesisCGF 2022[project]
3D-FRONT: 3D Furnished Rooms with layOuts and semaNTicsICCV 2021[project]
Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene UnderstandingICCV 2021[project]
OpenRooms: An End-to-End Open Framework for Photorealistic Indoor Scene DatasetsCVPR 2021[project]
Structured3D: A Large Photo-realistic Dataset for Structured 3D ModelingECCV 2020[project]
InteriorNet: Mega-scale Multi-sensor Photo-realistic Indoor Scenes DatasetBMVC 2018[project]
SceneNet RGB-D: Can 5M Synthetic Images Beat Generic ImageNet Pre-training on Indoor Segmentation?ICCV 2017[project]
Semantic Scene Completion from a Single Depth ImageCVPR 2017-
SceneNet: Understanding Real World Indoor Scenes With Synthetic DataCVPR 2016[project]
The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban ScenesCVPR 2016[project]

Holistic Scene Understanding

Perspective Image

PapersVenueLinks
Single-view 3D Scene Reconstruction with High-fidelity Shape and Texture3DV 2024[project]
Towards High-Fidelity Single-view Holistic Reconstruction of Indoor ScenesECCV 2022[code]
Holistic 3D Scene Understanding from a Single Image with Implicit RepresentationCVPR 2021[project] [code]
Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single ImageCVPR 2020[code]
PerspectiveNet: 3D Object Detection from a Single RGB Image via Perspective PointsNeurIPS 2019-
Hoilistc++ Scene Understanding: Single-view 3D Holistic Scene Parsing and Human Pose Estimation with Human-Object Interaction and Physical CommonsenseICCV 2019[project] [code]
Complete 3D Scene Parsing from an RGBD ImageIJCV 2018-
Cooperative Holistic Scene Understanding: Unifying 3D Object, Layout, and Camera Pose EstimationNeurIPS 2018[project] [code]
Holistic 3D Scene Parsing and Reconstruction from a Single RGB ImageECCV 2018[project] [code]
Factoring Shape, Pose, and Layout from the 2D Image of a 3D SceneCVPR 2018[project] [code]
Im2CADCVPR 2018[project]
DeepContext: Context-Encoding Neural Pathways for 3D Holistic Scene UnderstandingICCV 2017[project]
Emptying, Refurnishing, and Relighting Indoor SpacesSIGGRAPH Asia 2016[project]
Scene Parsing by Integrating Function, Geometry and Appearance ModelsCVPR 2013-
Understanding Indoor Scenes using 3D Geometric Phrases(CVPR 2013)-
Recovering Free Space of Indoor Scenes from a Single ImageCVPR 2012-
Efficient Exact Inference for 3D Indoor Scene UnderstandingECCV 2012-
Efficient Structured Prediction for 3D Indoor Scene UnderstandingCVPR 2012-
Estimating Spatial Layout of Rooms using Volumetric Reasoning about Objects and SurfacesNeurIPS 2010-
Thinking Inside the Box: Using Appearance Models and Context Based on Room GeometryECCV 2010-

Panoramic Image

PapersVenueLinks
PanoContext-Former: Panoramic Total Scene Understanding with a TransformerCVPR 2024-
PanelNet: Understanding 360 Indoor Environment via Panel RepresentationCVPR 2023-
DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based OptimizationICCV 2021[code]
HoHoNet: 360 Indoor Holistic Understanding with Latent Horizontal FeaturesCVPR 2021[Code]
Automatic 3D Indoor Scene Modeling from Single PanoramaCVPR 2018-
Pano2CAD: Room Layout From A Single Panorama ImageWACV 2017-
PanoContext: A Whole-room 3D Context Model for Panoramic Scene UnderstandingECCV 2014[project]

Room Layout Estimation

Perspective Image

(AW: Atlanta-world, SS: single-floor and single-ceiling, PP: Piece-wise Planarity.)

DatasetYearModality#FramesPriorSource
CAD-Estate2023RGB Video-GenericRealEstate-10K
Matterport3D-Layout2020RGB-D7360PPMatterport
ScanNet-Layout2020RGB-D293PPScanNet
Structured3D2020RGB-D82027AW+SSStructured3D
LSUN Room Layout2016RGB5394CuboidSUN
SUN RGB-D2015RGB-D10335AW+SSNYUv2, Berkeley B3DO, and SUN3D
NYUv2 3032013RGB-D303CuboidNYUv2
Hedau2009RGB366Cuboid-
PapersVenueLinks
Polygon Detection for Room Layout Estimation using Heterogeneous Graphs and WireframesICCV Workshop 2023[code]
ST-RoomNet: Learning Room Layout Estimation From Single Image Through Unsupervised Spatial TransformationsCVPR Workshop 2023-
Learning to Reconstruct 3D Non-Cuboid Room Layout from a Single RGB ImageWACV 2022[code]
RoomStructNet: Learning to Rank Non-Cuboidal Room Layouts From Single ViewCoRR 2021-
GeoLayout: Geometry Driven Room Layout Estimation Based on Depth Maps of PlanesECCV 2020[Matterport3D Layout Dataset]
Structural Deep Metric Learning for Room Layout EstimationECCV 2020-
General 3D Room Layout from a Single View by Render-and-CompareECCV 2020[project] [ScanNet-Layout Dataset] [code]
Smart Hypothesis Generation for Efficient and Robust Room Layout EstimationWACV 2020-
Flat2Layout: Flat Representation for Estimating Layout of General Room TypesCoRR 2019-
Thinking Outside the Box: Generation of Unconstrained 3D Room LayoutsACCV 2018-
RoomNet: End-to-End Room Layout EstimationICCV 2017-
Physics Inspired Optimization on Semantic Transfer Features: An Alternative Method for Room Layout EstimationCVPR 2017[project]
A Coarse-to-Fine Indoor Layout Estimation (CFILE) MethodACCV 2016-
DeLay: Robust Spatial Layout Estimation for Cluttered Indoor ScenesCVPR 2016-
Learning Informative Edge Maps for Indoor Scene Layout PredictionICCV 2015-
Rent3D: Floor-Plan Priors for Monocular Layout EstimationCVPR 2015[project]
Box In the Box: Joint 3D Layout and Object Reasoning from Single ImagesCVPR 2013-
Estimating the 3D Layout of Indoor Scenes and its Clutter from Depth SensorsICCV 2013[project]
Recovering the Spatial Layout of Cluttered RoomsICCV 2009-

Panoramic Image

(MW: Manhattan world, AW: Atlanta world, SS: single-floor and single-ceiling.)

DatasetYearModality#FramesPriorSource
ZInD2021RGB71474AW+SSZinD
MatterportLayout2020RGB-D2295MW+SSMatterport
Structured3D2020RGB-D196515AW+SSStructured3D
LayoutMP3D2020RGB-D2505MW+SSMatterport
2D-3D-S2018RGB-D571Cuboid2D-3D-S
PanoContext2014RGB500CuboidSUN360
PapersVenueLinks
No More Ambiguity in 360â—¦ Room Layout via Bi-Layout EstimationCVPR 2024
Seg2Reg: Differentiable 2D Segmentation to 1D Regression Rendering for 360 Room Layout ReconstructionCVPR 2024
iBARLE: imBalance-Aware Room Layout EstimationCoRR 2023
📷 GPR-Net: Multi-view Layout Estimation via a Geometry-aware Panorama Registration NetworkCVPR Workshop 2023-
Shape-Net: Room Layout Estimation from Panoramic Images Robust to Occlusion using Knowledge Distillation with 3D Shapes as Additional InputsCVPR Workshop 2023
U2RLE: Uncertainty-Guided 2-Stage Room Layout EstimationCVPR 2023
Disentangling Orthogonal Planes for Indoor Panoramic Room Layout Estimation with Cross-Scale Distortion AwarenessCVPR 2023[Code]
📷 360-MLC: Multi-view Layout Consistency for Self-training and Hyper-parameter TuningNeurIPS 2022[Project]
3D Room Layout Estimation from a Cubemap of Panorama Image via Deep Manhattan Hough TransformECCV 2022[Code]
3D Room Layout Recovery Generalizing across Manhattan and Non-Manhattan WorldsCVPR 2022-
📷 PSMNet: Position-aware Stereo Merging Network for Room Layout EstimationCVPR 2022[code]
Self-supervised 360Ëš Room Layout EstimationCoRR 2022[code]
LGT-Net: Indoor Panoramic Room Layout Estimation with Geometry-Aware Transformer NetworkCVPR 2022-
Deep3DLayout: 3D Reconstruction of an Indoor Layout from a Spherical Panoramic ImageSIGGRAPH Asia 2021[project]
Transferable End-to-end Room Layout Estimation via Implicit EncodingCoRR 2021[project]
OmniLayout: Room Layout Reconstruction from Indoor Spherical PanoramasCVPR Workshop 2021[code]
LED<sup>2</sup>-Net: Monocular 360Ëš Layout Estimation via Differentiable Depth RenderingCVPR 2021[project] [code]
SSLayout360: Semi-Supervised Indoor Layout Estimation from 360 PanoramaCVPR 2021-
Single-Shot Cuboids: Geodesics-based End-to-end Manhattan Aligned Layout Estimation from Spherical PanoramasImage and Vision Computing 2021[project] [code]
Manhattan Room Layout Reconstruction from a Single 360 image: A Comparative Study of State-of-the-art MethodsIJCV 2021[code] [MatterportLayout Dataset]
Training and Post Processing 3D Room Layout Beyond the Manhattan World AssumptionECCV Workshop 2020-
Joint 3D Layout and Depth Prediction from a Single Indoor Panorama ImageECCV 2020-
AtlantaNet: Inferring the 3D Indoor Layout from a Single 360 Image Beyond the Manhattan World AssumptionECCV 2020[project] [code]
Corners for Layout: End-to-End Layout Recovery from 360 ImagesICRA 2019[project] [code]
DuLa-Net: A Dual-Projection Network for Estimating Room Layouts from a Single RGB PanoramaCVPR 2019[project]
HorizonNet: Learning Room Layout with 1D Representation and Pano Stretch Data AugmentationCVPR 2019[code]
Layouts from Panoramic Images with Geometry and Deep LearningIROS 2018[code]
LayoutNet: Reconstructing the 3D Room Layout from a Single RGB Image(CVPR 2018)[code]
Efficient 3D Room Shape Recovery From a Single PanoramaCVPR 2016[code]

Floorplan

PapersVenueLinks
🎲 FRI-Net: Floorplan Reconstruction via Room-wise Implicit RepresentationECCV 2024[code]
🎲 PolyRoom: Room-aware Transformer for Floorplan ReconstructionECCV 2024[code]
🎲 PolyDiffuse: Polygonal Shape Reconstruction via Guided Set Diffusion ModelsNeurIPS 2023[project]
🎲 Connecting the Dots: Floorplan Reconstruction Using Two-Level QueriesCVPR 2023[project] [code]
📷 Floorplan Restoration by Structure Hallucinating Transformer CascadesCoRR 2022-
📷 MVLayoutNet: 3D Layout Reconstruction with Multi-View PanoramasCoRR 2021-
📷 Extreme Structure From Motion for Indoor Panoramas Without Visual OverlapsICCV 2021[code]
🎲 MonteFloor: Extending MCTS for Reconstructing Accurate Large-Scale Floor PlansICCV 2021-
🎲 Scan2Plan: Efficient Floorplan Generation from 3D Scans of Indoor ScenesCoRR 2020-
🎲 Floor-SP: Inverse CAD for Floorplans by Sequential Room-wise Shortest PathICCV 2019[project] [code]
📷 Floorplan-Jigsaw: Jointly Estimating Scene Layout and Aligning Partial ScansICCV 2019[project]
🎲 DeepPerimeter: Indoor Boundary Estimation from Posed Monocular SequencesCoRR 2019-
📷 FloorNet: A unified framework for floorplan reconstruction from 3D scansECCV 2018[project] [code]

Floorplan Vectorization

PapersVenueLinks
VectorFloorSeg: Two-Stream Graph Attention Network for Vectorized Roughcast Floorplan SegmentationCVPR 2023[code]
Parsing Line Segments of Floor Plan Images Using Graph Neural NetworksCoRR 2023-
Residential floor plan recognition and reconstructionCVPR 2021-
Versailles-FP dataset: Wall Detection in Ancient Floor PlansCoRR 2021-
Deep Floor Plan Recognition using a Multi-task Network with Room-boundary-Guided AttentionICCV 2019[project]
CubiCasa5K: A Dataset and an Improved Multi-Task Model for Floorplan Image AnalysisScandinavian Conference on Image Analysis 2019[code]
Raster-to-Vector: Revisiting Floorplan TransformationICCV 2017[project] [code]

Visual Localization

PapersVenueLinks
SPVLoc: Semantic Panoramic Viewport Matching for 6D Camera Localization in Unseen EnvironmentsECCV 2024[project] [code]
LaLaLoc++: Global Floor Plan Comprehension for Layout Localisation in Unvisited EnvironmentsECCV 2022[code]
LASER: LAtent SpacE Rendering for 2D Visual LocalizationCVPR 2022-
LaLaLoc: Latent Layout Localisation in Dynamic, Unvisited EnvironmentsICCV 2021-

Primitive

Junction

PapersVenueLinks
Manhattan Junction Catalogue for Spatial Reasoning of Indoor ScenesCVPR 2013-

Line Segment and Wireframe

PapersVenueLinks
📷Volumetric Wireframe Parsing from Neural Attraction FieldsCoRR 2023[code]
📷NEF: Neural Edge Fields for 3D Parametric Curve Reconstruction from Multi-view ImagesCVPR 2023[project]
DeepLSD: Line Segment Detection and Refinement with Deep Image GradientsCoRR 2022[Code]
Holistically-Attracted Wireframe Parsing: From Supervised to Self-Supervised LearningCoRR 2022-
🎲Learning to Construct 3D Building Wireframes from 3D Line CloudsBMVC 2022[Code]
HoW-3D: Holistic 3D Wireframe Perception from a Single Image3DV 2022[Code]
Semantic Room Wireframe Detection from a Single ViewICPR 2022[code]
Towards Real-time and Light-weight Line Segment DetectionAAAI 2022[code]
Hole-robust Wireframe DetectionWACV 2022-
Fully Convolutional Line ParsingNeurocomputing 2022[code]
ELSD: Efficient Line Segment Detector and DescriptorICCV 2021-
SOLD<sup>2</sup>: Self-supervised Occlusion-aware Line Description and DetectionCVPR 2021[code]
Line Segment Detection Using Transformers without EdgesCVPR 2021[code]
PlueckerNet: Learn to Register 3D Line ReconstructionsCVPR 2020[code]
LGNN: A Context-aware Line Segment DetectorACM MM 2020-
TP-LSD: Tri-Points Based Line Segment DetectorECCV 2020[code]
Deep Hough-Transform Line PriorsECCV 2020[code]
Deep Hough Transform for Semantic Line DetectionECCV 2020[code]
Holistically-Attracted Wireframe ParsingCVPR 2020[code]
Learning to Reconstruct 3D Manhattan Wireframes from a Single ImageICCV 2019[code]
End-to-End Wireframe ParsingICCV 2019[code]
PPGNet: Learning Point-Pair Graph for Line Segment DetectionCVPR 2019[code]
Learning Attraction Field Representation for Robust Line Segment DetectionCVPR 2019[code]
Novel Single View Constraints for Manhattan 3D Line Reconstruction3DV 2018-
Learning to Parse Wireframes in Images of Man-Made EnvironmentsCVPR 2018[code]
A Novel Linelet-Based Representation for Line Segment DetectionTPAMI 2018-
MCMLSD: A Dynamic Programming Approach to Line Segment DetectionCVPR 2017-
Lifting 3D Manhattan Lines from a Single ImageICCV 2013-
LSD: A Fast Line Segment Detector with a False Detection ControlTPAMI 2010-

Outdoor Architecture

PapersVenueLinks
HEAT: Holistic Edge Attention Transformer for Structured ReconstructionCVPR 2022[Project]
Structured Outdoor Architecture Reconsruction by Exploration and ClassificationICCV 2021[Project]
Roof-GAN: Learning to Generate Roof Geometry and Relations for Residential HousesCVPR 2021[Code]
Vectorizing World Buildings: Planar Graph Reconstruction by Primitive Detection and Relationship InferenceECCV 2020[Project]
Conv-MPN: Convolutional Message Passing Neural Network for Structured Outdoor Architecture ReconstructionCVPR 2020[Project]

Plane

PapersVenueLinks
MonoPlane: Exploiting Monocular Geometric Cues for Generalizable 3D Plane ReconstructionIROS 2024[code]
📷 UniPlane: Unified Plane Detection and Reconstruction from Posed Monocular VideosCoRR 2024
📷 AirPlanes: Accurate Plane Estimation via 3D-Consistent EmbeddingsCVPR 2024[project]
PlaneRecTR: Unified Query learning for 3D Plane Recovery from a Single ViewICCV 2023[Code]
📷 NOPE-SAC: Neural One-Plane RANSAC for Sparse-View Planar 3D ReconstructionCoRR 2022[Code]
📷 PlaneFormers: From Sparse View Planes to 3D ReconstructionECCV 2022[project] [code]
📷 PlanarRecon: Real-time 3D Plane Detection and Reconstruction from Posed Monocular VideosCVPR 2022[Project]
PlaneRecNet: Multi-Task Learning with Cross-Task Consistency for Piece-Wise Plane Detection and Reconstruction from a Single RGB ImageBMVC 2021[code]
PlaneTR: Structure-Guided Transformers for 3D Plane RecoveryICCV 2021[code]
📷 Planar Surface Reconstruction From Sparse ViewsICCV 2021[project] [code]
Indoor Panorama Planar 3D Reconstruction via Divide and ConquerCVPR 2021[code]
Learning Pairwise Inter-Plane Relations for Piecewise Planar ReconstructionECCV 2020[code]
Peek-a-Boo: Occlusion Reasoning in Indoor Scenes with Plane RepresentationsCVPR 2020[project]
Single-Image Piece-wise Planar 3D Reconstruction via Associative EmbeddingCVPR 2019[code]
PlaneRCNN: 3D Plane Detection and Reconstruction from a Single ImageCVPR 2019[project] [code]
Recovering 3D Planes from a Single Image via Convolutional Neural NetworksECCV 2018[code]
PlaneNet: Piece-wise Planar Reconstruction from a Single RGB ImageCVPR 2018[project] [code]

Vanishing Point

PapersVenueLinks
Vanishing Point Estimation in Uncalibrated Images with Prior Gravity DirectionICCV 2023[code]
Transformer Based Line Segment Classifier with Image Context for Real-Time Vanishing Point Detection in Manhattan WorldCVPR 2022-
Deep Vanishing Point Detection: Geometric Priors Make Dataset Variations VanishCVPR 2022-
VaPiD: A Rapid Vanishing Point Detector via Learned OptimizersICCV 2021-
NeurVPS: Neural Vanishing Point Scanning via Conic ConvolutionNeurIPS 2021[Code]