Awesome

3D Mapping

Welcome! This repository contains all software used to create CarlaSC (our dynamic scene completion data set) and MotionSC (our local semantic mapping network), and all baselines. Below is an overview of our project. For more information on the data and method, see our website. For more information on the software and files involved, see the Wiki.

Data
Networks
Use 3DMapping
- Dependencies
Results
- CarlaSC
- SemanticKITTI
Acknowledgement

Data: CarlaSC

A novel data set with accurate, complete dynamic outdoor scenes for semantic scene completion or mapping. We use the CARLA simulator (1) to gather clean ground truth information, where we randomly place a large number of sensors in each scene, guaranteeing theoretical convergence to the true scene and network generalization to the full scene. Dynamic maps are difficult due to traces left behind by moving objects, and occlusions. Below is a comparison of a frame from our data set with a similar frame from Semantic KITTI (4).

More introduction about CarlaSC dataset and download links are avialable on our website.

Networks: MotionSC

We create a network using ideas from MotionNet (2) and LMSCNet (3), which we call MotionSC. MotionSC extends semantic scene completion to achieve a higher level of scene understanding by integrating the structure of a view-volume scene completion network with the temporal backbone of an object tracking network. It is built off the idea that semantic scene completion is fundamentally a similar task to 3D semantic mapping, where the major difference is the use of temporal information, readily available in robotic applications. We compare with scene completion baselines LMSCNet (3), JS3CNet (5), and SSCNet (6). We show that the number of past scans (T) is correlated with improvements in semantic and geometric completeness quantitatively and qualitatively. Note that in the video below, the ego vehicle is the stationary Tesla in the bottom right, and is excluded from the complete semantic scenes.

Use 3DMapping

Dependencies

Pytorch - we tested on PyTorch 1.10 and 1.8.2
Open3D - for visualizing map
Spconv - Only required by JS3CNet. The JS3CNet is a little complicated on its dependency, please check our forked JS3CNet to run it.
For data generation and visualization tasks, you can find the dependencies at Wiki Home and Data Visualization.

We also provide an environment.yml which you can use to create a conda environment

conda env create -f environment.yml
conda activate MotionSC

Data Generation and Visualization

We provide script to generate the dataset. We also provide scripts for visualizing the data. We have detailed explanation on the useful scripts and parameters in our wiki. You can check the TODO block in scripts for parameters that could be adjusted.

Training and Testing

We provide synthetic scripts of training and testing MotionSC, LMSCNet and SSCNet in our repo. You can check the TODO block in scripts for parameters that could be adjusted. You can find how to use them on our wiki. Due to special dependencies of JS3CNet, you can check our forked JS3CNet to run it.

Results

We trained our model on the CarlaSC dataset and the SemanticKitti dataset. Note that while improved semantic scene completion networks are being released with better results, our method of incorporating temporal information may be applied to these networks for enhanced mapping capabilities.

CarlaSC

For the MotionSC model on CarlaSC we also list its performance dependent on the amount of temporal information provided, T.

Inference Time (Measured on NVIDIA GeForce RTX 3090)

Method	Latency (ms)
LMSCNet SS	4.86
SSCNet Full	2.18
JS3C-Net	166.2
MotionSC (T=1)	5.72

Semantic Completeness

Method	Mean IoU	Accuracy	Free	Building	Barrier	Other	Pedestrian	Pole	Road	Ground	Sidewalk	Vegetation	Vehicles
LMSCNet SS	42.53	94.64	97.41	25.61	3.35	11.31	33.76	43.54	85.96	21.15	52.64	39.99	53.09
SSCNet Full	41.91	94.11	96.02	27.04	1.82	13.65	29.69	27.02	88.45	25.89	65.36	33.29	52.78
JS3C-Net	48.95	95.48	96.78	34.68	3.03	22.94	43.64	44.50	93.31	30.90	75.15	34.35	59.21
MotionSC (T=1)	46.31	95.11	97.42	31.59	2.63	14.77	39.87	42.11	90.57	25.89	60.77	42.41	61.37
MotionSC (T=5)	45.35	95.00	97.43	29.48	2.54	17.48	41.87	43.43	90.90	22.08	58.43	35.79	59.41
MotionSC (T=10)	47.01	95.15	97.44	32.29	2.35	19.82	44.06	45.47	90.19	27.35	62.48	36.92	58.80
MotionSC (T=16)	47.45	95.57	97.60	34.91	2.66	22.86	37.78	43.87	90.12	28.31	66.20	41.59	56.08

Geometric Completeness

Method	Precision	Recall	IoU
LMSCNet SS	95.62	98.95	85.98
SSCNet Full	85.87	93.05	80.69
JS3C-Net	89.43	93.02	83.80
MotionSC (T=1)	93.32	92.16	86.46
MotionSC (T=5)	94.76	90.57	86.25
MotionSC (T=10)	93.17	92.43	86.56
MotionSC (T=16)	94.61	91.77	87.21

SemanticKITTI

For the SemanticKITTI dataset, we only compare with T=1 as the rules of the competion require a single input frame. Our results may be found on the Semantic KITTI leader board under user "tigeriv4." The results of other models are collected from their papers.

Semantic Completeness

Method	Mean IoU	Accuracy	Road	Sidewalk	Parking	Other-ground	Building	Car	Truck	Bicycle	Motorcycle	Other-vehicle	vegetation	Trunk	Terrain	Person	Bicyclist	Motorcyclist	Fence	Pole	Traffic-sign
LMSCNet SS	17.62	56.72	64.80	34.68	29.02	4.62	38.08	30.89	1.47	0	0	0.81	41.31	19.89	32.05	0	0	0	21.32	15.01	0.84
SSCNet Full	16.14	49.98	51.15	30.76	27.12	6.44	34.53	24.26	1.18	0.54	0.78	4.34	35.25	18.17	29.01	0.25	0.25	0.03	19.87	13.10	6.73
JS3C-Net	23.8	56.6	64.7	39.9	34.9	14.1	37.4	33.3	7.2	14.4	8.8	12.7	43.1	19.6	40.5	8.0	5.1	0.4	30.4	18.9	15.9
MotionSC (T=1)	18.4	56.9	66.0	36.5	29.6	7.0	39.0	31.4	1.0	0	0	3.6	40.0	19.0	30.0	0	0	0	23.4	20.0	3.4

Acknowledgement

We utilize data and code from:

[1] CARLA Simulator https://carla.org/
[2] SemanticKITTI http://www.semantic-kitti.org/
[3] MotionNet https://arxiv.org/abs/2003.06754
[4] LMSCNet https://arxiv.org/abs/2008.10559
[5] JS3CNet https://arxiv.org/abs/2012.03762
[6] SSCNet http://sscnet.cs.princeton.edu/

Reference

If you find our work useful in your research work, consider citing our paper!

@misc{https://doi.org/10.48550/arxiv.2203.07060,
  doi = {10.48550/ARXIV.2203.07060},
  
  url = {https://arxiv.org/abs/2203.07060},
  
  author = {Wilson, Joey and Song, Jingyu and Fu, Yuewei and Zhang, Arthur and Capodieci, Andrew and Jayakumar, Paramsothy and Barton, Kira and Ghaffari, Maani},
  
  keywords = {Computer Vision and Pattern Recognition (cs.CV), Robotics (cs.RO), FOS: Computer and information sciences, FOS: Computer and information sciences},
  
  title = {MotionSC: Data Set and Network for Real-Time Semantic Mapping in Dynamic Environments},
  
  publisher = {arXiv},
  
  year = {2022},
  
  copyright = {Creative Commons Attribution 4.0 International}
}

Awesome

3D Mapping

Table of Contents

Data: CarlaSC

Networks: MotionSC

Use 3DMapping

Dependencies

Data Generation and Visualization

Training and Testing

Results

CarlaSC

Inference Time (Measured on NVIDIA GeForce RTX 3090)

Semantic Completeness

Geometric Completeness

SemanticKITTI

Semantic Completeness

Acknowledgement

Reference