Home

Awesome

<p align="right">English | <a href="./README_CN.md">简体中文</a></p> <p align="center"> <img src="docs/figs/logo.png" align="center" width="22.5%"> <h3 align="center"><strong>Robo3D: Towards Robust and Reliable 3D Perception against Corruptions</strong></h3> <p align="center"> <a href="https://scholar.google.com/citations?user=-j1j7TkAAAAJ" target='_blank'>Lingdong Kong</a><sup>1,2,*</sup>&nbsp;&nbsp;&nbsp; <a href="https://github.com/youquanl" target='_blank'>Youquan Liu</a><sup>1,3,*</sup>&nbsp;&nbsp;&nbsp; <a href="https://scholar.google.com/citations?user=7atts2cAAAAJ" target='_blank'>Xin Li</a><sup>1,4,*</sup>&nbsp;&nbsp;&nbsp; <a href="https://scholar.google.com/citations?user=Uq2DuzkAAAAJ" target='_blank'>Runnan Chen</a><sup>1,5</sup>&nbsp;&nbsp;&nbsp; <a href="https://scholar.google.com/citations?user=QDXADSEAAAAJ" target='_blank'>Wenwei Zhang</a><sup>1,6</sup> <br> <a href="https://scholar.google.com/citations?user=YUKPVCoAAAAJ" target='_blank'>Jiawei Ren</a><sup>6</sup>&nbsp;&nbsp;&nbsp; <a href="https://scholar.google.com/citations?user=lSDISOcAAAAJ" target='_blank'>Liang Pan</a><sup>6</sup>&nbsp;&nbsp;&nbsp; <a href="https://scholar.google.com/citations?user=eGD0b7IAAAAJ" target='_blank'>Kai Chen</a><sup>1</sup>&nbsp;&nbsp;&nbsp; <a href="https://scholar.google.com/citations?user=lc45xlcAAAAJ" target='_blank'>Ziwei Liu</a><sup>6</sup> <br> <sup>1</sup>Shanghai AI Laboratory&nbsp;&nbsp;&nbsp; <sup>2</sup>National University of Singapore&nbsp;&nbsp;&nbsp; <sup>3</sup>Hochschule Bremerhaven&nbsp;&nbsp;&nbsp; <sup>4</sup>East China Normal University&nbsp;&nbsp;&nbsp; <sup>5</sup>The University of Hong Kong&nbsp;&nbsp;&nbsp; <sup>6</sup>S-Lab, Nanyang Technological University </p> </p> <p align="center"> <a href="https://arxiv.org/abs/2303.17597" target='_blank'> <img src="https://img.shields.io/badge/Paper-%F0%9F%93%83-slategray"> </a> <a href="https://ldkong.com/Robo3D" target='_blank'> <img src="https://img.shields.io/badge/Project-%F0%9F%94%97-lightblue"> </a> <a href="" target='_blank'> <img src="https://img.shields.io/badge/Demo-%F0%9F%8E%AC-pink"> </a> <a href="https://zhuanlan.zhihu.com/p/672935761" target='_blank'> <img src="https://img.shields.io/badge/%E4%B8%AD%E8%AF%91%E7%89%88-%F0%9F%90%BC-red"> </a> <a href="" target='_blank'> <img src="https://visitor-badge.laobi.icu/badge?page_id=ldkong1205.Robo3D&left_color=gray&right_color=firebrick"> </a> </p>

About

Robo3D is an evaluation suite heading toward robust and reliable 3D perception in autonomous driving. With it, we probe the robustness of 3D detectors and segmentors under out-of-distribution (OoD) scenarios against corruptions that occur in the real-world environment. Specifically, we consider natural corruptions happen in the following cases:

  1. Adverse weather conditions, such as fog, wet ground, and snow;
  2. External disturbances that are caused by motion blur or result in LiDAR beam missing;
  3. Internal sensor failure, including crosstalk, possible incomplete echo, and cross-sensor scenarios.
<img src="docs/figs/teaser/clean.png" width="240"><img src="docs/figs/teaser/fog.png" width="240"><img src="docs/figs/teaser/wet_ground.png" width="240">
CleanFogWet Ground
<img src="docs/figs/teaser/snow.png" width="240"><img src="docs/figs/teaser/motion_blur.png" width="240"><img src="docs/figs/teaser/beam_missing.png" width="240">
SnowMotion BlurBeam Missing
<img src="docs/figs/teaser/crosstalk.png" width="240"><img src="docs/figs/teaser/incomplete_echo.png" width="240"><img src="docs/figs/teaser/cross_sensor.png" width="240">
CrosstalkIncomplete EchoCross-Sensor

Visit our project page to explore more examples. :oncoming_automobile:

Updates

Outline

Taxonomy

<img src="docs/figs/demo/bev_fog.gif" width="180"><img src="docs/figs/demo/bev_wet_ground.gif" width="180"><img src="docs/figs/demo/bev_snow.gif" width="180"><img src="docs/figs/demo/bev_motion_blur.gif" width="180">
<img src="docs/figs/demo/rv_fog.gif" width="180"><img src="docs/figs/demo/rv_wet_ground.gif" width="180"><img src="docs/figs/demo/rv_snow.gif" width="180"><img src="docs/figs/demo/rv_motion_blur.gif" width="180">
FogWet GroundSnowMotion Blur
<img src="docs/figs/demo/bev_beam_missing.gif" width="180"><img src="docs/figs/demo/bev_crosstalk.gif" width="180"><img src="docs/figs/demo/bev_incomplete_echo.gif" width="180"><img src="docs/figs/demo/bev_cross_sensor.gif" width="180">
<img src="docs/figs/demo/rv_beam_missing.gif" width="180"><img src="docs/figs/demo/rv_crosstalk.gif" width="180"><img src="docs/figs/demo/rv_incomplete_echo.gif" width="180"><img src="docs/figs/demo/rv_cross_sensor.gif" width="180">
Beam MissingCrosstalkIncomplete EchoCross-Sensor

Video Demo

Demo 1Demo 2Demo 3
<img width="100%" src="docs/figs/demo1.png"><img width="100%" src="docs/figs/demo2.png"><img width="100%" src="docs/figs/demo3.png">
Link <sup>:arrow_heading_up:</sup>Link <sup>:arrow_heading_up:</sup>Link <sup>:arrow_heading_up:</sup>

Installation

For details related to installation, kindly refer to INSTALL.md.

Data Preparation

Our datasets are hosted by OpenDataLab.

<img src="https://raw.githubusercontent.com/opendatalab/dsdl-sdk/2ae5264a7ce1ae6116720478f8fa9e59556bed41/resources/opendatalab.svg" width="32%"/><br> OpenDataLab is a pioneering open data platform for the large AI model era, making datasets accessible. By using OpenDataLab, researchers can obtain free formatted datasets in various fields.

Kindly refer to DATA_PREPARE.md for the details to prepare the <sup>1</sup>KITTI, <sup>2</sup>KITTI-C, <sup>3</sup>SemanticKITTI, <sup>4</sup>SemanticKITTI-C, <sup>5</sup>nuScenes, <sup>6</sup>nuScenes-C, <sup>7</sup>WOD, and <sup>8</sup>WOD-C datasets.

Getting Started

To learn more usage about this codebase, kindly refer to GET_STARTED.md.

Model Zoo

<details open> <summary>&nbsp<b>LiDAR Semantic Segmentation</b></summary>
</details> <details open> <summary>&nbsp<b>LiDAR Panoptic Segmentation</b></summary>
<details open> <summary>&nbsp<b>3D Object Detection</b></summary>
</details>

Benchmark

LiDAR Semantic Segmentation

The mean Intersection-over-Union (mIoU) is consistently used as the main indicator for evaluating model performance in our LiDAR semantic segmentation benchmark. The following two metrics are adopted to compare among models' robustness:

:red_car:  SemanticKITTI-C

<p align="center"> <img src="docs/figs/stat/metrics_semkittic.png" align="center" width="100%"> </p>
ModelmCE (%)mRR (%)CleanFogWet GroundSnowMotion BlurBeam MissingCross-TalkIncomplete EchoCross-Sensor
SqueezeSeg164.8766.8131.6118.8527.3022.7017.9325.0121.6527.667.85
SqueezeSegV2152.4565.2941.2825.6435.0227.7522.7532.1926.6833.8011.78
RangeNet<sub>21</sub>136.3373.4247.1531.0440.8837.4331.1638.1637.9841.5418.76
RangeNet<sub>53</sub>130.6673.5950.2936.3343.0740.0230.1040.8046.0842.6716.98
SalsaNext116.1480.5155.8034.8948.4445.5547.9349.6340.2148.0344.72
FIDNet<sub>34</sub>113.8176.9958.8043.6651.6349.6840.3849.3249.4648.1729.85
CENet<sub>34</sub>103.4181.2962.5542.7057.3453.6452.7155.7845.3753.4045.84
FRNet96.8080.0467.5547.6162.1557.0856.8062.5440.9458.1147.30
KPConv99.5482.9062.1754.4657.7054.1525.7057.3553.3855.6453.91
PIDS<sub>NAS1.25x</sub>104.1377.9463.2547.9054.4848.8622.9754.9356.7055.8152.72
PIDS<sub>NAS2.0x</sub>101.2078.4264.5551.1955.9751.1122.4956.9557.4155.5554.27
WaffleIron109.5472.1866.0445.5258.5549.3033.0259.2822.4858.5554.62
PolarNet118.5674.9858.1738.7450.7349.4241.7754.1025.7948.9639.44
<sup>:star:</sup>MinkUNet<sub>18</sub>100.0081.9062.7655.8753.9953.2832.9256.3258.3454.4346.05
MinkUNet<sub>34</sub>100.6180.2263.7853.5454.2750.1733.8057.3558.3854.8846.95
Cylinder3D<sub>SPC</sub>103.2580.0863.4237.1057.4546.9452.4557.6455.9852.5146.22
Cylinder3D<sub>TSC</sub>103.1383.9061.0037.1153.4045.3958.6456.8153.5954.8849.62
SPVCNN<sub>18</sub>100.3082.1562.4755.3253.9851.4234.5356.6758.1054.6045.95
SPVCNN<sub>34</sub>99.1682.0163.2256.5353.6852.3534.3956.7659.0054.9747.07
RPVNet111.7473.8663.7547.6453.5451.1347.2953.5122.6454.7946.17
CPGNet107.3481.0561.5037.7957.3951.2659.0560.2918.5056.7257.79
2DPASS106.1477.5064.6140.4660.6848.5357.8058.7828.4655.8450.01
GFNet108.6877.9263.0042.0456.5756.7158.5956.9517.1455.2349.48

Note: Symbol <sup>:star:</sup> denotes the baseline model adopted in mCE calculation.

:blue_car:  nuScenes-C

<p align="center"> <img src="docs/figs/stat/metrics_nusc_seg.png" align="center" width="100%"> </p>
ModelmCE (%)mRR (%)CleanFogWet GroundSnowMotion BlurBeam MissingCross-TalkIncomplete EchoCross-Sensor
FIDNet<sub>34</sub>122.4273.3371.3864.8068.0258.9748.9048.1457.4548.7623.70
CENet<sub>34</sub>112.7976.0473.2867.0169.8761.6458.3149.9760.8953.3124.78
FRNet98.6377.4877.6569.1476.5869.4954.4968.3241.4358.7443.13
WaffleIron106.7372.7876.0756.0773.9349.5959.4665.1933.1261.5144.01
PolarNet115.0976.3471.3758.2369.9164.8244.6061.9140.7753.6442.01
<sup>:star:</sup>MinkUNet<sub>18</sub>100.0074.4475.7653.6473.9140.3573.3968.5426.5863.8350.95
MinkUNet<sub>34</sub>96.3775.0876.9056.9174.9337.5075.2470.1029.3264.9652.96
Cylinder3D<sub>SPC</sub>111.8472.9476.1559.8572.6958.0742.1364.4544.4460.5042.23
Cylinder3D<sub>TSC</sub>105.5678.0873.5461.4271.0258.4056.0264.1545.3659.9743.03
SPVCNN<sub>18</sub>106.6574.7074.4059.0172.4641.0858.3665.3636.8362.2949.21
SPVCNN<sub>34</sub>97.4575.1076.5755.8674.0441.9574.6368.9428.1164.9651.57
2DPASS98.5675.2477.9264.5076.7654.4662.0467.8434.3763.1945.83
GFNet92.5583.3176.7969.5975.5271.8359.4364.4766.7861.8642.30

Note: Symbol <sup>:star:</sup> denotes the baseline model adopted in mCE calculation.

:taxi:  WOD-C

<p align="center"> <img src="docs/figs/stat/metrics_wod_seg.png" align="center" width="100%"> </p>
ModelmCE (%)mRR (%)CleanFogWet GroundSnowMotion BlurBeam MissingCross-TalkIncomplete EchoCross-Sensor
<sup>:star:</sup>MinkUNet<sub>18</sub>100.0091.2269.0666.9960.9957.7568.9264.1565.3763.3656.44
MinkUNet<sub>34</sub>96.2191.8070.1568.3162.9857.9570.1065.7966.4864.5559.02
Cylinder3D<sub>TSC</sub>106.0292.3965.9363.0959.4058.4365.7262.0862.9960.3455.27
SPVCNN<sub>18</sub>103.6091.6067.3565.1359.1258.1067.2462.4165.4661.7954.30
SPVCNN<sub>34</sub>98.7292.0469.0167.1062.4157.5768.9264.6764.7064.1458.63

Note: Symbol <sup>:star:</sup> denotes the baseline model adopted in mCE calculation.

3D Object Detection

The mean average precision (mAP) and nuScenes detection score (NDS) are consistently used as the main indicator for evaluating model performance in our LiDAR semantic segmentation benchmark. The following two metrics are adopted to compare between models' robustness:

:red_car:  KITTI-C

<p align="center"> <img src="docs/figs/stat/metrics_kittic.png" align="center" width="100%"> </p>
ModelmCE (%)mRR (%)CleanFogWet GroundSnowMotion BlurBeam MissingCross-TalkIncomplete EchoCross-Sensor
PointPillars110.6774.9466.7045.7066.7135.7747.0952.2460.0154.8437.50
SECOND95.9382.9468.4953.2468.5154.9249.1954.1467.1959.2548.00
PointRCNN91.8883.4670.2656.3171.8250.2051.5256.8465.7062.0254.73
PartA2<sub>Free</sub>82.2281.8776.2858.0676.2958.1755.1559.4675.5965.6651.22
PartA2<sub>Anchor</sub>88.6280.6773.9856.5973.9751.3255.0456.3871.7263.2949.15
PVRCNN90.0481.7372.3655.3672.8952.1254.4456.8870.3963.0048.01
<sup>:star:</sup>CenterPoint100.0079.7368.7053.1068.7148.5647.9449.8866.0058.9045.12
SphereFormer-----------

Note: Symbol <sup>:star:</sup> denotes the baseline model adopted in mCE calculation.

:blue_car:  nuScenes-C

<p align="center"> <img src="docs/figs/stat/metrics_nusc_det.png" align="center" width="100%"> </p>
ModelmCE (%)mRR (%)CleanFogWet GroundSnowMotion BlurBeam MissingCross-TalkIncomplete EchoCross-Sensor
PointPillars<sub>MH</sub>102.9077.2443.3333.1642.9229.4938.0433.6134.6130.9025.00
SECOND<sub>MH</sub>97.5076.9647.8738.0047.5933.9241.3235.6440.3034.1223.82
<sup>:star:</sup>CenterPoint100.0076.6845.9935.0145.4131.2341.7935.1635.2232.5325.78
CenterPoint<sub>LR</sub>98.7472.4949.7236.3947.3432.8140.5434.4738.1135.5023.16
CenterPoint<sub>HR</sub>95.8075.2650.3139.5549.7734.7343.2136.2140.9835.0923.38
SphereFormer-----------

Note: Symbol <sup>:star:</sup> denotes the baseline model adopted in mCE calculation.

:taxi:  WOD-C

<p align="center"> <img src="docs/figs/stat/metrics_wod_det.png" align="center" width="100%"> </p>
ModelmCE (%)mRR (%)CleanFogWet GroundSnowMotion BlurBeam MissingCross-TalkIncomplete EchoCross-Sensor
PointPillars127.5381.2350.1731.2449.7546.0734.9343.9339.8043.4136.67
SECOND121.4381.1253.3732.8952.9947.2035.9844.7249.2846.8436.43
PVRCNN104.9082.4361.2737.3261.2760.3842.7849.5359.5954.4338.73
<sup>:star:</sup>CenterPoint100.0083.3063.5943.0662.8458.5943.5354.4160.3257.0143.98
PVRCNN++91.6084.1467.4545.5067.1862.7147.3557.8364.7160.9647.77
SphereFormer-----------

Note: Symbol <sup>:star:</sup> denotes the baseline model adopted in mCE calculation.

:vertical_traffic_light: More Benchmarking Results

For more detailed experimental results and visual comparisons, please refer to RESULTS.md.

Create Corruption Set

You can manage to create your own "Robo3D" corruption sets on other LiDAR-based point cloud datasets using our defined corruption types! Follow the instructions listed in CREATE.md.

TODO List

Citation

If you find this work helpful, please kindly consider citing our paper:

@inproceedings{kong2023robo3d,
    author = {Lingdong Kong and Youquan Liu and Xin Li and Runnan Chen and Wenwei Zhang and Jiawei Ren and Liang Pan and Kai Chen and Ziwei Liu},
    title = {Robo3D: Towards Robust and Reliable 3D Perception against Corruptions},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    pages = {19994--20006},
    year = {2023},
}
@misc{kong2023robo3d_benchmark,
  title = {The Robo3D Benchmark for Robust and Reliable 3D Perception},
  author = {Lingdong Kong and Youquan Liu and Xin Li and Runnan Chen and Wenwei Zhang and Jiawei Ren and Liang Pan and Kai Chen and Ziwei Liu},
  howpublished = {\url{https://github.com/ldkong1205/Robo3D}},
  year = {2023},
}

License

<a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-nc-sa/4.0/80x15.png" /></a> <br /> This work is under the <a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/">Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License</a>, while some specific operations in this codebase might be with other licenses. Please refer to LICENSE.md for a more careful check, if you are using our code for commercial matters.

Acknowledgements

This work is developed based on the MMDetection3D codebase.

<img src="https://github.com/open-mmlab/mmdetection3d/blob/main/resources/mmdet3d-logo.png" width="30%"/><br> MMDetection3D is an open source object detection toolbox based on PyTorch, towards the next-generation platform for general 3D detection. It is a part of the OpenMMLab project developed by MMLab.

:heart: We thank Jiangmiao Pang and Tai Wang for their insightful discussions and feedback. We thank the OpenDataLab platform for hosting our datasets.