Awesome
FlashOcc: Fast and Memory-Efficient Occupancy Prediction via Channel-to-Height Plugin
<div align="left"> <img src="figs/performance_flashocc.jpg" width="1500px" /> </div><br/> * Please note that the FPS here is measured with RTX3090 TensorRT FP16.Panoptic-FlashOcc: An Efficient Baseline to Marry Semantic Occupancy with Panoptic via Instance Center
<div align="left"> <img src="figs/performance.png" width="1500px"/> </div><br/> * Please note that the FPS here is measured with A100 GPU (PyTorch fp32 backend).News
- 2024.09.16 Technical Report: FlashOcc can be insert to Bevdet with 1.1ms consumption while facilitating each other.
- 2024.09.16 Selected as reference algorithm for occupancy on horizon J6E/M
- 2024.06.10 Release the code for Panoptic-FlashOCC
- 2024.04.17 Support for ray-iou metric
- 2024.03.22 Release the code for FlashOCCV2
- 2024.02.03 Release the training code for FlashOcc on UniOcc
- 2024.01.20 TensorRT Implement Writen In C++ With Cuda Acceleration
- 2023.12.23 Release the quick testing code via TensorRT in MMDeploy.
- 2023.11.28 Release the training code for FlashOCC.
This repository is an official implementation of FlashOCC
<div align="center"> <img src="figs/overview.png"/> </div><br/> <div align="center"> <img src="figs/panoptic_flashOcc.jpg"/> </div><br/>Main Results
1. FlashOCC
Config | Backbone | Input <br/>Size | mIoU | FPS<br/>(Hz) | Flops<br/>(G) | Params<br/>(M) | Model | Log |
---|---|---|---|---|---|---|---|---|
BEVDetOCC (1f) | R50 | 256x704 | 31.60 | 92.1 | 241.76 | 29.02 | gdrive | log |
M0: FlashOCC (1f) | R50 | 256x704 | 31.95 | 197.6 | 154.1 | 39.94 | gdrive | log |
M1: FlashOCC (1f) | R50 | 256x704 | 32.08 | 152.7 | 248.57 | 44.74 | gdrive | log |
BEVDetOCC-4D-Stereo (2f) | R50 | 256x704 | 36.1 | - | - | - | baidu | log |
M2:FlashOCC-4D-Stereo (2f) | R50 | 256x704 | 37.84 | - | - | - | gdrive | log |
BEVDetOCC-4D-Stereo (2f) | Swin-T | 512x1408 | 42.0 | - | - | - | baidu | log |
M3:FlashOCC-4D-Stereo (2f) | Swin-T | 512x1408 | 43.52 | - | 1490.77 | 144.99 | gdrive | log |
FPS are tested via TensorRT on 3090 with FP16 precision. Please refer to Tab.2 in paper for the detail model settings for M-number.
2. Panoptic-FlashOCC
In Panoptic-FlashOCC, we have made the following 3 adjustments to FlashOCC:
- Without using camera mask for training. This is because its use significantly improves the prediction performance in the visible region, but at the expense of prediction in the invisible region.
- Using category balancing.
- Using stronger loss settings.
- Introducing instance center for panoptic occupancy
More results for different configurations will be released soon.
Config | Backbone | Input <br/>Size | RayIou | RayPQ | mIoU | FPS<br/>(Hz) | Flops<br/>(G) | Params<br/>(M) | Model | Log |
---|---|---|---|---|---|---|---|---|---|---|
M1: FlashOCC (1f) | R50 | 256x704 | - | - | 15.41 | - | 248.57 | 44.74 | gdrive | log |
Panoptic-FlashOCC-Depth-tiny (1f) | R50 | 256x704 | 34.57 | - | 28.83 | 43.9 | 175.00 | 45.32 | gdrive | log |
Panoptic-FlashOCC-Depth-tiny-Pano (1f) | R50 | 256x704 | 34.81 | 12.9 | 29.14 | 39.8 | 175.00 | 45.32 | gdrive | log |
Panoptic-FlashOCC-Depth (1f) | R50 | 256x704 | 34.93 | - | 28.91 | 38.7 | 269.47 | 50.12 | gdrive | log |
Panoptic-FlashOCC-Depth-Pano (1f) | R50 | 256x704 | 35.22 | 13.2 | 29.39 | 35.2 | 269.47 | 50.12 | gdrive | log |
Panoptic-FlashOCC-4D-Depth (2f) | R50 | 256x704 | 35.99 | - | 29.57 | 35.9 | - | - | gdrive | log |
Panoptic-FlashOCC-4D-Depth-Pano (2f) | R50 | 256x704 | 36.76 | 14.5 | 30.31 | 30.4 | - | - | gdrive | log |
Panoptic-FlashOCC-4DLongterm-Depth (8f) | R50 | 256x704 | 38.51 | - | 31.49 | 35.6 | - | - | gdrive | log |
Panoptic-FlashOCC-4DLongterm-Depth-Pano (8f) | R50 | 256x704 | 38.50 | 16.0 | 31.57 | 30.2 | - | - | gdrive | log |
- Please note that the FPS here is measured with A100 GPU (PyTorch fp32 backend).
Get Started
Backend | mIOU | FPS(Hz) |
---|---|---|
PyTorch-FP32 | 31.95 | - |
TRT-FP32 | 30.78 | 96.2 |
TRT-FP16 | 30.78 | 197.6 |
TRT-FP16+INT8(PTQ) | 29.60 | 383.7 |
TRT-INT8(PTQ) | 29.59 | 397.0 |
- [flashocc] : A detail video can be found at baidu
- [panoptic-flashocc] : first row is our prediction and second row is gt.
Acknowledgement
Many thanks to the authors of BEVDet, FB-BEV, RenderOcc and SparseBEV
Bibtex
If this work is helpful for your research, please consider citing the following BibTeX entry.
@article{yu2024ultimatedo,
title={UltimateDO: An Efficient Framework to Marry Occupancy Prediction with 3D Object Detection via Channel2height},
author={Yu, Zichen and Shu, Changyong},
journal={arXiv preprint arXiv:2409.11160},
year={2024}
}
@article{yu2024panoptic,
title={Panoptic-FlashOcc: An Efficient Baseline to Marry Semantic Occupancy with Panoptic via Instance Center},
author={Yu, Zichen and Shu, Changyong and Sun, Qianpu and Linghu, Junjie and Wei, Xiaobao and Yu, Jiangyong and Liu, Zongdai and Yang, Dawei and Li, Hui and Chen, Yan},
journal={arXiv preprint arXiv:2406.10527},
year={2024}
}
@article{yu2023flashocc,
title={FlashOcc: Fast and Memory-Efficient Occupancy Prediction via Channel-to-Height Plugin},
author={Zichen Yu and Changyong Shu and Jiajun Deng and Kangjie Lu and Zongdai Liu and Jiangyong Yu and Dawei Yang and Hui Li and Yan Chen},
year={2023},
eprint={2311.12058},
archivePrefix={arXiv},
primaryClass={cs.CV}
}