Awesome
<div align='center'> <img src="assets/title_gemap.png" width="88%" height="auto"></img> </div> <div align="center"> <h3>[ECCV'24] Online Vectorized HD Map Construction Using Geometry </h3>Zhixin Zhang<sup>1</sup>, Yiyuan Zhang<sup>2</sup>, Xiaohan Ding<sup>3</sup>, Fusheng Jin<sup>1*</sup>, Xiangyu Yue<sup>2</sup>
<sup>1</sup>Beijing Institute of Technology, <sup>2</sup>CUHK, <sup>3</sup>Tencent AI Lab
Website | arXiv | YouTube | Bilibili | Zhihu
</div> <div align='center'> <img src='assets/demo_x0.5.gif' alt='framework' width='88%' height='auto'></img> </div>News
We're working on more powerful and efficient models, please stay tuned.
- (2024/7/2) GeMap is accepted by ECCV 2024 and we release a new GeMap model with 76.0 mAP.
- (2023/12/7) We released the first version of GeMap (with pre-trained checkpoints and evaluation).
- (2023/12/7) GeMap is released on arXiv.
Motivation
- Recent efforts have built strong baselines for online vectorized HD map construction task, however, shapes and relations of instances in urban road systems are still under-explored, such as parallelism, perpendicular, or rectangle-shape.
- As the ego vehicle moves, the shape of a specific instance or the relations between two instances will remain unchanged. To accurately represent such geometric features, invariance to rigid transformation is a fundamental property.
Highlights
This work contributes from two perspectives:
- GeMap achieves new state-of-the-art performance on the NuScenes and Argoverse 2 datasets. Remarkably, it reaches a 71.8% mAP on the large-scale Argoverse 2 dataset, outperforming MapTR V2 by +4.4% and surpassing the 70% mAP threshold for the first time.
- GeMap end-to-end learns Euclidean shapes and relations of map instances beyond basic perception. Specifically, we design a geometric loss based on angle and distance clues, which is robust to rigid transformations. We also decouple self-attention to independently handle Euclidean shapes and relations.
Quantitative Results
NuScenes
Model | Objective | Backbone | Epoch | mAP | FPS | Config / Log | Checkpoint |
---|---|---|---|---|---|---|---|
GeMap | simple | R50 | 110 | 62.7 | 15.6 | config/log | model |
GeMap | simple | Camera(R50) & LiDAR(SEC) | 110 | 66.5 | 6.8 | config/log | model |
GeMap | full | R50 | 110 | 69.4 | 13.3 | config/log | model |
GeMap | full | Swin-T | 110 | 72.0 | 10.0 | config/log | model |
GeMap | full | V2-99 | 110 | 72.2 | 9.5 | config/log | model |
GeMap | full | V2-99(DD3D) | 110 | 76.0 | 9.5 | config/log | model |
Argoverse 2
Model | Objective | Backbone | Epoch | mAP | FPS | Config / Log | Checkpoint |
---|---|---|---|---|---|---|---|
GeMap | simple | R50 | 6 | 63.9 | 13.5 | config/log | model |
GeMap | simple | R50 | 24 | 68.2 | 13.5 | config/log | model |
GeMap | full | R50 | 24 | 71.8 | 12.1 | config/log | model |
* All models are trained on 8 NVIDIA RTX3090 GPUs. The speed (Frames Per Second, FPS) is evaluated on a single 3090 GPU.
Visualization Results
Comparison Video
GeMap exhibits more robust predictions in occluded and rotated scenarios, especially under rainy weather conditions.
<div align='center'> <video src='https://github.com/cnzzx/GeMap-dev/assets/71703448/f5213adb-15a3-49a4-94c1-f4fe8e43babd.mp4' width='88%' height='auto'></video> </div>More Cases of GeMap
<div align='center'> <img src="assets/doc_pres.png" width="88%" height="auto"></img> </div>Getting Started
TODO
- Faster implementation for inference of GeMap.
- More powerful LiDAR and Camera + LiDAR models.
- Lighter and faster models with 30+ FPS.
Acknowledgements
GeMap is based on mmdetection3d. It is also greatly inspired by the following outstanding contributions to the open-source community: LSS, GKT, Swin-Transformer, VoVNet, BEVFormer, MapTR, BeMapNet, HDMapNet.
Citation
If the paper and code help your research, please kindly cite:
@article{zhang2023online,
title={Online Vectorized HD Map Construction using Geometry},
author={Zhang, Zhixin and Zhang, Yiyuan and Ding, Xiaohan and Jin, Fusheng and Yue, Xiangyu},
journal={arXiv preprint arXiv:2312.03341},
year={2023}
}