Home

Awesome

BEVDepth

BEVDepth is a new 3D object detector with a trustworthy depth estimation. For more details, please refer to our paper on Arxiv.

<img src="assets/bevdepth.png" width="1000" >

BEVStereo

BEVStereo is a new multi-view 3D object detector using temporal stereo to enhance depth estimation. <img src="assets/bevstereo.png" width="1000" >

MatrixVT

MatrixVT is a novel View Transformer for BEV paradigm with high efficiency and without customized operators. For more details, please refer to our paper on Arxiv. Try MatrixVT on CPU by run this file ! <img src="assets/matrixvt.jpg" width="1000" >

Updates!!

Quick Start

Installation

Step 0. Install pytorch(v1.9.0).

Step 1. Install MMDetection3D(v1.0.0rc4).

Step 2. Install requirements.

pip install -r requirements.txt

Step 3. Install BEVDepth(gpu required).

python setup.py develop

Data preparation

Step 0. Download nuScenes official dataset.

Step 1. Symlink the dataset root to ./data/.

ln -s [nuscenes root] ./data/

The directory will be as follows.

BEVDepth
├── data
│   ├── nuScenes
│   │   ├── maps
│   │   ├── samples
│   │   ├── sweeps
│   │   ├── v1.0-test
|   |   ├── v1.0-trainval

Step 2. Prepare infos.

python scripts/gen_info.py

Tutorials

Train.

python [EXP_PATH] --amp_backend native -b 8 --gpus 8

Eval.

python [EXP_PATH] --ckpt_path [CKPT_PATH] -e -b 8 --gpus 8

Benchmark

ExpEMACBGSmAPmATEmASEmAOEmAVEmAAENDSweights
BEVDepth0.33040.70210.27950.53460.55300.22740.4355github
BEVDepth0.33290.68320.27610.54460.52580.22590.4409github
BEVDepth0.34840.61590.27160.41440.44020.19540.4805github
BEVDepth0.35890.61190.26920.50740.40860.20090.4797github
BEVStereo0.34560.65890.27740.55000.49800.22780.4516github
BEVStereo0.34940.66710.27850.56060.46860.22950.4543github
BEVStereo0.34270.65600.27840.59820.53470.22280.4423github
BEVStereo0.34350.65850.27570.57920.50340.21630.4485github
BEVStereo0.35760.60710.26840.41570.39280.20210.4902github
BEVStereo0.37210.59800.27010.43810.36720.18980.4997github

FAQ

EMA

Due to the working mechanism of EMA, the model parameters saved by ckpt are different from the model parameters used in the training stage.

We used the customized EMA callback and this function is not supported for now.

Cite BEVDepth & BEVStereo & MatrixVT

If you use BEVDepth and BEVStereo in your research, please cite our work by using the following BibTeX entry:

 @article{li2022bevdepth,
  title={BEVDepth: Acquisition of Reliable Depth for Multi-view 3D Object Detection},
  author={Li, Yinhao and Ge, Zheng and Yu, Guanyi and Yang, Jinrong and Wang, Zengran and Shi, Yukang and Sun, Jianjian and Li, Zeming},
  journal={arXiv preprint arXiv:2206.10092},
  year={2022}
}
@article{li2022bevstereo,
  title={Bevstereo: Enhancing depth estimation in multi-view 3d object detection with dynamic temporal stereo},
  author={Li, Yinhao and Bao, Han and Ge, Zheng and Yang, Jinrong and Sun, Jianjian and Li, Zeming},
  journal={arXiv preprint arXiv:2209.10248},
  year={2022}
}
@article{zhou2022matrixvt,
  title={MatrixVT: Efficient Multi-Camera to BEV Transformation for 3D Perception},
  author={Zhou, Hongyu and Ge, Zheng and Li, Zeming and Zhang, Xiangyu},
  journal={arXiv preprint arXiv:2211.10593},
  year={2022}
}