Home

Awesome

MonoCD: Monocular 3D Object Detection with Complementary Depths

<h5 align="center">

Longfei Yan, Pei Yan, Shengzhou Xiong, Xuanyu Xiang, Yihua Tan

arXiv License: MIT

</h5>

This repository includes an official implementation of the paper MonoCD: Monocular 3D Object Detection with Complementary Depths based on the excellent work MonoFlex. In this work, we first point out the coupling phenomenon that the existing multi-depth predictions have the tendency of predicted depths to consistently overestimate or underestimate the true depth values, which limits the accuracy of combined depth. We propose to increase the complementarity of depths to alleviate this problem.

Installation

git clone https://github.com/dragonfly606/MonoCD.git
cd MonoCD

conda create -n monocd python=3.7
conda activate monocd

# Install PyTorch that matches your local CUDA version. We adopt torch 1.4.0+cu101
conda install pytorch torchvision cudatoolkit
pip install -r requirements.txt

cd model/backbone/DCNv2
sh make.sh
# If the DCNv2 compilation fails, you can replace it with the version from https://github.com/lbin/DCNv2 that matches your PyTorch version, and then try recompiling.

cd ../../..
python setup.py develop

Data Preparation

Please download KITTI dataset and organize the data as follows:

#ROOT		
  |training/
    |calib/
    |image_2/
    |label/
    |planes/
    |ImageSets/
  |testing/
    |calib/
    |image_2/
    |ImageSets/

The road planes for Horizon Heatmap training could be downloaded from HERE. Then remember to set the DATA_DIR = "/path/to/your/kitti/" in the config/paths_catalog.py according to your data path.

Get Started

Train

Training with one GPU.

CUDA_VISIBLE_DEVICES=0 python tools/plain_train_net.py --batch_size 8 --config runs/monocd.yaml --output output/exp

Test

The model will be evaluated periodically during training and you can also evaluate an already trained checkpoint with

CUDA_VISIBLE_DEVICES=0 python tools/plain_train_net.py --config runs/monocd.yaml --ckpt YOUR_CKPT  --eval

Model and log

We provide the trained model on KITTI and corresponding logs.

ModelsAP40@EasyAP40@Mod.AP40@HardLogs/Ckpts
MonoFlex23.6417.5114.83-
MonoFlex + Ours (paper)24.2218.2715.42-
MonoFlex + Ours (reproduced)25.9919.1216.03log/ckpt

Citation

If you find our work useful in your research, please consider giving us a star and citing:

@inproceedings{yan2024monocd,
  title={MonoCD: Monocular 3D Object Detection with Complementary Depths},
  author={Yan, Longfei and Yan, Pei and Xiong, Shengzhou and Xiang, Xuanyu and Tan, Yihua},
  booktitle={CVPR},
  pages={10248--10257},
  year={2024}
}

Acknowledgement

This project benefits from awesome works of MonoFlex and MonoGround. Please also consider citing them.

Contact

If you have any questions about this project, please feel free to contact longfeiyan@hust.edu.cn.