Home

Awesome

arXiv

(ICCV2023) MonoNeRD: NeRF-like Representations for Monocular 3D Object Detection

Introduction

This is the implementation of MonoNeRD: NeRF-like Representations for Monocular 3D Object Detection, In ICCV'23, Junkai Xu, Liang Peng, Haoran Cheng, Hao Li, Wei Qian, Ke Li, Wenxiao Wang and Deng Cai. Also a simple way to get 3D geometry and occupancy from monocular images.

Framework

[Paper] [Supp]

News

Demo

KITTI Demo

3d-visualization

Overview

Installation

Requirements

All the codes for training and evaluation are tested in the following environment:

Installation Steps

a. Clone this repository.

git clone https://github.com/cskkxjk/MonoNeRD.git

b. Install the dependent libraries as follows:

pip install -r requirements.txt 
git clone https://github.com/traveller59/spconv
git reset --hard f22dd9
git submodule update --recursive
python setup.py bdist_wheel
pip install ./dist/spconv-1.2.1-cp38-cp38m-linux_x86_64.whl
git clone https://github.com/xy-guo/mmdetection_kitti
python setup.py develop

c. Install this library by running the following command:

python setup.py develop

Getting Started

Dataset Preparation

For KITTI, dataset configs are located within configs/stereo/dataset_configs, and the model configs are located within configs/stereo/kitti_models.

For Waymo, dataset configs are located within configs/mono/dataset_configs, and the model configs are located within configs/mono/waymo_models.

MonoNeRD_PATH
├── data
│   ├── kitti
│   │   │── ImageSets
│   │   │── training
│   │   │   ├──calib & velodyne & label_2 & image_2 & image_3
│   │   │── testing
│   │   │   ├──calib & velodyne & image_2
│   ├── waymo
│   │   │── ImageSets
│   │   │── training
│   │   │   ├──calib & velodyne & label_2 & image_2 ...
│   │   │── validation
│   │   │   ├──calib & velodyne & label_2 & image_2 ...
├── configs
├── mononerd
├── tools
YOUR_KITTI_DATA_PATH=~/data/kitti_object
ln -s $YOUR_KITTI_DATA_PATH/ImageSets/ ./data/kitti/
ln -s $YOUR_KITTI_DATA_PATH/training/ ./data/kitti/
ln -s $YOUR_KITTI_DATA_PATH/testing/ ./data/kitti/

YOUR_WAYMO_DATA_PATH=~/data/waymo
ln -s YOUR_WAYMO_DATA_PATH/ImageSets/ ./data/waymo/
ln -s YOUR_WAYMO_DATA_PATH/training/ ./data/waymo/
ln -s YOUR_WAYMO_DATA_PATH/validation/ ./data/waymo/
python -m mononerd.datasets.kitti.lidar_kitti_dataset create_kitti_infos
python -m mononerd.datasets.kitti.lidar_kitti_dataset create_gt_database_only

python -m mononerd.datasets.waymo.lidar_waymo_dataset creat_waymo_infos
python -m mononerd.datasets.waymo.lidar_waymo_dataset create_gt_database_only

Training & Testing

Train a model

./scripts/dist_train.sh ${NUM_GPUS} 'exp_name' ./configs/stereo/kitti_models/mononerd.yaml

Test and evaluate the pretrained models

./scripts/dist_test_ckpt.sh ${NUM_GPUS} ./configs/stereo/kitti_models/mononerd.yaml ./ckpt/pretrained_mononerd.pth

Pretrained Models

KITTI 3D Object Detection Baselines

The results are the BEV / 3D detection performance of Car class on the val set of KITTI dataset.

Training TimeEasy@R40Moderate@R40Hard@R40download
mononerd~13 hours29.03 / 20.6422.03 / 15.4419.41 / 13.99Google-drive / 百度盘

Citation

@inproceedings{xu2023mononerd,
  title={Mononerd: Nerf-like representations for monocular 3d object detection},
  author={Xu, Junkai and Peng, Liang and Cheng, Haoran and Li, Hao and Qian, Wei and Li, Ke and Wang, Wenxiao and Cai, Deng},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={6814--6824},
  year={2023}
}

Acknowledgements

This project benefits from the following codebases. Thanks for their great works!