Home

Awesome

D4LCN: Learning Depth-Guided Convolutions for Monocular 3D Object Detection (CVPR 2020)

Mingyu Ding, Yuqi Huo, Hongwei Yi, Zhe Wang, Jianping Shi, Zhiwu Lu, Ping Luo

image

Introduction

Our framework is implemented and tested with Ubuntu 16.04, CUDA 8.0/9.0, Python 3, Pytorch 0.4/1.0/1.1, NVIDIA Tesla V100/TITANX GPU.

If you find our work useful in your research please consider citing our paper:

@inproceedings{ding2020learning,
  title={Learning Depth-Guided Convolutions for Monocular 3D Object Detection},
  author={Ding, Mingyu and Huo, Yuqi and Yi, Hongwei and Wang, Zhe and Shi, Jianping and Lu, Zhiwu and Luo, Ping},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={11672--11681},
  year={2020}
}

Requirements

Data preparation

Download and unzip the full KITTI detection dataset to the folder /path/to/kitti/. Then place a softlink (or the actual data) in data/kitti/. There are two widely used training/validation set splits for the KITTI dataset. Here we only show the setting of split1, you can set split2 accordingly.

cd D4LCN
ln -s /path/to/kitti data/kitti
ln -s /path/to/kitti/testing data/kitti_split1/testing

Our method uses DORN (or other monocular depth models) to extract depth maps for all images. You can download and unzip the depth maps extracted by DORN here and put them (or softlink) to the folder data/kitti/depth_2/. (You can also change the path in the scripts setup_depth.py)

Then use the following scripts to extract the data splits, which use softlinks to the above directory for efficient storage.

python data/kitti_split1/setup_split.py
python data/kitti_split1/setup_depth.py

Next, build the KITTI devkit eval for split1.

sh data/kitti_split1/devkit/cpp/build.sh

Lastly, build the nms modules

cd lib/nms
make

Training

We use visdom for visualization and graphs. Optionally, start the server by command line

sh visdom.sh

The port can be customized in config files. The training monitor can be viewed at http://localhost:9891.

You can change the batch_size according to the number of GPUs, default: 4 GPUs with batch_size = 8.

If you want to utilize the resnet backbone pre-trained on the COCO dataset, it can be downloaded from git or Google Drive, default: ImageNet pretrained pytorch model. You can also set use_corner and corner_in_3d to False for quick training.

See the configurations in scripts/config/depth_guided_config and scripts/train.py for details.

sh train.sh

Testing

We provide the weights, model and config file on the val1 data split available to download.

Testing requires paths to the configuration file and model weights, exposed variables near the top scripts/test.py. To test a configuration and model, simply update the variables and run the test file as below.

sh test.sh

Acknowledgements

We thank Garrick Brazil for his great works and repos.

Contact

For questions regarding D4LCN, feel free to post here or directly contact the authors (mingyuding@hku.hk).