Awesome

MT-DETR

This project is the code of WACV 2023 paper: MT-DETR: Robust End-to-end Multimodal Detection with Confidence Fusion by Shih-Yun Chu, Ming-Sui Lee. You can find more visualized result and details in supplementary material.

<div align=center> <img src='figure/snow_day.png' width="47%"> <img src='figure/densefog_day.png' width="47%"> <img src='figure/lightfog_night.png' width="47%"> <img src='figure/clear_night.png' width="47%"> </div>

Brief Introduction

In the application of autonomous driving, there are times when unexpected and severe weather (fog, snow, night) occurs in outdoor environments, making the detection tasks less effective. Therefore, this paper proposes a novel multimodal object detection network called MT-DETR. It achieves state-of-the-art performance using the camera, lidar and radar, and additional time information. The experimental results demonstrate that the MT-DETR is robust and performs well in various weather conditions. The good generalization and scalability confirm future applicability to different multimodal tasks.

Getting Started

The repository is based on mmdetection and cbnetv2. Many thanks for their awesome open-source project.

To run the code:

To construct an environment first, please follow the cbnetv2 (https://github.com/VDIGPKU/CBNetV2) and mmdetection (https://github.com/open-mmlab/mmdetection) tutorial.
Download the dataset and model checkpoints. Please go to data/ and checkpoint/ and read the instructions there to download.
After preparation, type the following command in your terminal:

bash run_script/${script_name}

You can comment training/inference block in shell scripts if you want.

The following are the important directories of this project:

data: download the dataset here
checkpoint: download model weights here
run_script: shell files for running models, change your path and GPU_id here
configs: configs of models, adjust models' setting here
mmdet/models/backbones/mt_detr.py,mmdet/models/backbones/fusion_module.py: core model architecture of MT-DETR (this paper)

BibTeX

If you find our work useful in your research, please consider citing our paper.

@InProceedings{Chu_2023_WACV,
    author    = {Chu, Shih-Yun and Lee, Ming-Sui},
    title     = {MT-DETR: Robust End-to-End Multimodal Detection With Confidence Fusion},
    booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
    month     = {January},
    year      = {2023},
    pages     = {5252-5261}
}