Home

Awesome

Teach-DETR

Teach-DETR: Better Training DETR with Teachers<br> Linjiang Huang (CUHK), Kaixin Lu (Shanghai University), Guanglu Song (Sensetime Research), Liang Wang (CASIA), Si Liu (Beihang University), Yu Liu (Sensetime Research), Hongsheng Li (CUHK)

Coming soon.

Introduction

In this paper, we present a novel training scheme, namely Teach-DETR, to learn better DETR-based detectors from versatile teacher detectors. We show that the predicted boxes from teacher detectors are effective medium to transfer knowledge of teacher detectors, which could be either RCNN-based or DETR-based detectors, to train a more accurate and robust DETR model. This new training scheme can easily incorporate the predicted boxes from multiple teacher detectors, each of which provides parallel supervisions to the student DETR. Our strategy introduces no additional parameters and adds negligible computational cost to the original detector during training. During inference, Teach-DETR brings zero additional overhead and maintains the merit of requiring no non-maximum suppression. Extensive experiments show that our method leads to consistent improvement for various DETR-based detectors. Specifically, we improve the state-of-the-art detector DINO with Swin-Large backbone, 4 scales of feature maps and 36-epoch training schedule, from 57.8% to 58.9% in terms of mean average precision on MSCOCO 2017 validation set.

<div align=center> <img src=figures/pipeline.png width=70%> </div>

Results of DETR-based detectors on COCO

ModelBackboneEpochsQueriesAP
Conditional-DETR-DC5R1015030045.0
Conditional-DETR-DC5 + AuxR1015030046.7 $\color{green}{(+1.7)}$
DAB-DETR-DC5R1015030045.8
DAB-DETR-DC5 + AuxR1015030048.5 $\color{green}{(+2.7)}$
DN-DETR-DC5R1015030047.3
DN-DETR-DC5 + AuxR1015030049.9 $\color{green}{(+2.6)}$

Results of two atypical DETR-based detectors on COCO

ModelBackboneEpochsQueriesAP
YOLOSDeiT-S15010035.6
YOLOS + AuxDeiT-S15010038.0 $\color{green}{(+2.4)}$
ViDTSwin-S5010047.2
ViDT + AuxSwin-S5010049.0 $\color{green}{(+1.8)}$

Results of Deformable-DETR-based detectors on COCO

ModelBackboneEpochsQueriesAP
Deformable-DETRSwin-S3630050.7
Deformable-DETR + AuxSwin-S3630053.2 $\color{green}{(+2.5)}$
Deformable-DETR + tricks $\dagger$Swin-S3630053.8
Deformable-DETR + tricks $\dagger$ + AuxSwin-S3630055.5 $\color{green}{(+1.7)}$
H-Deformable-DETRR503630050.0
H-Deformable-DETR + AuxR503630051.9 $\color{green}{(+1.9)}$
H-Deformable-DETRSwin-S3630054.2
H-Deformable-DETR + AuxSwin-S3630055.8 $\color{green}{(+1.6)}$
H-Deformable-DETRSwin-L (IN-22K)3630057.1
H-Deformable-DETR + AuxSwin-L (IN-22K)3630058.0 $\color{green}{(+0.9)}$
H-Deformable-DETR $\ddagger$Swin-L (IN-22K)3690057.6
H-Deformable-DETR $\ddagger$ + AuxSwin-L (IN-22K)3690058.5 $\color{green}{(+0.9)}$
DINO $\ddagger$Swin-L (IN-22K, 384)3690057.8
DINO $\ddagger$ + AuxSwin-L (IN-22K, 384)3690058.9 $\color{green}{(+1.1)}$

Note: all deformable-DETR-based detectors are in the two-stage manner.

$\dagger$ tricks denote dropout rate 0 within transformer, mixed query selection and look forward twice.

$\ddagger$ using top 300 predictions for evaluation.

Update

Installation

We test our models under python=3.7.10,pytorch=1.10.1,cuda=10.2. Other versions might be available as well.

  1. Clone this repo
git https://github.com/LeonHLJ/Teach-DETR.git
cd Teach-DETR
  1. Install Pytorch and torchvision

Follow the instruction on https://pytorch.org/get-started/locally/.

# an example:
conda install -c pytorch pytorch torchvision
  1. Install other needed packages
pip install -r requirements.txt
pip install openmim
mim install mmcv-full
pip install mmdet
  1. Compiling CUDA operators
cd models/ops
python setup.py build install
# unit test (should see all checking is True)
python test.py
cd ../..

Data

Please download COCO 2017 dataset and organize them as following:

coco_path/
  ├── train2017/
  ├── val2017/
  └── annotations/
  	├── instances_train2017.json
  	└── instances_val2017.json

Run

To train a model using 8 cards

GPUS_PER_NODE=8 ./tools/run_dist_launch.sh 8 <config path> \
    --coco_path <coco path> --ensemble

To train/eval a model with the swin transformer backbone, you need to download the backbone from the offical repo frist and specify argument--pretrained_backbone_path like h-detr configs.

To eval a model using 8 cards

GPUS_PER_NODE=8 ./tools/run_dist_launch.sh 8 <config path> \
    --coco_path <coco path> --eval --resume <checkpoint path> --ensemble

Distributed Run

You can refer to Deformable-DETR to enable training on multiple nodes.

License

This project is released under the MIT license. Please see the LICENSE file for more information.

Citation

@misc{huang2023teachdetr,
      title={Teach-DETR: Better Training DETR with Teachers}, 
      author={Linjiang Huang and Kaixin Lu and Guanglu Song and Liang Wang and Si Liu and Yu Liu and Hongsheng Li},
      journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
      year={2023},
      publisher={IEEE}
}