Home

Awesome

DETRs with Collaborative Hybrid Assignments Training

PWC PWC PWC PWC PWC

[šŸ“– Paper] [šŸ¤— Huggingface Model]

News

Introduction

teaser

In this paper, we present a novel collaborative hybrid assignments training scheme, namely Co-DETR, to learn more efficient and effective DETR-based detectors from versatile label assignment manners.

  1. Encoder optimization: The proposed training scheme can easily enhance the encoder's learning ability in end-to-end detectors by training multiple parallel auxiliary heads supervised by one-to-many label assignments.
  2. Decoder optimization: We conduct extra customized positive queries by extracting the positive coordinates from these auxiliary heads to improve attention learning of the decoder.
  3. State-of-the-art performance: Co-DETR with ViT-L (304M parameters) is the first model to achieve 66.0 AP on COCO test-dev.

teaser

Model Zoo

Objects365 pre-trained Co-DETR

ModelBackboneAugDatasetbox AP (val)mask AP (val)box AP (test)mask AP (test)ConfigDownload
Co-DINOSwin-LDETRCOCO64.1---configmodel
Co-DINOViT-LDETRCOCO65.9-66.0-configmodel
Co-DINOSwin-LLSJLVIS64.5---config (test)model
Co-DINOViT-LLSJLVIS68.0---config (test)model
Co-DINO-InstViT-LLSJLVIS67.360.7--config (test)model

Co-DETR with ResNet-50

ModelBackboneEpochsAugDatasetbox APConfigDownload
Co-DINOR5012DETRCOCO52.1configmodel
Co-DINOR5012LSJCOCO52.1configmodel
Co-DINO-9encR5012LSJCOCO52.6configmodel
Co-DINOR5036LSJCOCO54.8configmodel
Co-DINO-9encR5036LSJCOCO55.4configmodel

Co-DETR with Swin-L

ModelBackboneEpochsAugDatasetbox APConfigDownload
Co-DINOSwin-L12DETRCOCO58.9configmodel
Co-DINOSwin-L24DETRCOCO59.8configmodel
Co-DINOSwin-L36DETRCOCO60.0configmodel
Co-DINOSwin-L12LSJCOCO59.3configmodel
Co-DINOSwin-L24LSJCOCO60.4configmodel
Co-DINOSwin-L36LSJCOCO60.7configmodel
Co-DINOSwin-L36LSJLVIS56.9config (test)model

Co-Deformable-DETR

ModelBackboneEpochsQueriesbox APConfigDownload
Co-Deformable-DETRR501230049.5configmodel | log
Co-Deformable-DETRSwin-T1230051.7configmodel | log
Co-Deformable-DETRSwin-T3630054.1configmodel | log
Co-Deformable-DETRSwin-S1230053.4configmodel | log
Co-Deformable-DETRSwin-S3630055.3configmodel | log
Co-Deformable-DETRSwin-B1230055.5configmodel | log
Co-Deformable-DETRSwin-B3630057.5configmodel | log
Co-Deformable-DETRSwin-L1230056.9configmodel | log
Co-Deformable-DETRSwin-L3690058.5configmodel | log

Running

Install

We implement Co-DETR using MMDetection V2.25.3 and MMCV V1.5.0. The source code of MMdetection has been included in this repo and you only need to build MMCV following official instructions. We test our models under python=3.7.11,pytorch=1.11.0,cuda=11.3. Other versions may not be compatible.

Data

The COCO dataset and LVIS dataset should be organized as:

Co-DETR
ā””ā”€ā”€ data
    ā”œā”€ā”€ coco
    ā”‚   ā”œā”€ā”€ annotations
    ā”‚   ā”‚      ā”œā”€ā”€ instances_train2017.json
    ā”‚   ā”‚      ā””ā”€ā”€ instances_val2017.json
    ā”‚   ā”œā”€ā”€ train2017
    ā”‚   ā””ā”€ā”€ val2017
    ā”‚
    ā””ā”€ā”€ lvis_v1
        ā”œā”€ā”€ annotations
        ā”‚      ā”œā”€ā”€ lvis_v1_train.json
        ā”‚      ā””ā”€ā”€ lvis_v1_val.json
        ā”œā”€ā”€ train2017
        ā””ā”€ā”€ val2017        

Training

Train Co-Deformable-DETR + ResNet-50 with 8 GPUs:

sh tools/dist_train.sh projects/configs/co_deformable_detr/co_deformable_detr_r50_1x_coco.py 8 path_to_exp

Train using slurm:

sh tools/slurm_train.sh partition job_name projects/configs/co_deformable_detr/co_deformable_detr_r50_1x_coco.py path_to_exp

Testing

Test Co-Deformable-DETR + ResNet-50 with 8 GPUs, and evaluate:

sh tools/dist_test.sh  projects/configs/co_deformable_detr/co_deformable_detr_r50_1x_coco.py path_to_checkpoint 8 --eval bbox

Test using slurm:

sh tools/slurm_test.sh partition job_name projects/configs/co_deformable_detr/co_deformable_detr_r50_1x_coco.py path_to_checkpoint --eval bbox

Cite Co-DETR

If you find this repository useful, please use the following BibTeX entry for citation.

@inproceedings{zong2023detrs,
  title={Detrs with collaborative hybrid assignments training},
  author={Zong, Zhuofan and Song, Guanglu and Liu, Yu},
  booktitle={Proceedings of the IEEE/CVF international conference on computer vision},
  pages={6748--6758},
  year={2023}
}

License

This project is released under the MIT license. Please see the LICENSE file for more information.