Home

Awesome

<div align="center"> <h2> Semi-DETR: Semi-Supervised Object Detection with Detection Transformers </h2> </div>

This repo is the official implementation of CVPR'2023 paper "Semi-DETR: Semi-Supervised Object Detection with Detection Transformers". Semi-DETR is the first work on semi-supervised object detection designed for detection transformers.

Update

Usage

Our code is based on the awesome codebase provided by Soft-Teacher[1].

Requirements

<!-- #### Notes - The project should be compatible to the latest version of `mmdetection`. If you want to switch to the same version `mmdetection` as ours, run `cd thirdparty/mmdetection && git checkout v2.16.0` -->

Installation

Ths project is developed based on mmdetection, please install the mmdet in a editable mode first:

cd thirdparty/mmdetection && python -m pip install -e .

Following the mmdetection, we also develop our detection transformer module and semi-supervised module in the similar way, which needs to be installed first(Please change the module name('detr_od' and 'detr_ssod') in 'setup.py' file alter):

cd ../../ && python -m pip install -e .

These will install 'mmdet', 'detr_od' and 'detr_ssod' in our environment. It also needs to compile the CUDA ops for deformable attention:

cd detr_od/models/utils/ops
python setup.py build install
# unit test (should see all checking is True)(Optional)
python test.py
cd ../../..

Data Preparation

# YOUR_DATA should be a directory contains coco dataset.
# For eg.:
# YOUR_DATA/
#  coco/
#     train2017/
#     val2017/
#     unlabeled2017/
#     annotations/
ln -s ${YOUR_DATA} data
bash tools/dataset/prepare_coco_data.sh conduct

For concrete instructions of what should be downloaded, please refer to tools/dataset/prepare_coco_data.sh line 11-24. You can also download our generated semi-supervised data set splits in semi-coco-splits.

wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
tar -xf VOCtrainval_06-Nov-2007.tar
tar -xf VOCtest_06-Nov-2007.tar
tar -xf VOCtrainval_11-May-2012.tar

# resulting format
# YOUR_DATA/
#   - VOCdevkit
#     - VOC2007
#       - Annotations
#       - JPEGImages
#       - ...
#     - VOC2012
#       - Annotations
#       - JPEGImages
#       - ...

Following prior works, we convert the PASCAL VOC dataset into COCO format and evaluate the performance of model with coco-style mAP. Execute the following command to convert the dataset format:

python scripts/voc_to_coco.py --devkit_path ${VOCdevkit-PATH} --out-dir ${VOCdevkit-PATH}

Training

We implement the DINO with mmdetection following the original official repo, if you want to train the fully supervised DINO model by youself and check our implementation, you can run:

sh tools/dist_train_detr_od.sh dino_detr 8

It would train the DINO with batch size 16 for 12 epochs. We also provide the resulted checkpoint dino_sup_12e_ckpt and our training log dino_sup_12e_log of this fully supervised model.

sh tools/dist_train_detr_ssod.sh dino_detr_ssod ${FOLD} ${PERCENT} ${GPUS}

For example, you can run the following scripts to train our model on 10% labeled data with 8 GPUs on 1th split:

sh tools/dist_train_detr_ssod.sh dino_detr_ssod 1 10 8
sh tools/dist_train_detr_ssod_coco_full.sh <NUM_GPUS>

For example, to train ours R50 model with 8 GPUs:

sh tools/dist_train_detr_ssod_coco_full.sh 8

Evaluation

python tools/test.py <CONFIG_FILE_PATH> <CHECKPOINT_PATH> --eval bbox

We also prepare some models trained by us bellow:

COCO:

SettingmAPWeights
1% Data30.50 $\pm$ 0.30ckpt
5% Data40.10 $\pm$ 0.15ckpt
10% Data43.5 $\pm$ 0.10ckpt
Full Data50.5ckpt

VOC:

SettingAP50mAPWeights
Unlabel: VOC1286.165.2ckpt

[1] End-to-End Semi-Supervised Object Detection with Soft Teacher

Citation

If you find our repo useful for your research, please cite us:

@inproceedings{zhang2023semi,
  title={Semi-DETR: Semi-Supervised Object Detection With Detection Transformers},
  author={Zhang, Jiacheng and Lin, Xiangru and Zhang, Wei and Wang, Kuo and Tan, Xiao and Han, Junyu and Ding, Errui and Wang, Jingdong and Li, Guanbin},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={23809--23818},
  year={2023}
}