Awesome

<div align="center"> <h2> Semi-DETR: Semi-Supervised Object Detection with Detection Transformers </h2> </div>

This repo is the official implementation of CVPR'2023 paper "Semi-DETR: Semi-Supervised Object Detection with Detection Transformers". Semi-DETR is the first work on semi-supervised object detection designed for detection transformers.

Update

2024/08/09 We release the prepared conda environment to help run our code. You can download the environment we used from the Google Drive link: semidetr_miniconda_cuda12.1_torch1.9.0+cu111_mmcv-full1.3.16.tar. We have already validated this environment on the Tesle A100 with the latest CUDA driver 12.1 so that you can run our code without annoying bugs about the environments.
- Usage: Download this environment tar file and then put it into the envs directory of your anaconda/miniconda, where anaconda/miniconda manage their virtual envs. Then unzip this file, and execute conda init to make the env prepared. Note that sometimes you are required to modify the PYTHON interpreter path in some files under this env to make it work correctly, such as setting the PYTHON interpreter in the semidetr/bin/pip to your local path to make the pip work properly.
2024/08/09 We reshare our model weight files via Google Drive, you can download these files via the following links:
2024/11/26 We uploaded the supervised baseline to Google Drive as requested:
- DINO-MMDet 12 epoch(Google Drive)

Usage

Our code is based on the awesome codebase provided by Soft-Teacher[1].

Requirements

Ubuntu 18.04
Anaconda3 with python=3.8
Pytorch=1.9.0
mmdetection=2.16.0+fe46ffe
mmcv=1.3.16
cuda=10.2

Installation

Ths project is developed based on mmdetection, please install the mmdet in a editable mode first:

cd thirdparty/mmdetection && python -m pip install -e .

Following the mmdetection, we also develop our detection transformer module and semi-supervised module in the similar way, which needs to be installed first(Please change the module name('detr_od' and 'detr_ssod') in 'setup.py' file alter):

cd ../../ && python -m pip install -e .

These will install 'mmdet', 'detr_od' and 'detr_ssod' in our environment. It also needs to compile the CUDA ops for deformable attention:

cd detr_od/models/utils/ops
python setup.py build install
# unit test (should see all checking is True)(Optional)
python test.py
cd ../../..

Data Preparation

Download the COCO dataset
Execute the following command to generate data set splits:

# YOUR_DATA should be a directory contains coco dataset.
# For eg.:
# YOUR_DATA/
#  coco/
#     train2017/
#     val2017/
#     unlabeled2017/
#     annotations/
ln -s ${YOUR_DATA} data
bash tools/dataset/prepare_coco_data.sh conduct

For concrete instructions of what should be downloaded, please refer to tools/dataset/prepare_coco_data.sh line 11-24. You can also download our generated semi-supervised data set splits in semi-coco-splits.

Download the PASCAL VOC dataset
Execute the following command to generate data set splits:

wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
tar -xf VOCtrainval_06-Nov-2007.tar
tar -xf VOCtest_06-Nov-2007.tar
tar -xf VOCtrainval_11-May-2012.tar

# resulting format
# YOUR_DATA/
#   - VOCdevkit
#     - VOC2007
#       - Annotations
#       - JPEGImages
#       - ...
#     - VOC2012
#       - Annotations
#       - JPEGImages
#       - ...

Following prior works, we convert the PASCAL VOC dataset into COCO format and evaluate the performance of model with coco-style mAP. Execute the following command to convert the dataset format:

python scripts/voc_to_coco.py --devkit_path ${VOCdevkit-PATH} --out-dir ${VOCdevkit-PATH}

Training

To train model on the fully supervised setting(Optional):

We implement the DINO with mmdetection following the original official repo, if you want to train the fully supervised DINO model by youself and check our implementation, you can run:

sh tools/dist_train_detr_od.sh dino_detr 8

It would train the DINO with batch size 16 for 12 epochs. We also provide the resulted checkpoint dino_sup_12e_ckpt and our training log dino_sup_12e_log of this fully supervised model.

To train model on the partial labeled data setting:

sh tools/dist_train_detr_ssod.sh dino_detr_ssod ${FOLD} ${PERCENT} ${GPUS}

For example, you can run the following scripts to train our model on 10% labeled data with 8 GPUs on 1th split:

sh tools/dist_train_detr_ssod.sh dino_detr_ssod 1 10 8

To train model on the full labeled data setting:

sh tools/dist_train_detr_ssod_coco_full.sh <NUM_GPUS>

For example, to train ours R50 model with 8 GPUs:

sh tools/dist_train_detr_ssod_coco_full.sh 8

Evaluation

python tools/test.py <CONFIG_FILE_PATH> <CHECKPOINT_PATH> --eval bbox

We also prepare some models trained by us bellow:

COCO:

Setting	mAP	Weights
1% Data	30.50 $\pm$ 0.30	ckpt
5% Data	40.10 $\pm$ 0.15	ckpt
10% Data	43.5 $\pm$ 0.10	ckpt
Full Data	50.5	ckpt

VOC:

Setting	AP50	mAP	Weights
Unlabel: VOC12	86.1	65.2	ckpt

[1] End-to-End Semi-Supervised Object Detection with Soft Teacher

Citation

If you find our repo useful for your research, please cite us:

@inproceedings{zhang2023semi,
  title={Semi-DETR: Semi-Supervised Object Detection With Detection Transformers},
  author={Zhang, Jiacheng and Lin, Xiangru and Zhang, Wei and Wang, Kuo and Tan, Xiao and Han, Junyu and Ding, Errui and Wang, Jingdong and Li, Guanbin},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={23809--23818},
  year={2023}
}