Home

Awesome

Enhancing Your Trained DETRs with Box Refinement

Yiqun Chen, Qiang Chen, Peize Sun, Shoufa Chen, Jingdong Wang, Jian Cheng

We present a conceptually simple, efficient, and general framework for localization problems in DETR-like models. We add plugins to well-trained models instead of inefficiently designing new models and training them from scratch. The method, called RefineBox, refines the outputs of DETR-like detectors by lightweight refinement networks. RefineBox is easy to implement and train as it only leverages the features and predicted boxes from the well-trained detection models. Our method is also efficient as we freeze the trained detectors during training. In addition, we can easily generalize RefineBox to various trained detection models without any modification. We conduct experiments on COCO and LVIS $1.0$. Experimental results indicate the effectiveness of our RefineBox for DETR and its representative variants. For example, the performance gains for DETR, Conditinal-DETR, DAB-DETR, and DN-DETR are 2.4 AP, 2.5 AP, 1.9 AP, and 1.6 AP, respectively. We hope our work will bring the attention of the detection community to the localization bottleneck of current DETR-like models and highlight the potential of the RefineBox framework. Code and models are publically available at https://github.com/YiqunChen1999/RefineBox.

Results

More details can be found in our paper.

COCO

ModelBackboneAPAR $_{10}$AR $_{100}$
DETRResNet-5042.053.357.5
+ RefineBox (Ours)ResNet-5044.456.461.2
$\Delta$+2.4+3.1+3.7
DETRResNet-10143.554.959.0
+ RefineBox (Ours)ResNet-10145.557.562.1
$\Delta$+2.0+2.6+3.1
Conditional-DETRResNet-5041.055.461.0
+ RefineBox (Ours)ResNet-5043.558.865.1
$\Delta$+2.5+3.4+4.1
Conditional-DETRResNet-10142.956.762.1
+ RefineBox (Ours)ResNet-10145.059.665.6
$\Delta$+2.1+2.9+3.5
DAB-DETRResNet-5043.357.562.9
+ RefineBox (Ours)ResNet-5045.260.065.9
$\Delta$+1.9+2.5+3.0
DAB-DETRResNet-10144.059.065.7
+ RefineBox (Ours)ResNet-10145.461.068.4
$\Delta$+1.4+2.0+2.7
DAB-DETRSwin-Tiny45.258.463.5
+ RefineBox (Ours)Swin-Tiny47.160.966.5
$\Delta$+1.9+2.5+3.0
DN-DETRResNet-5044.358.363.4
+ RefineBox (Ours)ResNet-5045.960.366.0
$\Delta$+1.6+2.0+2.6
Group-DETR $*$ResNet-5037.652.558.3
+ RefineBox (Ours)ResNet-5040.356.162.7
$\Delta$+2.7+3.6+4.4

$*$ We train Group-Conditional-DETR with 11 training groups for 12 epochs.

LVIS

ModelAPAR
DAB-DETR-R5019.930.1
+ RefineBox (Ours)21.832.8
$\Delta$+1.9+2.7
DAB-DETR-R50 + FedLoss26.037.6
+ RefineBox (Ours)28.841.4
$\Delta$+2.8+3.8

Installation

Assume you have installed Anaconda or Miniconda, and you are using Linux.

Clone this project.

git clone https://github.com/YiqunChen1999/RefineBox.git
cd RefineBox

Install dependencies in one line.

bash -i ./install.sh

This will create a conda environment named refinebox and install all dependencies.

Finally, activate the environment.

conda activate refinebox

NOTE: Our code is tested on PyTorch $1.13.1$ Python $3.10$. Other versions may also work.

Dataset Preparation

We mainly conduct experiments on COCO. You can optionally choose to download LVIS $1.0$.

COCO

Please download COCO 2017 dataset from COCO website. The dataset should be organized as follows:

THIS_PROJECT
      |_ datasets
            |_ coco
                  |_ annotations
                        |_ instances_train2017.json
                        |_ instances_val2017.json
                  |_ train2017
                        |_ ... # images
                  |_ val2017
                        |_ ... # images

LVIS (Optional)

Please download LVIS $1.0$ dataset from LVIS website. The dataset should be organized as follows:

THIS_PROJECT
      |_ datasets
            |_ lvis
                  |_ lvis_v1_train_cat_info.json
                  |_ lvis_v1_train.json
                  |_ lvis_v1_val.json
                  |_ train2017
                        |_ ... # images
                  |_ val2017
                        |_ ... # images

lvis_v1_train_cat_info.json is used by the Federated loss. This is created by

python tools/get_lvis_cat_info.py --ann datasets/lvis/lvis_v1_train.json

Getting Started

Download Pre-trained Models

RefineBox improves trained DETRs, so download the pre-trained DETRs first.

Then, please change the path to the pre-trained DETRs train.init_checkpoint in the config files.

Taking DN-DETR as an example:

For our trained Group-Conditional-DETR and DAB-DETR for LVIS $1.0$, you can download them here.

Train RefineBox

Train RefineBox by running:

python tools/train_net.py --num-gpus 8 --config-file PATH_TO_RB_CONFIG_FILE

For example (DN-DETR):

python tools/train_net.py --num-gpus 8 --config-file projects/rb_dn_detr/configs/rb_dn_detr_r50_12ep.py

NOTE: We load pre-trained DETRs and RefineBox parameters separately, so you may see parameters not found warnings.

Evaluate RefineBox

Evaluate RefineBox by running:

python tools/train_net.py --num-gpus 8 --config-file PATH_TO_RB_CONFIG_FILE --eval-only

For example (DN-DETR):

python tools/train_net.py --num-gpus 8 --config-file projects/rb_dn_detr/configs/rb_dn_detr_r50_12ep.py --eval-only

NOTE: We load pre-trained DETRs and RefineBox parameters separately, so you may see parameters not found warnings.

Some Tips

License

The model is licensed under the Apache 2.0 license.

Cite RefineBox

If you find this work helpful, please cite:

@misc{chen2023enhancing,
      title={Enhancing Your Trained DETRs with Box Refinement}, 
      author={Yiqun Chen and Qiang Chen and Peize Sun and Shoufa Chen and Jingdong Wang and Jian Cheng},
      year={2023},
      eprint={2307.11828},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgement

This project is based on detrex and detectron2. We thank the authors for their great works. We also thank Detic for the Critetion with Federated Loss.