Awesome

Enhancing Your Trained DETRs with Box Refinement

Yiqun Chen, Qiang Chen, Peize Sun, Shoufa Chen, Jingdong Wang, Jian Cheng

We present a conceptually simple, efficient, and general framework for localization problems in DETR-like models. We add plugins to well-trained models instead of inefficiently designing new models and training them from scratch. The method, called RefineBox, refines the outputs of DETR-like detectors by lightweight refinement networks. RefineBox is easy to implement and train as it only leverages the features and predicted boxes from the well-trained detection models. Our method is also efficient as we freeze the trained detectors during training. In addition, we can easily generalize RefineBox to various trained detection models without any modification. We conduct experiments on COCO and LVIS $1.0$. Experimental results indicate the effectiveness of our RefineBox for DETR and its representative variants. For example, the performance gains for DETR, Conditinal-DETR, DAB-DETR, and DN-DETR are 2.4 AP, 2.5 AP, 1.9 AP, and 1.6 AP, respectively. We hope our work will bring the attention of the detection community to the localization bottleneck of current DETR-like models and highlight the potential of the RefineBox framework. Code and models are publically available at https://github.com/YiqunChen1999/RefineBox.

Results

More details can be found in our paper.

COCO

Model	Backbone	AP	AR $_{10}$	AR $_{100}$
DETR	ResNet-50	42.0	53.3	57.5
+ RefineBox (Ours)	ResNet-50	44.4	56.4	61.2
$\Delta$		+2.4	+3.1	+3.7
DETR	ResNet-101	43.5	54.9	59.0
+ RefineBox (Ours)	ResNet-101	45.5	57.5	62.1
$\Delta$		+2.0	+2.6	+3.1
Conditional-DETR	ResNet-50	41.0	55.4	61.0
+ RefineBox (Ours)	ResNet-50	43.5	58.8	65.1
$\Delta$		+2.5	+3.4	+4.1
Conditional-DETR	ResNet-101	42.9	56.7	62.1
+ RefineBox (Ours)	ResNet-101	45.0	59.6	65.6
$\Delta$		+2.1	+2.9	+3.5
DAB-DETR	ResNet-50	43.3	57.5	62.9
+ RefineBox (Ours)	ResNet-50	45.2	60.0	65.9
$\Delta$		+1.9	+2.5	+3.0
DAB-DETR	ResNet-101	44.0	59.0	65.7
+ RefineBox (Ours)	ResNet-101	45.4	61.0	68.4
$\Delta$		+1.4	+2.0	+2.7
DAB-DETR	Swin-Tiny	45.2	58.4	63.5
+ RefineBox (Ours)	Swin-Tiny	47.1	60.9	66.5
$\Delta$		+1.9	+2.5	+3.0
DN-DETR	ResNet-50	44.3	58.3	63.4
+ RefineBox (Ours)	ResNet-50	45.9	60.3	66.0
$\Delta$		+1.6	+2.0	+2.6
Group-DETR $*$	ResNet-50	37.6	52.5	58.3
+ RefineBox (Ours)	ResNet-50	40.3	56.1	62.7
$\Delta$		+2.7	+3.6	+4.4

$*$ We train Group-Conditional-DETR with 11 training groups for 12 epochs.

LVIS

Model	AP	AR
DAB-DETR-R50	19.9	30.1
+ RefineBox (Ours)	21.8	32.8
$\Delta$	+1.9	+2.7
DAB-DETR-R50 + FedLoss	26.0	37.6
+ RefineBox (Ours)	28.8	41.4
$\Delta$	+2.8	+3.8

Installation

Assume you have installed Anaconda or Miniconda, and you are using Linux.

Clone this project.

git clone https://github.com/YiqunChen1999/RefineBox.git
cd RefineBox

Install dependencies in one line.

bash -i ./install.sh

This will create a conda environment named refinebox and install all dependencies.

Finally, activate the environment.

conda activate refinebox

NOTE: Our code is tested on PyTorch $1.13.1$ Python $3.10$. Other versions may also work.

Dataset Preparation

We mainly conduct experiments on COCO. You can optionally choose to download LVIS $1.0$.

COCO

Please download COCO 2017 dataset from COCO website. The dataset should be organized as follows:

THIS_PROJECT
      |_ datasets
            |_ coco
                  |_ annotations
                        |_ instances_train2017.json
                        |_ instances_val2017.json
                  |_ train2017
                        |_ ... # images
                  |_ val2017
                        |_ ... # images

LVIS (Optional)

Please download LVIS $1.0$ dataset from LVIS website. The dataset should be organized as follows:

THIS_PROJECT
      |_ datasets
            |_ lvis
                  |_ lvis_v1_train_cat_info.json
                  |_ lvis_v1_train.json
                  |_ lvis_v1_val.json
                  |_ train2017
                        |_ ... # images
                  |_ val2017
                        |_ ... # images

lvis_v1_train_cat_info.json is used by the Federated loss. This is created by

python tools/get_lvis_cat_info.py --ann datasets/lvis/lvis_v1_train.json

Getting Started

Download Pre-trained Models

RefineBox improves trained DETRs, so download the pre-trained DETRs first.

Then, please change the path to the pre-trained DETRs train.init_checkpoint in the config files.

Taking DN-DETR as an example:

Download the pre-trained DN-DETR from detrex model zoo and place it to your machine. The path to the model is denoted as PATH_TO_DN_DETR.
Replace the value of train.init_checkpoint in projects/rb_dn_detr/configs/rb_dn_detr_r50_12ep.py with PATH_TO_DN_DETR.

For our trained Group-Conditional-DETR and DAB-DETR for LVIS $1.0$, you can download them here.

Train RefineBox

Train RefineBox by running:

python tools/train_net.py --num-gpus 8 --config-file PATH_TO_RB_CONFIG_FILE

For example (DN-DETR):

python tools/train_net.py --num-gpus 8 --config-file projects/rb_dn_detr/configs/rb_dn_detr_r50_12ep.py

NOTE: We load pre-trained DETRs and RefineBox parameters separately, so you may see parameters not found warnings.

Evaluate RefineBox

Evaluate RefineBox by running:

python tools/train_net.py --num-gpus 8 --config-file PATH_TO_RB_CONFIG_FILE --eval-only

For example (DN-DETR):

python tools/train_net.py --num-gpus 8 --config-file projects/rb_dn_detr/configs/rb_dn_detr_r50_12ep.py --eval-only

NOTE: We load pre-trained DETRs and RefineBox parameters separately, so you may see parameters not found warnings.

Some Tips

If you want to run evaluation on LVIS $1.0$, please make sure numpy version is lower than $1.24.0$ as np.float is deprecated in numpy $1.24.0$.

License

The model is licensed under the Apache 2.0 license.

Cite RefineBox

If you find this work helpful, please cite:

@misc{chen2023enhancing,
      title={Enhancing Your Trained DETRs with Box Refinement}, 
      author={Yiqun Chen and Qiang Chen and Peize Sun and Shoufa Chen and Jingdong Wang and Jian Cheng},
      year={2023},
      eprint={2307.11828},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgement

This project is based on detrex and detectron2. We thank the authors for their great works. We also thank Detic for the Critetion with Federated Loss.