

ASAG: Building Strong One-Decoder-Layer Sparse Detectors via Adaptive Sparse Anchor Generation

This is the official PyTorch implementation of ASAG (ICCV 2023).


1 Introduction

<center> <figure> <img src=".github/ap_fps.png" width="48%" /> <img src=".github/anchor_generator.png" width="48%" /> </figure> </center>

2 Model Zoo

<table> <thead> <tr> <th></th> <th>name</th> <th>backbone</th> <th>epoch</th> <th>#queries</th> <th>box AP</th> <th>Where in <a href="http://arxiv.org/abs/2308.09242">Our Paper</a></th> </tr> </thead> <tbody> <tr> <th>1</th> <td>ASAG-A</td> <td>R50</td> <td>12</td> <td>107</td> <td>42.6</td> <td>Table 2</td> </tr> <tr> <th>2</th> <td>ASAG-A</td> <td>R50</td> <td>12</td> <td>329</td> <td>43.6</td> <td>Table 2</td> </tr> <tr> <th>3</th> <td>ASAG-A<sup></td> <td>R50</td> <td>36</td> <td>102</td> <td>45.3</td> <td>Table 4</td> </tr> <tr> <th>4</th> <td>ASAG-A</td> <td>R50</td> <td>36</td> <td>312</td> <td>46.3</td> <td>Table 4</td> </tr> <tr> <th>5</th> <td>ASAG-A</td> <td>R101</td> <td>36</td> <td>296</td> <td>47.5</td> <td>Table 4</td> </tr> <tr> <th>6</th> <td>ASAG-S</td> <td>R50</td> <td>36</td> <td>100</td> <td>43.9</td> <td>Table 3 & 4</td> </tr> <tr> <th>7</th> <td>ASAG-S</td> <td>R50</td> <td>36</td> <td>312</td> <td>45.0</td> <td>Table 3 & 4</td> </tr> <tr> <th>8</th> <td>ASAG-A-dn</td> <td>R50</td> <td>12</td> <td>106</td> <td>43.1</td> <td>Table A-1</td> <tr> <th>9</th> <td>ASAG-A-crosscl</td> <td>R50</td> <td>12</td> <td>103</td> <td>43.8</td> <td></td> </tr> </tbody> </table>

3 Data preparation

Download and extract COCO 2017 train and val images with annotations from here.

We expect the directory structure to be the following:

  annotations/  # annotation json files
  train2017/    # train images
  val2017/      # val images

4 Usage

<details> <summary>ASAG-A (1x, R50, 100 queries)</summary> <p> Training </p> <code> python -m torch.distributed.launch --nproc_per_node=4 --use_env main.py --coco_path YOUR_COCO_PATH --batch_size 4 --output_dir output --backbone resnet50 --pretrained_checkpoint YOUR_DOWNLOADED_CHECKPOINT </code> <p> Inference </p> <code> python -m torch.distributed.launch --nproc_per_node=4 --use_env main.py --coco_path YOUR_COCO_PATH --batch_size 4 --output_dir output --backbone resnet50 --eval --resume ASAG_A_r50_1x_100.pth --used_head aux_2 </code> </details> <details> <summary>ASAG-A (1x, R50, 300 queries)</summary> <p> Training </p> <code> python -m torch.distributed.launch --nproc_per_node=4 --use_env main.py --coco_path YOUR_COCO_PATH --batch_size 4 --output_dir output --backbone resnet50 --pretrained_checkpoint YOUR_DOWNLOADED_CHECKPOINT --num_query 300 </code> <p> Inference </p> <code> python -m torch.distributed.launch --nproc_per_node=4 --use_env main.py --coco_path YOUR_COCO_PATH --batch_size 4 --output_dir output --backbone resnet50 --eval --resume ASAG_A_r50_1x_300.pth --used_head aux_2 --num_query 300 </code> </details> <details> <summary>ASAG-A (3x, R50, 100 queries)</summary> <p> Training </p> <code> python -m torch.distributed.launch --nproc_per_node=4 --use_env main.py --coco_path YOUR_COCO_PATH --batch_size 4 --output_dir output --backbone resnet50 --pretrained_checkpoint YOUR_DOWNLOADED_CHECKPOINT --training_schedule 3x </code> <p> Inference </p> <code> python -m torch.distributed.launch --nproc_per_node=4 --use_env main.py --coco_path YOUR_COCO_PATH --batch_size 4 --output_dir output --backbone resnet50 --eval --resume ASAG_A_r50_3x_100.pth --used_head main </code> </details> <details> <summary>ASAG-A (3x, R50, 300 queries)</summary> <p> Training </p> <code> python -m torch.distributed.launch --nproc_per_node=4 --use_env main.py --coco_path YOUR_COCO_PATH --batch_size 4 --output_dir output --backbone resnet50 --pretrained_checkpoint YOUR_DOWNLOADED_CHECKPOINT --num_query 300 --training_schedule 3x </code> <p> Inference </p> <code> python -m torch.distributed.launch --nproc_per_node=4 --use_env main.py --coco_path YOUR_COCO_PATH --batch_size 4 --output_dir output --backbone resnet50 --eval --resume ASAG_A_r50_3x_300.pth --used_head aux_2 --num_query 300 </code> </details> <details> <summary>ASAG-A (3x, R101, 300 queries)</summary> <p> Training </p> <code> python -m torch.distributed.launch --nproc_per_node=4 --use_env main.py --coco_path YOUR_COCO_PATH --batch_size 4 --output_dir output --backbone resnet101 --pretrained_checkpoint YOUR_DOWNLOADED_CHECKPOINT --num_query 300 --training_schedule 3x </code> <p> Inference </p> <code> python -m torch.distributed.launch --nproc_per_node=4 --use_env main.py --coco_path YOUR_COCO_PATH --batch_size 4 --output_dir output --backbone resnet101 --eval --resume ASAG_A_r101_3x_300.pth --used_head aux_2 --num_query 300 </code> </details> <details> <summary>ASAG-S (3x, R50, 100 queries)</summary> <p> Training </p> <code> python -m torch.distributed.launch --nproc_per_node=4 --use_env main.py --coco_path YOUR_COCO_PATH --batch_size 4 --output_dir output --backbone resnet50 --pretrained_checkpoint YOUR_DOWNLOADED_CHECKPOINT --training_schedule 3x --decoder_type SparseRCNN </code> <p> Inference </p> <code> python -m torch.distributed.launch --nproc_per_node=4 --use_env main.py --coco_path YOUR_COCO_PATH --batch_size 4 --output_dir output --backbone resnet50 --eval --decoder_type SparseRCNN --resume ASAG_S_r50_3x_100.pth --used_head aux_2 </code> </details> <details> <summary>ASAG-S (3x, R50, 300 queries)</summary> <p> Training </p> <code> python -m torch.distributed.launch --nproc_per_node=4 --use_env main.py --coco_path YOUR_COCO_PATH --batch_size 4 --output_dir output --backbone resnet50 --pretrained_checkpoint YOUR_DOWNLOADED_CHECKPOINT --num_query 300 --training_schedule 3x --decoder_type SparseRCNN </code> <p> Inference </p> <code> python -m torch.distributed.launch --nproc_per_node=4 --use_env main.py --coco_path YOUR_COCO_PATH --batch_size 4 --output_dir output --backbone resnet50 --eval --resume ASAG_S_r50_3x_300.pth --used_head aux_2 --num_query 300 --decoder_type SparseRCNN </code> </details> <details> <summary>ASAG-A+dn (1x, R50, 100 queries)</summary> <p> Training </p> <code> python -m torch.distributed.launch --nproc_per_node=4 --use_env main.py --coco_path YOUR_COCO_PATH --batch_size 4 --output_dir output --backbone resnet50 --pretrained_checkpoint YOUR_DOWNLOADED_CHECKPOINT --use_dn --fix_noise_scale </code> <p> Inference </p> <code> python -m torch.distributed.launch --nproc_per_node=4 --use_env main.py --coco_path YOUR_COCO_PATH --batch_size 4 --output_dir output --backbone resnet50 --eval --resume ASAG_A_r50_1x_100_dn.pth --used_head aux_2 </code> </details>

5 Efficient inference

6 CrowdHuman Results

<table> <thead> <tr> <th></th> <th>name</th> <th>AP(↑)</th> <th>mMR(↓)</th> <th>R(↑)</th> <th>Where in <a href="http://arxiv.org/abs/2308.09242">Our Paper</a></th> </tr> </thead> <tbody> <tr> <th>1</th> <td>Deformable DETR</td> <td>86.7</td> <td>54.0</td> <td>92.5</td> <td>Table 6</td> </tr> <tr> <th>2</th> <td>Sparse RCNN</td> <td>89.2</td> <td>48.3</td> <td>95.9</td> <td>Table 6</td> </tr> <tr> <th>3</th> <td>ASAG-S<sup></td> <td>91.3</td> <td>43.5</td> <td>96.9</td> <td>Table 6</td> </tr> </tbody> </table>

7 Equipping with stronger backbone

<table> <thead> <tr> <th></th> <th>backbone</th> <th>AP</th> <th>APs</th> <th>APm</th> <th>APl</th> </tr> </thead> <tbody> <tr> <th>1</th> <td>torchvision R50</td> <td>42.6</td> <td>25.9</td> <td>45.8</td> <td>56.9</td> </tr> <tr> <th>2</th> <td>CrossCL R50</td> <td>43.8</td> <td>26.1</td> <td>47.4</td> <td>59.3</td> </tr> </tbody> </table>

8 License

ASAG is released under the Apache 2.0 license. Please see the LICENSE file for more information.

9 Bibtex

If you find our work helpful for your research, please consider citing the following BibTeX entry.

  title={ASAG: Building Strong One-Decoder-Layer Sparse Detectors via Adaptive Sparse Anchor Generation},
  author={Fu, Shenghao and Yan, Junkai and Gao, Yipeng and Xie, Xiaohua and Zheng, Wei-Shi},

  title={Self-supervised Cross-stage Regional Contrastive Learning for Object Detection},
  author={Yan, Junkai and Yang, Lingxiao and Gao, Yipeng and Zheng, Wei-Shi},

10 Acknowledgement

Our ASAG is heavily inspired by many outstanding prior works, including

Thank the authors of above projects for open-sourcing their implementation codes!