Home

Awesome

SoCo

[NeurIPS 2021 Spotlight] Aligning Pretraining for Detection via Object-Level Contrastive Learning

By Fangyun Wei*, Yue Gao*, Zhirong Wu, Han Hu, Stephen Lin.

* Equal contribution.

Introduction

Image-level contrastive representation learning has proven to be highly effective as a generic model for transfer learning. Such generality for transfer learning, however, sacrifices specificity if we are interested in a certain downstream task. We argue that this could be sub-optimal and thus advocate a design principle which encourages alignment between the self-supervised pretext task and the downstream task. In this paper, we follow this principle with a pretraining method specifically designed for the task of object detection. We attain alignment in the following three aspects:

  1. object-level representations are introduced via selective search bounding boxes as object proposals;
  2. the pretraining network architecture incorporates the same dedicated modules used in the detection pipeline (e.g. FPN);
  3. the pretraining is equipped with object detection properties such as object-level translation invariance and scale invariance. Our method, called Selective Object COntrastive learning (SoCo), achieves state-of-the-art results for transfer performance on COCO detection using a Mask R-CNN framework.

Architecture

Main results

The pretrained models and finetuned models with their logs are available on Google Drive and Baidu Pan (code: 4662)

The following links are relative paths of the share folder.

SoCo pre-trained models

ModelArchEpochsScriptsPretrained Model (relative path)
SoCoResNet50-C4100SoCo_C4_100eplog (pretrain/SoCo_C4_100ep/log.txt) <br/> raw model (pretrain/SoCo_C4_100ep/ckpt_epoch_100.pth) <br/> converted d2 model (pretrain/SoCo_C4_100ep/current_detectron2_C4.pkl)
SoCoResNet50-C4400SoCo_C4_400eplog (pretrain/SoCo_C4_400ep/log.txt) <br/> raw model (pretrain/SoCo_C4_400ep/ckpt_epoch_400.pth) <br/> converted d2 model (pretrain/SoCo_C4_400ep/current_detectron2_C4.pkl)
SoCoResNet50-FPN100SoCo_FPN_100eplog (pretrain/SoCo_FPN_100ep/log.txt) <br/> raw model (pretrain/SoCo_FPN_100ep/ckpt_epoch_100.pth) <br/> converted d2 model (pretrain/SoCo_FPN_100ep/current_detectron2_Head.pkl)
SoCoResNet50-FPN400SoCo_FPN_400eplog (pretrain/SoCo_FPN_400ep/log.txt) <br/> raw model (pretrain/SoCo_FPN_400ep/ckpt_epoch_400.pth) <br/> converted d2 model (pretrain/SoCo_FPN_400ep/current_detectron2_Head.pkl)
SoCo*ResNet50-FPN400SoCo_FPN_Star_400eplog (pretrain/SoCo_FPN_Star_400ep/log.txt) <br/> raw model (pretrain/SoCo_FPN_Star_400ep/ckpt_epoch_400.pth) <br/> converted d2 model (pretrain/SoCo_FPN_Star_400ep/current_detectron2_Head.pkl)

Results on LVIS with MaskRCNN R50-FPN

MethodsEpochAP<sup>bb</sup>AP<sup>bb</sup><sub>50</sub>AP<sup>bb</sup><sub>75</sub>AP<sup>mk</sup>AP<sup>mk</sup><sub>50</sub>AP<sup>mk</sup><sub>75</sub>configDetectron2 trained (relative path)
Supervised9020.432.921.719.430.620.5----
SoCo*40026.341.227.825.038.526.8configlog (finetune/mask_rcnn_lvis_SoCo_FPN_Star_400ep_1x/log.txt) <br/> model (finetune/mask_rcnn_lvis_SoCo_FPN_Star_400ep_1x/model_final.pth)

Results on COCO with MaskRCNN R50-FPN

MethodsEpochAP<sup>bb</sup>AP<sup>bb</sup><sub>50</sub>AP<sup>bb</sup><sub>75</sub>AP<sup>mk</sup>AP<sup>mk</sup><sub>50</sub>AP<sup>mk</sup><sub>75</sub>configDetectron2 trained (relative path)
Scratch-31.049.533.228.546.830.4----
Supervised9038.959.642.735.456.538.1----
SoCo10042.362.546.537.659.140.5configlog (finetune/mask_rcnn_coco_SoCo_FPN_100ep_1x/log.txt) <br/> model (finetune/mask_rcnn_coco_SoCo_FPN_100ep_1x/model_final.pth)
SoCo40043.063.347.138.260.241.0configlog (finetune/mask_rcnn_coco_SoCo_FPN_400ep_1x/log.txt) <br/> model (finetune/mask_rcnn_coco_SoCo_FPN_400ep_1x/model_final.pth)
SoCo*40043.263.547.438.460.241.4configlog (finetune/mask_rcnn_coco_SoCo_FPN_Star_400ep_1x/log.txt) <br/> model (finetune/mask_rcnn_coco_SoCo_FPN_Star_400ep_1x/model_final.pth)

Results on COCO with MaskRCNN R50-C4

MethodsEpochAP<sup>bb</sup>AP<sup>bb</sup><sub>50</sub>AP<sup>bb</sup><sub>75</sub>AP<sup>mk</sup>AP<sup>mk</sup><sub>50</sub>AP<sup>mk</sup><sub>75</sub>configDetectron2 trained (relative path)
Scratch-26.444.027.829.346.930.8----
Supervised9038.258.241.233.354.735.2----
SoCo10040.460.443.734.956.837.0configlog (finetune/mask_rcnn_coco_SoCo_C4_100ep_1x/log.txt) <br/> model (finetune/mask_rcnn_coco_SoCo_C4_100ep_1x/model_final.pth)
SoCo40040.960.944.335.357.537.3configlog (finetune/mask_rcnn_coco_SoCo_C4_400ep_1x/log.txt) <br/> model (finetune/mask_rcnn_coco_SoCo_C4_400ep_1x/model_final.pth)

Get started

Requirements

The Dockerfile is included, please refer to it.

Prepare data with Selective Search

  1. Generate Selective Search proposals
    python selective_search/generate_imagenet_ss_proposals.py
    
  2. Filter out invalid proposals with filter strategy
    python selective_search/filter_ss_proposals_json.py
    
  3. Post preprocessing for images of no proposals
    python selective_search/filter_ss_proposals_json_post_no_prop.py
    

Pretrain with SoCo

Use SoCo FPN 100 epoch as example.

bash ./tools/SoCo_FPN_100ep.sh

Finetune detector

  1. Copy the folder detectron2_configs to the root folder of Detectron2
  2. Train the detectors with Detectron2

Citation

@article{wei2021aligning,
  title={Aligning Pretraining for Detection via Object-Level Contrastive Learning},
  author={Wei, Fangyun and Gao, Yue and Wu, Zhirong and Hu, Han and Lin, Stephen},
  journal={arXiv preprint arXiv:2106.02637},
  year={2021}
}