Awesome
GFR-DSOD
We have also merged our GFR-DSOD code here into the DSOD repo at https://github.com/szq0214/DSOD. Check it out there if you interested in DSOD.
We also see some very promising results on the PASCAL VOC Comp3 Leaderboard, like https://github.com/kuangliu/torchcv. While we found they still used the ImageNet pre-trained models as the initialized parameters (https://github.com/kuangliu/torchcv/issues/11). Please note that the Comp3 Challenge only allows to use the VOC12 dataset for training (without the pre-trained models). Please check your training process carefully.
If you find this helps your research, please consider citing:
@inproceedings{shenimproving,
title={Improving Object Detection from Scratch via Gated Feature Reuse},
author={Shen, Zhiqiang and Shi, Honghui and Yu, Jiahui and Phan, Hai and Feris, Rogerio and Cao, Liangliang and Liu, Ding and Wang, Xinchao and Huang, Thomas and Savvides, Marios}
booktitle={The British Machine Vision Conference (BMVC)},
year={2019}
Introduction
In GFR-DSOD, we propose a recurrent feature-pyramid structure to squeeze rich spatial and semantic features into a single prediction layer that further reduces the number of parameters to learn (DSOD need learn 1/2, but GFR-DSOD need only 1/3). Thus our new model is more fit for learning from scratch, and can converge faster than DSOD. We also introduce a novel gate-controlled prediction strategy in GFR-DSOD to adaptively enhance or attenuate feature activations at different scales based on the input object size.
<div align=center> <img src="https://user-images.githubusercontent.com/3794909/36568688-8d176d42-17f0-11e8-85d6-054d90ed5bfc.jpg" width="580"> </div> <div align=center> Figure 1: Illustration of the motivation of GFR-DSOD. </div> <div align=center> <img src="https://user-images.githubusercontent.com/3794909/36566300-ad4d9c2e-17e8-11e8-9808-a4c3602d21b1.jpg" width="740"> </div> <div align=center> Figure 2: An overview of GFR-DSOD together with three one-stage detector methods. </div>Visualization
- Visualizations of network structures (tools from ethereon, please ignore the warning messages):
Results & Models
Our PASCAL VOC LMDB files:
Method | LMDBs |
---|---|
Train on VOC07+12 and test on VOC07 | Download |
Train on VOC07++12 and test on VOC12 (Comp4) | Download |
Train on VOC12 and test on VOC12 (Comp3) | Download |
The tables below show the results on PASCAL VOC 2007, 2012 and 2012 Comp3 (training on VOC 2012 only).
PASCAL VOC test results:
Method | VOC 2007 test mAP | # params | Models |
---|---|---|---|
GFR-DSOD300 (07+12) | 78.5 | 14.1M | Download (56.5M) |
GFR-DSOD320 (07+12) | 78.7 | 14.2M | Download (56.8M) |
GFR-DSOD320* (07+12) | 79.0 | 16.0M | Download (63.9M) |
Method | VOC 2012 test mAP | # params | Models |
---|---|---|---|
GFR-DSOD320* (12) | 72.5 (VOC Comp3) | 16.0M | Download (63.9M) |
GFR-DSOD320 (07++12) | 77.0 | 14.2M | Download (56.8M) |
GFR-DSOD320* (07++12) | -- | -- | -- |
Contact
Zhiqiang Shen (zhiqiangshen0214 at gmail.com)
Any comments or suggestions are welcome!