Home

Awesome

deep learning object detection

A paper list of object detection using deep learning. I wrote this page with reference to this survey paper and searching and searching..

Last updated: 2020/09/22

Update log

2018/9/18 - update all of recent papers and make some diagram about history of object detection using deep learning. 2018/9/26 - update codes of papers. (official and unofficial)
2018/october - update 5 papers and performance table.
2018/november - update 9 papers.
2018/december - update 8 papers and and performance table and add new diagram(2019 version!!).
2019/january - update 4 papers and and add commonly used datasets.
2019/february - update 3 papers.
2019/march - update figure and code links.
2019/april - remove author's names and update ICLR 2019 & CVPR 2019 papers.
2019/may - update CVPR 2019 papers.
2019/june - update CVPR 2019 papers and dataset paper.
2019/july - update BMVC 2019 papers and some of ICCV 2019 papers.
2019/september - update NeurIPS 2019 papers and ICCV 2019 papers.
2019/november - update some of AAAI 2020 papers and other papers.
2020/january - update ICLR 2020 papers and other papers.
2020/may - update CVPR 2020 papers and other papers.
2020/june - update arxiv papers.
2020/august - update paper links.

Table of Contents

Paper list from 2014 to now(2019)

The part highlighted with red characters means papers that i think "must-read". However, it is my personal opinion and other papers are important too, so I recommend to read them if you have time.

<p align="center"> <img width="1000" src="/assets/deep_learning_object_detection_history.PNG" "Example of object detection."> </p>

Performance table

FPS(Speed) index is related to the hardware spec(e.g. CPU, GPU, RAM, etc), so it is hard to make an equal comparison. The solution is to measure the performance of all models on hardware with equivalent specifications, but it is very difficult and time consuming.

DetectorVOC07 (mAP@IoU=0.5)VOC12 (mAP@IoU=0.5)COCO (mAP@IoU=0.5:0.95)Published In
R-CNN58.5--CVPR'14
SPP-Net59.2--ECCV'14
MR-CNN78.2 (07+12)73.9 (07+12)-ICCV'15
Fast R-CNN70.0 (07+12)68.4 (07++12)19.7ICCV'15
Faster R-CNN73.2 (07+12)70.4 (07++12)21.9NIPS'15
YOLO v166.4 (07+12)57.9 (07++12)-CVPR'16
G-CNN66.866.4 (07+12)-CVPR'16
AZNet70.4-22.3CVPR'16
ION80.177.933.1CVPR'16
HyperNet76.3 (07+12)71.4 (07++12)-CVPR'16
OHEM78.9 (07+12)76.3 (07++12)22.4CVPR'16
MPN--33.2BMVC'16
SSD76.8 (07+12)74.9 (07++12)31.2ECCV'16
GBDNet77.2 (07+12)-27.0ECCV'16
CPF76.4 (07+12)72.6 (07++12)-ECCV'16
R-FCN79.5 (07+12)77.6 (07++12)29.9NIPS'16
DeepID-Net69.0--PAMI'16
NoC71.6 (07+12)68.8 (07+12)27.2TPAMI'16
DSSD81.5 (07+12)80.0 (07++12)33.2arXiv'17
TDM--37.3CVPR'17
FPN--36.2CVPR'17
YOLO v278.6 (07+12)73.4 (07++12)-CVPR'17
RON77.6 (07+12)75.4 (07++12)27.4CVPR'17
DeNet77.1 (07+12)73.9 (07++12)33.8ICCV'17
CoupleNet82.7 (07+12)80.4 (07++12)34.4ICCV'17
RetinaNet--39.1ICCV'17
DSOD77.7 (07+12)76.3 (07++12)-ICCV'17
SMN70.0--ICCV'17
Light-Head R-CNN--41.5arXiv'17
YOLO v3--33.0arXiv'18
SIN76.0 (07+12)73.1 (07++12)23.2CVPR'18
STDN80.9 (07+12)--CVPR'18
RefineDet83.8 (07+12)83.5 (07++12)41.8CVPR'18
SNIP--45.7CVPR'18
Relation-Network--32.5CVPR'18
Cascade R-CNN--42.8CVPR'18
MLKP80.6 (07+12)77.2 (07++12)28.6CVPR'18
Fitness-NMS--41.8CVPR'18
RFBNet82.2 (07+12)--ECCV'18
CornerNet--42.1ECCV'18
PFPNet84.1 (07+12)83.7 (07++12)39.4ECCV'18
Pelee70.9 (07+12)--NIPS'18
HKRM78.8 (07+12)-37.8NIPS'18
M2Det--44.2AAAI'19
R-DAD81.2 (07++12)82.0 (07++12)43.1AAAI'19
ScratchDet84.1 (07++12)83.6 (07++12)39.1CVPR'19
Libra R-CNN--43.0CVPR'19
Reasoning-RCNN82.5 (07++12)-43.2CVPR'19
FSAF--44.6CVPR'19
AmoebaNet + NAS-FPN--47.0CVPR'19
Cascade-RetinaNet--41.1CVPR'19
HTC--47.2CVPR'19
TridentNet--48.4ICCV'19
DAFS85.3 (07+12)83.1 (07++12)40.5ICCV'19
Auto-FPN81.8 (07++12)-40.5ICCV'19
FCOS--44.7ICCV'19
FreeAnchor--44.8NeurIPS'19
DetNAS81.5 (07++12)-42.0NeurIPS'19
NATS--42.0NeurIPS'19
AmoebaNet + NAS-FPN + AA--50.7arXiv'19
SpineNet--52.1arXiv'19
CBNet--53.3AAAI'20
EfficientDet--52.6CVPR'20
DetectoRS--54.7arXiv'20

2014

2015

2016

2017

2018

2019

2020

Dataset Papers

Statistics of commonly used object detection datasets. The Table came from this survey paper.

<table> <thead> <tr> <th rowspan=2>Challenge</th> <th rowspan=2 width=80>Object Classes</th> <th colspan=3>Number of Images</th> <th colspan=2>Number of Annotated Images</th> </tr> <tr> <th>Train</th> <th>Val</th> <th>Test</th> <th>Train</th> <th>Val</th> </tr> </thead> <tbody> <!-- PASCAL VOC Object Detection Challenge --> <tr><th colspan=7>PASCAL VOC Object Detection Challenge</th></tr> <tr><td> VOC07 </td><td> 20 </td><td> 2,501 </td><td> 2,510 </td><td> 4,952 </td><td> 6,301 (7,844) </td><td> 6,307 (7,818) </td></tr> <tr><td> VOC08 </td><td> 20 </td><td> 2,111 </td><td> 2,221 </td><td> 4,133 </td><td> 5,082 (6,337) </td><td> 5,281 (6,347) </td></tr> <tr><td> VOC09 </td><td> 20 </td><td> 3,473 </td><td> 3,581 </td><td> 6,650 </td><td> 8,505 (9,760) </td><td> 8,713 (9,779) </td></tr> <tr><td> VOC10 </td><td> 20 </td><td> 4,998 </td><td> 5,105 </td><td> 9,637 </td><td> 11,577 (13,339) </td><td> 11,797 (13,352) </td></tr> <tr><td> VOC11 </td><td> 20 </td><td> 5,717 </td><td> 5,823 </td><td> 10,994 </td><td> 13,609 (15,774) </td><td> 13,841 (15,787) </td></tr> <tr><td> VOC12 </td><td> 20 </td><td> 5,717 </td><td> 5,823 </td><td> 10,991 </td><td> 13,609 (15,774) </td><td> 13,841 (15,787) </td></tr> <!-- ILSVRC Object Detection Challenge --> <tr><th colspan=7>ILSVRC Object Detection Challenge</th></tr> <tr><td> ILSVRC13 </td><td> 200 </td><td> 395,909 </td><td> 20,121 </td><td> 40,152 </td><td> 345,854 </td><td> 55,502 </td></tr> <tr><td> ILSVRC14 </td><td> 200 </td><td> 456,567 </td><td> 20,121 </td><td> 40,152 </td><td> 478,807 </td><td> 55,502 </td></tr> <tr><td> ILSVRC15 </td><td> 200 </td><td> 456,567 </td><td> 20,121 </td><td> 51,294 </td><td> 478,807 </td><td> 55,502 </td></tr> <tr><td> ILSVRC16 </td><td> 200 </td><td> 456,567 </td><td> 20,121 </td><td> 60,000 </td><td> 478,807 </td><td> 55,502 </td></tr> <tr><td> ILSVRC17 </td><td> 200 </td><td> 456,567 </td><td> 20,121 </td><td> 65,500 </td><td> 478,807 </td><td> 55,502 </td></tr> <!-- MS COCO Object Detection Challenge --> <tr><th colspan=7>MS COCO Object Detection Challenge</th></tr> <tr><td> MS COCO15 </td><td> 80 </td><td> 82,783 </td><td> 40,504 </td><td> 81,434 </td><td> 604,907 </td><td> 291,875 </td></tr> <tr><td> MS COCO16 </td><td> 80 </td><td> 82,783 </td><td> 40,504 </td><td> 81,434 </td><td> 604,907 </td><td> 291,875 </td></tr> <tr><td> MS COCO17 </td><td> 80 </td><td> 118,287 </td><td> 5,000 </td><td> 40,670 </td><td> 860,001 </td><td> 36,781 </td></tr> <tr><td> MS COCO18 </td><td> 80 </td><td> 118,287 </td><td> 5,000 </td><td> 40,670 </td><td> 860,001 </td><td> 36,781 </td></tr> <!-- Open Images Object Detection Challenge --> <tr><th colspan=7>Open Images Object Detection Challenge</th></tr> <tr><td> OID18 </td><td> 500 </td><td> 1,743,042 </td><td> 41,620 </td><td> 125,436 </td><td> 12,195,144 </td><td> ― </td></tr> </tbody> </table>

The papers related to datasets used mainly in Object Detection are as follows.

Contact & Feedback

If you have any suggestions about papers, feel free to mail me :)