Home

Awesome

Feature Pyramid Network on caffe

This is the unoffical version Feature Pyramid Network for Feature Pyramid Networks for Object Detection https://arxiv.org/abs/1612.03144

results

FPN(resnet50)-end2end result is implemented without OHEM and train with pascal voc 2007 + 2012 test on 2007

merged rcnn

mAP@0.5aeroplanebicyclebirdboatbottlebuscarcatchaircow
0.7880.80790.80360.80100.72930.67430.86800.87660.89670.61220.8646
diningtabledoghorsemotorbikepersonpottedplantsheepsofatraintv
0.73300.88550.87600.80630.79990.51380.79050.77550.86370.7736

shared rcnn

mAP@0.5aeroplanebicyclebirdboatbottlebuscarcatchaircow
0.78330.85850.80010.79700.71740.65220.86680.87680.89290.58420.8658
diningtabledoghorsemotorbikepersonpottedplantsheepsofatraintv
0.70220.88910.86800.79910.79440.50650.78960.77070.86970.7653

framework

megred rcnn framework

Network overview: link

shared rcnn

Network overview: link

the red and yellow are shared params

about the anchor size setting

In the paper the anchor setting is Ratios: [0.5,1,2],scales :[8,]

With the setting and P2~P6, all anchor sizes are [32,64,128,512,1024],but this setting is suit for COCO dataset which has so many small targets.

But the voc dataset targets are range [128,256,512].

So, we desgin the anchor setting:Ratios: [0.5,1,2],scales :[8,16], this is very import for voc dataset.

usage

download voc07,12 dataset ResNet50.caffemodel and rename to ResNet50.v2.caffemodel

cp ResNet50.v2.caffemodel data/pretrained_model/

In my expriments, the codes require ~10G GPU memory in training and ~6G in testing. your can design the suit image size, mimbatch size and rcnn batch size for your GPUS.

compile caffe & lib

cd caffe-fpn
mkdir build
cd build
cmake ..
make -j16 all
cd lib
make 

train & test

shared rcnn

./experiments/scripts/FP_Net_end2end.sh 1 FPN pascal_voc
./test.sh 1 FPN pascal_voc

megred rcnn

 ./experiments/scripts/FP_Net_end2end_merge_rcnn.sh 0 FPN pascal_voc
 ./test_mergercnn.sh 0 FPN pascal_voc

0 1 is GPU id.

TODO List

feature pyramid networks for object detection

Lin, T. Y., Dollár, P., Girshick, R., He, K., Hariharan, B., & Belongie, S. (2016). Feature pyramid networks for object detection. arXiv preprint arXiv:1612.03144.