Awesome

An implementation of DetNet: A Backbone network for Object Detection. Due to the short time, I only trained and tested on pascal voc dataset. It proved that the performance of detnet59 is indeed better than FPN101.

Introduction

Firstly, I spent about one week training detnet59 on the ImageNet dataset .The classification performance of detnet59 is a little better than the original resnet50. Then i used the pretrained detnet59 to train and test on pascal voc.

Based on FPN_Pytorch, i change FPN101 to detnet59.

Update 2019/01/01

Fix bugs in demo.py. Now you can run demo.py. Note the default demo.py merely support pascal_voc categories. You need to change the pascal_classes in demo.py to adapt your own dataset. If you want to know more details, please see the usage part.

Update 2018/8/21

train and test on COCO2017 !

Update

Adding soft_nms. Without requiring any re-training of existing models. You only need to use soft_nms during testing to bring performance improvements.

Benchmarking

I benchmark this code thoroughly on pascal voc2007 and 07+12. Below are the results:

0). ImageNet(test on validation dataset)

backbone	Top1 error
pytorch resnet50	23.9
detnet59 in this code	23.8
detnet59 in the original paper	23.5

1). PASCAL VOC 2007 (Train/Test: 07trainval/07test, scale=600, ROI Align)

model（FPN）	GPUs	Batch Size	lr	lr_decay	max_epoch	Speed/epoch	Memory/GPU	mAP
ResNet-101	1 GTX 1080 (Ti)	2	1e-3	10	12	1.44hr	6137MB	75.7
DetNet59	1 GTX 1080 (Ti)	2	1e-3	10	12	1.07hr	5412MB	75.9

2). PASCAL VOC 07+12 (Train/Test: 07+12trainval/07test, scale=600, ROI Align)

model（FPN）	GPUs	Batch Size	lr	lr_decay	max_epoch	Speed/epoch	Memory/GPU	mAP
ResNet-101	1 GTX 1080 (Ti)	1	1e-3	10	12	3.96hr	9011MB	80.5
DetNet59	1 GTX 1080 (Ti)	1	1e-3	10	12	2.33hr	8015MB	80.7
ResNet-101(using soft_nms when testing)	1 GTX 1080 (Ti)	\	\	\	\	\	\	81.2
DetNet59(using soft_nms when testing)	1 GTX 1080 (Ti)	\	\	\	\	\	\	81.6

3). COCO2017 (Train/Test:COCO2017train/COCO2017val, scale=800, max_size=1200，ROI Align)

model	#GPUs	batch size	lr	lr_decay	max_epoch	time/epoch	mem/GPU	mAP
DetNet59	2	4	4e-3	4	11	\	9000	36.0

Preparation

First of all, clone the code

git clone https://github.com/guoruoqian/DetNet_Pytorch.git

Then, create a folder:

cd DetNet_Pytorch && mkdir data

prerequisites

Python 2.7 or 3.6
Pytorch 0.2.0 or higher（not support pytorch version >=0.4.0）
CUDA 8.0 or higher
tensorboardX

Data Preparation

VOC2007: Please follow the instructions in py-faster-rcnn to prepare VOC datasets. Actually, you can refer to any others. After downloading the data, creat softlinks in the folder data/.
VOC 07 + 12: Please follow the instructions in YuwenXiong/py-R-FCN . I think this instruction is more helpful to prepare VOC datasets.

Pretrained Model

You can download the detnet59 model which i trained on ImageNet from:

detnet59: dropbox，baiduyun

Download it and put it into the data/pretrained_model/.

Compilation

As pointed out by ruotianluo/pytorch-faster-rcnn, choose the right -arch in make.sh file, to compile the cuda code:

GPU model	Architecture
TitanX (Maxwell/Pascal)	sm_52
GTX 960M	sm_50
GTX 1080 (Ti)	sm_61
Grid K520 (AWS g2.2xlarge)	sm_30
Tesla K80 (AWS p2.xlarge)	sm_37

Install all the python dependencies using pip:

pip install -r requirements.txt

Compile the cuda dependencies using following simple commands:

cd lib
sh make.sh

It will compile all the modules you need, including NMS, ROI_Pooing, ROI_Align and ROI_Crop. The default version is compiled with Python 2.7, please compile by yourself if you are using a different python version.

Usage

train voc2007:

CUDA_VISIBLE_DEVICES=3 python3 trainval_net.py exp_name --dataset pascal_voc --net detnet59 --bs 2 --nw 4 --lr 1e-3 --epochs 12 --save_dir weights --cuda --use_tfboard True

test voc2007:

CUDA_VISIBLE_DEVICES=3 python3 test_net.py exp_name --dataset pascal_voc --net detnet59 --checksession 1 --checkepoch 7 --checkpoint 5010 --cuda --load_dir weights

run demo.py :

Before run demo, you must make dictionary 'demo_images' and put images (VOC images) in it. You can download the pretrained model listed in above tables.

CUDA_VISIBLE_DEVICES=0 python3 demo.py exp_name --dataset pascal_voc --net detnet59 --checksession 1 --checkepoch 7 --checkpoint 5010 --cuda --load_dir weights --image_dir demo_images --result_dir vis_results

using soft_nms when testing:

CUDA_VISIBLE_DEVICES=3 python3 test_net.py exp_name --dataset pascal_voc --net detnet59 --checksession 1 --checkepoch 7 --checkpoint 5010 --cuda --load_dir weights --soft_nms

Before training voc07+12, you can must set ASPECT_CROPPING in detnet59.yml False, or you will encounter some error during the training.

train voc07+12:

CUDA_VISIBLE_DEVICES=3 python3 trainval_net.py exp_name2 --dataset pascal_voc_0712 --net detnet59 --bs 1 --nw 4 --lr 1e-3 --epochs 12 --save_dir weights --cuda --use_tfboard True

train coco:

CUDA_VISIBLE_DEVICES=6,7 python3 trainval_net.py detnetv1.0 --dataset coco --net detnet59 --bs 4 --nw 4 --lr 4e-3 --epochs 12 --save_dir weights --cuda --lscale --mGPUs

test coco:

CUDA_VISIBLE_DEVICES=2 python3 test_net.py detnetv1.0 --dataset coco --net detnet59 --checksession 1 --checkepoch 7 --checkpoint 58632 --cuda --load_dir weights --ls

TODO

Train and test on COCO(Done)