Home

Awesome

R2CNN_Faster_RCNN_Tensorflow

Abstract

This is a tensorflow re-implementation of R<sup>2</sup>CNN: Rotational Region CNN for Orientation Robust Scene Text Detection.
It should be noted that we did not re-implementate exactly as the paper and just adopted its idea.

This project is based on Faster-RCNN, and completed by YangXue and YangJirui.

DOTA test results

1

Comparison

Part of the results are from DOTA paper.

Task1 - Oriented Leaderboard

ApproachesmAPPLBDBRGTFSVLVSHTCBCSTSBFRAHASPHC
SSD10.5939.839.090.6413.180.260.391.1116.2427.579.2327.169.093.031.051.01
YOLOv221.3939.5720.2936.5823.428.852.094.8244.3438.3534.6516.0237.6247.2325.57.45
R-FCN26.7937.838.213.6437.266.742.65.5922.8546.9366.0433.3747.1510.625.1917.96
FR-H36.2947.16619.851.7414.8712.86.8856.2659.9757.3247.8348.78.2337.2523.05
FR-O52.9379.0969.1217.1763.4934.237.1636.289.1969.658.9649.452.5246.6944.846.3
R<sup>2</sup>CNN60.6780.9465.7535.3467.4459.9250.9155.8190.6766.9272.3955.0652.2355.1453.3548.22
RRPN61.0188.5271.2031.6659.3051.8556.1957.2590.8172.8467.3856.6952.8453.0851.9453.58
ICN68.2081.4074.3047.7070.3064.9067.8070.0090.8079.1078.2053.6062.9067.0064.2050.20
R<sup>2</sup>CNN++71.1689.6681.2245.5075.1068.2760.1766.8390.9080.6986.1564.0563.4865.3468.0162.05

Task2 - Horizontal Leaderboard

ApproachesmAPPLBDBRGTFSVLVSHTCBCSTSBFRAHASPHC
SSD10.9444.7411.216.226.91210.2411.3415.5912.5617.9414.734.554.550.531.01
YOLOv239.276.933.8722.7334.8838.7332.0252.3761.6548.5433.9129.2736.8336.4438.2611.61
R-FCN47.2479.3344.2636.5853.5339.3834.1547.2945.6647.7465.8437.9244.2347.2350.6434.9
FR-H60.4680.3277.5532.8668.1353.6652.4950.0490.4175.0559.595749.8161.6956.4641.85
R<sup>2</sup>CNN----------------
FPN72.0088.7075.1052.6059.2069.4078.8084.5090.6081.3082.6052.5062.1076.6066.3060.10
ICN72.5090.0077.7053.4073.3073.5065.0078.2090.8079.1084.8057.2062.1073.5070.2058.10
R<sup>2</sup>CNN++75.3590.1881.8855.3073.2972.0977.6578.0690.9182.4486.3964.5363.4575.7778.2160.11

Face Detection

Environment: NVIDIA GeForce GTX 1060
2

ICDAR2015

3

Requirements

1、tensorflow >= 1.2
2、cuda8.0
3、python2.7 (anaconda2 recommend)
4、opencv(cv2)

Download Model

1、please download resnet50_v1resnet101_v1 pre-trained models on Imagenet, put it to data/pretrained_weights.
2、please download mobilenet_v2 pre-trained model on Imagenet, put it to data/pretrained_weights/mobilenet.
3、please download trained model by this project, put it to output/trained_weights.

Data Prepare

1、please download DOTA
2、crop data, reference:

cd $PATH_ROOT/data/io/DOTA
python train_crop.py 
python val_crop.py

3、data format

├── VOCdevkit
│   ├── VOCdevkit_train
│       ├── Annotation
│       ├── JPEGImages
│    ├── VOCdevkit_test
│       ├── Annotation
│       ├── JPEGImages

Compile

cd $PATH_ROOT/libs/box_utils/
python setup.py build_ext --inplace
cd $PATH_ROOT/libs/box_utils/cython_utils
python setup.py build_ext --inplace

Demo

Select a configuration file in the folder (libs/configs/) and copy its contents into cfgs.py, then download the corresponding weights.

DOTA

python demo_rh.py --src_folder='/PATH/TO/DOTA/IMAGES_ORIGINAL/' 
                  --image_ext='.png' 
                  --des_folder='/PATH/TO/SAVE/RESULTS/' 
                  --save_res=False
                  --gpu='0'

FDDB

python camera_demo.py --gpu='0'

Eval

python eval.py --img_dir='/PATH/TO/DOTA/IMAGES/' 
               --image_ext='.png' 
               --test_annotation_path='/PATH/TO/TEST/ANNOTATION/'
               --gpu='0'

Inference

python inference.py --data_dir='/PATH/TO/DOTA/IMAGES_CROP/'      
                    --gpu='0'

Train

1、If you want to train your own data, please note:

(1) Modify parameters (such as CLASS_NUM, DATASET_NAME, VERSION, etc.) in $PATH_ROOT/libs/configs/cfgs.py
(2) Add category information in $PATH_ROOT/libs/label_name_dict/lable_dict.py     
(3) Add data_name to line 75 of $PATH_ROOT/data/io/read_tfrecord.py 

2、make tfrecord

cd $PATH_ROOT/data/io/  
python convert_data_to_tfrecord.py --VOC_dir='/PATH/TO/VOCdevkit/VOCdevkit_train/' 
                                   --xml_dir='Annotation'
                                   --image_dir='JPEGImages'
                                   --save_name='train' 
                                   --img_format='.png' 
                                   --dataset='DOTA'

3、train

cd $PATH_ROOT/tools
python train.py

Tensorboard

cd $PATH_ROOT/output/summary
tensorboard --logdir=.

Citation

Some relevant achievements based on this code.

@article{[yang2018position](https://ieeexplore.ieee.org/document/8464244),
	title={Position Detection and Direction Prediction for Arbitrary-Oriented Ships via Multitask Rotation Region Convolutional Neural Network},
	author={Yang, Xue and Sun, Hao and Sun, Xian and  Yan, Menglong and Guo, Zhi and Fu, Kun},
	journal={IEEE Access},
	volume={6},
	pages={50839-50849},
	year={2018},
	publisher={IEEE}
}

@article{[yang2018r-dfpn](http://www.mdpi.com/2072-4292/10/1/132),
	title={Automatic ship detection in remote sensing images from google earth of complex scenes based on multiscale rotation dense feature pyramid networks},
	author={Yang, Xue and Sun, Hao and Fu, Kun and Yang, Jirui and Sun, Xian and Yan, Menglong and Guo, Zhi},
	journal={Remote Sensing},
	volume={10},
	number={1},
	pages={132},
	year={2018},
	publisher={Multidisciplinary Digital Publishing Institute}
}