Awesome
RRPN++: Guidance Towards More Accurate Scene Text Detection
Report can be viewed at: https://arxiv.org/abs/2009.13118
Hightlights
- 89.5 F-measure in single scale in ICDAR 2015 benchmark (https://rrc.cvc.uab.es/?ch=4&com=evaluation&view=method_info&task=1&m=78081).
- 92.0 F-measure in single scale in ICDAR 2013 and testing speed can reach 13.3 fps with 640px (tested in single GPU of GTX 1080Ti).
- Adopting RRoI Align in Detectron2.
- Text Spotter with Transformer (training and testing).
- Support for higher pytorch verison >=1.7.
- Bug fixed for rbox cannot detect angle over 45.
Environments
- Ubuntu 16.04
- Cuda 10 or 9
- python >=3.5
- pytorch >= 1.7 (Higher version supported)
- Other packages like numpy, cv2.
Installation
Check INSTALL.md for installation instructions.
Configuring your dataset
- Your dataset path can be set in
$RRPN_ROOT/maskrcnn_benchmark/config/paths_catalog.py
. We implemented interface for {IC13, IC15, IC17mlt, LSVT, ArT} for common use(Start from line 96):
...
"RRPN_train": { # including IC13 and IC15
'dataset_list':{
# 'IC13': 'Your dataset path',
...
},
"split": 'train'
},
...
- Add your dataset in detector? You need to form a dict array as follows:
im_info = {
'gt_classes': your class_id array,
'max_classes': your class_id array,
'image': path to access one image,
'boxes': rotate box in {cx, cy, w, h, θ},
'flipped': Not supported, just False,
'gt_overlaps': overlaps fill with 1 (gt with gt),
'seg_areas': H * W for an rbox,
'height': height of an image,
'width': width of an image,
'max_overlaps': overlaps fill with 1 (gt with gt),
'rotated': just True
}
Examples can be seen in $RRPN_ROOT/maskrcnn_benchmark/data/rotation_series.py
Your data API should be add to the variable DATASET
:
DATASET = {
'IC13':get_ICDAR2013,
'IC15':get_ICDAR2015_RRC_PICK_TRAIN,
'IC17mlt':get_ICDAR2017_mlt,
...
'Your Dataset Name': 'Your Dataset API'
}
- Add your dataset in spotter? You need to form a dict array as follows:
im_info = {
'gt_classes': your class_id array,
'max_classes': your class_id array,
'image': path to access one image,
'boxes': rotate box in {cx, cy, w, h, θ},
'flipped': Not supported, just False,
'gt_overlaps': overlaps fill with 1 (gt with gt),
'seg_areas': H * W for an rbox,
'height': height of an image,
'width': width of an image,
'gt_words': words of each box,
'max_overlaps': overlaps fill with 1 (gt with gt),
'rotated': just True
}
Examples can be seen in $RRPN_ROOT/maskrcnn_benchmark/data/rrpn_e2e_series.py
Your data API should be add to the variable DATASET
:
DATASET = {
'IC13':get_ICDAR2013,
'IC15':get_ICDAR2015_RRC_PICK_TRAIN,
'IC17mlt':get_ICDAR2017_mlt,
...
'Your Dataset Name': 'Your Dataset API'
}
Training
# create your data cache directory
mkdir data_cache
Train a detector of RRPN++
# In your root of RRPN
python tools/train_net.py --config-file=configs/arpn/e2e_rrpn_R_50_C4_1x_train_AFPN_RT_LERB.yaml
Train a spotter (Used in RRPN++ report and we strongly recommand to use) of RRPN++
# In your root of RRPN
python tools/train_net.py --config-file=configs/arpn_E2E/e2e_rrpn_R_50_C4_1x_train_AFPN_RT_LERB_Spotter.yaml
- Multi-GPU phase is not testing yet, be careful to use GPU more than 1.
Testing
-
Using
$RRPN_ROOT/demo/ICDAR19_eval_script.py
or$RRPN_ROOT/demo/rrpn_e2e_infer.py
(Strongly recommanded) to test images you want. The demo will generate a text for your detected coodinates. -
Showing the detected image by ture the variable
vis
to True. -
By adding the following setting into your configure yaml to test the datasets, or you can re-implement the file to test your images.
-
One of the configure file we recommand is
$RRPN_ROOT/configs/arpn_E2E/e2e_rrpn_R_50_C4_1x_test_AFPN_RT_LERB_Spotter.yaml
-
Choose the dataset you want to evaluate on.
TEST:
DATASET_NAME: "IC15" # Choice can be "IC15", "LSVT" and so on
MODE: "DET" # DET for detection evaluation or E2E for recognition results in the spotter
More Results
Final
- Enjoy it with all the codes.
- Citing us if you find it work in your projects.
@article{ma2020rrpn++,
title={RRPN++: Guidance Towards More Accurate Scene Text Detection},
author={Ma, Jianqi},
journal={arXiv preprint arXiv:2009.13118},
year={2020}
}
Special Thanks
- My family for FIRM SUPPORT of devices and power supply.
- Jingye Chen's (https://github.com/JinGyeSetBirdsFree) support of the code and report.