Awesome

Shape Robust Text Detection with Progressive Scale Expansion Network

A reimplement of PSENet with mxnet-gluon. Just train on ICPR.

Thanks for the author's (@whai362) great work!

To reimplement PSENet by Gluon, here are some problem that I occur.

First, I doubt the label about kernel is not correct. However, I verify them again so that they are absolute right.
Second, I doubt the mx.nd.split cannot be backwarded. However the diceloss about score map by split is well. So it cannot be raise this problem.
Here the network is based on resnet50, and the output of FPN is input_size/4,so there may not be any text instance in min_kernel_map. So I set the number of kernels to 3

Maybe upsampling output to input_size is a good choice. I will try it in my spare time.

Dataset	Recall	Precision	F1-score	Speed
ICPR(max_side=784)	0.56	0.67	0.61	45ms/image

gluoncv_model_zoo:resnet50_v1b, you can replace it with others，the default path of pretrained-model in ~/.mxnet/

Also you can download maskrcnn_coco from gluoncv_model_zoo to get a warm start.

cd pse
make

Here I add -Wl,-undefined,dynamic_lookup to avoid some compile error, which is different from original PSENet.

python scripts/train.py $data_path $ckpt

data_path: path of dataset, which the prefix of image and annoation must be same, for example, a.jpg, a.txt
ckpt: the filename of pretrained-mdel


Text loss	Kernel loss	All_loss	Pixel_accuracy

fusion_TB1vcxDLXXXXXb1XFXXunYpLFXX

python eval.py $data_path $ckpt $output_dir $gpu_or_cpu