Home

Awesome

awesome-deep-text-detection-recognition

A curated list of awesome deep learning based papers on text detection and recognition.

<p align='center'> <img src = '/overall_pi_chart.png' height="300px"> <img src = '/overall_histogram.png' height="450px"> </p>

Text Detection

Conf.DateTitleIC13IC15Resources
'14-ECCV14/10/07Robust Scene Text Detection with Convolution Neural Network Induced MSER Trees
15-CVPR15/06/01Symmetry-based text line detection in natural scenes0.8043PRJ <br> CODE
'16-TIP15/10/12Text-Attentional Convolutional Neural Networks for Scene Text Detection0.8165
'15-ICCV15/12/13Text Flow : A Unified Text Detection System in Natural Scene Images0.8025
'16-arXiv16/03/31Accurate Text Localization in Natural Image with Cascaded Convolutional TextNetwork0.86
'16-CVPR16/04/14Multi-Oriented Text Detection with Fully Convolutional Networks0.830.54*TORCH(M)
'16-CVPR16/04/22Synthetic Data for Text Localisation in Natural Images0.847 <br> (L)0.8359CODE <br> DB
'16-arXiv16/06/29Scene Text Detection Via Holistic, Multi-Channel Prediction0.84330.6477
'16-ECCV16/09/12Detecting Text in Natural Image with Connectionist Text Proposal Network0.82150.6085*CAFFE(M) <br> CAFFE <br> TF(M) <br> TF <br> DEMO <br> BLOG(CH)
'17-AAAI16/11/21TextBoxes: A fast text detector with a single deep neural network0.85 <br> (L)0.8767*CAFFE(M) <br> TF <br> BLOG(KR)
'18-TM17/03/03Arbitrary-Oriented Scene Text Detection via Rotation Proposals0.91250.8020*CAFFE
'17-CVPR17/03/04Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection0.7064
'17-CVPR17/03/19Detecting Oriented Text in Natural Images by Linking Segments0.8530.75 <br> (L)0.7636*TF(M) <br> TF(M) <br> SLIDE <br> VIDEO
'17-arXiv17/03/24Deep Direct Regression for Multi-Oriented Scene Text Detection0.860.81
'17-arXiv17/04/03Cascaded Segmentation-Detection Networks for Word-Level Text Spotting0.860.71
'17-CVPR17/04/11EAST: An Efficient and Accurate Scene Text Detector0.8072 <br> (L)0.8038TF(M) <br> TF <br> PYTORCH(M) <br> PYTORCH <br> DEMO <br> KERAS(M) <br> VIDEO
'17-ICIP17/05/15WordFence: Text Detection in Natural Images with Border Awareness0.86
'17-arXiv17/06/30R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection0.87730.8254TF(M) <br> CAFFE(M)
'17-CVPR17/07/21Multi-scale FCN with Cascaded Instance Aware Segmentation for Arbitrary Oriented Word Spotting In The Wild0.850.63
'17-arXiv17/08/17Deep Scene Text Detection with Connected Component Proposals0.919
'17-ICCV17/08/22WordSup: Exploiting Word Annotations for Character based Text Detection0.90640.7816
'17-ICCV17/09/01Single Shot Text Detector with Regional Attention0.87040.7691*CAFFE(M) <br> PYTORCH <br> VIDEO
'17-arXiv17/09/11Fused Text Segmentation Networks for Multi-oriented Scene Text Detection0.8414
'17-ICCV17/10/13WeText: Scene Text Detection under Weak Supervision0.869 <br> (L)0.8313
'17-ICCV17/10/22Self-organized Text Detection with Minimal Post-processing via Border Learning0.84*KERAS(M)
'17-ICDAR17/11/11Deep Residual Text Detection Network for Scene Text0.9117 <br> (L)0.8925
'18-AAAI17/11/12Feature Enhancement Network: A Refined Scene Text Detector0.9161
'17-arXiv17/11/30ArbiText: Arbitrary-Oriented Text Detection in Unconstrained Scene0.759
'18-AAAI18/01/04PixelLink: Detecting Scene Text via Instance Segmentation0.8810.8519*TF(M) TF
'18-CVPR18/01/05FOTS: Fast Oriented Text Spotting with a Unified Network0.9250.8984PYTORCH <br> PYTORCH <br> VIDEO
'18-TIP18/01/09TextBoxes++: A Single-Shot Oriented Scene Text Detector0.880.829 <br> (L)0.8475*CAFFE(M)
'18-CVPR18/02/27Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation0.880.843*PYTORCH(M)
'18-CVPR18/03/09An end-to-end TextSpotter with Explicit Alighment and Attention0.90.87*CAFFE(M)
'18-CVPR18/03/14Rotation-Sensitive Regression for Oriented Scene Text Detection0.890.838*CAFFE(M)
'18-arXiv18/04/08Detecting Multi-Oriented Text with Corner-based Region Proposals0.8760.845*CAFFE(M)
'18-arXiv18/04/24An Anchor-Free Region Proposal Network for Faster R-CNN based Text Detection Approaches0.920.86
'18-IJCAI18/05/03IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Oriented Scene Text Detection0.9047
'18-arXiv18/06/07Shape Robust Text Detection with Progressive Scale Expansion Network0.8721PRJ
'18-ECCV18/07/04TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes0.826PYTORCH
'18-ECCV18/07/06Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes0.9170.86
'18-ECCV18/07/10Accurate Scene Text Detection through Border Semantics Awareness and Bootstrapping0.892
'19-AAAI18/11/21Scene Text Detection with Supervised Pyramid Context Network0.9210.872
'19-TIP18/12/04TextField: Learning A Deep Direction Field for Irregular Scene Text Detection0.824*CAFFE(M)
'19-CVPR19/03/21Towards Robust Curve Text Detection with Conditional Spatial Expansion
'19-CVPR19/03/28Shape Robust Text Detection with Progressive Scale Expansion Network0.857TF(M)
'19-CVPR19/04/03Character Region Awareness for Text Detection0.9520.869*PYTORCH(M) <br> VIDEO <br> PYTORCH <br> TF(M) <br> KERAS <br> BLOG_CH <br> BLOG_KR <br> BLOG_KR <br> BLOG_KR
'19-CVPR19/04/13Look More Than Once: An Accurate Detector for Text of Arbitrary Shapes Screen reader support enabled0.877
'19-CVPR19/06/16Learning Shape-Aware Embedding for Scene Text Detection0.877
'19-CVPR19/06/16Arbitrary Shape Scene Text Detection with Adaptive Text Region Representation0.9170.876
'19-ICCV19/08/16Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network0.829
'19-ICCV19/09/02Geometry Normalization Networks for Accurate Scene Text Detection0.8852
'19-AAAI19/11/20Real-time Scene Text Detection with Differentiable Binarization0.847
<p align='center'> <img src = '/detection_ic13_results.png' height = '550px'> <img src = '/detection_ic15_results.png' height = '550px'> </p>

Text Recognition

Conf.DateTitleSVTIIIT5kIC03IC13Resources
'15-ICLR14/12/18Deep structured output learning for unconstrained text recognition0.7170.8960.818TF <br> SLIDE <br> VIDEO
'16-IJCV15/05/07Reading text in the wild with convolutional neural networks0.8070.9330.908KERAS
'16-AAAI15/06/14Reading Scene Text in Deep Convolutional Sequences
'17-TPAMI15/07/21An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition0.8080.7820.8940.867TORCH(M) <br> TF <br> TF <br> TF <br> TF <br> PYTORCH <br> PYTORCH(M) <br> BLOG(KR)
'16-CVPR16/03/09Recursive Recurrent Nets with Attention Modeling for OCR in the Wild0.8070.7840.8870.9
'16-CVPR16/03/12Robust scene text recognition with automatic rectification0.8190.8190.9010.886PYTORCH <br> PYTORCH
'16-CVPR16/06/27CNN-N-Gram for Handwriting Word Recognition0.8362VIDEO
'16-BMVC16/09/19STAR-Net: A SpaTial Attention Residue Network for Scene Text Recognition0.8360.8330.8990.891
'17-arXiv17/07/27STN-OCR: A single Neural Network for Text Detection and Text Recognition0.7980.860.903*MXNET(M) <br> PRJ <br> BLOG
'17-IJCAI17/08/19Learning to Read Irregular Text with Attention Mechanisms
'17-arXiv17/09/06Scene Text Recognition with Sliding Convolutional Character Models0.7650.8160.8450.852
'17-ICCV17/09/07Focusing Attention: Towards Accurate Text Recognition in Natural Images0.8590.8740.9420.933
'18-CVPR17/11/12AON: Towards Arbitrarily-Oriented Text Recognition0.8280.870.915TF
'17-NIPS17/12/04Gated Recurrent Convolution Neural Network for OCR0.8150.8080.978*TORCH(M)
'18-AAAI18/01/04Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition0.8440.8360.9150.908
'18-AAAI18/01/04SqueezedText: A Real-time Scene Text Recognition by Binary Convolutional Encoder-decoder Network0.870.9310.929
'18-CVPR18/05/09Edit Probability for Scene Text Recognition0.8750.8830.9460.944
'18-TPAMI18/06/25ASTER: An Attentional Scene Text Recognizer with Flexible Rectification0.9360.9340.9450.918*TF(M) <br> PYTORCH
'18-ECCV18/09/08Synthetically Supervised Feature Learning for Scene Text Recognition0.8710.8940.9470.94
'19-AAAI18/09/18Scene Text Recognition from Two-Dimensional Perspective0.8210.920.914
'19-AAAI18/11/02Show, Attend and Read: A Simple and Strong Baseline for Irregular Text Recognition0.8450.9150.91*TORCH(M)
'19-CVPR18/12/14ESIR: End-to-end Scene Text Recognition via Iterative Image Rectification0.9020.9330.913PRJ
'19-PR19/01/10MORAN: A Multi-Object Rectified Attention Network for Scene Text Recognition0.8830.9120.9500.924*PYTORCH(M)
'19-ICCV19/04/03What is wrong with scene text recognition model comparisons? dataset and model analysis0.8750.9490.936*PYTORCH(M) <br> BLOG_KR
'19-CVPR19/04/18Aggregation Cross-Entropy for Sequence Recognition0.8260.8230.9210.897*PYTORCH
'19-CVPR19/06/16Sequence-to-Sequence Domain Adaptation Network for Robust Text Image Recognition0.8450.8380.9210.918
'19-ICCV19/08/06Symmetry-constrained Rectification Network for Scene Text Recognition0.8890.9440.950.939
'20-AAAI19/12/28TextScanner: Reading Characters in Order for Robust Scene Text Recognition0.8950.9260.925
'20-AAAI19/12/21Decoupled Attention Network for Text Recognition0.8920.9430.950.939*PYTORCH(M)
'20-AAAI20/02/04GTC: Guided Training of CTC0.9290.9550.9520.943
<p align='center'> <img src = '/recognition_ic13_results.png' height = '550px'> <img src = '/recognition_iiit5k_results.png' height = '550px'> </p>

End-to-End Text Recognition

Conf.DateTitleIC03IC13IC15Resources
'12-ICPR12/11/11End-to-end text recognition with convolutional neural networks0.67*CODE
'14-ECCV14/09/06Deep Features for Text Spotting0.75PRJ <br> MATLAB
'15-IJCV15/05/07Reading Text in the Wild with Convolutional Neural Networks0.700.77KERAS
'15-TPAMI15/10/30Real-time Lexicon-free Scene Text Localization and Recognition0.5420.156
'16-arXiv16/04/10TextProposals: a Text-specific Selective Search Algorithm for Word Spotting in the Wild0.68430.4718 <br> (L)0.533*CAFFE(M)
'17-AAAI16/11/21TextBoxes: A fast text detector with a single deep neural network0.84TF <br> *CAFFE(M) <br> BLOG_KR
'17-ICCV17/07/13Towards End-to-end Text Spotting with Convolution Recurrent Neural Network0.8459VIDEO
'17-ICCV17/10/22Deep TextSpotter An End-to-End Trainable Scene Text Localization and Recognition Framework0.770.47VIDEO <br> *CAFFE(M)
'18-CVPR18/01/05FOTS: Fast Oriented Text Spotting with a Unified Network0.84770.6533VIDEO <br> TF(M)
'18-TIP18/01/09TextBoxes++: A Single-Shot Oriented Scene Text Detector0.84650.519*CAFFE(M)
'18-CVPR18/03/09An end-to-end TextSpotter with Explicit Alignment and Attention0.860.63*CAFFE(M)
'18-TPAMI18/06/25ASTER: An Attentional Scene Text Recognizer with Flexible Rectification0.64*TF(M)
'18-ECCV18/07/06Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes0.8650.624
'19-ICCV19/08/24Towards Unconstrained End-to-End Text Spotting0.6994BLOG_KR
'19-ICCV19/10/17Convolutional Character Networks0.7108*PYTORCH(M)
'19-ICCV19/10/27TextDragon: An End-to-End Framework for Arbitrary Shaped Text Spotting0.6537
'20-AAAI19/11/21All You Need Is Boundary: Toward Arbitrary-Shaped Text Spotting0.8410.641
'20-AAAI20/02/12Text Perceptron: Towards End-to-End Arbitrary-Shaped Text Spotting0.8580.651
<p align='center'> <img src = '/end2end_ic13_ic15_results.png' height = '400px'> </p>

Others

Conf.DateTitleDescriptionResources
'14-NIPS14/06/09Synthetic Data and Artificial Neural Networks for Natural Scene Text RecognitionDatasetPRJ
'17-ECCV17/02/13End-to-End Interpretation of the French Street Name Signs DatasetDataset (FSNS)*TF(M)
'17-arXiv17/04/11Attention-based Extraction of Structured Information from Street View ImageryFSNS*TF(M) <br> TF <br> TF <br> LUA <br> BLOG_KR
'17-CVPR17/07/21Unambiguous Text Localization and Retrieval for Cluttered ScenesText Retrieval
'17-AAAI17/10/22Detection and Recognition of Text Embedded in Online Images via Neural Context ModelsDatasetPRJ
'18-CVPR17/11/17Separating Style and Content for Generalized Style TransferFont Style
'17-arXiv17/12/06Detecting Curve Text in the Wild New Dataset and New SolutionDataset (CTW 1500)PRJ
'18-AAAI17/12/14SEE: Towards Semi-Supervised End-to-End Scene Text RecognitionFSNSPRJ <br> *CHAINER(M)
'17-CVPR18/06/07Learning to Extract Semantic Structure from Documents Using Multimodal Fully Convolutional Neural NetworksDocument LayoutPRJ
'18-CVPR18/06/19DocUNet: Document Image Unwarping via A Stacked U-NetDocument DewarpingPRJ
'18-CVPR18/06/19Document Enhancement using Visibility DetectionDocument EnhancementPRJ
'18-IJCAI18/06/22Multi-Task Handwritten Document Layout AnalysisDocument Layout
'18-ECCV18/07/09Verisimilar Image Synthesis for Accurate Detection and Recognition of Texts in ScenesDatasetPRJ
'19-AAAI18/12/03EnsNet: Ensconce Text in the WildText RemovalDB
'19-CVPR18/12/14Spatial Fusion GAN for Image SynthesisDatasetDB
'19-AAAI19/01/27Hierarchical Encoder with Auxiliary Supervision for Table-to-text Generation: Learning Better Representation for TablesTableToText
'19-AAAI19/01/27A Radical-aware Attention-based Model for Chinese Text ClassificationChinese Character Classification
'19-CVPR19/02/25Handwriting Recognition in Low-resource Scripts using Adversarial LearningHandwritting RecognitionTF
'19-CVPR19/03/27Tightness-aware Evaluation Protocol for Scene Text DetectionEvaluationCODE
'19-ICCV19/05/31Scene Text Visual Question AnsweringDatasetICDAR_DB
'19-CVPR19/06/16DynTypo: Example-based Dynamic Text Effects TransferText EffectsPRJ <br> VIDEO
'19-CVPR19/06/16Typography with Decor: Intelligent Text Style TransferText Effects*PYTORCH(M)
'19-CVPR19/06/16An Alternative Deep Feature Approach to Line Level Keyword SpottingKyeword Spotting
'19-ICCV19/07/23GA-DAN: Geometry-Aware Domain Adaptation Network for Scene Text Detection and RecognitionDomain Adaptation
'19-ICCV19/09/17Chinese Street View Text: Large-scale Chinese Text Reading with Partially Supervised LearningDatasetICDAR_DB
'19-ICCV19/10/02Large-scale Tag-based Font Retrieval with Generative Feature LearningFont Retrieval
'19-ICCV19/10/27TextPlace: Visual Place Recognition and Topological Localization Through Reading Scene TextsPlace RecognitionDB
'19-ICCV19/10/27DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression NetworksDocument Dewarping*PYTORCH(M)

Other lists

Tutorial Materials

Acknowledgment