Awesome
Case-Sensitive-Scene-Text-Recognition-Datasets
This project is part of the research work of the following paper:
UnrealText: Synthesizing Realistic Scene Text Images from the Unreal World (CVPR 2020) [GitHubRepo]
If you find this project useful in your research, you are encouraged to cite our paper:
@inproceedings{long2020unreal,
title={UnrealText: Synthesizing Realistic Scene Text Images from the Unreal World},
author={Long, Shangbang and Yao, Cong},
booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2020}
}
Background
The annotations of $4$ of the most popular scene text recognition datasets are incomplete. They are IIIT5K, SVT, SVTP, and CUTE-80. They only provide case-insensitive annotations and no punctuation marks.
For better understanding of scene text recognition models, we re-annotate these datasets and release them.
Dataset Statistics
Dataset Name | #Image |
---|---|
CUTE80 | 288 |
IIIT5K test set | 3000 |
IIIT5K training set | 2000 |
SVT test set | 647 |
SVT training set | 257 |
SVTP test set | 645 |