Awesome
EATEN: Entity-aware Attention for Single Shot Visual Text Extraction
Accepted to ICDAR 2019 arxiv
Authors: He Guo, Xiameng Qin, Jiaming Liu, Junyu Han, Jingtuo Liu and Errui Ding
Abstract
This repository is designed to provide an open-source dataset for Visual Text Extraction.
Samples
Train ticket
Real images
Synthetic images
Some clean images
Some hard images
Passport
Some images
Some hard images
Business card
Downloads
The dataset can be downloaded through the following link:
baiduyun, PASSWORD: e4z1
Some details:
scenes | number | size | Google Drive link |
---|---|---|---|
train ticket | 300k synth + 1.9 real | 13G | dataset_trainticket.tar |
passport | 100k synth | 5.8G | dataset_passport.tar |
business card | 200k synth | 19G | dataset_business.tar.0 dataset_business.tar.1 dataset_business.tar.2 dataset_business.tar.3 |
Limitations&&Todo
- [A large of training data]
Todo:- Use CycleGan or domain adaptation to synth data to train EATEN.
- Introduce datasets of STR to EATEN.
- [Generalization on complex scenes]
Todo:- Add bounding box annotations of ToIs to EATEN, such as 2019-ICCV-oral Towards Unconstrained End-to-End Text Spotting.
- [Engineering]
- Merge server decoder to one.
- parallel decoding.