Awesome
<h1 align="left">MASTER-mmocr</h1> <!-- TABLE OF CONTENTS --> <details open="open"> <summary><h2 style="display: inline-block">Contents</h2></summary> <ol> <li> <a href="#about-the-project">About The Project</a> <ul> <li><a href="#Dependency">Dependency</a></li> </ul> </li> <li> <a href="#getting-started">Getting Started</a> <ul> <li><a href="#prerequisites">Prerequisites</a></li> <li><a href="#installation">Installation</a></li> </ul> </li> <li><a href="#usage">Usage</a></li> <li><a href="#result">Result</a></li> <li><a href="#coming-soon">Coming Soon</a></li> <li><a href="#license">License</a></li> <li><a href="#Citations">Citations</a></li> <li><a href="#acknowledgements">Acknowledgements</a></li> </ol> </details> <!-- ABOUT THE PROJECT -->About The Project
This project is a re-implementation of MASTER: Multi-Aspect Non-local Network for Scene Text Recognition by MMOCR,which is an open-source toolbox based on PyTorch. The overall architecture will be shown below.
Dependency
<!-- GETTING STARTED -->Getting Started
Prerequisites
- Use Synthetic image datasets: SynthText (Synth800k), MJSynth (Synth90k) for training.
- Real image datasets: IIIT5K, SVT, IC03, IC13, IC15, SVTP, CUTE80 for testing.
- Dataset download link.
- Change dataset path in MASTER config.
Installation
-
Install mmdetection. click here for details.
# We embed mmdetection-2.11.0 source code into this project. # You can cd and install it (recommend). cd ./mmdetection-2.11.0 pip install -v -e .
-
Install mmocr. click here for details.
# install mmocr cd ./MASTER_mmocr pip install -v -e .
-
Install mmcv-full-1.3.4. click here for details.
pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/{cu_version}/{torch_version}/index.html # install mmcv-full-1.3.4 with torch version 1.8.0 cuda_version 10.2 pip install mmcv-full==1.3.4 -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.8.0/index.html
Usage
The usage of this project, is consistent with MMOCR-0.2.0. You can click here for mmocr usage details.
For training, run command
CUDA_VISIBLE_DEVICES={device_id} PORT={port_number} ./tools/dist_train.sh {config_path} {work_dir} {gpu_number}
# example
CUDA_VISIBLE_DEVICES=0 PORT=29500 ./tools/dist_train.sh ./configs/textrecog/master/master_ResnetExtra_academic_dataset_dynamic_mmfp16.py /expr/mmocr_text_line_recognition/ 1
PS :
- As mentioned in Prerequisites part, we use synthetic image datasets for training and real image datasets for evalutating. The 7 real image datasets mentioned above will be evaluated at each evaluation interval.
Result
Dataset | Paper reported accuracy | Our accuracy |
---|---|---|
IIIT5K | 95.0 | 95.07 |
SVT | 90.6 | 90.42 |
IC03 | 96.4 | 95.58 |
IC13 | 95.3 | 96.03 |
IC15 | 79.4 | 80.95 |
SVTP | 84.5 | 84.34 |
CUTE80 | 87.5 | 90.62 |
Coming Soon
- 1st Solution for ICDAR 2021 Competition on Scientific Table Image Recognition to Latex.
License
This project is licensed under the MIT License. See LICENSE for more details.
<!-- Citations -->Citations
If you find MASTER useful please cite paper:
@article{Lu2021MASTER,
title={{MASTER}: Multi-Aspect Non-local Network for Scene Text Recognition},
author={Ning Lu and Wenwen Yu and Xianbiao Qi and Yihao Chen and Ping Gong and Rong Xiao and Xiang Bai},
journal={Pattern Recognition},
year={2021}
}
<!-- ACKNOWLEDGEMENTS -->