Home

Awesome

SRNet-Datagen - A data generator of SRNet

 

Introduction

This is a data generator of SRNet which is the model described in the paper Editing Text in the wild.

Our tensorflow reproducing of SRNet: https://github.com/youdao-ai/SRNet

Original paper: Editing Text in the wild by Liang Wu, Chengquan Zhang, Jiaming Liu, Junyu Han, Jingtuo Liu, Errui Ding and Xiang Bai.

This data generator project is a simplification based on the following two projects.

Synthtext: Extracted the rendering part of the project and Adjusted to Python3 code to get i_s, t_t, t_f and mask_t

Skeletonization-of-Digital-Patterns: Adjusted this project to Python3 code to skeletonize mask_t and get t_sk.

 

Generate data

First prepare a directory of fonts and a background datasets without text. You can also prepare a word corpus for rendering.

You need to write the absolute path of each data in the background dataset as a line into a file, and modify bg_filepath parameter of Synthtext/data_cfg.py to the path of this file.

You can adjust other data configurations in Synthtext/data_cfg.py. The following is a description of some parameters.

Then you will need to adjust generating configurations in cfg.py including saving directory, the amount of data to generat and the number of processes that are needed.

Finally python3 datagen.py and start generating.

You can also use this project to generate data online while training SRNet.

image

From left to right, from top to bottom are examples of i_s, i_t, t_sk, t_t, t_b, t_f, mask_t

 

Requirements

 

Reference