Home

Awesome

Data Augmentation for Scene Text Recognition

(Pronounced as "strog")

Paper

Why it matters?

Scene Text Recognition (STR) requires data augmentation functions that are different from object recognition. STRAug is data augmentation designed for STR. It offers 36 data augmentation functions that are sorted into 8 groups. Each function supports 3 levels or magnitudes of severity or intensity.

Given a source image:

it can be transformed as follows:

  1. warp.py - to generate Curve, Distort, Stretch (or Elastic) deformations
CurveDistortStretch
  1. geometry.py - to generate Perspective, Rotation, Shrink deformations
PerspectiveRotationShrink
  1. pattern.py - to create different grids: Grid, VGrid, HGrid, RectGrid, EllipseGrid
GridVGridHGridRectGridEllipseGrid
  1. blur.py - to generate synthetic blur: GaussianBlur, DefocusBlur, MotionBlur, GlassBlur, ZoomBlur
GaussianBlurDefocusBlurMotionBlurGlassBlurZoomBlur
  1. noise.py - to add noise: GaussianNoise, ShotNoise, ImpulseNoise, SpeckleNoise
GaussianNoiseShotNoiseImpulseNoiseSpeckleNoise
  1. weather.py - to simulate certain weather conditions: Fog, Snow, Frost, Rain, Shadow
FogSnowFrostRainShadow
  1. camera.py - to simulate camera sensor tuning and image compression/resizing: Contrast, Brightness, JpegCompression, Pixelate
ContrastBrightnessJpegCompressionPixelate
  1. process.py - all other image processing issues: Posterize, Solarize, Invert, Equalize, AutoContrast, Sharpness, Color
PosterizeSolarizeInvertEqualize
AutoContrastSharpnessColor

Pip install

pip3 install straug

How to use

Command line (e.g. input image is nokia.png):

>>> from straug.warp import Curve
>>> from PIL import Image
>>> img = Image.open("nokia.png")
>>> img = Curve()(img, mag=3)
>>> img.save("curved_nokia.png")

Python script (see test.py):

python3 test.py --image=<target image>

For example:

python3 test.py --image=images/telekom.png

The corrupted images are in results directory.

If you want to randomly apply only the desired augmentation types among multiple augmentations, see test_random_aug.py

Reference

Citation

If you find this work useful, please cite:

@inproceedings{atienza2021data,
  title={Data Augmentation for Scene Text Recognition},
  author={Atienza, Rowel},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={1561--1570},
  year={2021}
}