Home

Awesome

subdetection

Subtitle detection library based on Tesseract-OCR and Qt5.

Version used:

You can check out my tutorials if you have no clue how to compile Tesseract on Windows:

http://broija.blogspot.fr/2014/10/build-tesseract-ocr-30202-with-qt-51.html

http://broija.blogspot.fr/2014/10/compile-tesseract-ocr-30202-with-mingw.html

Description

This library contains several classes, structures and functions to help detect subtitles in images.

All the parameters are tuned in the Parameters structure. You can also use a ParameterManager to save or load these parameters in a file or use the signal/slot mechanism.

The Detector class is the one to use for text detection.

Parameters

Instructions

Text detection


Detector usage is quite straightforward:

If text has been detected, you'll find it in the QStringList passed as parameter.

Language files


Tesseract language files can be used by putting them in a tessdata directory.

Then, instantiate a Detector and pass it tessdata parent directory path and the three characters refering to your language files.

You can download language files here:

https://code.google.com/p/tesseract-ocr/downloads/list

Tests

Application


A test project is available in test/app/ directory.

Unit tests


Available in test/unit_tests/ directory.