Home

Awesome

tesseract-ocr

This package is a wrapper of Tesseract OCR. Helping to read characters on a image.

Installation

  1. Add the dependency to your shard.yml:

    dependencies:
      tesseract-ocr:
        github: dannnylo/tesseract-ocr-crystal
    
  2. Run shards install

Usage

require "tesseract-ocr"

Basic usage:

Tesseract::Ocr.read("spec/resources/world.png") => "world"

With options:

Tesseract::Ocr.read("spec/resources/world.png", {  :l => "por", :oem => "1" }) => "world"

Convert image to PDF readable.

Tesseract::Ocr.to_pdf("spec/resources/world.png", { :oem => "1" }) => "/tmp/RANDOM_NAME.pdf"

Reading the image and get words positions

Tesseract::OcrWords.read("spec/resources/world.png") => [{word: "world", confidence: 95, x_start: 2, y_start: 2, x_end: 185, y_end: 56}]

Contributing

  1. Fork it (https://github.com/dannnylo/tesseract-ocr-crystal/fork)
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create a new Pull Request

Contributors