Awesome
pdf2djvu-ocr
IMPORTANT (QUALITY) DISCLAIMER
This script is still young and the resulting .djvu
files are not so good, often bigger than the original and with medium to low quality.
I hope people will help me improve this.
So before converting huge amount of documents do some performance/quality benchmarking.
Description
This Script follow the discussion on SuperUser to help convert from scanned PDF to DjVu+OCR.
Dependencies
- stylerc: bash output style ;
- pdfsandwich ;
- tesseract-ocr ;
- pdf2djvu.
Usage
The default behavior, i.e. call without arguments, will look for PDF files in the current working repository (glob: ./*.pdf
) :
pdf2djvu-ocr
Otherwise you can specify a path
pdf2djvu-ocr /path/to/files/**/*.pdf