Code Yarns ‍👨‍💻
Tech BlogPersonal Blog

How to add OCR to PDF using PDFOCR

📅 2014-Jun-09 ⬩ ✍️ Ashwin Nanjappa ⬩ 📚 Archive

PDFOCR is a Ruby script that can be used to add OCR text to a scanned PDF file.

$ sudo apt install tesseract-ocr tesseract-ocr-eng exactimage
$ git clone https://github.com/gkovacs/pdfocr
$ pdfocr.rb -i foo.pdf -o out.pdf

Tried with: Ubuntu 14.04