Code Yarns ‍👨‍💻
Tech BlogPersonal Blog

How to add OCR to PDF using PDFOCR

📅 2014-Jun-09 ⬩ ✍️ Ashwin Nanjappa ⬩ 🏷️ ocr, pdf, pdfocr, tesseract ⬩ 📚 Archive

PDFOCR is a Ruby script that can be used to add OCR text to a scanned PDF file.

$ sudo apt install tesseract-ocr tesseract-ocr-eng exactimage
$ git clone https://github.com/gkovacs/pdfocr
$ pdfocr.rb -i foo.pdf -o out.pdf

Tried with: Ubuntu 14.04


© 2022 Ashwin Nanjappa • All writing under CC BY-SA license • 🐘 @codeyarns@hachyderm.io📧