There are multiple OCR (optical character recognition) engines for Linux, but most have a major drawback. They can only export plain text of the OCR'ed image and do not support embedding text into the PDF in order to make a searchable PDF.
By searchable PDF, we refer to a scanned PDF document that contains invisible OCR'ed text over the scanned image. The text should have the right size in order to be placed over the text portions from image. Every word from the text layer should overlay exactly on the portion of the image that contains that word.
Here are two software solutions that are able to create searchable PDFs. One is a native Linux OCR engine and the other is a free PDF reader with OCR capabilities running in Wine.
Read more »
Thursday, 31 December 2015
Home
/
Linux
/
Linux - How to
/
OCR
/
Software
/
Software - PDF
/
How to OCR to searchable PDF in Linux
How to OCR to searchable PDF in Linux
Tags
# Linux
# Linux - How to
Software - PDF
Labels:
Linux,
Linux - How to,
OCR,
Software,
Software - PDF
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment