Electrical, Mechanical, Civil Engineering Objective Questions And Answers And Short Questions Answers For Exam, Tests and Interview Selections

Thursday 31 December 2015

How to OCR to searchable PDF in Linux

There are multiple OCR (optical character recognition) engines for Linux, but most have a major drawback. They can only export plain text of the OCR'ed image and do not support embedding text into the PDF in order to make a searchable PDF.

By searchable PDF, we refer to a scanned PDF document that contains invisible OCR'ed text over the scanned image. The text should have the right size in order to be placed over the text portions from image. Every word from the text layer should overlay exactly on the portion of the image that contains that word.

Here are two software solutions that are able to create searchable PDFs. One is a native Linux OCR engine and the other is a free PDF reader with OCR capabilities running in Wine.

How to OCR to searchable PDF in Linux
Read more »

No comments:

Post a Comment