OCR CONVERSION TOOL

A tool for conversion of image PDF to Searchable PDF-A format using Optical Character Reader (OCR) has been developed. For implementation of tool Python 3.6.3 programming language and HTML code are used for front end development respectively. In this technique, OCRmyPDF analyzes each page of a PDF to determine the colorspace and resolution (DPI) needed to capture all of the information on that page without losing content. It uses Ghostscript to rasterize the page, and then performs on OCR on the rasterized image to create an OCR layer. The layer is then grafted back onto the original PDF-A.

Upload the File to be OCRed (Only PDF files to be uploaded)