OCR can save many hours of labor when it becomes necessary to convert printed materials into electronic format. There are many different motivations for digitizing documents. Digitizing paper forms allows anyone on the Web to complete and submit them online, saving time, paper, and postage. Once the old paper documents are converted to electronic format, one can save space, make unlimited copies, share with customers, clients, and employees, edit the document, publish it on the web, and convert the document to other useful formats, index it for searchable databases, etc.
OCR can provide a great help in saving the invaluable manuscripts and historical texts which otherwise have limited life. As their primary concern is the studying of texts, historical researchers will welcome all techniques that ease their job in an efficient way. Thus when well applicated, OCR programs will support their usage in a historical project.
OCR has three real benefits, all of which may represent huge gains for any business with a lot of paperwork. These benefits are:
Reduced Storage :
Once scanned, a paper document may be either recycled or stored in an inexpensive facility. In either case, the paper is no longer of immediate concern. Simply scanning paper gives us the limited benefit over keeping the paper itself. OCR is able to convert the scans into intelligent documents, thereby making scanning a more useful option in the first place.
Intelligent Text :
Once a scan is converted into actual text, its file usually takes up less hard disk space and its contents become more valuable. With actual text, one can copy and paste the document into other applications like a word processor or spreadsheet. Once the text is in use in an application, there is no limit to what one can do with it. One can add numbers, check spelling, search for words, format text, and so on.
Text recognition is roughly 10 to 12 times faster than manual retyping. To quote just one hard figure: a (very) fast secretary types some 200 characters per minute, an OCR software will recognize several hundreds of characters per second. (Add some time for the scanning process and the handling of the software).