Optical character recognition programs generally work best with images from print media books and computer generated copies from laser and inkjet printers. Some documents are not suitable for conversion to editable format. Images with poor quality or other issues should be identified to determine whether OCR is right for your project. Some of them are:
# 1 - Hand-written or hand-stamped sheets are not suitable for automated processing. Pages containing annotations and cross-outs produce a high rate of error. In addition, originals with mixed text, pictures and graphics tend to have recognition problems, but usually can be corrected with some manual adjustment.
# 2 - Scans of old documents that have lost contrast, color definition and clarity will not have optimal results. In addition, pages generated from fax machines and dot matrix printers generally provide poor results.
# 3 - Hard copies typed on a typewriter with a worn ribbon, carbon copies and sheets with light characters do not produce good results with optical character recognition. By the end of the 1980s, computer word processor applications had replaced typewriters. However, many archives contain a high number of typewritten pages.
# 4 - Lightweight paper stocks that crease or crumple, jamming the scanner are another issue that may be encountered. Poor quality originals can be scanned on a flatbed scanner or copied on photocopy machine to avoid further damage to the original. Another solution is to capture the files with a digital camera. However, there are no guarantees that the extra work and effort will provide an acceptable output.
# 5 - Hard copies without proper formatting and columns are not suitable for output to excel. In such cases, it is faster and more accurate to key in the data manually. However, OCR scanning to excel spreadsheet format works well for sheets that are delimited with tabs. The tabulated data should closely resemble tab CSV - Comma-separated values.
In cases where the originals are inappropriate for OCR software, a better solution is manual data entry. Automated processing does not save resources when you need to go back and substantially correct the output. It is much easier and accurate to do it right the first time. You will be surprised to find out that outsourcing your project to a scanning company with offshore BPO services (business process outsourcing services) is both a timely and affordable solution. This is due to the lower cost of offshore labor which makes manual correction and re-typing cost effective.
ShareThis
No comments:
Post a Comment