Multilingual OCR

Many projects funded by government and industry are currently underway to scan hundreds of thousands of indic-script documents and manuscripts to create large digital library archives to preserve these treasures for posterity.

OCR is a key enabling technology for making these archives practically accessible to researchers and lay users alike by creating search-able indexes and machine readable text repositories of these documents. This book provides an overview of the current state-of-the-art in the OCR of the different Indic scripts as well as other issues in the creation of accessible digital libraries for Indic scripts. It provides a good technical overview of the latest research in the field.