| 
|   |  
| Starting date of the project : December 2006 
 
|  |  
| Month of the Year | Progress |  
| December-06 | A detailed study of the existing Gurmukhi OCR has been made and its limitations and areas of improvement have been noted. The following observations have been made about the present Gurmukhi OCR:Strengths:Limitations:Susceptible to noiseWorks well only on clean documentsTouching consonants not recognizedDoes not recognize digits and some special symbolsWorks only for single column text.
 Work initiated for development of Corpus for training and testing the OCR.
 |  
| January-07 | Twenty five books representing different fonts, time periods, publishers and print quality  identified for development of Corpus. Around 1000 pages scanned for the corpus.Segmentation algorithms for overlapping text lines and merged characters being developed.
 |  |  |