Punjabi Text To Speech Help for People with Cognitive Disabilities
Department of Science & Technology
Year 2011 to 2013
The proposed project is a multi disciplinary project involving Cognitive Science, Computer Science and Linguistics. The objective of project is to develop a text-to-speech (TTS) synthesis system for Punjabi language as a helping aid to the persons with cognitive disabilities like dyslexia, visual comprehension and other learning disabilities. This TTS system will be used as an add-on tool embedded with web browsers that will enable the browser to read aloud a website in Punjabi language. With more and more electronic data becoming available online, software’s with this TTS system as add-on tool will be helpful for information dissemination, as the user who can not read Punjabi but can understand it will then be able to get the information contained in a document/webpage by listening to it. This type of assistive technology can be particularly helpful to individuals with cognitive disabilities, visually impaired persons and old people who find it difficult to read from the computer screen.
The project work will be carried out as per the following steps:
Text pre-processing: The input text will be first normalized to resolve the various ambiguities preset in the input text, so that the system would be able to speak out the input text intelligently. This will include:
Transforming the words to the form as these are spoken by the native speakers. The native speaker does not speak the words as these are written. So, these will be transformed to the required form to increase the naturalness.
Schwa deletion will be performed on the words of the input text.
Syllabification of the words of the input text.
Preparation of Punjabi Speech Database: Punjabi speech database of syllable sounds will be developed after a thorough statistical analysis of Punjabi syllables over the carefully selected Punjabi corpus.
Development of a system that will concatenate the sounds of the syllables of the input Punjabi words.
Normalisation of the synthesized speech to eliminate/reduce the discontinuities at the concatenation points by using the Digital Signal Processing techniques.
Embedding Emotions: Applying the rules to make the system capable to produce output speech in different voices like male or female and in different emotions.
Enabling this TTS system to be integrated with existing software like web-browsers, editors etc.
Time Schedule of Activities through BAR Diagram
1) Text preprocessing
2) Creation of speech database
3) Transcription module
4) Synthesis module
5) Normalisation and Emotion Embedding
6) Integration with web browser, OCR and word processor