The University of Queensland Homepage
School of ITEE ITEE Main Website

 Txt_Spch

Text to Speech

Welcome to the Text to Speech Thesis Project home page!

I am currently a 4th year Computer System Engineer creating a text to speech database for my thesis. The database will consist of diphones and will utilise the MBrola speech engine for text to speech conversion.

The spoken language can be broken into its constituents, similar to a material where the atom is its building block. In speech, a phoneme can be regarded as a fundamental building block of speech production. However, phoneme based databases usually sound disjointed. This is due to phoneme based databases only 'voicing' distinct phonemes whereas in human speech production the transition of a phoneme to another is a gradual process. Diphones mimics this gradual transition as they are basically the end half of one phoneme joined with the beginning half of another phoneme. Thus diphones can produce a 'sound' that can be concatenated with others to create the appropriate, and often realistic words.

It is hoped that I will be able to create a database to generate about 500 words with good intelligibility. This database will then be processed through the MBrola Speech engine to generate the required words. The next step is then to compare the MBrola Speech engine with one produced by myself and place a front page for either the MBrola or my speech engine.


Last updated 13/03/97 (will be improved soon)

Kevin Lee
Department of Electrical Engineering (Computer Systems)
University of Queensland 4067
Email : e9329095@student.uq.edu.au