Can a computer produce emotional speech?

Text-to-speech synthesizers are widely used. At ISTC the Speech and Multimodal Communication Laboratory (SMCL) is trying to develop this useful technology in order to gain the transmission of emotions in speech communication. 

Electronic speech synthesizers started to spread in the early 1980s. The first results were often barely comprehensible, but now the "ability" of computers to talk is well-known. In the last years researches in this field strongly focused on the effort to make speech synthesizers sound less robotic and to reproduce human speech in a faithful way. At ISTC the Speech and Multimodal Communication Laboratory (SMCL) is following this direction. One of the most important result is the Italian version of FESTIVAL, a multilingual text-to-speech system developed by the Centre for Speech Technology Research of Edinburgh.

SMCL's aim was to switch from a neutral "narrative style" to a more varied "emotive style". In order to do that, voice processing algorithms for emotional speech synthesis were focused on the control of phoneme duration and pitch, which are the main parameters for voice quality.

Now FESTIVAL also speaks Italian, and it does so almost like an Italian speaker. This emotional speech synthesizer has several application fields, from assistive technology for impaired users to electronic games. 


Contact: Piero Cosi

ISTC Group: Speech and Multimodal Communication Laboratory

Relevant Publications