Control of Voice Quality for Emotional Speech Synthesis

Speech production in general, and emotional speech in particular, is characterized by a wide variety of phonation modalities. Voice quality, which is the term commonly used in the field, has an important role in the communication of emotions through speech, and nonmodal phonation modalities (soft, breathy, whispery, creaky, for example) are commonly found in emotional speech corpora. In this paper, we describe a voice synthesis framework that allows to control a set of acoustic parameters which are relevant for the simulation of nonmodal voice qualities. The set of controls of the synthesizer includes standard controls for duration and pitch of the phonemes, and additional controls for intensity, spectral emphasis, fast and slow variations of the duration and amplitude of the waveform periods (for voiced frames), frequency axis warping for changing the formant position, and aspiration noise level. Some guidelines are given to combine these signal transformations in the aim of reproducing some nonmodal voice qualities, including soft, loud, breathy, whispery, hoarse, and tremulous voice. It is also discussed how these voice qualities characterize the emotional speech . The system described here is based on the FESTIVAL speech synthesis framework and on the MBROLA diphone concatenation acoustic back-end. We also address the possibility of including affective tags in the input text to be converted.

Tipo Pubblicazione: 
Contributo in volume
Author or Creator: 
Drioli C.
Tesser F.
Tisato G.
Cosi P.
Marchetto E.
Publisher: 
EDK Editore, Torriana, ITA
Source: 
AISV 2004 - Misura di Parametri. Aspetti tecnologici ed implicazioni nei modelli linguistici, edited by Piero Cosi, pp. 789–798. Torriana: EDK Editore, 2005
Date: 
2005
Resource Identifier: 
http://www.cnr.it/prodotto/i/139741
http://scholar.google.it/citations?view_op=view_citation&hl=it&user=LC0j4ekAAAAJ&pagesize=100&citation_for_view=LC0j4ekAAAAJ:M3NEmzRMIkIC
urn:isbn:88-88974-69-5
Language: 
Eng