Investigating HMMs as a parametric model for expressive speech synthesis in German

Sacha Krstulovic; Anna Hunecke; Marc Schröder

In: Proceedings of the XVIth International Congress of Phonetic Sciences (ICPhS). International Congress of Phonetic Sciences (ICPhS), Saarbrücken, Germany, 8/2007.


The paper investigates the potential of HMM based synthesis to support the parameterisation of expressive speech in German. First, we review the assets of HMMs in the perspective of previous works in speech modelling and speech transformation. It is shown that HMMs define a flexible parametric model of the speech acoustics, which readily integrates several levels of speech modelling, such as distinct predictors for prosody and voice quality. HMM-based synthesis has also supported cross-speaker and cross-speaking style transformations with a good level of perceptual quality, albeit in other languages than German and over a limited range of styles. To try these considerations in our research framework, we have therefore performed a preliminary application of HMM technology to the synthesis of excited football announcements in German. It is shown that a highly intelligible voice can be obtained, but that the rendering of the prosodic and voice quality correlates of excitement could benefit from some improvement in well identified areas.

krstulovic_etal2007a.pdf (pdf, 106 KB )

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence