DFKI-LT - Synthesis of listener vocalisations with imposed intonation contours
Synthesis of listener vocalisations with imposed intonation contours
5 Seventh ISCA Tutorial and Research Workshop on Speech Synthesis, Kyoto, Japan, ISCA, ISCA, 2010
Synthesis of listener vocalisations is one of the focused research areas to improve emotionally coloured conversational speech synthesis. To communicate different intentions, a synthesiser should be capable of generating a broad range of vocalisations with different kinds of acoustic properties. However, the data collection for corpus based methods is necessarily limited in acoustic variability. This paper describes our approach to increase the acoustic variability of vocalisations in terms of intonation. After selecting the best candidate for a given target from among the available vocalisations, we use prosody modification techniques to impose a target intonation contour. In an experiment, we combine markedly distinct intonation contours with vocalisations differing in segmental form, using the prosodymodification techniquesMLSA vocoding, FD-PSOLA, and HNM. In a listening test, we evaluate the perceived naturalness of the resulting synthesised vocalisations, and assess the effect of segmental form, intonation contour and modification technique on perceived meaning.
Files: BibTeX, vocalisation_f0mod.pdf