DFKI-LT - Voice Quality Interpolation for Emotional Text-To-Speech Synthesis

Oytun Türk, Marc Schröder, Baris Bozkurt, Levent M. Arslan
Voice Quality Interpolation for Emotional Text-To-Speech Synthesis
2 Proc. Interspeech 2005, Pages 797-800, Lisbon, Portugal, 2005
 
Synthesizing desired emotions using concatenative algorithms relies on collection of large databases. This paper focuses on the development and assessment of a simple algorithm to interpolate the intended vocal effort in existing databases in order to create new databases with intermediate levels of vocal effort. Three diphone databases in German with soft, modal, and loud voice qualities are processed with a spectral interpolation algorithm. A listening test is performed to evaluate the intended vocal effort in the original databases as well as the interpolated ones. The results show that the interpolation algorithm can create the intended intermediate levels of vocal effort given the original databases independent of the language background of the subjects.
 
Files: BibTeX, turk_etal2005.pdf