Expressive Speech Synthesis: Past, Present, and Possible Futures

Marc Schröder

In: Jianhua Tao , Tieniu Tan . Affective Information Processing. Seiten 111-126 Springer London 2009.


Approaches towards adding expressivity to synthetic speech have changed considerably over the last 20 years. Early systems, including formant and diphone systems, have been focused around "explicit control" models; early unit selection systems have adopted a "playback" approach. Currently, various approaches are being pursued to increase the flexibility in expression while maintaining the quality of state-of-the-art systems, among them a new "implicit control" paradigm in statistical parametric speech synthesis, which provides control over expressivity by combining and interpolating between statistical models trained on different expressive databases. The present chapter provides an overview of the past and present approaches, and ventures a look into possible future developments.


Weitere Links

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence