Publikation

Facial expression as an input annotation modality for affective speech-to-speech translation

Éva Székely; Zeeshan Ahmed; Ingmar Steiner; Julie Carson-Berndsen

In: Ronald Böck; Francesca Bonin; Nick Campbell (Hrsg.). International Workshop on Multimodal Analyses for Human Machine Interaction (MA3). International Workshop on Multimodal Analyses for Human Machine Interaction (MA3-12), located at International Conference on Intelligent Virtual Agents, September 15, Santa Cruz, CA, USA, Online, 9/2012.

Zusammenfassung

One of the challenges of speech-to-speech translation is to accurately preserve the paralinguistic information in the speaker’s message. In this work we explore the use of automatic facial expression analysis as an input annotation modality to transfer paralinguistic information at a symbolic level from input to output in speech-to-speech translation. To evaluate the feasibility of this approach, a prototype system, FEAST (Facial Expression-based Affective Speech Translation) has been developed. FEAST classifies the emotional state of the user and uses it to render the translated output in an appropriate voice style, using expressive speech synthesis.

Weitere Links

http://fastnet.netsoc.ie/ma3_2012/program_files/6_Paper.pdf

6_Paper.pdf (pdf, 381 KB )