Speech animation using electromagnetic articulography as motion capture data

Ingmar Steiner; Korin Richmond; Slim Ouni

In: Slim Ouni; Frédéric Berthomier; Alexandra Jesse (Hrsg.). 12th International Conference on Auditory-Visual Speech Processing. International Conference on Auditory-Visual Speech Processing (AVSP-13), located at Interspeech, August 29 - September 1, Annecy, France, Pages 55-60, Inria, 8/2013.


Electromagnetic articulography (EMA) captures the position and orientation of a number of markers, attached to the articulators, during speech. As such, it performs the same function for speech that conventional motion capture does for full-body movements acquired with optical modalities, a long-time staple technique of the animation industry. In this paper, EMA data is processed from a motion-capture perspective and applied to the visualization of an existing multimodal corpus of articulatory data, creating a kinematic 3D model of the tongue and teeth by adapting a conventional motion capture based animation paradigm. This is accomplished using off-the-shelf, open-source software. Such an animated model can then be easily integrated into multimedia applications as a digital asset, allowing the analysis of speech production in an intuitive and accessible manner. The processing of the EMA data, its co-registration with 3D data from vocal tract magnetic resonance imaging (MRI) and dental scans, and the modeling workflow are presented in detail, and several issues discussed.

Weitere Links

paper.pdf (pdf, 1 MB ) poster.pdf (pdf, 8 MB )

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence