VoluProf implements an application that integrates the most realistic and interactive image possible, as well as the teacher's language, into an MR environment (e.g. in the user's living room). The text given by the lecturer is synthesized by text-to-speech, whereby the voice is reproduced true to the original voice using a newly developed machine learning algorithm. The avatar's facial expression, lip movement and body language are automatically adapted to what is being said. For this purpose, a photo-realistic avatar of the teacher should be generated from high-quality volumetric video data, where the main idea of volumetric video is to capture a person or an object with multiple cameras in a 360 degree multi-camera system and to create a dynamic 3D model from this material, which is superimposed on the real world in an MR application can.
In order to achieve a high level of interactivity in application scenarios, the DFKI is developing multimodal learning bot scripts based on the animatable volumetric videos. These make it possible to model discourses between the avatars and the users and thus to realize various forms of dialogue - e.g. Socratic dialogues, examination-like dialogues or inquiries for users. For this purpose, established technologies (e.g. knowledge representation through ontologies, AIML scripts) can be used, which are expanded to include modeling elements for controlling the facial expressions and gestures of the avatar.