[mary-users] MaryTTS Viseme data

Joan Pere Sanchez kaiserjp at gmail.com
Sun Apr 16 01:01:54 CEST 2017


Hi Dave,

You can take a look at this example to see how to extract from MaryTTS the
time-duration for each phoneme at the same time you have the phonemes in
SAMPA notation transcribed:

https://github.com/marytts/marytts-txt2wav

In MaryTTS you have several option as input (text, BML, SSML, and many
other) and also they are several output options. You can run the demo
compilation with the server-client solution and through the interface see
the options (there are a lot)

Best,


2017-04-15 22:45 GMT+02:00 idoor <idoorlab88 at gmail.com>:

> Hi Joan,
>
> Thanks for your response, do you have any pointers of references I can
> read and study? does MaryTTS provide any audio data for analysis of
> phonemes and visemes? MaryTTS can generate .wav file, is that possible to
> find a library tool to analyze the wave file and get phonemes info? I found
> this javadoc
> http://elckerlyc.sourceforge.net/javadoc/Hmi/hmi/tts/mary/
> MaryTTSGenerator.html
> but I could not find the souce code for this, have you happened to see the
> library jar file or source code for this?
>
> Thanks again for sharing some thoughts with me.
>
>
>
>
> On Sat, Apr 15, 2017 at 2:05 PM, Joan Pere Sanchez <kaiserjp at gmail.com>
> wrote:
>
>> Hi Dave,
>>
>> This task is the main goal of my PhD thesis. I'm doing lip-sync from the
>> input text over the time duration estimation done while the speech is
>> generated. You can develop your own strategy for lip/mouth synchronization,
>> but often this is an avatar (or interface -I'm using a talking head too-)
>> dependent task. So, if you are using an avatar, it depends if you can use
>> blend shapes to mix by interpolation from the initial pose to the next one.
>> Most of MPEG-4 systems are able to do that automatically.
>> On one hand, you have each phoneme and their start and finish time. On
>> the other hand, you can adjust a set of visemes for each basic expression
>> (no more than 15 are needed) and then choose the sequence corresponding to
>> each word you are generating. It's the more efficient and simple way to
>> have an effective lip synchronization.
>> Don't hesitate to contact me if you want more info or refs about.
>>
>> Bes regards,
>>
>>
>> 2017-04-15 18:27 GMT+02:00 idoor Du <idoorlab88 at gmail.com>:
>>
>>> Hi all,
>>>
>>> I am new to MaryTTS, tried to call its API via:
>>>
>>> AudioInputStream audio = mary.generateAudio("testing");
>>>
>>> Now I want to animate mouth/lip shapes at runtime based on the audio
>>> sound, how to achieve that? are there any viseme data associated with
>>> the audio?
>>>
>>> Thanks in advance.
>>>
>>> Dave
>>>
>>> _______________________________________________
>>> Mary-users mailing list
>>> Mary-users at dfki.de
>>> http://www.dfki.de/mailman/cgi-bin/listinfo/mary-users
>>>
>>>
>>
>>
>> --
>> *Joan Pere Sànchez Pellicer*
>> kaiserjp at gmail.com
>> www.chamaleon.net
>> +34 625 012 741 <+34%20625%2001%2027%2041>
>>
>
>


-- 
*Joan Pere Sànchez Pellicer*
kaiserjp at gmail.com
www.chamaleon.net
+34 625 012 741
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.dfki.de/pipermail/mary-users/attachments/20170416/1185d112/attachment-0001.htm 


More information about the Mary-users mailing list