[mary-users] MaryTTS Viseme data
Ingmar Steiner
ingmar.steiner at dfki.de
Mon Apr 24 10:33:33 CEST 2017
Just for the record, the REALISED_DURATIONS follows the ESPS Xwaves lab
format. This means that each row represents one phonetic segment; the
first field is the *end* time in seconds, the second field has *no
significance*, and the third provides the segment label.
Best wishes,
-Ingmar
On 21.04.17 21:57, Joan Pere Sanchez wrote:
> Hi Idoor,
>
> These values corresponds the allophone features used to represent the
> corresponding phone, using the SAMPA notation. The first value is the
> timestamps where the sound for this phoneme starts. The second value,
> here 125, I'm not totally secure but I believe it is the f0 or central
> frequency of the sound.
> In fact, if you observe in detail the sequence of the first values is:
> 0.075 - 0.24 - 0.275 - 0.345 - 0.435 . 0.580 - 0.755 - _ (this "_" is a
> final silence)
> That is, the point where each phone starts. In practice, it is the
> same, you can do a lip-sync totally accurate with that.
>
> Good luck!
>
>
> 2017-04-21 19:22 GMT+02:00 idoor <idoorlab88 at gmail.com
> <mailto:idoorlab88 at gmail.com>>:
>
> Hi Joan,
>
> Thanks for your advise, I got the result back for the text: How are
> you. as attached, but there are some values I am not sure what they
> mean, like for "h", there are two values: 0.075 and 125, does the
> value 0.075 mean how long it takes to speak "h" in seconds? and also
> 125 is hardcoded value in the source code, what does it mean for "h"?
>
> Thanks for your help!
>
> text: #
>
> 0.075 125 h
>
> 0.24000001 125 aU
>
> 0.275 125 A
>
> 0.345 125 r
>
> 0.435 125 j
>
> 0.58000004 125 u
>
> 0.75500005 125 _
>
>
>
>
> On Tue, Apr 18, 2017 at 6:13 AM, Joan Pere Sanchez
> <kaiserjp at gmail.com <mailto:kaiserjp at gmail.com>> wrote:
>
> Hello again,
>
> If you want to obtain phonemes and duration for lip-sync, you
> must to call:
>
> mary.setOutputType("REALISED_DURATIONS");
>
> Where you would see each phoneme and their duration. You can
> also use another output option to see the features of the
> tokens, this is:
> mary.setOutputType("TARGETFEATURES");
>
> In both command lines 'mary.' is the instance of
> 'LocalMaryinterface' classe to manage your input.
>
> Best,
>
>
>
> 2017-04-16 2:41 GMT+02:00 idoor <idoorlab88 at gmail.com
> <mailto:idoorlab88 at gmail.com>>:
>
> Joan,
>
> Thanks for your response again!
> I looked at this marytts-txt2wav before, I tested and got
> the double array:
> double[] samples =MaryAudioUtils.getSamplesAsDoubleArray(audio);
>
> but after I got that far, I do not know what to do next to
> get phonemes, is this double [] related to phonemes?
> Best regards,
>
>
> On Sat, Apr 15, 2017 at 7:01 PM, Joan Pere Sanchez
> <kaiserjp at gmail.com <mailto:kaiserjp at gmail.com>> wrote:
>
> Hi Dave,
>
> You can take a look at this example to see how to
> extract from MaryTTS the time-duration for each phoneme
> at the same time you have the phonemes in SAMPA notation
> transcribed:
>
> https://github.com/marytts/marytts-txt2wav
> <https://github.com/marytts/marytts-txt2wav>
>
> In MaryTTS you have several option as input (text, BML,
> SSML, and many other) and also they are several output
> options. You can run the demo compilation with the
> server-client solution and through the interface see the
> options (there are a lot)
>
> Best,
>
>
> 2017-04-15 22:45 GMT+02:00 idoor <idoorlab88 at gmail.com
> <mailto:idoorlab88 at gmail.com>>:
>
> Hi Joan,
>
> Thanks for your response, do you have any pointers
> of references I can read and study? does MaryTTS
> provide any audio data for analysis of phonemes and
> visemes? MaryTTS can generate .wav file, is that
> possible to find a library tool to analyze the wave
> file and get phonemes info? I found this javadoc
> http://elckerlyc.sourceforge.net/javadoc/Hmi/hmi/tts/mary/MaryTTSGenerator.html
> <http://elckerlyc.sourceforge.net/javadoc/Hmi/hmi/tts/mary/MaryTTSGenerator.html>
> but I could not find the souce code for this, have
> you happened to see the library jar file or source
> code for this?
>
> Thanks again for sharing some thoughts with me.
>
>
>
>
> On Sat, Apr 15, 2017 at 2:05 PM, Joan Pere Sanchez
> <kaiserjp at gmail.com <mailto:kaiserjp at gmail.com>> wrote:
>
> Hi Dave,
>
> This task is the main goal of my PhD thesis. I'm
> doing lip-sync from the input text over the time
> duration estimation done while the speech is
> generated. You can develop your own strategy for
> lip/mouth synchronization, but often this is an
> avatar (or interface -I'm using a talking head
> too-) dependent task. So, if you are using an
> avatar, it depends if you can use blend shapes
> to mix by interpolation from the initial pose to
> the next one. Most of MPEG-4 systems are able to
> do that automatically.
> On one hand, you have each phoneme and their
> start and finish time. On the other hand, you
> can adjust a set of visemes for each basic
> expression (no more than 15 are needed) and then
> choose the sequence corresponding to each word
> you are generating. It's the more efficient and
> simple way to have an effective lip synchronization.
> Don't hesitate to contact me if you want more
> info or refs about.
>
> Bes regards,
>
>
> 2017-04-15 18:27 GMT+02:00 idoor Du
> <idoorlab88 at gmail.com
> <mailto:idoorlab88 at gmail.com>>:
>
> Hi all,
>
> I am new to MaryTTS, tried to call its API via:
>
> AudioInputStream audio =
> mary.generateAudio("testing");
>
> Now I want to animate mouth/lip shapes at
> runtime based on the audio sound, how to
> achieve that? are there any viseme data
> associated with the audio?
>
> Thanks in advance.
>
> Dave
>
> _______________________________________________
> Mary-users mailing list
> Mary-users at dfki.de <mailto:Mary-users at dfki.de>
> http://www.dfki.de/mailman/cgi-bin/listinfo/mary-users
> <http://www.dfki.de/mailman/cgi-bin/listinfo/mary-users>
>
>
>
>
> --
> *Joan Pere Sànchez Pellicer*
> kaiserjp at gmail.com <mailto:kaiserjp at gmail.com>
> www.chamaleon.net <http://www.chamaleon.net>
> +34 625 012 741 <tel:+34%20625%2001%2027%2041>
>
>
>
>
>
> --
> *Joan Pere Sànchez Pellicer*
> kaiserjp at gmail.com <mailto:kaiserjp at gmail.com>
> www.chamaleon.net <http://www.chamaleon.net>
> +34 625 012 741 <tel:+34%20625%2001%2027%2041>
>
>
>
>
>
> --
> *Joan Pere Sànchez Pellicer*
> kaiserjp at gmail.com <mailto:kaiserjp at gmail.com>
> www.chamaleon.net <http://www.chamaleon.net>
> +34 625 012 741 <tel:+34%20625%2001%2027%2041>
>
>
>
>
>
> --
> *Joan Pere Sànchez Pellicer*
> kaiserjp at gmail.com <mailto:kaiserjp at gmail.com>
> www.chamaleon.net <http://www.chamaleon.net>
> +34 625 012 741
>
>
> _______________________________________________
> Mary-users mailing list
> Mary-users at dfki.de
> http://www.dfki.de/mailman/cgi-bin/listinfo/mary-users
>
More information about the Mary-users
mailing list