[mary-users] client Java application

Ingmar Steiner ingmar.steiner at inria.fr
Mon Feb 27 10:33:11 CET 2012


Dear Nikolai,

first of all, please don't send me any Word files. If you want to send 
me a screenshot, then attach the .jpg directly to the email.

Having said that, perhaps my last message was a bit too technical. To 
clarify: MaryXML is the data format that all of the components in the 
modular synthesis platform Mary use to communicate with each other. 
Therefore, *all* of the standard input types (WORDS, TOKENS, 
ACOUSTPARAMS, etc.) are MaryXML. Because of the modular design, you can 
actually override parts of the processing pipeline and provide your own 
data (in MaryXML), if you know what you're doing and can do it better.

If all you're looking for is a way to get Mary to speak using your own 
application that generates SSML, then by all means, you're good to go.

However, if you discover that some specific details of your input are 
not handled as expected and you can't figure out how to encode them in 
SSML, then (depending on the nature of the potential problem) it just 
might be a solution to provide appropriate MaryXML input where those 
details are specified and handled correctly.

But for regular use, SSML should work just as well.

Best wishes,

-Ingmar

On 2/27/12 03:17, Nikolai Kouznetsov wrote:
>
> Dear Ingmar,
>
> Attached is the Microsoft Word file with an image of MARY TTS client
> application. Here is my question: what is the option I should use to
> make sure that I can type in A MARY XML script?
>
> My best regards,
> Nikolai
>
>
>
>
>
>
> ----- Original Message ----- From: "Ingmar Steiner"
> <ingmar.steiner at inria.fr>
> To: "Nikolai Kouznetsov" <kzntsv at rogers.com>
> Cc: <mary-users at dfki.de>
> Sent: Saturday, February 25, 2012 7:39 AM
> Subject: Re: [mary-users] client Java application
>
>
>> Dear Nikolai,
>>
>> if you're asking about the INPUT_TYPE parameter (whether in the Java
>> client, browser interface, or whatever else), all of the synthesis
>> modules use MaryXML to describe the data being processed, adding
>> information over the synthesis pipeline, viz.
>>
>>> 2012-02-25 13:32:03,800 [I/O dispatcher 1] DEBUG
>>> marytts.ModuleRegistry Module TextToMaryXML converts TEXT into
>>> RAWMARYXML (locale en_US, voice cmu-slt-hsmm)
>>> 2012-02-25 13:32:03,826 [I/O dispatcher 1] DEBUG
>>> marytts.ModuleRegistry Module JTokeniser converts RAWMARYXML into
>>> TOKENS (locale en_US, voice cmu-slt-hsmm)
>>> 2012-02-25 13:32:03,826 [I/O dispatcher 1] DEBUG
>>> marytts.ModuleRegistry Module XML2Utt TokensEn converts TOKENS into
>>> FREETTS_TOKENS (locale en_US, voice cmu-slt-hsmm)
>>> 2012-02-25 13:32:03,826 [I/O dispatcher 1] DEBUG
>>> marytts.ModuleRegistry Module TokenToWords converts FREETTS_TOKENS
>>> into FREETTS_WORDS (locale en_US, voice cmu-slt-hsmm)
>>> 2012-02-25 13:32:03,826 [I/O dispatcher 1] DEBUG
>>> marytts.ModuleRegistry Module Utt2XML WordsEn converts FREETTS_WORDS
>>> into WORDS (locale en_US, voice cmu-slt-hsmm)
>>> 2012-02-25 13:32:03,826 [I/O dispatcher 1] DEBUG
>>> marytts.ModuleRegistry Module OpenNLPPosTagger converts WORDS into
>>> PARTSOFSPEECH (locale en_US, voice cmu-slt-hsmm)
>>> 2012-02-25 13:32:03,827 [I/O dispatcher 1] DEBUG
>>> marytts.ModuleRegistry Module JPhonemiser converts PARTSOFSPEECH into
>>> PHONEMES (locale en_US, voice cmu-slt-hsmm)
>>> 2012-02-25 13:32:03,827 [I/O dispatcher 1] DEBUG
>>> marytts.ModuleRegistry Module Prosody converts PHONEMES into
>>> INTONATION (locale en_US, voice cmu-slt-hsmm)
>>> 2012-02-25 13:32:03,827 [I/O dispatcher 1] DEBUG
>>> marytts.ModuleRegistry Module PronunciationModel converts INTONATION
>>> into ALLOPHONES (locale en_US, voice cmu-slt-hsmm)
>>> 2012-02-25 13:32:03,827 [I/O dispatcher 1] DEBUG
>>> marytts.ModuleRegistry Module AcousticModeller converts ALLOPHONES
>>> into ACOUSTPARAMS (locale en_US, voice cmu-slt-hsmm)
>>> 2012-02-25 13:32:03,827 [I/O dispatcher 1] DEBUG
>>> marytts.ModuleRegistry Module Synthesis converts ACOUSTPARAMS into
>>> AUDIO (locale en_US, voice cmu-slt-hsmm)
>>
>> With the exception of TEXT and the detour from TOKENS to WORDS by way
>> of FREETTS_*, all of those input types are MaryXML. If you provide
>> input as RAWMARYXML, the modules will look at the input, and decide
>> whether they need to process it or just pass it through.
>>
>> Best wishes,
>>
>> -Ingmar
>>
>> On 2/24/12 13:50, Nikolai Kouznetsov wrote:
>>> Ingmar,
>>>
>>> in the drop-down menu of your Java client application what menu item I
>>> should click to indicate that I want to use MaryXML? I have found SSML
>>> and SABLE there, but not MaryXML.
>>>
>>> Please advice.
>>>
>>> Thanks,
>>> Nikolai
>>>
>>>
>>> ----- Original Message ----- From: "Ingmar Steiner"
>>> <ingmar.steiner at inria.fr>
>>> To: "Nikolai Kouznetsov" <kzntsv at rogers.com>
>>> Cc: <mary-users at dfki.de>
>>> Sent: Friday, February 24, 2012 3:39 AM
>>> Subject: Re: [mary-users] client Java application
>>>
>>>
>>>> Dear Nikolai,
>>>>
>>>> I don't think the SABLE code has been touched in a while, but with
>>>> version 4.3, Mary added prosody specification support like in SSML 1.1
>>>> (see http://mary.opendfki.de/wiki/ProsodySpecificationSupport).
>>>> Obviously, using actual MaryXML would offer full control over all
>>>> synthesis parameters, but if you're forced to use either SABLE or
>>>> SSML, I would recommend SSML.
>>>>
>>>> Best wishes,
>>>>
>>>> -Ingmar
>>>>
>>>> On 24.02.2012 04:01, Nikolai Kouznetsov wrote:
>>>>> Hello, MArc,
>>>>>
>>>>> Here is a question: which mark-up is better to control MARY TTS
>>>>> performance
>>>>> in your opinion: SABLE or SSML. I am just wondering if all tags from
>>>>> these
>>>>> to markups have been implemented in AMry TTS.
>>>>>
>>>>> Thanks in advance,
>>>>> Nikolai Kouznetsov, PhD
>>>>>
>>>>> _______________________________________________
>>>>> Mary-users mailing list
>>>>> Mary-users at dfki.de
>>>>> http://www.dfki.de/mailman/cgi-bin/listinfo/mary-users
>>>>
>>>> --
>>>> Ingmar Steiner
>>>> Postdoctoral Researcher
>>>>
>>>> LORIA Speech Group, Nancy, France
>>>> National Institute for Research in
>>>> Computer Science and Control (INRIA)
>>>
>>
>> --
>> Ingmar Steiner
>> Postdoctoral Researcher
>>
>> LORIA Speech Group, Nancy, France
>> National Institute for Research in
>> Computer Science and Control (INRIA)
>

-- 
Ingmar Steiner
Postdoctoral Researcher

LORIA Speech Group, Nancy, France
National Institute for Research in
Computer Science and Control (INRIA)


More information about the Mary-users mailing list