[mary-users] SSML or SABLE break element not working correctly

Saurav Chakraborty sauravchk at gmail.com
Fri Feb 17 21:05:53 CET 2012


Hi Ingmar,
Thanks for your response.
I will open a bug on the same.
Regards
Saurav

On Fri, Feb 17, 2012 at 3:47 AM, Ingmar Steiner <ingmar.steiner at inria.fr>wrote:

> Dear Saurav,
>
> I think the problem isn't actually with the SSML not passing through the
> boundary duration attribute, but downstream, with the boundary not being
> realized as requested.
>
> Unfortunately, I don't have time to debug this further at the moment...
>
> Best wishes,
>
> -Ingmar
>
>
> On 17.02.2012 03:35, Saurav Chakraborty wrote:
>
>> Hi Ingmar,
>> Thanks a lot for  you response.
>> I will open a ticket for th SABLE issue as you suggested.
>> As for the SSML issue, I am getting the issue for the voice cmu-slt-hsmm
>> en_US female hmm (voice code cmu-slt-hsmm).
>>
>> Please find my request URL below:
>>
>> http://localhost:59125/**process?INPUT_TYPE=SSML&**
>> OUTPUT_TYPE=AUDIO&INPUT_TEXT=%**3C%3Fxml%20version%3D%221.0%**
>> 22%20encoding%3D%22UTF-8%22%**20%3F%3E%0A%3Cspeak%20version%**
>> 3D%221.0%22%20xmlns%3D%22http%**3A%2F%2Fwww.w3.org%2F2001%**
>> 2F10%2Fsynthesis%22%0A%20%**20xmlns%3Axsi%3D%22http%3A%2F%**
>> 2Fwww.w3.org%2F2001%**2FXMLSchema-instance%22%0A%20%**
>> 20xsi%3AschemaLocation%3D%**22http%3A%2F%2Fwww.w3.org%**
>> 2F2001%2F10%2Fsynthesis%**20http%3A%2F%2Fwww.w3.org%**
>> 2FTR%2Fspeech-synthesis%**2Fsynthesis.xsd%22%0A%20%**
>> 20xml%3Alang%3D%22en-US%22%3E%**0AWelcome%3Cbreak%20time%3D%**
>> 222500ms%22%2F%3Eto%20the%**20world%20of%20speech%**
>> 20synthesis!%0A%3C%2Fspeak%3E%**0A&OUTPUT_TEXT=&VOICE_**SELECTIONS=<http://localhost:59125/process?INPUT_TYPE=SSML&OUTPUT_TYPE=AUDIO&INPUT_TEXT=%3C%3Fxml%20version%3D%221.0%22%20encoding%3D%22UTF-8%22%20%3F%3E%0A%3Cspeak%20version%3D%221.0%22%20xmlns%3D%22http%3A%2F%2Fwww.w3.org%2F2001%2F10%2Fsynthesis%22%0A%20%20xmlns%3Axsi%3D%22http%3A%2F%2Fwww.w3.org%2F2001%2FXMLSchema-instance%22%0A%20%20xsi%3AschemaLocation%3D%22http%3A%2F%2Fwww.w3.org%2F2001%2F10%2Fsynthesis%20http%3A%2F%2Fwww.w3.org%2FTR%2Fspeech-synthesis%2Fsynthesis.xsd%22%0A%20%20xml%3Alang%3D%22en-US%22%3E%0AWelcome%3Cbreak%20time%3D%222500ms%22%2F%3Eto%20the%20world%20of%20speech%20synthesis%21%0A%3C%2Fspeak%3E%0A&OUTPUT_TEXT=&VOICE_SELECTIONS=>
>> <http://localhost:59125/**process?INPUT_TYPE=SSML&**
>> OUTPUT_TYPE=AUDIO&INPUT_TEXT=%**3C%3Fxml%20version%3D%221.0%**
>> 22%20encoding%3D%22UTF-8%22%**20%3F%3E%0A%3Cspeak%20version%**
>> 3D%221.0%22%20xmlns%3D%22http%**3A%2F%2Fwww.w3.org%2F2001%**
>> 2F10%2Fsynthesis%22%0A%20%**20xmlns%3Axsi%3D%22http%3A%2F%**
>> 2Fwww.w3.org%2F2001%**2FXMLSchema-instance%22%0A%20%**
>> 20xsi%3AschemaLocation%3D%**22http%3A%2F%2Fwww.w3.org%**
>> 2F2001%2F10%2Fsynthesis%**20http%3A%2F%2Fwww.w3.org%**
>> 2FTR%2Fspeech-synthesis%**2Fsynthesis.xsd%22%0A%20%**
>> 20xml%3Alang%3D%22en-US%22%3E%**0AWelcome%3Cbreak%20time%3D%**
>> 222500ms%22%2F%3Eto%20the%**20world%20of%20speech%**
>> 20synthesis!%0A%3C%2Fspeak%3E%**0A&OUTPUT_TEXT=&VOICE_**SELECTIONS=<http://localhost:59125/process?INPUT_TYPE=SSML&OUTPUT_TYPE=AUDIO&INPUT_TEXT=%3C%3Fxml%20version%3D%221.0%22%20encoding%3D%22UTF-8%22%20%3F%3E%0A%3Cspeak%20version%3D%221.0%22%20xmlns%3D%22http%3A%2F%2Fwww.w3.org%2F2001%2F10%2Fsynthesis%22%0A%20%20xmlns%3Axsi%3D%22http%3A%2F%2Fwww.w3.org%2F2001%2FXMLSchema-instance%22%0A%20%20xsi%3AschemaLocation%3D%22http%3A%2F%2Fwww.w3.org%2F2001%2F10%2Fsynthesis%20http%3A%2F%2Fwww.w3.org%2FTR%2Fspeech-synthesis%2Fsynthesis.xsd%22%0A%20%20xml%3Alang%3D%22en-US%22%3E%0AWelcome%3Cbreak%20time%3D%222500ms%22%2F%3Eto%20the%20world%20of%20speech%20synthesis%21%0A%3C%2Fspeak%3E%0A&OUTPUT_TEXT=&VOICE_SELECTIONS=>
>> >cmu-slt-hsmm%20en_**US%20female%20hmm&AUDIO_OUT=**
>> WAVE_FILE&LOCALE=en_US&VOICE=**cmu-slt-hsmm&AUDIO=WAVE_FILE
>>
>>
>> I appreciate you help on this.
>> Thanks and Regards
>> Saurav
>>
>>
>> On Thu, Feb 16, 2012 at 4:03 AM, Ingmar Steiner <ingmar.steiner at inria.fr
>> <mailto:ingmar.steiner at inria.**fr <ingmar.steiner at inria.fr>>> wrote:
>>
>>    Dear Saurav,
>>
>>    the SSML specification seems to be working for me.
>>
>>    http://mary.dfki.de:59125/**process?INPUT_TEXT=%3C%3Fxml+**
>> version%3D%221.0%22+encoding%**3D%22UTF-8%22+%3F%3E+%3Cspeak+**
>> version%3D%221.0%22+xmlns%3D%**22http%3A%2F%2Fwww.w3.org%**
>> 2F2001%2F10%2Fsynthesis%22+++**xmlns%3Axsi%3D%22http%3A%2F%**
>> 2Fwww.w3.org%2F2001%**2FXMLSchema-instance%22+++xsi%**
>> 3AschemaLocation%3D%22http%3A%**2F%2Fwww.w3.org%2F2001%2F10%**
>> 2Fsynthesis+http%3A%2F%2Fwww.**w3.org%2FTR%2Fspeech-**
>> synthesis%2Fsynthesis.xsd%22++**+xml%3Alang%3D%22en-US%22%3E+**
>> Welcome%3Cbreak+time%3D%**225000ms%22%2F%3Eto+the+world+**
>> of+speech+synthesis!+%3C%**2Fspeak%3E&INPUT_TYPE=SSML&**
>> OUTPUT_TYPE=AUDIO&LOCALE=en_**US&AUDIO=WAVE_FILE<http://mary.dfki.de:59125/process?INPUT_TEXT=%3C%3Fxml+version%3D%221.0%22+encoding%3D%22UTF-8%22+%3F%3E+%3Cspeak+version%3D%221.0%22+xmlns%3D%22http%3A%2F%2Fwww.w3.org%2F2001%2F10%2Fsynthesis%22+++xmlns%3Axsi%3D%22http%3A%2F%2Fwww.w3.org%2F2001%2FXMLSchema-instance%22+++xsi%3AschemaLocation%3D%22http%3A%2F%2Fwww.w3.org%2F2001%2F10%2Fsynthesis+http%3A%2F%2Fwww.w3.org%2FTR%2Fspeech-synthesis%2Fsynthesis.xsd%22+++xml%3Alang%3D%22en-US%22%3E+Welcome%3Cbreak+time%3D%225000ms%22%2F%3Eto+the+world+of+speech+synthesis%21+%3C%2Fspeak%3E&INPUT_TYPE=SSML&OUTPUT_TYPE=AUDIO&LOCALE=en_US&AUDIO=WAVE_FILE>
>>    <http://mary.dfki.de:59125/**process?INPUT_TEXT=%3C%3Fxml+**
>> version%3D%221.0%22+encoding%**3D%22UTF-8%22+%3F%3E+%3Cspeak+**
>> version%3D%221.0%22+xmlns%3D%**22http%3A%2F%2Fwww.w3.org%**
>> 2F2001%2F10%2Fsynthesis%22+++**xmlns%3Axsi%3D%22http%3A%2F%**
>> 2Fwww.w3.org%2F2001%**2FXMLSchema-instance%22+++xsi%**
>> 3AschemaLocation%3D%22http%3A%**2F%2Fwww.w3.org%2F2001%2F10%**
>> 2Fsynthesis+http%3A%2F%2Fwww.**w3.org%2FTR%2Fspeech-**
>> synthesis%2Fsynthesis.xsd%22++**+xml%3Alang%3D%22en-US%22%3E+**
>> Welcome%3Cbreak+time%3D%**22500ms%22%2F%3Eto+the+world+**
>> of+speech+synthesis%21+%3C%**2Fspeak%3E&INPUT_TYPE=SSML&**
>> OUTPUT_TYPE=PHONEMES&LOCALE=**en_US&AUDIO=WAVE_FILE<http://mary.dfki.de:59125/process?INPUT_TEXT=%3C%3Fxml+version%3D%221.0%22+encoding%3D%22UTF-8%22+%3F%3E+%3Cspeak+version%3D%221.0%22+xmlns%3D%22http%3A%2F%2Fwww.w3.org%2F2001%2F10%2Fsynthesis%22+++xmlns%3Axsi%3D%22http%3A%2F%2Fwww.w3.org%2F2001%2FXMLSchema-instance%22+++xsi%3AschemaLocation%3D%22http%3A%2F%2Fwww.w3.org%2F2001%2F10%2Fsynthesis+http%3A%2F%2Fwww.w3.org%2FTR%2Fspeech-synthesis%2Fsynthesis.xsd%22+++xml%3Alang%3D%22en-US%22%3E+Welcome%3Cbreak+time%3D%22500ms%22%2F%3Eto+the+world+of+speech+synthesis%21+%3C%2Fspeak%3E&INPUT_TYPE=SSML&OUTPUT_TYPE=PHONEMES&LOCALE=en_US&AUDIO=WAVE_FILE>
>> >
>>
>>
>>    Are you using INPUT_TYPE=SSML? Which voice?
>>
>>    As for SABLE, I can confirm that the MSEC attribute is dropped, not
>>    sure why at this point.
>>    It'd be great if you could file a bug report on this. We're
>>    currently in the process of shifting development over to github, and
>>    we're still optimizing the workflow, but given that you would need
>>    an opendfki account to open a traditional Trac ticket, maybe you
>>    could go ahead and open an issue at
>>    https://github.com/marc1s/__**marytts/issues/new<https://github.com/marc1s/__marytts/issues/new>
>>
>>    <https://github.com/marc1s/**marytts/issues/new<https://github.com/marc1s/marytts/issues/new>
>> >
>>
>>    Best wishes,
>>
>>    -Ingmar
>>
>>
>>    On 16.02.2012 04:15, Saurav Chakraborty wrote:
>>
>>        Hi,
>>        I have a requirement of controlling the pause between sentences and
>>        paragraphs by using the break element the control the duration
>>        of pause.
>>        However, MARY seems to be ignoring the duration specified in <break
>>        time="500ms/>" (SSML) or <break MSEC="500"/> (SABLE) .
>>        Did any one face a problem like this.
>>        Could you kindly confirm if MARY does not fully support SABLE or
>>        SSML input.
>>        Thanks for your help in advance.
>>        Regards
>>        Saurav
>>
>>
>>        ______________________________**___________________
>>        Mary-users mailing list
>>        Mary-users at dfki.de <mailto:Mary-users at dfki.de>
>>        http://www.dfki.de/mailman/__**cgi-bin/listinfo/mary-users<http://www.dfki.de/mailman/__cgi-bin/listinfo/mary-users>
>>
>>        <http://www.dfki.de/mailman/**cgi-bin/listinfo/mary-users<http://www.dfki.de/mailman/cgi-bin/listinfo/mary-users>
>> >
>>
>>
>>    --
>>    Ingmar Steiner
>>    Postdoctoral Researcher
>>
>>    LORIA Speech Group, Nancy, France
>>    National Institute for Research in
>>    Computer Science and Control (INRIA)
>>
>>
>>
> --
> Ingmar Steiner
> Postdoctoral Researcher
>
> LORIA Speech Group, Nancy, France
> National Institute for Research in
> Computer Science and Control (INRIA)
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.dfki.de/pipermail/mary-users/attachments/20120217/88bec017/attachment-0001.htm 


More information about the Mary-users mailing list