[mary-users] Fwd: Issues on MaryTTS with SSML Prosody Tag
Ingmar Steiner
ingmar.steiner at dfki.de
Tue Nov 8 11:03:17 CET 2016
Hi Gavin,
On 08.11.16 01:26, 姚晋 wrote:
> Hello Dr. Steiner,
>
> Thanks for your detailed answer.
>
> For the link to prosody documentation, I checked with different time
> (last Friday and today), as well as different people (myself and my
> friends), there is still a problem in connection failure. Is it possible
> that you can access it as you are using intranet with the server, but it
> might be different for external visitors?
I've tried from different external networks, with different browsers,
and the only problem I noticed was a single timeout, which went away
after refreshing. I suspect that either your browser is too impatient,
or that it doesn't trust the certificate, but I can't reproduce the issue.
If the problem persists, please contact admin at opendfki.de.
>
> I tried to check the documentation for the mapping between MaryXML
> prosody value range to the measurement unit, for example:
>
> - What is the default value for volume in dB? What is the default value
> for speaking rate in syllable/s?, etc.
Because the volume is not processed in any way, it would be whatever was
in the original recordings.
> - Regarding the range for manipulating volume in Mary GUI is [0.0,
> 10.0], is 1 scale for +6dB?
If you are referring to the "Audio Effects" in the browser (or java
client) GUI, that has nothing to do with the synthesis, and is simply a
DSP filter applied to the output audio. As you can see in the source
code [^1], the volume there is simply a linear scaling of the sample values.
>
> As I still had the connection problem to that website, could you share a
> PDF documentation with me by email if you have?
I've merged that page into the GitHub wiki for convenience:
https://github.com/marytts/marytts/wiki/ProsodySpecificationSupport
Best wishes,
-Ingmar
[^1]:
https://github.com/marytts/marytts/blob/d7b57e81d0e8e9abbd4f8be4ca66766ff700687c/marytts-runtime/src/main/java/marytts/signalproc/effects/VolumeEffect.java#L54-L64
>
> Best Regards,
> Gavin
>
>
>
>
> On 4 November 2016 at 14:31, Ingmar Steiner <ingmar.steiner at dfki.de
> <mailto:ingmar.steiner at dfki.de>> wrote:
>
> Hi Gavin,
>
> thanks for your detailed message!
>
> On 04.11.16 12:57, 姚晋 wrote:
>
> Hello all,
>
> I am using the MaryTTS in an English prosody study, but meet some
> problem as list below, please help check if it is the problem of
> my SSML
> or there is a problem in Mary with SSML, thanks in advance:
>
> 1. The link to user documentation doesn't work (404 Not found)
> (http://mary.opendfki.de/trac/wiki/ProsodySpecificationSupport
> <http://mary.opendfki.de/trac/wiki/ProsodySpecificationSupport>
> <http://mary.opendfki.de/trac/wiki/ProsodySpecificationSupport
> <http://mary.opendfki.de/trac/wiki/ProsodySpecificationSupport>>).
> As
> shown in attachment 1.
>
>
> The URL works fine for me.
>
>
> 2. I use the MaryTTS GUI directly for synthesizing speech with
> SSML, and
> find "volume" tag works in Bing Speech API, but seems not working in
> Mary. But it works if I modify "volume" with "Audio Effects" GUI
> directly. Please refer attachment 2.1 for SSML, and 2.2 for analysis
> with Praat.
>
>
> The most important thing to be aware of is that SSML is a
> recommendation, not a standard, and it's up to each "vendor" whether
> and how the various SSML features are implemented (I had an
> insightful discussion with Paul Bagshaw about this a few weeks ago).
>
> In MaryTTS, SSML is parsed and transformed into MaryXML using XSLT
> [^1]. The resulting prosody attributes are then available to the
> MaryTTS modules for processing.
>
> As it happens, any volume attribute is completely ignored by the
> class that handles the prosody element [^2]. In other words,
> fine-grained volume control is not currently possible with MaryXML.
>
>
> 3. "pitch" tag works for hmm-based voice, but not work
> accurately for
> unit-selection voice. Please refer attachment 3.1 for SSML, and
> 3.2 for
> analysis with Praat.
>
>
> Unit-selection synthesis, by its very nature, does not allow
> fine-grained prosody control. It concatenates the selected units
> from a voice database without modification and offers high
> naturalness by sacrificing flexibility. There have been experiments
> with signal manipulation after concatenation, but this tends to
> introduce unacceptable artifacts.
>
> If you need fine-grained control over prosody, you are better off
> using statistical parametric synthesis or diphone synthesis (e.g.,
> MBROLA). Note that the last version of MaryTTS that supported MBROLA
> was v4.3.1 -- we plan to support MBROLA again in the near future,
> but that's still work in progress. In the meantime, v4.3.1 should
> work just fine for you under Windows.
>
> If you have further questions, please feel free to engage the
> developers on our issue tracker [^3] and post technical questions or
> bug reports as appropriate.
>
> Best wishes,
>
> -Ingmar
>
>
> Best Regards,
> Gavin
>
>
>
> _______________________________________________
> Mary-users mailing list
> Mary-users at dfki.de <mailto:Mary-users at dfki.de>
> http://www.dfki.de/mailman/cgi-bin/listinfo/mary-users
> <http://www.dfki.de/mailman/cgi-bin/listinfo/mary-users>
>
>
> [^1]:
> https://github.com/marytts/marytts/blob/v5.2/marytts-runtime/src/main/resources/marytts/modules/ssml-to-mary.xsl
> <https://github.com/marytts/marytts/blob/v5.2/marytts-runtime/src/main/resources/marytts/modules/ssml-to-mary.xsl>
>
> [^2]:
> https://github.com/marytts/marytts/blob/v5.2/marytts-runtime/src/main/java/marytts/modules/acoustic/ProsodyElementHandler.java
> <https://github.com/marytts/marytts/blob/v5.2/marytts-runtime/src/main/java/marytts/modules/acoustic/ProsodyElementHandler.java>
>
> [^3]: https://github.com/marytts/marytts/issues
> <https://github.com/marytts/marytts/issues>
>
> --
> /**
> * Dr. Ingmar Steiner
> *
> * Head of Independent Research Group
> * Multimodal Speech Processing
> * Cluster of Excellence MMCI
> *
> * Senior Researcher
> * Multilingual Technologies Lab
> * German Research Center for
> * Artificial Intelligence (DFKI GmbH)
> *
> * Principal Investigator
> * Collaborative Research Center SFB-1102
> * Information Density and Linguistic Encoding
> *
> * Department of Computer Science
> * Department of Computational Linguistics & Phonetics
> * Saarland University
> *
> * Campus C7.4, Room 2.01
> * D-66123 Saarbrücken
> * @tel: +49-681-302-70028 <tel:%2B49-681-302-70028>
> * @fax: +49-681-302-4317 <tel:%2B49-681-302-4317>
> * @web: http://coli.uni-saarland.de/~steiner/
> <http://coli.uni-saarland.de/~steiner/>
> */
>
>
More information about the Mary-users
mailing list