[mary-users] Difference hsmm and unit selection voices
Ingmar Steiner
ingmar.steiner at dfki.de
Wed Dec 28 08:07:59 CET 2016
Dear Rijk,
hidden semi-Markov models can be used for statistical parametric speech
synthesis to predict the parameters of a vocoder to generate a speech
waveform "from scratch" based on text input. On the other hand,
unit-selection synthesis retrieves and concatenates small snippets of
speech recordings from a database that best match the text input. If
you're interested in the scientific background, Paul Taylor's textbook
is a good entry point. [^1]
From a practical perspective, you need to know that HMM-based synthesis
offers high flexibility at the expense of perceived naturalness, with a
low memory footprint, while unit-selection offers high naturalness under
limited flexibility, depending on the database and application domain,
and has a memory footprint correlating with the database size.
In MaryTTS, resources can be loaded from the classpath (conventional
Java software design) or from the filesystem, based on properties such
as these:
> voice.cmu-slt.cartFile = jar:/marytts/voice/CmuSlt/cart.mry
> voice.cmu-slt.audioTimelineFile = MARY_BASE/lib/voices/cmu-slt/timeline_waveforms.mry
The `jar:` prefix triggers classpath loading, while `MARY_BASE` is
expanded from the filesystem path provided by the corresponding
property, but it could also be any other valid path.
In the unit-selection case, the audio data (particularly the
`timeline_waveforms.mry` file) is almost always too big to be
efficiently loaded into memory, so our solution is to locate it on the
filesystem and read required units directly from disk at runtime.
I hope this helps. If you have further questions, please open issues on
GitHub as appropriate.
Best wishes,
-Ingmar
P.S. TypeTalk looks very cool. We might be in touch about that. =)
[^1]: https://scholar.google.com/scholar?q=paul+taylor+text+to+speech
On 27.12.16 17:56, Rijk Theodoor Oosterhoff wrote:
> Hello,
>
> I was wondering what is the difference between a hidden semi markov
> model voice and a unit selection voice?
>
> And how do I get these on the classpath. The hssm voices are simply
> added using a maven or gradle dependency. But a unit selection voice
> needs some other files as well. How do I get these on the classpath?
>
> Btw, we are building a Marytts frontend. You can find our effort at:
> http://typetalk.github.io/TypeTalk/
>
>
> Kind regards,
> Rijk
> _______________________________________________
> Mary-users mailing list
> Mary-users at dfki.de
> http://www.dfki.de/mailman/cgi-bin/listinfo/mary-users
>
More information about the Mary-users
mailing list