[mary-dev] Very strict reliability for FeatureMaker
Fabio Tesser
fabio.tesser at gmail.com
Tue Nov 23 20:09:24 CET 2010
Hello,
I am running the FeatureMaker program (point 5 of
http://mary.opendfki.de/wiki/NewLanguageSupport) for Italian.
I have used the strict reliability option because I would like to select
only words inside my lexicon (otherwise I obtain a lot of non-Italian
words and acronyms).
But even with this option I get in my selection some words not located
in the lexicon.
Some examples:
al-ʿAzīz
MA-31PG
Mini-DSLAM
Z-Man
You can notice that all these words contains the '-' character.
The reliability option description says that:
"With setting strict, only those sentences that contain words in the
lexicon or words that were transcribed by the preprocessor can be
selected for the synthesis script;"
So I suppuse these word are trancribed by "the preprocessor".
But if I try to transcribe these words using the maryserver the result
is that they are transcribeed by the lexicon (g2p_method), but they are
not inside the lexicon.
The marytts.tools.dbselection.FeatureMaker.checkReliability() method
confirms that.
I have some questions about these words:
- What is the component that transcribe these words (preprocessor)? And
how does it work?
- Is it possible to assign them another g2p_method label? In this way
should be possible to have a "very strict reliability" option in
checkReliability()...
- If this is not possible, does anyone have others suggestions of how to
assign, in the context of FeatureMaker, the sentences that contains
these words into the unreliable set?
Thank you,
Fabio.
More information about the Mary-dev
mailing list