[mary-dev] Very strict reliability for FeatureMaker

Fabio Tesser fabio.tesser at gmail.com
Tue Nov 23 20:09:24 CET 2010


Hello,

I am running the FeatureMaker program (point 5 of 
http://mary.opendfki.de/wiki/NewLanguageSupport) for Italian.
I have used the strict reliability option because I would like to select 
only words inside my lexicon (otherwise I obtain a lot of non-Italian 
words and acronyms).
But even with this option I get in my selection some words not located 
in the lexicon.
Some examples:
al-ʿAzīz
MA-31PG
Mini-DSLAM
Z-Man

You can notice that all these words contains the '-' character.

The reliability option description says that:
"With setting strict, only those sentences that contain words in the 
lexicon or words that were transcribed by the preprocessor can be 
selected for the synthesis script;"

So I suppuse these word are trancribed by "the preprocessor".
But if I try to transcribe these words using the maryserver the result 
is that they are transcribeed by the lexicon (g2p_method), but they are 
not inside the lexicon.

The marytts.tools.dbselection.FeatureMaker.checkReliability() method 
confirms that.

I have some questions about these words:
- What is the component that transcribe these words (preprocessor)? And 
how does it work?
- Is it possible to assign them another g2p_method label? In this way 
should be possible to have a "very strict reliability" option in 
checkReliability()...
- If this is not possible, does anyone have others suggestions of how to 
assign, in the context of FeatureMaker, the sentences that contains 
these words into the unreliable set?

Thank you,
Fabio.







More information about the Mary-dev mailing list