[mary-dev] Problem trying to add pt_BR - PhoneUnitLabelComputer step

Fabio Tesser fabio.tesser at gmail.com
Mon Feb 28 11:30:42 CET 2011


Hello  Ingmar and Fábio,

I would like to add my 2 cent on this:

When I have tried built a new voice in a machine with an Italian locale, 
I remember that I got a similar error.

If I remember well, the component that make use the "comma" instead the 
"." as my opinion was LabelPauseDeleter, but I haven’t investigated very 
deeply.

I fixed the problem running the voice import tools from a terminal with 
a different locale.

For example, in gnome ubuntu you can do this:

$ env LANG=en_GB.UTF-8 gnome-terminal --disable-factory

Best,
Fabio.



On 02/27/2011 02:56 PM, Fábio Marinho wrote:
> Well, I think I found out the problem. It seems that the regex used in 
> XwavesLabelfileDataSource for parsing the lab files is considering "." 
> as decimal separator. I don't know why, maybe because of my locale, 
> the lab files were generated using "," as decimal separator. Maybe a 
> good bug fix should be trying to use a decimal separator from the 
> locale of the machine executing the code.
>
> I you keep on trying here. Thank you.



On 02/28/2011 09:22 AM, Ingmar Steiner wrote:
> Dear Fábio,
>
> glad to see you found the cause of the issue so quickly!
>
> I agree that it would be slightly more elegant to support those locales
> that use a comma as the decimal separator (BP, German, etc.).
>
> However, I'm not convinced (just yet) that this is a clean solution. As
> far as I'm aware, the Xwaves lab file format does not allow commas as
> decimal separators, and Mary's XwavesLabelfileDataSource class is
> certainly not the only program that rejects files as malformed that do
> not adhere to the format.
>
> The question that arises is how your lab file was created. Was it some
> component of the Mary voicebuilding toolchain?
>
> Best wishes,
>
> -Ingmar
>
> On 26.02.2011 13:57, Fábio Marinho wrote:
>> Hello,
>>
>> I am a Java developer and personally interested in building a natural
>> voice for Brazilian Portuguese (pt_BR).
>>
>> I am following all steps in "Adding New Language Support" development
>> page of Open MARY.
>>
>> Everything was ok so far, but then I got stucked in
>> PhoneUnitLabelComputer step of the ImportVoice GUI:
>>
>> TRACE:
>>
>> Computing unit labels for 15 files.
>>   From phonetic label files:
>> /home/fmarinho/Desenvolvimento/TTS/minhavoz/lab/*.lab
>> To       unit label files:
>> /home/fmarinho/Desenvolvimento/TTS/minhavoz/phonelab/*.lab
>> Malformed line found outside of header:
>> 0,070000 125 _
>> java.lang.Exception: The component PhoneUnitLabelComputer produced the
>> following exception:
>> at
>> marytts.tools.voiceimport.DatabaseImportMain$8.run(DatabaseImportMain.java:294)
>> Caused by: java.io.IOException
>> at
>> marytts.util.data.text.XwavesLabelfileDataSource.parseLabels(XwavesLabelfileDataSource.java:157)
>> at
>> marytts.util.data.text.XwavesLabelfileDataSource.<init>(XwavesLabelfileDataSource.java:71)
>> at
>> marytts.util.data.text.XwavesLabelfileDataSource.<init>(XwavesLabelfileDataSource.java:58)
>> at
>> marytts.tools.voiceimport.PhoneUnitLabelComputer.computePhoneLabel(PhoneUnitLabelComputer.java:138)
>> at
>> marytts.tools.voiceimport.PhoneUnitLabelComputer.compute(PhoneUnitLabelComputer.java:119)
>> at
>> marytts.tools.voiceimport.DatabaseImportMain$8.run(DatabaseImportMain.java:291)
>>
>>
>> MY ENVIRONMENT:
>> Ubuntu 10.04 LTS - Lucid Lynx
>> java version "1.6.0_22"
>> Java(TM) SE Runtime Environment (build 1.6.0_22-b04)
>> Java HotSpot(TM) Server VM (build 17.1-b03, mixed mode)
>>
>>
>> I think that maybe it could be a simple detail that I am missing. So
>> before trying to debug the code in Eclipse, I would appreciate any help
>> on that.
>>
>> Thank you in advance.
>>
>>
>>
>>
>> _______________________________________________
>> Mary-dev mailing list
>> Mary-dev at dfki.de
>> http://www.dfki.de/mailman/cgi-bin/listinfo/mary-dev


More information about the Mary-dev mailing list