[mary-dev] Problem trying to add pt_BR - PhoneUnitLabelComputer step

Fábio Marinho fabiomarinho at gmail.com
Mon Feb 28 14:00:44 CET 2011


Dear folks,

Thank you for your fast answer!
Well, I think that my problem was similar to what happened with Fábio
Tesser. Anyways, I solved it "manually" just running linux sed command.

As I am new to TTS development, is there someone that could give me advices
and references for building a BP natural voice TTS? How many sentences
should I record for instance? I would like to colaborate anyhow as well.

I will share some aspects, objectives and motivations, to help helping me:

- I just discovered Open MARY Site last week and got eager with some
possibilities.

- Normally, I work as a Java Developer, coding CRUD and other boring things.

- Couldn't find anyone on my former University that could lead a research
project on a new Open Natural BP speech solution.

- My first motivation is to create a natural voice for my wife. Yes, for my
wife! She has a Retina Degeneration and can't read a complete word. Just
letter by letter and with 20x lupe. Therefore, she depens strongly on voice
synthesis for her studies ( Long reading of psychoanalisis texts. I thought
about unit selection synthesis so far, but without a domain... how many
sentences??)

- A lot of people in Brazil can't afford buying a comercial software for
sight accessibility. There are nice open solutions like NVDA and Orca, but
they lack good Natural voices (at least for BP). The robotic simple voices
are sufficient to operate basic functions, however, for reading long text,
they can cause headaches (at least for BP).

- We lead a non-profit association in our State (http://www.retinaminas.org/)
that basically gives information about new researches concerning Retina
Treatments and Accessibility for Sight Impaired people. It's a good idea to
concentrate my development skills to create new, free and open accessibility
solutions.

That's it! I dream of creating a good natural sounding voice for sight
accessibility, share the experiences and earn nothing for that. Could I use
MARY TTS? Am I crazy?

Well, I would be forever gratefull for any help on that.



2011/2/28 Fabio Tesser <fabio.tesser at gmail.com>

> Hello  Ingmar and Fábio,
>
> I would like to add my 2 cent on this:
>
> When I have tried built a new voice in a machine with an Italian locale, I
> remember that I got a similar error.
>
> If I remember well, the component that make use the "comma" instead the "."
> as my opinion was LabelPauseDeleter, but I haven’t investigated very deeply.
>
> I fixed the problem running the voice import tools from a terminal with a
> different locale.
>
> For example, in gnome ubuntu you can do this:
>
> $ env LANG=en_GB.UTF-8 gnome-terminal --disable-factory
>
> Best,
> Fabio.
>
>
>
>
> On 02/27/2011 02:56 PM, Fábio Marinho wrote:
>
>> Well, I think I found out the problem. It seems that the regex used in
>> XwavesLabelfileDataSource for parsing the lab files is considering "." as
>> decimal separator. I don't know why, maybe because of my locale, the lab
>> files were generated using "," as decimal separator. Maybe a good bug fix
>> should be trying to use a decimal separator from the locale of the machine
>> executing the code.
>>
>> I you keep on trying here. Thank you.
>>
>
>
>
> On 02/28/2011 09:22 AM, Ingmar Steiner wrote:
>
>> Dear Fábio,
>>
>> glad to see you found the cause of the issue so quickly!
>>
>> I agree that it would be slightly more elegant to support those locales
>> that use a comma as the decimal separator (BP, German, etc.).
>>
>> However, I'm not convinced (just yet) that this is a clean solution. As
>> far as I'm aware, the Xwaves lab file format does not allow commas as
>> decimal separators, and Mary's XwavesLabelfileDataSource class is
>> certainly not the only program that rejects files as malformed that do
>> not adhere to the format.
>>
>> The question that arises is how your lab file was created. Was it some
>> component of the Mary voicebuilding toolchain?
>>
>> Best wishes,
>>
>> -Ingmar
>>
>> On 26.02.2011 13:57, Fábio Marinho wrote:
>>
>>> Hello,
>>>
>>> I am a Java developer and personally interested in building a natural
>>> voice for Brazilian Portuguese (pt_BR).
>>>
>>> I am following all steps in "Adding New Language Support" development
>>> page of Open MARY.
>>>
>>> Everything was ok so far, but then I got stucked in
>>> PhoneUnitLabelComputer step of the ImportVoice GUI:
>>>
>>> TRACE:
>>>
>>> Computing unit labels for 15 files.
>>>  From phonetic label files:
>>> /home/fmarinho/Desenvolvimento/TTS/minhavoz/lab/*.lab
>>> To       unit label files:
>>> /home/fmarinho/Desenvolvimento/TTS/minhavoz/phonelab/*.lab
>>> Malformed line found outside of header:
>>> 0,070000 125 _
>>> java.lang.Exception: The component PhoneUnitLabelComputer produced the
>>> following exception:
>>> at
>>>
>>> marytts.tools.voiceimport.DatabaseImportMain$8.run(DatabaseImportMain.java:294)
>>> Caused by: java.io.IOException
>>> at
>>>
>>> marytts.util.data.text.XwavesLabelfileDataSource.parseLabels(XwavesLabelfileDataSource.java:157)
>>> at
>>>
>>> marytts.util.data.text.XwavesLabelfileDataSource.<init>(XwavesLabelfileDataSource.java:71)
>>> at
>>>
>>> marytts.util.data.text.XwavesLabelfileDataSource.<init>(XwavesLabelfileDataSource.java:58)
>>> at
>>>
>>> marytts.tools.voiceimport.PhoneUnitLabelComputer.computePhoneLabel(PhoneUnitLabelComputer.java:138)
>>> at
>>>
>>> marytts.tools.voiceimport.PhoneUnitLabelComputer.compute(PhoneUnitLabelComputer.java:119)
>>> at
>>>
>>> marytts.tools.voiceimport.DatabaseImportMain$8.run(DatabaseImportMain.java:291)
>>>
>>>
>>> MY ENVIRONMENT:
>>> Ubuntu 10.04 LTS - Lucid Lynx
>>> java version "1.6.0_22"
>>> Java(TM) SE Runtime Environment (build 1.6.0_22-b04)
>>> Java HotSpot(TM) Server VM (build 17.1-b03, mixed mode)
>>>
>>>
>>> I think that maybe it could be a simple detail that I am missing. So
>>> before trying to debug the code in Eclipse, I would appreciate any help
>>> on that.
>>>
>>> Thank you in advance.
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Mary-dev mailing list
>>> Mary-dev at dfki.de
>>> http://www.dfki.de/mailman/cgi-bin/listinfo/mary-dev
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.dfki.de/pipermail/mary-dev/attachments/20110228/bc5b0a4d/attachment.htm 


More information about the Mary-dev mailing list