[mary-dev] [mary-users] How create initial support files without Transcription Gui?

Ingmar Steiner ingmar.steiner at dfki.de
Mon Jun 14 09:11:51 CEST 2010


Dear Igor,

POS tagging is not performed during lexical lookup, but by dedicated modules, i.e. the OpenNLPPosTagger, which uses OpenNLP (http://opennlp.sourceforge.net/api/). Of course you could create a different module, as long as it converts WORDS into PARTSOFSPEECH.

Or, if you have a dictionary with POS, you could experiment with a module which converts WORDS directly into PHONEMES (although it could provide an alternative constructor producing PARTSOFSPEECH to handle explicit requests for that output type).

Best wishes,

/**
 * Ingmar Steiner
 * Researcher, Language Technology
 * German Research Center for Artificial Intelligence
 *
 * Campus D3 1 +1.18
 * D-66123 Saarbrücken
 * Germany
 * Phone: ++49-681-857-75-5263 (NEW!)
 * Email: ingmar.steiner at dfki.de
 *
 * Deutsches Forschungszentrum für Künstliche Intelligenz GmbH
 * Trippstadter Straße 122, D-67663 Kaiserslautern, Germany
 * Geschäftsführung:
 * Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
 * Dr. Walter Olthoff
 * Vorsitzender des Aufsichtsrats:
 * Prof. Dr. h.c. Hans A. Aukes
 * Amtsgericht Kaiserslautern, HRB 2313
 */

On 11 Jun 2010, at 21:58, Igor wrote:

> Ok Marc.
> 
> I will learn more about TranscriptionGUI code to obtain more knowledge. 
> 
> Still in question, how did you treat pos tagger for English? I found a cmudict in https://cmusphinx.svn.sourceforge.net/svnroot
> /cmusphinx/trunk/cmudict/ and it does'nt have tags. I downloaded cmu-slt-hsmm female voice and checked that pos tagger is already supported as bellow.
> 
> 
> <voice name="cmu-slt-hsmm">
> <s>
> <t pos="DT">
> The
> </t>
> <t pos="NN">
> house
> </t>
> <t pos="VBZ">
> is
> </t>
> <t pos="JJ">
> big
> </t>
> <t pos=".">
> .
> </t> 
> 
> 
> Which part of process pos tagger is made for english?
> 
> Sorry for many questions, but I still don't have too knowledge in TTS systems.
> 
> Thanks for attention.
> 
> 2010/6/11 Marc Schroeder <marc.schroeder at dfki.de>
> thats an interesting idea -- creating a slightly less trivial POS tagger than the mere function/content word distinction we support in the transcription gui. That is not yet the full POS tagger support as in state of the art systems (e.g., using trigram models trained on annotated corpora) but in many unambiguous cases should get you some way down the line.
> 
> I guess here you are on your own. dig into the TranscriptionGUI code to see where function word annotation is used, and see how you need to extend it to support a custom list of POS tags.
> 
> If you do this, it would be nice if you could feed back the code into MARY TTS.
> 
> Any discussions on this should probably be held on the mary-dev mailing list, so that we don't swamp the users.
> 
> Cheers,
> Marc
> 
> 
> On 10.06.10 23:03, Igor wrote:
> Thanks for help, Marc.
> 
> Other questions.
> 
> Actually, I can create a Portuguese Brazilian dictionary in my software,
> like:
> 
> a 'a functional
> apurar a-pu-'raX functional
> aquela a-'kE-la functional
> branco 'bra~-ku functional
> brasil bra-'ziw functional
> 
> Besides create fst and lts files inside the software, I would like add a
> POS tagger. The tag /functional/ is related with POS tagger? For example:
> 
> a 'a article
> apurar a-pu-'raX verb
> aquela a-'kE-la det
> branco 'bra~-ku adjective
> brasil bra-'ziw noun
> 
> Is it correct? If yes, which tags Mary accepts? Verb, noun, article, etc?
> 
> 
> 2010/6/9 Marc Schroeder <marc.schroeder at dfki.de
> <mailto:marc.schroeder at dfki.de>>
> 
> 
>    Hi,
> 
>    that is what we do for US English: we convert the CMUDict into our
>    format. Here is the source code:
> 
>    http://mary.opendfki.de/browser/tags/4.0.0/java/marytts/tools/newlanguage/en_US/CMUDict2MaryFST.java
> 
>    Hope that helps, best regards,
>    Marc
> 
>    On 08.06.10 14:59, Igor wrote:
>     > Hi,
>     >
>     > I already have created a brazilian voice, but have some questions. Is
>     > there anyway to create fst and lts files without Transcription Gui?
>     > Actually, I use a dictionary (a sofware generates it) and this
>    tool to
>     > build the initial suport (basic NLP) for portuguese. Is possible
>    create
>     > this files directally in my software?
>     >
>     > --
>     > Eng. Igor Costa do Couto
>     > Engenharia da Computação - UFPa
>     > LaPS - UFPa
>     > igorcouto at gmail.com <mailto:igorcouto at gmail.com>
>    <mailto:igorcouto at gmail.com <mailto:igorcouto at gmail.com>>
> 
>     > Skype: iccouto
>     >
>     >
>     >
>     > _______________________________________________
>     > Mary-users mailing list
>     > Mary-users at dfki.de <mailto:Mary-users at dfki.de>
> 
>     > http://www.dfki.de/mailman/cgi-bin/listinfo/mary-users
> 
>    --
>    please note my NEW phone number: +49-681-85775-5303
> 
>    Dr. Marc Schröder, Senior Researcher at DFKI GmbH
>    Coordinator EU FP7 Project SEMAINE http://www.semaine-project.eu
>    Project leader for DFKI in SSPNet http://sspnet.eu
>    Project leader PAVOQUE http://mary.dfki.de/pavoque
>    Associate Editor IEEE Trans. Affective Computing http://computer.org/tac
>    Editor W3C EmotionML Working Draft http://www.w3.org/TR/emotionml/
>    Portal Editor http://emotion-research.net
>    Team Leader DFKI TTS Group http://mary.dfki.de
> 
>    Homepage: http://www.dfki.de/~schroed <http://www.dfki.de/%7Eschroed>
>    Email: marc.schroeder at dfki.de <mailto:marc.schroeder at dfki.de>
> 
>    Phone: +49-681-85775-5303
>    Postal address: DFKI GmbH, Campus D3_2, Stuhlsatzenhausweg 3, D-66123
>    Saarbrücken, Germany
>    --
>    Official DFKI coordinates:
>    Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
>    Trippstadter Strasse 122, D-67663 Kaiserslautern, Germany
>    Geschaeftsfuehrung:
>    Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
>    Dr. Walter Olthoff
>    Vorsitzender des Aufsichtsrats: Prof. Dr. h.c. Hans A. Aukes
>    Amtsgericht Kaiserslautern, HRB 2313
>    _______________________________________________
>    Mary-users mailing list
>    Mary-users at dfki.de <mailto:Mary-users at dfki.de>
> 
>    http://www.dfki.de/mailman/cgi-bin/listinfo/mary-users
> 
> 
> 
> 
> --
> Eng. Igor Costa do Couto
> Engenharia da Computação - UFPa
> LaPS - UFPa
> igorcouto at gmail.com <mailto:igorcouto at gmail.com>
> Skype: iccouto
> 
> 
> -- 
> please note my NEW phone number: +49-681-85775-5303
> 
> Dr. Marc Schröder, Senior Researcher at DFKI GmbH
> Coordinator EU FP7 Project SEMAINE http://www.semaine-project.eu
> Project leader for DFKI in SSPNet http://sspnet.eu
> Project leader PAVOQUE http://mary.dfki.de/pavoque
> Associate Editor IEEE Trans. Affective Computing http://computer.org/tac
> Editor W3C EmotionML Working Draft http://www.w3.org/TR/emotionml/
> Portal Editor http://emotion-research.net
> Team Leader DFKI TTS Group http://mary.dfki.de
> 
> Homepage: http://www.dfki.de/~schroed
> Email: marc.schroeder at dfki.de
> Phone: +49-681-85775-5303
> Postal address: DFKI GmbH, Campus D3_2, Stuhlsatzenhausweg 3, D-66123 Saarbrücken, Germany
> --
> Official DFKI coordinates:
> Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
> Trippstadter Strasse 122, D-67663 Kaiserslautern, Germany
> Geschaeftsfuehrung:
> Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
> Dr. Walter Olthoff
> Vorsitzender des Aufsichtsrats: Prof. Dr. h.c. Hans A. Aukes
> Amtsgericht Kaiserslautern, HRB 2313
> 
> 
> 
> -- 
> Eng. Igor Costa do Couto
> Engenharia da Computação - UFPa
> LaPS - UFPa
> igorcouto at gmail.com
> Skype: iccouto
> 
> _______________________________________________
> Mary-dev mailing list
> Mary-dev at dfki.de
> http://www.dfki.de/mailman/cgi-bin/listinfo/mary-dev



More information about the Mary-dev mailing list