[mary-users] = sign in front of phoneme

Ingmar Steiner ingmar.steiner at dfki.de
Fri Apr 12 11:35:33 CEST 2013


Dear Ricardo,

On 10.04.13 22:05, Ricardo Duarte wrote:
> Dear Ingmar,
>
> Thank you very much for the quick reply, you got me lost in a few bits.
>
> I made this table in a while back i updated it with the link:
> https://www.sugarsync.com/pf/D6330330_64065670_614445
> I found the documentation a bit short on this matter and since i am not
> an expert in the speech area maybe you could overlook (and notify me of
> missing or wrong matches, or if you want just fix them yourself) the xls
> file and upload it in the github wiki for MaryTTS, or put in the wiki as
> html, if the file is correct i dont mind doing it myself as this might
> help a few people.

Thanks for sharing that table! As you suggest, we could add a page in 
the wiki on github. But I think the documentation of the corresponding 
language modules might be an even better place; I'll put it on my todo 
list for the maven site generation.

>
> So in Mary "r=" is equivalent to "@r" ? is there any other any other
> equivalences?

Mary "r=" is equivalent to SAMPA "r=". Also, SAMPA has other symbols, 
such as "@r", for pretty much the same sound (but with different 
phonological properties), and Mary pragmatically uses "r=" for those as 
well. Sorry for the confusion.

>
> You also said that say that stress is: "  " ' " and " , " for primary
> and secondary stress, as opposed to SAMPA's "\"" and "%", but closer to
> IPA " but in the acoustparams output stress there is a parameter in the
> syllable tag.
>
> example:
> <syllable ph="k r=">
> <ph d="122" end="0.7250624" p="k" units="k_L w0012 12425 0.07425; k_R
> w0010 12106 0.0471875"/>
> <ph d="114" end="0.83956236" f0="(0,94) (50,91) (100,99)" p="r="
> units="r=_L w0010 12107 0.0556875; r=_R w0010 12108 0.0588125"/>
> </syllable>

Although I don't see it in your XML snippet, I think you're referring to 
the stress attribute of syllable elements, which are generated from the 
token transcriptions (ph attributes in the t elements) at the ALLOPHONES 
stage. These in turn are taken from the lexicon (or generated from 
grapheme2phoneme rules), and stress at this level is lexical in nature. 
But as phrase-level prosody comes into play, the syllables with lexical 
stress are the "anchor points" for accents.

To answer your question, the stress attributes are essentially a helper 
construct to expose the lexical stress in the XML document. The ph 
attributes are not removed (so they become redundant), but IIRC, they 
are ignored after the XML syllable structure is generated.

Best wishes,

-Ingmar

>
> Can the stress be applicable to the phonemes and to the syllables?
> Are the stress tags you mention embedded in the phoneme attribute value,
> as p=",r" or p=" ' "
>
> Thanks for your help.
>
> Ricardo
>
>
>
>
> On 10 April 2013 18:39, Ingmar Steiner <ingmar.steiner at dfki.de
> <mailto:ingmar.steiner at dfki.de>> wrote:
>
>     Dear Ricardo,
>
>     the "r=" is indeed the syllabic r in SAMPA, but also used for the
>     rhotacized open-mid central vowel ɝ, which in canonical SAMPA would
>     be "@r" [1]; you can refer to Mary's allophone set for US English [2].
>
>     Conversion from Mary to SAMPA should be quite straightforward; there
>     are only very few notational differences (such as "'" and "," for
>     primary and secondary stress, as opposed to SAMPA's "\"" and "%",
>     but closer to IPA). Conversion to CMUDict's Arpabet should also be
>     easy to implement, but the class you mention has a different purpose
>     [3].
>
>     Best wishes,
>
>     -Ingmar
>
>     [1] see http://www.phon.ucl.ac.uk/__home/sampa/american.htm
>     <http://www.phon.ucl.ac.uk/home/sampa/american.htm>, Note 1, ii.
>     [2]
>     https://github.com/marytts/__marytts/blob/master/marytts-__lang-en/src/main/resources/__marytts/language/en_US/__lexicon/allophones.en_US.xml#__L17
>     <https://github.com/marytts/marytts/blob/master/marytts-lang-en/src/main/resources/marytts/language/en_US/lexicon/allophones.en_US.xml#L17>
>     [3]
>     http://mary.dfki.de/javadoc/4.__3.0/marytts/tools/newlanguage/__en_US/CMUDict2MaryFST.html
>     <http://mary.dfki.de/javadoc/4.3.0/marytts/tools/newlanguage/en_US/CMUDict2MaryFST.html>
>
>
>     On 4/10/13 19:09, Ricardo Duarte wrote:
>
>         Hi all,
>
>         I am implementing a coarticulation framework, i managed to match
>         most
>         phonemes, but a phoneme has an "r="  which has an '=' after the
>         'r', is
>         this phoneme part of a Diacritic? are the diacritic symbols the
>         same as
>         in Sampa (http://www.phon.ucl.ac.uk/__home/sampa/
>         <http://www.phon.ucl.ac.uk/home/sampa/>)  the '~' and '='?
>         Is there a way to convert from Mary phonemes to CMUDict or
>         Sampa? I just
>         found the class: CMUDict2MaryFST will this class do it?
>
>
>         Here it is the output of the intonation file
>         <?xml version="1.0" encoding="UTF-8"?>
>         <maryxml xmlns="http://mary.dfki.de/__2002/MaryXML
>         <http://mary.dfki.de/2002/MaryXML>"
>         xmlns:xsi="http://www.w3.org/__2001/XMLSchema-instance
>         <http://www.w3.org/2001/XMLSchema-instance>" version="0.5"
>         xml:lang="en-GB">
>         <p>
>         <voice name="dfki-spike">
>         <s>
>         <phrase>
>         <t accent="L+H*" g2p_method="lexicon" ph="' w E l - k @ m" pos="UH">
>         Welcome
>         </t>
>         <t g2p_method="lexicon" ph="' t u" pos="TO">
>         to
>         </t>
>         <t accent="!H*" g2p_method="lexicon" ph="k r= - ' I z - m @"
>         pos="NNP">
>         Charisma
>         </t>
>         <boundary breakindex="5" tone="L-L%"/>
>         </phrase>
>         </s>
>         </voice>
>         </p>
>         </maryxml>
>
>
>         I would appreciate any help.
>
>         Ricardo
>
>
>         _________________________________________________
>         Mary-users mailing list
>         Mary-users at dfki.de <mailto:Mary-users at dfki.de>
>         http://www.dfki.de/mailman/__cgi-bin/listinfo/mary-users
>         <http://www.dfki.de/mailman/cgi-bin/listinfo/mary-users>
>
>
>     /**
>       * Dr. Ingmar Steiner
>       *
>       * Head of Independent Research Group
>       * Multimodal Speech Processing
>       * Cluster of Excellence MMCI
>       *
>       * Senior Researcher
>       * Language Technology Lab
>       * German Research Center for
>       * Artificial Intelligence (DFKI GmbH)
>       *
>       * Adjunct Assistant Professor
>       * Department of Computer Science
>       * Saarland University
>       *
>       * Campus C7.4, Room 3.01
>       * D-66123 Saarbrücken
>       * @tel: +49-681-302-70028 <tel:%2B49-681-302-70028>
>       * @fax: +49-681-302-4317 <tel:%2B49-681-302-4317>
>       * @web: http://coli.uni-saarland.de/~__steiner/
>     <http://coli.uni-saarland.de/~steiner/>
>       */
>
>


More information about the Mary-users mailing list