[mary-users] English Language Prosody - Preprocessing Idea?

Stephen Mack sbmack7 at comcast.net
Wed Aug 27 15:41:11 CEST 2008

Thanks for the reply.
Right, any pre-processor would require some kind fo expert system engine to apply the rules.
But here is where it gets tricky.  The current English instruction for a comma is to minimize its use, except when not using a comma could cause the reader confusion.  So the using comma more liberally as a prosody management device is tossed out the window.  BTW, I studied German. “defenestrate” (Fenster) is a great English word that means to toss someone out a window.
So firstly, a baseline comma augmentation engine would have to implement current rules.  For example, replicating these:
 <http://grammar.ccc.commnet.edu/grammar/commas.htm> http://grammar.ccc.commnet.edu/grammar/commas.htm
With an option to apply augmented commas based on arguments found in here:
 <http://grammar.wikia.com/wiki/Serial_comma> http://grammar.wikia.com/wiki/Serial_comma
Incidentally, try cutting and pasting the 19th and 21st century examples into Mary for comparison.
And finally for the most exhaustive comma augment,  the developer would have the isolate why the novice comptemporary writer often uses "too many" commas.  So he  would have to interact with an English linguist or read documentation that explains that.  What I mean is that an unskilled English writer may use an "unneeded" comma because he is probably playing out the text prosaically in his mind, and inserts additional commas that map to vocal pauses in his mind apart from simple understanding.
The developer would have to classify those ungrammatical pause related commas and add those as comma insertion rules into the Mary preprocessor.  Because although they may be grammatical errors, they could be prosaic requirements.
I am thinking on the fly here.  So the CommaAugmentation module would have 4 parametric levels of augmentation:
*         None - Text spoken as is
*         Standard - Text comma augmented based on comtempory written rules
*         Obsolete - Standard plus Obsolete written usage that improves prosody 
*         Writer's Intent - Obsolete plus additional comma insertion based on grammatically uncodified rules to support the prosaic pause.
So usage would work like this.
1)      Plug some text into Mary
2)      Select a comma augment level.
3)      Listen to Mary play a section of the document (Assuming prosaic homogenaity throughout the documen requires only that.)
4)      Select an alternative comma augment if necessary
5)      Mary writes the comma augmented text file (with markers to show changes)
6)      User builds final prosody edits into the Mary augmented text
7)      Deliver modified text to Mary
Does this make sense?
-----Original Message-----
From: Marc Schroeder [mailto:schroed at dfki.de] 
Sent: Wednesday, August 27, 2008 7:14 AM
To: Stephen Mack
Cc: mary-users at dfki.de
Subject: Re: [mary-users] English Language Prosody - Preprocessing Idea?
thanks for the interesting thought.
I guess it boils down to the question, what information do you need (in 
terms of text analysis) in order to know where the commas used to be... 
would a full syntactic parse be required? Or is it typically before or 
after certain kinds of words (e.g., conjunctions: or, and, ...)?
There is a component that currently does this kind of analysis, so it 
would be interesting to see if it is possible to add some rules for 
breaks without commas; the component is (in MARY 3.6) 
de.dfki.lt.mary.modules.en.Prosody, and it reads rules from the file 
MARY\ TTS/lib/modules/en/prosody/tobipredparams_english.xml
Volunteers for improving these rules are welcome! :-)
Stephen Mack schrieb:
> Marc,
> You have done a wonderful job developing Mary up to her present state.
> But here is a challenge that I have encountered that is not a function 
> of Mary, but rather the evolving nature of (American) English 
> punctuation, specifically the disappearing use of the comma.  There has 
> been no change in spoken English cadence given fewer commas in written 
> form.  That presents the TTS prosody problem of the voice not 
> recognizing natural breaks in actual elocution.  The result is that if a 
> block of contemporay English text is delivered to Mary without any 
> preprocessing, playback is generally much too rapid and unnatural.  Many 
> of the natural breaks of spoken English are missing because they are not 
> represented in written form as commas.
> So the onus (/Verpflichtung/) falls on the user to carefully markup his 
> text, adding additional punctuation and/or SSML <break> elements.  But 
> that requires a lot of back and worth work between a text editor and 
> Mary to get the playback right.
> There has been much debate on comma use by English grammarians.  Which 
> suggests the development of a Mary preprocessing module that would 
> insert additional commas.  A general comma augmentation scheme could be 
> set as a parameter, “none”, “moderate”, “strick”.   Rule sets would 
> correspond to gradations from lax to strict use models argued by the 
> grammarians.
> Say for example, a CommaAugment pre-processing routine were available.  
> Then the user would select a CommaAugment parameter, do an initial 
> playback and output the comma augmented transcript that is closest to 
> what he wants as a playback end-state.   He could then text-edit a final 
> mark-up with much less prosody customization.
> I suppose a user could write a pre-processing routine on his own by 
> coding a comma-based rule set in an external application.  But it seems 
> that it would be much more efficient to have the routine integrated 
> directly into Mary.
> Thoughts?
> Regards,
> Steve Mack
> Washington, DC
> ------------------------------------------------------------------------
> _______________________________________________
> Mary-users mailing list
> Mary-users at dfki.de
> http://www.dfki.de/mailman/listinfo/mary-users
Dr. Marc Schröder, Senior Researcher at DFKI GmbH
Coordinator EU FP7 Project SEMAINE http://www.semaine-project.eu
Chair W3C Emotion ML Incubator http://www.w3.org/2005/Incubator/emotion
Portal Editor http://emotion-research.net
Team Leader DFKI Speech Group http://mary.dfki.de
Project Leader DFG project PAVOQUE http://mary.dfki.de/pavoque
Homepage: http://www.dfki.de/~schroed
Email: schroed at dfki.de
Phone: +49-681-302-5303
Postal address: DFKI GmbH, Campus D3_2, Stuhlsatzenhausweg 3, D-66123 
Saarbrücken, Germany
Official DFKI coordinates:
Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
Trippstadter Strasse 122, D-67663 Kaiserslautern, Germany
Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
Dr. Walter Olthoff
Vorsitzender des Aufsichtsrats: Prof. Dr. h.c. Hans A. Aukes
Amtsgericht Kaiserslautern, HRB 2313
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.dfki.de/pipermail/mary-users/attachments/20080827/af3c33da/attachment-0001.html>

More information about the Mary-users mailing list