[mary-users] Duration and intonation modeling

Ingmar Steiner ingmar.steiner at dfki.de
Fri Apr 12 12:57:29 CEST 2013


Dear Trang,

On 12.04.13 12:44, Trang Nguyen wrote:
> Dears,
>
> I have adopted Mary TTS in our research for Vietnamese, and thank you
> for a nice framework.

Excellent, glad you were able to make use of Mary! Would you be willing 
to share your code for Vietnamese by pushing it to a feature branch of 
your fork on github?

>
> I have some concerns related to duration and intonation modeling in Mary
> TTS:
> 1. How the duration is modeled and synthesized in Mary TTS? I found that
> the duration in my voice is not well modeled, they are very short… How
> could I change or adapt for our language? Do you have any materials
> mentioning about this problem?

Conventionally, duration is modeled using CARTs. This might not be the 
best approach for some languages. Some debugging might uncover what is 
going wrong, but it's hard to say without knowing the details.

Alternatively, we do have code for different models in place (HMMs, 
Sums-of-Products), which might well produce better results. See 
marytts-runtime/src/main/java/marytts/modules/acoustic for details.

> 2. I found in the source code of Mary TTS, it seems allow to process
> continuous value features, but I find no way to do that. Now I already
> developed a continuous prosody model, but I can't integrate into Mary
> TTS to extract features for training and synthesis.

First of all, the prosody modelling in Mary can always be improved. But 
you raise an interesting point; could you please explain the problems 
you ran into, and the details of the continuous prosody model?

Best wishes,

-Ingmar

>
> If you have some suggestions/clues, it will be very happy for me :).
>
> Thanks in advance.
>
> Trang.
>
>
>
>
>
>
>
>
>
>
>
> _______________________________________________
> Mary-users mailing list
> Mary-users at dfki.de
> http://www.dfki.de/mailman/cgi-bin/listinfo/mary-users
>

-- 
/**
  * Dr. Ingmar Steiner
  *
  * Head of Independent Research Group
  * Multimodal Speech Processing
  * Cluster of Excellence MMCI
  *
  * Senior Researcher
  * Language Technology Lab
  * German Research Center for
  * Artificial Intelligence (DFKI GmbH)
  *
  * Adjunct Assistant Professor
  * Department of Computer Science
  * Saarland University
  *
  * Campus C7.4, Room 3.01
  * D-66123 Saarbrücken
  * @tel: +49-681-302-70028
  * @fax: +49-681-302-4317
  * @web: http://coli.uni-saarland.de/~steiner/
  */


More information about the Mary-users mailing list