[mary-users] [mary-users at dfki.de] Target cost weight training

Jerome Perri jerome.perri at hotmail.com
Tue Apr 30 13:09:46 CEST 2013


Dear Ingmar,

I would like to give it a try with my approach anyway.
Perhaps my ignorance results in something not-seen-yet :-)

One thing that I am not sure of:

The festvox component produces the MCEPs. Okay. 
It produces 12 MCEPs at each pitchmark.

Now for example we have two phonemes "a:".
Both have the same number of pitchmarks (lets say 30).

I would simply say:

float sum1;
for (int i=0 to 29)
    for (int j=0 to 11)
        sum1+=MCEPs_For_Phoneme_1(i,j);
   
float sum2;
for (int i=0 to 29)
    for (int j=0 to 11)
        sum2+=MCEPs_For_Phoneme_2(i,j);

Now I want to calculate the spectral distance (how much they differ):

float distance=abs(sum1-sum2);

I am not experienced with acoustic distances at all, so there is a good chance that the math it completely garbage.

Greetings,
Jerome





> Date: Tue, 30 Apr 2013 11:52:27 +0200
> From: ingmar.steiner at dfki.de
> To: jerome.perri at hotmail.com
> CC: mary-users at dfki.de
> Subject: Re: [mary-users] [mary-users at dfki.de] Target cost weight training
> 
> Dear Jerome,
> 
> in essence, you're giving a simplified description of how the acoustic 
> models are trained. But tuning target cost weights for unit selection is 
> a different question... In fact, even in the industry, it's mostly done 
> by hand for each voice (AFAIA). Efficient *automatic* weight tuning is 
> still an active research topic (see, e.g., 
> http://dx.doi.org/10.1016/j.specom.2011.01.004).
> 
> Best wishes,
> 
> -Ingmar
> 
> On 29.04.13 20:14, Jerome Perri wrote:
> > Hello!
> >
> > Can somebody tell me how I can automatically train target cost weights?
> >
> > My proposal is:
> >
> > - Make a list of all phoneme "a:".
> > - Compare the durations and MCEPs and F0's of all "a:" in this list.
> > - Look at how much a feature (for example "pos_in_word") influences the
> > differences.
> >
> > This would tell us which weight the feature "pos_in_word" should have.
> >
> > Is this a good approach?
> >
> > Jerome Perri
> >
> >
> > _______________________________________________
> > Mary-users mailing list
> > Mary-users at dfki.de
> > http://www.dfki.de/mailman/cgi-bin/listinfo/mary-users
> >
> 
> -- 
> /**
>   * Dr. Ingmar Steiner
>   *
>   * Head of Independent Research Group
>   * Multimodal Speech Processing
>   * Cluster of Excellence MMCI
>   *
>   * Senior Researcher
>   * Language Technology Lab
>   * German Research Center for
>   * Artificial Intelligence (DFKI GmbH)
>   *
>   * Adjunct Assistant Professor
>   * Department of Computer Science
>   * Saarland University
>   *
>   * Campus C7.4, Room 3.01
>   * D-66123 Saarbrücken
>   * @tel: +49-681-302-70028
>   * @fax: +49-681-302-4317
>   * @web: http://coli.uni-saarland.de/~steiner/
>   */
 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.dfki.de/pipermail/mary-users/attachments/20130430/c754c837/attachment.htm 


More information about the Mary-users mailing list