[mary-users] [mary-users at dfki.de] Target cost weight training
Jerome Perri
jerome.perri at hotmail.com
Tue Apr 30 13:09:46 CEST 2013
Dear Ingmar,
I would like to give it a try with my approach anyway.
Perhaps my ignorance results in something not-seen-yet :-)
One thing that I am not sure of:
The festvox component produces the MCEPs. Okay.
It produces 12 MCEPs at each pitchmark.
Now for example we have two phonemes "a:".
Both have the same number of pitchmarks (lets say 30).
I would simply say:
float sum1;
for (int i=0 to 29)
for (int j=0 to 11)
sum1+=MCEPs_For_Phoneme_1(i,j);
float sum2;
for (int i=0 to 29)
for (int j=0 to 11)
sum2+=MCEPs_For_Phoneme_2(i,j);
Now I want to calculate the spectral distance (how much they differ):
float distance=abs(sum1-sum2);
I am not experienced with acoustic distances at all, so there is a good chance that the math it completely garbage.
Greetings,
Jerome
> Date: Tue, 30 Apr 2013 11:52:27 +0200
> From: ingmar.steiner at dfki.de
> To: jerome.perri at hotmail.com
> CC: mary-users at dfki.de
> Subject: Re: [mary-users] [mary-users at dfki.de] Target cost weight training
>
> Dear Jerome,
>
> in essence, you're giving a simplified description of how the acoustic
> models are trained. But tuning target cost weights for unit selection is
> a different question... In fact, even in the industry, it's mostly done
> by hand for each voice (AFAIA). Efficient *automatic* weight tuning is
> still an active research topic (see, e.g.,
> http://dx.doi.org/10.1016/j.specom.2011.01.004).
>
> Best wishes,
>
> -Ingmar
>
> On 29.04.13 20:14, Jerome Perri wrote:
> > Hello!
> >
> > Can somebody tell me how I can automatically train target cost weights?
> >
> > My proposal is:
> >
> > - Make a list of all phoneme "a:".
> > - Compare the durations and MCEPs and F0's of all "a:" in this list.
> > - Look at how much a feature (for example "pos_in_word") influences the
> > differences.
> >
> > This would tell us which weight the feature "pos_in_word" should have.
> >
> > Is this a good approach?
> >
> > Jerome Perri
> >
> >
> > _______________________________________________
> > Mary-users mailing list
> > Mary-users at dfki.de
> > http://www.dfki.de/mailman/cgi-bin/listinfo/mary-users
> >
>
> --
> /**
> * Dr. Ingmar Steiner
> *
> * Head of Independent Research Group
> * Multimodal Speech Processing
> * Cluster of Excellence MMCI
> *
> * Senior Researcher
> * Language Technology Lab
> * German Research Center for
> * Artificial Intelligence (DFKI GmbH)
> *
> * Adjunct Assistant Professor
> * Department of Computer Science
> * Saarland University
> *
> * Campus C7.4, Room 3.01
> * D-66123 Saarbrücken
> * @tel: +49-681-302-70028
> * @fax: +49-681-302-4317
> * @web: http://coli.uni-saarland.de/~steiner/
> */
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.dfki.de/pipermail/mary-users/attachments/20130430/c754c837/attachment.htm
More information about the Mary-users
mailing list