[mary-users] F0CartTrainer: Long overlooked bug?
Ingmar Steiner
ingmar.steiner at dfki.de
Fri May 10 09:33:13 CEST 2013
Dear Jerome,
throughout the code, datagram boundaries are interpreted as pitchmarks
(however created). As a corollary, the the inverse datagram duration in
seconds is by definition the F0 in Hz.
Of course, the interesting question is how this affects voiced-unvoiced
related processing, since this distinction is not preserved in the above
encoding... Solving this issue has been in the backlog for several
years, but we're looking for a way to maintain backward compatibility
with existing unit selection data files.
Best wishes,
-Ingmar
On 08.05.13 20:18, Jerome Perri wrote:
> Dear Ingmar,
>
> thank you.
> I have not taken a closer look at how the datagrams are created.
> Do you mean that it is possible to get the f0 values from the pitchmarks
> as created by Praat?
> Or do you process the pitchmarks afterwards in such a way that one can
> read the f0 values from it?
>
> Greetings,
> Jerome
>
>
> > Date: Mon, 6 May 2013 11:50:34 +0200
> > From: ingmar.steiner at dfki.de
> > To: jerome.perri at hotmail.com
> > CC: mary-users at dfki.de
> > Subject: Re: [mary-users] F0CartTrainer: Long overlooked bug?
> >
> > Dear Jerome,
> >
> > thank you for your message.
> >
> > The Datagrams are pitch-synchronous sample packets, so during voiced
> > segments, the F0 is the inverse of their duration. I agree that this
> > relation is not explicit at first glance, and that the code could be
> > refactored for clarity.
> >
> > Best wishes,
> >
> > -Ingmar
> >
> > On 04.05.13 16:55, Jerome Perri wrote:
> > > Hello!
> > >
> > > In the F0CartTrainer I have seen something strange:
> > >
> > > Datagram[] midDatagrams =
> > > waveTimeline.getDatagrams(unitFile.getUnit(mid),
> unitFile.getSampleRate());
> > > Datagram[] leftDatagrams =
> > > waveTimeline.getDatagrams(unitFile.getUnit(first),
> > > unitFile.getSampleRate());
> > > Datagram[] rightDatagrams =
> > > waveTimeline.getDatagrams(unitFile.getUnit(last),
> unitFile.getSampleRate());
> > > if (midDatagrams != null && midDatagrams.length > 0
> > > && leftDatagrams != null &&
> > > leftDatagrams.length > 0
> > > && rightDatagrams != null &&
> > > rightDatagrams.length > 0) {
> > > float midF0 = waveTimeline.getSampleRate() /
> > > (float) midDatagrams[midDatagrams.length/2].getDuration();
> > >
> > > waveTimeLine is the file "timeline_waveforms.mry".
> > >
> > > In the above one can see that midF0 is calculated by:
> > >
> > > float midF0 = waveTimeline.getSampleRate() / (float)
> > > midDatagrams[midDatagrams.length/2].getDuration();
> > >
> > > The timeline ("waveTimeline" in the upper code) is fed by:
> > >
> > > /* Feed the datagram to the timeline */
> > > waveTimeline.feed( new Datagram(duration,
> > > buff.toByteArray()), globSampleRate );
> > >
> > > So the wavTimeLine contains an index, a byte pointer and duration.
> > >
> > > But it does not contain any F0 values.
> > >
> > > Jerome
> > >
> > >
> > > _______________________________________________
> > > Mary-users mailing list
> > > Mary-users at dfki.de
> > > http://www.dfki.de/mailman/cgi-bin/listinfo/mary-users
> > >
More information about the Mary-users
mailing list