[mary-users] F0CartTrainer: Long overlooked bug?

Ingmar Steiner ingmar.steiner at dfki.de
Fri May 10 09:33:13 CEST 2013


Dear Jerome,

throughout the code, datagram boundaries are interpreted as pitchmarks 
(however created). As a corollary, the the inverse datagram duration in 
seconds is by definition the F0 in Hz.

Of course, the interesting question is how this affects voiced-unvoiced 
related processing, since this distinction is not preserved in the above 
encoding... Solving this issue has been in the backlog for several 
years, but we're looking for a way to maintain backward compatibility 
with existing unit selection data files.

Best wishes,

-Ingmar

On 08.05.13 20:18, Jerome Perri wrote:
> Dear Ingmar,
>
> thank you.
> I have not taken a closer look at how the datagrams are created.
> Do you mean that it is possible to get the f0 values from the pitchmarks
> as created by Praat?
> Or do you process the pitchmarks afterwards in such a way that one can
> read the f0 values from it?
>
> Greetings,
> Jerome
>
>
>  > Date: Mon, 6 May 2013 11:50:34 +0200
>  > From: ingmar.steiner at dfki.de
>  > To: jerome.perri at hotmail.com
>  > CC: mary-users at dfki.de
>  > Subject: Re: [mary-users] F0CartTrainer: Long overlooked bug?
>  >
>  > Dear Jerome,
>  >
>  > thank you for your message.
>  >
>  > The Datagrams are pitch-synchronous sample packets, so during voiced
>  > segments, the F0 is the inverse of their duration. I agree that this
>  > relation is not explicit at first glance, and that the code could be
>  > refactored for clarity.
>  >
>  > Best wishes,
>  >
>  > -Ingmar
>  >
>  > On 04.05.13 16:55, Jerome Perri wrote:
>  > > Hello!
>  > >
>  > > In the F0CartTrainer I have seen something strange:
>  > >
>  > > Datagram[] midDatagrams =
>  > > waveTimeline.getDatagrams(unitFile.getUnit(mid),
> unitFile.getSampleRate());
>  > > Datagram[] leftDatagrams =
>  > > waveTimeline.getDatagrams(unitFile.getUnit(first),
>  > > unitFile.getSampleRate());
>  > > Datagram[] rightDatagrams =
>  > > waveTimeline.getDatagrams(unitFile.getUnit(last),
> unitFile.getSampleRate());
>  > > if (midDatagrams != null && midDatagrams.length > 0
>  > > && leftDatagrams != null &&
>  > > leftDatagrams.length > 0
>  > > && rightDatagrams != null &&
>  > > rightDatagrams.length > 0) {
>  > > float midF0 = waveTimeline.getSampleRate() /
>  > > (float) midDatagrams[midDatagrams.length/2].getDuration();
>  > >
>  > > waveTimeLine is the file "timeline_waveforms.mry".
>  > >
>  > > In the above one can see that midF0 is calculated by:
>  > >
>  > > float midF0 = waveTimeline.getSampleRate() / (float)
>  > > midDatagrams[midDatagrams.length/2].getDuration();
>  > >
>  > > The timeline ("waveTimeline" in the upper code) is fed by:
>  > >
>  > > /* Feed the datagram to the timeline */
>  > > waveTimeline.feed( new Datagram(duration,
>  > > buff.toByteArray()), globSampleRate );
>  > >
>  > > So the wavTimeLine contains an index, a byte pointer and duration.
>  > >
>  > > But it does not contain any F0 values.
>  > >
>  > > Jerome
>  > >
>  > >
>  > > _______________________________________________
>  > > Mary-users mailing list
>  > > Mary-users at dfki.de
>  > > http://www.dfki.de/mailman/cgi-bin/listinfo/mary-users
>  > >


More information about the Mary-users mailing list