[mary-users] exact Audio Format

Marc Schroeder schroed at dfki.de
Mon Mar 5 17:40:16 CET 2007


Hi Pierre,

walking down the corridor would have been just as easy, but this way we
can share this bit of information with others on the list ;-)

As the wav data is generated as an actual wav file, you could of course
get the exact audio format by parsing the wav header with your favourite
tool. By the way, make sure you are using MARY 3.0.3 or 3.1beta1,
because earlier versions had some troubles with the WAV headers in
streaming mode.

But I can also tell you about the file format. Take a look at the unit
selection code under
http://mary.opendfki.de/browser/trunk/java/de/dfki/lt/mary/unitselection/concat/BaseUnitConcatenator.java

The sample rate depends on the actual voice -- usually, it is either
16000 or 22050. The rest should be pretty much the same for all voices.

90 	        this.audioformat = new
AudioFormat(AudioFormat.Encoding.PCM_SIGNED,
91 	                sampleRate, // samples per second
92 	                16, // bits per sample
93 	                1, // mono
94 	                2, // nr. of bytes per frame
95 	                sampleRate, // nr. of frames per second
96 	                true); // big-endian;

Cheers,
Marc

Pierre Lison schrieb:
> Hello all,
> 
> I'm a hiwi student at DFKI (in the Cognitive Systems project), and  
> I'm currently using Mary for the TTS module of our talking robots.  I  
> have a small technical question to ask you: what is the *exact* audio  
> format used in the output stream produced by the Mary client (ie by  
> the process method), when "WAV" is chosen as the audio type ?
> 
>    What I would need to know is:
> - the encoding format (ie. PCM, U-LAW, etc.) ;
> - the frame size ;
> - the frame rate ;
> - the sample rate ;
> - is it a big or little endian format ?
> 
> I want to pipe the output of the TTS to a stream over a (SIP-based)  
> phone connection, where the audio format is U-Law 8 KHz.. and in  
> order to convert the audio data into this particular format, I need  
> to know the exact initial format first !  And I didn't find this  
> information anywhere in the docs.
> 
> Thanks for your help !
> 
> 
> --
> Pierre Lison (student in Computational Linguistics)
> Blumenstraße 17,        Web:    www.coli.uni-saarland.de/~pierrel
> 66111 Saarbrücken       Mobile: +49.177.305.306.1
> 
> 
> 
> _______________________________________________
> Mary-users mailing list
> Mary-users at dfki.de
> http://www.dfki.de/mailman/listinfo/mary-users
> 

-- 
Dr. Marc Schröder, Senior Researcher
DFKI GmbH, Stuhlsatzenhausweg 3, D-66123 Saarbrücken, Germany
http://www.dfki.de/~schroed
Here. Now. Real, first-person experience. Am I there to witness it?
--
official DFKI coordinates:
Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
Trippstadter Strasse 122, D-67663 Kaiserslautern, Germany
Geschaeftsfuehrung:
Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
Dr. Walter Olthoff
Vorsitzender des Aufsichtsrats: Prof. Dr. h.c. Hans A. Aukes
Amtsgericht Kaiserslautern, HRB 2313



More information about the Mary-users mailing list