[mary-users] New voice creation

Mon Nov 9 21:01:31 CET 2015

Hi Ales,

Thank you for your detailed response. I have saved your notes for future
reference and I hope to get started on custom voice creation once more.
Thanks again!

On Sun, Nov 8, 2015 at 5:16 AM Aleś Bułojčyk <alex73mail at gmail.com> wrote:

> Hi Clifton.
>
> Yes, existing voice creation software is not user friendly at all. Below
> is my notes about creating voice, step-by-step with exact commands. I hope
> it will help somebody for create new voice.
>
> But keep in mind, I created voice, but quality is far away from normal
> yet. Looks like I made mistake in some parameters, or even skipped some
> required step.
>
> WBR, Alex.
>
> Voice creation
> ==============
>
> This is my notes about HTS voice creation.
> It's not so simple process, because it requires many software, and most
> software should be compiled from sources.
> Voice creation depends on voice recognition. That's why we need to use
> voice recognition software too.
>
> Useful links
> ------------
>
> Some demos, that can show possible quality of voice synthesis:
>   http://www.cstr.ed.ac.uk/projects/festival/onlinedemo.html
>   http://festvox.org/voicedemos.html
>
> Some docs:
>   http://htk.eng.cam.ac.uk/docs/docs.shtml
>
> http://www.voxforge.org/home/dev/acousticmodels/linux/create/htkjulius/tutorial
>
> Docs from MaryTTS:
>   https://github.com/marytts/marytts/wiki/HMMVoiceCreation
>
> https://github.com/marytts/marytts/wiki/VoiceImportToolsTutorial#step-by-step-procedure
>
> Software
> --------
>
> I used Ubuntu 15.04 32-bit desktop for voice compiling.
>
> 1. Prepare system
>
>     sudo mkdir /voice
>     sudo chown <username> /voice
>     sudo apt-get install mc libc6-dev-i386 libx11-dev libncurses5-dev git
> sox tcl-snack g++
>
> 2. Compile HTS
>
> * Download
> http://hts.sp.nitech.ac.jp/archives/2.2/HTS-2.2_for_HTK-3.4.1.tar.bz2
> from http://hts.sp.nitech.ac.jp/?Download to /voice/sources/
> * Download
> http://htk.eng.cam.ac.uk/ftp/software/hdecode/HDecode-3.4.1.tar.gz from
> http://htk.eng.cam.ac.uk/prot-docs/hdecode.shtml to /voice/sources/
> * Download http://htk.eng.cam.ac.uk/ftp/software/HTK-3.4.1.tar.gz from
> http://htk.eng.cam.ac.uk/download.shtml to /voice/sources/
>
>     cd /voice/sources/
>     tar xf HTK-3.4.1.tar.gz
>     tar xf HDecode-3.4.1.tar.gz
>     mkdir hts
>     tar xf HTS-2.2_for_HTK-3.4.1.tar.bz2 -C hts
>     cd htk
>     patch -p1 -d . < ../hts/HTS-2.2_for_HTK-3.4.1.patch
>     ./configure --prefix=/voice/soft/hts
>     make all hdecode
>     make install install-hdecode
>
> Result: files /voice/soft/hts/bin/H*
>
> 3. Compile hts_engine 1.05
>
> * Download
> http://sourceforge.net/projects/hts-engine/files/hts_engine%20API/hts_engine_API-1.05/hts_engine_API-1.05.tar.gz/download
> from
> http://sourceforge.net/projects/hts-engine/files/hts_engine%20API/hts_engine_API-1.05/
> to /voice/sources/
>
>     cd /voice/sources/
>     tar xf hts_engine_API-1.05.tar.gz
>     cd hts_engine_API-1.05
>     ./configure --prefix=/voice/soft/hts_engine
>     make
>     make install
>
> Result: /voice/soft/hts_engine/bin/hts_engine
>
> 4. Install speech tools
>
> * Download
> http://festvox.org/packed/festival/2.4/speech_tools-2.4-release.tar.gz
> from http://festvox.org/festival/ to /voice/sources/
>
>     cd /voice/sources/
>     tar xf speech_tools-2.4-release.tar.gz
>     ./configure
>     make
>     make test
>
> Result: latest command should show “Test OK”
>
> 5. Install Festvox
>
> * Download http://festvox.org/festvox-2.7/festvox-2.7.0-release.tar.gz
> from http://festvox.org/download.html
>
>     cd /voice/sources/
>     tar xf festvox-2.7.0-release.tar.gz
>     cd festvox/
>     ./configure
>     make
>
>
> 6. Install Festival
> * Download
> http://festvox.org/packed/festival/2.4/festival-2.4-release.tar.gz from
> http://festvox.org/festival/ to /voice/sources/
>
>     cd /voice/sources/
>     tar xf festival-2.4-release.tar.gz
>     cd festival/
>     ./configure
>     make
>
> 7. Install SPTK
> * Download http://downloads.sourceforge.net/sp-tk/SPTK-3.4.1.tar.gz
>
>     cd /voice/sources/
>     tar xf SPTK-3.4.1.tar.gz
>     cd SPTK-3.4.1/
>     ./configure --prefix=/voice/soft/SPTK
>     make
>     make install
>
> Result: files /voice/soft/SPTK/bin/*
>
> 8. Install Praat
> * Download http://www.fon.hum.uva.nl/praat/praat5417_linux64.tar.gz from
> http://www.fon.hum.uva.nl/praat/download_linux.html to /voice/sources/
>
>     cd /voice/sources/
>     mkdir /voice/soft/praat
>     tar xf praat5417_linux64.tar.gz -C /voice/soft/praat/
>
> 9. Install JDK to /voice/soft/java
>
>     cd /voice/soft
>     tar xf /voice/sources/jdk-8u60-linux-x64.tar.gz
>     mv jdk1.8.0_60/ java/
>
> Note: you may have other JDK version.
>
> 10. Install maven
>
> * Download
> http://ftp.byfly.by/pub/apache.org/maven/maven-3/3.3.3/binaries/apache-maven-3.3.3-bin.tar.gz
> from https://maven.apache.org/download.cgi to /voice/sources/
>
>     cd /voice/soft/
>     tar xf /voice/sources/apache-maven-3.3.3-bin.tar.gz
>     mv apache-maven-3.3.3 maven
>
> 11. MaryTTS
>
>     cd /voice/sources/
>     git clone https://github.com/marytts/marytts.git (I used
> 65ee75ee8473e0d65853b0f8ccac1582ea0cb783 snapshot)
>     cd marytts/lib/external/ehmm
>     make
>     cd ..
>
> PATH=$PATH:/voice/soft/hts/bin:/voice/soft/hts_engine/bin:/voice/soft/SPTK/bin:/voice/sources/marytts/lib/external/ehmm/bin
> ./check_install_external_programs.sh -check
>
> Result: externalBinaries.config. Check should display all OK.
>
>     cd /voice/sources/marytts
>     JAVA_HOME=/voice/soft/java /voice/soft/maven/bin/mvn install
>
> Result:
> /voice/sources/marytts/target/marytts-builder-5.2-SNAPSHOT/bin/voiceimport.sh
>
> Voice compiling: prepare
> ------------------------
>
> 1. Put audio files into /voice/data/wav, text into
> /voice/data/txt.done.data
>
> 2. Run MaryTTS server in the separate window:
>
>     cd /voice/sources/marytts/target/marytts-5.2-SNAPSHOT/bin
>     PATH=/voice/soft/java/bin:$PATH ./marytts-server
>
> 3. Run voice conversion:
>
>     cd /voice/sources/marytts/target/marytts-builder-5.2-SNAPSHOT/bin;
>
> PATH=$PATH:/voice/soft/java/bin/:/voice/soft/praat:/voice/sources/speech_tools/main/:/voice/soft/hts/bin:/voice/soft/hts_engine/bin:/voice/soft/SPTK/bin:/voice/sources/marytts/lib/external/ehmm/bin
> ./voiceimport.sh
>
> 4. Choose directory /voice/data
>
> 5. Set parameters (I tried russian voice elena compilation):
>
>     db.marybase: /voice/sources/marytts
>     db.estDir: /voice/sources/speech_tools/
>     db.samplingrate: as in source files
>     db.locale: ru
>     HMMVoiceConfigure.dataSet      =  rusian_set_name
>     HMMVoiceConfigure.speaker      =  elena
>     HMMVoiceConfigure.lowerF0      =  80 (male=40,  female=80)
>     HMMVoiceConfigure.upperF0     = 350 (male=280, female=350)
>     HMMVoiceConfigure.freqWarp = 0.53
>     HMMVoiceConfigure.sampfreq = 44100
>     HMMVoiceConfigure.frameLen, frameShift (see
> http://www.dfki.de/pipermail/mary-users/2012-April/001189.html)
>     HMMVoiceCompiler.mavenBin    /voice/soft/maven/bin/mvn
>     EHMMLabeler.ehmmDir    /voice/sources/marytts/lib/external/ehmm
>
> 6. Close UI, then start voiceimport.sh as in previous step. Check if
> console contains "Reading external binaries config file
> /voice/sources/marytts/lib/external/externalBinaries.config"
>
> 7. Run "PraatPitchmarker". Result : voice/pm/* files
>
> 8. Run "MCEPMaker". Result : voice/mcep/*.mcep files
>
> 9. Run "Festvox2MaryTranscripts". Result : voice/text/*.txt files
> (converted from voice/txt.done.data)
>
> Voice compiling: autolabeling
> -----------------------------
>
> 10. Run "AllophonesExtractor". Result: voice/prompt_allophones/*.xml
>
> 11. Run "EHMMLabeler". Result: voice/ehmm/* files. See
> https://github.com/marytts/marytts/wiki/HMMVoiceCreation for some
> details. This step will process several hours.
>
> 12. Run "LabelPauseDeleter" (set threshold=10 in settings). Result:
> voice/lab/*.lab
>
> 13. Run "PhoneUnitLabelComputer". Result: voice/phonelab/*.lab
>
> 14. Run "TranscriptionAligner". Result: voice/allphones/*.xml
>
> 15. Run "FeatureSelection". Result: voice/mary/features.txt
>
> 16. Run "PhoneUnitFeatureComputer": Result: voice/phonefeatures/*.pfeats
>
> 17. Run "PhoneLabelFeatureAligner". See console output - it should display
> "0 problems"
>
> As results of previous steps, you should have in your voice building
> directory:
>
>     phonefeatures directory
>     phonelab directory
>     mary/features.txt file
>     $MARY_BASE/lib/external/externalBinaries.config
>
>
> Voice compiling: voice training
> -------------------------------
>
> 18. Run "HMMVoiceDataPreparation"
>
> 19. Run "HMMVoiceConfigure"
>
> 20. Run "HMMVoiceFeatureSelection". Result: mary/hmmFeatures.txt
>
> 21. Run "HMMVoiceMakeData"
>
> 22. Run "HMMVoiceMakeVoice". This step will process several hours.
>
> 23. Run "HMMVoiceCompiler". If this step finished with error, you need to
> go to /voice/data/mary/voice-my_voice-hsmm, then edit pom.xml from 5.1.2 to
> 5.2-SNAPSHOT version, then compile manually by command "cd
> /voice/data/mary/voice-my_voice-hsmm; JAVA_HOME=/voice/soft/java
> /voice/soft/maven/bin/mvn install"
>
>
> Once the voice is compiled, follow the instructions in
> Publishing-a-MARY-TTS-Voice to install the voice
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.dfki.de/pipermail/mary-users/attachments/20151109/efd9d42f/attachment-0001.htm