[mary-users] New voice creation
Aleś Bułojčyk
alex73mail at gmail.com
Sun Nov 8 14:15:44 CET 2015
Hi Clifton.
Yes, existing voice creation software is not user friendly at all. Below is
my notes about creating voice, step-by-step with exact commands. I hope it
will help somebody for create new voice.
But keep in mind, I created voice, but quality is far away from normal yet.
Looks like I made mistake in some parameters, or even skipped some required
step.
WBR, Alex.
Voice creation
==============
This is my notes about HTS voice creation.
It's not so simple process, because it requires many software, and most
software should be compiled from sources.
Voice creation depends on voice recognition. That's why we need to use
voice recognition software too.
Useful links
------------
Some demos, that can show possible quality of voice synthesis:
http://www.cstr.ed.ac.uk/projects/festival/onlinedemo.html
http://festvox.org/voicedemos.html
Some docs:
http://htk.eng.cam.ac.uk/docs/docs.shtml
http://www.voxforge.org/home/dev/acousticmodels/linux/create/htkjulius/tutorial
Docs from MaryTTS:
https://github.com/marytts/marytts/wiki/HMMVoiceCreation
https://github.com/marytts/marytts/wiki/VoiceImportToolsTutorial#step-by-step-procedure
Software
--------
I used Ubuntu 15.04 32-bit desktop for voice compiling.
1. Prepare system
sudo mkdir /voice
sudo chown <username> /voice
sudo apt-get install mc libc6-dev-i386 libx11-dev libncurses5-dev git
sox tcl-snack g++
2. Compile HTS
* Download
http://hts.sp.nitech.ac.jp/archives/2.2/HTS-2.2_for_HTK-3.4.1.tar.bz2 from
http://hts.sp.nitech.ac.jp/?Download to /voice/sources/
* Download
http://htk.eng.cam.ac.uk/ftp/software/hdecode/HDecode-3.4.1.tar.gz from
http://htk.eng.cam.ac.uk/prot-docs/hdecode.shtml to /voice/sources/
* Download http://htk.eng.cam.ac.uk/ftp/software/HTK-3.4.1.tar.gz from
http://htk.eng.cam.ac.uk/download.shtml to /voice/sources/
cd /voice/sources/
tar xf HTK-3.4.1.tar.gz
tar xf HDecode-3.4.1.tar.gz
mkdir hts
tar xf HTS-2.2_for_HTK-3.4.1.tar.bz2 -C hts
cd htk
patch -p1 -d . < ../hts/HTS-2.2_for_HTK-3.4.1.patch
./configure --prefix=/voice/soft/hts
make all hdecode
make install install-hdecode
Result: files /voice/soft/hts/bin/H*
3. Compile hts_engine 1.05
* Download
http://sourceforge.net/projects/hts-engine/files/hts_engine%20API/hts_engine_API-1.05/hts_engine_API-1.05.tar.gz/download
from
http://sourceforge.net/projects/hts-engine/files/hts_engine%20API/hts_engine_API-1.05/
to /voice/sources/
cd /voice/sources/
tar xf hts_engine_API-1.05.tar.gz
cd hts_engine_API-1.05
./configure --prefix=/voice/soft/hts_engine
make
make install
Result: /voice/soft/hts_engine/bin/hts_engine
4. Install speech tools
* Download
http://festvox.org/packed/festival/2.4/speech_tools-2.4-release.tar.gz from
http://festvox.org/festival/ to /voice/sources/
cd /voice/sources/
tar xf speech_tools-2.4-release.tar.gz
./configure
make
make test
Result: latest command should show “Test OK”
5. Install Festvox
* Download http://festvox.org/festvox-2.7/festvox-2.7.0-release.tar.gz from
http://festvox.org/download.html
cd /voice/sources/
tar xf festvox-2.7.0-release.tar.gz
cd festvox/
./configure
make
6. Install Festival
* Download
http://festvox.org/packed/festival/2.4/festival-2.4-release.tar.gz from
http://festvox.org/festival/ to /voice/sources/
cd /voice/sources/
tar xf festival-2.4-release.tar.gz
cd festival/
./configure
make
7. Install SPTK
* Download http://downloads.sourceforge.net/sp-tk/SPTK-3.4.1.tar.gz
cd /voice/sources/
tar xf SPTK-3.4.1.tar.gz
cd SPTK-3.4.1/
./configure --prefix=/voice/soft/SPTK
make
make install
Result: files /voice/soft/SPTK/bin/*
8. Install Praat
* Download http://www.fon.hum.uva.nl/praat/praat5417_linux64.tar.gz from
http://www.fon.hum.uva.nl/praat/download_linux.html to /voice/sources/
cd /voice/sources/
mkdir /voice/soft/praat
tar xf praat5417_linux64.tar.gz -C /voice/soft/praat/
9. Install JDK to /voice/soft/java
cd /voice/soft
tar xf /voice/sources/jdk-8u60-linux-x64.tar.gz
mv jdk1.8.0_60/ java/
Note: you may have other JDK version.
10. Install maven
* Download
http://ftp.byfly.by/pub/apache.org/maven/maven-3/3.3.3/binaries/apache-maven-3.3.3-bin.tar.gz
from https://maven.apache.org/download.cgi to /voice/sources/
cd /voice/soft/
tar xf /voice/sources/apache-maven-3.3.3-bin.tar.gz
mv apache-maven-3.3.3 maven
11. MaryTTS
cd /voice/sources/
git clone https://github.com/marytts/marytts.git (I used
65ee75ee8473e0d65853b0f8ccac1582ea0cb783 snapshot)
cd marytts/lib/external/ehmm
make
cd ..
PATH=$PATH:/voice/soft/hts/bin:/voice/soft/hts_engine/bin:/voice/soft/SPTK/bin:/voice/sources/marytts/lib/external/ehmm/bin
./check_install_external_programs.sh -check
Result: externalBinaries.config. Check should display all OK.
cd /voice/sources/marytts
JAVA_HOME=/voice/soft/java /voice/soft/maven/bin/mvn install
Result:
/voice/sources/marytts/target/marytts-builder-5.2-SNAPSHOT/bin/voiceimport.sh
Voice compiling: prepare
------------------------
1. Put audio files into /voice/data/wav, text into /voice/data/txt.done.data
2. Run MaryTTS server in the separate window:
cd /voice/sources/marytts/target/marytts-5.2-SNAPSHOT/bin
PATH=/voice/soft/java/bin:$PATH ./marytts-server
3. Run voice conversion:
cd /voice/sources/marytts/target/marytts-builder-5.2-SNAPSHOT/bin;
PATH=$PATH:/voice/soft/java/bin/:/voice/soft/praat:/voice/sources/speech_tools/main/:/voice/soft/hts/bin:/voice/soft/hts_engine/bin:/voice/soft/SPTK/bin:/voice/sources/marytts/lib/external/ehmm/bin
./voiceimport.sh
4. Choose directory /voice/data
5. Set parameters (I tried russian voice elena compilation):
db.marybase: /voice/sources/marytts
db.estDir: /voice/sources/speech_tools/
db.samplingrate: as in source files
db.locale: ru
HMMVoiceConfigure.dataSet = rusian_set_name
HMMVoiceConfigure.speaker = elena
HMMVoiceConfigure.lowerF0 = 80 (male=40, female=80)
HMMVoiceConfigure.upperF0 = 350 (male=280, female=350)
HMMVoiceConfigure.freqWarp = 0.53
HMMVoiceConfigure.sampfreq = 44100
HMMVoiceConfigure.frameLen, frameShift (see
http://www.dfki.de/pipermail/mary-users/2012-April/001189.html)
HMMVoiceCompiler.mavenBin /voice/soft/maven/bin/mvn
EHMMLabeler.ehmmDir /voice/sources/marytts/lib/external/ehmm
6. Close UI, then start voiceimport.sh as in previous step. Check if
console contains "Reading external binaries config file
/voice/sources/marytts/lib/external/externalBinaries.config"
7. Run "PraatPitchmarker". Result : voice/pm/* files
8. Run "MCEPMaker". Result : voice/mcep/*.mcep files
9. Run "Festvox2MaryTranscripts". Result : voice/text/*.txt files
(converted from voice/txt.done.data)
Voice compiling: autolabeling
-----------------------------
10. Run "AllophonesExtractor". Result: voice/prompt_allophones/*.xml
11. Run "EHMMLabeler". Result: voice/ehmm/* files. See
https://github.com/marytts/marytts/wiki/HMMVoiceCreation for some details.
This step will process several hours.
12. Run "LabelPauseDeleter" (set threshold=10 in settings). Result:
voice/lab/*.lab
13. Run "PhoneUnitLabelComputer". Result: voice/phonelab/*.lab
14. Run "TranscriptionAligner". Result: voice/allphones/*.xml
15. Run "FeatureSelection". Result: voice/mary/features.txt
16. Run "PhoneUnitFeatureComputer": Result: voice/phonefeatures/*.pfeats
17. Run "PhoneLabelFeatureAligner". See console output - it should display
"0 problems"
As results of previous steps, you should have in your voice building
directory:
phonefeatures directory
phonelab directory
mary/features.txt file
$MARY_BASE/lib/external/externalBinaries.config
Voice compiling: voice training
-------------------------------
18. Run "HMMVoiceDataPreparation"
19. Run "HMMVoiceConfigure"
20. Run "HMMVoiceFeatureSelection". Result: mary/hmmFeatures.txt
21. Run "HMMVoiceMakeData"
22. Run "HMMVoiceMakeVoice". This step will process several hours.
23. Run "HMMVoiceCompiler". If this step finished with error, you need to
go to /voice/data/mary/voice-my_voice-hsmm, then edit pom.xml from 5.1.2 to
5.2-SNAPSHOT version, then compile manually by command "cd
/voice/data/mary/voice-my_voice-hsmm; JAVA_HOME=/voice/soft/java
/voice/soft/maven/bin/mvn install"
Once the voice is compiled, follow the instructions in
Publishing-a-MARY-TTS-Voice to install the voice
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.dfki.de/pipermail/mary-users/attachments/20151108/c0b58cb6/attachment.htm
More information about the Mary-users
mailing list