[mary-users] New voice creation

Aleś Bułojčyk alex73mail at gmail.com
Sun Nov 8 14:15:44 CET 2015


Hi Clifton.

Yes, existing voice creation software is not user friendly at all. Below is
my notes about creating voice, step-by-step with exact commands. I hope it
will help somebody for create new voice.

But keep in mind, I created voice, but quality is far away from normal yet.
Looks like I made mistake in some parameters, or even skipped some required
step.

WBR, Alex.

Voice creation
==============

This is my notes about HTS voice creation.
It's not so simple process, because it requires many software, and most
software should be compiled from sources.
Voice creation depends on voice recognition. That's why we need to use
voice recognition software too.

Useful links
------------

Some demos, that can show possible quality of voice synthesis:
  http://www.cstr.ed.ac.uk/projects/festival/onlinedemo.html
  http://festvox.org/voicedemos.html

Some docs:
  http://htk.eng.cam.ac.uk/docs/docs.shtml

http://www.voxforge.org/home/dev/acousticmodels/linux/create/htkjulius/tutorial

Docs from MaryTTS:
  https://github.com/marytts/marytts/wiki/HMMVoiceCreation

https://github.com/marytts/marytts/wiki/VoiceImportToolsTutorial#step-by-step-procedure

Software
--------

I used Ubuntu 15.04 32-bit desktop for voice compiling.

1. Prepare system

    sudo mkdir /voice
    sudo chown <username> /voice
    sudo apt-get install mc libc6-dev-i386 libx11-dev libncurses5-dev git
sox tcl-snack g++

2. Compile HTS

* Download
http://hts.sp.nitech.ac.jp/archives/2.2/HTS-2.2_for_HTK-3.4.1.tar.bz2 from
http://hts.sp.nitech.ac.jp/?Download to /voice/sources/
* Download
http://htk.eng.cam.ac.uk/ftp/software/hdecode/HDecode-3.4.1.tar.gz from
http://htk.eng.cam.ac.uk/prot-docs/hdecode.shtml to /voice/sources/
* Download http://htk.eng.cam.ac.uk/ftp/software/HTK-3.4.1.tar.gz from
http://htk.eng.cam.ac.uk/download.shtml to /voice/sources/

    cd /voice/sources/
    tar xf HTK-3.4.1.tar.gz
    tar xf HDecode-3.4.1.tar.gz
    mkdir hts
    tar xf HTS-2.2_for_HTK-3.4.1.tar.bz2 -C hts
    cd htk
    patch -p1 -d . < ../hts/HTS-2.2_for_HTK-3.4.1.patch
    ./configure --prefix=/voice/soft/hts
    make all hdecode
    make install install-hdecode

Result: files /voice/soft/hts/bin/H*

3. Compile hts_engine 1.05

* Download
http://sourceforge.net/projects/hts-engine/files/hts_engine%20API/hts_engine_API-1.05/hts_engine_API-1.05.tar.gz/download
from
http://sourceforge.net/projects/hts-engine/files/hts_engine%20API/hts_engine_API-1.05/
to /voice/sources/

    cd /voice/sources/
    tar xf hts_engine_API-1.05.tar.gz
    cd hts_engine_API-1.05
    ./configure --prefix=/voice/soft/hts_engine
    make
    make install

Result: /voice/soft/hts_engine/bin/hts_engine

4. Install speech tools

* Download
http://festvox.org/packed/festival/2.4/speech_tools-2.4-release.tar.gz from
http://festvox.org/festival/ to /voice/sources/

    cd /voice/sources/
    tar xf speech_tools-2.4-release.tar.gz
    ./configure
    make
    make test

Result: latest command should show “Test OK”

5. Install Festvox

* Download http://festvox.org/festvox-2.7/festvox-2.7.0-release.tar.gz from
http://festvox.org/download.html

    cd /voice/sources/
    tar xf festvox-2.7.0-release.tar.gz
    cd festvox/
    ./configure
    make


6. Install Festival
* Download
http://festvox.org/packed/festival/2.4/festival-2.4-release.tar.gz from
http://festvox.org/festival/ to /voice/sources/

    cd /voice/sources/
    tar xf festival-2.4-release.tar.gz
    cd festival/
    ./configure
    make

7. Install SPTK
* Download http://downloads.sourceforge.net/sp-tk/SPTK-3.4.1.tar.gz

    cd /voice/sources/
    tar xf SPTK-3.4.1.tar.gz
    cd SPTK-3.4.1/
    ./configure --prefix=/voice/soft/SPTK
    make
    make install

Result: files /voice/soft/SPTK/bin/*

8. Install Praat
* Download http://www.fon.hum.uva.nl/praat/praat5417_linux64.tar.gz from
http://www.fon.hum.uva.nl/praat/download_linux.html to /voice/sources/

    cd /voice/sources/
    mkdir /voice/soft/praat
    tar xf praat5417_linux64.tar.gz -C /voice/soft/praat/

9. Install JDK to /voice/soft/java

    cd /voice/soft
    tar xf /voice/sources/jdk-8u60-linux-x64.tar.gz
    mv jdk1.8.0_60/ java/

Note: you may have other JDK version.

10. Install maven

* Download
http://ftp.byfly.by/pub/apache.org/maven/maven-3/3.3.3/binaries/apache-maven-3.3.3-bin.tar.gz
from https://maven.apache.org/download.cgi to /voice/sources/

    cd /voice/soft/
    tar xf /voice/sources/apache-maven-3.3.3-bin.tar.gz
    mv apache-maven-3.3.3 maven

11. MaryTTS

    cd /voice/sources/
    git clone https://github.com/marytts/marytts.git (I used
65ee75ee8473e0d65853b0f8ccac1582ea0cb783 snapshot)
    cd marytts/lib/external/ehmm
    make
    cd ..

PATH=$PATH:/voice/soft/hts/bin:/voice/soft/hts_engine/bin:/voice/soft/SPTK/bin:/voice/sources/marytts/lib/external/ehmm/bin
./check_install_external_programs.sh -check

Result: externalBinaries.config. Check should display all OK.

    cd /voice/sources/marytts
    JAVA_HOME=/voice/soft/java /voice/soft/maven/bin/mvn install

Result:
/voice/sources/marytts/target/marytts-builder-5.2-SNAPSHOT/bin/voiceimport.sh

Voice compiling: prepare
------------------------

1. Put audio files into /voice/data/wav, text into /voice/data/txt.done.data

2. Run MaryTTS server in the separate window:

    cd /voice/sources/marytts/target/marytts-5.2-SNAPSHOT/bin
    PATH=/voice/soft/java/bin:$PATH ./marytts-server

3. Run voice conversion:

    cd /voice/sources/marytts/target/marytts-builder-5.2-SNAPSHOT/bin;

PATH=$PATH:/voice/soft/java/bin/:/voice/soft/praat:/voice/sources/speech_tools/main/:/voice/soft/hts/bin:/voice/soft/hts_engine/bin:/voice/soft/SPTK/bin:/voice/sources/marytts/lib/external/ehmm/bin
./voiceimport.sh

4. Choose directory /voice/data

5. Set parameters (I tried russian voice elena compilation):

    db.marybase: /voice/sources/marytts
    db.estDir: /voice/sources/speech_tools/
    db.samplingrate: as in source files
    db.locale: ru
    HMMVoiceConfigure.dataSet      =  rusian_set_name
    HMMVoiceConfigure.speaker      =  elena
    HMMVoiceConfigure.lowerF0      =  80 (male=40,  female=80)
    HMMVoiceConfigure.upperF0     = 350 (male=280, female=350)
    HMMVoiceConfigure.freqWarp = 0.53
    HMMVoiceConfigure.sampfreq = 44100
    HMMVoiceConfigure.frameLen, frameShift (see
http://www.dfki.de/pipermail/mary-users/2012-April/001189.html)
    HMMVoiceCompiler.mavenBin    /voice/soft/maven/bin/mvn
    EHMMLabeler.ehmmDir    /voice/sources/marytts/lib/external/ehmm

6. Close UI, then start voiceimport.sh as in previous step. Check if
console contains "Reading external binaries config file
/voice/sources/marytts/lib/external/externalBinaries.config"

7. Run "PraatPitchmarker". Result : voice/pm/* files

8. Run "MCEPMaker". Result : voice/mcep/*.mcep files

9. Run "Festvox2MaryTranscripts". Result : voice/text/*.txt files
(converted from voice/txt.done.data)

Voice compiling: autolabeling
-----------------------------

10. Run "AllophonesExtractor". Result: voice/prompt_allophones/*.xml

11. Run "EHMMLabeler". Result: voice/ehmm/* files. See
https://github.com/marytts/marytts/wiki/HMMVoiceCreation for some details.
This step will process several hours.

12. Run "LabelPauseDeleter" (set threshold=10 in settings). Result:
voice/lab/*.lab

13. Run "PhoneUnitLabelComputer". Result: voice/phonelab/*.lab

14. Run "TranscriptionAligner". Result: voice/allphones/*.xml

15. Run "FeatureSelection". Result: voice/mary/features.txt

16. Run "PhoneUnitFeatureComputer": Result: voice/phonefeatures/*.pfeats

17. Run "PhoneLabelFeatureAligner". See console output - it should display
"0 problems"

As results of previous steps, you should have in your voice building
directory:

    phonefeatures directory
    phonelab directory
    mary/features.txt file
    $MARY_BASE/lib/external/externalBinaries.config


Voice compiling: voice training
-------------------------------

18. Run "HMMVoiceDataPreparation"

19. Run "HMMVoiceConfigure"

20. Run "HMMVoiceFeatureSelection". Result: mary/hmmFeatures.txt

21. Run "HMMVoiceMakeData"

22. Run "HMMVoiceMakeVoice". This step will process several hours.

23. Run "HMMVoiceCompiler". If this step finished with error, you need to
go to /voice/data/mary/voice-my_voice-hsmm, then edit pom.xml from 5.1.2 to
5.2-SNAPSHOT version, then compile manually by command "cd
/voice/data/mary/voice-my_voice-hsmm; JAVA_HOME=/voice/soft/java
/voice/soft/maven/bin/mvn install"


Once the voice is compiled, follow the instructions in
Publishing-a-MARY-TTS-Voice to install the voice
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.dfki.de/pipermail/mary-users/attachments/20151108/c0b58cb6/attachment.htm 


More information about the Mary-users mailing list