Qualitative Evaluation and Error Analysis of Phonetic Segmentation

Arif Khan, Ingmar Steiner

In: Jürgen Trouvain , Ingmar Steiner , Bernd Möbius (Hrsg.). 28th Conference on Electronic Speech Signal Processing (ESSV). Elektronische Sprachsignalverarbeitung (ESSV) 28th March 15-17 Saarbrücken Germany Seiten 138-144 TUD Press Dresden 3/2017.


Speech segmentation is the process of splitting and identifying the boundaries between different units of speech, i.e., words, syllables, and phones. This paper focuses on the automatic phonetic segmentation of speech and the methods used for its evaluation. We explain the current methods used for the evaluation of speech segmentation and highlight the details that have not been sufficiently addressed in the literature. Several metrics are explained for analysis. The phones are grouped into several classes and the phone class transitions are observed. We found that, most of the errors comes from those class transitions which are also difficult for humans to segment.

