Skip to main content Skip to main navigation


EM-basierte maschinelle Lernverfahren für natürliche Sprachen

Detlef Prescher
PhD-Thesis, Universität Stuttgart, Institut für Maschinelle Sprachverarbeitung (IMS), AIMS Report, No. 8(1), 2002.


This thesis presents the Expectation-Maximization algorithm (EM algorithm, Dempster et al. (1977)) in its practical and theoretical aspects. The EM algorithm is the stochastic basis of many machine learning algorithms for natural language processing. In the theoretical part of this thesis the stochastic basis of linguistics and the formal basis of the EM algorithm is explained. The practical part of this thesis presents a probabilistic clustering method for multivariate linguistic data and stochastic modeling of lexicalized grammars.