Skip to main content Skip to main navigation


A Uniform Method for Automatically Extracting Stochastic Lexicalized Tree Grammars from Treebanks and HPSG

Günter Neumann
In: o.A.. Building and using Parsed Corpora. Language and Speech series, Dordrecht, Kluwer, 2003.


We present a uniform method for the extraction of stochastic lexicalized tree grammars (SLTG) of different complexities from existing treebanks as well as from competence-based grammars , which allows us to analyze the relationship of a grammar automatically induced from a treebank with respect to its size, its complexity, and its predictive power on unseen data. Processing of different SLTG is performed by a stochastic version of the two-step Early-based parsing strategy introduced in Schabes and Joshi, 1991.