DFKI-LT - A Uniform Method for Automatically Extracting Stochastic Lexicalized Tree Grammars from Treebanks and HPSG
A Uniform Method for Automatically Extracting Stochastic Lexicalized Tree Grammars from Treebanks and HPSG
1 Building and using Parsed Corpora,
We present a uniform method for the extraction of stochastic lexicalized tree grammars (SLTG) of different complexities from existing treebanks as well as from competence-based grammars , which allows us to analyze the relationship of a grammar automatically induced from a treebank with respect to its size, its complexity, and its predictive power on unseen data. Processing of different SLTG is performed by a stochastic version of the two-step Early-based parsing strategy introduced in Schabes and Joshi, 1991.
Files: BibTeX, kluwer-tag+.pdf