Data-driven Approaches to Head-driven Phrase Structure Grammar

Günter Neumann

In: Rens Bod , Remko Scha (Hrsg.). DATA-ORIENTED PARSING. Center for the Study of Language and Information - Studies in Computat CSLI Publications, University of Chicago Press 2003.


We present HPSG-DOP, a method for automatically extracting a Stochastic Lexicalized Tree Grammar (SLTG) from a HPSG source grammar and a given corpus. Processing of a SLTG is performed by a specialized fast parser. The approach has been tested on a large English grammar and has been shown to achieve additional performance increase compared to parsing with a highly tuned HPSG parser. Our approach is simple and transparent. The extracted grammars are declaratively represented and have a high degree of practical applicability.

hpsg-dop-main.pdf (pdf, 283 KB )

