Exploring HPSG-based Treebanks for Probabilistic Parsing

Günter Neumann, Berthold Crysmann

In: LREC 2006. International Conference on Language Resources and Evaluation (LREC) 2006.


We describe a method for the automatic extraction of a Stochastic Lexicalized Tree Insertion Grammar from a linguistically rich HPSG Treebank. The extraction method is strongly guided by HPSG–based head and argument decomposition rules. The tree anchors correspond to lexical labels encoding fine–grained information. The approach has been tested with a German corpus achieving a labeled recall of 77.33% and labeled precision of 78.27%, which is competitive to recent results reported for German parsing using the Negra Treebank.

LREC2006-long.pdf (pdf, 136 KB )

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence