Publikation

Large-Scale Corpus-Driven PCFG Approximation of an HPSG

Yi Zhang, Hans-Ulrich Krieger

In: 12th International Conference on Parsing Technologies. International Conference on Parsing Technologies (IWPT-2011) befindet sich Hauptkonferenz October 5-7 Dublin Ireland SigPARSE 2011.

Abstrakt

We present a novel corpus-driven approach towards grammar approximation for a linguistically deep Head-driven Phrase Structure Grammar. With an unlexicalized probabilistic context-free grammar obtained by Maximum Likelihood Estimate on a large-scale automatically annotated corpus, we are able to achieve parsing accuracy higher than the original HPSG-based model. Different ways of enriching the annotations carried by the approximating PCFG are proposed and compared. Comparison to the state-of-the-art latent-variable PCFG shows that our approach is more suitable for the grammar approximation task where training data can be acquired automatically. The best approximating PCFG achieved ParsEval F$_1$ accuracy of 84.13\%. The high robustness of the PCFG suggests it is a viable way of achieving full coverage parsing with the hand-written deep linguistic grammars.

Projekte

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence