Correlating Natural Language Parser Performance with Statistical Measures of the Text

Yi Zhang, Rui Wang

In: Proceedings of KI 2009. German Conference on Artificial Intelligence (KI-2009) September 15-18 Paderborn Germany Springer 2009.


Natural language parsing, as one of the central tasks in natural language processing, is widely used in many AI fields. In this paper, we address an issue of parser performance evaluation, particularly its variation across datasets. We propose three simple statistical measures to characterize the datasets and also evaluate their correlation to the parser performance. The results clearly show that different parsers have different performance variation and sensitivity against these measures. The method can be used to guide the choice of natural language parsers for new domain applications, as well as systematic combination for better parsing accuracy.

final.pdf (pdf, 156 KB )

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence