DFKI-LT - The Corpus and the Lexicon: Standardising Deep Lexical Acquisition Evaluation

Yi Zhang, Timothy Baldwin, Valia Kordoni
The Corpus and the Lexicon: Standardising Deep Lexical Acquisition Evaluation
2 Proceedings of ACL 2007 Workshop on Deep Linguistic Processing, Pages 152-159, Prague, Czech Republic, 6/2007
 
This paper is concerned with the standardisation of evaluation metrics for lexical acquisition over precision grammars, which are attuned to actual parser performance. Specifically, we investigate the impact that lexicons at varying levels of lexical item precision and recall have on the performance of pre-existing broad-coverage precision grammars in parsing, i.e., on their coverage and accuracy. The grammars used for the experiments reported here are the LinGO English Resource Grammar (ERG; Flickinger (2000)) and JaCY (Siegel and Bender, 2002), precision grammars of English and Japanese, respectively. Our results show convincingly that traditional F-score-based evaluation of lexical acquisition does not correlate with actual parsing performance. What we argue for, therefore, is a recall-heavy interpretation of F-score in designing and optimising automated lexical acquisition algorithms.
 
Files: BibTeX, evaldla-final.pdf