Discriminant Ranking for Efficient Treebanking

Yi Zhang, Valia Kordoni

In: Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010). International Conference on Computational Linguistics (COLING-2010) Coling 2010 Organizing Committee 2010.


Treebank annotation is a labor-intensive and time-consuming task. In this paper, we show that a simple statistical ranking model can significantly improve treebanking efficiency by prompting human annotators, well-trained in disambiguation tasks for treebanking but not necessarily grammar experts, to the most relevant linguistic disambiguation decisions. Experiments were carried out to evaluate the impact of such techniques on annotation efficiency and quality. The detailed analysis of outputs from the ranking model shows strong correlation to the human annotator behavior. When integrated into the treebanking environment, the model brings a significant annotation speed-up with improved inter-annotator agreement.


Weitere Links

C10-2166.pdf (pdf, 550 KB )

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence