Skip to main content Skip to main navigation

Publication

Ensemble-style Self-training on Citation Classification

Cailing Dong; Ulrich Schäfer
In: Proceedings of the 5th International Joint Conference on Natural Language Processing. International Joint Conference on Natural Language Processing (IJCNLP-2011), November 8-13, Chiang Mai, Thailand, Pages 623-631, ISBN 978-974-466-564-5, Association for Computational Linguistics, 11/2011.

Abstract

Classification of citations into categories such as use, refutation, comparison etc. may have several relevant applications for digital libraries such as paper browsing aids, reading recommendations, qualified citation indexing, or fine-grained impact factor calculation. Most citation classification approaches described so far heavily rely on rule systems and patterns tailored to specific science domains. We focus on a less manual approach by learning domain-insensitive features from textual, physical, and syntactic aspects. Our experiments show the effectiveness of this feature set with various machine learning algorithms on datasets of different sizes. Furthermore, we build an ensemble style self-training classification model and get better classification performance using only few training data, which largely reduces the manual annotation work in this task.

Projekte

Weitere Links