DFKI-LT - Bootstrapping a Domain-specific Terminological Taxonomy from Scientific Text
Bootstrapping a Domain-specific Terminological Taxonomy from Scientific Text
2 Proceedings of the 9th International Conference on Terminology and Artificial Intelligence (TIA),
We present an approach to automated extraction of a taxonomy of domain-specific terms from scientific discourse. The approach has been developed and evaluated in the domain of computational linguistics. Concept pairs in is-a relation have been extracted from a subset of the ACL Anthology and WeScience. Correctness of the resource has been verified by crowdsourcing: To attract domain experts to identify correct and invalid is-a pairs, we used ``games with a purpose''. The popular games of Tetris and Invaders were modified to support concurrent and efficient annotation of domain term pairs during playing. High quality of the resulting annotations was ensured by exploiting redundancy: at least five-way agreement was required for a candidate is-a pair to be considered correctly extracted. Based on the crowdsourced evaluation the extraction method achieved precision around 80%.
Files: BibTeX, TIA05.pdf, TIA05.pdf