DFKI-LT - Unsupervised Monolingual and Bilingual Word-Sense Disambiguation of Medical Documents using UMLS
Unsupervised Monolingual and Bilingual Word-Sense Disambiguation of Medical Documents using UMLS
1 Proceedings of ACL 2003 Workshop on Natural Language Processing in Biomedicine volume 13,
This paper describes techniques for unsupervised word sense disambiguation of English and German medical documents using UMLS. We present both monolingual techniques which rely only on the structure of UMLS, and bilingual techniques which also rely on the availability of parallel corpora. The best results are obtained using relations between terms given by UMLS, a method which achieves 74% precision, 66% coverage for English and 79% precision, 73% coverage for German on evaluation corpora and over 83% coverage over the whole corpus. The success of this technique for German shows that a lexical resource giving relations between concepts used to index an English document collection can be used for high quality disambiguation in another language.
Files: BibTeX, biomed-wsd.pdf