A Light-weight & Robust System for Clinical Concept Disambiguation

Dirk Weißenborn, Roland Roller, Feiyu Xu, Hans Uszkoreit, Enrique Garcia Perez

In: Proceedings of the 7th International Symposium on Semantic Mining in Biomedicine. Semantic Mining in Biomedicine (SMBM-16) August 4-October 5 Potsdam Germany 8/2016.


This paper presents a system for the normalization of concept mentions in clinical narratives. We evaluate and compare it against a popular, open-source solution that is frequently used for natural language processing of clinical text. The evaluation is based on a manually annotated dataset of 72 discharge summaries taken from the i2b2-corpus. Besides the demonstration and evaluation of our system we provide an in-depth corpus analysis that guided the development of the system. Our focus lies on the task of concept disambiguation, for which we combine two unsupervised approaches that are easy to implement and computationally inexpensive. We show that some ambiguities can only be resolved by adapting to annotation guidelines and preferences which we solve via the introduction of heuristics. Finally, we present an online-demo that gives insights into the individual parts of the normalization pipeline.


Weitere Links

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence