Unsupervised and domain-independent extraction of technical terms from scientifc articles in digital libraries

Kathrin Eichler, Holmer Hemsen, Günter Neumann

In: Thomas Mandl, Ingo Frommholz (Hrsg.). Proceedings of the Workshop "Information Retrieval". Workshop "Information Retrieval" der FG-IR (WIR) befindet sich organized as part of LWA September 21-23 Darmstadt Germany TU Darmstadt 2009.


A central issue for making the contents of documents in a digital library accessible to the user is the identification and extraction of technical terms. We propose a method to approach this task in an unsupervised, domain-independent way: We use a nominal group chunker to extract term candidates and select the technical terms from these candidates based on string frequencies retrieved using the MSN search engine.


Weitere Links

lwa-submit_dilia_eichler_etal.pdf (pdf, 216 KB)

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence