Unsupervised and domain-independent extraction of technical terms from scientifc articles in digital libraries

Kathrin Eichler; Holmer Hemsen; Günter Neumann
In: Thomas Mandl; Ingo Frommholz (Hrsg.). Proceedings of the Workshop "Information Retrieval". Workshop "Information Retrieval" der FG-IR (WIR), located at organized as part of LWA, September 21-23, Darmstadt, Germany, TU Darmstadt, 2009.


A central issue for making the contents of documents in a digital library accessible to the user is the identification and extraction of technical terms. We propose a method to approach this task in an unsupervised, domain-independent way: We use a nominal group chunker to extract term candidates and select the technical terms from these candidates based on string frequencies retrieved using the MSN search engine.



