Using Suffix Arrays for Efficiently Recognition of Named Entities in Large Scale

Benjamin Adrian; Sven Schwarz
In: Andreas König; Andreas Dengel; Knut Hinkelmann; Koichi Kise; Robert J. Howlett; Lakhmi C. Jain (Hrsg.). Knowlege-Based and Intelligent Information and Engineering Systems. International Conference on Knowledge-Based and Intelligent Information & Engineering Systems (KES-2011), 15th, September 12 - August 14, Kaiserslautern, Germany, Pages 420-429, Lecture Notes in Computer Science, Vol. 6882, ISBN 978-3-642-23862-8, Springer, 2011.


In this paper, we present an efficient comparison of text and RDF data for recognizing named entities. Here, a named entity is a text sequence that refers to a URI reference within an RDF graph. We present suffix arrays as representation format for text and a relational database scheme to represent Semantic Web data. Using these representation facilities performs a named entity recognition in linear time complexity and without the requirement to hold names of existing entities in memory. Both is needed to implement a named entity recognition on the scale of for instance the DBpedia database.



