Using Suffix Arrays for Efficiently Recognition of Named Entities in Large Scale

Benjamin Adrian, Sven Schwarz

In: Andreas König , Andreas Dengel , Knut Hinkelmann , Koichi Kise , Robert J. Howlett , Lakhmi C. Jain (editor). Knowlege-Based and Intelligent Information and Engineering Systems. International Conference on Knowledge-Based and Intelligent Information & Engineering Systems (KES-2011) 15th September 12-August 14 Kaiserslautern Germany Pages 420-429 Lecture Notes in Computer Science 6882 ISBN 978-3-642-23862-8 Springer 2011.


In this paper, we present an efficient comparison of text and RDF data for recognizing named entities. Here, a named entity is a text sequence that refers to a URI reference within an RDF graph. We present suffix arrays as representation format for text and a relational database scheme to represent Semantic Web data. Using these representation facilities performs a named entity recognition in linear time complexity and without the requirement to hold names of existing entities in memory. Both is needed to implement a named entity recognition on the scale of for instance the DBpedia database.


