Skip to main content Skip to main navigation

Publikation

Determining the Origin and Structure of Person Names

Yu Fu; Feiyu Xu; Hans Uszkoreit
In: Nicoletta Calzolari; Khalid Choukri; Bente Maegaard; Joseph Mariani; Jan Odjik; Stelios Piperidis; Mike Rosner; Daniel Tapias (Hrsg.). Proceedings of the 7th International Conference on Language Resources and Evaluation. International Conference on Language Resources and Evaluation (LREC-2010), May 19-21, Valletta, Malta, ISBN 2-9517408-6-7, European Language Resources Association (ELRA), 5/2010.

Zusammenfassung

This paper presents a novel system HENNA (Hybrid Person Name Analyzer) for identifying language origin and analyzing linguistic structures of person names. We conduct ME-based classification methods for the language origin identification and achieve very promising performance. We will show that word-internal character sequences provide surprisingly strong evidence for predicting the language origin of person names. Our approach is context-, language- and domain-independent and can thus be easily adapted to person names in or from other languages. Furthermore, we provide a novel strategy to handle origin ambiguities or multiple origins in a name. HENNA also provides a person name parser for the analysis of linguistic and knowledge structures of person names. All the knowledge about a person name in HENNA is modelled in a person-name ontology, including relationships between language origins, linguistic features and grammars of person names of a specific language and interpretation of name elements. The approaches presented here are useful extensions of the named entity recognition task.

Projekte

Weitere Links