Publication

Accessing Multilingual Data on the Web for the Semantic Annotation of Cultural Heritage Texts

Karlheinz Moerth, Thierry Declerck, Piroska Lendvai, Tamás Váradi

In: Elena Montiel-Ponsoda, John McCrae, Paul Buitelaar, Philipp Cimiano (editor). Proceedings of the 2nd International Workshop on the Multilingual Semantic Web. International Workshop on the Multilingual Semantic Web (MSW-2011) located at The 10th International Semantic Web Conference (ISWC2011) October 23-27 Bonn Germany Springer 10/2011.

Abstract

Our study targets interoperable semantic annotation of Cultural Heritage or eHumanities texts in German and Hungarian. A semantic resource we focus on is the Thompson Motif-index of folk-literature (TMI), the labels of which are available only in English. We investigate the use of lexical data on the Web in German and Hungarian for supporting semi-automatic translation of TMI: lexical resources offered by Wiktionary accessed via the Lexvo service, and discuss shortcomings of those resources. An approach for mapping the XML dump of Wiktionary onto a TEI and MAF compliant data is presented, whereby we discuss improvements in the representation of Wiktionary data for exploiting its multilingual value within the LOD framework.

Projekte

MSW2-ICLTT-FIN_PL-TD-KH.pdf (pdf, 137 KB)

German Research Center for Artificial Intelligence
Deutsches Forschungszentrum für Künstliche Intelligenz