Accessing Multilingual Data on the Web for the Semantic Annotation of Cultural Heritage Texts

Karlheinz Moerth; Thierry Declerck; Piroska Lendvai; Tamás Váradi

In: Elena Montiel-Ponsoda; John McCrae; Paul Buitelaar; Philipp Cimiano (Hrsg.). Proceedings of the 2nd International Workshop on the Multilingual Semantic Web. International Workshop on the Multilingual Semantic Web (MSW-2011), located at The 10th International Semantic Web Conference (ISWC2011), October 23-27, Bonn, Germany, Springer, 10/2011.


Our study targets interoperable semantic annotation of Cultural Heritage or eHumanities texts in German and Hungarian. A semantic resource we focus on is the Thompson Motif-index of folk-literature (TMI), the labels of which are available only in English. We investigate the use of lexical data on the Web in German and Hungarian for supporting semi-automatic translation of TMI: lexical resources offered by Wiktionary accessed via the Lexvo service, and discuss shortcomings of those resources. An approach for mapping the XML dump of Wiktionary onto a TEI and MAF compliant data is presented, whereby we discuss improvements in the representation of Wiktionary data for exploiting its multilingual value within the LOD framework.


MSW2-ICLTT-FIN_PL-TD-KH.pdf (pdf, 137 KB )

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence