DFKI-LT - Accessing Multilingual Data on the Web for the Semantic Annotation of Cultural Heritage Texts

Karlheinz Moerth, Thierry Declerck, Piroska Lendvai, Tamás Váradi
Accessing Multilingual Data on the Web for the Semantic Annotation of Cultural Heritage Texts
in: Elena Montiel-Ponsoda, John McCrae, Paul Buitelaar, Philipp Cimiano (eds.):
2 Proceedings of the 2nd International Workshop on the Multilingual Semantic Web, Bonn, Germany, Springer, 10/2011
 
Our study targets interoperable semantic annotation of Cultural Heritage or eHumanities texts in German and Hungarian. A semantic resource we focus on is the Thompson Motif-index of folk-literature (TMI), the labels of which are available only in English. We investigate the use of lexical data on the Web in German and Hungarian for supporting semi-automatic translation of TMI: lexical resources offered by Wiktionary accessed via the Lexvo service, and discuss shortcomings of those resources. An approach for mapping the XML dump of Wiktionary onto a TEI and MAF compliant data is presented, whereby we discuss improvements in the representation of Wiktionary data for exploiting its multilingual value within the LOD framework.
 
Files: BibTeX, MSW2-ICLTT-FIN_PL-TD-KH.pdf