DFKI-LT - Improving Machine Translation through Linked Data

Ankit Srivastava, Georg Rehm, Felix Sasaki
Improving Machine Translation through Linked Data
in: Ondřej Bojar, Alexander M. Fraser, Lucia Specia, Mikel L. Forcada (eds.):
3 The Prague Bulletin of Mathematical Linguistics volume 108, Pages 355-366, Charles University (Prague, Czech Republic), 6/2017
With the ever increasing availability of linked multilingual lexical resources, there is a re- newed interest in extending Natural Language Processing (NLP) applications so that they can make use of the vast set of lexical knowledge bases available in the Semantic Web. In the case of Machine Translation, MT systems can potentially benefit from such a resource. Unknown words and ambiguous translations are among the most common sources of error. In this pa- per, we attempt to minimise these types of errors by interfacing Statistical Machine Translation (SMT) models with Linked Open Data (LOD) resources such as DBpedia and BabelNet. We perform several experiments based on the SMT system Moses and evaluate multiple strategies for exploiting knowledge from multilingual linked data in automatically translating named en- tities. We conclude with an analysis of best practices for multilingual linked data sets in order to optimise their benefit to multilingual and cross-lingual applications.
Files: BibTeX, 108, art-srivastava-rehm-sasaki-2.pdf