Using Domain-specific and Collaborative Resources for Term Translation

Mihael Arcan, Christian Federmann, Paul Buitelaar

In: Proceedings of the Sixth Workshop on Syntax, Semantics and Structure in Statistical Translation. Workshop on Syntax, Semantics and Structure in Statistical Translation (SSST-12) July 12-12 Jeju South Korea Seiten 86-94 Association for Computational Linguistics 7/2012.


In this article we investigate the translation of terms from English into German and vice versa in the isolation of an ontology vocabulary. For this study we built new domain-specific resources from the translation search engine Linguee and from the online encyclopedia Wikipedia. We learned that a domain-specific resource produces better results than a bigger, but more general one. The first finding of our research is that the vocabulary and the structure of the parallel corpus are important. By integrating the multilingual knowledge base Wikipedia, we further improved the translation wrt. the domain-specific resources, whereby some translation evaluation metrics outperformed the results of Google Translate. This finding leads us to the conclusion that a hybrid translation system, a combination of bilingual terminological resources and statistical machine translation can help to improve translation of domain-specific terms.


Weitere Links

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence