Publikation

Information Density and Quality Estimation Features as Translationese Indicators for Human Translation Classification

Raphael Rubino, Ekaterina Lapshinova-Koltunski, Josef van Genabith

In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Annual Conference of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL-2016) 15th June 12-17 San Diego CA United States 2016.

Abstrakt

This paper introduces information density and machine translation quality estimation inspired features to automatically detect and classify human translated texts. We investigate two settings: discriminating between translations and comparable originally authored texts, and distinguishing two levels of translation professionalism. Our framework is based on delexicalised sentence-level dense feature vector representations combined with a supervised machine learning approach. The results show state-of-the-art performance for mixed-domain translationese detection with information density and quality estimation based features, while results on translation expertise classification are mixed.

Weitere Links

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence