Skip to main content Skip to main navigation


Learning from human judgments of machine translation output

Maja Popovic; Eleftherios Avramidis; Aljoscha Burchardt; Sabine Hunsicker; Sven Schmeier; Cindy Tscherwinka; David Vilar; Hans Uszkoreit
In: Proceedings of the MT Summit XIV. Machine Translation Summit (MT Summit-2013), September 2-6, Nice, France, The European Association for Machine Translation, Allschwil / Switzerland, 2013.


Human translators are the key to evaluating machine translation (MT) quality and also to addressing the so far unanswered question when and how to use MT in professional translation workflows. Usually, human judgments come in the form of ranking outputs of different translation systems and recently, post-edits of MT output have come into focus. This paper describes the results of a detailed large scale human evaluation consisting of three tightly connected tasks: ranking, error classification and post-editing. Translation outputs from three domains and six translation directions generated by five distinct translation systems have been analysed with the goal of getting relevant insights for further improvement of MT quality and applicability.