Machine learning methods for comparative and time-oriented Quality Estimation of Machine Translation output

Eleftherios Avramidis, Maja Popovic

In: Proceedings of the Eighth Workshop on Statistical Machine Translation. Workshop on Statistical Machine Translation (WMT-13) August 8-9 Sofia Bulgaria Seiten 329-336 Association for Computational Linguistics 8/2013.


This paper describes a set of experiments on two sub-tasks of Quality Estimation of Machine Translation (MT) output. Sentence-level ranking of alternative MT outputs is done with pairwise classifiers using Logistic Regression with black-box features originating from PCFG Parsing, language models and various counts. Post-editing time prediction uses regression models, additionally fed with new elaborate features from the Statistical MT decoding process. These seem to be better indicators of post-editing time than black-box features. Prior to training the models, feature scoring with ReliefF and Information Gain is used to choose feature sets of decent size and avoid computational complexity.


Weitere Links

WMT40.pdf (pdf, 189 KB)

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence