DFKI-LT - Machine learning methods for comparative and time-oriented Quality Estimation of Machine Translation output

Eleftherios Avramidis, Maja Popovic
Machine learning methods for comparative and time-oriented Quality Estimation of Machine Translation output
2 Proceedings of the Eighth Workshop on Statistical Machine Translation, Pages 329-336, Sofia, Bulgaria, Association for Computational Linguistics, Association for Computational Linguistics, 8/2013
 
This paper describes a set of experiments on two sub-tasks of Quality Estimation of Machine Translation (MT) output. Sentence-level ranking of alternative MT outputs is done with pairwise classifiers using Logistic Regression with black-box features originating from PCFG Parsing, language models and various counts. Post-editing time prediction uses regression models, additionally fed with new elaborate features from the Statistical MT decoding process. These seem to be better indicators of post-editing time than black-box features. Prior to training the models, feature scoring with ReliefF and Information Gain is used to choose feature sets of decent size and avoid computational complexity.
 
Files: BibTeX, W13-2240, WMT40.pdf