DFKI-LT - Machine learning methods for comparative and time-oriented Quality Estimation of Machine Translation output
Machine learning methods for comparative and time-oriented Quality Estimation of Machine Translation output
2 Proceedings of the Eighth Workshop on Statistical Machine Translation,
This paper describes a set of experiments on two sub-tasks of Quality Estimation of Machine Translation (MT) output. Sentence-level ranking of alternative MT outputs is done with pairwise classifiers using Logistic Regression with black-box features originating from PCFG Parsing, language models and various counts. Post-editing time prediction uses regression models, additionally fed with new elaborate features from the Statistical MT decoding process. These seem to be better indicators of post-editing time than black-box features. Prior to training the models, feature scoring with ReliefF and Information Gain is used to choose feature sets of decent size and avoid computational complexity.
Files: BibTeX, W13-2240, WMT40.pdf