DFKI-LT - Quality Estimation for Machine Translation output using linguistic analysis and decoding features
Quality Estimation for Machine Translation output using linguistic analysis and decoding features
1 Proceedings of the Seventh Workshop on Statistical Machine Translation, Montreal, Canada, Association for Computational Linguistics, 6/2012
We describe a submission to the WMT12 Quality Estimation task, including an extensive Machine Learning experimentation. Data were augmented with features from linguistic analysis and statistical features from the SMT search graph. Several Feature Selection algorithms were employed. The Quality Estimation problem was addressed both as a regression task and as a discretised classification task, but the latter did not generalise well on the unseen testset. The most successful regression methods had an RMSE of 0.86 and were trained with a feature set given by Correlation-based Feature Selection. Indications that RMSE is not always sufficient for measuring performance were observed.
Files: BibTeX, W12-3108.pdf