DFKI-LT - Efforts on Machine Learning over Human-mediated Translation Edit Rate

Eleftherios Avramidis
Efforts on Machine Learning over Human-mediated Translation Edit Rate
1 Proceedings of the Ninth Workshop on Statistical Machine Translation, Pages 302-306, Baltimore, Maryland, USA, Association for Computational Linguistics, 6/2014
 
In this paper we describe experiments on predicting HTER, as part of our submission in the Shared Task on Quality Estimation, in the frame of the 9th Workshop on Statistical Machine Translation. In our experiment we check whether it is possible to achieve better HTER prediction by training four individual regression models for each one of the edit types (deletions, insertions, substitutions, shifts), however no improvements were yielded. We also had no improvements when investigating the possibility of adding more data from other non-minimally post-edited and freely translated datasets. Best HTER prediction was achieved by adding deduplicated WMT13 data and additional features such as (a) rule-based language corrections (language tool) (b) PCFG parsing statistics and count of tree labels (c) position statistics of parsing labels (d) position statistics of tri-grams with low probability.
 
Files: BibTeX, document.pdf