DFKI-LT - Correlating decoding events with errors in Statistical Machine Translation

Eleftherios Avramidis, Maja Popovic
Correlating decoding events with errors in Statistical Machine Translation
in: Rajeev Sangal, Jyoti Pawar, Dipti Misra Sharma (eds.):
2 Proceedings of the 11th International Conference on Natural Language Processing, Goa, India, Natural Language Processing Association, India, International Institute of Information Technology, 2014
 
This work investigates situations in the decoding process of Phrase-based SMT that cause particular errors on the output of the translation. A set of translations post-edited by professional translators is used to automatically identify errors based on edit distance. Binary classifiers predicting the sentence-level existence of an error are fitted with Logistic Regression, based on features from the decoding search graph. Models are fitted for 3 common error types and 6 language pairs. The statistically significant coefficients of the logistic function are used to analyze parts of the decoding process that are related to the particular errors.
 
Files: BibTeX, icon2014.pdf