Skip to main content Skip to main navigation


Correlating decoding events with errors in Statistical Machine Translation

Eleftherios Avramidis; Maja Popovic
In: Rajeev Sangal; Jyoti Pawar; Dipti Misra Sharma (Hrsg.). Proceedings of the 11th International Conference on Natural Language Processing. International Conference on Natural Language Processing (ICON-2014), 11th, December 18-21, Goa, India, Natural Language Processing Association, India, 2014.


This work investigates situations in the decoding process of Phrase-based SMT that cause particular errors on the output of the translation. A set of translations post-edited by professional translators is used to automatically identify errors based on edit distance. Binary classifiers predicting the sentence-level existence of an error are fitted with Logistic Regression, based on features from the decoding search graph. Models are fitted for 3 common error types and 6 language pairs. The statistically significant coefficients of the logistic function are used to analyze parts of the decoding process that are related to the particular errors.