DFKI-LT - Correlating decoding events with errors in Statistical Machine Translation
Correlating decoding events with errors in Statistical Machine Translation
2 Proceedings of the 11th International Conference on Natural Language Processing, Goa, India, Natural Language Processing Association, India, International Institute of Information Technology, 2014
This work investigates situations in the decoding process of Phrase-based SMT that cause particular errors on the output of the translation. A set of translations post-edited by professional translators is used to automatically identify errors based on edit distance. Binary classifiers predicting the sentence-level existence of an error are fitted with Logistic Regression, based on features from the decoding search graph. Models are fitted for 3 common error types and 6 language pairs. The statistically significant coefficients of the logistic function are used to analyze parts of the decoding process that are related to the particular errors.
Files: BibTeX, icon2014.pdf