Statistical Machine Transliteration with Multi-to-Multi Joint Source Channel Model

Yu Chen, Rui Wang, Yi Zhang

In: Proceedings of the Named Entities Workshop Shared Task on Machine Transliteration. The IJCNLP Named Entities Workshop Shared Task on Machine Transliteration (NEWS-2011) befindet sich International Joint Conference on Natural Language Processing November 12 Chiang Mai Thailand Association for Computational Linguistics 11/2011.


This paper describes DFKI's participation in the NEWS2011 shared task on machine transliteration. Our primary system participated in the evaluation for English-Chinese and Chinese-English language pairs. We extended the joint source-channel model on the transliteration task into a multi-to-multi joint source-channel model, which allows alignments between substrings of arbitrary lengths in both source and target strings. When the model is integrated into a modified phrase-based statistical machine translation system, around 20% of improvement is observed. The primary system achieved 0.320 on English-Chinese and 0.133 on Chinese-English in terms of top-1 accuracy.


NEWS2011.pdf (pdf, 149 KB )

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence