DFKI-LT - Advancements in Arabic-to-English Hierarchical Machine Translation

Matthias Huck, David Vilar Torres, Daniel Stein, Hermann Ney
Advancements in Arabic-to-English Hierarchical Machine Translation
in: Mikel L. Forcada, Heidi Depraetere, Vincent Vandeghinste (eds.):
1 15th Annual Conference of the European Association for Machine Translation, Pages 273-280, Leuven, Belgium, European Association for Machine Translation, 5/2011
 
In this paper we study several advanced techniques and models for Arabic-to-English statistical machine translation. We examine how the challenges imposed by this particular language pair and translation direction can be successfully tackled within the framework of hierarchical phrase-based translation. We extend the state-of-the-art with a novel cross-system and cross-paradigm lightly-supervised training approach. In addition, for following recently developed techniques we provide a concise review, an empirical evaluation, and an in-depth analysis: soft syntactic labels, a discriminative word lexicon model, additional reorderings, and shallow rules. We thus bring together complementary methods that previously have only been investigated in isolation and mostly on different language pairs. Combinations of the methods yield significant improvements over a baseline using a usual set of models. The resulting hierarchical systems perform competitive on the large-scale NIST Arabic-to-English translation task.
 
Files: BibTeX, ArEn.pdf