DFKI-LT - Investigating Genre and Method Variation in Translation Using Text Classification

Marcos Zampieri, Ekaterina Lapshinova-Koltunski
Investigating Genre and Method Variation in Translation Using Text Classification
1 Proceedings of the 18th International Conference on Text, Speech and Dialogue (TSD2015),
Lecture Notes in Artificial Intelligence, Pilzen, Czech Republic, Springer, 2015

 
In this paper, we propose the use of automatic text classification methods to analyse variation in English-German translations from both a quantitative and a qualitative perspective. The experiments described in this paper are carried out in two steps. We trained classifiers to 1) discriminate between diferent genres (fiction, political essays, etc.); and 2) identify the translation method (machine vs. human). Using semi-delexicalized models (excluding all nouns), we report results of up to 60.5% F-measure in distinguishing human and machine translations and 45.4% in discriminating between seven diferent genres. More than the classification performance itself, we argue that text classification methods can level out discriminative features of diferent variables (genres and translation methods) thus enabling researchers to investigate in more detail the properties of each of them.
 
Files: BibTeX, tsd2015.pdf