Improving German Image Captions using Machine Translation and Transfer Learning

Rajarshi Biswas, Michael Barz, Mareike Hartmann, Daniel Sonntag

In: Luis Espinosa-Anke , Carlos Martin-Vide , Irena Spasic (Hrsg.). Statistical Language and Speech Processing SLSP 2021. International Conference on Statistical Language and Speech Processing (SLSP) 8th-9th November 22-26 Cardiff United Kingdom Lecture Notes in Computer Science / Lecture Notes in Artificial Intelligence (LNCS / LNAI) Springer Heidelberg 11/2021.


Image captioning is a complex artificial intelligence task that involves many fundamental questions of data representation, learning, and natural language processing. In addition, most of the work in this domain addresses the English language because of the high availability of annotated training data compared to other languages. Therefore, we investigate methods for image captioning in German that transfer knowledge from English training data. We explore four different methods for generating image captions in German, two baseline methods and two more advanced ones based on transfer learning. The baseline methods are based on a state-of-the-art model which we train using a translated version of the English MS COCO dataset and the smaller German Multi30K dataset, respectively. Both advanced methods are pre-trained using the translated MS COCO dataset and fine-tuned for German on the Multi30K dataset. One of these methods uses an alternative attention mechanism from the literature that showed a good performance in English image captioning. We compare the performance of all methods for the Multi30K test set in German using common automatic evaluation metrics. We show that our advanced method with the alternative attention mechanism presents a new baseline for German BLEU, ROUGE, CIDEr, and SPICE scores, and achieves a relative improvement of 21.2 % in BLEU-4 score compared to the current state-of-the-art in German image captioning.


SLSP2021Paper.pdf (pdf, 586 KB )

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence