A data augmentation approach for sign-language-to-text translation in-the-wild

Fabrizio Nunnari, Cristina España-Bonet, Eleftherios Avramidis

In: Proceedings of the 3rd Conference on Language, Data and Knowledge. Conference on Language, Data and Knowledge (LDK-2021) September 1-4 Zaragoza/Hybrid Spain OpenAccess Series in Informatics (OASIcs) 93 Dagstuhl publishing 9/2021.


In this paper, we describe the current main approaches to sign language translation which use deep neural networks with videos as input and text as output. We highlight that, under our point of view, their main weakness is the lack of generalization in daily life contexts. Our goal is to build a state-of-the-art system for the automatic interpretation of sign language in unpredictable video framing conditions. %, and proposed a specific methodology to address them. Our main contribution is the shift from image features to landmark positions in order to diminish the size of the input data and facilitate the combination of data augmentation techniques for landmarks. We describe the set of hypotheses to build such a system and the list of experiments that will lead us to their verification.


Weitere Links

Camera_Ready_A_data_augmentation_approach_for_sign_language_to_text_translation_in_the_wild.pdf (pdf, 914 KB )

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence