Multilinguality and Language Technology

D&R Group

Language resources are the basis for building, customizing, and/or evaluating language technologies such as machine translation engines, chat bots, QA systems etc.

The Data & Resources Group of DFKI’s Research Department Multilinguality and Language Technology focusses on the collection and processing of language data in Europe and beyond, turning this data into valuable resources for the development of intelligent language technologies. All language resources, as well as corresponding language tools and services are made available through the ELRC-SHARE Repository for re-use by research, industry and others.

In order to ensure sustainability and success of its data collection activities, the Data & Resources Group has set-up and since then closely collaborates with the ELRC National Anchor Points, a unique network with technology experts from public service representatives from all EU Member States, Iceland and Norway.


Some current projects:

LT-BRIDGE

“Bridging the technology gap: Integrating Malta into European Research and Innovation efforts for AI-based language technologies”.
H2020-WIDESPREAD-2020-5 Grant Agreement No. 952194

Project Page

Fair Forward

Consulting services to Gesellschaft für Internationale Zusammenarbeit (GIZ) on technical aspects of AI in international cooperation including natural language processing (NLP), training data and data access for FAIR Forward – Artificial Intelligence for All. GIZ Project No. 19.2010.7-003.00

Project Page

ELG

European Language Grid (ELG).
H2020-EU.2.1.1. Grant Agreement No. 825627

Project Page

CEF AT Tools and Services

Study on service portfolio development and implementation of the “service desk” component of the CEF Automated Translation Platform (CEF AT Tools and Services).
SMART 2016/0103

Project Page

ELRC

European Language Resource Coordination (ELRC).
SMART 2019/1083

Project Page


Selected recent publications

  • Lilli Smal, Andrea Lösch, Josef van Genabith, Maria Giagkou, Thierry Declerck, Stephan Busemann: “Language Data Sharing in European Public services – Overcoming Obstacles and Creating Sustainble Data Sharing Infrastructures” in: Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Christopher Cieri, Khalid Choukri, Thierry Declerck, , Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis (eds.): Proceedings of the Twelfth International Conference on Language Resources and Evaluation (LREC 2020), Pages 3443-3448, Marseille, France, ELRA, Paris, 5/2020
  • Thierry Declerck; John McCrae; Matthias Hartung; Jorge Gracia; Christian Chiarcos; Elena Montiel; Philipp Cimiano; Artem Revenko; Roser Sauri; Deirdre Lee; Stefania Racioppa; Jamal Nasir; Matthias Orlikowski; Marta Lanau-Coronas; Christian Fäth; Mariano Rico; Mohammad Fazleh Elahi; Maria Khvalchik; Meritxell Gonzalez; Katharine Cooney: „Recent Developments for the Linguistic Linked Open Data Infrastructure“, in: Nicoletta Calzolari; Frédéric Béchet; Philippe Blache; Christopher Cieri; Khalid Choukri; Thierry Declerck; Sara Goggi; Hitoshi Isahara; Bente Maegaard; Joseph Mariani; Hélène Mazo; Asuncion Moreno; Jan Odijk; Stelios Piperidis (eds.): Proceedings of the Twelfth International Conference on Language Resources and Evaluation (LREC 2020), Pages 5660-5667, Marseille, France, ELRA, Paris, 5/2020
  • Christian Chiarcos, Maxim Ionov, Jesse de Does, Katrien Depuydt, Anas Fahad Khan, Sander Stolk, Thierry Declerck, John Philip McCrae: „ Modelling Frequency and Attestations for OntoLex-Lemon “ in: Ilan Kernerman, Simon Krek, John P. McCrae, Jorge Gracia, Sina Ahmadi, Besim Kabashi (eds.): Proceedings of the 2020 Globalex Workshop on Linked Lexicography, Pages 1-9, Marseille, France, ELRA, Paris, 5/2020
  • Sina Ahmadi, John P. McCrae, Sanni Nimb, Fahad Khan, Monica Monachini, Bolette S. Pedersen, Thierry Declerck, Tanja Wissik, Andrea Bellandi, Irene Pisani, Thomas Troelsgård, Sussi Olsen, Simon Krek, Veronika Lipp, Tamás Váradi, László Simon, András Győrffy, Carole Tiberius, Tanneke Schoonheim, Yifat Ben Moshe, Maya Rudich, Raya Abu Ahmad, Dorielle Lonke, Kira Kovalenko, Margit Langemets, Jelena Kallas, Oksana Dereza, Theodorus Fransen, David Cillessen, David Lindemann, Mikel Alonso, Ana Salgado, José Luis Sancho, Rafael-J. Ureña-Ruiz, Jordi Porta Zamorano, Kiril Simov, Petya Osenova, Zara Kancheva, Ivaylo Radev, Ranka Stanković, Andrej Perdih, Dejan Gabrovšek: “A Multilingual Evaluation Dataset for Monolingual Word Sense Alignment” in: Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis (eds.): Proceedings of the Twelfth International Conference on Language Resources and Evaluation (LREC 2020), Pages 3232-3242, Marseille, France, ELRA, Paris, 5/2020
  • Thierry Declerck; Stefania Racioppa; Galia Angelova (Hrsg.); Ruslan Mitkov (Hrsg.); Ivelina Nikolova (Hrsg.); Irina Temnikova (Hrsg.) (2019) „Porting Multilingual Morphological Resources to OntoLex-Lemon“, in: Galia Angelova; Ruslan Mitkov; Ivelina Nikolova; Irina Temnikova (eds.): Proceedings of the International Conference Recent Advances in Natural Language Processing (RANLP 2019), Pages, Varna, Bulgaria, INCOMA Ltd., Shoumen, Bulgaria, 9/2019
  • Andrea Lösch, Valérie Mapelli, Stelios Piperidis, Andrejs Vasiļjevs, Lilli Smal, Thierry Declerck, Eileen Schnur, Khalid Choukri, Josef van Genabith: “European Language Resource Coordination: Collecting Language Resources for Public Sector Multilingual Information Management”, in: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan, ELRA, Paris, France, 5/2018
  • Thierry Declerck, Kseniya Egorova, Eileen Schnur: “An Integrated Formal Representation of Terminological and Lexical Data included in Classification Schemes” in:  Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan, ELRA, Paris, France, Paris, 5/2018

German Research Center for Artificial Intelligence
Deutsches Forschungszentrum für Künstliche Intelligenz