Project

Pret-a-LLOD

Scalable Open Linked Data environment

  • Duration:

Language technologies increasingly rely on large amounts of data and better access and usage of language resources will enable to provide multilingual solutions that would support the emerging Digital Single Market in Europe. However, data is rarely ‘ready-to-use’ and language technology specialists spend over 80% of their time on cleaning, organizing and collecting datasets. Reducing this effort promises huge cost savings for all sectors where language technologies are required. An essential part of the Extract-Transform-Load process involves linking datasets to existing schemas, yet few specialists take advantage of linked data technologies to perform this task. In this project we aim to increase the uptake of language technologies by exploiting the combination of linked data and language technologies, that is Linguistic Linked Open Data (LLOD), to create ready-to-use multilingual data. Prêt-à-LLOD aims to achieve this by creating a new methodology for building data value chains applicable to a wide-range of sectors and applications and based around language resources and language technologies that can be integrated by means of semantic technologies, in particular the usage of Linguistic Linked Open Data (LLOD). The project will develop novel tools for the transformation and linking of datasets, and apply these to both data and metadata in order to provide multi-portal access to heterogeneous data repositories. We will study how we can automatically analyze licenses in order to deduce how data may be lawfully used and sold by language resource providers. Finally, we will provide tools to combine language services and resources into complex pipelines by use of semantic technologies. This will lead to sustainable data offers and services that can be deployed to many platforms, including as-yetunknown platforms, and can be self-described with linked data semantics. This toolkit will be validated in four pilots, where novel data value chains will be built for pharmaceutical applications, technology providers, and government services. This toolkit will increase the uptake of language technologies by removing barriers to its use and provide cost savings that benefit both public and private sector users such as SMEs.

Partners

NATIONAL UNIVERSITY OF IRELAND GALWAY (Coordinator)
Ireland UNIVERSIDAD DE ZARAGOZA -- Spain UNIVERSIDAD POLITECNICA DE MADRID -- Spain UNIVERSITAET BIELEFELD -- Germany JOHANN WOLFGANG GOETHEUNIVERSITATFRANKFURT AM MAIN -- Germany DEUTSCHES FORSCHUNGSZENTRUM FUR KUNSTLICHE INTELLIGENZ GMBH -- Germany SEMALYTIX GMBH -- Germany THE CHANCELLOR, MASTERS AND SCHOLARS OF THE UNIVERSITY OF OXFORD -- United Kingdom SEMANTIC WEB COMPANY GMBH -- Austria DERILINX LIMITED -- Ireland

Share project:

Contact Person

Publications about the project

Lenka Bajčetić, Thierry Declerck

In: Zoe Gavriilidou, Maria Mitsiaki, Asimakis Fliatouras (editor). Proceedings of XIX EURALEX Congress. EURALEX International Congress (EURALEX-2020) Lexicography for Inclusion September 7-11 Pages 73-80 1 ISBN 978-618-85138-1-5 Euralex 11/2020.

To the publication
Tanja Wissik, Thierry Declerck

In: Actes de la conférence TOTh 2019. Terminology & Ontology: Theories and applications (TOTh-2019) June 6-7 Le Bourget du Lac France Pages 365-381 ISBN 978-2-919732-80-7 Presses Universitaires Savoie Mont Blanc Chambéry 7/2020.

To the publication
Thierry Declerck, John McCrae, Matthias Hartung, Jorge Gracia, Christian Chiarcos, Elena Montiel, Philipp Cimiano, Artem Revenko, Roser Sauri, Deirdre Lee, Stefania Racioppa, Jamal Nasir, Matthias Orlikowski, Marta Lanau-Coronas, Christian Fäth, Mariano Rico, Mohammad Fazleh Elahi, Maria Khvalchik, Meritxell Gonzalez, Katharine Cooney

In: Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Christopher Cieri, Khalid Choukri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis (editor). Proceedings of the Twelfth International Conference on Language Resources and Evaluation (LREC 2020). International Conference on Language Resources and Evaluation (LREC-2020) May 11-16 Marseille France Pages 5660-5667 ISBN 979-10-95546-34-4 ELRA Paris 5/2020.

To the publication

German Research Center for Artificial Intelligence
Deutsches Forschungszentrum für Künstliche Intelligenz