As laid down in the European Strategy for data in 2020, data is the fuel of the 21st century. This is the reason why the European Commission supports the development of Common European Data Spaces in strategic economic sectors and domains of public interest. They will bring together relevant data infrastructures and governance frameworks to facilitate data pooling and sharing. This will allow data from across the EU to be made available and exchanged in a trustworthy and secure manner, keeping companies and individuals in control of the data they generate.
One of the Data Spaces currently under development is the Language Data Space (LDS). Through it, relevant stakeholders, e.g., from the publishing, language technology or press industry, will be able to exchange and monetise their language data and other language resources (e.g., language models) through a single platform, taking EU values and compliance with EU rules fully into account. As a result, the LDS will significantly increase the much-needed availability of clean, high-quality, compliant language data to support the development of state-of-the-art language technologies (LT) and AI-based LT services for a range of businesses.
The creation of the LDS platform aims at marking a turning point in the approach to the collection of language resources: the LDS will help European industry to compete globally with the language technology services provided by US or Chinese companies, and to build trust throughout the language data sharing process.