RELATOR

DFKI


Distributed European Linguistic Resource Repository



Introduction

The RELATOR project is a CEC-funded initiative which addresses the vital area of linguistic resources for spoken and written language processing. The RELATOR Consortium is currently establishing a repository of existing linguist resources and welcomes industrial participation at this crucial pilot stage. This document describes the RELATOR initiative and its importance to the European linguistic community.


What is RELATOR?

RELATOR is a European-wide consortium of researchers who, with the support of the European Commission, are striving to establish a European repository of linguistic resources. Linguistic resources comprise a variety of spoken and written language materials, including lexicons, grammars, corpora, and spoken language databases. RELATOR will be focussing on -- but will be by no means limited to -- the nine working languages of the European Union. Initially, the RELATOR Consortium will address existing linguistic resources. Over time, it is envisaged that the RELATOR initiative will be expanded to encompass the development of new materials. RELATOR will be positioned as the European counterpart to the US Linguistic Data Consortium, an ARPA-funded initiative which has been addressing the needs of American researchers. RELATOR will ensure that the requirements of the European language processing community receive commensurate attention.


Who is behind RELATOR?

The Coordinating Partner of the RELATOR project is the Institute for Computational Linguistics at the University of Pisa. Additional Partners include CNRS-LIMSI, The Centre for Cognitive Science (Edinburgh), and the German Institute for AI (DFKI) at the University of Saarland. Associate partners are the Danish Center for Language Technology, INESC Lisbon, and the Institut de la Communication Parlee at Grenoble. The RELATOR Consortium comprises representatives of major coordinating bodies and associations, most notably ELSNET, ESCA, and EACL, and will build on the previous like-minded efforts of these groups. The Consortium is assisted by an Industrial Steering Committee, which is made up of representatives of leading IT companies, publishers, and other providers of electronic information services. RELATOR is supported by the European Commission within the framework of the Linguistic Research and Engineering program. This four-year program funds a select number of European research and technical development projects in the field of language processing. RELATOR is seen as a vital supplementary activity to the program's varied array of basic research, technical development, and evaluation-oriented projects.


Why was RELATOR established?

Language processing is an enabling technology which has the potential of having profound impact across a vast spectrum of modern day life, including information technology, education, and telecommunications. However, a major bottleneck in the development of applications in these sectors has been the lack of linguistic resources which are essential for building portable and robust systems with broad coverage. Being an immature science, language processing also requires a substantial amount of further basic research, much of which will largely depend on the availability of large-scale linguistic resources. These matters are slowly being addressed for the English language, but for other languages the problem remains acute. In the past, industry has been tackling some of these needs, but efforts have been piecemeal at best -- and proprietary. Meanwhile, fundamental economic changes have been taking place in the IT industry as a whole, resulting in less funding available for the medium- and long-term investments required to build commercial language processing applications, a problem seen even in the world's largest concerns. Moreover, the problem will only get worse, as the globalization of the economy makes ever increasing demands on national language support. Given this troubling context, any duplication of effort in the development of resources must be avoided. RELATOR was therefore established to provide a mechanism for distributing this burden among the providers and beneficiaries.


How will RELATOR achieve its goals?

The RELATOR Consortium's first actions were to appoint an industrial steering committee, an editor/coordinator, and three high-level consultants. The next step will be to attract interest from a representative cross- section of industry. In the start-up phase, the emphasis will be on identifying existing resources, defining collective requirements, and establishing a prototypical repository. Subsequently, the RELATOR consortium will experiment with disseminating these materials by means of a distributed electronic network. It will also investigate the possibility of establishing CD-ROM pressing facilities in-house. One of the most complex and urgent issues RELATOR will be grappling with is the intellectual property considerations which surround the collection and distribution of linguistic resources. At the behest of the RELATOR Consortium, the Institut de Recherches Comparatives sur les Institutions et le Droit (Paris) will be identifying the relevant legal problems and will compile a report containing suggestions for conditions governing use, production, and brokerage of linguistic resources. It will also supply model licensing agreements and suggest possible roles for national and international funding agencies. For RELATOR to succeed, it must ultimately satisfy the needs of a broad range of European industry. For this reason, the Consortium will be dependent on extensive input and feedback from industry throughout the entire trajectory of the pilot project and beyond.


Timing

The European Commission has funded RELATOR for an initial timeframe of eighteen months, starting January, 1994. Spring 1994 will see the first meeting of the Industrial Steering Committee. A campaign will also be launched at this time to attract a small number of industrial partners. Within the first six months, a workshop will be held to mobilize the European linguistic community regarding RELATOR. Fall 1994 should see the establishment of an experimental distributed database. A second workshop, possibly held in conjunction with the International Forum and Conference on Language Engineering and Industry (when?) will be held to gain feedback on the interim report. A final workshop will gather recommendations to be made in the final report. While the initiative could receive further EU funding in subsequent years, RELATOR will ultimately take the form of an independent, self-sustaining entity.


Why you/your company should participate in RELATOR

Compiling linguistic resources for spoken and written language processing applications is an expensive and time-consuming undertaking. Because linguistic resources are essential, they are largely pre- competitive. RELATOR aims to make such materials available in a useful form at low cost to its members, thereby freeing their resources for strategic implementation issues, such as domain-specific requirements, performance, functionality, user and program interfaces, and hardware platforms. Moreover, having access to standardized linguistic resources which your company has helped define could radically increase your time-to-market -- and your commercial chances -- in new language areas. An investment of time and resources in RELATOR today could mean manifold returns for your company in the future.


DFKIlabupsearchserver


to the Relator WWW Server


Christoph Weyers (weyers@dfki.de)