Data Category Registry: morpho-syntactic and syntactic profiles

Gil Francopoulo, Thierry Declerck, Virach Sornlertlamvanich, Eric de la Clergerie, Monica Monachini

In: Andreas Witt , Felix Sasaki , Elke Teich , Nicoletta Calzolari , Peter Wittenburg (Hrsg.). Proceedings of the LREC 2008 Workshop "Uses and usage of language resource-related standards". International Conference on Language Resources and Evaluation (LREC-08) 6th befindet sich the 6th edition of the Language Resources and Evaluation Conference May 27-30 Marrakech Morocco ELRA/ELDA 5/2008.


After a brief presentation of the data model, we describe a work in progress to define an initial set of morpho-syntactic and syntactic data categories dedicated to NLP applications. The aim is to improve interoperability among language resources and to optimize the process leading to their integration in applications. The main point is to be sure that when a language resource makes use of a value, the other language resources and programs have the same interpretation for this given value. From a practical point of view, these values are collected from existing lists, discussed, extended, and then recorded within a freely accessible data base: the ISO Data Category Registry.


LREC2008MarrakechStandardsWorkshopDCRProfiles.pdf (pdf, 116 KB )

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence