Modelling Frequency and Attestations for OntoLex-Lemon

Christian Chiarcos, Maxim Ionov, Jesse de Does, Katrien Depuydt, Anas Fahad Khan, Sander Stolk, Thierry Declerck, John Philip McCrae

In: Ilan Kernerman , Simon Krek , John P. McCrae , Jorge Gracia , Sina Ahmadi , Besim Kabashi (editor). Proceedings of the 2020 Globalex Workshop on Linked Lexicography. GLOBALEX (GLOBALEX-2020) located at 12th Language Resources and Evaluation Conference May 12-12 Marseille France Pages 1-9 ISBN 979-10-95546-46-7 ELRA Paris 5/2020.


The OntoLex vocabulary enjoys increasing popularity as a means of publishing lexical resources with RDF and as Linked Data. The recent publication of a new OntoLex module for lexicography, lexicog, reflects its increasing importance for digital lexicography. However, not all aspects of digital lexicography have been covered to the same extent. In particular, supplementary information drawn from corpora such as frequency information, links to attestations, and collocation data were considered to be beyond the scope of lexicog. Therefore, the OntoLex community has put forward the proposal for a novel module for frequency, attestation and corpus information (FrAC), that not only covers the requirements of digital lexicography, but also accommodates essential data structures for lexical information in natural language processing. This paper introduces the current state of the OntoLex-FrAC vocabulary, describes its structure, some selected use cases, elementary concepts and fundamental definitions, with a focus on frequency and attestations.


2020.globalex-1.1.pdf (pdf, 564 KB )

German Research Center for Artificial Intelligence
Deutsches Forschungszentrum für Künstliche Intelligenz