Skip to main content Skip to main navigation

Publication

Transforming a Data Monolith into a Network of FAIR Digital Objects

Christian Backe
Conference presentation (FAIR in Action 2025), Zenodo, 10/2025.

Abstract

This talk, held at FAIR in Action 2025, presents the transformation of a monolithic data corpus into a network of FAIR Digital Objects (FDOs). The transformation addresses two main objectives. - First, the corpus is divided into data elements that are addressable at a granular scale. This allows, e.g., precise linking of propositions to specific observations, accurate error reporting and versioning, and flexible recombination of dataset components. - Second, controlled semantics are established across all levels of the data model. The goal is to eliminate any ambiguity for data consumers in order to increase reuse efficiency and minize the risk of misinterpretation. The data was originally acquired within the RoBivaL project which investigated different mobile robot designs in an agricultural setting. Data collection included high-resolution sensor measurements from several modalities, field logbooks containing structured experiment documentation, and specifications providing metadata and context about research methodology, data structures, and used equipment. The corpus was first made available on Zenodo in an effort to comply with the FAIR principles. While this initial version featured a clear layout and open formats in order to facilitate reuse, it was provided as a monolith which limited its interoperability. Further, though rich semantics were explicitly documented in the initial version, they were not codified in a standardized fashion. The transformed version implements the corpus as a network of FDOs, using semantic web technologies for data modeling, and Nanopublications for distribution. The data model has three layers: - The Experimental Research Ontology (ERO) forms the foundational layer. It models fundamental aspects of experimental data creation in general and is aligned with several upper ontologies. - The RoBivaL Specification layer uses ERO to specify RoBivaL's project methodology, define the structure and semantics of the payload data, and provide information about the used equipment. - The RoBivaL Payload Data layer uses the Specification layer to capture the values of experiment parameters and sensor measurements. The transformation was conducted as a use case of the project FDO Connect which develops tools and methodologies to bridge traditional data management practices with emerging FDO ecosystem requirements in order to facilitate the broader adoption of FAIR principles in research communities. The dataset transformation showcases modular multi-layered data modeling, a practical implementation of the FDO specifications, and a best practice for FAIR-compliant usage of semantic web technologies for distributed scientific data networks.

Projects

More links