Publication
Toward Consistent Data Quality Assessment in Heterogeneous Smart Living Systems
Tobias Dreesbach; Léon Dankert; Constantin Brîncoveanu; K. Valerie Carl; Oliver Hinz; Oliver Thomas
In: 27th International Conference on Business Informatics. IEEE Conference on Business Informatics (CBI-2025), September 9-12, Lissabon, Portugal, IEEE, 2025.
Abstract
As data-driven applications gain traction in smart living environments, ensuring high data quality becomes a critical prerequisite for reliable analytics and Artificial Intelligence (AI)-based services. However, due to the wide range of hardware and software providers as well as the variety of use cases in the smart living domain, the resulting data are characterized by highly heterogeneous structures and nesting levels. These differences, combined with strong contextual dependencies, make it difficult to automatically assess data quality consistently across datasets. This paper addresses the need for a consistent assessment of data quality within heterogeneous smart living datasets. We present a prototype that generates quality reports for datasets with an unknown structure based on established data quality metrics, including completeness, accuracy and consistency. A Large Language Model component generates structured metadata describing the datasets structure and suggesting applicable quality checks, which are then interpreted by tailored scripts to apply standardized quality checks. This approach not only supports the assessment of data quality within individual organizations but also facilitates cross-organizational assessments, enabling better comparability and evaluation of datasets from different sources or providers. The prototype was evaluated using a variety of synthetic and real-world smart living datasets. The results demonstrate the feasibility of the proposed approach, although certain limitations remain, which are discussed in detail within the paper.
