INFLUENCE OF DATA CLEANING TECHNIQUES ON SUB-FIELD YIELD PREDICTIONSCristhian Sanchez; Deepak Kumar Pathak; Miro Miranda Lorenz; Marcela Charfuelan Oliva; Patrick Helber; Marlon Nuske; Benjamin Bischke; Peter Habelitz; Nafisur Rahman; Francisco Mena; Hiba Najjar; Jayanth Siddamsetty; Diego Arenas; Michaela Vollmer; Andreas Dengel
In: 2023 IEEE International Geoscience and Remote Sensing Symposium (IGARSS). IEEE International Geoscience and Remote Sensing Symposium (IGARSS-2023), TU3.R3: Spatio-temporal Data Harmonization I, July 16-21, Pasadena, California, USA, IEEE, 10/2023.
Modern combine harvesters can collect geo-located real-time yield measurement while harvesting. This data can be used to train Machine Learning models that predict the yield at sub-field level based on remote sensing input data. The performance of these models is, however, highly dependent on the quality of the yield data. It is therefore important to develop automatic cleaning techniques to correct for common errors in combine harvester yield maps. In this work, we compare different combinations of data cleaning techniques by evaluating their impact on the yield-prediction model performance at field and sub-field level. Our findings indicate that basic cleaning techniques such as absolute thresholds are sufficient at the field level, whereas the performance at the sub-field level is enhanced through the utilization of more intricate statistical cleaning methods.