Skip to main content Skip to main navigation

Publikation

Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive Windowing

Philipp Grulich; René Saitenmacher; Jonas Traub; Sebastian Breß; Tilmann Rabl; Volker Markl
In: Advances in Database Technology; EDBT 2018. International Conference on Extending Database Technology (EDBT-2018), 21th, March 26-29, Vienna, Austria, ISBN 978-3-89318-078-3, OpenProceedings, 2018.

Zusammenfassung

Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive Windowing Abstract: Machine learning techniques for data stream analysis suffer from concept drifts such as changed user preferences, varying weather conditions, or economic changes. These concept drifts cause wrong predictions and lead to incorrect business decisions. Concept drift detection methods such as adaptive windowing (Adwin) allow for adapting to concept drifts on the fly. In this paper, we examine Adwin in detail and point out its throughput bottlenecks. We then introduce several parallelization alternatives to address these bottlenecks. Our optimizations lead to a speedup of two orders of magnitude over the original Adwin implementation. Thus, we explore parallel adaptive windowing to provide scalable concept detection for high-velocity data streams with millions of tuples per second.

Projekte