Publikation

Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive Windowing

Philipp Grulich, René Saitenmacher, Jonas Traub, Sebastian Breß, Tilmann Rabl, Volker Markl

In: Advances in Database Technology; EDBT 2018. International Conference on Extending Database Technology (EDBT-2018) 21th March 26-29 Vienna Austria ISBN 978-3-89318-078-3 OpenProceedings 2018.

Abstrakt

Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive Windowing Abstract: Machine learning techniques for data stream analysis suffer from concept drifts such as changed user preferences, varying weather conditions, or economic changes. These concept drifts cause wrong predictions and lead to incorrect business decisions. Concept drift detection methods such as adaptive windowing (Adwin) allow for adapting to concept drifts on the fly. In this paper, we examine Adwin in detail and point out its throughput bottlenecks. We then introduce several parallelization alternatives to address these bottlenecks. Our optimizations lead to a speedup of two orders of magnitude over the original Adwin implementation. Thus, we explore parallel adaptive windowing to provide scalable concept detection for high-velocity data streams with millions of tuples per second.

Projekte

grulich-Scalable-Detection-of-Concept-Drifts-on-Data-Streams-with-Parallel-Adaptive-Windowing.pdf (pdf, 799 KB)

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence