Meta-Learning, Model Selection, and Example Selection in Machine Learning Domains with Concept Drift

Ralf Klinkenberg

Abstract

For many tasks where data is collected over an extended period of time, its underlying distribution is likely to change. A typical example is information filtering, i.e. the adaptive classification of documents with respect to a particular user interest. The interest of the user may change over time. Machine learning approaches handling concept drift have been shown to outperform more static approaches ignoring it in experiments with different types of simulated concept drifts on real-word text data and in experiments on real-world data for the task of classifying phases in business cycles exhibiting real concept drift. While previous concept drift handling approaches only use a single base learning algorithm and employ this same base learner at each step in time, this paper proposes a metalearning approach allowing the use of alternative learners and automatically selecting the most promising base learner at each step in time. This work in progress investigates, if such a contextdependent selection of the base learner leads to a better adaptation to the drifting concept, i.e. to lower classification error rates, than approaches based on single base learner only. Furthermore it investigates, how much the proposed metalearning approach allows to speed up the selection process and how much of the gained reduction in the error rate may be lost by that speed-up. The approaches with and without base learner selection and meta-learning are to be compared in experiments using real-world data from the above mentioned domains with simulated and real-world concept drifts, respectively.

[article]