TY - GEN
T1 - Semi-supervised learning in nonstationary environments
AU - Ditzler, Gregory
AU - Polikar, Robi
N1 - Copyright:
Copyright 2011 Elsevier B.V., All rights reserved.
PY - 2011
Y1 - 2011
N2 - Learning in nonstationary environments, also called learning concept drift, has been receiving increasing attention due to increasingly large number of applications that generate data with drifting distributions. These applications are usually associated with streaming data, either online or in batches, and concept drift algorithms are trained to detect and track the drifting concepts. While concept drift itself is a significantly more complex problem than the traditional machine learning paradigm of data coming from a fixed distribution, the problem is further complicated when obtaining labeled data is expensive, and training must rely, in part, on unlabelled data. Independently from concept drift research, semi-supervised approaches have been developed for learning from (limited) labeled and (abundant) unlabeled data; however, such approaches have been largely absent in concept drift literature. In this contribution, we describe an ensemble of classifiers based approach that takes advantage of both labeled and unlabeled data in addressing concept drift: available labeled data are used to generate classifiers, whose voting weights are determined based on the distances between Gaussian mixture model components trained on both labeled and unlabeled data in a drifting environment.
AB - Learning in nonstationary environments, also called learning concept drift, has been receiving increasing attention due to increasingly large number of applications that generate data with drifting distributions. These applications are usually associated with streaming data, either online or in batches, and concept drift algorithms are trained to detect and track the drifting concepts. While concept drift itself is a significantly more complex problem than the traditional machine learning paradigm of data coming from a fixed distribution, the problem is further complicated when obtaining labeled data is expensive, and training must rely, in part, on unlabelled data. Independently from concept drift research, semi-supervised approaches have been developed for learning from (limited) labeled and (abundant) unlabeled data; however, such approaches have been largely absent in concept drift literature. In this contribution, we describe an ensemble of classifiers based approach that takes advantage of both labeled and unlabeled data in addressing concept drift: available labeled data are used to generate classifiers, whose voting weights are determined based on the distances between Gaussian mixture model components trained on both labeled and unlabeled data in a drifting environment.
UR - http://www.scopus.com/inward/record.url?scp=80054770829&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=80054770829&partnerID=8YFLogxK
U2 - 10.1109/IJCNN.2011.6033578
DO - 10.1109/IJCNN.2011.6033578
M3 - Conference contribution
AN - SCOPUS:80054770829
SN - 9781457710865
T3 - Proceedings of the International Joint Conference on Neural Networks
SP - 2741
EP - 2748
BT - 2011 International Joint Conference on Neural Networks, IJCNN 2011 - Final Program
T2 - 2011 International Joint Conference on Neural Network, IJCNN 2011
Y2 - 31 July 2011 through 5 August 2011
ER -