TY - GEN
T1 - Core support extraction for learning from initially labeled nonstationary environments using COMPOSE
AU - Capo, Robert
AU - Sanchez, Anthony
AU - Polikar, Robi
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2014/9/3
Y1 - 2014/9/3
N2 - Learning in nonstationary environments, also called concept drift, requires an algorithm to track and learn from streaming data, drawn from a nonstationary (drifting) distribution. When data arrive continuously, a concept drift algorithm is required to maintain an up-to-date hypothesis that evolves with the changing environment. A more difficult problem that has received less attention, however, is learning from so-called initially labeled nonstationary environments, where the the environment provides only unlabeled data after initialization. Since the labels to such data never become available, learning in such a setting is also referred to as extreme verification latency, where the algorithm must only use unlabeled data to keep the hypothesis current. In this contribution, we analyze COMPOSE, a framework recently proposed for learning in such environments. One of the central processes of COMPOSE is core support extraction, where the algorithm predicts which data instances will be useful and relevant for classification in future time steps. We compare two different options, namely Gaussian mixture model based maximum a posteriori sampling and a-shape compaction, for core support extraction, and analyze their effects on both accuracy and computational complexity of the algorithm. Our findings point to-as is the case in most engineering problems a trade-off: that a-shapes are more versatile in most situations, but they are far more computationally complex, especially as the dimensionality of the dataset increases. Our proposed GMM procedure allows COMPOSE to operate on datasets of substantially larger dimensionality without affecting its classification performance.
AB - Learning in nonstationary environments, also called concept drift, requires an algorithm to track and learn from streaming data, drawn from a nonstationary (drifting) distribution. When data arrive continuously, a concept drift algorithm is required to maintain an up-to-date hypothesis that evolves with the changing environment. A more difficult problem that has received less attention, however, is learning from so-called initially labeled nonstationary environments, where the the environment provides only unlabeled data after initialization. Since the labels to such data never become available, learning in such a setting is also referred to as extreme verification latency, where the algorithm must only use unlabeled data to keep the hypothesis current. In this contribution, we analyze COMPOSE, a framework recently proposed for learning in such environments. One of the central processes of COMPOSE is core support extraction, where the algorithm predicts which data instances will be useful and relevant for classification in future time steps. We compare two different options, namely Gaussian mixture model based maximum a posteriori sampling and a-shape compaction, for core support extraction, and analyze their effects on both accuracy and computational complexity of the algorithm. Our findings point to-as is the case in most engineering problems a trade-off: that a-shapes are more versatile in most situations, but they are far more computationally complex, especially as the dimensionality of the dataset increases. Our proposed GMM procedure allows COMPOSE to operate on datasets of substantially larger dimensionality without affecting its classification performance.
UR - http://www.scopus.com/inward/record.url?scp=84908494508&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84908494508&partnerID=8YFLogxK
U2 - 10.1109/IJCNN.2014.6889917
DO - 10.1109/IJCNN.2014.6889917
M3 - Conference contribution
AN - SCOPUS:84908494508
T3 - Proceedings of the International Joint Conference on Neural Networks
SP - 602
EP - 608
BT - Proceedings of the International Joint Conference on Neural Networks
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2014 International Joint Conference on Neural Networks, IJCNN 2014
Y2 - 6 July 2014 through 11 July 2014
ER -