TY - JOUR
T1 - Extensions to Online Feature Selection Using Bagging and Boosting
AU - Ditzler, Gregory
AU - Labarck, Joseph
AU - Ritchie, James
AU - Rosen, Gail
AU - Polikar, Robi
N1 - Funding Information:
Manuscript received March 14, 2016; revised October 28, 2016, April 12, 2017, and August 16, 2017; accepted August 17, 2017. Date of publication October 11, 2017; date of current version August 20, 2018. This work was supported in part by the National Science Foundation under Grant 1120622 and Grant 1310496 and in part by Drexel’s University Research Computing Facility. (Corresponding author: Gregory Ditzler.) G. Ditzler is with the Department of Electrical and Computer Engineering, The University of Arizona, Tucson, AZ 85721 USA (e-mail: gregory.ditzler@gmail.com).
Publisher Copyright:
© 2012 IEEE.
PY - 2018/9
Y1 - 2018/9
N2 - Feature subset selection can be used to sieve through large volumes of data and discover the most informative subset of variables for a particular learning problem. Yet, due to memory and other resource constraints (e.g., CPU availability), many of the state-of-the-art feature subset selection methods cannot be extended to high dimensional data, or data sets with an extremely large volume of instances. In this brief, we extend online feature selection (OFS), a recently introduced approach that uses partial feature information, by developing an ensemble of online linear models to make predictions. The OFS approach employs a linear model as the base classifier, which allows the l0-norm of the parameter vector to be constrained to perform feature selection leading to sparse linear models. We demonstrate that the proposed ensemble model typically yields a smaller error rate than any single linear model, while maintaining the same level of sparsity and complexity at the time of testing.
AB - Feature subset selection can be used to sieve through large volumes of data and discover the most informative subset of variables for a particular learning problem. Yet, due to memory and other resource constraints (e.g., CPU availability), many of the state-of-the-art feature subset selection methods cannot be extended to high dimensional data, or data sets with an extremely large volume of instances. In this brief, we extend online feature selection (OFS), a recently introduced approach that uses partial feature information, by developing an ensemble of online linear models to make predictions. The OFS approach employs a linear model as the base classifier, which allows the l0-norm of the parameter vector to be constrained to perform feature selection leading to sparse linear models. We demonstrate that the proposed ensemble model typically yields a smaller error rate than any single linear model, while maintaining the same level of sparsity and complexity at the time of testing.
UR - http://www.scopus.com/inward/record.url?scp=85031778862&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85031778862&partnerID=8YFLogxK
U2 - 10.1109/TNNLS.2017.2746107
DO - 10.1109/TNNLS.2017.2746107
M3 - Article
C2 - 29028210
AN - SCOPUS:85031778862
SN - 2162-237X
VL - 29
SP - 4504
EP - 4509
JO - IEEE Transactions on Neural Networks
JF - IEEE Transactions on Neural Networks
IS - 9
M1 - 8065078
ER -