TY - GEN
T1 - Random feature subset selection for ensemble based classification of data with missing features
AU - DePasquale, Joseph
AU - Polikar, Robi
PY - 2007
Y1 - 2007
N2 - We report on our recent progress in developing an ensemble of classifiers based algorithm for addressing the missing feature problem. Inspired in part by the random subspace method, and in part by an AdaBoost type distribution update rule for creating a sequence of classifiers, the proposed algorithm generates an ensemble of classifiers, each trained on a different subset of the available features. Then, an instance with missing features is classified using only those classifiers whose training dataset did not include the currently missing features. Within this framework, we experiment with several bootstrap sampling strategies each using a slightly different distribution update rule. We also analyze the effect of the algorithm's primary free parameter (the number of features used to train each classifier) on its performance. We show that the algorithm is able to accommodate data with up to 30% missing features, with little or no significant performance drop.
AB - We report on our recent progress in developing an ensemble of classifiers based algorithm for addressing the missing feature problem. Inspired in part by the random subspace method, and in part by an AdaBoost type distribution update rule for creating a sequence of classifiers, the proposed algorithm generates an ensemble of classifiers, each trained on a different subset of the available features. Then, an instance with missing features is classified using only those classifiers whose training dataset did not include the currently missing features. Within this framework, we experiment with several bootstrap sampling strategies each using a slightly different distribution update rule. We also analyze the effect of the algorithm's primary free parameter (the number of features used to train each classifier) on its performance. We show that the algorithm is able to accommodate data with up to 30% missing features, with little or no significant performance drop.
UR - http://www.scopus.com/inward/record.url?scp=37249049777&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=37249049777&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-72523-7_26
DO - 10.1007/978-3-540-72523-7_26
M3 - Conference contribution
AN - SCOPUS:37249049777
SN - 9783540724810
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 251
EP - 260
BT - Multiple Classifier Systems - 7th International Workshop, MCS 2007, Proceedings
PB - Springer Verlag
T2 - 7th International Workshop on Multiple Classifier Systems, MCS 2007
Y2 - 23 May 2007 through 25 May 2007
ER -