Random feature subset selection for analysis of data with missing features

Joseph DePasquale, Robi Polikar

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations

Abstract

We discuss an ensemble-of-classifiers based algorithm for the missing feature problem. The proposed approach is inspired in part by the random subspace method, and in part by the incremental learning algorithm, Learn ++. The premise is to generate an adequately large number of classifiers, each trained on a different and random combination of features, drawn from an iteratively updated distribution. To classify an instance with missing features, only those classifiers whose training data did not include the currently missing feature are used. These classifiers are combined by using a majority voting combination rule to obtain the final classification of the given instance. We had previously presented preliminary results on a similar approach, which could handle up to 10% missing data. In this study, we expand our work to include different types of rules to update the distribution, and also examine the effect of the algorithm's primary free parameter (the number of features used to train the ensemble of classifiers) on the overall classification performance. We show that this algorithm can now accommodate up to 30% of features missing without a significant drop in performance.

Original languageEnglish (US)
Title of host publicationThe 2007 International Joint Conference on Neural Networks, IJCNN 2007 Conference Proceedings
Pages2379-2384
Number of pages6
DOIs
StatePublished - 2007
Event2007 International Joint Conference on Neural Networks, IJCNN 2007 - Orlando, FL, United States
Duration: Aug 12 2007Aug 17 2007

Publication series

NameIEEE International Conference on Neural Networks - Conference Proceedings
ISSN (Print)1098-7576

Other

Other2007 International Joint Conference on Neural Networks, IJCNN 2007
Country/TerritoryUnited States
CityOrlando, FL
Period8/12/078/17/07

All Science Journal Classification (ASJC) codes

  • Software

Fingerprint

Dive into the research topics of 'Random feature subset selection for analysis of data with missing features'. Together they form a unique fingerprint.

Cite this