Random feature subset selection for ensemble based classification of data with missing features

Joseph DePasquale, Robi Polikar

Research output: Chapter in Book/Report/Conference proceedingConference contribution

8 Scopus citations

Abstract

We report on our recent progress in developing an ensemble of classifiers based algorithm for addressing the missing feature problem. Inspired in part by the random subspace method, and in part by an AdaBoost type distribution update rule for creating a sequence of classifiers, the proposed algorithm generates an ensemble of classifiers, each trained on a different subset of the available features. Then, an instance with missing features is classified using only those classifiers whose training dataset did not include the currently missing features. Within this framework, we experiment with several bootstrap sampling strategies each using a slightly different distribution update rule. We also analyze the effect of the algorithm's primary free parameter (the number of features used to train each classifier) on its performance. We show that the algorithm is able to accommodate data with up to 30% missing features, with little or no significant performance drop.

Original languageEnglish (US)
Title of host publicationMultiple Classifier Systems - 7th International Workshop, MCS 2007, Proceedings
PublisherSpringer Verlag
Pages251-260
Number of pages10
ISBN (Print)9783540724810
DOIs
StatePublished - 2007
Externally publishedYes
Event7th International Workshop on Multiple Classifier Systems, MCS 2007 - Prague, Czech Republic
Duration: May 23 2007May 25 2007

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4472 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other7th International Workshop on Multiple Classifier Systems, MCS 2007
Country/TerritoryCzech Republic
CityPrague
Period5/23/075/25/07

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Random feature subset selection for ensemble based classification of data with missing features'. Together they form a unique fingerprint.

Cite this