Random feature subset selection for ensemble based classification of data with missing features

Joseph DePasquale, Robi Polikar

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    7 Scopus citations

    Abstract

    We report on our recent progress in developing an ensemble of classifiers based algorithm for addressing the missing feature problem. Inspired in part by the random subspace method, and in part by an AdaBoost type distribution update rule for creating a sequence of classifiers, the proposed algorithm generates an ensemble of classifiers, each trained on a different subset of the available features. Then, an instance with missing features is classified using only those classifiers whose training dataset did not include the currently missing features. Within this framework, we experiment with several bootstrap sampling strategies each using a slightly different distribution update rule. We also analyze the effect of the algorithm's primary free parameter (the number of features used to train each classifier) on its performance. We show that the algorithm is able to accommodate data with up to 30% missing features, with little or no significant performance drop.

    Original languageEnglish (US)
    Title of host publicationMultiple Classifier Systems - 7th International Workshop, MCS 2007, Proceedings
    PublisherSpringer Verlag
    Pages251-260
    Number of pages10
    ISBN (Print)9783540724810
    DOIs
    StatePublished - 2007
    Event7th International Workshop on Multiple Classifier Systems, MCS 2007 - Prague, Czech Republic
    Duration: May 23 2007May 25 2007

    Publication series

    NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    Volume4472 LNCS
    ISSN (Print)0302-9743
    ISSN (Electronic)1611-3349

    Other

    Other7th International Workshop on Multiple Classifier Systems, MCS 2007
    Country/TerritoryCzech Republic
    CityPrague
    Period5/23/075/25/07

    All Science Journal Classification (ASJC) codes

    • Theoretical Computer Science
    • Computer Science(all)

    Fingerprint

    Dive into the research topics of 'Random feature subset selection for ensemble based classification of data with missing features'. Together they form a unique fingerprint.

    Cite this