Nearest hyperdisk methods for high-dimensional classification

Hakan Cevikalp, Bill Triggs, Robi Polikar

Research output: Chapter in Book/Report/Conference proceedingConference contribution

42 Scopus citations

Abstract

In high-dimensional classification problems it is infeasible to include enough training samples to cover the class regions densely. Irregularities in the resulting sparse sample distributions cause local classifiers such as Nearest Neighbors (NN) and kernel methods to have irregular decision boundaries. One solution is to "fill in the holes" by building a convex model of the region spanned by the training samples of each class and classifying examples based on their distances to these approximate models. Methods of this kind based on affine and convex hulls and bounding hyperspheres have already been studied. Here we propose a method based on the hounding hyper disk of each class - the intersection of the affine hull and the smallest bounding hypersphere of its training samples. We argue that in many cases hyperdisks are preferable to affine and convex hulls and hyperspheres: they bound the classes more tightly than affine hulls or hyperspheres while avoiding much of the sample overfitting and computational complexity that is inherent in high-dimensional convex hulls. We show that the hyperdisk method can be kemelized to provide nonlinear classifiers based on non-Euclidean distance metrics. Experiments on several classification problems show promising results.

Original languageEnglish (US)
Title of host publicationProceedings of the 25th International Conference on Machine Learning
PublisherAssociation for Computing Machinery (ACM)
Pages120-127
Number of pages8
ISBN (Print)9781605582054
DOIs
StatePublished - 2008
Event25th International Conference on Machine Learning - Helsinki, Finland
Duration: Jul 5 2008Jul 9 2008

Publication series

NameProceedings of the 25th International Conference on Machine Learning

Other

Other25th International Conference on Machine Learning
Country/TerritoryFinland
CityHelsinki
Period7/5/087/9/08

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Human-Computer Interaction
  • Software

Fingerprint

Dive into the research topics of 'Nearest hyperdisk methods for high-dimensional classification'. Together they form a unique fingerprint.

Cite this