Cluster tree based multi-label classification for protein function prediction

Qingyao Wu, Yunming Ye, Xiaofeng Zhang, Shen Shyang Ho

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Automatically assigning functions for unknown proteins is a key task in computational biology. Proteins in nature have multiple classes according to the functions they perform. Many efforts have been made to cast the protein function prediction into a multi-label learning problem. This paper proposes a novel Cluster Tree based Multi-label Learning algorithm (CTML) for protein function prediction. The main idea is to compute a set of predictive labels associated at each node for multi-label prediction by using the k-means clustering techniques and the predictive functions via the learning data at the nodes. With the propagation of the predictive labels from the root node to the leaf node, the correlations between labels can be preserved. Experimental results on benchmark data (genbase and yeast datasets) show that the proposed CTML algorithm is effective in predicting protein functions. Moreover, the classification performance of the CTML algorithm is competitive against the other baseline multi-label learning algorithms.

Original languageEnglish (US)
Title of host publicationProceedings - 2013 IEEE International Conference on Bioinformatics and Biomedicine, IEEE BIBM 2013
Pages513-516
Number of pages4
DOIs
StatePublished - 2013
Externally publishedYes
Event2013 IEEE International Conference on Bioinformatics and Biomedicine, IEEE BIBM 2013 - Shanghai, China
Duration: Dec 18 2013Dec 21 2013

Publication series

NameProceedings - 2013 IEEE International Conference on Bioinformatics and Biomedicine, IEEE BIBM 2013

Other

Other2013 IEEE International Conference on Bioinformatics and Biomedicine, IEEE BIBM 2013
Country/TerritoryChina
CityShanghai
Period12/18/1312/21/13

All Science Journal Classification (ASJC) codes

  • Biomedical Engineering

Fingerprint

Dive into the research topics of 'Cluster tree based multi-label classification for protein function prediction'. Together they form a unique fingerprint.

Cite this