Incremental Semi-Supervised Learning for Functional Analysis of Protein Sequences

Mali Halac, Bahrad Sokhansanj, William L. Trimble, Thomas Coard, Norman C. Sabin, Emrecan Ozdogan, Robi Polikar, Gail L. Rosen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

Current approaches for the functional annotation of proteins rely on training a classifier based on a fixed reference database. As more genes are sequenced, the size of the reference database grows and classifiers are retrained with the old and some new data. Considering the ever-increasing number of (meta-)genomic data, repeating this process is computationally expensive. An alternative is to update the classifier continuously based on a stream of data. Thus, in this study, we propose an incremental and semi-supervised learning approach to train a classifier for the functional analysis of protein sequences. Our method proves to have a low computational cost while maintaining high accuracy in nredicting protein functions.

Original languageEnglish (US)
Title of host publication2021 IEEE Symposium Series on Computational Intelligence, SSCI 2021 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728190488
DOIs
StatePublished - 2021
Externally publishedYes
Event2021 IEEE Symposium Series on Computational Intelligence, SSCI 2021 - Orlando, United States
Duration: Dec 5 2021Dec 7 2021

Publication series

Name2021 IEEE Symposium Series on Computational Intelligence, SSCI 2021 - Proceedings

Conference

Conference2021 IEEE Symposium Series on Computational Intelligence, SSCI 2021
Country/TerritoryUnited States
CityOrlando
Period12/5/2112/7/21

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Computer Science Applications
  • Decision Sciences (miscellaneous)
  • Safety, Risk, Reliability and Quality
  • Control and Optimization

Fingerprint

Dive into the research topics of 'Incremental Semi-Supervised Learning for Functional Analysis of Protein Sequences'. Together they form a unique fingerprint.

Cite this