TY - JOUR
T1 - Integrating human omics data to prioritize candidate genes
AU - Chen, Yong
AU - Wu, Xuebing
AU - Jiang, Rui
N1 - Funding Information:
The authors would like to thank Prof. Fengzhu Sun and Prof. Xuegong Zhang for critical reading of this manuscript and useful suggestions. This research was partially supported by the National Basic Research Program of China (2012CB316504), the National High Technology Research and Development Program of China (2012AA020401), the National Natural Science Foundation of China (61175002, 60805010, and 61273228), and the Open Research Fund of State Key Laboratory of Bioelectronics, Southeast University.
PY - 2013/12/18
Y1 - 2013/12/18
N2 - Background: The identification of genes involved in human complex diseases remains a great challenge in computational systems biology. Although methods have been developed to use disease phenotypic similarities with a protein-protein interaction network for the prioritization of candidate genes, other valuable omics data sources have been largely overlooked in these methods. Methods. With this understanding, we proposed a method called BRIDGE to prioritize candidate genes by integrating disease phenotypic similarities with such omics data as protein-protein interactions, gene sequence similarities, gene expression patterns, gene ontology annotations, and gene pathway memberships. BRIDGE utilizes a multiple regression model with lasso penalty to automatically weight different data sources and is capable of discovering genes associated with diseases whose genetic bases are completely unknown. Results: We conducted large-scale cross-validation experiments and demonstrated that more than 60% known disease genes can be ranked top one by BRIDGE in simulated linkage intervals, suggesting the superior performance of this method. We further performed two comprehensive case studies by applying BRIDGE to predict novel genes and transcriptional networks involved in obesity and type II diabetes. Conclusion: The proposed method provides an effective and scalable way for integrating multi omics data to infer disease genes. Further applications of BRIDGE will be benefit to providing novel disease genes and underlying mechanisms of human diseases.
AB - Background: The identification of genes involved in human complex diseases remains a great challenge in computational systems biology. Although methods have been developed to use disease phenotypic similarities with a protein-protein interaction network for the prioritization of candidate genes, other valuable omics data sources have been largely overlooked in these methods. Methods. With this understanding, we proposed a method called BRIDGE to prioritize candidate genes by integrating disease phenotypic similarities with such omics data as protein-protein interactions, gene sequence similarities, gene expression patterns, gene ontology annotations, and gene pathway memberships. BRIDGE utilizes a multiple regression model with lasso penalty to automatically weight different data sources and is capable of discovering genes associated with diseases whose genetic bases are completely unknown. Results: We conducted large-scale cross-validation experiments and demonstrated that more than 60% known disease genes can be ranked top one by BRIDGE in simulated linkage intervals, suggesting the superior performance of this method. We further performed two comprehensive case studies by applying BRIDGE to predict novel genes and transcriptional networks involved in obesity and type II diabetes. Conclusion: The proposed method provides an effective and scalable way for integrating multi omics data to infer disease genes. Further applications of BRIDGE will be benefit to providing novel disease genes and underlying mechanisms of human diseases.
UR - http://www.scopus.com/inward/record.url?scp=84890419221&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84890419221&partnerID=8YFLogxK
U2 - 10.1186/1755-8794-6-57
DO - 10.1186/1755-8794-6-57
M3 - Article
C2 - 24344781
AN - SCOPUS:84890419221
SN - 1755-8794
VL - 6
JO - BMC Medical Genomics
JF - BMC Medical Genomics
IS - 1
M1 - 57
ER -