TY - JOUR
T1 - Multi-Constrained Joint Non-Negative Matrix Factorization with Application to Imaging Genomic Study of Lung Metastasis in Soft Tissue Sarcomas
AU - Deng, Jin
AU - Zeng, Weiming
AU - Kong, Wei
AU - Shi, Yuhu
AU - Mou, Xiaoyang
AU - Guo, Jian
N1 - Funding Information:
Manuscript received July 19, 2019; revised September 15, 2019 and October 15, 2019; accepted November 15, 2019. Date of publication November 21, 2019; date of current version June 18, 2020. This work was supported in part by the National Natural Science Foundation of China under Grant 31870979 and Grant 61906117, in part by the Natural Science Foundation of Shanghai under Grant 18ZR1417200, and in part by the Shanghai Sailing Program under Grant 19YF1419000. (Corresponding author: Weiming Zeng.) J. Deng, W. Kong, Y. Shi, and J. Guo are with the Laboratory of Digital Imaging and Intelligent Computing, Information Engineering College, Shanghai Maritime University.
Publisher Copyright:
© 1964-2012 IEEE.
PY - 2020/7
Y1 - 2020/7
N2 - Objective: The study of pathogenic mechanism at the genetic level by imaging genetics methods enables to effectively reveal the association of histopathology and genetics. However, there is a lack of effective and accurate tools to establish association models from macroscopic to microscopic. Methods: The multi-constrained joint non-negative matrix factorization (MCJNMF) was developed for simultaneous integration of genomic data and image data to identify common modules related to disease. Two types of data matrices were projected onto a common feature space, in which heterogeneous variables with large coefficients in the same projected direction form a common module. Meanwhile, the correlation between original data features was integrated by using regularization constraints to improve the biological relevance. Sparsity constraints and orthogonal constraints were performed on decomposition factors to minimize the redundancy between different bases and to reduce algorithm complexity. Results: This algorithm was successfully performed on the module identification of lung metastasis in soft tissue sarcomas (STSs) by integrating FDG-PET image and DNA methylation data features. Multilevel analysis on the top extracted modules revealed that these modules were closely related to the lung metastasis. Particularly, several genes with diagnostic potential for lung metastasis can be discovered from high score modules. Conclusion: This method not only can be applied for the accurate identification of patterns related to pathogenic mechanism of diseases, but also has a significant implication for discovering protein biomarkers. Significance: This method provides avenues for further studies of identifying complex association patterns of diseases according to different types of biological data.
AB - Objective: The study of pathogenic mechanism at the genetic level by imaging genetics methods enables to effectively reveal the association of histopathology and genetics. However, there is a lack of effective and accurate tools to establish association models from macroscopic to microscopic. Methods: The multi-constrained joint non-negative matrix factorization (MCJNMF) was developed for simultaneous integration of genomic data and image data to identify common modules related to disease. Two types of data matrices were projected onto a common feature space, in which heterogeneous variables with large coefficients in the same projected direction form a common module. Meanwhile, the correlation between original data features was integrated by using regularization constraints to improve the biological relevance. Sparsity constraints and orthogonal constraints were performed on decomposition factors to minimize the redundancy between different bases and to reduce algorithm complexity. Results: This algorithm was successfully performed on the module identification of lung metastasis in soft tissue sarcomas (STSs) by integrating FDG-PET image and DNA methylation data features. Multilevel analysis on the top extracted modules revealed that these modules were closely related to the lung metastasis. Particularly, several genes with diagnostic potential for lung metastasis can be discovered from high score modules. Conclusion: This method not only can be applied for the accurate identification of patterns related to pathogenic mechanism of diseases, but also has a significant implication for discovering protein biomarkers. Significance: This method provides avenues for further studies of identifying complex association patterns of diseases according to different types of biological data.
UR - http://www.scopus.com/inward/record.url?scp=85086747721&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85086747721&partnerID=8YFLogxK
U2 - 10.1109/TBME.2019.2954989
DO - 10.1109/TBME.2019.2954989
M3 - Article
C2 - 31751222
AN - SCOPUS:85086747721
VL - 67
SP - 2110
EP - 2118
JO - IRE transactions on medical electronics
JF - IRE transactions on medical electronics
SN - 0018-9294
IS - 7
M1 - 8908811
ER -