Abstract
Numerous single-cell transcriptomic datasets from identical tissues or cell lines are generated from different laboratories or single-cell RNA sequencing (scRNA-seq) protocols. The denoising of these datasets to eliminate batch effects is crucial for data integration, ensuring accurate interpretation and comprehensive analysis of biological questions. Although many scRNA-seq data integration methods exist, most are inefficient and/or not conducive to downstream analysis. Here, DeepBID, a novel deep learning-based method for batch effect correction, non-linear dimensionality reduction, embedding, and cell clustering concurrently, is introduced. DeepBID utilizes a negative binomial-based autoencoder with dual Kullback–Leibler divergence loss functions, aligning cell points from different batches within a consistent low-dimensional latent space and progressively mitigating batch effects through iterative clustering. Extensive validation on multiple-batch scRNA-seq datasets demonstrates that DeepBID surpasses existing tools in removing batch effects and achieving superior clustering accuracy. When integrating multiple scRNA-seq datasets from patients with Alzheimer's disease, DeepBID significantly improves cell clustering, effectively annotating unidentified cells, and detecting cell-specific differentially expressed genes.
Original language | English (US) |
---|---|
Article number | 2308934 |
Journal | Advanced Science |
Volume | 11 |
Issue number | 29 |
DOIs | |
State | Published - Aug 7 2024 |
All Science Journal Classification (ASJC) codes
- Medicine (miscellaneous)
- General Chemical Engineering
- General Materials Science
- Biochemistry, Genetics and Molecular Biology (miscellaneous)
- General Engineering
- General Physics and Astronomy