Integrating Deep Supervised, Self-Supervised and Unsupervised Learning for Single-Cell RNA-seq Clustering and Annotation
TL;DR: This paper proposes a novel end-to-end cell supervised clustering and annotation framework called scAnCluster, which fully utilizes the cell type labels available from reference data to facilitate the cell clustered and annotation on the unlabeled target data.
read more
Abstract: As single-cell RNA sequencing technologies mature, massive gene expression profiles can be obtained. Consequently, cell clustering and annotation become two crucial and fundamental procedures affecting other specific downstream analyses. Most existing single-cell RNA-seq (scRNA-seq) data clustering algorithms do not take into account the available cell annotation results on the same tissues or organisms from other laboratories. Nonetheless, such data could assist and guide the clustering process on the target dataset. Identifying marker genes through differential expression analysis to manually annotate large amounts of cells also costs labor and resources. Therefore, in this paper, we propose a novel end-to-end cell supervised clustering and annotation framework called scAnCluster, which fully utilizes the cell type labels available from reference data to facilitate the cell clustering and annotation on the unlabeled target data. Our algorithm integrates deep supervised learning, self-supervised learning and unsupervised learning techniques together, and it outperforms other customized scRNA-seq supervised clustering methods in both simulation and real data. It is particularly worth noting that our method performs well on the challenging task of discovering novel cell types that are absent in the reference data.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Machine Learning and Big Data Provide Crucial Insight for Future Biomaterials Discovery and Research.
TL;DR: Machine learning has been widely adopted in a variety of fields including engineering, science, and medicine revolutionizing how data is collected, used, and stored as discussed by the authors, which has led to a drastic increase in the number of computational models for the prediction of various numerical, categorical, or association events given input variables.
58
Deep learning applications in single-cell genomics and transcriptomics data analysis.
Nafiseh Erfanian,A. Ali Heydari,Pablo Ianez,Afshin Derakhshani,Mohammad GhasemiGol,Mohsen Farahpour,Seyyed Mohammad Razavi,Saeed Nasseri,Hossein Safarpour,Amirhossein Sahebkar +9 more
TL;DR: In this article , the authors examined DL applications in genomics, transcriptomics, spatial transcriptomics and multi-omics integration, and address whether DL techniques will prove to be advantageous or if the single-cell omics domain poses unique challenges.
47
Single-cell RNA-seq data semi-supervised clustering and annotation via structural regularized domain adaptation.
TL;DR: A flexible single cell semi-supervised clustering and annotation framework, scSemiCluster, which integrates the reference data and target data and incorporates pairwise constraints in the feature learning process such that cells belonging to the same cluster are close to each other, and cells belong to different clusters are far from each other in the latent space.
43
OUP accepted manuscript
06 Jan 2022
TL;DR: Wang et al. as mentioned in this paper proposed a novel scRNA-seq clustering algorithm called scNAME which incorporates a mask estimation task for gene pertinence mining and a neighborhood contrastive learning framework for cell intrinsic structure exploitation.
41
Application of Deep Learning on Single-cell RNA Sequencing Data Analysis: A Review
TL;DR: Deep learning has also emerged as a promising tool for scRNAseq data analysis, as it has a capacity to extract informative and compact features from noisy, heterogeneous, and high-dimensional scRNA-seq data to improve downstream analysis as discussed by the authors .
References
•Posted Content
UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction
Leland McInnes,John Healy +1 more
TL;DR: The UMAP algorithm is competitive with t-SNE for visualization quality, and arguably preserves more of the global structure with superior run time performance.
9.9K
UMAP: Uniform Manifold Approximation and Projection
Leland McInnes,John Healy,Nathaniel Saul,Lukas Großberger +3 more
- 02 Sep 2018
TL;DR: Uniform Manifold Approximation and Projection (UMAP) is a dimension reduction technique that can be used for visualisation similarly to t-SNE, but also for general non-linear dimension reduction.
Highly Parallel Genome-wide Expression Profiling of Individual Cells Using Nanoliter Droplets
Evan Z. Macosko,Evan Z. Macosko,Anindita Basu,Anindita Basu,Rahul Satija,Rahul Satija,James Nemesh,James Nemesh,Karthik Shekhar,Melissa Goldman,Melissa Goldman,Itay Tirosh,Allison R. Bialas,Nolan Kamitaki,Nolan Kamitaki,Emily M. Martersteck,John J. Trombetta,David A. Weitz,Joshua R. Sanes,Alex K. Shalek,Alex K. Shalek,Alex K. Shalek,Aviv Regev,Aviv Regev,Aviv Regev,Steven A. McCarroll,Steven A. McCarroll +26 more
TL;DR: Drop-seq will accelerate biological discovery by enabling routine transcriptional profiling at single-cell resolution by separating them into nanoliter-sized aqueous droplets, associating a different barcode with each cell's RNAs, and sequencing them all together.
7.3K
SCANPY: large-scale single-cell gene expression data analysis
TL;DR: This work presents Scanpy, a scalable toolkit for analyzing single-cell gene expression data that includes methods for preprocessing, visualization, clustering, pseudotime and trajectory inference, differential expression testing, and simulation of gene regulatory networks, and AnnData, a generic class for handling annotated data matrices.
Fast, sensitive and accurate integration of single-cell data with Harmony.
Ilya Korsunsky,Nghia Millard,Jean Fan,Kamil Slowikowski,Fan Zhang,Kevin Wei,Yuriy Baglaenko,Michael B. Brenner,Po-Ru Loh,Po-Ru Loh,Po-Ru Loh,Soumya Raychaudhuri +11 more
TL;DR: Harmony, for the integration of single-cell transcriptomic data, identifies broad and fine-grained populations, scales to large datasets, and can integrate sequencing- and imaging-based data.