Fast, sensitive and accurate integration of single-cell data with Harmony.
Ilya Korsunsky,Nghia Millard,Jean Fan,Kamil Slowikowski,Fan Zhang,Kevin Wei,Yuriy Baglaenko,Michael B. Brenner,Po-Ru Loh,Po-Ru Loh,Po-Ru Loh,Soumya Raychaudhuri +11 more
TL;DR: Harmony, for the integration of single-cell transcriptomic data, identifies broad and fine-grained populations, scales to large datasets, and can integrate sequencing- and imaging-based data.
read more
Abstract: The emerging diversity of single-cell RNA-seq datasets allows for the full transcriptional characterization of cell types across a wide variety of biological and clinical conditions. However, it is challenging to analyze them together, particularly when datasets are assayed with different technologies, because biological and technical differences are interspersed. We present Harmony (
https://github.com/immunogenomics/harmony
), an algorithm that projects cells into a shared embedding in which cells group by cell type rather than dataset-specific conditions. Harmony simultaneously accounts for multiple experimental and biological factors. In six analyses, we demonstrate the superior performance of Harmony to previously published algorithms while requiring fewer computational resources. Harmony enables the integration of ~106 cells on a personal computer. We apply Harmony to peripheral blood mononuclear cells from datasets with large experimental differences, five studies of pancreatic islet cells, mouse embryogenesis datasets and the integration of scRNA-seq with spatial transcriptomics data. Harmony, for the integration of single-cell transcriptomic data, identifies broad and fine-grained populations, scales to large datasets, and can integrate sequencing- and imaging-based data.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Multi-omics data integration using ratio-based quantitative profiling with Quartet reference materials
Yuanting Zheng,Yaqing Liu,Jingcheng Yang,Dong Li,Rui Zhang,Sha Tian,Ying Yu,Luyao Ren,Wanwan Hou,Zhen-Hua Feng,Yuanbang Mai,Jinxiong Han,Lijun Zhang,Hui Jiang,Gang� Li,Jingwei Lou,Ruiqiang Li,Jingchao Lin,Huafen Liu,Ziqing Kong,Depeng Wang,Fangping Dai,Ding Bao,Zehui Cao,Qiaochu Chen,Qingwang Chen,Xingdong Chen,Yuechen Gao,He Jiang,Bin Li,Bingying Li,Jingjing Li,Ruimei Liu,Tao Qing,Erfei Shang,Jun Shang,Shanyue Sun,Haiyan Wang,Xiaolin Wang,Naixin Zhang,Peipei Zhang,Ruolan Zhang,Sibo Zhu,Andreas Scherer,Jiucun Wang,Jing Wang,Yinbo Huo,Gang Liu,Chengming Cao,Shao Li,Joshua Xu,Huixiao Hong,Wenming Xiao,Xiaozhen Liang,Daru Lu,Li Jin,Weida Tong,Ding Chen,Jinming Li,Xiang Fang,Leming Shi +60 more
TL;DR: Multi-omics data integration using ratio-based quantitative profiling with Quartet reference materials provides a solution to the problem of irreproducibility in multi-omics data by establishing a common reference point and enabling absolute quantification of features.
Cross-organ single-cell transcriptome profiling reveals macrophage and dendritic cell heterogeneity in zebrafish.
TL;DR: This study comprehensively profile the heterogeneity of TRMs and DCs across adult zebrafish organs via single-cell RNA sequencing, and identifies two macrophage subsets: pro-inflammatory macrophages with potent phagocytosis signatures and pro-remodeling macrophaged with tissue regeneration signatures in barrier tissues, liver, and heart.
38
Integrating Deep Supervised, Self-Supervised and Unsupervised Learning for Single-Cell RNA-seq Clustering and Annotation
TL;DR: This paper proposes a novel end-to-end cell supervised clustering and annotation framework called scAnCluster, which fully utilizes the cell type labels available from reference data to facilitate the cell clustered and annotation on the unlabeled target data.
38
Integrating spatial transcriptomics data across different conditions, technologies, and developmental stages
TL;DR: Wang et al. as mentioned in this paper developed a graph attention neural network STAligner for integrating and aligning spatial transcriptomics (ST) datasets, enabling spatially-aware data integration, simultaneous spatial domain identification, and downstream comparative analysis.
38
Histone lactylation couples cellular metabolism with developmental gene regulatory networks
Fjodor Merkuri,Megan Rothstein,Marcos Simoes-Costa +2 more
TL;DR: It is shown that histone lactylation couples metabolism and transcription during neural crest cell differentiation in the early embryo, an epigenetic mechanism that integrates cellular metabolism with the GRNs that orchestrate embryonic development.
38
References
STAR: ultrafast universal RNA-seq aligner
Alexander Dobin,Carrie A. Davis,Felix Schlesinger,Jorg Drenkow,Chris Zaleski,Sonali Jha,Philippe Batut,Mark Chaisson,Thomas R. Gingeras +8 more
TL;DR: The Spliced Transcripts Alignment to a Reference (STAR) software based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure outperforms other aligners by a factor of >50 in mapping speed.
Gene Ontology: tool for the unification of biology
M Ashburner,Catherine A. Ball,Judith A. Blake,David Botstein,Heather Butler,J. M. Cherry,Allan Peter Davis,Kara Dolinski,Selina S. Dwight,J.T. Eppig,Midori A. Harris,David P. Hill,Laurie Issel-Tarver,Andrew Kasarskis,Suzanna E. Lewis,John C. Matese,Joel E. Richardson,M. Ringwald,Gerald M. Rubin,Gavin Sherlock +19 more
TL;DR: The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing.
limma powers differential expression analyses for RNA-sequencing and microarray studies
Matthew E. Ritchie,Belinda Phipson,Di Wu,Yifang Hu,Charity W. Law,Wei Shi,Gordon K. Smyth,Gordon K. Smyth +7 more
TL;DR: The philosophy and design of the limma package is reviewed, summarizing both new and historical features, with an emphasis on recent enhancements and features that have not been previously described.
Fast unfolding of communities in large networks
Vincent D. Blondel,Jean-Loup Guillaume,Jean-Loup Guillaume,Renaud Lambiotte,Renaud Lambiotte,Etienne Lefebvre +5 more
TL;DR: This work proposes a heuristic method that is shown to outperform all other known community detection methods in terms of computation time and the quality of the communities detected is very good, as measured by the so-called modularity.
Integrating single-cell transcriptomic data across different conditions, technologies, and species.
TL;DR: An analytical strategy for integrating scRNA-seq data sets based on common sources of variation is introduced, enabling the identification of shared populations across data sets and downstream comparative analysis.
Related Papers (5)
Grace X.Y. Zheng,Jessica M. Terry,Phillip Belgrader,Paul Ryvkin,Zachary Bent,Ryan Wilson,Solongo B. Ziraldo,Tobias Daniel Wheeler,Geoffrey P. McDermott,Junjie Zhu,Mark T. Gregory,Joe Shuga,Luz Montesclaros,Jason G. Underwood,Donald A. Masquelier,Stefanie Y. Nishimura,Michael Schnall-Levin,Paul Wyatt,Christopher Hindson,Rajiv Bharadwaj,Alexander Wong,Kevin D. Ness,Lan Beppu,H. Joachim Deeg,Christopher McFarland,Keith R. Loeb,Keith R. Loeb,William J. Valente,William J. Valente,Nolan G. Ericson,Emily A. Stevens,Jerald P. Radich,Tarjei S. Mikkelsen,Benjamin J. Hindson,Jason H. Bielas +34 more
Evan Z. Macosko,Evan Z. Macosko,Anindita Basu,Anindita Basu,Rahul Satija,Rahul Satija,James Nemesh,James Nemesh,Karthik Shekhar,Melissa Goldman,Melissa Goldman,Itay Tirosh,Allison R. Bialas,Nolan Kamitaki,Nolan Kamitaki,Emily M. Martersteck,John J. Trombetta,David A. Weitz,Joshua R. Sanes,Alex K. Shalek,Alex K. Shalek,Alex K. Shalek,Aviv Regev,Aviv Regev,Aviv Regev,Steven A. McCarroll,Steven A. McCarroll +26 more