Journal Article10.1016/j.ymeth.2024.09.010
AtML: An Arabidopsis thaliana root cell identity recognition tool for medicinal ingredient accumulation
Shicong Yu,Lijia Liu,Hao Wang,Yan Shen,Shuqin Zheng,Jing Ning,Ruxian Luo,Xiangzheng Fu,Xiaoshu Deng +8 more
TL;DR: Researchers developed AtML, a machine learning tool, to accurately identify Arabidopsis root cell stages and biomarkers, achieving 96.50% accuracy and 96.51% recall, and identified 160 important marker genes for medicinal compound accumulation.
read more
Abstract: Arabidopsis thaliana synthesizes various medicinal compounds, and serves as a model plant for medicinal plant research. Single-cell transcriptomics technologies are essential for understanding the developmental trajectory of plant roots, facilitating the analysis of synthesis and accumulation patterns of medicinal compounds in different cell subpopulations. Although methods for interpreting single-cell transcriptomics data are rapidly advancing in Arabidopsis, challenges remain in precisely annotating cell identity due to the lack of marker genes for certain cell types. In this work, we trained a machine learning system, AtML, using sequencing datasets from six cell subpopulations, comprising a total of 6000 cells, to predict Arabidopsis root cell stages and identify biomarkers through complete model interpretability. Performance testing using an external dataset revealed that AtML achieved 96.50% accuracy and 96.51% recall. Through the interpretability provided by AtML, our model identified 160 important marker genes, contributing to the understanding of cell type annotations. In conclusion, we trained AtML to efficiently identify Arabidopsis root cell stages, providing a new tool for elucidating the mechanisms of medicinal compound accumulation in Arabidopsis roots.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
References
LIBSVM: A library for support vector machines
Chih-Chung Chang,Chih-Jen Lin +1 more
TL;DR: Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.
Detecting Novel Associations in Large Data Sets
David N. Reshef,David N. Reshef,David N. Reshef,Yakir A. Reshef,Yakir A. Reshef,Hilary K. Finucane,Sharon R. Grossman,Sharon R. Grossman,Gilean McVean,Gilean McVean,Peter J. Turnbaugh,Eric S. Lander,Eric S. Lander,Eric S. Lander,Michael Mitzenmacher,Pardis C. Sabeti,Pardis C. Sabeti +16 more
TL;DR: A measure of dependence for two-variable relationships: the maximal information coefficient (MIC), which captures a wide range of associations both functional and not, and for functional relationships provides a score that roughly equals the coefficient of determination of the data relative to the regression function.
A survey of best practices for RNA-seq data analysis
Ana Conesa,Pedro Madrigal,Pedro Madrigal,Sonia Tarazona,David Gomez-Cabrero,Alejandra Cervera,Andrew McPherson,Michał Wojciech Szcześniak,Daniel J. Gaffney,Laura L. Elo,Xuegong Zhang,Ali Mortazavi +11 more
TL;DR: All of the major steps in RNA-seq data analysis are reviewed, including experimental design, quality control, read alignment, quantification of gene and transcript levels, visualization, differential gene expression, alternative splicing, functional analysis, gene fusion detection and eQTL mapping.
Cellular organisation of the Arabidopsis thaliana root
Liam Dolan,Kees Janmaat,Viola Willemsen,Paul Linstead,Scott Poethig,Keith Roberts,Ben Scheres +6 more
TL;DR: The anatomy of the developing root of Arabidopsis is described using conventional histological techniques, scanning and transmission electron microscopy and a model of meristem activity is proposed, which underpins future work on the developmental genetics of root morphogenesis.