Evaluation of microarray data normalization procedures using spike-in experiments
Patrik Rydén,Henrik Andersson,Mattias Landfors,Linda Näslund,Blanka Hartmanová,Laila Noppa,Laila Noppa,Anders Sjöstedt +7 more
TL;DR: The use of spike-in experiments is a powerful approach for evaluating microarray preprocessing procedures and is characterized by properties of the observed log-ratios and the analysis' ability to detect differentially expressed genes.
read more
Abstract: Recently, a large number of methods for the analysis of microarray data have been proposed but there are few comparisons of their relative performances. By using so-called spike-in experiments, it is possible to characterize the analyzed data and thereby enable comparisons of different analysis methods. A spike-in experiment using eight in-house produced arrays was used to evaluate established and novel methods for filtration, background adjustment, scanning, channel adjustment, and censoring. The S-plus package EDMA, a stand-alone tool providing characterization of analyzed cDNA-microarray data obtained from spike-in experiments, was developed and used to evaluate 252 normalization methods. For all analyses, the sensitivities at low false positive rates were observed together with estimates of the overall bias and the standard deviation. In general, there was a trade-off between the ability of the analyses to identify differentially expressed genes (i.e. the analyses' sensitivities) and their ability to provide unbiased estimators of the desired ratios. Virtually all analysis underestimated the magnitude of the regulations; often less than 50% of the true regulations were observed. Moreover, the bias depended on the underlying mRNA-concentration; low concentration resulted in high bias. Many of the analyses had relatively low sensitivities, but analyses that used either the constrained model (i.e. a procedure that combines data from several scans) or partial filtration (a novel method for treating data from so-called not-found spots) had with few exceptions high sensitivities. These methods gave considerable higher sensitivities than some commonly used analysis methods. The use of spike-in experiments is a powerful approach for evaluating microarray preprocessing procedures. Analyzed data are characterized by properties of the observed log-ratios and the analysis' ability to detect differentially expressed genes. If bias is not a major problem; we recommend the use of either the CM-procedure or partial filtration.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
A unique program for cell death in xylem fibers of Populus stem
Charleen L. Courtois-Moreau,Edouard Pesquet,Andreas Sjödin,Luis Muñiz,Benjamin Bollhöner,Minako Kaneda,Lacey Samuels,Stefan Jansson,Hannele Tuominen +8 more
TL;DR: High-resolution microarray analysis in the vascular tissues of Populus stem suggests the involvement of several previously uncharacterized transcription factors, ethylene, sphingolipids and light signaling as well as autophagy in the control of fiber cell death.
169
OnPLS integration of transcriptomic, proteomic and metabolomic data shows multi-level oxidative stress responses in the cambium of transgenic hipI- superoxide dismutase Populus plants
Vaibhav Srivastava,Vaibhav Srivastava,Ogonna Obudulu,Ogonna Obudulu,Joakim Bygdell,Tommy Löfstedt,Patrik Rydén,Robert Nilsson,Robert Nilsson,Maria Ahnlund,Annika I. Johansson,Pär Jonsson,Eva Freyhult,Johanna Qvarnström,Jan Karlsson,Michael Melzer,Thomas Moritz,Johan Trygg,Torgeir R. Hvidsten,Torgeir R. Hvidsten,Gunnar Wingsle +20 more
TL;DR: System responses to oxidative stress in Populus are presented by integrating data from analyses of the cambial region of wild-type controls and plants expressing high-isoelectric-point superoxide dismutase (hipI-SOD) transcripts in antisense orientation showing a higher production of superoxide.
Classification of microarrays; synergistic effects between normalization, gene selection and machine learning
TL;DR: There is a synergistic relationship between these methods and gene selection based on the T-test and the selection of a relatively high number of genes, and these methods benefit significantly from using normalized data, although it is hard to draw general conclusions about the relative performance of different normalization procedures.
Gene array identification of Ipf1/Pdx1 -/- regulated genes in pancreatic progenitor cells
Per Svensson,Cecilia Williams,Cecilia Williams,Joakim Lundeberg,Patrik Rydén,Ingela Bergqvist,Helena Edlund +6 more
TL;DR: This microarray analysis has identified a number of candidate genes that are differentially expressed in Ipf1/Pdx1-/- pancreatic buds that were known to be important for pancreatic progenitor cell proliferation and differentiation whereas others have not previously been associated with pancreatic development.
Validation and characterization of DNA microarray gene expression data distribution and associated moments
TL;DR: The null hypotheses for goodness of fit for all considered univariate theoretical probability distributions (including the Normal distribution) are rejected for more than 50% of probe sets on the Affymetrix microarray platform at a 95% confidence level, suggesting that under the tested conditions a priori assumption of any of these distributions across all probe sets is not valid.
21
References
Bioconductor: open software development for computational biology and bioinformatics
Robert Gentleman,Vincent J. Carey,Douglas M. Bates,Benjamin M. Bolstad,Marcel Dettling,Sandrine Dudoit,Byron Ellis,Laurent Gautier,Yongchao Ge,Jeff Gentry,Kurt Hornik,Torsten Hothorn,Wolfgang Huber,Stefano Maria Iacus,Rafael A. Irizarry,Friedrich Leisch,Cheng Li,Martin Maechler,A. J. Rossini,Günther Sawitzki,Colin A. Smith,Gordon K. Smyth,Luke Tierney,Jean Yang,Jianhua Zhang +24 more
TL;DR: Details of the aims and methods of Bioconductor, the collaborative creation of extensible software for computational biology and bioinformatics, and current challenges are described.
Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation
TL;DR: This article proposes normalization methods that are based on robust local regression and account for intensity and spatial dependence in dye biases for different types of cDNA microarray experiments.
BioArray Software Environment (BASE): a platform for comprehensive management and analysis of microarray data
Lao H. Saal,Lao H. Saal,Carl Troein,Johan Vallon-Christersson,Sofia Gruvberger,Åke Borg,Carsten Peterson +6 more
TL;DR: This work presents a web-based customizable bioinformatics solution called BioArray Software Environment (BASE) for the management and analysis of all areas of microarray experimentation.
Towards sound epistemological foundations of statistical methods for high-dimensional biology.
TL;DR: This work offers a framework in which the epistemological foundation of proposed statistical methods can be evaluated and is hopeful that it will help clarify the role of data consistency in the development of statistical techniques.
Empirical evaluation of data transformations and ranking statistics for microarray analysis
Li-Xuan Qin,Kathleen F. Kerr +1 more
TL;DR: Findings support the use of an intensity-based normalization procedure and indicate that local background subtraction can be detrimental for effectively detecting differential expression, and find that choice of image analysis software can also substantially influence experimental conclusions.