Evaluation of microarray data normalization procedures using spike-in experiments

doi:10.1186/1471-2105-7-300

Open AccessJournal Article10.1186/1471-2105-7-300

Evaluation of microarray data normalization procedures using spike-in experiments

Patrik Rydén, +7 more

- 14 Jun 2006

- BMC Bioinformatics

- Vol. 7, Iss: 1, pp 300-300

24

TL;DR: The use of spike-in experiments is a powerful approach for evaluating microarray preprocessing procedures and is characterized by properties of the observed log-ratios and the analysis' ability to detect differentially expressed genes.

Abstract: Recently, a large number of methods for the analysis of microarray data have been proposed but there are few comparisons of their relative performances. By using so-called spike-in experiments, it is possible to characterize the analyzed data and thereby enable comparisons of different analysis methods. A spike-in experiment using eight in-house produced arrays was used to evaluate established and novel methods for filtration, background adjustment, scanning, channel adjustment, and censoring. The S-plus package EDMA, a stand-alone tool providing characterization of analyzed cDNA-microarray data obtained from spike-in experiments, was developed and used to evaluate 252 normalization methods. For all analyses, the sensitivities at low false positive rates were observed together with estimates of the overall bias and the standard deviation. In general, there was a trade-off between the ability of the analyses to identify differentially expressed genes (i.e. the analyses' sensitivities) and their ability to provide unbiased estimators of the desired ratios. Virtually all analysis underestimated the magnitude of the regulations; often less than 50% of the true regulations were observed. Moreover, the bias depended on the underlying mRNA-concentration; low concentration resulted in high bias. Many of the analyses had relatively low sensitivities, but analyses that used either the constrained model (i.e. a procedure that combines data from several scans) or partial filtration (a novel method for treating data from so-called not-found spots) had with few exceptions high sensitivities. These methods gave considerable higher sensitivities than some commonly used analysis methods. The use of spike-in experiments is a powerful approach for evaluating microarray preprocessing procedures. Analyzed data are characterized by properties of the observed log-ratios and the analysis' ability to detect differentially expressed genes. If bias is not a major problem; we recommend the use of either the CM-procedure or partial filtration.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Journal Article•10.1111/J.1365-313X.2008.03777.X

A unique program for cell death in xylem fibers of Populus stem

Charleen L. Courtois-Moreau, +8 more

- 01 Apr 2009

- Plant Journal

TL;DR: High-resolution microarray analysis in the vascular tissues of Populus stem suggests the involvement of several previously uncharacterized transcription factors, ethylene, sphingolipids and light signaling as well as autophagy in the control of fiber cell death.

...read moreread less

169

•Journal Article•10.1186/1471-2164-14-893

OnPLS integration of transcriptomic, proteomic and metabolomic data shows multi-level oxidative stress responses in the cambium of transgenic hipI- superoxide dismutase Populus plants

Vaibhav Srivastava, +20 more

- 17 Dec 2013

- BMC Genomics

TL;DR: System responses to oxidative stress in Populus are presented by integrating data from analyses of the cambial region of wild-type controls and plants expressing high-isoelectric-point superoxide dismutase (hipI-SOD) transcripts in antisense orientation showing a higher production of superoxide.

...read moreread less

75

•Journal Article•10.1186/1471-2105-12-390

Classification of microarrays; synergistic effects between normalization, gene selection and machine learning

Jenny Önskog, +5 more

- 07 Oct 2011

- BMC Bioinformatics

TL;DR: There is a synergistic relationship between these methods and gene selection based on the T-test and the selection of a relatively high number of genes, and these methods benefit significantly from using normalized data, although it is hard to draw general conclusions about the relative performance of different normalization procedures.

...read moreread less

39

•Journal Article•10.1186/1471-213X-7-129

Gene array identification of Ipf1/Pdx1 -/- regulated genes in pancreatic progenitor cells

Per Svensson, +6 more

- 23 Nov 2007

- BMC Developmental Biology

TL;DR: This microarray analysis has identified a number of candidate genes that are differentially expressed in Ipf1/Pdx1-/- pancreatic buds that were known to be important for pancreatic progenitor cell proliferation and differentiation whereas others have not previously been associated with pancreatic development.

...read moreread less

32

•Journal Article•10.1186/1471-2105-11-576

Validation and characterization of DNA microarray gene expression data distribution and associated moments

Reuben Thomas, +3 more

- 24 Nov 2010

- BMC Bioinformatics

TL;DR: The null hypotheses for goodness of fit for all considered univariate theoretical probability distributions (including the Normal distribution) are rejected for more than 50% of probe sets on the Affymetrix microarray platform at a 95% confidence level, suggesting that under the tested conditions a priori assumption of any of these distributions across all probe sets is not valid.

...read moreread less

21

...

Expand

References

•Journal Article•10.1186/GB-2004-5-10-R80

Bioconductor: open software development for computational biology and bioinformatics

Robert Gentleman, +24 more

- 15 Sep 2004

- Genome Biology

TL;DR: Details of the aims and methods of Bioconductor, the collaborative creation of extensible software for computational biology and bioinformatics, and current challenges are described.

...read moreread less

13.2K

•Journal Article•10.1093/NAR/30.4.E15

Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation

Yee Hwa Yang, +6 more

- 15 Feb 2002

- Nucleic Acids Research

TL;DR: This article proposes normalization methods that are based on robust local regression and account for intensity and spatial dependence in dye biases for different types of cDNA microarray experiments.

...read moreread less

3.8K

•Journal Article•10.1186/GB-2002-3-8-SOFTWARE0003

BioArray Software Environment (BASE): a platform for comprehensive management and analysis of microarray data

Lao H. Saal, +6 more

- 15 Jul 2002

- Genome Biology

TL;DR: This work presents a web-based customizable bioinformatics solution called BioArray Software Environment (BASE) for the management and analysis of all areas of microarray experimentation.

...read moreread less

523

•Journal Article•10.1038/NG1422

Towards sound epistemological foundations of statistical methods for high-dimensional biology.

Tapan Mehta, +2 more

- 01 Sep 2004

- Nature Genetics

TL;DR: This work offers a framework in which the epistemological foundation of proposed statistical methods can be evaluated and is hopeful that it will help clarify the role of data consistency in the development of statistical techniques.

...read moreread less

126

•Journal Article•10.1093/NAR/GKH866

Empirical evaluation of data transformations and ranking statistics for microarray analysis

Li-Xuan Qin, +1 more

- 01 Jan 2004

- Nucleic Acids Research

TL;DR: Findings support the use of an intensity-based normalization procedure and indicate that local background subtraction can be detrimental for effectively detecting differential expression, and find that choice of image analysis software can also substantially influence experimental conclusions.

...read moreread less

87