Comparison of Affymetrix GeneChip expression measures
TL;DR: It is found that background correction, one of the main steps in preprocessing, has the largest effect on performance and, in particular, background correction appears to improve accuracy but, in general, worsen precision.
read more
Abstract: Motivation: In the Affymetrix GeneChip system, preprocessing occurs before one obtains expression level measurements. Because the number of competing preprocessing methods was large and growing we developed a benchmark to help users identify the best method for their application. A webtool was made available for developers to benchmark their procedures. At the time of writing over 50 methods had been submitted.
Results: We benchmarked 31 probe set algorithms using a U95A dataset of spike in controls. Using this dataset, we found that background correction, one of the main steps in preprocessing, has the largest effect on performance. In particular, background correction appears to improve accuracy but, in general, worsen precision. The benchmark results put this balance in perspective. Furthermore, we have improved some of the original benchmark metrics to provide more detailed information regarding precision and accuracy. A handful of methods stand out as providing the best balance using spike-in data with the older U95A array, although different experiments on more current arrays may benchmark differently.
Availability: The affycomp package, now version 1.5.2, continues to be available as part of the Bioconductor project (http://www.bioconductor.org). The webtool continues to be available at http://affycomp.biostat.jhsph.edu
Contact: [email protected]
Supplementary information: Supplementary data are available at Bioinformatics online.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
More to Hi-C than meets the eye
Myong-Hee Sung,Gordon L. Hager +1 more
TL;DR: It is shown that data generated using the Hi-C approach contain hidden features of interchromosomal DNA interactions, which are revealed through analysis with an integrated probabilistic model that corrects for multiple sources of bias in the data.
2
miR-190, CDK1, MCM10 and NDC80 predict the prognosis of the patients with lung cancer
Li-Wei Gao,Guo-Liang Wang +1 more
TL;DR: The prognostic mechanisms of LC were revealed by building a protein-protein interaction (PPI) network and identifying significant network modules, which might act in LC by targeting the DEGs.
2
Data-driven analysis and druggability assessment methods to accelerate the identification of novel cancer targets
TL;DR: In this paper , a bioinformatic approach for identifying novel cancer drug targets by performing statistical analysis to ascertain quantitative changes in expression levels between protein-coding genes, as well as co-expression networks to classify these genes into groups.
2
Meta-analysis of cancer gene-profiling data.
Xinan Yang,Xiao Sun +1 more
TL;DR: This chapter introduces the R implementation of OrderedList, a method that was specially proposed for cancer gene expression data meta-analysis, on real data sets to identify biomarkers for adenocarcinoma lung cancer.
2
Summary of contributions to GAW15 Group 16: processing/normalization of expression traits.
TL;DR: It is concluded that preprocessing statistical analyses may have an important impact on eQTL analyses and on the identification of cis‐/trans‐regulators and/or major biological pathways.
2
References
Exploration, normalization, and summaries of high density oligonucleotide array probe level data
Rafael A. Irizarry,Bridget G. Hobbs,Francois Collin,Yasmin Beazer-Barclay,Kristen J. Antonellis,Uwe Scherf,Terence P. Speed +6 more
TL;DR: There is no obvious downside to using RMA and attaching a standard error (SE) to this quantity using a linear model which removes probe-specific affinities, and the exploratory data analyses of the probe level data motivate a new summary measure that is a robust multi-array average (RMA) of background-adjusted, normalized, and log-transformed PM values.
A comparison of normalization methods for high density oligonucleotide array data based on variance and bias
TL;DR: Three methods of performing normalization at the probe intensity level are presented: a one number scaling based algorithm and a method that uses a non-linear normalizing relation by comparing the variability and bias of an expression measure and the simplest and quickest complete data method is found to perform favorably.
9K
Variance stabilization applied to microarray data calibration and to the quantification of differential expression.
Wolfgang Huber,Anja von Heydebreck,Holger Sültmann,Annemarie Poustka,Martin Vingron +4 more
- 01 Jul 2002
TL;DR: A statistical model for microarray gene expression data that comprises data calibration, the quantifying of differential expression, and the quantification of measurement error is introduced, and a difference statistic Deltah whose variance is approximately constant along the whole intensity range is derived.
2.7K
A variance-stabilizing transformation for gene-expression microarray data.
Blythe Durbin,Johanna Hardin,Douglas M. Hawkins,David M. Rocke +3 more
- 01 Jul 2002
TL;DR: A transformation is introduced that stabilizes the variance of microarray data across the full range of expression, and simulation studies suggest that this transformation approximately symmetrizes micro array data.
Robust singular value decomposition analysis of microarray data
TL;DR: A robust analysis method is developed for the understanding of large-scale shifts in gene effects and the isolation of particular sample-by-gene effects that might be either unusual interactions or the result of experimental flaws.
193
Related Papers (5)
Robert Gentleman,Vincent J. Carey,Douglas M. Bates,Benjamin M. Bolstad,Marcel Dettling,Sandrine Dudoit,Byron Ellis,Laurent Gautier,Yongchao Ge,Jeff Gentry,Kurt Hornik,Torsten Hothorn,Wolfgang Huber,Stefano Maria Iacus,Rafael A. Irizarry,Friedrich Leisch,Cheng Li,Martin Maechler,A. J. Rossini,Günther Sawitzki,Colin A. Smith,Gordon K. Smyth,Luke Tierney,Jean Yang,Jianhua Zhang +24 more