TL;DR: This article proposes normalization methods that are based on robust local regression and account for intensity and spatial dependence in dye biases for different types of cDNA microarray experiments.
Abstract: There are many sources of systematic variation in cDNA microarray experiments which affect the measured gene expression levels (e.g. differences in labeling efficiency between the two fluorescent dyes). The term normalization refers to the process of removing such variation. A constant adjustment is often used to force the distribution of the intensity log ratios to have a median of zero for each slide. However, such global normalization approaches are not adequate in situations where dye biases can depend on spot overall intensity and/or spatial location within the array. This article proposes normalization methods that are based on robust local regression and account for intensity and spatial dependence in dye biases for different types of cDNA microarray experiments. The selection of appropriate controls for normalization is discussed and a novel set of controls (microarray sample pool, MSP) is introduced to aid in intensity-dependent normalization. Lastly, to allow for comparisons of expression levels across slides, a robust method based on maximum likelihood estimation is proposed to adjust for scale differences among slides.
TL;DR: It is demonstrated that ANOVA methods can be used to normalize microarray data and provide estimates of changes in gene expression that are corrected for potential confounding effects and establishes a framework for the general analysis and interpretation of micro array data.
Abstract: Spotted cDNA microarrays are emerging as a powerful and cost-effective tool for large-scale analysis of gene expression. Microarrays can be used to measure the relative quantities of specific mRNAs in two or more tissue samples for thousands of genes simultaneously. While the power of this technology has been recognized, many open questions remain about appropriate analysis of microarray data. One question is how to make valid estimates of the relative expression for genes that are not biased by ancillary sources of variation. Recognizing that there is inherent "noise" in microarray data, how does one estimate the error variation associated with an estimated change in expression, i.e., how does one construct the error bars? We demonstrate that ANOVA methods can be used to normalize microarray data and provide estimates of changes in gene expression that are corrected for potential confounding effects. This approach establishes a framework for the general analysis and interpretation of microarray data.
TL;DR: A theoretical model is derived that explains certain biases observed in the two-color microarray hybridization experiments reported in the literature and is used to validate the microarray methodology by determining the differential expression of four select Arabidopsis genes and two human genes as a function of the amount of target arrayed.
Abstract: We derived a theoretical model that explains certain biases observed in the two-color microarray hybridization experiments reported in the literature. We show that true competition is achieved only when the hybridization kinetics of the two differentially labeled probes are the same. If the hybridization kinetics of the two differentially labeled probes is different, which can occur when the labeling and hybridization conditions for the two probes are dissimilar, then differential expression observed becomes a function of the amount of the target (i.e., DNA spotted on the slide). We use this model to validate the microarray methodology by determining the differential expression of four select Arabidopsis genes and two human genes (beta-actin and GAPDH) as a function of the amount of target arrayed. We show through both modeling and experiments that the rate constants for Cy5- and Cy3-labeled probes are the same under our exrimental conditions. Therefore, the target concentrations need not greatly exceed the probe concentration. It is obvious from the data presented that a simple treatment of an individual hybridization rate calculation does notfully describe what is occuring in today's complex, multispecies experiments. The method of validation is easily implemented to ensure data reliability by two-color microarray.
TL;DR: EDGE3 is an open-source, web-based application that allows for the storage, analysis, and controlled sharing of transcription-based microarray data generated on the Agilent DNA platform and provides a means for managing RNA samples and arrays during the hybridization process.
Abstract: Background
The ability to generate transcriptional data on the scale of entire genomes has been a boon both in the improvement of biological understanding and in the amount of data generated. The latter, the amount of data generated, has implications when it comes to effective storage, analysis and sharing of these data. A number of software tools have been developed to store, analyze, and share microarray data. However, a majority of these tools do not offer all of these features nor do they specifically target the commonly used two color Agilent DNA microarray platform. Thus, the motivating factor for the development of EDGE3 was to incorporate the storage, analysis and sharing of microarray data in a manner that would provide a means for research groups to collaborate on Agilent-based microarray experiments without a large investment in software-related expenditures or extensive training of end-users.
TL;DR: This article proposes two criteria to address the robustness of microarray designs against missing observations and demonstrates the simultaneous use of efficiency and robustness criteria to select good micro array designs for both one-factor and multi-factor experiments.
Abstract: The main goal of microarray experiments is to select a small subset of genes that are differentially expressed among competing mRNA samples. For a given set of such mRNA samples, it is possible to consider a number of two-color cDNA microarray designs with a fixed number of arrays. Appropriate criteria can be used to select an efficient design from such a set of alternative experimental designs. In practice, however, microarray expression data often contain missing observations and the most efficient design (with complete observations) for a specific setup may not be efficient in the presence of missing observations. In this article, we propose two criteria to address the robustness of microarray designs against missing observations. We demonstrate the simultaneous use of efficiency and robustness criteria to select good microarray designs for both one-factor and multi-factor experiments.
Contact: mlatif@isrt.ac.bd