Top 9 papers published in the topic of Bioconductor in 2003

Open source software for the analysis of microarray data.

[...]

Sandrine Dudoit¹, Robert Gentleman², John Quackenbush³•Institutions (3)

University of California, Berkeley¹, Harvard University², George Washington University³

01 Mar 2003-BioTechniques

TL;DR: Three of the most widely used and comprehensive statistical analysis tools written in R through the Bioconductor project are reviewed, the Java-based TM4 software system available from The Institute for Genomic Research, and BASE, the Web-based system developed at Lund University.

...read moreread less

Abstract: DNA microarray assays represent the first widely used application that attempts to build upon the information provided by genome projects in the study of biological questions. One of the greatest challenges with working with microarrays is collecting, managing, and analyzing data. Although several commercial and noncommercial solutions exist, there is a growing body of freely available, open source software that allows users to analyze data using a host of existing techniques and to develop their own and integrate them within the system. Here we review three of the most widely used and comprehensive systems, the statistical analysis tools written in R through the Bioconductor project (http://www.bioconductor.org), the Java-based TM4 software system available from The Institute for Genomic Research (http://www.tigr.org/software), and BASE, the Web-based system developed at Lund University (http://base.thep.lu.se).

...read moreread less

313 citations

Journal Article•10.1093/BIOINFORMATICS/19.1.155•

An extensible application for assembling annotation for genomic data.

[...]

Jianhua Zhang¹, Vincent J. Carey¹, Robert Gentleman¹•Institutions (1)

Harvard University¹

01 Jan 2003-Bioinformatics

TL;DR: The system currently provides parsers to process annotation data from LocusLink, Gene Ontology Consortium, and Human Gene Project and can be extended to new data sources via user defined parsers.

...read moreread less

Abstract: SUMMARY AnnBuilder is an R package for assembling genomic annotation data. The system currently provides parsers to process annotation data from LocusLink, Gene Ontology Consortium, and Human Gene Project and can be extended to new data sources via user defined parsers. AnnBuilder differs from other existing systems in that it provides users with unlimited ability to assemble data from user selected sources. The products of AnnBuilder are files in XML format that can be easily used by different systems. AVAILABILITY (http://www.bioconductor.org). Open source.

...read moreread less

65 citations

Proceedings Article•

Expression array annotation using the BioMediator biological data integration system and the BioConductor analytic platform.

[...]

Hao Mei¹, Peter Tarczy-Hornoch, Peter Mork, Anthony J. Rossini, Ron Shaker, Loren Donelson - Show less +2 more•Institutions (1)

University of Washington¹

1 Jan 2003

TL;DR: The model presented addresses the need for annotation sources identified during BioConductor inverted exclamation mark s development, and provides well-curated genomic background knowledge for expression array analysis and interpretation.

...read moreread less

Abstract: This paper presents the implementation of a model for expression array annotation (EAA) using the BioMediator biological data integration system along with BioConductor, an analytic tools platform. The model presented addresses the need for annotation sources identified during BioConductor’s development. Annotation provides us with well-curated genomic background knowledge for expression array analysis and interpretation. Annotation requests are constructed and posted to the query interface of the EAA package (the EAA model implemented as a component of BioConductor). The software enumerates all possible annotation paths for queries. These are then transformed to PQL queries and processed by BioMediator. Annotation entities returned from the EAA package answer the annotation request.

...read moreread less

20 citations

Distributed Storage and Analysis of Microarray Data in the Terabyte Range: An Alternative to Bioconductor

[...]

Christian Stratowa¹•Institutions (1)

Boehringer Ingelheim¹

1 Jan 2003

TL;DR: ROOT is introduced, an objectoriented framework that has been developed at CERN for distributed data warehousing and data mining of particle data in the petabyte range, and how R could be easily extended to access ROOT from within R is emphasized.

...read moreread less

Abstract: Novel high-throughput technologies such as DNA microarray analyses are allowing biologists to generate sets of data in the terabyte realm. Many of these data will be deposited in the public domain, necessitating a common standard. Currently available database systems are not appropriate for these intentions. In this paper, I will introduce ROOT (http://root.cern.ch), an objectoriented framework that has been developed at CERN for distributed data warehousing and data mining of particle data in the petabyte range. Data are stored as sets of objects in machine-independent files, and specialized methods are used to get direct access to separate attributes of selected data objects. ROOT has been designed in such a way that it can query its databases in parallel on SMP/MPP machines, on clusters of PC’s, or using common GRID services. In order to demonstrate the applicability of ROOT to microarray data, I will present a functional prototype system, called XPS - eXpression Profiling System, which can be considered to be an alternative to the Bioconductor project. The current implementation handles the storage of Aymetrix GeneChip schemes and data, and the pre-processing, normalization and filtering of GeneChip data. Based on this system, I will propose a novel standard for the distributed storage of microarray data. Finally, I will emphasize the similarities between R and ROOT, and show how R could be easily extended to access ROOT from within R.

...read moreread less

4 citations

A Graphical Users Interface to Normalize Microarray Data

[...]

Fátima Sánchez Cabo, Zlatko Trajanoski, Kwang-Hyun Cho, Olaf Wolkenhauer

1 Jan 2003

TL;DR: In order to extract valuable information from the big amount of data that microarrays experiments generate, suitable and powerful statistical and computational methods are required.

...read moreread less

Abstract: Microarray technology is becoming an essential tool in functional genomics. The possibility of monitoring the expression level of thousands of genes simultaneously, as the response to a particular biological condition, gives to the biologists the chance to widen the aims of their experiments and opens a door to the understanding of cellular transcription processes. In order to extract valuable information from the big amount of data that microarrays experiments generate, suitable and powerful statistical and computational methods are required. An example of the eort of statisticians and computer scientists is the release of the first Bioconductor software and the increasing number of functions for microarray data analysis implemented

...read moreread less

3 citations

RDBMS in Bioinformatics: The Bioconductor Experience

[...]

Vincent J. Carey¹•Institutions (1)

Harvard University¹

1 Jan 2003

TL;DR: The role played by RDBMS in Bioconductor is less pronounced than had been anticipated, but this will change as requirements for query optimization, data structure standardization, and greater volumes of data and metadata emerge.

...read moreread less

Abstract: Bioconductor (http://www.bioconductor.org/) is an open source collection of resources aimed at transparently advancing the theory and practice of bioinformatics, with a focus on expression arrays and the R statistical computing environment. I will sketch the key data structures and data flow processes addressed in Bioconductor thus far. I will review the role played by RDBMS in the development and curation of packaged annotation networks and in the analysis of Serial Analysis of Gene Expression (SAGE) libraries. Non-relational database technologies such as BerkeleyDB and HDF5 have also played a role in tools for archiving and navigating expression array data. At present the role of RDBMS in Bioconductor is less pronounced than had been anticipated. This will change as requirements for query optimization, data structure standardization, and greater volumes of data and metadata emerge.

...read moreread less

Book Chapter•10.1007/0-387-21679-0_3•

Bioconductor R Packages for Exploratory Analysis and Normalization of cDNA Microarray Data

[...]

Sandrine Dudoit, Jean Y.H. Yang

1 Jan 2003

TL;DR: This chapter describes a collection of four R packages for exploratory analysis and normalization of two-color cDNA microarray fluorescence intensity data, developed as part of the Bioconductor project, to produce an open-source and open-development statistical computing framework for the analysis of genomic data.

...read moreread less

Abstract: This chapter describes a collection of four R packages for exploratory analysis and normalization of two-color cDNA microarray fluorescence intensity data. R’s object-oriented class/method mechanism is exploited to allow efficient and systematic representation and manipulation of large microarray datasets of multiple types. The marrayClasses package contains class definitions and associated methods for pre- and postnormalization intensity data for batches of arrays. The marrayInput package provides functions and tcltk widgets to automate data input and the creation of microarray-specific R objects for storing these data. Functions for diagnostic plots of microarray spot statistics, such as boxplots, scatterplots, and spatial color images, are provided in marrayPlots. Finally, the marrayNorm package implements robust adaptive location and scale normalization procedures, which correct for different types of dye biases (e.g., intensity, spatial, plate biases) and allow the use of control sequences spotted onto the array and possibly spiked into the mRNA samples. The four new packages were developed as part of the Bioconductor project, which aims more generally to produce an open-source and open-development statistical computing framework for the analysis of genomic data.

...read moreread less

Book Chapter•10.1007/0-387-21679-0_4•

An R Package for Analyses of Affymetrix Oligonucleotide Arrays

[...]

Rafael A. Irizarry, Laurent Gautier, Leslie Cope

1 Jan 2003

TL;DR: An extensible, interactive environment for data analysis and exploration of Affymetrix oligonucleotide array probe-level data and some examples demonstrating that having access to and methods for probelevel data results in improvements to quality control assessments, normalization, and expression measures are provided.

...read moreread less

Abstract: We describe an extensible, interactive environment for data analysis and exploration of Affymetrix oligonucleotide array probe-level data. The software utilities provided with the Affymetrix analysis suite summarize the probe set intensities and makes available only one expression measure for each gene. We have developed this package because much can be learned from studying the individual probe intensities or, as we call them, the probe-level data. We provide some examples demonstrating that having access to and methods for probelevel data results in improvements to quality control assessments, normalization, and expression measures. The software is implemented as an add-on package, conveniently named affy, to the freely available and widely used statistical language/software R (Ihaka and Gentleman, 1996). The development of this software as an add-on to R allows us to take advantage of the basic mathematical and statistical functions and powerful graphics capabilities that are provided with R. Our package is distributed as open source code for Linux, Unix, and Microsoft Windows. It is is released under the GNU General Public License. It is part of the Bioconductor project and can be obtained from http://www.bioconductor.org.

...read moreread less

Journal Article•10.2202/1544-6115.1008•

Parameter estimation for the calibration and variance stabilization of microarray data.

[...]

Wolfgang Huber¹, Anja von Heydebreck², Holger Sültmann¹, Annemarie Poustka¹, Martin Vingron² - Show less +1 more•Institutions (2)

German Cancer Research Center¹, Max Planck Society²

05 Apr 2003-Statistical Applications in Genetics and Molecular Biology

TL;DR: This paper derives and validate an estimator for the parameters of a transformation for the joint calibration (normalization) and variance stabilization of microarray intensity data and finds that the error decreases with the square root of the number of probes per array and that the estimation is robust against the presence of differentially expressed genes.

...read moreread less

Abstract: We derive and validate an estimator for the parameters of a transformation for the joint calibration (normalization) and variance stabilization of microarray intensity data. With this, the variances of the transformed intensities become approximately independent of their expected values. The transformation is similar to the logarithm in the high intensity range, but has a smaller slope for intensities close to zero. Applications have shown better sensitivity and specificity for the detection of differentially expressed genes. In this paper, we describe the theoretical aspects of the method. We incorporate calibration and variance-mean dependence into a statistical model and use a robust variant of the maximum-likelihood method to estimate the transformation parameters. Using simulations, we investigate the size of the estimation error and its dependence on sample size and the presence of outliers. We find that the error decreases with the square root of the number of probes per array and that the estimation is robust against the presence of differentially expressed genes. Software is publicly available as an R package through the Bioconductor project (http://www.bioconductor.org).

...read moreread less

Showing papers on "Bioconductor published in 2003"

Open source software for the analysis of microarray data.

An extensible application for assembling annotation for genomic data.

Expression array annotation using the BioMediator biological data integration system and the BioConductor analytic platform.

Distributed Storage and Analysis of Microarray Data in the Terabyte Range: An Alternative to Bioconductor

A Graphical Users Interface to Normalize Microarray Data

RDBMS in Bioinformatics: The Bioconductor Experience

Bioconductor R Packages for Exploratory Analysis and Normalization of cDNA Microarray Data

An R Package for Analyses of Affymetrix Oligonucleotide Arrays

Parameter estimation for the calibration and variance stabilization of microarray data.