Sharing and Reusing Gene Expression Profiling Data in Neuroscience
Xiang Wan,Paul Pavlidis +1 more
TL;DR: The public availability of high-throughput expression data in neuroscience and how it has been reused, and tools that have been developed to facilitate reuse are reviewed.
read more
Abstract: As public availability of gene expression profiling data increases, it is natural to ask how these data can be used by neuroscientists. Here we review the public availability of high-throughput expression data in neuroscience and how it has been reused, and tools that have been developed to facilitate reuse. There is increasing interest in making expression data reuse a routine part of the neuroscience tool-kit, but there are a number of challenges. Data must become more readily available in public databases; efforts to encourage investigators to make data available are important, as is education on the benefits of public data release. Once released, data must be better-annotated. Techniques and tools for data reuse are also in need of improvement. Integration of expression profiling data with neuroscience-specific resources such as anatomical atlases will further increase the value of expression data.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Data reuse and the open data citation advantage
TL;DR: There is a direct effect of third-party data reuse that persists for years beyond the time when researchers have published most of the papers reusing their own data, and a robust citation benefit from open data is found, although a smaller one than previously reported.
577
Data from: Data reuse and the open data citation advantage
Heather A. Piwowar
- 01 Jan 2013
TL;DR: A multivariate regression on 10,555 studies that created gene expression microarray data, we found that studies that made data available in a public repository received 9% (95% confidence interval) more citations than similar studies for which the data was not made available.
297
Recommendations for the collection and use of multiplexed functional data for clinical variant interpretation
Hannah Gelman,Jennifer N. Dines,Jonathan S. Berg,Alice H. Berger,Sarah E. Brnich,Fuki M. Hisama,Richard G. James,Alan F. Rubin,Alan F. Rubin,Alan F. Rubin,Jay Shendure,Jay Shendure,Brian H. Shirts,Douglas M. Fowler,Lea M. Starita +14 more
TL;DR: Recommendations to experimentalists and clinicians on how to evaluate the quality of multiplexed functional datasets, and how different datasets could be incorporated into the ACMG/AMP variant-interpretation framework, will hopefully clarify whether and how such data should be used.
The reuse of public datasets in the life sciences: potential risks and rewards.
TL;DR: The prodigious potential of reusing publicly available datasets and the associated challenges, limitations and risks are reviewed and a checklist to determine the reuse value and potential of a particular dataset is provided.
53
BgeeDB, an R package for retrieval of curated expression datasets and for gene list expression localization enrichment tests.
Andrea Komljenovic,Andrea Komljenovic,Julien Roux,Julien Roux,Julien Wollbrett,Julien Wollbrett,Marc Robinson-Rechavi,Marc Robinson-Rechavi,Frederic B. Bastian,Frederic B. Bastian +9 more
TL;DR: BgeeDB is a collection of functions to import into R re-annotated, quality-controlled and re-processed expression data available in the Bgee database, which includes a new gene set enrichment test for preferred localization of expression of genes in anatomical structures (“TopAnat”).
32
References
Cluster analysis and display of genome-wide expression patterns
TL;DR: A system of cluster analysis for genome-wide expression data from DNA microarray hybridization is described that uses standard statistical algorithms to arrange genes according to similarity in pattern of gene expression, finding in the budding yeast Saccharomyces cerevisiae that clustering gene expression data groups together efficiently genes of known similar function.
Molecular classification of cancer: class discovery and class prediction by gene expression monitoring.
Todd R. Golub,Todd R. Golub,Donna K. Slonim,Pablo Tamayo,Christine Huard,Michelle Gaasenbeek,Jill P. Mesirov,Hilary A. Coller,Mignon L. Loh,James R. Downing,Michael A. Caligiuri,Clara D. Bloomfield,Eric S. Lander +12 more
TL;DR: A generic approach to cancer classification based on gene expression monitoring by DNA microarrays is described and applied to human acute leukemias as a test case and suggests a general strategy for discovering and predicting cancer classes for other types of cancer, independent of previous biological knowledge.
Gene Expression Omnibus: NCBI gene expression and hybridization array data repository
TL;DR: The Gene Expression Omnibus (GEO) project was initiated in response to the growing demand for a public repository for high-throughput gene expression data and provides a flexible and open design that facilitates submission, storage and retrieval of heterogeneous data sets from high-power gene expression and genomic hybridization experiments.
13.2K
Bioconductor: open software development for computational biology and bioinformatics
Robert Gentleman,Vincent J. Carey,Douglas M. Bates,Benjamin M. Bolstad,Marcel Dettling,Sandrine Dudoit,Byron Ellis,Laurent Gautier,Yongchao Ge,Jeff Gentry,Kurt Hornik,Torsten Hothorn,Wolfgang Huber,Stefano Maria Iacus,Rafael A. Irizarry,Friedrich Leisch,Cheng Li,Martin Maechler,A. J. Rossini,Günther Sawitzki,Colin A. Smith,Gordon K. Smyth,Luke Tierney,Jean Yang,Jianhua Zhang +24 more
TL;DR: Details of the aims and methods of Bioconductor, the collaborative creation of extensible software for computational biology and bioinformatics, and current challenges are described.
Genome-wide atlas of gene expression in the adult mouse brain
Ed S. Lein,Michael Hawrylycz,Nancy Ao,Mikael Ayres,Amy Bensinger,Amy Bernard,Andrew F. Boe,Mark S. Boguski,Mark S. Boguski,Kevin S. Brockway,Emi J. Byrnes,Lin Chen,Li Chen,Tsuey-Ming Chen,Mei Chi Chin,Jimmy Chong,Brian E. Crook,Aneta Czaplinska,Chinh Dang,Suvro Datta,Nick Dee,Aimee L. Desaki,Tsega Desta,Ellen Diep,Tim A. Dolbeare,Matthew J. Donelan,Hong-Wei Dong,Jennifer G. Dougherty,Ben J. Duncan,Amanda Ebbert,Gregor Eichele,Lili K. Estin,Casey Faber,Benjamin A.C. Facer,Rick Fields,Shanna R. Fischer,Tim P. Fliss,Cliff Frensley,Sabrina N. Gates,Katie J. Glattfelder,Kevin R. Halverson,Matthew R. Hart,John G. Hohmann,Maureen P. Howell,Darren P. Jeung,Rebecca A. Johnson,Patrick T. Karr,Reena Kawal,Jolene Kidney,Rachel H. Knapik,Chihchau L. Kuan,James H. Lake,Annabel R. Laramee,Kirk D. Larsen,Christopher Lau,Tracy Lemon,Agnes J. Liang,Ying Liu,Lon T. Luong,Jesse Michaels,Judith J. Morgan,Rebecca J. Morgan,Marty Mortrud,Nerick Mosqueda,Lydia Ng,Randy Ng,Geralyn J. Orta,Caroline C. Overly,Tu H. Pak,Sheana Parry,Sayan Dev Pathak,Owen C. Pearson,Ralph B. Puchalski,Zackery L. Riley,Hannah R. Rockett,Stephen A. Rowland,Joshua J. Royall,Marcos J. Ruiz,Nadia R. Sarno,Katherine Schaffnit,Nadiya V. Shapovalova,Taz Sivisay,Clifford R. Slaughterbeck,Simon Smith,Kimberly A. Smith,Bryan I. Smith,Andy J. Sodt,Nick N. Stewart,Kenda-Ruth Stumpf,Susan M. Sunkin,Madhavi Sutram,Angelene Tam,Carey D. Teemer,Christina Thaller,Carol L. Thompson,Lee R. Varnam,Axel Visel,Axel Visel,Ray M. Whitlock,Paul Wohnoutka,Crissa K. Wolkey,Victoria Y. Wong,Matthew J.A. Wood,Murat B. Yaylaoglu,Rob Young,Brian L. Youngstrom,Xu Feng Yuan,Bin Zhang,Theresa A. Zwingman,Allan R. Jones +109 more
TL;DR: An anatomically comprehensive digital atlas containing the expression patterns of ∼20,000 genes in the adult mouse brain is described, providing an open, primary data resource for a wide variety of further studies concerning brain organization and function.