Open AccessPosted Content
A Framework for Implementing Machine Learning on Omics Data
Geoffroy Dubourg-Felonneau,Timothy I. Cannings,Fergal Cotter,Hannah Thompson,Nirmesh Patel,John W. Cassidy,Harry W. Clifford +6 more
TL;DR: A framework for combining -omics data sets, and for handling high dimensional data, making -omics research more accessible to machine learning applications is presented, and is demonstrated through integration and analysis of multi-analyte data for a set of 3,533 breast cancers.
read more
Abstract: The potential benefits of applying machine learning methods to -omics data are becoming increasingly apparent, especially in clinical settings. However, the unique characteristics of these data are not always well suited to machine learning techniques. These data are often generated across different technologies in different labs, and frequently with high dimensionality. In this paper we present a framework for combining -omics data sets, and for handling high dimensional data, making -omics research more accessible to machine learning applications. We demonstrate the success of this framework through integration and analysis of multi-analyte data for a set of 3,533 breast cancers. We then use this data-set to predict breast cancer patient survival for individuals at risk of an impending event, with higher accuracy and lower variance than methods trained on individual data-sets. We hope that our pipelines for data-set generation and transformation will open up -omics data to machine learning researchers. We have made these freely available for noncommercial use at this http URL.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Figures
Citations
Computational Techniques and Tools for Omics Data Analysis: State-of-the-Art, Challenges, and Future Directions
TL;DR: This paper presents the critical review of state-of-the-art techniques for omics, multi-omics, radiomics data analysis themed at disease prediction, disease recurrence, survival analysis, and biomarker discovery.
60
BSense: A parallel Bayesian hyperparameter optimized Stacked ensemble model for breast cancer survival prediction
TL;DR: In this article , a parallel Bayesian hyperparameter optimized Stacked Ensemble (BSense) model is proposed to predict the occurrence, reoccurrence, and survival in breast cancer.
25
An Integrated Multi-Disciplinary Perspectivefor Addressing Challenges of the Human Gut Microbiome.
Rohan M. Shah,Rohan M. Shah,Elizabeth J. McKenzie,Magda Rosin,Snehal R. Jadhav,Shakuntla V. Gondalia,Douglas Rosendale,David J. Beale +7 more
TL;DR: A perspective review of the recent literature that focuses on the challenges of exploring the human gut microbiome, with a strong focus on an integrated perspective applied to these themes, contextualize the experimental and technical challenges of undertaking such studies and provide a framework for capitalizing on the breadth of insight such approaches afford.
21
Applying a GAN-based classifier to improve transcriptome-based prognostication in breast cancer
TL;DR: In this paper , a Wasserstein Generative Adversarial Network (GAN) with gradient penalty and an embedded auxiliary classifier was used to distinguish low- from high-risk patients.
Applying GAN-based data augmentation to improve transcriptome-based prognostication in breast cancer
TL;DR: GAN-based data augmentation allowed generating a robust classifier capable of stratifying low- vs high-risk patients based on full transcriptome data and across independent and heterogeneous breast cancer cohorts.
References
•Journal Article
Visualizing Data using t-SNE
TL;DR: A new technique called t-SNE that visualizes high-dimensional data by giving each datapoint a location in a two or three-dimensional map, a variation of Stochastic Neighbor Embedding that is much easier to optimize, and produces significantly better visualizations by reducing the tendency to crowd points together in the center of the map.
GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers.
Craig H. Mermel,Steven E. Schumacher,Barbara Hill,Matthew Meyerson,Rameen Beroukhim,Gad Getz +5 more
TL;DR: By separating SCNA profiles into underlying arm-level and focal alterations, the estimation of background rates for each category is improved, and a probabilistic method for defining the boundaries of selected-for SCNA regions with user-defined confidence is described.
•Proceedings Article
Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures
James Bergstra,Daniel L. K. Yamins,David D. Cox +2 more
- 16 Jun 2013
TL;DR: This work proposes a meta-modeling approach to support automated hyperparameter optimization, with the goal of providing practical tools that replace hand-tuning with a reproducible and unbiased optimization process.
GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers
Craig H. Mermel,Steven E. Schumacher,Barbara Hill,Matthew Meyerson,Rameen Beroukhim,Gad Getz +5 more
TL;DR: By separating SCNA profiles into underlying arm-level and focal alterations, the estimation of background rates for each category is improved, and a probabilistic method for defining the boundaries of selected-for SCNA regions with user-defined confidence is described.
1.6K
Cellular heterogeneity and molecular evolution in cancer.
TL;DR: Important considerations related to intratumor heterogeneity during tumor evolution are summarized and experimental approaches commonly used to infer intrumor heterogeneity are discussed and how these methodologies can be translated into clinical practice are described.
519
Related Papers (5)
Xiangtian Yu,Tao Zeng +1 more
