Integrating different data types by regularized unsupervised multiple kernel learning with application to cancer subtype discovery.

doi:10.1093/BIOINFORMATICS/BTV244

Open AccessJournal Article10.1093/BIOINFORMATICS/BTV244

Integrating different data types by regularized unsupervised multiple kernel learning with application to cancer subtype discovery.

Nora K. Speicher, +1 more

- 15 Jun 2015

- Bioinformatics

- Vol. 31, Iss: 12, pp 268-275

179

TL;DR: Current multiple kernel learning for dimensionality reduction approaches are applied and extended, and it is shown that one can even use several kernels per data type and thereby alleviate the user from having to choose the best kernel functions and kernel parameters for each data type beforehand.

Abstract: Motivation: Despite ongoing cancer research, available therapies are still limited in quantity and effectiveness, and making treatment decisions for individual patients remains a hard problem. Established subtypes, which help guide these decisions, are mainly based on individual data types. However, the analysis of multidimensional patient data involving the measurements of various molecular features could reveal intrinsic characteristics of the tumor. Large-scale projects accumulate this kind of data for various cancer types, but we still lack the computational methods to reliably integrate this information in a meaningful manner. Therefore, we apply and extend current multiple kernel learning for dimensionality reduction approaches. On the one hand, we add a regularization term to avoid overfitting during the optimization procedure, and on the other hand, we show that one can even use several kernels per data type and thereby alleviate the user from having to choose the best kernel functions and kernel parameters for each data type beforehand. Results: We have identified biologically meaningful subgroups for five different cancer types. Survival analysis has revealed significant differences between the survival times of the identified subtypes, with P values comparable or even better than state-of-the-art methods. Moreover, our resulting subtypes reflect combined patterns from the different data sources, and we demonstrate that input kernel matrices with only little information have less impact on the integrated kernel matrix. Our subtypes show different responses to specific therapies, which could eventually assist in treatment decision making. Availability and implementation: An executable is available upon request. Contact: ed.gpm.fni-ipm@aron or ed.gpm.fni-ipm@refiefpn

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Journal Article•10.1177/1177932219899051

Multi-omics Data Integration, Interpretation, and Its Application.

Indhupriya Subramanian, +4 more

- 31 Jan 2020

- Bioinformatics and Biology Insights

TL;DR: This review collected the tools and methods that adopt integrative approach to analyze multiple omics data and summarized their ability to address applications such as disease subtyping, biomarker prediction, and deriving insights into the data.

...read moreread less

1K

•Journal Article•10.3389/FGENE.2017.00084

More Is Better: Recent Progress in Multi-Omics Data Integration Methods.

Sijia Huang, +4 more

- 16 Jun 2017

- Frontiers in Genetics

TL;DR: This review outlines the progress done in the field of multi-omics integration and comprehensive tools developed so far in this field and discusses the integration methods to predict patient survival.

...read moreread less

647

Journal Article•10.1016/J.BIOTECHADV.2021.107739

Using machine learning approaches for multi-omics data analysis: A review

Parminder Singh Reel, +4 more

- 29 Mar 2021

- Biotechnology Advances

TL;DR: In this article, the authors explore different integrative machine learning methods which have been used to provide an in-depth understanding of biological systems during normal physiological functioning and in the presence of a disease.

...read moreread less

494

•Journal Article•10.1093/NAR/GKY889

Multi-omic and multi-view clustering algorithms: review and cancer benchmark

Nimrod Rappoport, +1 more

- 16 Nov 2018

- Nucleic Acids Research

TL;DR: This review covers methods developed specifically for omic data as well as generic multi-view methods developed in the machine learning community for joint clustering of multiple data types, providing the first systematic comparison of leading multi-omics and multi-View clustering algorithms.

...read moreread less

407

•Journal Article•10.1016/J.CSBJ.2021.06.030

Integration strategies of multi-omics data for machine learning analysis.

Milan Picard, +4 more

- 01 Jan 2021

- Computational and structural biotechnolo...

TL;DR: In this article, the authors focus on challenges and existing multi-omics integration strategies by paying special attention to machine learning applications and summarize the most recent data integration methods/ frameworks into five different integration strategies: early, mixed, intermediate, late and hierarchical.

...read moreread less

341

...

Expand

References

•Journal Article•10.1016/J.CELL.2011.02.013

Hallmarks of cancer: the next generation.

Douglas Hanahan, +1 more

- 04 Mar 2011

- Cell

TL;DR: Recognition of the widespread applicability of these concepts will increasingly affect the development of new means to treat human cancer.

...read moreread less

63.3K

•Journal Article•10.1016/0377-0427(87)90125-7

Silhouettes: a graphical aid to the interpretation and validation of cluster analysis

Peter J. Rousseeuw

- 01 Nov 1987

- Journal of Computational and Applied Mat...

TL;DR: A new graphical display is proposed for partitioning techniques, where each cluster is represented by a so-called silhouette, which is based on the comparison of its tightness and separation, and provides an evaluation of clustering validity.

...read moreread less

19K

•Journal Article•10.1007/S11222-007-9033-Z

A tutorial on spectral clustering

Ulrike von Luxburg

- 01 Dec 2007

- Statistics and Computing

TL;DR: In this article, the authors present the most common spectral clustering algorithms, and derive those algorithms from scratch by several different approaches, and discuss the advantages and disadvantages of these algorithms.

...read moreread less

11.1K

•Journal Article•10.1016/J.CCR.2009.12.020

Integrated Genomic Analysis Identifies Clinically Relevant Subtypes of Glioblastoma Characterized by Abnormalities in PDGFRA, IDH1, EGFR, and NF1

Roel G.W. Verhaak, +34 more

- 19 Jan 2010

- Cancer Cell

TL;DR: A robust gene expression-based molecular classification of GBM into Proneural, Neural, Classical, and Mesenchymal subtypes is described and multidimensional genomic data is integrated to establish patterns of somatic mutations and DNA copy number.

...read moreread less

7.2K

•Journal Article•10.1080/01621459.1971.10482356

Objective Criteria for the Evaluation of Clustering Methods

William M. Rand

- 01 Dec 1971

- Journal of the American Statistical Asso...

TL;DR: This article proposes several criteria which isolate specific aspects of the performance of a method, such as its retrieval of inherent structure, its sensitivity to resampling and the stability of its results in the light of new data.

...read moreread less

6.7K