Multi-source learning with block-wise missing data for Alzheimer's disease prediction

doi:10.1145/2487575.2487594

Proceedings Article10.1145/2487575.2487594

Multi-source learning with block-wise missing data for Alzheimer's disease prediction

Shuo Xiang, +5 more

- 11 Aug 2013

- pp 185-193

95

TL;DR: This paper investigates the situation of complete data and presents a unified ``bi-level" learning model for multi-source data and gives a natural extension of this model to the more challenging case with incomplete data.

Abstract: With the advances and increasing sophistication in data collection techniques, we are facing with large amounts of data collected from multiple heterogeneous sources in many applications. For example, in the study of Alzheimer's Disease (AD), different types of measurements such as neuroimages, gene/protein expression data, genetic data etc. are often collected and analyzed together for improved predictive power. It is believed that a joint learning of multiple data sources is beneficial as different data sources may contain complementary information, and feature-pruning and data source selection are critical for learning interpretable models from high-dimensional data. Very often the collected data comes with block-wise missing entries; for example, a patient without the MRI scan will have no information in the MRI data block, making his/her overall record incomplete. There has been a growing interest in the data mining community on expanding traditional techniques for single-source complete data analysis to the study of multi-source incomplete data. The key challenge is how to effectively integrate information from multiple heterogeneous sources in the presence of block-wise missing data. In this paper we first investigate the situation of complete data and present a unified ``bi-level" learning model for multi-source data. Then we give a natural extension of this model to the more challenging case with incomplete data. Our major contributions are threefold: (1) the proposed models handle both feature-level and source-level analysis in a unified formulation and include several existing feature learning approaches as special cases; (2) the model for incomplete data avoids direct imputation of the missing elements and thus provides superior performances. Moreover, it can be easily generalized to other applications with block-wise missing data sources; (3) efficient optimization algorithms are presented for both the complete and incomplete models. We have performed comprehensive evaluations of the proposed models on the application of AD diagnosis. Our proposed models compare favorably against existing approaches.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Journal Article•10.1109/TPAMI.2015.2417578

Multi-View Intact Space Learning

Chang Xu, +2 more

- 01 Dec 2015

- IEEE Transactions on Pattern Analysis an...

TL;DR: In this paper, the authors proposed the Multi-view Intact Space Learning (MISL) algorithm, which integrates the encoded complementary information in multiple views to discover a latent intact representation of the data.

...read moreread less

429

•Journal Article•10.1109/TPAMI.2018.2879108

Late Fusion Incomplete Multi-View Clustering

Xinwang Liu, +8 more

- 01 Oct 2019

- IEEE Transactions on Pattern Analysis an...

TL;DR: This work proposes Late Fusion Incomplete Multi-view Clustering (LF-IMVC) which effectively and efficiently integrates the incomplete clustering matrices generated by incomplete views and develops a three-step iterative algorithm to solve the resultant optimization problem with linear computational complexity and theoretically prove its convergence.

...read moreread less

374

•Journal Article•10.1109/TPAMI.2019.2892416

Multiple Kernel $k$ k -Means with Incomplete Kernels

Xinwang Liu, +9 more

- 01 May 2020

- IEEE Transactions on Pattern Analysis an...

TL;DR: Wang et al. as discussed by the authors integrated imputation and clustering into a unified learning procedure, which does not require that there is at least one complete base kernel matrix over all the samples.

...read moreread less

358

Journal Article•10.1109/TPAMI.2020.2974828

Efficient and Effective Regularized Incomplete Multi-View Clustering

Xinwang Liu, +7 more

- 01 Aug 2021

- IEEE Transactions on Pattern Analysis an...

TL;DR: This paper proposes an Efficient and Effective Incomplete Multi-view Clustering (EE-IMVC) algorithm, which proposes to impute each incomplete base matrix generated by incomplete views with a learned consensus clustering matrix to address issues of intensive computational and storage complexities, over-complicated optimization and limitedly improved clustering performance.

...read moreread less

273

•Proceedings Article

Fortune teller: predicting your career path

Ye Liu, +4 more

- 12 Feb 2016

TL;DR: This work scientifically and systematically study the feasibility of career path prediction from social network data and seamlessly fuse information from multiple social networks to comprehensively describe a user and characterize progressive properties of his or her career path.

...read moreread less

262

...

Expand

References

Book Chapter•10.1017/CBO9781139207249.009

I and J

William Marsden

- 01 Jan 2012

154.7K

Journal Article•10.1111/J.2517-6161.1996.TB02080.X

Regression Shrinkage and Selection via the Lasso

Robert Tibshirani

- 01 Jan 1996

- Journal of the royal statistical society...

TL;DR: A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.

...read moreread less

45.4K

Journal Article•10.1137/080716542

A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems

Amir Beck, +1 more

- 01 Jan 2009

- Siam Journal on Imaging Sciences

TL;DR: A new fast iterative shrinkage-thresholding algorithm (FISTA) which preserves the computational simplicity of ISTA but with a global rate of convergence which is proven to be significantly better, both theoretically and practically.

...read moreread less

14.3K

•Journal Article•10.1214/009053604000000067

Least angle regression

Bradley Efron, +19 more

- 01 Apr 2004

- Annals of Statistics

TL;DR: A publicly available algorithm that requires only the same order of magnitude of computational effort as ordinary least squares applied to the full set of covariates is described.

...read moreread less

9.4K

Journal Article•10.1111/J.1467-9868.2005.00532.X

Model selection and estimation in regression with grouped variables

Ming Yuan, +1 more

- 01 Feb 2006

- Journal of The Royal Statistical Society...

TL;DR: In this paper, instead of selecting factors by stepwise backward elimination, the authors focus on the accuracy of estimation and consider extensions of the lasso, the LARS algorithm and the non-negative garrotte for factor selection.

...read moreread less

8.8K