Proceedings Article10.1145/2487575.2487594
Multi-source learning with block-wise missing data for Alzheimer's disease prediction
Shuo Xiang,Lei Yuan,Wei Fan,Yalin Wang,Paul M. Thompson,Jieping Ye +5 more
- 11 Aug 2013
- pp 185-193
TL;DR: This paper investigates the situation of complete data and presents a unified ``bi-level" learning model for multi-source data and gives a natural extension of this model to the more challenging case with incomplete data.
read more
Abstract: With the advances and increasing sophistication in data collection techniques, we are facing with large amounts of data collected from multiple heterogeneous sources in many applications. For example, in the study of Alzheimer's Disease (AD), different types of measurements such as neuroimages, gene/protein expression data, genetic data etc. are often collected and analyzed together for improved predictive power. It is believed that a joint learning of multiple data sources is beneficial as different data sources may contain complementary information, and feature-pruning and data source selection are critical for learning interpretable models from high-dimensional data. Very often the collected data comes with block-wise missing entries; for example, a patient without the MRI scan will have no information in the MRI data block, making his/her overall record incomplete. There has been a growing interest in the data mining community on expanding traditional techniques for single-source complete data analysis to the study of multi-source incomplete data. The key challenge is how to effectively integrate information from multiple heterogeneous sources in the presence of block-wise missing data. In this paper we first investigate the situation of complete data and present a unified ``bi-level" learning model for multi-source data. Then we give a natural extension of this model to the more challenging case with incomplete data. Our major contributions are threefold: (1) the proposed models handle both feature-level and source-level analysis in a unified formulation and include several existing feature learning approaches as special cases; (2) the model for incomplete data avoids direct imputation of the missing elements and thus provides superior performances. Moreover, it can be easily generalized to other applications with block-wise missing data sources; (3) efficient optimization algorithms are presented for both the complete and incomplete models. We have performed comprehensive evaluations of the proposed models on the application of AD diagnosis. Our proposed models compare favorably against existing approaches.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Multi-View Intact Space Learning
Chang Xu,Dacheng Tao,Chao Xu +2 more
TL;DR: In this paper, the authors proposed the Multi-view Intact Space Learning (MISL) algorithm, which integrates the encoded complementary information in multiple views to discover a latent intact representation of the data.
429
Late Fusion Incomplete Multi-View Clustering
Xinwang Liu,Xinzhong Zhu,Miaomiao Li,Lei Wang,Chang Tang,Jianping Yin,Dinggang Shen,Huaimin Wang,Wen Gao +8 more
TL;DR: This work proposes Late Fusion Incomplete Multi-view Clustering (LF-IMVC) which effectively and efficiently integrates the incomplete clustering matrices generated by incomplete views and develops a three-step iterative algorithm to solve the resultant optimization problem with linear computational complexity and theoretically prove its convergence.
374
Multiple Kernel $k$ k -Means with Incomplete Kernels
Xinwang Liu,Xinzhong Zhu,Miaomiao Li,Lei Wang,En Zhu,Tongliang Liu,Marius Kloft,Dinggang Shen,Jianping Yin,Wen Gao +9 more
TL;DR: Wang et al. as discussed by the authors integrated imputation and clustering into a unified learning procedure, which does not require that there is at least one complete base kernel matrix over all the samples.
358
Efficient and Effective Regularized Incomplete Multi-View Clustering
TL;DR: This paper proposes an Efficient and Effective Incomplete Multi-view Clustering (EE-IMVC) algorithm, which proposes to impute each incomplete base matrix generated by incomplete views with a learned consensus clustering matrix to address issues of intensive computational and storage complexities, over-complicated optimization and limitedly improved clustering performance.
273
•Proceedings Article
Fortune teller: predicting your career path
Ye Liu,Luming Zhang,Liqiang Nie,Yan Yan,David S. Rosenblum +4 more
- 12 Feb 2016
TL;DR: This work scientifically and systematically study the feasibility of career path prediction from social network data and seamlessly fuse information from multiple social networks to comprehensively describe a user and characterize progressive properties of his or her career path.
References
Regression Shrinkage and Selection via the Lasso
TL;DR: A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.
A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems
Amir Beck,Marc Teboulle +1 more
TL;DR: A new fast iterative shrinkage-thresholding algorithm (FISTA) which preserves the computational simplicity of ISTA but with a global rate of convergence which is proven to be significantly better, both theoretically and practically.
14.3K
Least angle regression
Bradley Efron,Trevor Hastie,Iain M. Johnstone,Robert Tibshirani,Hemant Ishwaran,Keith Knight,Jean-Michel Loubes,Jean-Michel Loubes,Pascal Massart,Pascal Massart,David Madigan,David Madigan,Greg Ridgeway,Greg Ridgeway,Saharon Rosset,Saharon Rosset,Ji Zhu,Robert A. Stine,Berwin A. Turlach,Sanford Weisberg +19 more
TL;DR: A publicly available algorithm that requires only the same order of magnitude of computational effort as ordinary least squares applied to the full set of covariates is described.
Model selection and estimation in regression with grouped variables
Ming Yuan,Yi Lin +1 more
TL;DR: In this paper, instead of selecting factors by stepwise backward elimination, the authors focus on the accuracy of estimation and consider extensions of the lasso, the LARS algorithm and the non-negative garrotte for factor selection.