Open Access
Sparse solutions for linear prediction problems
Dennis Shasha,Tyler Neylon +1 more
- 01 Jan 2006
TL;DR: This thesis is concerned with finding linear identities among time series, and asking how to bound the generalization error by using sparse vectors as hypotheses in the machine learning versions of these problems.
read more
Abstract: The simplicity of an idea has long been regarded as a sign of elegance and, when shown to coincide with accuracy, a hallmark of profundity.
In this thesis our ideas are vectors used as predictors, and sparsity is our measure of simplicity. A vector is sparse when it has few nonzero elements. We begin by asking the question: given a matrix of n time series (vectors which evolve in a "sliding" manner over time) as columns, what are the simplest linear identities among them? Under basic learning assumptions, we justify that such simple identities are likely to persist in the future. It is easily seen that our question is akin to finding sparse vectors in the null space of this matrix. Hence we are confronted with the problem of finding an optimally sparse basis for any vector space. This is a computationally challenging problem with many promising applications, such as iterative numerical optimization, fast dimensionality reduction, graph algorithms on cycle spaces, and of course the time series work of this thesis.
In part I, we give a brief exposition of the questions to be addressed here: finding linear identities among time series, and asking how we may bound the generalization error by using sparse vectors as hypotheses in the machine learning versions of these problems. In part II, we focus on the theoretical justification for maximizing sparsity as a means of learning or prediction. We'll look at sample compression schemes as a means of correlating sparsity with the capacity of a hypothesis set, as well as examining learning error bounds which support sparsity. Finally, in part III, we'll illustrate an increasingly sophisticated toolkit of incremental algorithms for discovering sparse patterns among evolving time series.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Gradient descent with sparsification: an iterative algorithm for sparse recovery with restricted isometry property
Rahul Garg,Rohit Khandekar +1 more
- 14 Jun 2009
TL;DR: The Matlab implementation of GraDeS (Gradient Descent with Sparsification) outperforms previously proposed algorithms like Subspace Pursuit, StOMP, OMP, and Lasso by an order of magnitude and uncovered cases where L1-regularized regression (Lasso) fails but GraDeS finds the correct solution.
Combining sparse coding and time-domain features for heart sound classification
Bradley M. Whitaker,Pradyumna B. Suresha,Chengyu Liu,Gari D. Clifford,Gari D. Clifford,David V. Anderson +5 more
TL;DR: The results show that sparse coding is an effective way to define spectral features of the cardiac cycle and its sub-cycles for the purpose of classification and can be combined with additional feature extraction methods to improve classification accuracy.
112
A geometric approach to sample compression
TL;DR: The sample compression conjecture of Littlestone & Warmuth has remained unsolved for a quarter century as mentioned in this paper, and two promising ways forward are: embedding maximal classes into maximum classes with at most a polynomial increase to VC dimension, and compression via operating on geometric representations.
•Posted Content
Supersparse Linear Integer Models for Interpretable Classification
TL;DR: An off-the-shelf tool to create scoring systems that both accurate and interpretable, known as a Supersparse Linear Integer Model (SLIM), which is a discrete optimization problem that minimizes the 0-1 loss to encourage a high level of accuracy.
Matrix Sparsification and the Sparse Null Space Problem
Lee-Ad Gottlieb,Tyler Neylon +1 more
TL;DR: In this paper, the authors revisited the matrix problems of sparse null space and matrix sparsification, and showed that they are equivalent, and gave a powerful tool to extend algorithms and heuristics for sparse approximation theory to these problems.
38
References
Statistical learning theory
Vladimir Vapnik
- 01 Jan 1998
TL;DR: Presenting a method for determining the necessary and sufficient conditions for consistency of learning process, the author covers function estimates from small data pools, applying these estimations to real-life problems, and much more.
30.4K
A tutorial on support vector regression
TL;DR: This tutorial gives an overview of the basic ideas underlying Support Vector (SV) machines for function estimation, and includes a summary of currently used algorithms for training SV machines, covering both the quadratic programming part and advanced methods for dealing with large datasets.
Atomic Decomposition by Basis Pursuit
TL;DR: Basis Pursuit (BP) is a principle for decomposing a signal into an "optimal" superposition of dictionary elements, where optimal means having the smallest l1 norm of coefficients among all such decompositions.
11.3K
Decoding by linear programming
Emmanuel J. Candès,Terence Tao +1 more
TL;DR: F can be recovered exactly by solving a simple convex optimization problem (which one can recast as a linear program) and numerical experiments suggest that this recovery procedure works unreasonably well; f is recovered exactly even in situations where a significant fraction of the output is corrupted.
•Posted Content
Decoding by Linear Programming
Emmanuel J. Candès,Terence Tao +1 more
TL;DR: In this paper, it was shown that under suitable conditions on the coding matrix, the input vector can be recovered exactly by solving a simple convex optimization problem (which one can recast as a linear program).
6.8K