Open AccessPosted Content
Distributed Adaptive Sampling for Kernel Matrix Approximation
TL;DR: SQUEAK as discussed by the authors is the first RLS sampling algorithm for kernel approximation that does not require constructing the whole kernel matrix, and it runs in linear time in a single pass over the dataset w.r.t.
read more
Abstract: Most kernel-based methods, such as kernel or Gaussian process regression, kernel PCA, ICA, or $k$-means clustering, do not scale to large datasets, because constructing and storing the kernel matrix $\mathbf{K}_n$ requires at least $\mathcal{O}(n^2)$ time and space for $n$ samples. Recent works show that sampling points with replacement according to their ridge leverage scores (RLS) generates small dictionaries of relevant points with strong spectral approximation guarantees for $\mathbf{K}_n$. The drawback of RLS-based methods is that computing exact RLS requires constructing and storing the whole kernel matrix. In this paper, we introduce SQUEAK, a new algorithm for kernel approximation based on RLS sampling that sequentially processes the dataset, storing a dictionary which creates accurate kernel matrix approximations with a number of points that only depends on the effective dimension $d_{eff}(\gamma)$ of the dataset. Moreover since all the RLS estimations are efficiently performed using only the small dictionary, SQUEAK is the first RLS sampling algorithm that never constructs the whole matrix $\mathbf{K}_n$, runs in linear time $\widetilde{\mathcal{O}}(nd_{eff}(\gamma)^3)$ w.r.t. $n$, and requires only a single pass over the dataset. We also propose a parallel and distributed version of SQUEAK that linearly scales across multiple machines, achieving similar accuracy in as little as $\widetilde{\mathcal{O}}(\log(n)d_{eff}(\gamma)^3)$ time.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
•Posted Content
On Fast Leverage Score Sampling and Optimal Learning
TL;DR: In this article, leverage score sampling for positive definite matrices defined by a kernel is studied and a leverage score sample sampling algorithm for kernel ridge regression is proposed. But, performing leverage scores sampling is a challenge in its own right requiring further approximations.
•Proceedings Article
An Iterative, Sketching-based Framework for Ridge Regression
Agniva Chowdhury,Jiasen Yang,Petros Drineas +2 more
- 03 Jul 2018
TL;DR: It is proved that accurate approximations can be achieved by a sample whose size depends on the degrees of freedom of the ridge-regression problem rather than the dimensions of the design matrix, which is a fundamental and wellunderstood primitive of randomized linear algebra.
•Proceedings Article
Gaussian Process Optimization with Adaptive Sketching: Scalable and No Regret
Daniele Calandriello,Luigi Carratino,Alessandro Lazaric,Michal Valko,Lorenzo Rosasco +4 more
- 25 Jun 2019
TL;DR: BKB (budgeted kernelized bandit), a new approximate GP algorithm for optimization under bandit feedback that achieves near-optimal regret (and hence near-Optimal convergence rate) with near-constant per-iteration complexity and remarkably no assumption on the input space or covariance of the GP.
•Posted Content
Convergence of Sparse Variational Inference in Gaussian Processes Regression
TL;DR: It is shown that the KL-divergence between the approximate model and the exact posterior arbitrarily small for a Gaussian-noise regression model with M needs to grow with N to ensure high quality approximations.
•Proceedings Article
Improved large-scale graph learning through ridge spectral sparsification
Daniele Calandriello,Daniele Calandriello,Ioannis Koutis,Alessandro Lazaric,Michal Valko +4 more
- 01 Jan 2018
TL;DR: By constructing a spectrally-similar graph, this paper is able to bound the error induced by the sparsifica-tion for a variety of downstream tasks (e.g., SSL), and empirically validate the theoretical guarantees on Amazon co-purchase graph and compare to the state-of-the-art heuristics.
References
•Book
Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)
Carl Edward Rasmussen,Christopher Williams +1 more
- 01 Dec 2005
TL;DR: The treatment is comprehensive and self-contained, targeted at researchers and students in machine learning and applied statistics, and includes detailed algorithms for supervised-learning problem for both regression and classification.
3.1K
Kernel Principal Component Analysis
Bernhard Schölkopf,Alexander J. Smola,Klaus-Robert Müller +2 more
- 08 Oct 1997
TL;DR: A new method for performing a nonlinear form of Principal Component Analysis by the use of integral operator kernel functions is proposed and experimental results on polynomial feature extraction for pattern recognition are presented.
2.6K
A Unifying View of Sparse Approximate Gaussian Process Regression
TL;DR: A new unifying view, including all existing proper probabilistic sparse approximations for Gaussian process regression, relies on expressing the effective prior which the methods are using, and highlights the relationship between existing methods.
•Book
Stochastic Dominance: Investment Decision Making under Uncertainty
Haim Levy
- 25 Nov 2010
TL;DR: In this article, the authors present algorithms for stochastic dominance with specific distributions and apply them to different types of risk measures, such as expected utility theory, risk measures and diversification.
809
Kernel principal component analysis
Bernhard Schölkopf,Alexander J. Smola,Klaus-Robert Müller +2 more
- 08 Feb 1999
TL;DR: In this paper, a nonlinear form of principal component analysis (PCA) is proposed to perform polynomial feature extraction in high-dimensional feature spaces, related to input space by some nonlinear map; for instance, the space of all possible d-pixel products in images.
438