Estimating the Support of a High-Dimensional Distribution
TL;DR: In this paper, the authors propose a method to estimate a function f that is positive on S and negative on the complement of S. The functional form of f is given by a kernel expansion in terms of a potentially small subset of the training data; it is regularized by controlling the length of the weight vector in an associated feature space.
read more
Abstract: Suppose you are given some data set drawn from an underlying probability distribution P and you want to estimate a "simple" subset S of input space such that the probability that a test point drawn from P lies outside of S equals some a priori specified value between 0 and 1. We propose a method to approach this problem by trying to estimate a function f that is positive on S and negative on the complement. The functional form of f is given by a kernel expansion in terms of a potentially small subset of the training data; it is regularized by controlling the length of the weight vector in an associated feature space. The expansion coefficients are found by solving a quadratic programming problem, which we do by carrying out sequential optimization over pairs of input patterns. We also provide a theoretical analysis of the statistical performance of our algorithm. The algorithm is a natural extension of the support vector algorithm to the case of unlabeled data.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Laplacian twin parametric-margin support vector machine for semi-supervised classification
Zhiji Yang,Yitian Xu +1 more
TL;DR: This work proposes a Laplacian twin parametric-margin support vector machine (LTPMSVM) for the semi-supervised classification, which exploits the geometric information of the marginal distribution embedded in unlabeled data to construct a more reasonable classifier.
37
Deep super-class learning for long-tail distributed image classification
Yucan Zhou,Qinghua Hu,Yu Wang +2 more
TL;DR: A block-structured sparse constraint is designed and attached on the top of a convolutional neural network to accomplish representation learning, classifier training, and super-class construction in a unified end-to-end learning procedure to tackle the problem of long-tail distributed image classification.
37
A two-stage flow-based intrusion detection model for next-generation networks
TL;DR: This paper proposes a two-stage flow-based intrusion detection system for next-generation networks which uses an enhanced unsupervised one-class support vector machine and a self-organizing map which automatically groups malicious flows into different alert clusters.
Data compression by volume prototypes for streaming data
TL;DR: A one-pass algorithm is shown to have such prototypes for data stream, along with an application for classification, and an oblivion mechanism is also incorporated to adapt concept drift.
37
A Cyber-Security Methodology for a Cyber-Physical Industrial Control System Testbed
TL;DR: In this article, a real-time testbed for cyber-physical industrial control systems is proposed, where the Tennessee Eastman process is simulated in realtime on a PC and closed-loop controllers are implemented on the Siemens PLCs.
References
•Book
Elements of information theory
Thomas M. Cover,Joy A. Thomas +1 more
- 01 Jan 1991
TL;DR: The author examines the role of entropy, inequality, and randomness in the design of codes and the construction of codes in the rapidly changing environment.
•Book
The Nature of Statistical Learning Theory
Vladimir Vapnik
- 01 Jan 1995
TL;DR: Setting of the learning problem consistency of learning processes bounds on the rate of convergence ofLearning processes controlling the generalization ability of learning process constructing learning algorithms what is important in learning theory?
46K
Statistical learning theory
Vladimir Vapnik
- 01 Jan 1998
TL;DR: Presenting a method for determining the necessary and sufficient conditions for consistency of learning process, the author covers function estimates from small data pools, applying these estimations to real-life problems, and much more.
30.4K
A training algorithm for optimal margin classifiers
Bernhard E. Boser,Isabelle Guyon,Vladimir Vapnik +2 more
- 01 Jul 1992
TL;DR: A training algorithm that maximizes the margin between the training patterns and the decision boundary is presented, applicable to a wide variety of the classification functions, including Perceptrons, polynomials, and Radial Basis Functions.
A tutorial on support vector regression
TL;DR: This tutorial gives an overview of the basic ideas underlying Support Vector (SV) machines for function estimation, and includes a summary of currently used algorithms for training SV machines, covering both the quadratic programming part and advanced methods for dealing with large datasets.
Related Papers (5)
Vladimir Vapnik
- 01 Jan 1995
Vladimir Vapnik
- 01 Jan 1998