Estimating the Support of a High-Dimensional Distribution
Top Cited Papers
- 1 July 2001
- journal article
- Published by MIT Press in Neural Computation
- Vol. 13 (7), 1443-1471
- https://doi.org/10.1162/089976601750264965
Abstract
Suppose you are given some data set drawn from an underlying probability distribution P and you want to estimate a “simple” subset S of input space such that the probability that a test point drawn from P lies outside of S equals some a priori specified value between 0 and 1. We propose a method to approach this problem by trying to estimate a function f that is positive on S and negative on the complement. The functional form of f is given by a kernel expansion in terms of a potentially small subset of the training data; it is regularized by controlling the length of the weight vector in an associated feature space. The expansion coefficients are found by solving a quadratic programming problem, which we do by carrying out sequential optimization over pairs of input patterns. We also provide a theoretical analysis of the statistical performance of our algorithm. The algorithm is a natural extension of the support vector algorithm to the case of unlabeled data.Keywords
This publication has 14 references indexed in Scilit:
- An Equivalence Between Sparse Approximation and Support Vector MachinesNeural Computation, 1998
- A plug-in approach to support estimationThe Annals of Statistics, 1997
- Learning Distributions by Their Density Levels: A Paradigm for Learning without a TeacherJournal of Computer and System Sciences, 1997
- Density Estimation under Qualitative Assumptions in Higher DimensionsJournal of Multivariate Analysis, 1995
- Measuring Mass Concentrations and Estimating Density Contour Clusters-An Excess Mass ApproachThe Annals of Statistics, 1995
- Generalized Quantile ProcessesThe Annals of Statistics, 1992
- The excess-mass ellipsoidJournal of Multivariate Analysis, 1991
- Estimation of a Convex Density Contour in Two DimensionsJournal of the American Statistical Association, 1987
- Detection of Abnormal Behavior Via Nonparametric Estimation of the SupportSIAM Journal on Applied Mathematics, 1980
- An Iterative Method for Estimating a Multivariate Mode and IsoplethJournal of the American Statistical Association, 1979