An optimal algorithm for approximate nearest neighbor searching fixed dimensions

1 November 1998

journal article
research article
Published by Association for Computing Machinery (ACM) in Journal of the ACM

Vol. 45 (6), 891-923
https://doi.org/10.1145/293347.293348

Abstract

Consider a set S of n data points in real d-dimensional space, R-d, where distances are measured using any Minkowski metric. In nearest neighbor searching, we preprocess S into a data structure, so that given any query point q epsilon R-d, is the closest point of S to q can be reported quickly. Given any positive real epsilon, a data point p is a (1 + epsilon)-approximate nearest neighbor of q if its distance from q is within a factor of (1 + epsilon) Of the distance to the true nearest neighbor. We show that it is possible to preprocess a set of n points in R-d in O(dn log n) time and O(dn) space, so that given a query point q epsilon R-d, and epsilon > 0, a (1 + epsilon)-approximate nearest neighbor of q can be computed in O(c(d,epsilon) log n) time, where c(d,epsilon) less than or equal to d [1 + 6d/epsilon](d) is a factor depending only on dimension and epsilon. In general, we show that given an integer k greater than or equal to 1 (1 + epsilon)-approximations to the k nearest neighbors of q cart be computed in additional O(kd log n) time.

Keywords

This publication has 42 references indexed in Scilit:

The TV-tree: An index structure for high-dimensional data
The VLDB Journal, 1994
An optimal algorithm for the on-line closest-pair problem
Algorithmica, 1994
Fast closest codeword search algorithms for vector quantisation
IEE Proceedings - Vision, Image, and Signal Processing, 1994
Equal-average hyperplane partitioning method for vector quantization of image data
Pattern Recognition Letters, 1992
Refinements to nearest-neighbor searching ink-dimensional trees
Algorithmica, 1991
Indexing by latent semantic analysis
Journal of the American Society for Information Science, 1990
Rate-distortion performance of DPCM schemes for autoregressive sources
IEEE Transactions on Information Theory, 1985
A data structure for dynamic trees
Journal of Computer and System Sciences, 1983
Optimal Expected-Time Algorithms for Closest Point Problems
ACM Transactions on Mathematical Software, 1980
Analysis of an Algorithm for Finding Nearest Neighbors in Euclidean Space
ACM Transactions on Mathematical Software, 1979

Cited by 1537 articles