K‐means clustering: A half‐century synthesis
Top Cited Papers
- 1 May 2006
- journal article
- review article
- Published by Wiley in British Journal of Mathematical and Statistical Psychology
- Vol. 59 (1), 1-34
- https://doi.org/10.1348/000711005x48266
Abstract
This paper synthesizes the results, methodology, and research conducted concerning the K‐means clustering method over the last fifty years. The K‐means method is first introduced, various formulations of the minimum variance loss function and alternative loss functions within the same class are outlined, and different methods of choosing the number of clusters and initialization, variable preprocessing, and data reduction schemes are discussed. Theoretic statistical results are provided and various extensions of K‐means using different metrics or modifications of the original algorithm are given, leading to a unifying treatment of K‐means and some of its extensions. Finally, several future studies are outlined that could enhance the understanding of numerous subtleties affecting the performance of the K‐means method.Keywords
This publication has 148 references indexed in Scilit:
- Model-Based Clustering, Discriminant Analysis, and Density EstimationJournal of the American Statistical Association, 2002
- Measuring the influence of individual data points in a cluster analysisJournal of Classification, 1996
- An entropy criterion for assessing the number of clusters in a mixture modelJournal of Classification, 1996
- A Reference Bayesian Test for Nested Hypotheses and its Relationship to the Schwarz CriterionJournal of the American Statistical Association, 1995
- Comparing partitionsJournal of Classification, 1985
- A new look at the statistical model identificationIEEE Transactions on Automatic Control, 1974
- Percentage Points of a Test for ClustersJournal of the American Statistical Association, 1969
- Integer Programming and the Theory of GroupingJournal of the American Statistical Association, 1969
- Hierarchical Grouping to Optimize an Objective FunctionJournal of the American Statistical Association, 1963
- On Grouping for Maximum HomogeneityJournal of the American Statistical Association, 1958