An Iterative Clustering Procedure

Abstract
In many remote sensing applications millions of measurements can be made from a satellite at one time, and many times the data is of marginal value. In these situations clustering techniques might save much data transmission without loss of information since cluster codes may be transmitted instead of multidimensional data points. Data points within a cluster are highly similar so that interpretation of the cluster code can be meaningfully made on the basis of knowing what sort of data point is typical of those in the cluster. We introduce an iterative clustering technique; the procedure suboptimally minimizes the probability of differences between the binary reconstructions from the cluster codes and the original binary data. The iterative clustering technique was programmed for the GE 635 KANDIDATS (Kansas Digital Image Data System) and tested on two data sets. The first was a multi-image set. Twelve images of the northern part of Yellowstone Park were taken by the Michigan scanner system, and the images were reduced and run with the program. Thirty-thousand data points, each consisting of a binary vector of 25 components, were clustered into four clusters. The percentage difference between the components of the reconstructed binary data and the original binary data was 20 percent. The second data set consisted of measurements of the frequency content of the signals from lightning discharges. One hundred and thirty-four data measurements, each consisting of a binary vector of 32 components, were clustered into four clusters.

This publication has 44 references indexed in Scilit: