Analysis of the clustering properties of the Hilbert space-filling curve

Abstract
Several schemes for the linear mapping of a multidimensional space have been proposed for various applications, such as access methods for spatio-temporal databases and image compression. In these applications, one of the most desired properties from such linear mappings is clustering, which means the locality between objects in the multidimensional space being preserved in the linear space. It is widely believed that the Hilbert space-filling curve achieves the best clustering (Abel and Mark, 1990; Jagadish, 1990). We analyze the clustering property of the Hilbert space-filling curve by deriving closed-form formulas for the number of clusters in a given query region of an arbitrary shape (e.g., polygons and polyhedra). Both the asymptotic solution for the general case and the exact solution for a special case generalize previous work. They agree with the empirical results that the number of clusters depends on the hypersurface area of the query region and not on its hypervolume. We also show that the Hilbert curve achieves better clustering than the z curve. From a practical point of view, the formulas given provide a simple measure that can be used to predict the required disk access behaviors and, hence, the total access time.

This publication has 24 references indexed in Scilit: