COMMUNICATION PIPELINING IN HYPERCUBES

Abstract
Many parallel algorithms exhibit a hypercube communication topology. Such algorithms can easily be executed on a multicomputer with a hypercube interconnection topology. However, in most cases these parallel algorithms only make use of a small fraction of the interconnection bandwidth offered by the multicomputer. In particular, each processor of a hypercube multicomputer is connected to d different neighbors by d different links. Nevertheless, hypercube algorithms usually do not use more than one of these d links at the same time. This paper presents a technique called communication pipelining that enables a more efficient use of the interconnection network and, in consequence, a significant reduction in the execution time. This technique is based on a transformation of the original algorithm. The resulting equivalent algorithm makes use of several links of each node simultaneously. Given a particular problem and a particular architecture. the degree of pipelining to be applied is a design parameter that must be decided when transforming the original algorithm. The paper presents analytical models that allow for an optimal choice of the degree of pipelining for each problem and a given architecture. To illustrate the performance of the communication pipelining technique, its application to the FFT computation is presented as an example. It is shown that an optimal choice of the degree of pipelining can achieve a reduction by a factor of d in the communication overhead of the algorithm.