Sorting with linear speedup on a pipelined hypercube

Abstract
The authors formally define a distributed-memory parallel architecture called the pipelined hypercube. A coarse-grained parallel sorting algorithm that can be mapped efficiently on such an architecture is also presented. The pipelined hypercube has a more powerful communication mechanism than the traditional binary code architecture, in that it permits communication of blocks of data between processing elements (PEs) to be performed in a pipelined manner. Certain data communication problems which would probably be serialized on the binary code architecture, can be performed optimally on the pipelined hypercube. The sorting algorithm can be mapped efficiently onto a pipelined hypercube of P PEs. It sorts N data items, initially distributed among the PEs, in time O((N log N/P)+log/sup 2/ P), thereby achieving linear speedup when P is O(N/log N).

This publication has 23 references indexed in Scilit: