The Parallel Multipole Method on the Connection Machine

Abstract
This paper reports on a fast implementation of the three-dimensional nonadaptive Parallel Multipole Method (PMM) on the Connection Machine system model CM–2. The data interactions within the decomposition tree are modeled by a hierarchy of three-dimensional grids forming a pyramid in which parent nodes have degree eight. The base of the pyramid is embedded in the Connection Machine as a three-dimensional grid. The standard grid embedding feature is used. For 10 or more particles per processor the communication time is insignificant. The evaluation of the potential field for a system with 128k particles takes 5 seconds, and a system of a million particles about 3 minutes. The maximum number of particles that can be represented in 2G bytes of primary storage is $ \sim 50$ million. The execution rate of this implementation of the PMM is at about 1.7 Gflops/sec for a particle-processor-ratio of 10 or greater. A further speed improvement is possible by an improved use of the memory hierarchy associated with each floating-point unit in the system.

This publication has 8 references indexed in Scilit: