A matching approach to utilizing fine-grained parallelism

Abstract

An overview is presented of a system in which the compiler and architecture coordinate to match by reconfiguration. The architecture is a reconfigurable long-instruction-word machine that provides fine-grained parallelism to the compiler, which detects and schedules the parallelism in a program. The compiler uses a number of techniques to transform both the program code and architecture to obtain a match. These include distribution of parallelism using the notion of regions, code scheduling according to regions, and allocation of data into multiple memory modules. Experiments were performed to determine the effects of reconfiguration and the compiler techniques on the performance of both scientific and nonscientific programs. Results indicate that the reconfigurable nature of the architecture is responsible for a substantial part of the speedup and that the problem of memory bottleneck faced in designing parallel systems is solved.

Keywords

Cited by 1 article