The performance potential of multiple functional unit processors

Abstract
The interaction is investigated of pipelining and multiple-functional units in single-processor machines, to gain an understanding of how each of these techniques contribute to performance improvement. A CRAY-like processor model is studied. The issue rate (instructions per clock cycle) is used as the performance measure. The base, nonpipelined, machine is then systematically augmented with more and more hardware features and the performance impact of each feature is evaluated. It is found that in nonvector machines, pipelining multiple-function units does not provide significant performance improvements. Dataflow limits are then derived for benchmark programs to determine the performance potential of each benchmark. In addition, other limits are computed which apply more realistic constraints on a computation. Based on these more realistic limits, it is determined to be worthwhile to investigate the performance improvements that can be achieved from issuing multiple instructions during each clock cycle. Several hardware approaches are evaluated for issuing multiple instructions each clock cycle Author(s) Pleszkun, A.R. Dept. of Comput. Sci., Wisconsin Univ., Madison, WI, USA Sohi, G.S.