Highly concurrent scalar processing

Abstract
High speed scalar processing is an essential characteristic of high performance general purpose computer systems. Highly concurrent execution of scalar code is difficult due to data dependencies and conditional branches. This paper proposes an architectural concept called guarded instructions to reduce the penalty of conditional branches in deeply pipelined processors. A code generation heuristic, the decision tree scheduling technique, reorders instructions in a complex of basic blocks so as to make efficient use of guarded instructions. Performance evaluation of several benchmarks are presented, including a module from the UNIX kernel. Even with these difficult scalar code examples, a speedup of two is achievable by using conventional pipelined uniprocessors augmented by guard instructions, and a speedup of three or more can be achieved using processors with parallel instruction pipelines.

This publication has 7 references indexed in Scilit: