Trace cache: a low latency approach to high bandwidth instruction fetching

24 December 2002

proceedings article
Published by Institute of Electrical and Electronics Engineers (IEEE)

Vol. 25, 24-34
https://doi.org/10.1109/micro.1996.566447

Abstract

As the issue width of superscalar processors is increased, instruction fetch bandwidth requirements will also increase. It will become necessary to fetch multiple basic blocks per cycle. Conventional instruction caches hinder this effort because long instruction sequences are not always in contiguous cache locations. We propose supplementing the conventional instruction cache with a trace cache. This structure caches traces of the dynamic instruction stream, so instructions that are otherwise noncontiguous appear contiguous. For the Instruction Benchmark Suite (IBS) and SPEC92 integer benchmarks, a 4 kilobyte trace cache improves performance on average by 28% over conventional sequential fetching. Further, it is shown that the trace cache's efficient, low latency approach enables it to outperform more complex mechanisms that work solely out of the instruction cache.

Keywords

This publication has 13 references indexed in Scilit:

A Comprehensive Instruction Fetch Mechanism For A Processor Supporting Speculative Execution
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
A fill-unit approach to multiple instruction issue
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
Trace cache: a low latency approach to high bandwidth instruction fetching
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
Increasing the instruction fetch rate via multiple branch prediction and a branch address cache
Published by Association for Computing Machinery (ACM) ,1993
Efficient program tracing
Computer, 1993
Improving the accuracy of dynamic branch prediction using branch correlation
Published by Association for Computing Machinery (ACM) ,1992
Machine organization of the IBM RISC System/6000 processor
IBM Journal of Research and Development, 1990
Hardware Support For Large Atomic Units in Dynamically Scheduled Machines
Published by Institute of Electrical and Electronics Engineers (IEEE) ,1988
Branch Prediction Strategies and Branch Target Buffer Design
Computer, 1984

Cited by 193 articles