Title:
Intelligent Cache Management by Exploiting Dynamic UTI/MTI Behavior

Thumbnail Image
Author(s)
Fryman, Joshua Bruce
Huneycutt, Chad Marcus
Snyder, Luke Aron
Loh, Gabriel H.
Lee, Hsien-Hsin Sean
Authors
Advisor(s)
Advisor(s)
Editor(s)
Associated Organization(s)
Supplementary to
Abstract
This work addresses the problem of the increasing performance disparity between the microprocessor and memory subsystem. Current L1 caches fabricated in deep submicron processes must either shrink to maintain timing, or suffer higher latencies, exacerbating the problem. We introduce a new classification for the behavior of memory traffic, which we refer to as target behavior. Classification of the target behavior falls into two categories: Uni-Targeted Instructions (UTI) and Multi-Targeted Instructions (MTI). On average, 30% of all dynamic memory LD/ST operations come from execution of UTIs, yet only a few hundred static instructions are actually UTIs. This makes isolation of the UTI targets an avenue for optimization. The addition of a small, fast cache structure which contains only UTI data would ideally reduce MTI pollution of UTI information. By intelligently selecting between larger, slower data caches and our UTI cache, we reduce the latency problem while increasing performance. Our distinct contributions fall in three areas, with implications to many others: (1) we present a new characterization of memory traffic based on the number of targets from LD/ST instructions; (2) we explore the underlying nature of the target division and devise a simple mechanism for exploiting regularity based on a UTI cache; (3) we explore a variety of prediction mechanisms and processor configuration options to determine sensitivity and the performance gains actually attainable under different modern processor configurations. We attain up to 42% IPC improvements on SPEC2000, with a mean improvement of 8%. Our solution also reduces L2 accesses by up to 89% (average 29%), while reducing load-load violation traps by up to 84% (average 13%), and store-load violation traps by up to 43% (average 8%).
Sponsor
Date Issued
2005
Extent
155253 bytes
Resource Type
Text
Resource Subtype
Technical Report
Rights Statement
Rights URI