PARALLEL DATA LAB

TIP: Cost-Benefit Analysis for Informed Prefetching & Caching

The cache manager's goal is to allocate cache buffers to minimize application execution time. It would like to take advantage of application disclosures for aggressive prefetching and preferential caching of hinted blocks. At the same time, it must continue to cache blocks for unhinted accesses which rely on the cache for good performance. Without a principled approach, it is not at all clear when a cache buffer should be allocated for prefetching vs. caching for hinted re-use vs. caching for unhinted accesses.

Because the goal is to reduce application execution time, we developed a system model that predicts how execution time would change if a cache buffer were allocated to prefetching or de-allocated from caching. This model is the basis for our cost/benefit analysis. We designed a system that continually applies this analysis to balance the benefit of prefetching against the cost of ejecting a cached block, and, if the benefit exceeds the cost, to choose for ejection the block that would cost the system the least as suggested by the figure below.

The TIP-2 system architecture includes three key components.

  1. Independent estimators dynamically estimate either (1) the benefit (reduction in I/O service time) of allocating a buffer for prefetching or a demand access, or (2) the cost (increase in I/O service time) of taking a buffer from the LRU queue or the cache of hinted blocks. Because the estimators are independent, they are relatively simple. Estimator independence also make the system extensible because the addition of a new estimator does not require changes in the existing ones.
  2. A Common currency for the expression of the cost and benefit estimates allows the comparison of the independent estimates at a global level.
  3. An efficient allocation algorithm finds the globally least valuable buffer and reallocates it for the greatest benefit when the estimated benefit exceeds the anticipated cost.
All of our test applications spend a significant fraction of their execution time stalled waiting for I/O completion on our test machine, a 150 MHz DEC Alpha (3000/500) running OSF/1 v2.0 with 128 MBytes of RAM, including 12 MBytes of file buffer cache, and a software array of up to 15 disks. We added our informed prefetching and caching resource management and ran two sets of tests. The first shows the ability of informed prefetching to exploit storage parallelism and the second shows the ability of our informed caching replacement policy to adapt to disclosed access patterns.


Figure 1 shows the effect of informed prefetching on the I/O-stall time experienced by each application as the application's data is striped over wider disk arrays. With only one disk, informed prefetching eliminates 50% of the I/O-stall time for two of the test programs, full text search and database join, because of its ability to schedule disk accesses more effectively and overlap computation with I/O. For all applications, except the computational chemistry, informed prefetching eliminates 75% to 95% of the I/O-stall time when prefetching in parallel from as few as six disks.

For the Davidson computational chemistry application, which repeatedly reads a 17 MByte file sequentially, informed prefetching is in fact working quite well to keep up with the aggressive sequential prefetching already coded into OSF/1's file system. However, prefetching is only part of the story for this application; because the code repeatedly reads a file large enough to overflow the cache, a caching policy that replaces the least recently used buffer is exactly wrong. Instead, the most recently accessed blocks of this file should be replaced.




Our informed cache manager's adaptive replacement policy leverages the disclosure of the access pattern to increase cache performance as shown in the right-hand figure. Without hints, growing the file cache does not reduce execution time until the entire data file fits. In contrast, the informed cache manager's cost-benefit analysis determines that there is least benefit from caching the most-recently-used block, and replaces it first. Hence the informed cache manager 'discovers' a most-recently-used replacement policy for this application and thus makes effective use of each additional buffer.

For more information on this work please refer to our publications. The best paper to look at is Informed Prefetching and Caching (postscript, pdf).