PARALLEL DATA LAB

Automatic I/O Hint Generation through
Speculative Execution

Many applications issue large numbers of disk requests in order to access data that is too large or too infrequently used to be found in memory. Due to the huge (and growing) disparity between processor cycle times and disk access latencies, these applications waste a significant fraction of their time waiting for disk requests to complete.

Prefetching is a well-known technique for hiding fetch latency, but is only effective if we can determine what and when to prefetch data. Most file systems perform only sequential prefetching, failing to hide the I/O latency of applications with non-trivial access patterns. Previous research on TIP has shown that we can achieve much better performance by manually modifying applications to issue hints that disclose their future I/O accesses to an informed prefetching and caching manager. Manual modification is not an ideal solution, however, because it requires source code and may require significant programmer effort.

We propose a novel approach to automatically extracting accurate prefetching information that could potentially apply to a wide range of applications. Our approach leverages the observation that processors, particularly on client workstations, are often idle while applications are stalled waiting for I/O requests to complete. We propose that a wide range of disk-bound applications could dynamically discover and disclose their own future I/O accesses by opportunistically exploiting any unused processing cycles to perform speculative execution, an eager pre-execution of application code using the available, incomplete data state. If this approach is successful, then, as illustrated below, the potential benefit may be significant.



We designed a method for transforming application binaries to employ the speculative execution approach using only simple, generic static analyses and transformations. In addition to not requiring source code, this design does not require specialized operating system support. Our implementation takes advantage of the TIP prefetching and caching manager, but similar functionality could be provided in user-level code; no other specialization is used. Our design has three fundamental properties:

  1. It ensures that executing the transformed application will produce the same results as executing the original application by fully isolating the effects of performing speculative execution;
  2. It encourages correct and timely hinting by resynchronizing speculative and normal execution when resynchronization is observed to be necessary; and
  3. It is careful to avoid adding noticeable overhead to normal execution.
We developed a binary modification tool, called SpecHint which modifies Alpha binaries according to our design. We evaluated the design using the TIP benchmark suite on a 233MHz AlphaStation 255 running Digital Unix 3.2, with the data files striped across HP C2247 disks. We discovered that, as shown below, our initial, naive design often produces significant performance benefits, especially if the I/O system supports I/O parallelism.



On the other hand, the design incurs a significant performance penalty in one case, and is often much less successful than manual modification. The problem is that our initial, naive design makes no attempt to address speculative execution's inherent sensitivity to the amount of inter-read code and the existence of data dependencies in applications. For example, data dependencies in Gnuld cause speculative execution to mispredict Gnuld's future data needs. The mispredictions result in a significant performance penalty when there is only one disk because prefetches of mispredicted data consume scarce bandwidth. We suggest that we can decrease speculative execution's sensitivity to these factors by making speculation self-aware through a combination of more sophisticated static and dynamic analyses.

We are exploring a variety of different techniques. Despite fairly unsophisticated implementations, our results indicate that we can increase the benefits of speculative execution by making speculation self-aware. With our current implementations, for example, the mean performance improvement with our automatic approach is 47%, 78% and 82% in 1, 2 and 4-disk configurations, respectively, of the mean performance improvement when the applications were manually modified to issue hints. Furthermore, as shown below, our techniques can decrease the processing cycles needed to generate correct hints, allowing speculative execution to produce significant benefits even when there is substantial competition for processing cycles.



For more information on this work, please refer to:

Fay Chang and Garth A. Gibson. Automatic I/O hint generation through speculative execution, Proceedings of the 3rd Symposium on Operating Systems Design and Implementation (OSDI), February 1999. (postscript, pdf).