Many applications issue large numbers of disk requests in order to access data that is too large or too infrequently used to be found in memory. Due to the huge (and growing) disparity between processor cycle times and disk access latencies, these applications waste a significant fraction of their time waiting for disk requests to complete.
Prefetching is a well-known technique for hiding fetch latency, but is only effective if we can determine what and when to prefetch data. Most file systems perform only sequential prefetching, failing to hide the I/O latency of applications with non-trivial access patterns. Previous research on TIP has shown that we can achieve much better performance by manually modifying applications to issue hints that disclose their future I/O accesses to an informed prefetching and caching manager. Manual modification is not an ideal solution, however, because it requires source code and may require significant programmer effort.
We propose a novel approach to automatically extracting accurate prefetching
information that could potentially apply to a wide range of applications.
Our approach leverages the observation that processors, particularly
on client workstations, are often idle while applications are stalled
waiting for I/O requests to complete. We propose that a wide range of
disk-bound applications could dynamically discover and disclose their
own future I/O accesses by opportunistically exploiting any unused processing
cycles to perform speculative execution, an eager pre-execution
of application code using the available, incomplete data state. If this
approach is successful, then, as illustrated below, the potential benefit
may be significant.
We designed a method for transforming application binaries to employ the speculative execution approach using only simple, generic static analyses and transformations. In addition to not requiring source code, this design does not require specialized operating system support. Our implementation takes advantage of the TIP prefetching and caching manager, but similar functionality could be provided in user-level code; no other specialization is used. Our design has three fundamental properties:
On the other hand, the design incurs a significant performance penalty in one case, and is often much less successful than manual modification. The problem is that our initial, naive design makes no attempt to address speculative execution's inherent sensitivity to the amount of inter-read code and the existence of data dependencies in applications. For example, data dependencies in Gnuld cause speculative execution to mispredict Gnuld's future data needs. The mispredictions result in a significant performance penalty when there is only one disk because prefetches of mispredicted data consume scarce bandwidth. We suggest that we can decrease speculative execution's sensitivity to these factors by making speculation self-aware through a combination of more sophisticated static and dynamic analyses.
We are exploring a variety of different techniques. Despite fairly
unsophisticated implementations, our results indicate that we can increase
the benefits of speculative execution by making speculation self-aware.
With our current implementations, for example, the mean performance
improvement with our automatic approach is 47%, 78% and 82% in 1, 2
and 4-disk configurations, respectively, of the mean performance improvement
when the applications were manually modified to issue hints. Furthermore,
as shown below, our techniques can decrease the processing cycles needed
to generate correct hints, allowing speculative execution to produce
significant benefits even when there is substantial competition for
processing cycles.
For more information on this work, please refer to:
Fay Chang and Garth A. Gibson.
Automatic I/O hint generation through speculative execution, Proceedings
of the 3rd Symposium on Operating Systems Design and Implementation (OSDI), February 1999. (postscript, pdf).