Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-04-109, August, 2004.
Brandon Salmon, Eno Thereska, Craig A.N. Soules, John D. Strunk, Gregory R. Ganger
Electrical and Computer Engineering
Carnegie Mellon University
Pittsburgh, PA 15213
Choosing the correct settings for large systems can be a daunting task. The performance of the system is often heavily dependent upon these settings, and the “correct” settings are often closely coupled with the workload. System designers usually resort to using a set of heuristic approaches that are known to work well in some cases. However, hand-combining these heuristics is painstaking and fragile. We propose a two-tiered architecture that makes this combination transparent and robust, and describe an application of the architecture to the problem of disk layout optimization. This two-tiered architecture consists of a set of independent heuristics, and an adaptive method of combining them. However, building such a system has proved to be more difficult than expected. Each heuristic depends heavily on decisions from other heuristics, making it difficult to break the problem into smaller pieces. This paper outlines our approaches and how they have worked, discusses the biggest challenges in building the system, and mentions additional possible solutions. Whether this problem is solvable is still open to debate, but the experiences reported provide a cautionary tale; system policy automation is complex and difficult.
KEYWORDS: disk layout, adaptive, self-managing, self-tuning, learning, automated tuning