PARALLEL DATA LAB 

PDL Abstract

Informed Data Distribution Selection in a Self-predicting Storage System

Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-06-101, January, 2006. Superceded by Proceedings of the International Conference on Autonomic Computing (ICAC-06), Dublin, Ireland. June 12th-16th 2006.

Eno Thereska1, Michael Abd-El-Malek1, Jay J. Wylie2, Dushyanth Narayanan3, Gregory R. Ganger1

1 Carnegie Mellon University, 2 HP Labs - Palo Alto, 3 Microsoft Research - Cambridge

Parallel Data Laboratory
Electrical and Computer Engineering
Carnegie Mellon University
Pittsburgh, PA 15213

http://www.pdl.cmu.edu/

Systems should be self-predicting. They should continuously monitor themselves and provide quantitative answers to What...if questions about hypothetical workload or resource changes. Self-prediction would significantly simplify administrators’ planning challenges, such as performance tuning and acquisition decisions, by reducing the detailed workload and internal system knowledge required. This paper describes and evaluates support for self-prediction in a cluster-based storage system and its application to What...if questions about data distribution selection.

KEYWORDS: data distribution, end-to-end tracing, self-prediction

FULL PAPER: pdf