PARALLEL DATA LAB 

PDL Abstract

Dynamic Function Placement for Data-Intensive Cluster Computing

USENIX Annual Technical Conference, San Diego, CA, June 2000. Supercedes Carnegie Mellon University School of Computer Science Technical Report CMU-CS-99-140, June 1999.

Khalil Amiri*, David Petrou, Gregory R. Ganger* and Garth A. Gibson

School of Computer Science
Dept. of Electrical and Computer Engineering*
Carnegie Mellon University
Pittsburgh, PA 15213

http://www.pdl.cmu.edu/

Optimally partitioning application and filesystem functionality within a cluster of clients and servers is a difficult problem due to dynamic variations in application behavior, resource availability, and workload mixes. This paper presents ABACUS, a run-time system that monitors and dynamically changes function placement for applications that manipulate large data sets. Several examples of data-intensive workloads are used to show the importance of proper function placement and its dependence on dynamic run-time characteristics, with performance differences frequently reaching 2-10X. We evaluate how well the ABACUS prototype adapts to run-time system behavior, including both long-term variation (e.g., filter selectivity) and short-term variation (e.g., multi-phase applications and inter-application resource contention). Our experiments with ABACUS indicate that it is possible to adapt in all of these situations and that the adaptation converges most quickly in those cases where the performance impact is most significant.

FULL PAPER: pdf / postscript
FULL TR VERSION OF PAPER: pdf / postscript