Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-05-104, April 2005.
Gregory R. Ganger, Michael Abd-El-Malek, Chuck Cranor, James Hendricks, Andrew J. Klosterman,
Michael Mesnier, Manish Prasad, Brandon Salmon, Raja R. Sambasivan, Shafeeq Sinnamohideen,
John D. Strunk, Eno Thereska, Jay J. Wylie
Parallel Data Laboratory,
Carnegie Mellon University
Pittsburgh, PA 15213
No single data encoding scheme or fault model is right for all data. A versatile storage system allows these to be data-specific, so that they can be matched to access patterns, reliability requirements, and cost goals. Ursa Minor is a cluster-based storage system that allows data-specific selection of and on-line changes to encoding schemes and fault models. Thus, different data types can share a scalable storage infrastructure and still enjoy customized choices, rather than suffering from “one size fits all.” Experiments with Ursa Minor show performance penalties as high as 2–3 x for workloads using poorly-matched choices. Experiments also show that a single cluster supporting multiple workloads is much more efficient when the choices are specialized rather than forced to use a “one size fits all” configuration.
KEYWORDS: versatile storage, distributed storage, distributed file system, cluster-based storage
FULL PAPER: pdf / postscript