I/O WORKLOAD CHARACTERIZATION
THIS PAGE HAS MOVED. PLEASE UPDATE YOUR BOOKMARKS. IF YOU ARE NOT REDIRECTED IN A FEW SECONDS, PLEASE CLICK HERE TO GO TO OUR NEW PAGE.
[ Traffic Modeling | People
| Publications ]
Data Mining meets Traffic Modeling
Traffic modeling of storage workloads is extremely helpful in evaluating
system designs. The work involves the following two aspects. The first
is to discover and to quantify the most important features of the traffic
data. Two example features are temporal burstiness and spatial locality.
In addition, it's even harder to determine how these features affect
the performance of the traffic data in real systems. Secondly, we need
an efficient statistical model to generate synthetic workloads of similar
behavior as the real ones. Traditional models such as Poisson are inadequate
in generating timestamps for traffic data of strong burstiness, not
mentioning generating multi-dimensional traffic.
This project is to solve the above problem. Our previous work has focused
on the spatio-temporal behavior of traffic data, more specifically,
the temporal burstiness and spatial locality of I/O workload. Our proposed
tool, entropy plot, is able to quantify the temporal burstiness and
spatial locality in traffic data. The B-model generates the timestamps
for the synthetic traffic to imitate the temporal burstiness of real
traffic data. The PQRS model goes one step further by generating both
the timestamps and request locations for synthetic traces. The ongoing
work is to augment the model to deal with more dimensionality.
2- and 3-dimensional representations of real traffic
data showing burstiness along time and space.
People
Publications
- Storage Device Performance Prediction with CART Models. Mengzhi Wang, Kinman Au, Anastassia Ailamaki, Anthony Brockwell, Christos Faloutsos, and Gregory R. Ganger. Proc. 12th Annual Meeting of the IEEE/ACM International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS). Volendam, The Netherlands. October 5-7, 2004. Supercedes Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-04-103, March 2004.
Abstract / Postscript [908K] / PDF [122K]
- Storage Device Performance Prediction with CART Models [Extended Abstract]. Mengzhi Wang, Kinman Au, Anastassia Ailamaki, Anthony Brockwell, Christos Faloutsos, and Gregory R. Ganger. Proceedings: Poster Session. Joint International Conference on Measurement and Modeling of Computer Systems. ACM SIGMETRICS/Performance 2004. June 12th-16th 2004, Columbia University, New York.
Abstract / Postscript [400K] / PDF [64K]
- SIMFLEX: A Fast, Accurate, Flexible Full-System Simulation Framework for Performance Evaluation of Server Architecture. Nikolaos Hardavellas, Stephen Somogyi, Thomas F. Wenisch, Roland E. Wunderlich, Shelley Chen, Jangwoo Kim, Babak Falsafi, James C. Hoe, and Andreas G. Nowatzyk. ACM SIGMETRICS Performance Evaluation Review (PER) Special Issue on Tools for Computer Architecture Research, Volume 31, Number 4, pages 31-35, March 2004.
Abstract / PDF [96K]
- Capturing the Spatio-Temporal Behavior of Real Traffic Data.
Mengzhi Wang, Anastassia Ailamaki, and Christos Faloutsos. Performance
2002, September, 2002, Rome, Italy. Best Student Paper Award.
Abstract / PDF
[1.9M]
- Data Mining Meets Performance Evaluation: Fast Algorithms for
Modeling Bursty Traffic. M. Wang, T. Madhyastha, N.H. Chan, S.
Papadimitriou, C. Faloutsos. 18th International Conference on Data
Engineering, February 26-March 1, 2002 San Jose, California. Also
available as a technical report CMU-CS-01-101.
Abstract / Postscript
[2.25M] / PDF [358K]
Acknowledgements
We thank the members and companies of the PDL Consortium: American Power Conversion,
Data Domain, Inc.,
EMC Corporation,
Facebook,
Google,
Hewlett-Packard Labs,
Hitachi,
IBM,
Intel Corporation,
LSI,
Microsoft Research,
NetApp, Inc.,
Oracle Corporation,
Seagate Technology,
Sun Microsystems, Symantec Corporation and
VMware, Inc. for
their interest, insights, feedback, and support.
|