PARALLEL DATA LAB

// TRACE

I/O traces play a critical role in storage systems evaluation. They are captured through a variety of mechanisms, analyzed to understand the characteristics and demands of different applications, and replayed against real and simulated storage systems to recreate representative workloads. Often, traces are much easier to work with than actual applications, particularly when the applications are complex to configure and run, or involve confidential data or algorithms.

//TRACE is a new approach for extracting and replaying traces of parallel applications. Its tracing engine (the causality engine) automatically discovers inter-node data dependencies and inter-request compute times for each node (process) in an application. It does so by selectively delaying I/O in order to expose data dependencies among the compute nodes. The learned dependency information is saved in per-node annotated I/O traces. Such annotation allows a parallel replayer to closely mimic the behavior of a traced application.

For more information, please see our extended overview of //Trace.

 

 

People

FACULTY

Greg Ganger
David O'Hallaron

STAFF

Gregg Economou
Michael Stroucken

STUDENTS

James Hendricks
Julio López
Mike Mesnier
Raja Sambasivan
Matthew Wachs


Publications

  • Relative Fitness Modeling. Michael P. Mesnier, Matthew Wachs, Raja R. Sambasivan, Alice X. Zheng, and Gregory R. Ganger. Communications of the ACM, Vol. 52 No. 4, April 2009.
    Abstract / PDF [775K]

  • On Modeling the Relative Fitness of Storage. Michael P. Mesnier. Carnegie Mellon University, Dept. ECE Ph.D Dissertation CMU-PDL-07-108, December 19, 2007.
    Abstract / PDF [1.16M]

  • //TRACE: Parallel Trace Replay with Approximate Causal Events. Michael Mesnier, Matthew Wachs, Raja R. Sambasivan, Julio Lopez, James Hendricks, Gregory R. Ganger. Proceedings of the 5th USENIX Conference on File and Storage Technologies (FAST '07), February 13–16, 2007, San Jose, CA. Supercedes Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-06-108, September 2006.
    Abstract / PDF[ 187K]


Acknowledgements

We thank the members and companies of the PDL Consortium: Amazon, Datadog, Google, Honda, Intel Corporation, IBM, Jane Street, Meta, Microsoft Research, Oracle Corporation, Pure Storage, Salesforce, Samsung Semiconductor Inc., Two Sigma, and Western Digital for their interest, insights, feedback, and support.