I/O traces play a critical role in storage systems evaluation. They are captured through a variety of mechanisms, analyzed to understand the characteristics and demands of different applications, and replayed against real and simulated storage systems to recreate representative workloads. Often, traces are much easier to work with than actual applications, particularly when the applications are complex to configure and run, or involve confidential data or algorithms.
//TRACE is a new approach for extracting and replaying traces of parallel applications. Its tracing engine (the causality engine) automatically discovers inter-node data dependencies and inter-request compute times for each node (process) in an application. It does so by selectively delaying I/O in order to expose data dependencies among the compute nodes. The learned dependency information is saved in per-node annotated I/O traces. Such annotation allows a parallel replayer to closely mimic the behavior of a traced application.
For more information, please see our extended overview of //Trace.
FACULTY
STAFFGregg Economou
Michael Stroucken
James Hendricks
Julio López
Mike Mesnier
Raja Sambasivan
Matthew Wachs
We thank the members and companies of the PDL Consortium: Amazon, Bloomberg, Datadog, Google, Honda, Intel Corporation, IBM, Jane Street, Meta, Microsoft Research, Oracle Corporation, Pure Storage, Salesforce, Samsung Semiconductor Inc., Two Sigma, and Western Digital for their interest, insights, feedback, and support.