PARALLEL DATA LAB

Attribute-Based Learning Environments (ABLE)

Overview

To tune and manage themselves, file and storage systems must understand key properties (e.g., access pattern, lifetime, popularity) of their files. ABLE allows systems to learn how to automatically classify files and predict the properties of new files, as they are created, by exploiting the strong associations between a file's properties and the names and attributes assigned to it. Such predictions can be used to select policies (e.g., disk allocation schemes and replication factors) for individual files. Further, changes in associations can expose information about applications, helping self-* system components distinguish growth from fundamental change. For further information, see the extended overview.



Models, created from file attributes, are used to classify the properties of existing files and predict the properties of new files when they are created.

 

People

FACULTY

Greg Ganger

STUDENTS

Mike Mesnier
Eno Thereska

 

Publications

  • File Classification in Self-* Storage Systems. Michael Mesnier, Eno Thereska, Daniel Ellard, Gregory R. Ganger, Margo Seltzer. Proceedings of the First International Conference on Autonomic Computing (ICAC-04). New York, NY. May 2004. Supercedes Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-04-101, January 2004.
    Abstract / Postscript [1.6M] / PDF [80K]

  • Attribute-Based Prediction of File Properties. Daniel Ellard, Michael Mesnier, Eno Thereska, Gregory R. Ganger, Margo Seltzer. Harvard Computer Science Group Technical Report TR-14-03, December 2003.
    Abstract / Postscript [850K] / PDF [127K]

Acknowledgements

We thank the members and companies of the PDL Consortium: Amazon, Bloomberg, Datadog, Google, Honda, Intel Corporation, IBM, Jane Street, Meta, Microsoft Research, Oracle Corporation, Pure Storage, Salesforce, Samsung Semiconductor Inc., Two Sigma, and Western Digital for their interest, insights, feedback, and support.