PARALLEL DATA LAB

N-Store

The design of a database management system’s (DBMS) architecture is predicated on the target storage hierarchy. Traditional disk-oriented systems use a two-level hierarchy, with fast volatile memory used for caching, and slower, durable device used for primary storage. As such, these systems use a buffer pool and complex concurrency control schemes to mask disk latencies. Compare this to main memory DBMSs that assume all data can reside in DRAM, and thus do not need these components.

But emerging non-volatile memory (NVM) technologies require us to rethink this dichotomy. Such memory devices are slightly slower than DRAM, but all writes are persistent, even after power loss. These devices promise to overcome the disparity between processor performance and DRAM storage capacity limits that encumber data-centric applications.

We are studying NVMs to understand their performance characteristics in the context of big data systems and build the groundwork for new DBMS architectures.

People

FACULTY

Andy Pavlo

GRAD STUDENTS

Joy Arulraj

Publications

  • Let’s Talk About Storage & Recovery Methods for Non-Volatile Memory Database Systems. Joy Arulraj, Andrew Pavlo, Subramanya R. Dulloor. Proceedings ACM SIGMOD, Melbourne, Victoria, Australia, May 31-June 4, 2015.
    Abstract / PDF [1M]

Acknowledgements

This research was funded (in part) by the Intel Science and Technology Center for Big Data.

We thank the members and companies of the PDL Consortium: Amazon, Bloomberg, Datadog, Google, Honda, Intel Corporation, IBM, Jane Street, Meta, Microsoft Research, Oracle Corporation, Pure Storage, Salesforce, Samsung Semiconductor Inc., Two Sigma, and Western Digital for their interest, insights, feedback, and support.