PARALLEL DATA LAB

Peloton

In the last two decades, both researchers and vendors have built advisory tools to assist database administrators in various aspects of system tuning and physical design. Most of this previous work, however, is incomplete because they still require humans to make the final decisions about any changes to the database and are reactionary measures that fix problems after they occur.

What is needed for a truly “self-driving” database management system (DBMS) is a new architecture that is designed for autonomous operation. This is different than earlier attempts because all aspects of the system are controlled by an integrated planning component that not only optimizes the system for the current workload, but also predicts future workload trends so that the system can prepare itself accordingly. With this, the DBMS can support all of the previous tuning techniques without requiring a human to determine the right way and proper time to deploy them. It also enables new optimizations that are important for modern high-performance DBMSs, but which are not possible today because the complexity of managing these systems has surpassed the abilities of human experts.

Peloton is a relational database management system designed for fully autonomous optimization of hybrid workloads. It is built by students and researchers at the Carnegie Mellon Database Research Group. See the people page for the full listing of contributors.

Key Features:

  • Postgres wire-protocol and JDBC compatible.
  • Native support for byte-addressable non-volatile memory (NVM) storage technology.
  • Lock-free multi-version concurrency control.
  • Integrated artificial intelligence components that enable autonomous optimizations.
  • High-performance, lock-free Bw-Tree for indexing.
  • 100% Open-Source (Apache Software License v2.0)

People

FACULTY

Andy Pavlo
Anthony Tomasic
Todd Mowry

GRAD STUDENTS

Leon Ang
Gustavo Angulo
Joy Arulraj
Patrick Huang
Hao Jin
Haibin Lin
Jiexi Lin
Lin Ma
Prashanth Menon
Matt Perron
Ian Quah
Siddharth Santurkar
Bili Sun
Skye Toor
Dana Van Aken
Allison Wang
Ziqi Wang
Yingjun Wu
Ran Xian


Publications

  • Self-Driving Database Management Systems. A. Pavlo, G. Angulo, J. Arulraj, H. Lin, J. Lin, L. Ma, P. Menon, T. Mowry, M. Perron, I. Quah, S. Santurkar, A. Tomasic, S. Toor, D. V. Aken, Z. Wang, Y. Wu, R. Xian, and T. Zhang. In CIDR 2017, Conference on Innovative Data Systems Research. January 8-11, 2017, Chaminade, CA.
    Abstract / PDF [680K]

  • An Empirical Evaluation of In-Memory Multi-Version Concurrency Control. Yingjun Wu, Joy Arulraj, Jiexi Lin, Ran Xian, Andrew Pavlo. Proceedings of the VLDB Endowment, vol. 10, iss. 7, pages. 781—792, March 2017.
    Abstract / PDF [660K]

  • Bridging the Archipelago between Row-Stores and Column-Stores for Hybrid Workloads. Joy Arulraj, Andrew Pavlo, Prashanth Menon. SIGMOD’16, June 26-July 01, 2016, San Francisco, CA, USA. 
    Abstract / PDF [575K]

  • Write-Behind Logging. J. Arulraj, M. Perron, A. Pavlo. Proc. VLDB Endow., vol. 10, pp. 337-348, December, 2016.
    Abstract / PDF [931K]

  • Let’s Talk About Storage & Recovery Methods for Non-Volatile Memory Database Systems. Joy Arulraj, Andrew Pavlo, Subramanya R. Dulloor. Proceedings ACM SIGMOD, Melbourne, Victoria, Australia, May 31-June 4, 2015.
    Abstract / PDF [1M]


Downloads

Acknowledgements

We thank the members and companies of the PDL Consortium: Amazon, Bloomberg, Datadog, Google, Honda, Intel Corporation, IBM, Jane Street, Meta, Microsoft Research, Oracle Corporation, Pure Storage, Salesforce, Samsung Semiconductor Inc., Two Sigma, and Western Digital for their interest, insights, feedback, and support.