Conference on Innovative Data Systems Research (CIDR) 2021. January 11-15, 2021, Virtual Event.
Ling Zhang1, Matthew Butrovich1, Tianyu Li2, Yash Nannapanei3, Andrew Pavlo1, John Rollinson4, Huanchen Zhang5, Ambarish Balakumar1, Daniel Biales1, Ziqi Dong1, Emmanuel Eppinger1, Jordi Gonzalez1, Wan Shen Lim1, Jianqiao Liu1, Lin Ma1, Prashanth Menon1, Soumil Mukherjee1, Tanuj Nayak1, Amadou Ngom1, Jeff Niu1, Deepayan Patra1, Poojita Raj1, Stephanie Wang1, Wuwen Wang1, Yao Yu, William Zhang1
1 Carnegie Mellon University
2 Massachusetts Institute of Technology
3 Rockset
4 Army Cyber Institute
5 Tsinghua University
http://www.pdl.cmu.edu/
Almost every database management system (DBMS) supporting transactions created in the last decade implements multi-version concurrency control (MVCC). But these systems rely on physical data structures (e.g., B+trees, hash tables) that do not natively support multi-versioning. As a result, there is a disconnect between the logical semantics of transactions and the DBMS’s underlying implementation. System developers must invest in engineering efforts to coordinate transactional access to these data structures and nontransactional maintenance tasks. This burden leads to challenges when reasoning about the system’s correctness and performance and inhibits its modularity. In this paper, we propose the Deferred Action Framework (DAF), a new system architecture for scheduling maintenance tasks in an MVCC DBMS integrated with the system’s transactional semantics. DAF allows the system to register arbitrary actions and then defer their processing until they are deemed safe by transactional processing. We show that DAF can support garbage collection and index cleaning without compromising performance while facilitating higher-level implementation goals, such as non-blocking schema changes and self-driving optimizations.
FULL PAPER: pdf