Current database systems use data layouts that can exploit unique features of only one level of the memory hierarchy (cache/main memory or on-line storage). Such layouts optimize for the predominant access pattern of one workload (e.g., DSS), while trading off performance of another workload type (e.g., OLTP). Achieving efficient execution of different workloads without this trade-off or the need to manually re-tune the system for each workload type is still an unsolved problem. The "Fates" database system project answers this challenge.
The Fates Architecture
The goal of the Fates architecture is to offer efficient execution at all levels of memory hierarchy and optimize data layout to improve performance, by exploiting the unique characteristics available at each level. This is done, primarily, by decoupling of the in-memory data layout from the on-disk storage layout. Where traditional database systems are forced to fetch and store unnecessary data as an artifact of a chosen data layout, the Fates database system can request, retrieve, and store just the needed data, catering to the needs of a specific query. This conserves storage device bandwidth, memory capacity, and avoids cache pollution-all of which improves query execution time.
Borrowing from the Greek mythology of The Three FatesClotho,
Lachesis, and Atroposwho spin, measure, and cut the thread of
life, the three components of our database system (bearing the Fates'
respective names) establish proper abstractions in the database query
execution engine. These abstractions cleanly separate the functionality
of each component while allowing efficient query execution along the
entire path through the database system.
Clotho ensures efficient query execution at the cache/main-memory level and figures at the inception of a request for particular data. It employs a new in-memory page layout and query-specific organization to offer efficient access to all data. Trade-offs are eliminated as the query engine fetches only the data desired.
The Lachesis database storage manager handles the mapping and access to minipages located within the LBNs of on-line storage devices. It makes I/O execution efficient for concurrent workloads competing for a storage device by using explicit, device-independent performance hints. It elimates the need for manual I/O performance tuning and divides reponsibilities equally amongst the storage devices being accessed.
Atropos is a disk array logical volume manager for the orchestrated and efficient use of disks. This is acheived as Atropos provides logical to physical mapping, issues I/Os to individual disks, and exposes important device attributes to facilitate efficient queries such as track aligned accesses, possible semi-sequential access patterns, and efficient access paths in 2D data structures.
FACULTY
Anastassia Ailamaki
Greg Ganger
STUDENTS
Minglong Shao
We thank the members and companies of the PDL Consortium: Amazon, Bloomberg, Datadog, Google, Honda, Intel Corporation, IBM, Jane Street, Meta, Microsoft Research, Oracle Corporation, Pure Storage, Salesforce, Samsung Semiconductor Inc., Two Sigma, and Western Digital for their interest, insights, feedback, and support.