PARALLEL DATA LAB 

PDL Abstract

Survey and Evaluation of Database Management System Extensibility

Carnegie Mellon University School of Computer Science M.S. Thesis CMU-CS-23-144. January 2024.

Abigale Kim

School of Computer Science
Carnegie Mellon University
Pittsburgh, PA 15213

http://www.pdl.cmu.edu

Database management system (DBMS) extensibility is a feature that enables users to extend the DBMS with user software. However, the DBMS extensibility environment is fraught with perils, and DBMS developers have to resort to unspecified methods of developing extensions, including copying core DBMS source code and casing between different versions of the DBMS. Extending a DBMS to support new functionality is challenging due to the tight coupling between the system’s internal components. This thesis studies and evaluates the design of DBMS extensibility. We first provide a comprehensive taxonomy of the types of extensibility supported by DBMSs and the effects of supporting their functionality within the DBMS. Given that PostgreSQL has the most variegated extensibility ecosystem, we also provide an in-depth analysis of it, where we evaluate how compatible extensions were with one another, extension source code quality, and extension complexity. To assist us with this evaluation, we introduce an automated PostgreSQL extension analysis framework that collects information on how an extension integrates into the DBMS. We present results from static and dynamic analysis for over 100 extensions. We show correlations between the lack of compatibility of extensions and several factors related to their complexity and source code. We conclude by discussing the design decisions and trade-offs with supporting extensions in a DBMS.

KEYWORDS: Database management systems, static analysis, system extensibility, database management system extensibility.

FULL PAPER: pdf