Carnegie Mellon Database Application Catalog
Contact: Andy Pavlo
The Carnegie Mellon Database Application Catalog (CMDBAC) is an on-line repository of open-source database applications that you can use for benchmarking and experimentation. The goal of this project is to provide ready-to-run real-world applications for researchers and practitioners that go beyond the standard benchmarks.
We built a crawler that finds applications hosted on public repositories (e.g., GitHub). We then created a framework that automatically learns how to deploy and execute an application inside a virtual machine sandbox. You can then safely download the application on your local machine and execute it to collect query traces and other metrics.
The CMDBAC currently contains over 1000 applications of varying complexity. We target Web applications based on popular programming frameworks because (1) they are easier to find and (2) we can automate the deployment process. We support applications that use the Django, Ruby on Rails, Drupal, Node.js, and Grails frameworks.
Dana Van Aken
Zeyuan Shang, Tsinghua University
This research was funded (in part) by the National Science Foundation (III-1423210). We thank the members and companies of the PDL Consortium: Broadcom, Ltd., Citadel, Dell EMC, Google, Hewlett-Packard Labs, Hitachi Ltd., Intel Corporation, Microsoft Research, MongoDB, NetApp, Inc., Oracle Corporation, Samsung Information Systems America, Seagate Technology, Tintri, Two Sigma, Uber, Veritas and Western Digital for their interest, insights, feedback, and support.