| WeH 7109
Charlotte Yano - (412) 268-7656
| Computer Science Department
School of Computer Science
Carnegie Mellon University
5000 Forbes Avenue
Pittsburgh, PA 15213-3891
| Associate Professor, SCS and ECE
Database system design and performance, Cache-Resident Databases, Internet querying,
My research spans several areas of database management systems (DBMSs). The goal is to discover innovative ways to improve database system performance, inspired by the behavior characteristics of popular database applications on modern machines. The long-term research plan is to explore the interaction between database systems and other related areas towards improving performance (compilers, networks) and expanding functionality (artificial intelligence, user interfaces).
I am currently working on the impact of the hardware and operating system on the behavior of database workloads. Hardware has become extremely sophisticated, and it employs many levels of non-blocking memory and highly parallel instruction execution techniques. The operating system and compilers successfully exploit the underlying hardware to deliver the high performance to the application level. This is great, because modern database applications (i.e., geographical/map applications) are computationally intensive, therefore they should fully exploit the new capabilities. But do they?
Recent studies show that database workloads do not exploit the capabilities of modern machines. For most database applications, an increase in processor speed does not result in a commensurate improvement in application performance. Although commercial database systems such as DB2, Oracle, or SQL Server are easy to install and operate on just about any machine/operating system, in reality database system technology has evolved independently from operating systems and computer architecture; however, its performance still depends much on resource management and the underlying hardware. Therefore, the database system must learn to utilize better the functionality offered by the OS and the hardware. There are two related projects:
Cache-Resident Data Bases (CRDB): It has been proven that database workloads suffer from memory-related delays caused by cache misses. CRDB studies software techniques to provide the performance illusion that the database is entirely in the cache, i.e., avoid or hide the latencies of cache misses. Such techniques include dynamic cache block coloring of instruction, data, and metadata, and prefetching of cache blocks likely to be used next by database algorithms.
Query-Pipeline (Q-Pipe): This large long-term project introduces
a revolutionary staged design for high-performance, evolvable DBMS
that are easy to tune and maintain. We break the database system into
modules and to encapsulate them into self-contained stages connected
to each other through queues. The staged, data-centric design remedies
the weaknesses of modern DBMS by providing solutions at both a hardware
and a software engineering level. There are multiple problems associated
with this new database system architecture, ranging from job queueing
and scheduling with multiple constraints to multi-query processing