PARALLEL DATA LAB 

PDL Abstract

Simple Adaptive Query Processing vs. Learned Query Optimizers: Observations and Analysis

Proceedings of the VLDB Endowment, Vol. 16, No. 11.

Yunjia Zhang*, Yannis Chronis*, Jignesh M. Patel, Theodoros Rekatsinas*

Carnegie Mellon University
* University of Wisconsin-Madison

http://www.pdl.cmu.edu/

There have been many decades of work on optimizing query processing in database management systems. Recently, modern machine learning (ML), and specifically reinforcement learning (RL), has gained increased attention as a means to develop a query optimizer (QO). In this work, we take a closer look at two recent state-of-the-art (SOTA) RL-based QO methods to better understand their behavior.We find that these RL-based methods do not generalize as well as it seems at first glance. Thus, we ask a simple question: How do SOTA RL-based QOs compare to a simple, modern, adaptive query processing approach? To answer this question, we choose two simple adaptive query processing techniques and implemented them in PostgreSQL. The first adapts an individual join operation on-the-fly and switches between a Nested Loop Join algorithm and a Hash Join algorithm to avoid sub-optimal join algorithm decisions. The second is a technique called Lookahead Information Passing (LIP), in which adaptive semijoin techniques are used to make a pipeline of join operations execute efficiently. To our surprise, we find that this simple adaptive query processing approach is not only competitive to the SOTA RL-based approaches but, in some cases, outperforms the RL-based approaches. The adaptive approach is also appealing because it does not require an expensive training step, and it is fully interpretable compared to the RL-based QO approaches. Further, the adaptive method works across complex query constructs that RL-based QO methods currently cannot optimize.

FULL PAPER: pdf