PARALLEL DATA LAB 

PDL Talk Series

June 12, 2024


TIME
: 12:00 noon - to approximately 1:00 pm EDT
PLACE: Virtual - a zoom link will be emailed closer to the seminar


SPEAKER: Pat Helland, Principal Architect
Salesforce

Scalable OLTP in the Cloud: What's the BIG DEAL?
The pursuit of scalable OLTP systems has been the holy grail of my career. Because OLTP systems are typically split into applications and databases, the isolation semantics provided by the DB and used by the app have a major impact on the scalability of the OLTP system as a whole. The isolation semantics are a BIG DEAL!

This thought experiment explores the asymptotic limits to scale for OLTP systems. An OLTP (OnLine Transaction Processing) system is a domain-specific application using a RCSI (READ COMMIT- TED SNAPSHOT ISOLATION) SQL database to provide transactions across many concurrent users. This interface provides the contractual BIG DEAL between OLTP databases and OLTP applications.

Focusing on the BIG DEAL, shows today’s popular databases unnecessarily limit scale. Similarly, we identify common app pat- terns that inhibit scale. We can reimagine the way we build both databases and applications to empower scale. All while complying with the established SQL and RCSI interface (i.e., the BIG DEAL).

Perhaps, this can provoke discussions within the database community leading to new opportunities for OLTP systems. To me, that would be a big deal!

PAPER: Scalable OLTP in the Cloud: What's the BIG DEAL?

BIO: Pat Helland has been building distributed systems, database systems, high-performance messaging systems, and multiprocessors since 1978, shortly after dropping out of UC Irvine without a bachelor's degree. That hasn't stopped him from having a passion for academics and publication. From 1982 to 1990, Pat was the chief architect for TMF (Transaction Monitoring Facility), the transaction logging and recovery systems for NonStop SQL, a message-based fault-tolerant system providing high-availability solutions for business critical solutions. In 1991, he moved to HaL Computers where he was chief architect for the Mercury Interconnect Architecture, a cache-coherent non-uniform memory architecture multiprocessor. In 1994, Pat moved to Microsoft to help the company develop a business providing enterprise software solutions. He was chief architect for MTS (Microsoft Transaction Server) and DTC (Distributed Transaction Coordinator). Starting in 2000, Pat began the SQL Service Broker project, a high-performance transactional exactly-once in-order message processing and app execution engine built deeply into Microsoft SQL Server 2005. From 2005-2007, he worked at Amazon on scalable enterprise solutions, scale-out user facing services, integrating product catalog feeds from millions of sellers, and highly-available eventually consistent storage. From 2007 to 2011, Pat was back at Microsoft working on a number of projects including Structured Streams in Cosmos. Structured streams kept metadata within the "big data" streams that were typically 10s of terabytes in size. This metadata allowed affinitized placement within the cluster as well as efficient joins across multiple streams. On launch, this doubled the work performed within the 250PB store. Pat also did the initial design for Baja, the distributed transaction support for a distributed event-processing engine implemented as an LSM atop structured streams providing transactional updates targeting the ingestion of "the entire web in one table" with changes visible in seconds. Starting in 2012, Pat has worked at Salesforce on database technology running within cloud environments. His current interests include latency bounding of online enterprise-grade transaction systems in the face of jitter, the management of metastability in complex environments, and zero-downtime upgrades to databases and stateful applications. In his spare time, Pat regularly writes for ACM Queue, Communications of the ACM, and various conferences. He has been deeply involved in the organization of the HPTS (High Performance Transactions Systems - www.hpts.ws) workshop since 1985. His blog is at pathelland.substack.com and he parsimoniously tweets with the handle @pathelland.


CONTACTS


Director, Parallel Data Lab
VOICE: (412) 268-1297


Executive Director, Parallel Data Lab
VOICE: (412) 268-5485


PDL Administrative Manager
VOICE: (412) 268-6716