PARALLEL DATA LAB 

PDL Abstract

Scaling the IO Wall with Declarative IO

20th USENIX Symposium on Operating Systems Design and Implementation (OSDI '26) will take place on July 13–15, 2026, Seattle, WA.

Sanjith Athlur#*, Sara McAllister#†‡*, Theo Gregersen#, Timothy Kim#, Yiwei Chen#, Sarvesh Tandon#, Lucy Wang#, Daniel S. Berger#§, Saurabh Kadekodi‡, Arif Merchant‡, Benjamin Berg¶, Nathan Beckmann#, Rashmi Vinayak#, George Amvrosiadis#, Gregory R. Ganger#

# Carnegie Mellon University
† University of Wisconsin, Madison
‡ Google
§ Microsoft Azure and University of Washington
¶ UNC Chapel Hill

* equal contribution

http://www.pdl.cmu.edu/

HDD capacities will greatly increase over the next ten years, lowering cost-per-TB in large-scale storage systems. Unfor- tunately, device bandwidth will not grow proportionally to device capacity. Hence, storage systems will face an IO wall where the demand for HDD IO will outstrip supply.

We find that, surprisingly, between 45% and 70% of after- cache HDD IO demand for 6 hyperscalers comes from crucial maintenance tasks that ensure data reliability and efficiency (e.g. scrubbing, garbage collection). Unfortunately, caching maintenance tasks is ineffective — individual tasks have little reuse and inter-task reuse is too far apart in time. Fortunately, maintenance tasks are flexible in the timing, ordering of data accesses, and even which data they access. However, the cur- rent imperative storage interface (e.g., read/write) hides main- tenance tasks’ flexible nature. We propose Declarative IO, a new interface for distributed storage systems that allows developers to expose tasks’ flexibility to the storage system. This interface allows tasks to send a declaration to our dis- tributed storage system, DINGO, specifying sets of data and their associated deadlines, such as “process all blocks of this device within 7 days”. In processing declarations, DINGO coordinates IO across different tasks to create timely data reuse. DINGO achieves a 26–51% IO savings for maintenance task mixes corresponding to real hyperscalers, enabling the deployment of 1.7× larger HDDs than in imperative systems.

FULL PAPER: pdf