PARALLEL DATA LAB 

PDL Talk Series

JuLY 1, 2021


TIME
: 12:00 noon - to approximately 1:00 pm EDT
PLACE: Virtual - a zoom link will be emailed closer to the seminar

SPEAKER: Saurabh Kadekodi
Google


DISK-ADAPTIVE REDUNDANCY: Tailoring Data Redundancy to Disk-reliability-heterogeneity in Cluster Storage Systems
Large-scale cluster storage systems typically consist of a heterogeneous mix of storage devices with significantly varying failure rates. Despite having reliability differences of over 10x in the same storage cluster for the same storage tier, redundancy settings are generally configured in a one-scheme-for-all fashion. This dissertation paves the way for exploiting disk-reliability heterogeneity to tailor redundancy settings to different disk groups for cost-effective, and safer redundancy.

The first contribution is the heterogeneity-aware redundancy tuner (HeART), an online tuning tool that actively engages with the disk hazard (bathtub) curve and identifies the boundaries, and steady-state failure rate for each deployed disk group by make/model. Using this information, HeART suggests the most space-efficient redundancy option allowed that will achieve the specified target data reliability. HeART was evaluated via simulation on a 100,000+ disk production storage cluster where it met target reliability levels while requiring much fewer disks (11--33%) than traditional approaches. Despite substantial space-savings, HeART is rendered unusable in certain real-world clusters, because the IO load of redundancy transitions overwhelms the storage infrastructure (termed transition overload). The second contribution of this dissertation is an in-depth analysis of millions of disks from Google, NetApp, and Backblaze to understand transition overload as a roadblock for disk-adaptive redundancy. Building on insights drawn from this analysis, Pacemaker is the third contribution of this dissertation; a low-overhead disk-adaptive redundancy orchestrator that mitigates transition overload by initiating transitions proactively and efficiently in a manner that avoids urgency while ensuring high space-savings. Simulation of Pacemaker on four large (110K–450K disks) production clusters shows that the transition IO requirement decreases to <5% cluster IO bandwidth (<0.5% on average). Pacemaker achieves this while providing overall space-savings of 14–20% while never leaving data under-protected.

The final contribution is the design and implementation of disk-adaptive redundancy techniques from Pacemaker in the widely used HDFS. This prototype repurposes HDFS’s existing architectural components for disk-adaptive redundancy, and successfully leverages the robustness of the existing code. The repurposed components are fundamental to any distributed storage system’s architecture allowing this prototype to also serve as a guideline for other systems to incorporate disk-adaptive redundancy.

BIO: Saurabh is a visiting faculty researcher at Google Research. He works in the Storage Analytics team. His research primarily revolves around storage systems, both local and distributed. At Google he has recently started working on ML aided fault tolerance. He finished his PhD from CMU at the end of 2020. Before this he did his Masters at Northwestern University and his bachelors at Pune Institute of Computer Technology in Pune, India.


CONTACTS


Director, Parallel Data Lab
VOICE: (412) 268-1297


Executive Director, Parallel Data Lab
VOICE: (412) 268-5485


PDL Administrative Manager
VOICE: (412) 268-6716