Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-21-101, July 2021.
Qing Zheng, Chuck Cranor, Greg Ganger, Garth Gibson, George Amvrosiadis, Brad Settlemyer†, Gary Grider†
Carnagie Mellon University
† Los Alamos National Laboratory
High-Performance Computing (HPC) is synonymous to massive concurrency. But it can be challenging on a large computing platform for a parallel filesystem's control plane to utilize CPU cores when every process's metadata mutation is globally synchronized and serialized against every other process's mutation. We present DeltaFS, a new paradigm for distributed filesystem metadata. DeltaFS allows jobs to self-commit their namespace changes to logs, avoiding the cost of global synchronization. Followup jobs selectively merge logs produced by previous jobs as needed, a principle we term No Ground Truth which allows more efficient data sharing. By following this principle, DeltaFS leans on the parallelism found when utilizing resources at the nodes where job processes run, improving metadata operation throughput up to 98x, a number rising as job processes increase. DeltaFS enables efficient inter-job communication, reducing overall workflow runtime by significantly improving client metadata operation latency, and resource usage up to 52.4x.
KEYWORDS: Distributed filesystem metadata, massively-parallel computing, data storage
FULL TR: pdf