PARALLEL DATA LAB 

PDL Abstract

Lessons from Profiling and Optimizing Placement in AMR Codes

2025 IEEE International Conference on Cluster Computing (CLUSTER), United Kingdom, 2025.

Ankush Jain†, Charles D. Cranor†, Qing Zheng‡, Dominic Manno‡, George Amvrosiadis†, Gary A. Grider‡

†Carnegie Mellon University,
‡Los Alamos National Laboratory

http://www.pdl.cmu.edu/

Block-structured Adaptive Mesh Refinement (AMR), while essential for improving efficiency in large-scale irregular and dynamic simulations, poses unique optimization challenges. Previous work has identified load imbalance and synchronization overhead as key obstacles to performance, but the deep understanding of complex runtime behavior needed to systematically address them remains elusive. In this paper, we integrate telemetry collection, analysis, and intervention to bridge this understanding gap. Establishing reliable, actionable telemetry required systematic tuning to eliminate cross-stack performance anomalies. Leveraging this foundation we design CPLX, a tunable placement policy balancing compute load and communication locality, improving runtime by up to 21.6% over optimized baselines. Our experience highlights the empirical nature of placement optimization, requiring theoretical models to be grounded in observed runtime behavior.

FULL PAPER: pdf